Done
Details
Details
Assignee
Kamil Holubicki
Kamil HolubickiReporter
Juan Arruti
Juan ArrutiLabels
Needs QA
Yes
In progress time
6.25
Time tracking
No time logged1w 1d 2h remaining
Sprint
None
Fix versions
Affects versions
Priority
Smart Checklist
Smart Checklist
Created July 13, 2024 at 2:24 AM
Updated December 23, 2024 at 11:40 AM
Resolved September 27, 2024 at 7:45 AM
On PXC 8.0.36, a flapping flow control scenario may hang the cluster in a multi-writer environment. It also affects 5.7.44 and 5.7.25.
InnoDB status from the affected node shows threads in replicating state:
The receive queue does not show write-sets:
And flow control is still active:
Node 2 and 3 also shows flow control as active:
Killing the threads doesn't fix the issue, the node needs to be restarted to fix the cluster:
How to repeat:
Use the attached my.cnf to create a 3 nodes PXC 8.0.36 cluster.
Create the following tables:
On node 1, configure a 8M redo and strict durability settings:
On node 1, run the following command to produce a flow control flapping behavior:
And run the following workload:
On node 2, run the following commands:
Monitor the flow control on node 1, you may need adding more inserts in case the flapping happens between several seconds.
Since it’s a race condition, it may take seconds to minutes to trigger the bug.