Done
Details
Assignee
UnassignedUnassignedReporter
jinyou.majinyou.maNeeds QA
YesFix versions
Affects versions
Priority
Medium
Details
Details
Assignee
Unassigned
UnassignedReporter
jinyou.ma
jinyou.maNeeds QA
Yes
Fix versions
Affects versions
Priority
Smart Checklist
Smart Checklist
Smart Checklist
Created October 18, 2023 at 7:39 AM
Updated June 6, 2024 at 8:02 AM
Resolved January 16, 2024 at 5:56 PM
Crash log
There are RW-latch waitings in the section semaphores of the innodb status
The RW-latch held by a thread
Reproduce
Deploy one PXC cluster and one replication
connecting 2 clusters by asynchronous replication
running SQL to crash the PXC
You can check the innodb status by the command below in node 1 of PXC, when the output is paused.
Root cause
Because the ha_list is null in ha_commit_low, the thread does not call the ht->commit to pop the thd from wsrep_group_commit_queue.
The thread will keep the first element in the wsrep_group_commit_queue. When the other threads call wsrep_wait_for_turn_in_group_commit, threads will wait on the condition COND_wsrep_group_commit until MySQL crashes
By adding the breakpoints below
Id 21 [ Xid 4302 ] register
Id 15 [ Xid 4303 ] register
Id 16 [ Xid 4304 ] register
Id 17 [ Xid 4305 ] register
Id 21 [ Xid 4302 ] enters ha_commit_low
Id 18 [ Xid 4306 ] register
Id 21 [ Xid 4302 ] ha_list is not null
Id 21 [ Xid 4302 ] innobase_commit
Id 21 [ Xid 4302 ] wait
Id 21 [ Xid 4302 ] leaves
Id 15 [ Xid 4303 ] enters ha_commit_low
Id 16 [ Xid 4304 ] enters ha_commit_low
Id 16 [ Xid 4304 ] ha_list is not null
Id 16 [ Xid 4304 ] innobase_commit
Id 16 [ Xid 4304 ] wait
Id 22 [ Xid 4307 ] register
The 4303 enters the ha_commit_low, but does not call the innobase_commit.
The 4303 does not pop the thread from the wsrep_group_commit_queue.
When 4304 commits, the 4304 waits on the condition.