PXC in non-primary if IO starvation + topology change
General
Escalation
General
Escalation
Description
Environment
None
AFFECTED CS IDs
275852
Smart Checklist
Activity
Show:
Julia Vural March 4, 2025 at 9:26 PM
It appears that this issue is no longer being worked on, so we are closing it for housekeeping purposes. If you believe the issue still exists, please open a new ticket after confirming it's present in the latest release.
Won't Do
Details
Details
Assignee
Unassigned
UnassignedReporter
Carlos Tutte
Carlos TutteAffects versions
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist

Open Smart Checklist
Created June 3, 2020 at 7:08 PM
Updated March 4, 2025 at 9:26 PM
Resolved March 4, 2025 at 9:26 PM
if having a cluster with "N" nodes and one of the nodes shutdowns (either gracefully or unexpectedly) and the other nodes have IO starvation or no IO access at all, then the cluster is left in non-primary state.
In order to reproduce, set a 3 node PXC with all nodes up.
Filesystem freeze node 1 and 2:
# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 40G 4.4G 36G 11% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 8.5M 1.9G 1% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup tmpfs 379M 0 379M 0% /run/user/1000 # fsfreeze -f /
Shutdown node 3
systemctl stop mysql
After unfreezing fs, it can be seen that cluster is non-primary:
fsfreeze -u / ... PXC: root@localhost ((none)) > show global status like 'wsrep_cluster_s%'; +--------------------------+--------------------------------------+ | Variable_name | Value | +--------------------------+--------------------------------------+ | wsrep_cluster_size | 2 | | wsrep_cluster_state_uuid | a7c767b5-a4f6-11ea-a007-5ad7ece0d11b | | wsrep_cluster_status | non-Primary | +--------------------------+--------------------------------------+ 3 rows in set (0.00 sec)
On the error.log, this message can be seen showing cluster cannot reach the other nodes:
2020-06-02T19:54:22.319479Z 0 [Warning] WSREP: last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT14.1444S), skipping check 2020-06-02T19:54:22.319550Z 0 [Warning] WSREP: evs::proto(ac7a1c98, INSTALL, view_id(REG,0bef8e20,53)) install timer expired evs::proto(evs::proto(ac7a1c98, INSTALL, view_id(REG,0bef8e20,53)), INSTALL) { current_view=Current view of cluster as seen by this node view (view_id(REG,0bef8e20,53) memb { 0bef8e20,0 ac7a1c98,0 bf9d7f66,0 }
Repeating the experiment without freezing fs and using BCC tools, it can be seen that file gvwstate.dat is accesed by MySQL:
TID COMM READS WRITES R_Kb W_Kb T FILE 1719 irqbalance 7 0 7 0 R smp_affinity 4247 filetop 2 0 2 0 R loadavg 4810 mysqld 0 8 0 1 R error.log 1719 irqbalance 1 0 1 0 R stat 1719 irqbalance 1 0 1 0 R smp_affinity 1719 irqbalance 1 0 1 0 R interrupts 4812 mysqld 0 6 0 0 R error.log 4804 mysqld 0 6 0 0 R error.log 4804 mysqld 0 1 0 0 R gvwstate.dat.tmp 4803 mysqld 0 1 0 0 R error.log
mysqld process tries to access files error.log and gvwstate.dat.tmp. From the source code documentation file doc/source/wsrep-files-index.rst:
* :file:`gvwstate.dat` This file is used for Primary Component recovery feature. This file is created once primary component is formed or changed, so you can get the latest primary component this node was in. And this file is deleted when the node is shutdown gracefully. First part contains the node :term:`UUID` information. Second part contains the view information. View information is written between ``#vwbeg`` and ``#vwend``. View information consists of:
Since PXC needs to access a file for topology changes, and in case having IO issues/starvation on some nodes, then whole cluster might be down.
Suggested fix:
Is it possible to make the topology change non-dependant of creating a new "gvwstate.dat" disk file ?