pxc-2 node was only alive node before delete pxc, re-created cluster is not recovering
General
Escalation
General
Escalation
Description
If 2 from 3 nodes were down before cluster deletion, the crash recovery is not working after cluster re-creation.
steps to reproduce: 1. on pxc-2 create a table, insert few rows 2. on pxc-0 freeze mysqld (kill -STOP on the node 3. on pxc-2 insert additional rows 4. on pxc-1 freeze mysqld (kill -STOP on the node) 5. on pxc-2 force primary: SET GLOBAL wsrep_provider_options='pc.bootstrap=YES'; and insert few rows 6. delete pxc resource cluster1 7. create the cluster
pxc-0 and pxc-1 were without quorum before restart. pxc-2 was stopped normally without crash
As a result the crash recovery is not initiated: The recovery could be triggered by creating empty gvwstate.dat file: kubectl exec -it cluster1-pxc-2 -c logs – touch /var/lib/mysql/gvwstate.dat
Environment
None
AFFECTED CS IDs
CS0017674
Smart Checklist
Activity
Show:
Jira Bot September 28, 2021 at 9:58 AM
Hello , It's jira-bot again. Your bug report is important to us, but we haven't heard from you since the previous notification. If we don't hear from you on this in 7 days, the issue will be automatically closed.
Jira Bot September 11, 2021 at 12:57 PM
Hello , I'm jira-bot, Percona's automated helper script. Your bug report is important to us but we've been unable to reproduce it, and asked you for more information. If we haven't heard from you on this in 3 more weeks, the issue will be automatically closed.
Slava Sarzhan June 22, 2021 at 4:39 PM
Hi ,
I have tried to reproduce this issue using 1.8.0 but without any results. I was generating the traffic then deleted pod pxc-0 using chaos-mesh after that deleted pod pxc-1 and removed the cluster at all. After that I have started the cluster once again and the cluster was recovered. Could you please check it on 1.8.0 from your end too.
P.S. If the node has 'bootstrap=1' I don't think that it is safe to change it by operator.
If 2 from 3 nodes were down before cluster deletion, the crash recovery is not working after cluster re-creation.
steps to reproduce:
1. on pxc-2 create a table, insert few rows
2. on pxc-0 freeze mysqld (kill -STOP on the node
3. on pxc-2 insert additional rows
4. on pxc-1 freeze mysqld (kill -STOP on the node)
5. on pxc-2 force primary: SET GLOBAL wsrep_provider_options='pc.bootstrap=YES';
and insert few rows
6. delete pxc resource cluster1
7. create the cluster
pxc-0 and pxc-1 were without quorum before restart.
pxc-2 was stopped normally without crash
As a result the crash recovery is not initiated:
The recovery could be triggered by creating empty gvwstate.dat file:
kubectl exec -it cluster1-pxc-2 -c logs – touch /var/lib/mysql/gvwstate.dat