Implement automatic SST for joiners or --tc-heuristic-recover=COMMIT

Description

With PXC 5.7 it’s possible to see “[ERROR] Found 2 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information” error during WSREP position recovery.

It puts the pod to the crashloop backoff state.

 

{"log":"2024-01-20T06:11:18.520078Z 0 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.7.36-39 started; log sequence number 129250032827677\n","file":"/var/lib/mys ql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.520096Z 0 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.520386Z 0 [Note] Plugin 'FEDERATED' is disabled.\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525864Z 0 [Note] InnoDB: Starting recovery for XA transactions...\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525881Z 0 [Note] InnoDB: Transaction 159171881702 in prepared state after recovery\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525885Z 0 [Note] InnoDB: Transaction contains changes to 1 rows\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525888Z 0 [Note] InnoDB: Transaction 159171881701 in prepared state after recovery\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525891Z 0 [Note] InnoDB: Transaction contains changes to 1 rows\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525894Z 0 [Note] InnoDB: 2 transactions in prepared state after recovery\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525897Z 0 [Note] Found 2 prepared transaction(s) in InnoDB\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525908Z 0 [Warning] WSREP: Discovered discontinuity in recovered wsrep transaction XIDs. Truncating the recovery list to 0 entries\n","file":"/ var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525911Z 0 [Note] WSREP: Last wsrep seqno to be recovered 79670565700\n","file":"/var/lib/mysql/wsrep_recovery_verbose.log"} {"log":"2024-01-20T06:11:18.525949Z 0 [ERROR] Found 2 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (l ast binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.\n"," file":"/var/lib/mysql/wsrep_recovery_verbose.log"}

It would be great to have automatic recovery for such cases with SST or start wsrep recovery with --tc-heuristic-recover=COMMIT (if PXC developers could confirm that it’s safe). I’ve tried to reproduce the issue with multiple attempts, but not capable to simulate production issue. The problem also happening at https://perconadev.atlassian.net/browse/K8SPXC-913

Environment

None

AFFECTED CS IDs

CS0043398

Activity

Nickolay Ihalainen January 29, 2024 at 5:56 PM

In the past it was possible to use unsafe boostrap option:

https://github.com/percona/percona-docker/blob/pxc-operator-1.13.0/percona-xtradb-cluster-5.7/dockerdir/unsafe-bootstrap.sh

But this code is dead with PXC 1.13.0. Even with crVersion 1.9.0, /var/lib/mysql/unsafe-bootstrap.sh is used (it’s from pxc8 image and there is no tc-recover).

 

There is no reports for PXC8 with such errors, probably only PXC57 is affected by this issue.

Details

Assignee

Reporter

Needs QA

Yes

Affects versions

Priority

Smart Checklist

Created January 29, 2024 at 4:55 PM
Updated September 11, 2024 at 11:07 AM

Flag notifications