When taking backup on a busy node Backup fails if not able to get Log information
General
Escalation
General
Escalation
Description
Environment
None
Activity
Show:
Details
Details
Assignee
Marcelo Altmann
Marcelo Altmann(Deactivated)Reporter
Marco Tusa
Marco TusaLabels
Needs QA
Yes
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist

Open Smart Checklist
Created November 28, 2022 at 1:37 PM
Updated June 10, 2024 at 2:18 PM
When a node is busy, not at saturation point but ... busy, in case of PXC we also have FC happening.
The last step of the process in PXB is to take a snapshot of the log information with
SELECT server_uuid, local, replication, storage_engines FROM performance_schema.log_status
Unfortunately this operation may go in timeout:
221125 17:02:39 >> log scanned up to (195884949862) Error: failed to execute query 'SELECT server_uuid, local, replication, storage_engines FROM performance_schema.log_status': 1205 (HY000) Lock wait timeout exceeded; try restarting transaction 2022-11-25T17:02:39.851237Z 0 [ERROR] [MY-000000] [WSREP-SST] ------------ innobackup.backup.log (END) ------------
Causing the process to be aborted at the end.
The line where this happens:
https://github.com/percona/percona-xtrabackup/blob/05b32eab68e301ab71a980ec186ff88bc12a4349/storage/innobase/xtrabackup/src/backup_mysql.cc#L1689
After a brief discussion with @Marcelo Altmann we noticed that it could be beneficial to increase the Lock wait timeout to allow the operation to be completed, or also include a retry step.
In any case at the moment the use of the performance_schema.log_status table is not deprive of locking with all the correlate issues.