Donor or backup node can be blocked for writes completely by DDL or DCL queries
General
Escalation
General
Escalation
Description
Environment
None
AFFECTED CS IDs
261917
blocks
is duplicated by
Smart Checklist
Activity
Show:
Julia Vural March 4, 2025 at 9:26 PM
It appears that this issue is no longer being worked on, so we are closing it for housekeeping purposes. If you believe the issue still exists, please open a new ticket after confirming it's present in the latest release.
Truong Hua June 24, 2021 at 5:29 AM
I have the same issue and I think these are related https://forums.percona.com/t/donor-node-can-not-serve-write-query-while-sst/11010
https://forums.percona.com/t/donor-replication-queue-overflow-during-sst/6651/10
Won't Do
Details
Details
Assignee
Unassigned
UnassignedReporter
Przemyslaw Malkowski
Przemyslaw MalkowskiLabels
Time tracking
2d 4h logged
Sprint
None
Affects versions
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist

Open Smart Checklist
Created July 25, 2019 at 1:17 PM
Updated March 4, 2025 at 9:26 PM
Resolved March 4, 2025 at 9:26 PM
The Xtrabackup based SST method is called "non blocking" and Xtrabackup on PXC uses ligthweight backup locks. So normal writes (DML) are still possible, but all queries that are handled by TOI (DDL, DCL, etc) must wait for the backup lock.
Unfortunately in Galera/PXC, the way how TOI are handled makes the whole node blocked for writes as the example DDL query blocked by backup lock is also holding the galera commit monitor.
In addition, TOI takes ownership of those queries so there is no way to cancel or kill them.
That situation renders the donor (or a node used for backups) completely blocked for writes, which is not something ideal, especially that also writes unrelated to the blocked DDL are also blocked (also those coming from replication).
An example situation may look like this:
*************************** 4. row *************************** Id: 6 User: root Host: localhost db: NULL Command: Query Time: 275 State: Waiting for backup lock Info: CREATE USER 'test'@'10.0.0.10' Rows_sent: 0 Rows_examined: 0 *************************** 5. row *************************** Id: 7 User: root Host: localhost db: test Command: Query Time: 224 State: wsrep: initiating pre-commit for write set (685489) Info: delete from t1 limit 1 Rows_sent: 0 Rows_examined: 1 5 rows in set (0.00 sec)
Maybe this behavior could be improved by moving the backup locks above the TOI handling, so that the Galera commit monitor is not blocked, and unrelated DML writes could still proceed?