PXC concurrent backups - metadata lock hang with xtrabackup_history

Description

Hi Percona,

We have found a hang while executing XtraBackup concurrently in multiple PXC nodes.
We cannot reproduce always the problem, but when it happens we see "metadata lock" in the wsrep_applier thread and the only fix is to kill one of the xtrabackup processes.

The metadata lock corresponds to the "CREATE DATABASE IF NOT EXISTS PERCONA_SCHEMA" statement in XtraBackup code.
It makes us suspect that TOI is causing this issue, because it creates the lock that conflicts with the backup in other nodes, even when the object exists.

Reproduce Scenario
PXC - 3 nodes
Execute XtraBackup with xtrabackup_history enabled, full backup, in all 3 at the same time

Fix:
Execute DDL only if the object doesn't exist.

  1. Check if database PERCONA_SCHEMA exists - Create if not

  2. Check if table XTRABACKUP_HISTORY exists - Create if not

  3. Check if column BINLOG_POS has the required type - Alter if not (I believe we don't need this migration anymore)

With those changes the hang is not happening anymore. I will raise a pull request so you can review them.

Environment

Percona XtraDB Cluster 8.0.36 - 3 nodes - RHEL8
Percona XtraBackup 8.0.35-31

Activity

Show:

Aaditya Dubey September 6, 2024 at 11:56 AM

Hi

Thank you for the report.
This issue looks related to the following issues:
https://perconadev.atlassian.net/browse/PXB-3084
https://perconadev.atlassian.net/browse/PXC-2593

However, thank you for the contributions. It will be reviewed, and you will be notified accordingly.

Details

Assignee

Reporter

Labels

Needs QA

Yes

Affects versions

Priority

Smart Checklist

Created September 3, 2024 at 9:42 AM
Updated January 22, 2025 at 12:57 PM

Flag notifications