Return the behavior of safe-slave-backup before the lock-ddl implementation

Description

The behavior of the stop slave changed on version 8.0.22 when the option "--lock-ddl" was added. After this version, the stop slave is executed before copying the InnoDB files. 

 

Could the xtrabackup back to the old behavior? Doing the stop slave after copying the InnoDB Tables.

Environment

None

AFFECTED CS IDs

CS0029559

Smart Checklist

Activity

Show:

Marcelo Altmann February 2, 2023 at 3:34 PM

In summary:

Server Behavior:

The current server behavior of allowing STOP SLAVE to be executed if it is issued on a different session than the session that executed LTFB is correct.

STOP SLAVE will hang if any of the slave worker threads is waiting for LTFB while executing a DDL.

As soon as the session that holds LTFB executes UNLOCK TABLES, slave worker thread will be able to proceed with its DDL and once it completes, STOP SLAVE will complete too.

Deadlock only occurs if STOP SLAVE is executed in the same session holding LTFB.

 

Percona Xtrabackup Behavior:

The current PXB behavior is correct. If we have to execute a STOP SLAVE it has to be done before LTFB/LIFB otherwise it will be blocked due to the global instance lock been acquired on the same session executing STOP SLAVE; Even if we try to open a new connection on PXB to execute the STOP SLAVE before copying non-innodb tables we can have a deadlock in case slave worker thread is waiting on LTFB to execute a DDL. STOP SLAVE will hang waiting for PXB to UNLOCK TABLES which only happens in the end of the backup, thus causing a deadlock.

 

We will remain executing STOP SLAVE before LTFB/LIFB

 

Yura Sorokin January 12, 2023 at 2:42 PM

In my opinion this is a bug.
Similarly to LIFB, "STOP SLAVE" under LTFB should not be allowed on any session.
This needs to be fixed in the PS code.

Marcelo Altmann January 9, 2023 at 3:13 PM

can you please validate/explain why we allow STOP SLAVE under LTFB on a different session. Most likely a bug that has to be fixed on server part.

FYI, this was introduced at . Upstream had the same issue with LIFB and they block all sessions - see https://github.com/mysql/mysql-server/commit/7e75ca24a273bd2e1aaa662e6712025c4dd8e0ca

Sveta Smirnova November 18, 2022 at 4:19 PM

Hi

If LOCK TABLES FOR BACKUP; and STOP SLAVE; run in one connection, they conflict. However, it is possible to run STOP SLAVE in separate connection without any issue.

In the another connection:

Check replica status in any of them:

LOCK INSTANCE FOR BACKUP blocks STOP REPLICA/SLAVE operation no matter from which connection it is called:

Another connection:

Marcelo Altmann November 15, 2022 at 4:59 PM

Hi  

 

I implemented the work required on this ticket, however, it seems like test 3 is actually blocked too:

 

Can you please again?

Won't Do

Details

Assignee

Reporter

Upstream Bug URL

Needs QA

Affects versions

Priority

Smart Checklist

Created September 8, 2022 at 5:39 PM
Updated July 22, 2024 at 1:18 PM
Resolved February 2, 2023 at 3:34 PM