pt-online-schema-change race overwrites new_table_name

Description

If you are using --history with pt-osc the query to update the history entry with the new_table_name does not constrain the UPDATE with a key, so it overwrites ALL entries in the history table. If you execute multiple migrations in parallel they end up stepping on eachother. If you then resume a migration you can end up with a scenario where only a portion of the key space is copied to the new table and then swapped.

The problem is this stanza of code starting on line 9689 (v3.6.0):


if ( $o->get('history') ) { my $sth = $cxn->dbh()->prepare( "UPDATE ${hist_table} SET new_table_name = ?" ); $sth->execute($new_tbl->{tbl}); }

I have made and tested this change, which appears to do the correct thing now:

if ( $o->get('history') ) { my $sth = $cxn->dbh()->prepare( "UPDATE ${hist_table} SET new_table_name = ? WHERE job_id = ?" ); $sth->execute($new_tbl->{tbl}, $job_id); }

This behavior was observed in our testing of resume functionality, it manifested as the log output showing ___table_new but the history table containing __table_new which came from a different invocation of pt-osc. The end result was that approx 1.9mn keys were copied into the ___table_new working table, while 2.1mn keys were copied into a table called __table_new that was missing the column that was being added by the migration. This resulted in table containing 2.1mn keys and it was missing the column the migration added.

Environment

None

Activity

Show:

Sveta Smirnova March 13, 2025 at 4:50 PM

https://perconadev.atlassian.net/browse/PT-2355 is not the same. This one is about wrong table name update and that one is about error when resuming job with NULL boundaries.

Aaditya Dubey February 3, 2025 at 12:21 PM

Hi

Thank you for the report.
This issue is known to us and is being tracked here at https://perconadev.atlassian.net/browse/PT-2355 However, we are not marking this task duplicate because it contains a piece of code where the issue is happening. It should be fixed in the upcoming PT release.

Done

Details

Assignee

Reporter

Priority

Affects versions

Fix versions

Needs QA

Yes

Smart Checklist

Created January 27, 2025 at 10:24 PM
Updated 2 days ago
Resolved 2 days ago

Flag notifications