LP #1096274: pt-archiver misses data when only copying data
Description
Environment
Smart Checklist
Activity

lpjirasync January 24, 2018 at 2:18 PM
**Comment from Launchpad by: Carlos Salguero on: 17-11-2016 18:00:01
This is the default behavior.
From the docs:
--[no]safe-auto-increment
default: yes
Do not archive row with max AUTO_INCREMENT.
Adds an extra WHERE clause to prevent pt-archiver from removing the newest row when ascending a single-column AUTO_INCREMENT key. This guards against re-using AUTO_INCREMENT values if the server restarts, and is enabled by default.
The extra WHERE clause contains the maximum value of the auto-increment column as of the beginning of the archive or purge job. If new rows are inserted while pt-archiver is running, it will not see them.

lpjirasync January 24, 2018 at 2:18 PM
**Comment from Launchpad by: Jesse McLaughlin on: 17-11-2016 14:14:41
Although if the test case listed above was added to the current code, it would also break. So either way you have broken tests.

lpjirasync January 24, 2018 at 2:18 PM
**Comment from Launchpad by: Carlos Salguero on: 17-11-2016 13:43:20
That 'fix' breaks all the tests

lpjirasync January 24, 2018 at 2:18 PM
**Comment from Launchpad by: Jesse McLaughlin on: 06-10-2016 10:53:29
FYI,
With reference to earlier in this thread, the following patch fixes the test case above:
$ diff pt-archiver pt-archiver-patched
6247c6247
< $first_sql .= " AND ($col < " . $q->quote_val($val) . ")";
—
> $first_sql .= " AND ($col <= " . $q->quote_val($val) . ")";

lpjirasync January 24, 2018 at 2:18 PM
**Comment from Launchpad by: Jesse McLaughlin on: 04-10-2016 14:30:13
Here's a test case that reproduces this issue:
STEP 1. Create some test data (create table and inserts below):
$ mysql tmp
Server version: 5.5.37-MariaDB-log MariaDB Server
MariaDB [tmp]>
CREATE TABLE `tiger_pony` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`ref` varchar(32) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
insert into tiger_pony (ref) values ('Timber');
insert into tiger_pony (ref) values ('Shiny');
select * from tiger_pony;
--------+
id | ref |
--------+
1 | Timber |
2 | Shiny |
--------+
STEP 2. Run the Percona archiver utility on the table:
$ /opt/percona-toolkit-2.2.17/bin/pt-archiver -u <user> -p <password> --source h=<host>,D=tmp,t=tiger_pony --where 'true' --no-delete --limit 1000 --commit-each --file ./tmp.txt
$ cat tmp.txt
1 Timber
So the original select gives two rows, and the Percona archiver utility only outputs one.
Details
Assignee
UnassignedUnassignedReporter
lpjirasynclpjirasync(Deactivated)Priority
High
Details
Details
Assignee
Reporter

Priority
Smart Checklist
Open Smart Checklist
Smart Checklist
Open Smart Checklist
Smart Checklist

**Reported in Launchpad by Pritam Mungse last update 17-11-2016 18:00:56
when using '--no-delete' option in pt-archiver (to only copy data to destination) if the where clause column can have multiple records then pt-archiver will miss records in copying.
e.g. create table with following columns:
id, timestamp
data:
001-25941882
1280490242452
001-25941883
1280490242452
001-25941884
1280490242452
001-25941885
1280490242452
001-25941886
1280495048106
001-25941887
1280495048108
001-25942993
1280495048108
001-25942994
1280495048108
001-25942995
1280490776956
001-25942996
1280493558772
use batch size of 2 (limit) and with commit-each option, the pt-archiver will miss copying few records.
pt version- 2.1.7
mysql version - 5.1.40
command:
/usr/bin/pt-archiver --source h=127.0.0.1,P=3306,D=test_db_1,u=root,p=xxx,t=test,i=idx_timestamp,A=utf8 --dest h=127.0.0.1,P=3306,D=test_db_2,u=root,p=xxx,t=test,A=utf8 --where "created < 1280493558772 " --limit 2--sentinel /tmp/sentinel --set-vars "SQL_MODE=''" --no-delete --commit-each --skip-foreign-key-checks --no-safe-auto-increment --why-quit --statistics --progress 2
It's pretty easy to reproduce and happens due to optimized SQL (which add in above case timestamp > [max value in last run]) in the next sql clause - which cause the next run to miss few records.