It seems that HAProxy is exposed at the time when PITR restore pod is running and if they have conflicting transactions it would lead to PITR pod stuck on duplicate key error.
Test:
Create database and table
CREATE DATABASE test;
CREATE TABLE test.t1(
a int not null auto_increment primary key,
b timestamp
);
Create one script that inserts data and execute
#!/bin/bash
MYSQL="mysql -h 192.168.0.116 -P 3306 -uroot -pROOTPASSWORD -BN -e "
while [ 1 ]
do
$MYSQL "INSERT INTO test.t1(b) VALUES(NOW())"
sleep 1
echo "inserting..."
done
Create one script that shows latest inserts and execute
#!/bin/bash
MYSQL="mysql -h 192.168.0.116 -P 3306 -uroot -pROOTPASSWORD -e "
while [ 1 ]
do
$MYSQL "SELECT * FROM test.t1 ORDER BY b DESC LIMIT 20"
sleep 1
date
done
mysql: [Warning] Using a password on the command line interface can be insecure.
Monday, 15 July, 2024 03:35:45 PM PST
mysql: [Warning] Using a password on the command line interface can be insecure.
Monday, 15 July, 2024 03:35:47 PM PST
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111)
inserting...
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111)
inserting...
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111)
inserting...
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111)
Restore process has begun. However, strangely, HAProxy has been allowed external access to the MySQL at the same time when PITR restore job is running at the very bottom of this output:
From the logs, you can see that the restore pod is still running but is not able to apply PITR due to duplicate key error:
# kubectl -n everest logs pitr-job-restore-9se-mysql-3fu-w2f6n
2024/07/15 07:39:12 run recover
2024/07/15 07:39:13 current gtid set is 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68
2024/07/15 07:39:13 checking current file name binlog_1721028875_f7ff19c6e318624d5ee64c0dda415af9 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:229-252
2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:229-252 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68
2024/07/15 07:39:14 checking current file name binlog_1721028810_78116a2a61f82d5c90af78ed04cf2e22 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:188-228
2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:188-228 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68
2024/07/15 07:39:14 checking current file name binlog_1721028746_ef2980fc327fb0f60c22aaabd5892e57 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:149-187
2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:149-187 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68
2024/07/15 07:39:14 checking current file name binlog_1721028683_ba1298b3bafd2606acbc07300b85fe24 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:109-148
2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:109-148 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68
2024/07/15 07:39:14 checking current file name binlog_1721028643_3d1ae7694c9347c1cdff82a9b56616b1 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:69-108
2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:69-108 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68
2024/07/15 07:39:14 checking current file name binlog_1721028619_68a0725b668103e2f6a008776a5d70b6 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:49-68
2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:49-68 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-48
2024/07/15 07:39:14 Running /usr/bin/mysql -h mysql-3fu-pxc-0.mysql-3fu-pxc.everest.svc.cluster.local -P 33062 -u xtrabackup
2024/07/15 07:39:14 working with binlog_1721028619_68a0725b668103e2f6a008776a5d70b6, 6 out of 6 remaining
2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" -
2024/07/15 07:39:14 working with binlog_1721028643_3d1ae7694c9347c1cdff82a9b56616b1, 5 out of 6 remaining
2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" -
2024/07/15 07:39:14 working with binlog_1721028683_ba1298b3bafd2606acbc07300b85fe24, 4 out of 6 remaining
2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" -
2024/07/15 07:39:14 working with binlog_1721028746_ef2980fc327fb0f60c22aaabd5892e57, 3 out of 6 remaining
2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" -
ERROR 1062 (23000) at line 622: Duplicate entry '66' for key 't1.PRIMARY'
# kubectl get pods -n everest
NAME READY STATUS RESTARTS AGE
percona-xtradb-cluster-operator-77c5ffddf6-d7tmn 1/1 Running 0 31m
xb-cron-mysql-3fu-fullbackup-202471573020-25ss9-mwkhw 0/1 Completed 0 16m
restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Completed 0 8m48s
mysql-3fu-pxc-0 1/1 Running 0 8m33s
pitr-job-restore-9se-mysql-3fu-w2f6n 1/1 Running 0 7m56s
mysql-3fu-haproxy-0 2/2 Running 0 8m30s
On Everest side, the status is still in “Restoring” state
Environment
None
Attachments
3
16 Jul 2024, 01:23 AM
16 Jul 2024, 01:23 AM
16 Jul 2024, 01:23 AM
Activity
Eleonora Zinchenko
November 21, 2024 at 10:27 AM
Hi,
Verified. Proxies are scaled down to 0 during pitr restore, pitr finishes successfully. Agreed with @ege.gunes that we will add to tests check that proxies are scaled down during pitr restore so moving task back to in progress.
Slava Sarzhan
November 8, 2024 at 10:03 AM
Hi @Jaime Sicam , thank you for task. We have improved this PITR restoration behavior. During PITR operator will not start (scale down) proxy pods. It was added into 1.16.0. You can use main image if you want to test it.
It seems that HAProxy is exposed at the time when PITR restore pod is running and if they have conflicting transactions it would lead to PITR pod stuck on duplicate key error.
Test:
Create database and table
CREATE DATABASE test; CREATE TABLE test.t1( a int not null auto_increment primary key, b timestamp );
Create one script that inserts data and execute
#!/bin/bash MYSQL="mysql -h 192.168.0.116 -P 3306 -uroot -pROOTPASSWORD -BN -e " while [ 1 ] do $MYSQL "INSERT INTO test.t1(b) VALUES(NOW())" sleep 1 echo "inserting..." done
Create one script that shows latest inserts and execute
#!/bin/bash MYSQL="mysql -h 192.168.0.116 -P 3306 -uroot -pROOTPASSWORD -e " while [ 1 ] do $MYSQL "SELECT * FROM test.t1 ORDER BY b DESC LIMIT 20" sleep 1 date done
Current Inserts at 3:30PM PHT:
Monday, 15 July, 2024 03:30:26 PM PST mysql: [Warning] Using a password on the command line interface can be insecure. +----+---------------------+ | a | b | +----+---------------------+ | 35 | 2024-07-15 07:30:25 | | 34 | 2024-07-15 07:30:23 | | 33 | 2024-07-15 07:30:22 | | 32 | 2024-07-15 07:30:20 | | 31 | 2024-07-15 07:30:19 | | 30 | 2024-07-15 07:30:17 | | 29 | 2024-07-15 07:30:16 | | 28 | 2024-07-15 07:30:14 | | 27 | 2024-07-15 07:30:13 | | 26 | 2024-07-15 07:30:11 | | 25 | 2024-07-15 07:30:09 | | 24 | 2024-07-15 07:30:07 | | 23 | 2024-07-15 07:30:05 | | 22 | 2024-07-15 07:30:03 | | 21 | 2024-07-15 07:30:02 | | 20 | 2024-07-15 07:30:00 | | 19 | 2024-07-15 07:29:58 | | 18 | 2024-07-15 07:29:57 | | 17 | 2024-07-15 07:29:55 | | 16 | 2024-07-15 07:29:54 |
Scheduled Full Backup at 3:30PM PHT:
Current inserts at 3:33PM PHT:
mysql: [Warning] Using a password on the command line interface can be insecure. +-----+---------------------+ | a | b | +-----+---------------------+ | 247 | 2024-07-15 07:33:55 | | 246 | 2024-07-15 07:33:53 | | 245 | 2024-07-15 07:33:52 | | 244 | 2024-07-15 07:33:50 | | 243 | 2024-07-15 07:33:49 | | 242 | 2024-07-15 07:33:47 | | 241 | 2024-07-15 07:33:46 | | 240 | 2024-07-15 07:33:44 | | 239 | 2024-07-15 07:33:42 | | 238 | 2024-07-15 07:33:41 | | 237 | 2024-07-15 07:33:39 | | 236 | 2024-07-15 07:33:37 | | 235 | 2024-07-15 07:33:35 | | 234 | 2024-07-15 07:33:34 | | 233 | 2024-07-15 07:33:32 | | 232 | 2024-07-15 07:33:31 | | 231 | 2024-07-15 07:33:30 | | 230 | 2024-07-15 07:33:28 | | 229 | 2024-07-15 07:33:27 | | 228 | 2024-07-15 07:33:26 | +-----+---------------------+
Stop writes and truncate table at 3:35PM PHT
mysql> truncate table test.t1; Query OK, 0 rows affected (0.02 sec)
Content of table at 3:35PM PHT
mysql: [Warning] Using a password on the command line interface can be insecure. Monday, 15 July, 2024 03:35:45 PM PST mysql: [Warning] Using a password on the command line interface can be insecure. Monday, 15 July, 2024 03:35:47 PM PST mysql: [Warning] Using a password on the command line interface can be insecure.
Apply PITR up to 3:34PM PHT
Start insert script again. Insert script cannot connect now.
mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111) inserting... mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111) inserting... mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111) inserting... mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111)
Restore process has begun. However, strangely, HAProxy has been allowed external access to the MySQL at the same time when PITR restore job is running at the very bottom of this output:
everest mysql-3fu-pitr-5bdfb8cbdb-f4fc2 1/1 Terminating 0 11m everest mysql-3fu-pitr-5bdfb8cbdb-f4fc2 1/1 Terminating 0 11m everest mysql-3fu-pitr-5bdfb8cbdb-f4fc2 0/1 Terminating 0 11m everest mysql-3fu-pitr-5bdfb8cbdb-f4fc2 0/1 Terminating 0 11m everest mysql-3fu-pitr-5bdfb8cbdb-f4fc2 0/1 Terminating 0 11m everest mysql-3fu-pxc-0 1/1 Terminating 0 14m everest mysql-3fu-haproxy-0 2/2 Terminating 0 14m everest mysql-3fu-haproxy-0 2/2 Terminating 0 14m everest mysql-3fu-haproxy-0 0/2 Terminating 0 14m everest mysql-3fu-haproxy-0 0/2 Terminating 0 14m everest mysql-3fu-haproxy-0 0/2 Terminating 0 14m everest mysql-3fu-pxc-0 1/1 Terminating 0 14m everest mysql-3fu-pxc-0 0/1 Terminating 0 14m everest mysql-3fu-pxc-0 0/1 Terminating 0 14m everest mysql-3fu-pxc-0 0/1 Terminating 0 14m everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Pending 0 0s everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Pending 0 0s everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 ContainerCreating 0 0s everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 ContainerCreating 0 0s everest restore-job-restore-9se-mysql-3fu-bp5bt 1/1 Running 0 3s everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Completed 0 7s everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Completed 0 8s everest restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Completed 0 9s everest mysql-3fu-pxc-0 0/1 Pending 0 0s everest mysql-3fu-pxc-0 0/1 Pending 0 0s everest mysql-3fu-haproxy-0 0/2 Pending 0 0s everest mysql-3fu-haproxy-0 0/2 Pending 0 0s everest mysql-3fu-pxc-0 0/1 Init:0/1 0 0s everest mysql-3fu-pxc-0 0/1 Terminating 0 0s everest mysql-3fu-haproxy-0 0/2 Terminating 0 0s everest mysql-3fu-haproxy-0 0/2 Terminating 0 1s everest mysql-3fu-pxc-0 0/1 Terminating 0 2s everest mysql-3fu-haproxy-0 0/2 Terminating 0 2s everest mysql-3fu-pxc-0 0/1 Terminating 0 4s everest mysql-3fu-pxc-0 0/1 Terminating 0 5s everest mysql-3fu-pxc-0 0/1 Terminating 0 5s everest mysql-3fu-pxc-0 0/1 Terminating 0 5s everest mysql-3fu-pxc-0 0/1 Terminating 0 5s everest mysql-3fu-pxc-0 0/1 Pending 0 0s everest mysql-3fu-pxc-0 0/1 Pending 0 0s everest mysql-3fu-pxc-0 0/1 Pending 0 1s everest mysql-3fu-pxc-0 0/1 Init:0/1 0 1s everest mysql-3fu-haproxy-0 0/2 Terminating 0 7s everest mysql-3fu-haproxy-0 0/2 Terminating 0 7s everest mysql-3fu-pxc-0 0/1 Init:0/1 0 2s everest mysql-3fu-haproxy-0 0/2 Terminating 0 7s everest mysql-3fu-haproxy-0 0/2 Terminating 0 8s everest mysql-3fu-haproxy-0 0/2 Terminating 0 8s everest mysql-3fu-haproxy-0 0/2 Pending 0 0s everest mysql-3fu-haproxy-0 0/2 Pending 0 0s everest mysql-3fu-haproxy-0 0/2 Init:0/1 0 0s everest mysql-3fu-haproxy-0 0/2 Init:0/1 0 1s everest mysql-3fu-pxc-0 0/1 PodInitializing 0 7s everest mysql-3fu-haproxy-0 0/2 PodInitializing 0 6s everest mysql-3fu-pxc-0 0/1 Running 0 11s everest mysql-3fu-haproxy-0 1/2 Running 0 12s everest mysql-3fu-pxc-0 1/1 Running 0 32s everest pitr-job-restore-9se-mysql-3fu-w2f6n 0/1 Pending 0 0s everest pitr-job-restore-9se-mysql-3fu-w2f6n 0/1 Pending 0 0s everest pitr-job-restore-9se-mysql-3fu-w2f6n 0/1 ContainerCreating 0 0s everest pitr-job-restore-9se-mysql-3fu-w2f6n 0/1 ContainerCreating 0 0s everest pitr-job-restore-9se-mysql-3fu-w2f6n 1/1 Running 0 2s everest mysql-3fu-haproxy-0 2/2 Running 0 37s
From the select script you can see that full backup has been restored but not PITR. Also, new data has been inserted on the table:
bash select.sh mysql: [Warning] Using a password on the command line interface can be insecure. +----+---------------------+ | a | b | +----+---------------------+ | 67 | 2024-07-15 07:39:15 | | 66 | 2024-07-15 07:39:14 | | 65 | 2024-07-15 07:39:12 | <-- new inserts | 64 | 2024-07-15 07:30:42 | <-- full backup | 62 | 2024-07-15 07:30:41 | | 60 | 2024-07-15 07:30:40 | | 58 | 2024-07-15 07:30:39 | | 56 | 2024-07-15 07:30:38 | | 54 | 2024-07-15 07:30:37 | | 52 | 2024-07-15 07:30:36 | | 50 | 2024-07-15 07:30:35 | | 48 | 2024-07-15 07:30:34 | | 46 | 2024-07-15 07:30:33 | | 44 | 2024-07-15 07:30:32 | | 42 | 2024-07-15 07:30:31 | | 40 | 2024-07-15 07:30:30 | | 38 | 2024-07-15 07:30:28 | | 37 | 2024-07-15 07:30:27 | | 36 | 2024-07-15 07:30:26 | | 35 | 2024-07-15 07:30:25 | +----+---------------------+
From the logs, you can see that the restore pod is still running but is not able to apply PITR due to duplicate key error:
# kubectl -n everest logs pitr-job-restore-9se-mysql-3fu-w2f6n 2024/07/15 07:39:12 run recover 2024/07/15 07:39:13 current gtid set is 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:13 checking current file name binlog_1721028875_f7ff19c6e318624d5ee64c0dda415af9 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:229-252 2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:229-252 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file name binlog_1721028810_78116a2a61f82d5c90af78ed04cf2e22 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:188-228 2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:188-228 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file name binlog_1721028746_ef2980fc327fb0f60c22aaabd5892e57 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:149-187 2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:149-187 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file name binlog_1721028683_ba1298b3bafd2606acbc07300b85fe24 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:109-148 2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:109-148 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file name binlog_1721028643_3d1ae7694c9347c1cdff82a9b56616b1 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:69-108 2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:69-108 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file name binlog_1721028619_68a0725b668103e2f6a008776a5d70b6 gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:49-68 2024/07/15 07:39:14 Checking sub result binlog gtid 633cb8b4-427b-11ef-85da-fe4687ec9e6f:49-68 sub result 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-48 2024/07/15 07:39:14 Running /usr/bin/mysql -h mysql-3fu-pxc-0.mysql-3fu-pxc.everest.svc.cluster.local -P 33062 -u xtrabackup 2024/07/15 07:39:14 working with binlog_1721028619_68a0725b668103e2f6a008776a5d70b6, 6 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - 2024/07/15 07:39:14 working with binlog_1721028643_3d1ae7694c9347c1cdff82a9b56616b1, 5 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - 2024/07/15 07:39:14 working with binlog_1721028683_ba1298b3bafd2606acbc07300b85fe24, 4 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - 2024/07/15 07:39:14 working with binlog_1721028746_ef2980fc327fb0f60c22aaabd5892e57, 3 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - ERROR 1062 (23000) at line 622: Duplicate entry '66' for key 't1.PRIMARY' # kubectl get pods -n everest NAME READY STATUS RESTARTS AGE percona-xtradb-cluster-operator-77c5ffddf6-d7tmn 1/1 Running 0 31m xb-cron-mysql-3fu-fullbackup-202471573020-25ss9-mwkhw 0/1 Completed 0 16m restore-job-restore-9se-mysql-3fu-bp5bt 0/1 Completed 0 8m48s mysql-3fu-pxc-0 1/1 Running 0 8m33s pitr-job-restore-9se-mysql-3fu-w2f6n 1/1 Running 0 7m56s mysql-3fu-haproxy-0 2/2 Running 0 8m30s
On Everest side, the status is still in “Restoring” state