PITR restore hung with duplicate key error

Description

It seems that HAProxy is exposed at the time when PITR restore pod is running and if they have conflicting transactions it would lead to PITR pod stuck on duplicate key error.

Test:

  1. Create database and table

CREATE DATABASE test; CREATE TABLE test.t1(  a int not null auto_increment primary key,  b timestamp );
  1. Create one script that inserts data and execute

#!/bin/bash MYSQL="mysql -h 192.168.0.116 -P 3306 -uroot -pROOTPASSWORD -BN -e " while [ 1 ] do   $MYSQL "INSERT INTO test.t1(b) VALUES(NOW())"   sleep 1   echo "inserting..." done
  1. Create one script that shows latest inserts and execute

#!/bin/bash MYSQL="mysql -h 192.168.0.116 -P 3306 -uroot -pROOTPASSWORD -e " while [ 1 ] do $MYSQL "SELECT * FROM test.t1 ORDER BY b DESC LIMIT 20" sleep 1 date done
  1. Current Inserts at 3:30PM PHT:

Monday, 15 July, 2024 03:30:26 PM PST mysql: [Warning] Using a password on the command line interface can be insecure. +----+---------------------+ | a  | b               | +----+---------------------+ | 35 | 2024-07-15 07:30:25 | | 34 | 2024-07-15 07:30:23 | | 33 | 2024-07-15 07:30:22 | | 32 | 2024-07-15 07:30:20 | | 31 | 2024-07-15 07:30:19 | | 30 | 2024-07-15 07:30:17 | | 29 | 2024-07-15 07:30:16 | | 28 | 2024-07-15 07:30:14 | | 27 | 2024-07-15 07:30:13 | | 26 | 2024-07-15 07:30:11 | | 25 | 2024-07-15 07:30:09 | | 24 | 2024-07-15 07:30:07 | | 23 | 2024-07-15 07:30:05 | | 22 | 2024-07-15 07:30:03 | | 21 | 2024-07-15 07:30:02 | | 20 | 2024-07-15 07:30:00 | | 19 | 2024-07-15 07:29:58 | | 18 | 2024-07-15 07:29:57 | | 17 | 2024-07-15 07:29:55 | | 16 | 2024-07-15 07:29:54 |

 

  1. Scheduled Full Backup at 3:30PM PHT:

  1. Current inserts at 3:33PM PHT:

mysql: [Warning] Using a password on the command line interface can be insecure. +-----+---------------------+ | a   | b               | +-----+---------------------+ | 247 | 2024-07-15 07:33:55 | | 246 | 2024-07-15 07:33:53 | | 245 | 2024-07-15 07:33:52 | | 244 | 2024-07-15 07:33:50 | | 243 | 2024-07-15 07:33:49 | | 242 | 2024-07-15 07:33:47 | | 241 | 2024-07-15 07:33:46 | | 240 | 2024-07-15 07:33:44 | | 239 | 2024-07-15 07:33:42 | | 238 | 2024-07-15 07:33:41 | | 237 | 2024-07-15 07:33:39 | | 236 | 2024-07-15 07:33:37 | | 235 | 2024-07-15 07:33:35 | | 234 | 2024-07-15 07:33:34 | | 233 | 2024-07-15 07:33:32 | | 232 | 2024-07-15 07:33:31 | | 231 | 2024-07-15 07:33:30 | | 230 | 2024-07-15 07:33:28 | | 229 | 2024-07-15 07:33:27 | | 228 | 2024-07-15 07:33:26 | +-----+---------------------+
  1. Stop writes and truncate table at 3:35PM PHT

mysql> truncate table test.t1; Query OK, 0 rows affected (0.02 sec)
  1. Content of table at 3:35PM PHT

mysql: [Warning] Using a password on the command line interface can be insecure. Monday, 15 July, 2024 03:35:45 PM PST mysql: [Warning] Using a password on the command line interface can be insecure. Monday, 15 July, 2024 03:35:47 PM PST mysql: [Warning] Using a password on the command line interface can be insecure.
  1. Apply PITR up to 3:34PM PHT

  1. Start insert script again. Insert script cannot connect now.

mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111) inserting... mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111) inserting... mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111) inserting... mysql: [Warning] Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.0.116:3306' (111)
  1. Restore process has begun. However, strangely, HAProxy has been allowed external access to the MySQL at the same time when PITR restore job is running at the very bottom of this output:

everest          mysql-3fu-pitr-5bdfb8cbdb-f4fc2                               1/1 Terminating     0      11m everest          mysql-3fu-pitr-5bdfb8cbdb-f4fc2                               1/1 Terminating     0      11m everest          mysql-3fu-pitr-5bdfb8cbdb-f4fc2                               0/1 Terminating     0      11m everest          mysql-3fu-pitr-5bdfb8cbdb-f4fc2                               0/1 Terminating     0      11m everest          mysql-3fu-pitr-5bdfb8cbdb-f4fc2                               0/1 Terminating     0      11m everest          mysql-3fu-pxc-0                                               1/1 Terminating     0      14m everest          mysql-3fu-haproxy-0                                           2/2 Terminating     0      14m everest          mysql-3fu-haproxy-0                                           2/2 Terminating     0      14m everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      14m everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      14m everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      14m everest          mysql-3fu-pxc-0                                               1/1 Terminating     0      14m everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      14m everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      14m everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      14m everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 Pending         0      0s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 Pending         0      0s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 ContainerCreating   0      0s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 ContainerCreating   0      0s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       1/1 Running         0      3s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 Completed       0      7s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 Completed       0      8s everest          restore-job-restore-9se-mysql-3fu-bp5bt                       0/1 Completed       0      9s everest          mysql-3fu-pxc-0                                               0/1 Pending         0      0s everest          mysql-3fu-pxc-0                                               0/1 Pending         0      0s everest          mysql-3fu-haproxy-0                                           0/2 Pending         0      0s everest          mysql-3fu-haproxy-0                                           0/2 Pending         0      0s everest          mysql-3fu-pxc-0                                               0/1 Init:0/1        0      0s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      0s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      0s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      1s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      2s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      2s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      4s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      5s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      5s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      5s everest          mysql-3fu-pxc-0                                               0/1 Terminating     0      5s everest          mysql-3fu-pxc-0                                               0/1 Pending         0      0s everest          mysql-3fu-pxc-0                                               0/1 Pending         0      0s everest          mysql-3fu-pxc-0                                               0/1 Pending         0      1s everest          mysql-3fu-pxc-0                                               0/1 Init:0/1        0      1s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      7s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      7s everest          mysql-3fu-pxc-0                                               0/1 Init:0/1        0      2s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      7s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      8s everest          mysql-3fu-haproxy-0                                           0/2 Terminating     0      8s everest          mysql-3fu-haproxy-0                                           0/2 Pending         0      0s everest          mysql-3fu-haproxy-0                                           0/2 Pending         0      0s everest          mysql-3fu-haproxy-0                                           0/2 Init:0/1        0      0s everest          mysql-3fu-haproxy-0                                           0/2 Init:0/1        0      1s everest          mysql-3fu-pxc-0                                               0/1 PodInitializing 0      7s everest          mysql-3fu-haproxy-0                                           0/2 PodInitializing 0      6s everest          mysql-3fu-pxc-0                                               0/1 Running         0      11s everest          mysql-3fu-haproxy-0                                           1/2 Running         0      12s everest          mysql-3fu-pxc-0                                               1/1 Running         0      32s everest          pitr-job-restore-9se-mysql-3fu-w2f6n                          0/1 Pending         0      0s everest          pitr-job-restore-9se-mysql-3fu-w2f6n                          0/1 Pending         0      0s everest          pitr-job-restore-9se-mysql-3fu-w2f6n                          0/1 ContainerCreating   0      0s everest          pitr-job-restore-9se-mysql-3fu-w2f6n                          0/1 ContainerCreating   0      0s everest          pitr-job-restore-9se-mysql-3fu-w2f6n                          1/1 Running         0      2s everest          mysql-3fu-haproxy-0                                           2/2 Running          0      37s
  1. From the select script you can see that full backup has been restored but not PITR. Also, new data has been inserted on the table:

bash select.sh mysql: [Warning] Using a password on the command line interface can be insecure. +----+---------------------+ | a  | b               | +----+---------------------+ | 67 | 2024-07-15 07:39:15 | | 66 | 2024-07-15 07:39:14 | | 65 | 2024-07-15 07:39:12 | <-- new inserts | 64 | 2024-07-15 07:30:42 | <-- full backup | 62 | 2024-07-15 07:30:41 | | 60 | 2024-07-15 07:30:40 | | 58 | 2024-07-15 07:30:39 | | 56 | 2024-07-15 07:30:38 | | 54 | 2024-07-15 07:30:37 | | 52 | 2024-07-15 07:30:36 | | 50 | 2024-07-15 07:30:35 | | 48 | 2024-07-15 07:30:34 | | 46 | 2024-07-15 07:30:33 | | 44 | 2024-07-15 07:30:32 | | 42 | 2024-07-15 07:30:31 | | 40 | 2024-07-15 07:30:30 | | 38 | 2024-07-15 07:30:28 | | 37 | 2024-07-15 07:30:27 | | 36 | 2024-07-15 07:30:26 | | 35 | 2024-07-15 07:30:25 | +----+---------------------+

 

  1. From the logs, you can see that the restore pod is still running but is not able to apply PITR due to duplicate key error:

# kubectl -n everest logs pitr-job-restore-9se-mysql-3fu-w2f6n 2024/07/15 07:39:12 run recover 2024/07/15 07:39:13 current gtid set is 633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:13 checking current file  name  binlog_1721028875_f7ff19c6e318624d5ee64c0dda415af9  gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:229-252 2024/07/15 07:39:14 Checking sub result  binlog gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:229-252  sub result  633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file  name  binlog_1721028810_78116a2a61f82d5c90af78ed04cf2e22  gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:188-228 2024/07/15 07:39:14 Checking sub result  binlog gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:188-228  sub result  633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file  name  binlog_1721028746_ef2980fc327fb0f60c22aaabd5892e57  gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:149-187 2024/07/15 07:39:14 Checking sub result  binlog gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:149-187  sub result  633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file  name  binlog_1721028683_ba1298b3bafd2606acbc07300b85fe24  gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:109-148 2024/07/15 07:39:14 Checking sub result  binlog gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:109-148  sub result  633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file  name  binlog_1721028643_3d1ae7694c9347c1cdff82a9b56616b1  gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:69-108 2024/07/15 07:39:14 Checking sub result  binlog gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:69-108  sub result  633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-68 2024/07/15 07:39:14 checking current file  name  binlog_1721028619_68a0725b668103e2f6a008776a5d70b6  gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:49-68 2024/07/15 07:39:14 Checking sub result  binlog gtid  633cb8b4-427b-11ef-85da-fe4687ec9e6f:49-68  sub result  633cb8b4-427b-11ef-85da-fe4687ec9e6f:1-48 2024/07/15 07:39:14 Running /usr/bin/mysql -h mysql-3fu-pxc-0.mysql-3fu-pxc.everest.svc.cluster.local -P 33062 -u xtrabackup 2024/07/15 07:39:14 working with binlog_1721028619_68a0725b668103e2f6a008776a5d70b6, 6 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - 2024/07/15 07:39:14 working with binlog_1721028643_3d1ae7694c9347c1cdff82a9b56616b1, 5 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - 2024/07/15 07:39:14 working with binlog_1721028683_ba1298b3bafd2606acbc07300b85fe24, 4 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - 2024/07/15 07:39:14 working with binlog_1721028746_ef2980fc327fb0f60c22aaabd5892e57, 3 out of 6 remaining 2024/07/15 07:39:14 Running /usr/bin/sh -c mysqlbinlog --disable-log-bin --stop-datetime="2024-07-15 07:34:00" - ERROR 1062 (23000) at line 622: Duplicate entry '66' for key 't1.PRIMARY' # kubectl get pods -n everest NAME                                                READY   STATUS  RESTARTS   AGE percona-xtradb-cluster-operator-77c5ffddf6-d7tmn    1/1 Running 0      31m xb-cron-mysql-3fu-fullbackup-202471573020-25ss9-mwkhw   0/1 Completed   0      16m restore-job-restore-9se-mysql-3fu-bp5bt             0/1 Completed   0      8m48s mysql-3fu-pxc-0                                     1/1 Running 0      8m33s pitr-job-restore-9se-mysql-3fu-w2f6n                1/1 Running 0      7m56s mysql-3fu-haproxy-0                                 2/2 Running 0      8m30s
  1. On Everest side, the status is still in “Restoring” state

Environment

None

Attachments

3

Activity

Eleonora Zinchenko 
November 21, 2024 at 10:27 AM

Hi,

Verified. Proxies are scaled down to 0 during pitr restore, pitr finishes successfully. Agreed with that we will add to tests check that proxies are scaled down during pitr restore so moving task back to in progress.

Slava Sarzhan 
November 8, 2024 at 10:03 AM

Hi , thank you for task. We have improved this PITR restoration behavior. During PITR operator will not start (scale down) proxy pods. It was added into 1.16.0. You can use main image if you want to test it.

Done

Details

Assignee

Reporter

Labels

Needs QA

Fix versions

Affects versions

Priority

Created July 16, 2024 at 1:23 AM
Updated December 19, 2024 at 8:11 PM
Resolved December 2, 2024 at 8:57 AM

Flag notifications