Issues

Select view

Select search mode

 
18 of 18

PITR restore hung with duplicate key error

Done

Description

It seems that HAProxy is exposed at the time when PITR restore pod is running and if they have conflicting transactions it would lead to PITR pod stuck on duplicate key error.

Test:

  1. Create database and table

  1. Create one script that inserts data and execute

  1. Create one script that shows latest inserts and execute

  1. Current Inserts at 3:30PM PHT:

 

  1. Scheduled Full Backup at 3:30PM PHT:

  1. Current inserts at 3:33PM PHT:

  1. Stop writes and truncate table at 3:35PM PHT

  1. Content of table at 3:35PM PHT

  1. Apply PITR up to 3:34PM PHT

  1. Start insert script again. Insert script cannot connect now.

  1. Restore process has begun. However, strangely, HAProxy has been allowed external access to the MySQL at the same time when PITR restore job is running at the very bottom of this output:

  1. From the select script you can see that full backup has been restored but not PITR. Also, new data has been inserted on the table:

 

  1. From the logs, you can see that the restore pod is still running but is not able to apply PITR due to duplicate key error:

  1. On Everest side, the status is still in “Restoring” state

Environment

None

Attachments

3

Details

Assignee

Reporter

Labels

Needs QA

Yes

Fix versions

Affects versions

Priority

Smart Checklist

Created July 16, 2024 at 1:23 AM
Updated December 19, 2024 at 8:11 PM
Resolved December 2, 2024 at 8:57 AM

Activity

Eleonora ZinchenkoNovember 21, 2024 at 10:27 AM

Hi,

Verified. Proxies are scaled down to 0 during pitr restore, pitr finishes successfully. Agreed with that we will add to tests check that proxies are scaled down during pitr restore so moving task back to in progress.

Slava SarzhanNovember 8, 2024 at 10:03 AM

Hi , thank you for task. We have improved this PITR restoration behavior. During PITR operator will not start (scale down) proxy pods. It was added into 1.16.0. You can use main image if you want to test it.