PXC in a Desynced state after joiner being killed

General

Escalation

General

Escalation

Description

Scenario:

1 bootstraped node,
2 joiners stared at the same time. First was doing SST, second was waiting for a donor becoming available.
Joiner was killed during prepare state. Donor was hung in donor/desync state:

Environment

None

AFFECTED CS IDs

CS0026005

Linked work items

causes

PXC-4266

Support for httpnc based snapshot transfer

relates to

K8SPXC-998

PXC 5.7 backups are stopping garbd too early

Activity

puneet.kaushik
May 17, 2022 at 5:32 AM
(edited)

Bug fix verified in PXC 5.7.37 !

Kamil Holubicki
April 14, 2022 at 2:05 PM

Hi ,

Probably the documentation of the above config params will need to be updated.

Kamil Holubicki
April 13, 2022 at 12:17 PM
(edited)

How to test
2 node cluster. n1 having wsrep_provider_options="pc.weight=3"

start n1. Load some data using sysbench (particularly useful for testing Case 2)
start n2

Case 1:

3. When n1 log says 'Sleeping before data transfer for SST' partition the network

Case 2:

3. When n1 log says 'Streaming the backup to joiner at...' which will happen just after 'Sleeping before data transfer for SST' partition the network. You need to partition it while SST transfer is in the middle.

4. Wait

Expected result

In both cases, the donor should cancel serving SST and go back to SYNCED state. For case 1 the timeout is the result of 'retry=N' and 'donor-timeout' combination (retry=30, donor-timeout=2 results in c.a 60 secs). For case 2 the timeout is 'sst-idle-timeout'

Joiner should abort after 'sst-idle-timeout' (default 120) sec

Partition network: iptables -P INPUT DROP && iptables -P OUTPUT DROP

Enable network: iptables -P INPUT ACCEPT && iptables -P OUTPUT ACCEPT

SST timeout control

We have the following parameters of [sst] section in the configuration file to control SST timeouts:
sockopt - if it contains retry=N, N will be used, otherwise 30
donor-timeout - (default: 10). The value of 'connect-timeout' on the donor side
joiner-timeout (sst-initial-timeout) - (default: 60). Time for joiner to wait for SST transfer start.
sst-idle-timeout - (default: 120). Timeout for transfer stuck. If no data is send or received in this time window, SST process is aborted.

Kamil Holubicki
April 13, 2022 at 10:30 AM
(edited)

5.7 fix
https://github.com/percona/percona-xtradb-cluster/pull/1617

8.0 fix
https://github.com/percona/percona-xtradb-cluster/pull/1618

Iwo Panowicz
April 11, 2022 at 1:27 PM

The easiest way of reproducing that is to simulate network issues.

For instance, a single node cluster with `garbd --sst` used for testing.

Joiner:

garbd

./receive

Donor:

donor logs

Here, just after the backup was started it's enough to block any communication between the joiner and the donor, for instance iptables -P INPUT DROP && iptables -P OUTPUT DROP. It's important to do that in a way that disallows any communication (not even RST).

PXC in that case waits for the socat to finish anyhow (either with a success or a failure). It's even more interesting as Galera notices that the joiner already left the cluster, but it continues to stream the backup.

This feature is of PXC is also used for taking backups in PXC-Operator when 5.7 is used. For 8.0 the --recv-script is used, which also forces the garbd to reply Galera message when the backup is happening.

In some particular cases the socat can wait for the connection to expire for hours or days, but it really depends on the infrastructure and topology. 16 minutes in the above example is also very long time.

Resize issue view side panel

Done

Details

Assignee

Kamil Holubicki

Reporter

Iwo Panowicz

Labels

platform

Time tracking

1d 4h 31m logged6h 18m remaining

Fix versions

5.7.37-31.57 (Q1 2022)

8.0.28-19 (Q1 2022)

Affects versions

8.0.19-10

8.0.21-12.1

5.7.35-31.53 (Q3 2021)

Priority

Medium

Parent

PXC-3532 PXC 8.x Stability Review and Modifications

Created August 25, 2020 at 8:16 AM

Updated May 27, 2024 at 12:08 PM

Resolved April 14, 2022 at 2:03 PM