Xtrabackup fails on primary node, causing SST failure

Description

When deploying a pxc of size 3 with a haproxy of size 2, haproxy successfully spins up, but the pxc enters a crashed state, where node 2 and 3 are in a crashLoopBackoff (probably) because node 1's xtrabackup fails, which then causes SST to fail in node 1.

My PXC configuration is  

pxc.yaml

apiVersion: pxc.percona.com/v1-10-0 kind: PerconaXtraDBCluster metadata: name: test-mysql spec: allowUnsafeConfigurations: true backup: image: percona/percona-xtradb-cluster-operator:1.10.0-pxc8.0-backup schedule: - name: test-mysql-schedule schedule: 0 */12 * * * storageName: us-east-1 storages: us-east-1: s3: bucket: /test credentialsSecret: backup-credentials endpointUrl: 'https://s3.us-east-1.amazonaws.com' region: us-east-1 type: s3 crVersion: 1.10.0 enableCRValidationWebhook: true haproxy: affinity: antiAffinityTopologyKey: kubernetes.io/hostname enabled: true gracePeriod: 30 image: percona/percona-xtradb-cluster-operator:1.10.0-haproxy resources: limits: cpu: '0.2' memory: 256Mi requests: cpu: '0.1' memory: 64Mi size: 2 initImage: percona/percona-xtradb-cluster-operator:1.10.0 pxc: affinity: antiAffinityTopologyKey: kubernetes.io/hostname autoRecovery: true gracePeriod: 600 image: percona/percona-xtradb-cluster:8.0.23-14.1 imagePullPolicy: Always livenessDelaySec: 600 readinessDelaySec: 15 resources: limits: cpu: '0.4' memory: 512Mi requests: cpu: '0.1' memory: 256Mi size: 3 volumeSpec: persistentVolumeClaim: accessModes: - ReadWriteOnce resources: requests: storage: 1G updateStrategy: RollingUpdates secretsName: test-mysql-credentials

Node 1's relevant output showing xtrabackup failure:

2021-11-22T18:40:53.573905Z 0 [Note] [MY-000000] [WSREP-SST] Streaming the backup to joiner at 10.42.8.61 4444 2021-11-22T18:41:23.598288Z 0 [Note] [MY-000000] [WSREP-SST] 2021/11/22 18:41:23 socat[17586] E connect(7, AF=2 10.42.8.61:4444, 16): Connection refused 2021-11-22T18:41:23.598919Z 0 [Note] [MY-000000] [WSREP-SST] donor: => Rate:[2.00KiB/s] Avg:[2.00KiB/s] Elapsed:0:00:30 ETA 6:00:17 donor: => Rate:[2.00KiB/s] Avg:[2.00KiB/s] Elapsed:0:00:30 2021-11-22T18:41:24.268955Z 0 [ERROR] [MY-000000] [WSREP-SST] ******************* FATAL ERROR ********************** 2021-11-22T18:41:24.268992Z 0 [ERROR] [MY-000000] [WSREP-SST] xtrabackup finished with error: 1. Check /var/lib/mysql//innobackup.backup.log 2021-11-22T18:41:24.269016Z 0 [ERROR] [MY-000000] [WSREP-SST] Line 1920 2021-11-22T18:41:24.272105Z 0 [ERROR] [MY-000000] [WSREP-SST] ------------ innobackup.backup.log (START) ------------ xtrabackup: recognized server arguments: --datadir=/var/lib/mysql --server-id=37644180 --innodb_flush_log_at_trx_commit=0 --innodb_flush_method=O_DIRECT --innodb_file_per_table=1 --innodb_buffer_pool_size=402653175 --defaults_group=mysqld xtrabackup: recognized client arguments: --socket=/tmp/mysql.sock --parallel=4 --user=mysql.pxc.sst.user --password=* --socket=/tmp/mysql.sock --lock-ddl=1 --backup=1 --galera-info=1 --stream=xbstream --xtrabackup-plugin-dir=/usr/bin/pxc_extra/pxb-8.0/lib/plugin --target-dir=/tmp/pxc_sst_DzXW/donor_xb_jhEc /usr/bin/pxc_extra/pxb-8.0/bin/xtrabackup version 8.0.23-16 based on MySQL server 8.0.23 Linux (x86_64) (revision id: 934bc8f) xtrabackup: perl binary not found. Skipping the version check 211122 18:40:53 Connecting to MySQL server host: localhost, user: mysql.pxc.sst.user, password: set, port: not set, socket: /tmp/mysql.sock Using server version 8.0.23-14.1 211122 18:40:53 Executing LOCK TABLES FOR BACKUP... xtrabackup: uses posix_fadvise(). xtrabackup: cd to /var/lib/mysql xtrabackup: open files limit requested 0, set to 1048576 xtrabackup: using the following InnoDB configuration: xtrabackup: innodb_data_home_dir = . xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend xtrabackup: innodb_log_group_home_dir = ./ xtrabackup: innodb_log_files_in_group = 2 xtrabackup: innodb_log_file_size = 50331648 xtrabackup: using O_DIRECT Number of pools: 1 WARNING: unknown option --binlog-info=ON 211122 18:40:53 Connecting to MySQL server host: localhost, user: mysql.pxc.sst.user, password: set, port: not set, socket: /tmp/mysql.sock xtrabackup: Redo Log Archiving is not set up. Starting to parse redo log at lsn = 29112889 211122 18:40:53 >> log scanned up to (29113358) xtrabackup: Generating a list of tablespaces xtrabackup: Generating a list of tablespaces Scanning './' Completed space ID check of 2 files. Allocated tablespace ID 1 for sys/sys_config, old maximum was 0 Using undo tablespace './undo_001'. Using undo tablespace './undo_002'. Opened 2 existing undo tablespaces. xtrabackup: Starting 4 threads for parallel data files transfer 211122 18:40:54 [01] Streaming ./ibdata1 211122 18:40:54 [02] Streaming ./sys/sys_config.ibd 211122 18:40:54 [03] Streaming ./mysql/wsrep_cluster.ibd 211122 18:40:54 [04] Streaming ./mysql/wsrep_cluster_members.ibd 211122 18:40:54 [02] ...done 211122 18:40:54 [03] ...done 211122 18:40:54 [04] ...done 211122 18:40:54 [02] Streaming ./mysql/wsrep_streaming_log.ibd 211122 18:40:54 [02] ...done 211122 18:40:54 >> log scanned up to (29114860) 211122 18:40:55 >> log scanned up to (29114870) 211122 18:40:56 >> log scanned up to (29114870) 211122 18:40:57 >> log scanned up to (29114870) 211122 18:40:58 >> log scanned up to (29114870) 211122 18:40:59 >> log scanned up to (29114870) 211122 18:41:00 >> log scanned up to (29114870) 211122 18:41:01 >> log scanned up to (29114870) 211122 18:41:02 >> log scanned up to (29114870) 211122 18:41:03 >> log scanned up to (29114870) 211122 18:41:04 >> log scanned up to (29116542) 211122 18:41:05 >> log scanned up to (29116542) 211122 18:41:06 >> log scanned up to (29116542) 211122 18:41:07 >> log scanned up to (29116542) 211122 18:41:08 >> log scanned up to (29116542) 211122 18:41:09 >> log scanned up to (29116542) 211122 18:41:10 >> log scanned up to (29116542) 211122 18:41:11 >> log scanned up to (29116542) 211122 18:41:12 >> log scanned up to (29116542) 211122 18:41:13 >> log scanned up to (29116542) 211122 18:41:14 >> log scanned up to (29116542) 211122 18:41:15 >> log scanned up to (29116542) 211122 18:41:16 >> log scanned up to (29116542) 211122 18:41:17 >> log scanned up to (29116542) 211122 18:41:18 >> log scanned up to (29116542) 211122 18:41:19 >> log scanned up to (29116542) 211122 18:41:20 >> log scanned up to (29116542) 211122 18:41:21 >> log scanned up to (29116542) 211122 18:41:22 >> log scanned up to (29116542) xtrabackup: Error writing file '<unopen fd>' (OS errno 32 - Broken pipe) [04] xtrabackup: Error: failed to copy datafile. xtrabackup: Error writing file '<unopen fd>' (OS errno 32 - Broken pipe) [02] xtrabackup: Error: failed to copy datafile. xtrabackup: Error writing file '<unopen fd>' (OS errno 32 - Broken pipe) xb_stream_write_data() failed. xtrabackup: Error writing file '<unopen fd>' (OS errno 32 - Broken pipe) [01] xtrabackup: Error: xtrabackup_copy_datafile() failed. [01] xtrabackup: Error: failed to copy datafile. 211122 18:41:23 [03] Streaming ./mysql.ibd xtrabackup: Error writing file '<unopen fd>' (OS errno 32 - Broken pipe) xb_stream_write_data() failed. xtrabackup: Error writing file '<unopen fd>' (OS errno 32 - Broken pipe) [03] xtrabackup: Error: xtrabackup_copy_datafile() failed. [03] xtrabackup: Error: failed to copy datafile. 211122 18:41:23 >> log scanned up to (29116542) 2021-11-22T18:41:24.272150Z 0 [ERROR] [MY-000000] [WSREP-SST] ------------ innobackup.backup.log (END) ------------ 2021-11-22T18:41:24.272201Z 0 [ERROR] [MY-000000] [WSREP-SST] ****************************************************** 2021-11-22T18:41:24.272288Z 0 [ERROR] [MY-000000] [WSREP-SST] Cleanup after exit with status:22 2021-11-22T18:41:24.284106Z 0 [ERROR] [MY-000000] [WSREP] Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.42.8.61:4444/xtrabackup_sst//1' --socket '/tmp/mysql.sock' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib64/mysql/plugin/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --mysqld-version '8.0.23-14.1' --binlog 'binlog' --gtid '1ad48574-4bb8-11ec-b6a5-d7463964cfa3:80' : 22 (Invalid argument) 2021-11-22T18:41:24.285808Z 0 [Note] [MY-000000] [Galera] SST sending failed: -22

Environment

None

Smart Checklist

Activity

Aaditya Dubey January 19, 2023 at 10:13 AM

Hi ,

We still haven't heard any news from you. So I assume reported issue is not persist and will close the ticket. If you disagree just reply and create a follow-up.

Aaditya Dubey June 6, 2022 at 8:57 AM

Hi  ,

Thank you for the report.
please let me know if issue is fixed or still visible to you.

Conrad Hanson December 7, 2021 at 9:28 PM

This issue seems to not take place once the resource's `spec.pxc.volumeSpec.persistentVolumeClaim.resources.requests.storage` value is set to `4G` or higher. 

It might be useful to have the operator raise warnings that a pxc storage request is too low if set below `4G`.  

Incomplete

Details

Assignee

Reporter

Affects versions

Priority

Smart Checklist

Created November 22, 2021 at 8:01 PM
Updated March 5, 2024 at 5:43 PM
Resolved January 19, 2023 at 10:13 AM

Flag notifications