ALTER TABLE causes wsrep_cluster_status Disconnected

Description

I have a PXC environment. When I try "ALTER TABLE" including "ADD UNIQUE" and table contains duplicate entries, the result is "Disconnected" node or even whole cluster crash.

Environment:
3 servers, Debian Linux 11
8.0.33-25.1 Percona XtraDB Cluster (GPL), Release rel25, Revision 0c56202, WSREP version 26.1.4.3
mysqld.cnf attached

MySQL table definition:
mysql> describe iot_devices_values;
+----------------+--------------+------+-----+-------------------+-------------------+
| Field          | Type         | Null | Key | Default           | Extra             |
+----------------+--------------+------+-----+-------------------+-------------------+
| id             | int unsigned | NO   | PRI | NULL              | auto_increment    |
| iot_devices_id | int unsigned | NO   | MUL | NULL              |                   |
| timestamp      | timestamp    | NO   |     | CURRENT_TIMESTAMP | DEFAULT_GENERATED |
| data           | json         | NO   |     | NULL              |                   |
+----------------+--------------+------+-----+-------------------+-------------------+
4 rows in set (0.00 sec)mysql> show indexes from iot_devices_values;
+--------------------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table              | Non_unique | Key_name       | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+--------------------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| iot_devices_values |          0 | PRIMARY        |            1 | id             | A         |       32780 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| iot_devices_values |          1 | iot_devices_id |            1 | iot_devices_id | A         |           7 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+--------------------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2 rows in set (0.01 sec)mysql>

Attempted command:
ALTER TABLE `iot_devices_values` ADD UNIQUE(`iot_devices_id`, `timestamp`);

Result (expected, I'm aware that the table contains duplicates):
ERROR 1062 (23000): Duplicate entry '43-2023-03-18 06:25:17' for key 'iot_devices_values.iot_devices_id_2'

Unexpected result:
One or multiple nodes in the PXC cluster become "Disconnected" and make the cluster broken. In my test scenario I broke server 2 (timestamp 2023-08-24T14:56:39).
mysql> show status like 'wsrep_clu%';
+----------------------------+--------------------------------------+
| Variable_name              | Value                                |
+----------------------------+--------------------------------------+
| wsrep_cluster_capabilities |                                      |
| wsrep_cluster_conf_id      | 18446744073709551615                 |
| wsrep_cluster_size         | 0                                    |
| wsrep_cluster_state_uuid   | 62cb5731-428a-11ee-8fd0-07da8bce8819 |
| wsrep_cluster_status       | Disconnected                         |
+----------------------------+--------------------------------------+
5 rows in set (0.00 sec)mysql>
{{{}{}}}I will attach error.log from all 3 nodes.

Thanks a lot!

Vítek

Environment

None

Attachments

9

Activity

puneet.kaushik January 16, 2024 at 6:12 PM

Bug Fix verified in PXC 8.0.35 ! By using attached test-case i am not able to reproduce the scenario.

Kamil Holubicki August 28, 2023 at 3:57 PM

8.0.32 affected in the same way:

n1: Duplicate entry '43-2023-03-18 06:25:17' for key 'iot_devices_values.iot_devices_id_2', Error_code: 1062;

n2: Duplicate entry '47-2023-05-09 13:21:05' for key 'iot_devices_values.iot_devices_id_2

n3: Duplicate entry '43-2023-03-18 06:25:17' for key 'iot_devices_values.iot_devices_id_2', Error_code: 1062;

So the n2 gets evicted.

What is interesting, it was the 2nd attempt. For the 1st time, all nodes voted in the way n2 voted above and all got to the consensus.

Kamil Holubicki August 28, 2023 at 3:18 PM

Hi ,
I confirm I can reproduce it. It happened after executing ALTER TABLE several times on node-1.

Vit Novak August 25, 2023 at 9:55 AM

Typo: At server1: SET GLOBAL pxc_strict_mode=STRICT; -> SET GLOBAL pxc_strict_mode=ENFORCING;

Vit Novak August 25, 2023 at 9:50 AM

Oh... I have just tried to run the command "ALTER TABLE `iot_devices_values` ADD UNIQUE(`iot_devices_id`, `timestamp`);" at another node (server3) and it made server2 Disconnected.

My test env:

  1. Create 3 Debian 11 VMs in Azure Cloud (Standard B1s (1 vcpu, 1 GiB memory)) within the same VNET.

  2. Install PXC following the guide https://docs.percona.com/percona-xtradb-cluster/8.0/apt.html.

  3. Configure the nodes following the guide https://docs.percona.com/percona-xtradb-cluster/8.0/configure-nodes.html#configure. .cnf files attached.

  4. Bootstrap server1, https://docs.percona.com/percona-xtradb-cluster/8.0/bootstrap.html.

  5. Start server2 and server3, https://docs.percona.com/percona-xtradb-cluster/8.0/add-node.html#add-node.

  6. Create new database at server1.

  7. At server1: SET GLOBAL pxc_strict_mode=PERMISSIVE;

  8. Import the SQL dump at server1

  9. At server1: SET GLOBAL pxc_strict_mode=STRICT;

  10. Run "ALTER TABLE `iot_devices_values` ADD UNIQUE(`iot_devices_id`, `timestamp`);" at any of the servers. During the last test I had to try the query multiple times.

Done

Details

Assignee

Reporter

Needs Review

Yes

Needs QA

Yes

Affects versions

Priority

Smart Checklist

Created August 24, 2023 at 3:22 PM
Updated June 6, 2024 at 7:57 AM
Resolved January 16, 2024 at 6:15 PM