writer node switched when another node becomes online

Description

Sorry about criptic title, but I'll try to explain what we use and what seems problematic to us.
The setup is 3 node PXC 8.0 and proxysql 2.0.12 with pxc kubernetes operator and in proxysql-admin.cnf we set "export WRITE_NODE='cluster1-pxc-0.cluster1-pxc.pxc-demo.svc.cluster.local:3306'" which means if possible we always want node 0 (cluster1-pxc-0) to be the writer.
Hostgroups in our setup are: 11-writer, 12-backup_writer, 10-reader

Starting setup:

Looks ok, node0 is writer.

Then we start cluster upgrade process and upgrade nodes 1 by 1 - node2 first, then node1 and last upgraded is node0 (writer).

Node2 is getting upgraded:

Node0 is writer, so ok.

Node1 is getting upgraded:

Node0 is writer, so ok.

Node0 is getting upgraded and node2 is selected as writer, node1 is online, but not yet re-added to proxysql.

Node1 becomes re-added to proxysql, but the writer is now switched to node1.

Why didn't the writer just stay node2?

At the end, node0 is upgraded and re-added to proxysql and selected as writer:

That last writer switch is ok since we prefer to have node0 as writer, but why the switch before from node2 to node1?

Environment

None

Attachments

3

Smart Checklist

Activity

Morten Tryfoss August 25, 2020 at 6:50 AM

Edit: my error might be related to backups instead, with similar error message.

Morten Tryfoss August 25, 2020 at 6:25 AM

Hi,

 

We got the exact same issue. Did not discover this before all pxc nodes were offline and the cluster stopped responding.

This seems to be trigged right after midnight every day. Some days the nodes get ONLINE again, but not always.

The time of day might be related to work at our cloud provider.

 

Whats the key change in the percona/percona-xtradb-cluster-operator:mykola-proxysql image regarding this?

Tomislav Plavcic July 1, 2020 at 3:16 PM


This looks much better, but I still see one thing when node-0 comes online writer node-2 is selected as donor and then it goes into offline group - you can see this in table #2.

"When the writer node is going into a Donor/Desync status, ProxySQL will move the write traffic to the backup writer node, after promoting it from HG4 (writer backup HG) to HG2 (writer HG)."
https://proxysql.com/blog/proxysql-native-galera-support/

Not sure at the moment if there's something we can do about it.

 

Here's the part of logs from proxysql where it made the change for node-2 from writer to move in offline group.
Notice line: "2020-07-01 14:06:15 MySQL_HostGroups_Manager.cpp:4425:update_galera_set_offline(): [WARNING] Galera: setting host cluster1-pxc-2.cluster1-pxc.pxc-test.svc.cluster.local:3306 offline because: wsrep_local_state=2"

Mykola Marzhan July 1, 2020 at 12:46 PM

, could you retest with percona/percona-xtradb-cluster-operator:mykola-proxysql image

Done

Details

Assignee

Reporter

Time tracking

2h 5m logged

Fix versions

Affects versions

Priority

Smart Checklist

Created June 30, 2020 at 8:53 AM
Updated March 5, 2024 at 6:13 PM
Resolved August 25, 2020 at 11:42 AM