cluster goes into unhealthy status after clustercheck secret changed

Description

It seems that this is affecting the master branch only for now.
Setup is 1x proxysql 3x pxc from master images.
After cluster is running patch the clustercheck secret something like:
kubectl patch secret my-cluster-secrets -p="{\"data\":{\"clustercheck\": \"Y2x1c3RlcjEyMzQ1\"}}"

Observe that the node-2 will be restarted, after some time node-0 and node-1 go into unready state.

NAME READY STATUS RESTARTS AGE cluster1-proxysql-0 3/3 Running 0 12m cluster1-pxc-0 0/1 Running 0 11m cluster1-pxc-1 0/1 Running 0 10m cluster1-pxc-2 1/1 Running 0 7m43s percona-xtradb-cluster-operator-6fc947d9bd-gsqld 1/1 Running 0 13m
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 12m default-scheduler Successfully assigned pxc-test/cluster1-pxc-0 to gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Normal SuccessfulAttachVolume 12m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-2f00de5c-3cea-4b5f-9f82-b819a85c09ee" Normal Pulling 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Pulling image "perconalab/percona-xtradb-cluster-operator:1.6.0" Normal Pulled 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Successfully pulled image "perconalab/percona-xtradb-cluster-operator:1.6.0" Normal Created 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Created container pxc-init Normal Started 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Started container pxc-init Normal Pulling 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Pulling image "perconalab/percona-xtradb-cluster-operator:master-pxc8.0" Normal Pulled 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Successfully pulled image "perconalab/percona-xtradb-cluster-operator:master-pxc8.0" Normal Created 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Created container pxc Normal Started 11m kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Started container pxc Warning Unhealthy 25s (x17 over 8m25s) kubelet, gke-tomislav-cluster-117-default-pool-ee53df37-vp08 Readiness probe failed: ERROR 1045 (28000): Access denied for user 'clustercheck'@'localhost' (using password: YES) + [[ '' == \P\r\i\m\a\r\y ]] + exit 1

And the cluster seems to be stuck in this state.

If I manually delete the pods 0 and 1 they return to normal state.

Environment

None

Smart Checklist

Activity

Tomislav Plavcic November 2, 2020 at 1:12 PM

One thing that I might add here is that if I saw correctly that after node-2 is restarted init script is run, but when node-1 is restarted it isn't and that might be worth to check.

Done

Details

Assignee

Reporter

Needs Review

Yes

Time tracking

6h logged

Fix versions

Priority

Smart Checklist

Created October 29, 2020 at 4:41 PM
Updated March 5, 2024 at 6:03 PM
Resolved February 2, 2021 at 3:49 PM

Flag notifications