Restart DB Cluster restarts only one PXC server and one proxySQL, not all of them
General
Escalation
General
Escalation
Description
How to test
Just follow the steps to reproduce. They won't be reproducible anymore
How to document
None
Smart Checklist
Activity
Show:

Andrei Minkin January 9, 2023 at 3:06 PM
It was done during the and

Andrei Minkin November 24, 2022 at 3:59 PM
https://github.com/percona/dbaas-operator/commit/e4bcd7db7c51280988a821a67681a27a220db397 The FIX I'll apply the same pattern for psmdb clusters also. Waiting for to be fixed

Diogo Recharte November 9, 2022 at 2:26 PM
This bug will be fixed with the architectural changes done in .

Alexey Palazhchenko January 20, 2021 at 4:35 PM
We will use "pause" functionality as discussed earlier at and other places.

Sergey Pronin January 20, 2021 at 6:09 AM
Why don't we use 'pause' functionality of the operator for this? The same you use for suspend/resume.
Done
Details
Details
Assignee

Reporter

Priority
Components
Labels
Needs QA
Yes
Planned Version/s
Fix versions
Story Points
3
Smart Checklist
Open Smart Checklist
Smart Checklist

Open Smart Checklist
Created January 19, 2021 at 9:43 PM
Updated March 6, 2024 at 3:20 AM
Resolved January 10, 2023 at 11:13 AM
Impact on the user:
user will not get what they've tried to achieve by restarting the DB Cluster
Steps to reproduce:
Setup PXC cluster in PMM
wait it's ready
Restart it
Wait it's restarted
check status in k8s
Actual result:
spronin-test-proxysql-0 3/3 Running 6 13h spronin-test-proxysql-1 3/3 Running 0 13h spronin-test-proxysql-2 0/3 ContainerCreating 0 1s spronin-test-pxc-0 1/1 Running 0 13h spronin-test-pxc-1 1/1 Running 8 13h spronin-test-pxc-2 1/1 Terminating 0 2m29s
as you can see only one PXC and one proxySQL pod's started recently, others are 13h Up
Expected Result:
All pods have recent uptime
Workaround:
restart pods from kubectl
Suggested Implementation:
When dbaas-controller gets Restart request
Pause DB cluster
Wait until DB cluster is paused
Start new goroutine to Resume DB cluster
Return response
Possible issues:
If dbaas-controller restarts during restart DB cluster may stuck in pause state
After pausing DB cluster it may have active status for a few seconds (PMM-7397)
Details:
implemented database cluster restart with
kubectl rollout restart
. It is good enough for alpha, but not good enough for beta: we can lose data this way, and operators' team confirmed it is not safe.For beta1, we should implement restart via full cluster pause/resume: https://www.percona.com/doc/kubernetes-operator-for-pxc/pause.html
The same functionality will be available for PSMDB.