pxc operator robustness improvement

General

Escalation

General

Escalation

Description

Now the operator always sleep to meet the required condition, this lead to block. And if the operator is crash or the required condition is err, the cluster will not auto recovery(such as when restore failed, the pxc size information will lose)。

In k8s, operator always push the state to the final state according the current state, it is not procedure oriented. The operator will become more stable if we use k8s thinkind.

I am sorry that my english is poor, I don't know if anyone understand my idea. But i am glad to communicate with the operator team and make my effort to improve the program robustness. Any group or way i can chat to the team i want to know

Environment

None

Smart Checklist

Activity

Show:

Jira Bot August 29, 2021 at 11:57 AM

Hello ,
It's been 52 days since this issue went into Incomplete and we haven't heard
from you on this.

At this point, our policy is to Close this issue, to keep things from getting
too cluttered. If you have more information about this issue and wish to
reopen it, please reply with a comment containing "jira-bot=reopen".

Jira Bot August 21, 2021 at 11:56 AM

Hello ,
It's jira-bot again. Your bug report is important to us, but we haven't heard
from you since the previous notification. If we don't hear from you on
this in 7 days, the issue will be automatically closed.

Jira Bot August 6, 2021 at 10:57 AM

Hello ,
I'm jira-bot, Percona's automated helper script. Your bug report is important
to us but we've been unable to reproduce it, and asked you for more
information. If we haven't heard from you on this in 3 more weeks, the issue
will be automatically closed.

Lalit Choudhary July 8, 2021 at 10:16 AM

Thank you for the report and your inputs.

if the operator is crash or the required condition is err, the cluster will not auto recovery(such as when restore failed, the pxc size information will lose)。
In k8s, operator always push the state to the final state according the current state, it is not procedure oriented. The operator will become more stable if we use k8s thinkind.

In PXC-Operator version 1.7 and 1.8 there few improvements for auto-recovery .

New feature 1.7.0 and 1.8.0

: Add support for point-in-time recovery

: PXC cluster will now recover automatically from a full crash when Pods are stuck in CrashLoopBackOff status

: Operator can now automatically recover Percona XtraDB Cluster after the network partitioning

https://www.percona.com/doc/kubernetes-operator-for-pxc/ReleaseNotes/index.html

Apart from this if have further improvement suggestion, it would be better if you can add example use case and your expectation as an improvement.

Feel free to add a comment here.

Incomplete

Details
Assignee
Lalit Choudhary
Reporter
朱礼程
Components
Priority
Medium

Smart Checklist

Created May 11, 2021 at 7:43 AM

Updated March 5, 2024 at 5:51 PM

Resolved August 29, 2021 at 11:57 AM

Configure

pxc operator robustness improvement

Description

Environment

Smart Checklist

Activity

Jira Bot August 29, 2021 at 11:57 AM

Jira Bot August 21, 2021 at 11:56 AM

Jira Bot August 6, 2021 at 10:57 AM

Lalit Choudhary July 8, 2021 at 10:16 AM

DetailsAssigneeLalit ChoudharyLalit ChoudharyReporter朱礼程朱礼程ComponentsPriorityMedium

Details

Assignee

Reporter

Components

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Details
Assignee
Lalit Choudhary
Reporter
朱礼程
Components
Priority
Medium

Smart Checklist