DBaaS: cfg pod keeps restarting

General

Escalation

General

Escalation

Description

Impact:

PSMDB cluster won't stay up making our UI tests unstable

STR:

Start minikube using our kubernetes-cluster-staging job
Register the minikube (only works with instance created by aws-staging-start job)
Create a PSMDB cluster
Wait for a while

Result:

The cluster won't stay up long, cfg pod keeps restarting and eventually goes into CrashLoopBackOff

Details:

Could not reproduce this with minikube running on localhost or with EKS, only with our kubernetes-cluster-staging deployment

It seems to be related to mongo clusters consuming a lot of CPU

More in https://perconacorp.slack.com/archives/C01F3TX1CL8/p1631872806027100

How to test

None

How to document

None

Attachments

12 Aug 2021, 01:30 PM

Activity

Show:

Roma Novikov
November 26, 2021 at 9:43 AM

@Puneet Kala, @Nurlan Moldomurov is this a QA problem only or this is DBaaS/Operator problem?
If DBaaS - then maybe Operators team can check this too

Former user
August 23, 2021 at 8:26 AM

@Beata Handzelova Could you please open a discussion about this topic in #dbaas channel?
It may be cause by noisy neighbour problem -> When the cluster has low limit on cpu or machine is under high load, psmdb cluster fails to start. That is known issue. It's only my guess it's the cause. Not sure if Cloud team can do anything about that.

Roma Novikov
August 17, 2021 at 10:25 AM

Is this a defect for https://perconadev.atlassian.net/browse/PMM-8316#icft=PMM-8316 then? or it's K8s version related, not Operator.

Resize issue view side panel

Details

Assignee

Unassigned

Reporter

Former user(Deactivated)

Priority

High

Needs QA

Yes

Needs Doc

Yes

Environment

percona/percona-server-mongodb-operator:1.9.0
minikube version: v1.20.0
kubernetes version: v1.20.1
PMM Server: dev-latest

Created August 12, 2021 at 1:31 PM

Updated March 27, 2024 at 2:59 PM