DBaaS: cfg pod keeps restarting

Description

Impact:

PSMDB cluster won't stay up making our UI tests unstable

STR:

Start minikube using our kubernetes-cluster-staging job
Register the minikube (only works with instance created by aws-staging-start job)
Create a PSMDB cluster
Wait for a while

Result:

The cluster won't stay up long, cfg pod keeps restarting and eventually goes into CrashLoopBackOff

Details:

Could not reproduce this with minikube running on localhost or with EKS, only with our kubernetes-cluster-staging deployment

It seems to be related to mongo clusters consuming a lot of CPU

More in https://perconacorp.slack.com/archives/C01F3TX1CL8/p1631872806027100 

How to test

None

How to document

None

Attachments

1

Activity

Show:

Roma Novikov 
November 26, 2021 at 9:43 AM

,  is this a QA problem only or this is DBaaS/Operator problem?
If DBaaS - then maybe Operators team can check this too

Former user 
August 23, 2021 at 8:26 AM

Could you please open a discussion about this topic in #dbaas channel?
It may be cause by noisy neighbour problem -> When the cluster has low limit on cpu or machine is under high load, psmdb cluster fails to start. That is known issue. It's only my guess it's the cause. Not sure if Cloud team can do anything about that.

Roma Novikov 
August 17, 2021 at 10:25 AM

Is this a defect for https://perconadev.atlassian.net/browse/PMM-8316#icft=PMM-8316 then? or it's K8s version related, not Operator. 

Details

Assignee

Reporter

Priority

Needs QA

Needs Doc

Environment

percona/percona-server-mongodb-operator:1.9.0
minikube version: v1.20.0
kubernetes version: v1.20.1
PMM Server: dev-latest

Created August 12, 2021 at 1:31 PM
Updated March 27, 2024 at 2:59 PM

Flag notifications