MongoDB operator crashes when pause is enabled and there are no pods created

Description

Describe the bug
When the user provides some incorrect configuration that prevents statefulsets from creating pods and enables the pause, the operator keeps crashing.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy a MongoDB cluster with pause enabled, 3 replicas and some configures that prevents sts from creating pods (e.g. non-existent runtimeClassName), for example:

Expected behavior
The operator should run normally and be available.

Current behavior
The operator keeps crashing and be in CrashLoopBackoff state.

Root Cause
The root cause is that when pause is enabled, the operator tries to delete replset pods. Before it deletes pods, it checks whether the pod is primary at this line: https://github.com/percona/percona-server-mongodb-operator/blob/main/pkg/controller/perconaservermongodb/finalizers.go#L172. If the no pods are created, operator will access the index 0 of an empty PodList.

Desktop (please complete the following information):

  • OS: Ubuntu

  • Browser [e.g. chrome, safari]

  • Reproduced on latest release v0.8.3

Environment

None

Activity

Slava Sarzhan September 9, 2024 at 6:53 PM

In general when we have pause: true operator should not start DB cluster at all.

Details

Assignee

Reporter

Needs QA

Yes

Fix versions

Affects versions

Priority

Smart Checklist

Created September 9, 2024 at 6:36 PM
Updated December 16, 2024 at 2:13 PM