MongoDB Cluster Shows DOWN in Percona Everest UI After Manual Finalizer Deletion and Configuration Change

Description

  • Percona Everest: v1.3.0-rc4

  • psmdb (1.17.0)

  • GKE

A MongoDB cluster with sharding enabled was initially set up with 3 nodes (Medium size per node). Due to insufficient resources, the deployment got stuck, so I decided to delete it manually by using the following command:
kubectl edit psmdb/mongodb-test -n everest

I deleted the finalizer called delete-psmdb-pods-in-order and saved the changes.

After some time, I recreated the cluster with 3 nodes but this time using Small size nodes to avoid resource issues. This configuration worked perfectly. Later, I decided to change the configuration to 5 nodes with a Small size per node. However, in the Percona Everest UI, the MongoDB cluster status shows as DOWN, even though all pods are in a Running state.

Image:

Some logs:

Discussion with and in the thread: https://perconacorp.slack.com/archives/C0545J2BEJX/p1731593603806659

Steps to reproduce it:

  • Deploy a MongoDB cluster with 3 Medium nodes; it fails due to resource limits.

  • Manually delete the stuck deployment by removing the delete-psmdb-pods-in-order finalizer.

  • Recreate the cluster with 3 Small nodes, confirming successful deployment.

  • Update the cluster to 5 Small nodes.

  • In Percona Everest UI, the cluster shows as DOWN even though all pods are Running.

  • Check for errors, especially authentication issues, using kubectl get db <DBName> -n <DBNamespace>.

Environment

None

Attachments

2

Activity

Show:

Edith Erika Puclla Pareja January 29, 2025 at 9:22 AM

This issue is related with
When we create a single node of MongoDB

Edith Erika Puclla Pareja January 21, 2025 at 3:49 PM
Edited

This happened again with 0.0.0 (main branch) on GKE, one single node of MongoDB (No backups) But a Storage registered.
There is a Slack thread where there are some logs:
It might be related to:

Edith Erika Puclla Pareja December 5, 2024 at 10:21 AM

Try to replicate the last time, and if this is not happening a gain close the issue

Diogo Recharte November 14, 2024 at 6:28 PM

Tried on EKS, wasn’t able to reproduce.

Details

Assignee

Reporter

Priority

Smart Checklist

Created November 14, 2024 at 4:22 PM
Updated January 30, 2025 at 9:48 AM