MongoDB is transiently unavailable although the mongos servers have become ready when the cluster is first booting up

Description

MongoDB is transiently unavailable for writes even when the mongos servers are showing Ready status when the cluster is first booting up.

The MongoDB does become available after the mongos becomes ready for a while. We think that the MongoDB cluster is not yet fully ready although the current readiness probe for mongos succeeds.

 

To Reproduce
Steps to reproduce the behavior:

  1. Deploy any MongoDB cluster with sharding:

  2. Deploy a MongoDB client which keeps trying to perform a write. In our case, we are using the mongodb's golang client to perform a collection.InsertOne() with a document.

Expected behavior
After the Mongos servers get ready, the cluster should become fully ready for client workloads.

Current behavior
Although the Mongos servers' readiness probes succeeded, the the InsertOne workloads are failing, with errors saying shard not found

Root Cause
The current readiness probe implementation for Mongos is listing the admin db and check if the request can be successfully executed. However, although the admin db can be listed, the MongoDB cluster is not yet ready for write workload, causing unavailability.

Environment

None

Activity

Details

Assignee

Reporter

Needs QA

Yes

Affects versions

Priority

Smart Checklist

Created 5 days ago
Updated 5 days ago