Sharded cluster restoration failure in 2.8.0

General

Escalation

General

Escalation

Description

This ticket is a duplicate of in the PBM project as asked by for further investigation

Context:

PSMDB 1.19.1, pbm 2.8.0

Problem:

PITR restoration of a sharded cluster fails with the error (have never seen it before) :

Attaching psmdb operator and pbm logs, pbm status as well as psmdb, psmdb-restore, psmdb-backup CRs. The backup files can be found in percona-dev AWS, bucket name oksana-s3-2, the direct link https://us-east-2.console.aws.amazon.com/s3/buckets/oksana-s3-2?region=us-east-2&bucketType=general&prefix=mongodb-5ta/&showversions=false

STR

Create sharded cluster: 3 nodes, 3 shards (rs), 3 configservers, pitr enabled
Take a scheduled backup
Add ~1.2 Gb of data (replace PASSWORD with databaseAdmin password, CLUSTER_NAME and NAMESPACE) :
Take a scheduled backup
Enable sharding for the collection:
1. get password from clusterAdmin (replace NAMESPACE and CLUSTER_NAME) :
2. run percona-client:
3. connect to mongo using clusterAdmin:
4. Shard the collection:
5. Wait while the data is sharded (run the last command several times so it shows 3 shards with some stable percentage of distribution between the runs)
Take a scheduled backup
Wait ~30 min
(UPD: in my case I was monitoring the pitr chunks uploading intervals in the pbm status. The interval updates each 10 min by default. First the interval was ending with the time when the latest backup completed, then in 10 min it was updated (update1, +10 min), then in 10 min it was updated again (update2, +10 min).
Restore using latest backup + pitr to a time after the latest backup (UPD: time between update1 and update2)
Restore fails with:

None

Resize issue view side panel

Boris Ilijic

Oksana Grishchenko

Yes

High

Created February 13, 2025 at 12:01 PM

Updated April 15, 2025 at 11:20 AM