Restore from pbm hangs after some time.

Description

Hello!

We have problem restoring physical mongodb created with pbm.
I’ve been forwarded here from community forum

When we restore the backup to the test server with the remapped replica set name, it hangs after some time without any additional logs. We can see that network traffic drops and nothing happens on the server. We could get this error only on incremental physical backups. If we restore only base of incremental physical backups, everything works fine.
We also get this error when we use different servers for restores and different storages for backups. The storages used are OVH object storage and Hetzner object storage.

describe-restore seems to show incorrect information about start time and last_transition_time

Also, it seems that when the process hangs (and traffic drops), pbm-cli stops waiting, even though we set a fairly large wait time. We get these errors on another restore, so I will send examples in another message so that different timestamps do not bother you. Let me know if it’s important for the discussion.

Also let me know if we could attach any additional info about restores.

Environment

MongoDB version - percona-server-mongodb-server/stable,now 6.0.9-7.jammy
PBM version - percona-backup-mongodb/unknown,now 2.7.0-1.jammy
PRETTY_NAME="Ubuntu 22.04.5 LTS"

Activity

Show:

radoslaw.szulgo 4 days ago

Thanks for the update , Danila! That helps. We’ plan to review these findings in about 2 weeks.

Danila Terekhov 4 days ago

Hello!
Good news!!
We were able to restore an incremental backup with repl-set-remmaping using pbm-agent version 2.5.0 in conjunction with mongodb 6.0.9.
It seems that problem with incremental restore appeared lately in pbm-agent versions so for now we will use pbm-agent 2.5.0. We will give you any additional information about future restores.

For now, we cannot say clearly if the problem appears on version 2.6.0, but its super clear that the problem with incremental restores exists on versions both 2.7.0 and 2.8.0.

Could you possibly confirm a problem on your end about failing incremental restores in higher versions of pbm-agent?

Danila Terekhov 5 days ago


It seems that problem not in replica-set remaping. We renamed (recreated) replica-set for restore with name as in original cluster, created new incremental backup and restore fails.

Here’s pbm-agent and mongod statuses

 

Node load are same as before, traffic drop to zero after restore hangs. screenshot is here.

Now pbm-cli gave us some logs like this:

But in pbm-agent logs nothing happened.


By filename in “copy …” log lines we see, that restore hangs on incremental backups, all files from 2025-03-10T06:00:05Z including indexes downloaded correctly.


pbm describe-restore also looks incorrect.


Please say if i can attach any additional information about restore except diagnostic report, because its unavailable for physical restores

Danila Terekhov last week

Hello once again! Sorry for late reply, i was on my holidays.

In the meantime, as you are now on 2.8.0 - you can provide diagnostic report.

Your documents says, that diagnostic report only available for logical backups, and we are using physical.

we’ve identified that replica set remapping doesn’t work for incremental backups

As we remember, previously (before update to 2.7.0 or when backups were smaller) incremental restores worked fine with repl-set-remapping, but I may not be correct here. We will try to do an incremental restore to a different replica-set, but with the replica-set name used in the original cluster, and tell you if it worked.

Note that the number of shards must be the same as in the environment where the you made the backup.

We have non-sharded cluster. Number of nodes is different, for restore we have only one node cluster with initialized replica-set, while in original backups cluster we have more than 5 data nodes. We have always only had one node for the restore and previously everything has worked.

radoslaw.szulgo March 6, 2025 at 1:54 PM

Seems indeed it should work. Let us check again:

> Starting with version 2.2.0, you can restore physical and incremental physical backups into a new environment with a different replica set names. Note that the number of shards must be the same as in the environment where the you made the backup.

Details

Security Level Help

None

Assignee

Reporter

Labels

Needs QA

Yes

Components

Affects versions

Priority

Smart Checklist

Created February 18, 2025 at 10:03 AM
Updated 4 days ago