Provide post-backup validation functionality

Description

Over the past few years, we have encountered several occasions when we have been unable to restore backups due to BSON deserialisation errors. These snapshots and subsequent failures have been taken in entirely different infrastructures (one Azure Blob store, one NFS, one CIFS).

Nothing in the logs indicated a problem when taking the snapshots, and our attempts to read and decompress the files didn’t reveal any apparent problems, but something was still going wrong with the restore processes:

[rs0/XXX:27017] [pitrestore/snapdate] restore: mongorestore: restore mongo dump (successes: XXX / fails: 0): db_name.collection_name: error restoring from archive on stdin: reading bson input: error demultiplexing archive; archive io error

The major risk is that we had no idea these backups were unrestorable until performing a full restore into Mongo. All of these backups were with old versions, so I’m not asking for bug fixes here – we are underway moving to the latest release.

What I would like to highlight is the need for a validation feature which can test the backups to the greatest extent possible in-place without performing a destructive restore. Possibilities include:

  • Calculating and checksums of files are they written to storage, which can be verified later

  • Restoring snapshots to Mongo in a different databases, but with index creation disabled

  • An intermediate where dumps are read and decoded, but not sent to mongod

Environment

None

Activity

Show:

Aaditya Dubey June 4, 2024 at 12:17 PM

Hi

Thank you for the report and feedback.

Details

Assignee

Reporter

Labels

Needs QA

Yes

Priority

Smart Checklist

Created May 24, 2024 at 10:53 AM
Updated February 3, 2025 at 12:10 PM

Flag notifications