Over the past few years, we have encountered several occasions when we have been unable to restore backups due to BSON deserialisation errors. These snapshots and subsequent failures have been taken in entirely different infrastructures (one Azure Blob store, one NFS, one CIFS).
Nothing in the logs indicated a problem when taking the snapshots, and our attempts to read and decompress the files didn’t reveal any apparent problems, but something was still going wrong with the restore processes:
The major risk is that we had no idea these backups were unrestorable until performing a full restore into Mongo. All of these backups were with old versions, so I’m not asking for bug fixes here – we are underway moving to the latest release.
What I would like to highlight is the need for a validation feature which can test the backups to the greatest extent possible in-place without performing a destructive restore. Possibilities include:
Calculating and checksums of files are they written to storage, which can be verified later
Restoring snapshots to Mongo in a different databases, but with index creation disabled
An intermediate where dumps are read and decoded, but not sent to mongod
Over the past few years, we have encountered several occasions when we have been unable to restore backups due to BSON deserialisation errors. These snapshots and subsequent failures have been taken in entirely different infrastructures (one Azure Blob store, one NFS, one CIFS).
Nothing in the logs indicated a problem when taking the snapshots, and our attempts to read and decompress the files didn’t reveal any apparent problems, but something was still going wrong with the restore processes:
The major risk is that we had no idea these backups were unrestorable until performing a full restore into Mongo. All of these backups were with old versions, so I’m not asking for bug fixes here – we are underway moving to the latest release.
What I would like to highlight is the need for a validation feature which can test the backups to the greatest extent possible in-place without performing a destructive restore. Possibilities include:
Calculating and checksums of files are they written to storage, which can be verified later
Restoring snapshots to Mongo in a different databases, but with index creation disabled
An intermediate where dumps are read and decoded, but not sent to
mongod