Implement a way to skip a operation
Description
Environment
AFFECTED CS IDs
Smart Checklist
Activity

Akira Kurogane July 5, 2021 at 7:00 AM
Sorry Prasenjeet but we won't do this.
The idea is simple if you understand MongoDB's idempotent oplog and how it is applied. But not simple once you consider MongoDB catalog (collection, index, UUIDs, sharding metadata) gotchas. Eg. a "test.foo" collection that is dropped will be implicity recreated by the next user insert to "test.foo", but the new "test.foo" has a different collection UUID. The UUID is what the oplog application proceess will use, and also be the way sharding metadata identifies it.
The UI design of this feature in PBM would be also be difficult.
If this was provided as a feature it I can predict with high confidence that more people would make a mistake, or hit a catalog issue they didn't forsee, than people who use it successfully.
'The op' you wish to eliminate during restore can be many ops in different replica sets. Eg. a drop of sharded collection is a 'drop' command on each shard replicaset, plus an update in at least the "collections" and "chunks" collections in the config. If you don't get them cleanly then there will be unrecoverable errors after. Programming PBM to identify these situations to ensure safe running is not feasible.
If you'd like to do a restore using PBM with one disastrous command removed I suggest: download the PITR oplog span file or files from the backup. Decompress - you'll have to search for a tool that will decompress by the 'snappy' compression used, or whatever other compression lib if you choose a non-default compression option. Insert them using mongorestore into a standalone mongod node, remove the command you need, mongodump again as BSON *confirming the output is ordered by "ts", then insert back to the remote backup store as the same file name. PBM won't be aware that the oplog spans were updated.
Please analyze the feasibility to implement a way to skip an operation.
For example, A drop was issued, the timestamp of the operation properly identified and instead of losing 10 minutes (period of time PITR generates new dumps), just skip that operation from that oplog dump.