multi-shard transactions research
Description
Environment
is cloned by
is duplicated by
relates to
Smart Checklist
Activity
Akira Kurogane December 17, 2019 at 5:57 AMEdited
Four additions to replications internals README in the MongoDB upstream are relevant reading.
Akira Kurogane October 28, 2019 at 12:46 PM
@Former user The test we'd like to do here is:
Prepare 2+ shard cluster
Create collection that is spread over 2+ shards. E.g.: use a shard key of the "_id" field; Insert doc with "_id": 99; and another with "_id": 101; sh.splitAt({"_id": 100}); sh.moveChunk("test.foo", {"_id": 101}, "shard2") (whatever the other shard is).
Start a transaction, make it a multi-shard transaction by deleting both docs "_id": 99 and 101. (It could be inserts or updates, but I think deletes is simpler to check). But don't commit the transaction
Without committing the transaction, stop and immediately make the backup. Important to finish the backup before the 60-second transaction timeout finishes.
Let the 60 second expire.
Run pbm restore.
Allow 60 seconds to pass.
Confirm the transaction writes didn't happen. E.g. if there were deletes confirm that docs with id 99 and 101 exist.
Akira Kurogane October 7, 2019 at 3:10 AM
N.b. We're not 100% sure that this doesn't already work. That is, we're not sure if (or if not) a v4.2 primary node that is having these "prepare": true
oplog docs applied via the applyOps command will buffer rather than immediately apply them.
Two ways forward: read the code, or do the test.
Details
Assignee
UnassignedUnassignedReporter
Akira KuroganeAkira Kurogane(Deactivated)Needs QA
YesTime tracking
30m loggedFix versions
Priority
Medium
Details
Details
Assignee
Reporter
Needs QA
Time tracking
Fix versions
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist
Open Smart Checklist
Smart Checklist

PBM v1.0 does no detection of the MongoDB 4.2. transactions that are multi-shard, which need to be buffered and not applied until a "commitTransaction" is reached.
We need to document that v1.0 does not support them.
It does support transactions of the 4.0 style. This means the way they are in non-sharded replicasets (v4.2 too), and ones in 4.2 clusters that are limited to a single shard are the same oplog format, so those are OK.