restore fails with applyOps: (Location10065) invalid parameter: expected an object ()

General

Escalation

General

Escalation

Description

The issue seems to happen on restore and looks like:

...
2021-09-19T20:05:28.150+0000    no indexes to restore for collection config.migrationCoordinators
2021-09-19T20:05:28.150+0000    no indexes to restore for collection config.rangeDeletions
2021-09-19T20:05:28.150+0000    no indexes to restore for collection admin.pbmRUsers
2021-09-19T20:05:28.000+0000 I [restore/2021-09-19T20:01:28Z] mongorestore finished
2021-09-19T20:05:31.000+0000 I [restore/2021-09-19T20:01:28Z] restoring users and roles
2021-09-19T20:05:31.000+0000 I [restore/2021-09-19T20:01:28Z] starting oplog replay
2021-09-19T20:05:31.000+0000 E [restore/2021-09-19T20:01:28Z] restore: oplog apply: applying an entry: applyOps: (Location10065) invalid parameter: expected an object ()
2021-09-19T20:05:31.000+0000 D [restore/2021-09-19T20:01:28Z] releasing lock

Not sure currently is it because of some parameter that we pass or genuine PBM issue.

More logs:
rs0 pbm-agent logs:

cfg0 pbm-agent logs:

operator log:

Version info:
2021-09-20T08:35:28.000+0000 I pbm-agent:
Version: 1.6.0
Platform: linux/amd64
GitCommit: 5ccc7f0911ddc73c6a33bb36c85e2a8183f17bea
GitBranch: release-1.6.0-fix-k9sdocker
BuildTime: 2021-09-19_16:53_UTC
GoVersion: go1.15.15

MongoDB version: 4.4.8-9

PBM status:

Steps to reproduce:

1. set env variables

export IMAGE=perconalab/percona-server-mongodb-operator:k8spsmdb-566
export IMAGE_PMM=percona/pmm-client:2.21.0
export IMAGE_BACKUP=perconalab/percona-server-mongodb-operator:1.10.0-backup
export IMAGE_MONGOD=percona/percona-server-mongodb:4.4.8-9

2. git clone https://github.com/percona/percona-server-mongodb-operator.git --branch release-1.10.0
3. add cloud secret into e2e-tests/conf/cloud-secret.yml (if you don't have it please ping me)
4. execute e2e-tests/demand-backup-sharded/run and wait until it reaches restore from aws backup (~20 min)

Environment

None

Attachments

21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 01:44 PM
21 Sep 2021, 09:01 AM
21 Sep 2021, 09:01 AM

Smart Checklist

Activity

Tomislav Plavcic October 4, 2021 at 9:50 AM
Edited

@andrew.pogrebnoi
Here are the steps to reproduce:
Use the script from here: https://github.com/tplavcic/cloud-qa/blob/main/mongo_startup.sh
(it's the same old script, but changed to run 3 shards instead of 2 - it seems that it's easier to reproduce with 3 shards)
Use PSMDB binary tarball and PBM binary tarball for this script.

Steps:
$ ./mongo_startup.sh -s --pbmDir=/home/plavi/lab/pbm/bin/percona-backup-mongodb-1.6.0 --pbmStorage=fs
$ export PBM_MONGODB_URI=mongodb://localhost:27017/
$ ./pbm_store_set.sh
$ ./pbm backup # first backup
wait until "config.system.sessions" starts to get sharded and chunks migrated, best to have chunks in all 3 shards like below (and it might take few minutes for this to start, I'm not sure what really kicks it off in mongo):

databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                rs1     1020
                                rs2     2
                                rs3     2

$ ./pbm backup # run second backup while this migration in progress

mongos> sh.stopBalancer()
$ ./pbm restore <ID of second (or last) backup>

notice this in pbm logs:

$ ./pbm logs
2021-10-04T09:27:58Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] restore started
2021-10-04T09:27:58Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] restore started
2021-10-04T09:27:58Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] restore started
2021-10-04T09:27:58Z I [config/localhost:27028] [restore/2021-10-04T09:26:38Z] restore started
2021-10-04T09:28:02Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] mongorestore finished
2021-10-04T09:28:02Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] mongorestore finished
2021-10-04T09:28:02Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] mongorestore finished
2021-10-04T09:28:02Z I [config/localhost:27028] [restore/2021-10-04T09:26:38Z] mongorestore finished
2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] restoring users and roles
2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] restoring users and roles
2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] starting oplog replay
2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] starting oplog replay
2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] oplog replay finished on {1633339611 20}
2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] oplog replay finished on {1633339614 21}
2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] restore finished successfully
2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] restore finished successfully
2021-10-04T09:28:04Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] restoring users and roles
2021-10-04T09:28:04Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] starting oplog replay
2021-10-04T09:28:04Z E [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] restore: oplog apply: applying an entry: applyOps: (Location10065) invalid parameter: expected an object ()
2021-10-04T09:28:04Z E [config/localhost:27028] [restore/2021-10-04T09:26:38Z] restore: waiting for start: cluster failed: oplog apply: applying an entry: applyOps: (Location10065) invalid parameter: expected an object ()

Kimberly Wilkins September 30, 2021 at 7:10 AM

Does this occur all the time? Or only when using Kubernetes envs?

Tomislav Plavcic September 29, 2021 at 8:13 AM

1. I have found today that 4.0 is not affected by this. So 4.0 and 5.0 seems unaffected and 4.2/4.4 seem affected.
2. Yes, this is just a regular backup/restore without PITR.

andrew.pogrebnoi September 29, 2021 at 8:05 AM

Thanks, Tomislav!

Did you try v4.0?
It's just a regular backup/restore - without any PITR. Right?

Tomislav Plavcic September 24, 2021 at 3:14 PM
Edited

So I have played a bit with this today.

Good news: it doesn't affect PSMDB 5.0.2, only 4.4/4.2.
Bad news: Sleep before backup and before restore doesn't help. Also I tried to enter much more data during the sleep before backup and it didn't help.

I have managed to reproduce it manually at least once without much data with steps like:
1. create a new user: "db.createUser({user:"myApp",pwd:"myPass",roles:[{db:"myApp",role:"readWrite"}]})"
2. add 4 documents into one test collection
3. create backup
4. drop test collection
5. do restore

Not sure if any of this helps, but these are my findings so far.

Done

Details
Assignee
andrew.pogrebnoi
Reporter
Tomislav Plavcic
Time tracking
1d 6h logged
Fix versions
1.6.1
Affects versions
1.6.0
Priority
High

Smart Checklist

Created September 20, 2021 at 6:59 AM

Updated March 5, 2024 at 7:00 PM

Resolved October 27, 2021 at 12:21 PM

restore fails with applyOps: (Location10065) invalid parameter: expected an object ()

Description

Environment

Attachments

Smart Checklist

Activity

Tomislav Plavcic October 4, 2021 at 9:50 AM
Edited

Kimberly Wilkins September 30, 2021 at 7:10 AM

Tomislav Plavcic September 29, 2021 at 8:13 AM

andrew.pogrebnoi September 29, 2021 at 8:05 AM

Tomislav Plavcic September 24, 2021 at 3:14 PM
Edited

Details
Assignee
andrew.pogrebnoi
Reporter
Tomislav Plavcic
Time tracking
1d 6h logged
Fix versions
1.6.1
Affects versions
1.6.0
Priority
High

Details

Assignee

Reporter

Time tracking

Fix versions

Affects versions

Priority

Smart Checklist

Smart Checklist

Flag notifications

Something's gone wrong

Something's gone wrong

restore fails with applyOps: (Location10065) invalid parameter: expected an object ()

Description

Environment

Attachments

Smart Checklist

Activity

Tomislav Plavcic October 4, 2021 at 9:50 AMEdited

Kimberly Wilkins September 30, 2021 at 7:10 AM

Tomislav Plavcic September 29, 2021 at 8:13 AM

andrew.pogrebnoi September 29, 2021 at 8:05 AM

Tomislav Plavcic September 24, 2021 at 3:14 PMEdited

DetailsAssigneeandrew.pogrebnoiandrew.pogrebnoiReporterTomislav PlavcicTomislav PlavcicTime tracking1d 6h loggedFix versions1.6.1Affects versions1.6.0PriorityHigh

Details

Assignee

Reporter

Time tracking

Fix versions

Affects versions

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Flag notifications

Something's gone wrong

Something's gone wrong

Tomislav Plavcic October 4, 2021 at 9:50 AM
Edited

Tomislav Plavcic September 24, 2021 at 3:14 PM
Edited

Details
Assignee
andrew.pogrebnoi
Reporter
Tomislav Plavcic
Time tracking
1d 6h logged
Fix versions
1.6.1
Affects versions
1.6.0
Priority
High

Smart Checklist