restore fails with applyOps: (Location10065) invalid parameter: expected an object ()

Description

The issue seems to happen on restore and looks like:

... 2021-09-19T20:05:28.150+0000 no indexes to restore for collection config.migrationCoordinators 2021-09-19T20:05:28.150+0000 no indexes to restore for collection config.rangeDeletions 2021-09-19T20:05:28.150+0000 no indexes to restore for collection admin.pbmRUsers 2021-09-19T20:05:28.000+0000 I [restore/2021-09-19T20:01:28Z] mongorestore finished 2021-09-19T20:05:31.000+0000 I [restore/2021-09-19T20:01:28Z] restoring users and roles 2021-09-19T20:05:31.000+0000 I [restore/2021-09-19T20:01:28Z] starting oplog replay 2021-09-19T20:05:31.000+0000 E [restore/2021-09-19T20:01:28Z] restore: oplog apply: applying an entry: applyOps: (Location10065) invalid parameter: expected an object () 2021-09-19T20:05:31.000+0000 D [restore/2021-09-19T20:01:28Z] releasing lock

Not sure currently is it because of some parameter that we pass or genuine PBM issue.

More logs:
rs0 pbm-agent logs:


cfg0 pbm-agent logs:


operator log:

Version info:
2021-09-20T08:35:28.000+0000 I pbm-agent:
Version: 1.6.0
Platform: linux/amd64
GitCommit: 5ccc7f0911ddc73c6a33bb36c85e2a8183f17bea
GitBranch: release-1.6.0-fix-k9sdocker
BuildTime: 2021-09-19_16:53_UTC
GoVersion: go1.15.15

MongoDB version: 4.4.8-9

PBM status:

Steps to reproduce:

1. set env variables

export IMAGE=perconalab/percona-server-mongodb-operator:k8spsmdb-566 export IMAGE_PMM=percona/pmm-client:2.21.0 export IMAGE_BACKUP=perconalab/percona-server-mongodb-operator:1.10.0-backup export IMAGE_MONGOD=percona/percona-server-mongodb:4.4.8-9

2. git clone https://github.com/percona/percona-server-mongodb-operator.git --branch release-1.10.0
3. add cloud secret into e2e-tests/conf/cloud-secret.yml (if you don't have it please ping me)
4. execute e2e-tests/demand-backup-sharded/run and wait until it reaches restore from aws backup (~20 min)

Environment

None

Attachments

23
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 01:44 PM
  • 21 Sep 2021, 09:01 AM
  • 21 Sep 2021, 09:01 AM

Smart Checklist

Activity

Tomislav Plavcic October 4, 2021 at 9:50 AM
Edited


Here are the steps to reproduce:
Use the script from here: https://github.com/tplavcic/cloud-qa/blob/main/mongo_startup.sh
(it's the same old script, but changed to run 3 shards instead of 2 - it seems that it's easier to reproduce with 3 shards)
Use PSMDB binary tarball and PBM binary tarball for this script.

Steps:
$ ./mongo_startup.sh -s --pbmDir=/home/plavi/lab/pbm/bin/percona-backup-mongodb-1.6.0 --pbmStorage=fs
$ export PBM_MONGODB_URI=mongodb://localhost:27017/
$ ./pbm_store_set.sh
$ ./pbm backup # first backup
wait until "config.system.sessions" starts to get sharded and chunks migrated, best to have chunks in all 3 shards like below (and it might take few minutes for this to start, I'm not sure what really kicks it off in mongo):

databases: { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: rs1 1020 rs2 2 rs3 2

$ ./pbm backup # run second backup while this migration in progress

mongos> sh.stopBalancer()
$ ./pbm restore <ID of second (or last) backup>

notice this in pbm logs:

$ ./pbm logs 2021-10-04T09:27:58Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] restore started 2021-10-04T09:27:58Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] restore started 2021-10-04T09:27:58Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] restore started 2021-10-04T09:27:58Z I [config/localhost:27028] [restore/2021-10-04T09:26:38Z] restore started 2021-10-04T09:28:02Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] mongorestore finished 2021-10-04T09:28:02Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] mongorestore finished 2021-10-04T09:28:02Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] mongorestore finished 2021-10-04T09:28:02Z I [config/localhost:27028] [restore/2021-10-04T09:26:38Z] mongorestore finished 2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] restoring users and roles 2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] restoring users and roles 2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] starting oplog replay 2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] starting oplog replay 2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] oplog replay finished on {1633339611 20} 2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] oplog replay finished on {1633339614 21} 2021-10-04T09:28:04Z I [rs2/localhost:28019] [restore/2021-10-04T09:26:38Z] restore finished successfully 2021-10-04T09:28:04Z I [rs3/localhost:29019] [restore/2021-10-04T09:26:38Z] restore finished successfully 2021-10-04T09:28:04Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] restoring users and roles 2021-10-04T09:28:04Z I [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] starting oplog replay 2021-10-04T09:28:04Z E [rs1/localhost:27019] [restore/2021-10-04T09:26:38Z] restore: oplog apply: applying an entry: applyOps: (Location10065) invalid parameter: expected an object () 2021-10-04T09:28:04Z E [config/localhost:27028] [restore/2021-10-04T09:26:38Z] restore: waiting for start: cluster failed: oplog apply: applying an entry: applyOps: (Location10065) invalid parameter: expected an object ()

Kimberly Wilkins September 30, 2021 at 7:10 AM

Does this occur all the time?  Or only when using Kubernetes envs?

Tomislav Plavcic September 29, 2021 at 8:13 AM

1. I have found today that 4.0 is not affected by this. So 4.0 and 5.0 seems unaffected and 4.2/4.4 seem affected.
2. Yes, this is just a regular backup/restore without PITR.

andrew.pogrebnoi September 29, 2021 at 8:05 AM

Thanks, Tomislav!

  1. Did you try v4.0?

  2. It's just a regular backup/restore - without any PITR. Right?

 

Tomislav Plavcic September 24, 2021 at 3:14 PM
Edited

So I have played a bit with this today.

Good news: it doesn't affect PSMDB 5.0.2, only 4.4/4.2.
Bad news: Sleep before backup and before restore doesn't help. Also I tried to enter much more data during the sleep before backup and it didn't help.

I have managed to reproduce it manually at least once without much data with steps like:
1. create a new user: "db.createUser({user:"myApp",pwd:"myPass",roles:[{db:"myApp",role:"readWrite"}]})"
2. add 4 documents into one test collection
3. create backup
4. drop test collection
5. do restore

Not sure if any of this helps, but these are my findings so far.

Done

Details

Assignee

Reporter

Time tracking

1d 6h logged

Fix versions

Affects versions

Priority

Smart Checklist

Created September 20, 2021 at 6:59 AM
Updated March 5, 2024 at 7:00 PM
Resolved October 27, 2021 at 12:21 PM

Flag notifications