[Investigation] Add support for incremental backups

Description

Incremental backups solve the following problems:

  1. Storage efficiency - save changes made only after the last backup - full or another incremental.

  2. Faster backups - no need to upload the full backup, only changes.

  3. Cost effective - lower network load and storage consumption.

  4. Improved RTO - in a mix with PITR can allow users to recover faster, as oplog recovery will be needed for much smaller period.

 

Percona Backup for MongoDB supports incremental backups for physical type:

 

We need to support it in the Operator.

Environment

None

Activity

inel.pandzic September 17, 2024 at 8:07 AM

Regarding the implementation, for us it shouldn't be problematic, the whole mechanism needed we already have in place.

Regarding the backup process, to start an incremental chain, there should be an initial base full backup. This means if the first backup created by the user is an incremental backup, the operator can automatically create a full physical backup. A second incremental backup can actually be an incremental backup based on the initial base backup.

The operator can create a missing base backup like this:
pbm backup --type incremental --base
But it would be maybe better that we create it by creating a proper backup object that would initiate a full physical backup.
After that incremental backup is made with pbm backup --type incremental command.

Regarding restore process, the flow from an incremental backup is the same as the restore from a full physical backup: specify the backup name for the pbm restore command, like pbm restore 2022-11-25T14:13:43Z.

As seen from the backup of the command above, our spec could be something like this:

One consideration we need to keep in mind:
Percona Backup for MongoDB tracks the backup history only on the node where the base incremental backup was taken. This means that subsequent incremental backups must always be run on that very node. To make this happen, Percona Backup for MongoDB tries to schedule backups on that same node.

If the node with the base incremental backup is down or unavailable, you must start the incremental backup chain anew on another node.

Done

Details

Assignee

Reporter

Needs QA

Yes

Story Points

Sprint

Fix versions

Priority

Smart Checklist

Created March 19, 2024 at 8:10 AM
Updated December 16, 2024 at 1:31 PM
Resolved September 17, 2024 at 8:08 AM