Backup script will store a partial backup on error during SST

Description

If SST data transfer starts, but doesn't finish fully, the backup script stores an incomplete backup.

This is related to K8SPXC-242: we have no way to check that SST is completed successfully and that the donor is still alive.

Reproduction steps:

  • create a modified backup sst script which transmits an indinite amount of random data (cat /dev/urandom), and modify the backup script to use this (or: add a large dataset to the cluster)

  • setup a cluster

  • kill the donor sst script before it completes

  • observe that the backup script thinks that it successfully received the entire dataset

Environment

None

Smart Checklist

Activity

Rahul Malik March 23, 2020 at 12:06 PM

for a successful backup, in the end, it should be "completed OK!"

Mykola Marzhan March 23, 2020 at 10:34 AM

is xbstream has anything at the end of the stream which can be like a flag of successful transfer?
in short: I want to run the command like tail -1 xbstream | grep "successful backup" to understand if backup streaming was finished correctly.

Zsolt Parragi March 23, 2020 at 7:02 AM

I tested this with the file based backup, and there I didn't see any issues reported (and xbcloud isn't used there). It's possible that this only affects the file based backup.

Mykola Marzhan March 23, 2020 at 6:59 AM

we definitely know if the backup was finished, xbstream has its own strict format which read by xbcloud, if backup uploaded correctly xbcloud creates "$S3_BUCKET_PATH.md5" file after a successful upload.
the simple check on "$S3_BUCKET_PATH.md5" file is already embedded inside job.

Won't Do

Details

Assignee

Reporter

Priority

Smart Checklist

Created March 23, 2020 at 6:13 AM
Updated March 5, 2024 at 6:18 PM
Resolved March 19, 2021 at 1:21 PM