Issues

Select view

Select search mode

 
50 of

xbstream does long post-processing after receiving all data

Description

When we stream a backup containing huge tables, xbstream after receiving it spends a significant amount of time on post-processing.

Why it is a problem: PXC uses PXB for SST. On Joiner node there is a mechanism that prevents SST transfer to be stuck. It monitors if the receive directory size grows. If its size does not change for sst-idle-timeout (default 120 sec), SST script considers the transfer as stuck and cancels SST.

Side note: in PXC it seems possible to monitor not only receive directory size, but also xbstream activity, but such a solution seems to be not perfect workaround (how much CPU consumption should we consider as “doing something” and how much as “waiting for data when the network is stuck”?). If we could avoid this “silent” time in PXB, or let the outside world know in any way, that streaming is still in progress, that would be a better solution.

Steps to reproduce:

  1. Create a large database. I’ve attached create_manga_db.sql

  2. Stuff the database tables with data. I’ve attached manga-insert.sh script (modify db connection parameters). It will take some time, but it will result in ca. 500GB database and a few huge tables (more than 100GB)

  3. Do a backup (I used pxb 8.0.35)

    ./xtrabackup --defaults-file=/home/kamil.holubicki/sandboxes/msb_8_0_41/my.sandbox.cnf --backup --compress=lz4 --stream=xbstream --target-dir /home/kamil.holubicki/backup-1 -umsandbox -pmsandbox -H127.0.0.1 -P8041 > /home/kamil.holubicki/backup-1.stream
  4. restore the backup

    mkdir -p /bigdisk/kamil.holubicki/decompressed cd /bigdisk/kamil.holubicki/decompressed cat ../backup-1.stream | /home/kamil.holubicki/pxb-8.0.35/bin/xbstream -x --decompress --decompress-threads=4 --parallel=4
  5. In another terminal observe the directory size

    while sleep 1; do du -b -s decompressed; du -b -s -h decompressed; done
  6. Once a directory size stops to grow, check that xbstream process is still active. In my test (highram2 server) it is ca 3:45min). Then xbstream finishes.

Note 1: There is stream file already prepared (to avoid 1, 2, 3). Its location is /bigdisk/kamil.holubicki/backup-1.stream


I attached to xbstream with gdb during the “silence” and here is the callstack:

#0 0x00007f8b1d4fdb6c in read () from /lib64/libc.so.6 #1 0x0000000000414910 in read (__nbytes=10485760, __buf=0x7f8b0ca04b10, __fd=3) at /usr/include/bits/unistd.h:38 #2 my_read (fd=3, Buffer=0x7f8b0ca04b10 ".Q\234^", Count=10485760, MyFlags=MyFlags@entry=0) at /usr/src/debug/percona-xtrabackup-80-8.0.35-32.1.el9.x86_64/mysys/my_read.cc:87 #3 0x00000000004197d8 in datafile_read (cursor=0x7f8b18bf5930) at /usr/src/debug/percona-xtrabackup-80-8.0.35-32.1.el9.x86_64/storage/innobase/xtrabackup/src/file_utils.cc:266 #4 restore_sparseness (buffer_size=10485760, error=0x7f8b18bf52c0 "\300\211\001\024\213\177", src_file_path=0x7f8b18bf54c0 "manga_data/manga_reading_status.ibd") at /usr/src/debug/percona-xtrabackup-80-8.0.35-32.1.el9.x86_64/storage/innobase/xtrabackup/src/file_utils.cc:316 #5 extract_worker_thread_func (ctxt=...) at /usr/src/debug/percona-xtrabackup-80-8.0.35-32.1.el9.x86_64/storage/innobase/xtrabackup/src/xbstream.cc:595 #6 0x00007f8b1d8dbad4 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #7 0x00007f8b1d489d32 in start_thread () from /lib64/libc.so.6 #8 0x00007f8b1d50edc0 in clone3 () from /lib64/libc.so.6

Indeed, it seems we are reading the whole idb (huge) file, which takes time.

This behavior was introduced to pxb by commit 079ea2bb.

Note 2: It is probably possible to reproduce the issue with just one huge table.

Note 3: Having a compressed backup is necessary, because the problematic branch is executed only for that case.

Environment

None

Attachments

3
  • 21 Mar 2025, 03:17 PM
  • 21 Mar 2025, 03:17 PM
  • 21 Mar 2025, 03:16 PM

Details

Assignee

Reporter

Needs QA

Yes

Affects versions

Priority

Smart Checklist

Created last week
Updated last week

Activity

Show:

Kamil Holubickilast week

I created xbstream coredump: /bigdisk/kamil.holubicki/xbstream.coredump

symbol-file /home/kamil.holubicki/pxb-debug/usr/lib/debug/usr/bin/xbstream-8.0.35-32.1.el9.x86_64.debug