Missing 'posix_fadvise(file, 0, 0, POSIX_FADV_SEQUENTIAL)' when reading the redo log files

User Story

It is painful to run xtrabackup against a server with a high write load. The data file part usually completes but the redo copy loop is slow and sometimes ends with a "wrapped around" message. It appears the kernel is not doing read ahead because 'posix_fadvise(file, 0, 0, POSIX_FADV_SEQUENTIAL)' is not called. This is a fairly easy fix.

Acceptance Criteria

Faster redo log copy.

Description

'posix_fadvise(file, 0, 0, POSIX_FADV_SEQUENTIAL)' is missing when copying the redo log.

Environment

None

Attachments

1
  • 14 Mar 2025, 05:37 PM

Activity

Show:

Satya Bodapati last week

Thanks, ​@Yves Trudeau. We will handle redo log copying as a performance improvement. I have an EPIC to handle the catchup phase(get rid of it), and other strategies to handle it.
We hope to solve this someday.

Yves Trudeau last week

So, I generated flamegraphs of all the versions and found no difference. Even the stock version, without my fadvise patch, apparently calls the read ahead functions. As there are workarounds for the observed problems, I believe we can close this ticket.

--Yves

Yves Trudeau March 19, 2025 at 2:22 PM

I am very surprised too with the differences between the packaged and compiled version. I may need to profile it. Having said that, I discussed with the customer the use of the redo-log-arch-dir option. This option could be better documented but that is another story. Another customer having PXB issues found out the NFS mount used for the backup was using a degraded raid-5 array. Write performance was terrible.

I’ll run a more test today and possible have some flamegraphs.

Satya Bodapati March 18, 2025 at 4:15 PM

Regarding the 22MB/s comment, I meant that if server is writing at 120MB/s, xtrabackup could better (Assuming there is sufficient disk IO capacity left.)

And coming to the differences, I doubt if there are some build issues or packagaging flags that can cause this slowness.
Do you think there is difference between compiled xtrabackup and packaged xtrabackup? You can remove my changes too, and compare package vs compiled?

Yves Trudeau March 14, 2025 at 7:51 PM

So, I recompiled with your patch and with and without mine. The results are below:

root@ytrudeau-coder-ytrudeau-devbox:~# grep catchup xberr.log xberr.log.2 xberr.log.3 xberr.log:2025-03-14T19:11:51.443789-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup started xberr.log:2025-03-14T19:11:59.787957-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup completed: 8 seconds xberr.log:2025-03-14T19:11:59.788016-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup copied: 6970865152 bytes xberr.log:2025-03-14T19:11:59.789454-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup speed: in bytes/sec: 8.71358e+08 xberr.log.2:2025-03-14T19:14:47.796244-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup started xberr.log.2:2025-03-14T19:14:57.182263-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup completed: 10 seconds xberr.log.2:2025-03-14T19:14:57.182323-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup copied: 7002074624 bytes xberr.log.2:2025-03-14T19:14:57.182355-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup speed: in bytes/sec: 7.00207e+08 xberr.log.3:2025-03-14T19:30:49.273405-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup started xberr.log.3:2025-03-14T19:30:58.767806-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup completed: 9 seconds xberr.log.3:2025-03-14T19:30:58.769196-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup copied: 6991886336 bytes xberr.log.3:2025-03-14T19:30:58.769243-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup speed: in bytes/sec: 7.76876e+08 root@ytrudeau-coder-ytrudeau-devbox:~# grep catchup xberr.log.patch* xberr.log.patch:2025-03-14T19:08:08.697207-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup started xberr.log.patch:2025-03-14T19:08:18.541751-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup completed: 10 seconds xberr.log.patch:2025-03-14T19:08:18.542438-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup copied: 6995402240 bytes xberr.log.patch:2025-03-14T19:08:18.542466-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup speed: in bytes/sec: 6.9954e+08 xberr.log.patch.2:2025-03-14T19:18:02.288104-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup started xberr.log.patch.2:2025-03-14T19:18:10.387148-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup completed: 8 seconds xberr.log.patch.2:2025-03-14T19:18:10.388001-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup copied: 7054010368 bytes xberr.log.patch.2:2025-03-14T19:18:10.388032-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup speed: in bytes/sec: 8.81751e+08 xberr.log.patch.3:2025-03-14T19:23:40.374267-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup started xberr.log.patch.3:2025-03-14T19:23:49.794520-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup completed: 9 seconds xberr.log.patch.3:2025-03-14T19:23:49.794566-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup copied: 6999840256 bytes xberr.log.patch.3:2025-03-14T19:23:49.794589-00:00 0 [Note] [MY-011825] [Xtrabackup] Redo catchup speed: in bytes/sec: 7.7776e+08

As you can see, there are no differences. I retried the packaged xtrabackup and got a slow result (13s) . Could it be a different build environment? Seems too big. Let’s say I am puzzled.

Duplicate

Details

Assignee

Reporter

Needs QA

Yes

Priority

Smart Checklist

Created March 11, 2025 at 8:16 PM
Updated last week
Resolved last week