FLUSH CHANGED_PAGE_BITMAPS leaves gaps between the last written bitmap LSN and InnoDB checkpoint LSN

Description

It is related to the fact that checkpoint LSN is not necessary aligned to the log block size. Log tracker is reading in blocks aligned with the LOG_BLOCK_SIZE to avoid reading incomplete blocks. When log_online_follow_redo_log completes, it may leave the last redo log record in the parse buffer. Last tracked LSN will be roughly the last checkpoint LSN - last log record length.

Environment

None

Smart Checklist

Activity

Show:

Laurynas Biveinis March 1, 2019 at 8:47 AM

I am sorry, I got it backwards, of course it makes sense

Sergei Glushchenko March 1, 2019 at 8:31 AM
Edited

There is a comment in log0recv.cc:

I read it as redo log apply starts with the checkpoint LSN or if checkpoint LSN points to the middle of the log record, then with the first log record past checkpoint LSN.

What the code actually does, it starts to parse redo log from within the same log block as checkpoint_lsn points to. This is done because checkpoint_lsn may point to the middle of the log record. It however remembers the bytes_to_ignore_before_checkpoint as the number of bytes to ignore between parse_start_lsn and checkpoint_lsn. So, it seems it does exactly what advertised and log records are indeed applied starting with checkpoint LSN or at some point after checkpoint LSN.

Anyways, it is guaranteed that redo log application is started from the log block containing the checkpoint LSN. What I see with current changed page tracking is that tracked LSN is ~9k bytes behind the checkpoint LSN which is ~17 log blocks. (I run large inserts with log record size from 8k to 16k)

Laurynas Biveinis March 1, 2019 at 8:02 AM

How can crash recovery apply record whose tail falls after checkpoint lsn? By definition it's not a persisted one?

Sergei Glushchenko February 28, 2019 at 4:23 PM

Crash recovery starts to apply log records from the point at checkpoint lsn or after checkpoint lsn. So, we need to track everything at least up to checkpoint lsn (it is fine if we track more). If we tracked less, backup will be invalid.
Suggested fix is to remember the checkpoint LSN at the beginning of FLUSH command. This LSN will be equal or greater than the one xtrabackup started to copy redo logs from. Then run log_online_follow_redo_log possibly several times until log_bmp_sys->parse_buf.get_current_lsn() becomes greater or equal to the saved LSN. Then we consider that FLUSH is done.

Laurynas Biveinis February 28, 2019 at 1:37 PM

Is that an actual issue and not something forced by design? InnoDB crash recovery know operates under the same constraint, it might find a truncated log record right at the checkpoint boundary, hence effectively the last log record is one before.

What would be the fix?

Done

Details

Assignee

Reporter

Time tracking

1d 25m logged

Fix versions

Priority

Smart Checklist

Created February 27, 2019 at 4:51 PM
Updated March 6, 2024 at 12:19 PM
Resolved March 6, 2019 at 6:55 AM