MyRocks - rocksdb_flush_log_at_trx_commit 0 and 2 seem to have the same behavior
General
Escalation
General
Escalation
Description
Environment
pt-summary
pt-mysql-summary
Smart Checklist
Activity
Show:

Julia Vural March 4, 2025 at 9:02 PM
It appears that this issue is no longer being worked on, so we are closing it for housekeeping purposes. If you believe the issue still exists, please open a new ticket after confirming it's present in the latest release.

Matthew Boehm July 24, 2024 at 8:20 PM
Any updates on this issue? Prominent customer is asking about this behavior as well.

Lalit Choudhary February 15, 2019 at 1:32 PM
In test seeing increment once per second for Rocksdb_flush_log_at_trx_commit= 0 and 2

Lalit Choudhary February 15, 2019 at 1:29 PM
Hi Peter,
Thank you for the report.
I did a little testing with rocksdb_flush_log_at_trx_commit in order to get a better idea of how it works, especially considering the value of 0 in InnoDB where you are reliant on innodb_flush_log_at_timeout to help establish when log flushing should occur.
For test 1 I set the value of rocksdb_flush_log_at_trx_commit to 1, performed 10 inserts and checked for approximately 10 WAL syncs. No issues here.
For my second test I set the value to 2 and performed 500 inserts. I also set rocksdb_wal_bytes_per_sync to a very high value to ensure no extra syncs would occur as a result of WAL byte writing. The insert batch took 19 seconds. There were 23 syncs. Accounting for human time I would consider this to be close enough. I also manually checked the rocksdb_wal_synced status variable during this time and during periods of no activity I still saw it increment once per second. This appears to be good as well.
For my third and final set I set the value of rocksdb_flush_log_at_trx_commit to 0. Per the documentation this indicates "Do not sync on transaction commit". However it doesn't really state when the sync should occur. For InnoDB this is controled by innodb_flush_log_at_timeout, but there doesn't seem to be a similar variable for MyRocks. I figured this might be controlled by rocksdb_wal_bytes_per_sync, but this did not appear to be the case.
As soon as I configured this variable to 0 I monitored the status of rocksdb_wal_synced and noticed that it was increment once per second, just like value 2. I decided to test further.
For test 4 I figured that In order to test the syncing behavior of rocksdb_wal_bytes_per_sync I need to write data where rocksdb_wal_bytes_per_sync is greater than the amount of data in 1 row, but less that the amount of data that can be written in 1 second. Per my initial single record write tests, 2461 bytes is written per record. Per a batch test 500 records (individually inserted) takes 20 seconds and consumed 1183053 WAL bytes, meaning it consumes about 60000 bytes per second.
With this taken into consideration, assuming that rocksdb_wal_bytes_per_sync acts as the WAL in-memory buffer size, If I insert 500 records with a value of rocksdb_wal_bytes_per_sync = 30000 I should see 40 syncs for cache fill and 20 syncs for the timed flush, assuming the timed flush still happens
This failed. Regardless of how I configured rocksdb_wal_bytes_per_sync, rocksdb_wal_synced always incremented once per second when the rocksdb_flush_log_at_trx_commit variable was set to 0
In conclusion it looks like value 0 and 2 for rocksdb_flush_log_at_trx_commit do the same thing. It's possible that rocksdb_wal_bytes_per_sync is not part of the equation for this as this was just an educated guess on my part. I would suggest including a variable similar to innodb_flush_log_at_timeout, or at least update the documentation to better show under what circumstances the WAL will be synced with rocksdb_flush_log_at_trx_commit set to 0.