Document new features in percona-202302 to percona-202305

General

Escalation

General

Escalation

Description

https://github.com/percona/percona-server/pull/5191

1. New variables:
a) Add a new variable rocksdb_block_cache_numshardbits:

:dyn: No
   :scope: Global
   :vartype: Numeric
   :default: -1
   :min: -1
   :max: 8
    "Specifies numShardBits used for block cache for RocksDB. The cache is sharded to 2^numShardBits shards, by hash of the key.",

b) Add a new variable rocksdb_check_iterate_bounds:

:dyn: Yes
   :scope: Global, Session
   :vartype: Bool
   :default: ON
    "Check rocksdb iterator upper/lower bounds during iterating",

c) Add a new variable rocksdb_compact_lzero_now:

:dyn: Yes
   :scope: Global
   :vartype: Bool
   :default: OFF
    "Force to compact all L0 files.",

d) Add a new variable rocksdb_file_checksums:

:dyn: No
   :scope: Global
   :vartype: Bool
   :default: OFF
    "Whether to write and check RocksDB file-level checksums.",

e) Add a new variable rocksdb_max_file_opening_threads:

:dyn: No
   :scope: Global
   :vartype: Numeric
   :default: 16
   :min: 1
   :max: INT_MAX
    "Set DBOptions::max_file_opening_threads for RocksDB.",

f) Add a new variable rocksdb_partial_index_ignore_killed:

:dyn: Yes
   :scope: Global
   :vartype: Bool
   :default: ON
    "If ON, partial index materialization will ignore the killed flag and "
    "continue materialization until completion. If queries are killed during "
    "materialization due to timeout, then the work done so far is wasted, and "
    "it is likely that killed query will be retried later, hitting the same "
    "problem.",

2. Change default values of MyRocks variables:
a) Change default value ofrocksdb_compaction_sequential_deletes from 0 to 149999
b) Change default value of rocksdb_compaction_sequential_deletes_count_sd from OFF to ON
c) Change default value of rocksdb_compaction_sequential_deletes_window from 0 to 150000
d) Change default value of rocksdb_force_flush_memtable_now from ON to OFF
e) Change default value of rocksdb_large_prefix from OFF to ON and deprecate this system variable

3. Add clone plugin variables:
a) Add a new variable clone_compression_algorithm:

:dyn: Yes
   :scope: Global
   :vartype: enum { ZLIB = 0, ZSTD }
   :default: ZSTD
    "Set compression algorithm used in clone.",

b) Add a new variable clone_zstd_compression_level:

:dyn: Yes
   :scope: Global
   :vartype: Numeric
   :default: 3
   :min: 1
   :max: 10
    "Set zstd compression level.",

Commit descriptions that may help with describing new variables:
a) rocksdb_file_checksums

Implement RocksDB file-level checksums (#1280)
Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/14df9f16d1f24ffab7d7982133fa9247087d8f32>
Summary:
Add a new boolean system variable rocksdb_file_checksums.

b) rocksdb_force_flush_memtable_now

* rocksdb_force_flush_memtable_now default value changed to OFF to match what is done for rocksdb_force_flush_memtable_and_lzero_now and rocksdb_compact_lzero_now
* confirm that the value to which these variables are set can be parsed, that wasn't done for all of them
* don't raise an error when these are set to OFF, that will be a no-op

Note: these variables are triggers, an action is taken when this is done: set global var = ON | true | 1

c) rocksdb_compact_lzero_now
The alternative is to add a new global variable, rocksdb_compact_lzero_now, that when set will request L0 -> base_level compaction. This allows a client to first set rocksdb_force_flush_memtable_now, wait for that to finish, then set rocksdb_compact_lzero_now.
d) rocksdb_large_prefix

Set rocksdb_large_prefix to ON and deprecate this system variable (#1322)
Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/e4751cae6431ccdbf02e2ad847c18d2b86903e09>
Summary:
This variable was introduced to mirror innodb_large_prefix, which was deprecated in 5.7 and removed in 8.0, and no longer serves any real purpose, nor there is anything to gain with its 'OFF' setting, which is also incompatible with the data dictionary schema. Thus 1) change the default to 'ON'; 2) deprecate it.

e) rocksdb_check_iterate_bounds

check upper and lower bound for writebatchwithindex 
Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/0bfdb056c66e3f1e9168087c735744e21283a658>
Add a variable rocksdb_check_iterate_bounds to control whether we should check iterate bounds and check these bounds inside myrocks if rocksdb_check_iterate_bounds is true.

f) rocksdb_block_cache_numshardbits

Add rocksdb_block_cache_numshardbits for issue 1336 (#1339)Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/730887a4080fd3426203aba33838e03f1bbde5b7>

Summary:
This fixes <https://github.com/facebook/mysql-5.6/issues/1336>

This adds the my.cnf options: rocksdb_block_cache_numshardbits

This option can be set so that RocksDB to fix the number of block cache shards.

The default value is -1 to match existing behavior. When -1 RocksDB code will determine the number of block cache shards as min(6, rocksdb_block_cache_size / min_shard_size) and today min_shard_size is 512K for LRU and 32M for Hyper.

The math above frequently results in a block cache with too many small shards when rocksdb_block_cache_size is not too big (a few GB is not too big) and there will be perf problems that are hard to debug in such a case.

g) rocksdb_compaction_sequential_deletes

Changing default rocksdb_compaction_sequential_deletes values
Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/51116fb65fca10e4d7a60b28f6612ad71678aba3>

Summary:
This diff enables "Deletion-Triggered-Compaction" behavior by default.
Typical performance issue of LSM database like RocksDB is that if you
issue lots of updates or deletes on nearby key ranges, following range
scans may end up reading huge number of tombstones and affects high cpu
/ high query execution time.
With deletion triggered compaction, when new SST file is created by
flush or compaction, if there are many tombstones within a window (this
diff sets 149999/150000), it triggers another compaction to try to
remove the tombstones. This increases CPU time for compactions but in
many cases it helps to prevent range scan perf regressions.

h)rocksdb_max_file_opening_threads

Adding rocksdb_max_file_opening_threads sysvar
Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/0d685cb30ab9b07c2d7bddf221dfb12845f51332>

Summary:
This diff adds a rocksdb_max_file_opening_threads sysvar,
which maps to RocksDB max_file_opening_threads DBOption.

Default max_file_opening_threads is 16. On instance crash recovery,
if there are a lot of SST files and if running on slower storage
devices, 16 threads may be too high to cause stalls on other
database instances running on the same devices. You can control
concurrency to prevent such stalls.

i) rocksdb_partial_index_ignore_killed

Add option to ignore killed flag for materialization

Upstream commit ID: <https://github.com/facebook/mysql-5.6/commit/7d17329e6b686653a4661b166679c357a284df37>

Summary:
When queries time out, we abandon any work that was done to materialize partial indexes. This may not be desirable since the client will most likely retry, and still fail because it is querying a large group that requires materialization.

Add a flag for the materialization code path to ignore the kill flag and finish materialization. In case we have bugs, we can always turn it off via a flag to check the kill flag.