myrocks table lost all data after alter table in IngestExternalFile
Description
Environment
AFFECTED CS IDs
relates to
Activity

Przemyslaw Malkowski February 14, 2025 at 3:55 PM
The problem still exists for simple alter table x engine = rocksdb;
queries.
Table data is lost entirely after a simple loop like:
The same is confirmed with a loop of alter table sbtest1 add column d int; alter table sbtest1 drop column d;
Big enough sysbench table (1M rows) and fast disk allow to reproduce in just 3-4 iterations.
Enabling rocksdb_bulk_load_use_sst_partitioner
does not help this one.

Julia Vural September 27, 2024 at 7:16 AM
Proposed a workaround, closing the issue.

Przemyslaw Skibinski July 1, 2024 at 7:43 AM
Hi, we don’t know why rocksdb-bulk-load-use-sst-partitioner
is disabled by default but I expect some side effects are possible because Meta is not afraid to change default values when it makes sense for them.

Przemyslaw Skibinski June 28, 2024 at 7:36 AM
We received feedback from Meta. They suggested to use both:
1) a one line fix
2) set --rocksdb-bulk-load-use-sst-partitioner=1
option
In my experiments 1) doesn't solve the issue but 2) does.
Basically with 2) they gave us just a workaround. The problem is that rocksdb-bulk-load-use-sst-partitioner
is disabled by default so I expect it has some drawbacks like e.g. worse performance.
Moreover Meta mentioned that they plan to rewrite bulk loading so probably the current implementation is buggy and hard to fix.

Przemyslaw Skibinski April 2, 2024 at 3:58 PM
We noticed that rocksdb_stress.drop_cf_stress
sometimes fails with the same issue i.e. Global seqno is required, but disabled
.
I managed to simplify rocksdb_stress.drop_cf_stress
and created rocksdb_stress.alter_table_crash
. The crash is caused by assert(0)
I put after Global seqno is required, but disabled
is logged/encountered.
To trigger the issue each worker calls:
in the loop with different table names (tbl0X
).
It is probably a race condition because I didn't manage to reproduce it with a single thread.
The issue is encountered without using rocksdb_bulk_load=1
.
To reproduce the issue please use:
I also confirmed that the issue started to appear after was introduced to RocksDB repo.
Details
Details
Assignee

Reporter

Labels
Needs QA
In progress time
Time tracking
Sprint
Affects versions
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist

When the rocksdb_bulk_load is 1, alter a myrocks table may lose all data in
IngestExternalFile
The error log is
Based on the comment, the
IngestExternalFile()
will fail, if the key range overlaps with existing keys or tombstones in the DBThe commit
9502856edd77260bf8a12a66f2a232078ddb2d60
on RocksDB contributed to this issue.You can find more details about the change on PR #10988
There are steps to reproduce
create tables
session 1 running sysbench
session 2 running alter query