Document MyRocks 'column family' functionality
Description
Environment
blocks
relates to
Smart Checklist
Activity

George Lorch October 24, 2018 at 6:26 PMEdited
8.0 Specifics :
Column Family placement specification and naming rules :
Column Family name may only be specified via 'cfname=xxx'. This is an improvement to the user interface specific to 8.0.
Column Family names may not contain leading or trailing spaces. These spaces will be stripped when specified in an INDEX COMMENT. Example: 'cfname= asdf jkl ;' will result in a parsed cfname of 'asdf jkl'.
Column Family names are case sensitive, so 'Foo' and 'foo' will reference distinct Column Families.
The 'cfname=' token must be EXACTLY 'cfname=' no internal spaces or uppercase, leading spaces will be stripped.
Only the first 'cfname=' read from the INDEX COMMENT will be used, the rest will be ignored with the exception of partition naming which is explained here https://github.com/facebook/mysql-5.6/wiki/Column-Families-on-Partitioned-Tables.
'cfname=' may be terminated with a ';', Example: 'cfname=foo; special column family' will result in a parsed cfname of 'foo'.
Any failure to parse or detect a Column Family name will result in the index being placed within the 'default' Column Family with no warning.

George Lorch October 24, 2018 at 6:22 PMEdited
5.7 Specifics :
Column Family placement specifications and naming rules :
Column Family index placement may be specified by adding 'cfname=xxx;' to an INDEX COMMENT.
If the cfname specifier is not used, the entire index comment will be considered as the Column Family name.
If the cfname specifier is used, the following rules apply :
Column Family names may not contain leading or trailing spaces. These spaces will be stripped when specified in an INDEX COMMENT. Example: 'cfname= asdf jkl ;' will result in a parsed cfname of 'asdf jkl'.
Column Family names are case sensitive, so 'Foo' and 'foo' will reference distinct Column Families.
The 'cfname=' token must be EXACTLY 'cfname=' no internal spaces or uppercase, leading spaces will be stripped.
Only the first 'cfname=' read from the INDEX COMMENT will be used, the rest will be ignored with the exception of partition naming which is explained here https://github.com/facebook/mysql-5.6/wiki/Column-Families-on-Partitioned-Tables.
'cfname=' may be terminated with a ';', Example: 'cfname=foo; special column family' will result in a parsed cfname of 'foo'.
Any failure to parse or detect a Column Family name will result in the index being placed within the 'default' Column Family with no warning.

George Lorch October 24, 2018 at 6:22 PMEdited
General Info :
https://github.com/facebook/mysql-5.6/wiki/Column-Families-on-Partitioned-Tables
Column Families are similar in concept to TABLESPACES but are more flexible in that they can have many different traits customized such as block sizes and compression.
Each index of a table (PRIMARY KEY is an index) may reside in a different Column Family.
Column Family instances have their own MemTable (https://github.com/facebook/rocksdb/wiki/MemTable) regardless if the Column Family is in active use or not. In order to keep the overall memory requirement low, the number of Column Family instances should be kept to the absolute minimum needed.
On system initialization, MyRocks creates two Column Families, one named '__system__', which can not be used by any user created tables/indexes, and one named 'default' which is the default location for all user tables/indexes.
Index placement into a specific Column Family is specified via an INDEX COMMENT, if no comment is specified, tables/indexes get placed into the 'default' Column Family. Example : 'CREATE TABLE t1 (a INT, b INT, PRIMARY KEY(a) COMMENT 'cfname=cf1', KEY kb(b) COMMENT 'cfname=cf2') ENGINE=ROCKSDB' will place the PRIMARY KEY into 'cf1' and will place the KEY 'kb' into 'cf2'.
As of PS 5.7.23-24 (or 5.7.24-24, whichever comes next) and 8.0, there is a new option that can be enables to prohibit a user from creating new Column Families via INDEX COMMENTs
Column Family Options:
On server startup, the rocksdb_default_cf_options server option is applied to all Column Families. In order to override any specific option for a specific Column Family, use the rocksdb_override_cf_options server option to specify the Column Family name and options that should be overridden for that Column Family. At runtime, these two server variables are read only.
At runtime, some (not all) Column Family options may be changed via the rocksdb_update_cf_options server variable. IMPORTANT! Any changes made to a Column Families options through this method will only last until the server is restarted. Changes made to a Column Families options via the rocksdb_update_cf_options ARE NOT PERSISTED and will not survive a server restart.
Details
Assignee
Borys BelinskyBorys Belinsky(Deactivated)Reporter
George LorchGeorge Lorch(Deactivated)Labels
Time tracking
1d 2h 30m loggedComponents
Affects versions
Priority
Low
Details
Details
Assignee

Reporter

Labels
Time tracking
Components
Affects versions
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist
Open Smart Checklist
Smart Checklist

MyRocks and RocksDB have this concept of column families, which is sort of analogous to a tablespace. Individual tables/partitions/indexes can be placed into different column families. Column families can have different configuration options that impact all data within the column family, the most common options to use would be compression options. We need to document the concept of column families and how to use the rocksdb_default_cf_options and rocksdb_update_cf_options as well as the CREATE/ALTER TABLE/INDEX 'comment' field to set and change column family options. Upstream wiki has some documetation to this effect but IMHO it is a bit fragmented and needs various ideas pulled together into a single document section.