MySQL converts collation of date data type in ibd but data dictionary

Description

Description

When MySQL converts the charset collation on a table, it converts the date and time data types columns in ibd file. The collation_id in the ibd does not match that of data dictionary.

  • the table structure is below

mysql> CREATE TABLE test.a ( a datetime ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_ci;

the collation_id is 8 (latin1_swedish_ci)

shell> ibd2sdi /var/lib/mysql/test/a.ibd | jq '.[1].object.dd_object.columns[0]' | grep collation_id "collation_id": 8,
  • after converting the charset collation, the collation_id becomes 224 (utf8mb4_unicode_ci)

mysql> ALTER TABLE test.a CONVERT TO CHARACTER SET utf8mb4 collate utf8mb4_unicode_ci; shell> ibd2sdi /var/lib/mysql/test/a.ibd | jq '.[1].object.dd_object.columns[0]' | grep collation_id "collation_id": 224,
  • when coping the table to a new table, the collation_id of the new table is 8 (latin1_swedish_ci)

mysql> create table test.b like test.a; shell> ibd2sdi /var/lib/mysql/test/b.ibd | jq '.[1].object.dd_object.columns[0]' | grep collation_id "collation_id": 8,
  • After altering table, the the collation_id becomes 8 (latin1_swedish_ci)

mysql> ALTER TABLE test.a ENGINE = INNODB; shell> ibd2sdi /var/lib/mysql/test/a.ibd | jq '.[1].object.dd_object.columns[0]' | grep collation_id "collation_id": 8,

Reproduce

CREATE DATABASE test; CREATE TABLE test.a ( a datetime ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_ci; ALTER TABLE test.a CONVERT TO CHARACTER SET utf8mb4 collate utf8mb4_unicode_ci;

Environment

None

AFFECTED CS IDs

CS0045899

Activity

Show:

Venkatesh Prasad May 17, 2024 at 7:15 AM

I debugged a bit deeper and found that

  • For CREATE TABLE LIKE queries the charset for any field is taken from the table->field object of the source table. Ref: here

  • During the preparation of the fields Field_datetimef object's pointer is passed to Create_field constructor and so that it uses the same charset as of the source column. Ref: here

  • However, all temporal fields are always created with my_charset_latin1 charset. Ref: here

  • All derived classes of Field_temporal will always return my_charset_latin1 charset. Ref: here

It is for the above reason, even queries like ALTER TABLE ENGINE=INNODB  reset the charset of the datetime column

1. create table a(dt datetime); ../bin/ibd2sdi ./var/mysqld.1/data/test/a.ibd -> "name": "dt", "collation_id": 8 2. alter table a CONVERT TO CHARACTER SET utf8mb4 collate utf8mb4_unicode_ci; ../bin/ibd2sdi ./var/mysqld.1/data/test/a.ibd -> "name": "dt", "collation_id": 224 3. alter table a engine = innodb; ../bin/ibd2sdi ./var/mysqld.1/data/test/a.ibd -> "name": "dt", "collation_id": 8

Aaditya Dubey May 13, 2024 at 2:11 PM

Hi

It is done!

Julia Vural May 13, 2024 at 1:54 PM

Please copy the details on escalation to this ticket.

Done

Details

Assignee

Reporter

Labels

Needs QA

Yes

Sprint

Affects versions

Priority

Smart Checklist

Created May 1, 2024 at 1:33 AM
Updated August 29, 2024 at 12:19 PM
Resolved June 17, 2024 at 6:47 AM