[pr-archiver] charset mistmatch and invalid cyrillic chars

Description

Hello. I try to dump TSV-file with pt-archiver.

  1. pt-archiver --no-delete --source h=10.2.0.103,D=stat,t=CallStatV4 --limit=1000 --where "id > 1350363818" --file load_to_clickhouse.tsv
    Character set mismatch: --source DSN uses latin1, table uses utf8. You can disable this check by specifying --no-check-charset.

I get an error:

Character set mismatch: --source DSN uses latin1, table uses utf8. You can disable this check by specifying --no-check-charset.

But there is utf8 everywhere in DSN:

collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8

mysql> SELECT default_character_set_name FROM information_schema.SCHEMATA WHERE schema_name = "stat";
----------------------------

default_character_set_name

----------------------------

utf8

----------------------------

mysql> SELECT CCSA.character_set_name FROM information_schema.`TABLES` T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "stat" AND T.table_name = "CallStatV4";
--------------------

character_set_name

--------------------

utf8

--------------------

If I give option --charset=utf8 to pt-archiver, there is no error, but cyrillic chars in file are invalid like this: "name": "Анжелика"

 

Environment

None

Smart Checklist

Activity

Show:

Jira Bot March 4, 2021 at 3:56 PM

Hello ,
It's been 52 days since this issue went into Incomplete and we haven't heard
from you on this.

At this point, our policy is to Close this issue, to keep things from getting
too cluttered. If you have more information about this issue and wish to
reopen it, please reply with a comment containing "jira-bot=reopen".

Jira Bot February 24, 2021 at 3:56 PM

Hello ,
It's jira-bot again. Your bug report is important to us, but we haven't heard
from you since the previous notification. If we don't hear from you on
this in 7 days, the issue will be automatically closed.

Jira Bot February 9, 2021 at 2:56 PM

Hello ,
I'm jira-bot, Percona's automated helper script. Your bug report is important
to us but we've been unable to reproduce it, and asked you for more
information. If we haven't heard from you on this in 3 more weeks, the issue
will be automatically closed.

Lalit Choudhary January 11, 2021 at 1:58 PM
Edited

Hi

Thank you for the report.

I can't reproduce the described case with PT 3.2.1 version.

Please provide reproduciable test/example.

 

Incomplete

Details

Assignee

Reporter

Priority

Affects versions

Smart Checklist

Created November 19, 2020 at 1:37 PM
Updated February 29, 2024 at 8:56 PM
Resolved March 4, 2021 at 3:56 PM