pt-heartbeat doesn't reconnect for check-read-only
General
Escalation
General
Escalation
Description
Environment
None
AFFECTED CS IDs
275153
Smart Checklist
Activity
Show:
Iwo Panowicz June 7, 2020 at 1:19 PMEdited
There's also another use-case in which pt-heartbeat should be reconnecting.
Steps to reproduce:
non-root user
MySQL started with read_only=1
pt-heartbeat started as
pt-heartbeat --update --database=percona --host=127.0.0.1 --port=5534 --user=msandbox --password=msandbox --replace --check-read-only --no-version-check
After killing the connection pt-heatbeat will exit with exit code = 2
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
# pt_heartbeat:5969 19725 Sleeping for 1.0 seconds
DBD::mysql::db selectrow_array failed: Lost connection to MySQL server during query [for Statement "SELECT @@global.read_only"] at ./pt line 6469.
Carlos Salguero May 19, 2020 at 12:49 PM
I cannot reproduce:
Killed the connection 3 times, pt-heartbeat is always reconnecting.
mysql> show processlist;
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
| 8 | msandbox | localhost:44060 | NULL | Binlog Dump | 84706 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 8389 | msandbox | localhost:38702 | NULL | Sleep | 4 | | NULL |
| 8393 | msandbox | localhost:38714 | NULL | Sleep | 4 | | NULL |
| 8394 | msandbox | localhost:38718 | NULL | Sleep | 4 | | NULL |
| 8396 | msandbox | localhost:38726 | sakila | Sleep | 0 | | NULL |
| 8400 | msandbox | localhost:38754 | NULL | Query | 0 | starting | show processlist |
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)mysql> kill 8396
-> ;
Query OK, 0 rows affected (0.00 sec)mysql> show processlist;
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
| 8 | msandbox | localhost:44060 | NULL | Binlog Dump | 84800 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 8400 | msandbox | localhost:38754 | NULL | Query | 0 | starting | show processlist |
| 8408 | msandbox | localhost:38796 | sakila | Sleep | 0 | | NULL |
| 8412 | msandbox | localhost:38830 | NULL | Sleep | 1 | | NULL |
| 8414 | msandbox | localhost:38840 | NULL | Sleep | 1 | | NULL |
| 8415 | msandbox | localhost:38842 | NULL | Sleep | 1 | | NULL |
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)mysql> kill 8408;
Query OK, 0 rows affected (0.00 sec)mysql> show processlist;
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
| 8 | msandbox | localhost:44060 | NULL | Binlog Dump | 84813 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 8400 | msandbox | localhost:38754 | NULL | Query | 0 | starting | show processlist |
| 8412 | msandbox | localhost:38830 | NULL | Sleep | 1 | | NULL |
| 8414 | msandbox | localhost:38840 | NULL | Sleep | 1 | | NULL |
| 8415 | msandbox | localhost:38842 | NULL | Sleep | 1 | | NULL |
| 8418 | msandbox | localhost:38870 | sakila | Sleep | 0 | | NULL |
+------+----------+-----------------+--------+-------------+-------+---------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)mysql> kill 8418;
Query OK, 0 rows affected (0.00 sec)
Sveta Smirnova May 5, 2020 at 10:39 AM
Workaround:
crontab:
* * * * * /etc/pt_heartbeat_restart.sh
script:
#!/bin/bash
# Restart pt-heartbeat if journalctl report error of:
# MySQL server has gone away [for Statement "SELECT @@global.read_only"]
journalctl -upt-heartbeat --since '1 min ago' | grep -q "MySQL server has gone away"
if [ $? -eq 0 ]; then
systemctl restart pt-heartbeat.service
fi
Done
Created April 21, 2020 at 10:36 AM
Updated February 29, 2024 at 8:59 PM
Resolved May 20, 2020 at 2:52 PM
Steps to reproduce:
1. Start pt-heartbeat like:
pt-heartbeat --update --database=percona --create-table --host=127.0.0.1 --port=5725 --user=msandbox --password=msandbox --replace --check-read-only
2. Kill connection of pt-heartbeat:
mysql [localhost:5725] {msandbox} ((none)) > show processlist; +----+----------+-----------------+---------+---------+------+----------+------------------+-----------+---------------+ | Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | +----+----------+-----------------+---------+---------+------+----------+------------------+-----------+---------------+ | 16 | msandbox | localhost | NULL | Query | 0 | starting | show processlist | 0 | 0 | | 17 | msandbox | localhost:42608 | percona | Sleep | 0 | | NULL | 0 | 0 | +----+----------+-----------------+---------+---------+------+----------+------------------+-----------+---------------+ 2 rows in set (0.00 sec) mysql [localhost:5725] {msandbox} ((none)) > kill connection 17; Query OK, 0 rows affected (0.00 sec)
pt-heartbeat will enter endless loop of:
... Lost connection to MySQL server during query [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"] MySQL server has gone away [for Statement "SELECT @@global.read_only"]