PXC will hang at shutdown if IST fail

General

Escalation

General

Escalation

Description

If node fails on SST/IST because it's seqno is higher than group seqno PXC will start a shutdown but will hang forever.

pmp show that all innodb threads are still up and waiting on a singal or condition, just like galera didn't flag mysql to stop.

Wed Jun  5 13:05:59 CDT 2019
libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2811),LinuxAIOHandler::poll(os0file.cc:2957),os_aio_linux_handler(os0file.cc:3013),os_aio_handler(os0file.cc:3013),fil_aio_wait(fil0fil.cc:6359),io_handler_thread(srv0start.cc:347),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140),inline_mysql_cond_wait(thr_cond.h:140),Per_thread_connection_handler::block_until_new_connection(thr_cond.h:140),handle_connection(connection_handler_per_thread.cc:365),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
poll(libc.so.6),vio_io_wait(viosocket.c:1157),vio_socket_io_wait(viosocket.c:116),vio_read(viosocket.c:171),net_read_raw_loop(net_serv.cc:672),net_read_packet_header(net_serv.cc:756),net_read_packet(net_serv.cc:756),my_net_read(net_serv.cc:899),Protocol_classic::read_packet(protocol_classic.cc:808),Protocol_classic::get_command(protocol_classic.cc:965),do_command(sql_parse.cc:1063),handle_connection(connection_handler_per_thread.cc:318),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:303),buf_lru_manager_sleep_if_needed(buf0flu.cc:3576),buf_lru_manager(buf0flu.cc:3576),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,wait(os0event.h:156),os_event::wait_low(os0event.h:156),srv_worker_thread(srv0srv.cc:3059),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,wait(os0event.h:156),os_event::wait_low(os0event.h:156),buf_flush_page_cleaner_worker(buf0flu.cc:3533),start_thread(libpthread.so.0),clone(libc.so.6)
sigwaitinfo(libc.so.6),timer_notify_thread_func(posix_timers.c:77),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_join(libpthread.so.0),mysqld_main(mysqld.cc:6353),__libc_start_main(libc.so.6),_start
pthread_cond_wait,wait(os0event.h:156),os_event::wait_low(os0event.h:156),srv_purge_coordinator_suspend(srv0srv.cc:3220),srv_purge_coordinator_thread(srv0srv.cc:3220),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,wait(os0event.h:156),os_event::wait_low(os0event.h:156),buf_resize_thread(buf0buf.cc:3027),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,wait(os0event.h:156),os_event::wait_low(os0event.h:156),buf_dump_thread(buf0dump.cc:791),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140),inline_mysql_cond_wait(thr_cond.h:140),wsrep_rollback_process(thr_cond.h:140),start_wsrep_THD(mysqld.cc:7460),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140),inline_mysql_cond_wait(thr_cond.h:140),wsrep_pfs_instr_cb(thr_cond.h:140),wait,galera::ServiceThd::thd_func,start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140),inline_mysql_cond_wait(thr_cond.h:140),Global_THD_manager::wait_till_wsrep_thd_eq(thr_cond.h:140),wsrep_wait_appliers_close(mysqld.cc:7666),wsrep_stop_replication(wsrep_mysqld.cc:1297),signal_hand(mysqld.cc:2746),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140),inline_mysql_cond_wait(thr_cond.h:140),Event_queue::cond_wait(thr_cond.h:140),Event_queue::get_top_for_execution_if_time(event_queue.cc:579),Event_scheduler::run(event_scheduler.cc:563),event_scheduler_thread(event_scheduler.cc:244),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,os_event::timed_wait(os0event.cc:81),os_event::wait_time_low(os0event.cc:208),srv_monitor_thread(srv0srv.cc:1962),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,os_event::timed_wait(os0event.cc:81),os_event::wait_time_low(os0event.cc:208),srv_error_monitor_thread(srv0srv.cc:2135),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,os_event::timed_wait(os0event.cc:81),os_event::wait_time_low(os0event.cc:208),pc_sleep_if_needed(buf0flu.cc:2772),buf_flush_page_cleaner_coordinator(buf0flu.cc:2772),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,os_event::timed_wait(os0event.cc:81),os_event::wait_time_low(os0event.cc:208),lock_wait_timeout_thread(lock0wait.cc:612),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,os_event::timed_wait(os0event.cc:81),os_event::wait_time_low(os0event.cc:208),ib_wqueue_timedwait(ut0wqueue.cc:160),fts_optimize_thread(fts0opt.cc:2900),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,os_event::timed_wait(os0event.cc:81),os_event::wait_time_low(os0event.cc:208),dict_stats_thread(dict0stats_bg.cc:428),start_thread(libpthread.so.0),clone(libc.so.6)
pthread_cond_timedwait,native_cond_timedwait(thr_cond.h:129),my_cond_timedwait(thr_cond.h:129),inline_mysql_cond_timedwait(thr_cond.h:129),audit_log_flush(thr_cond.h:129),audit_log_flush_worker(thr_cond.h:129),start_thread(libpthread.so.0),clone(libc.so.6)
nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:303),srv_master_sleep(srv0srv.cc:2845),srv_master_thread(srv0srv.cc:2845),start_thread(libpthread.so.0),clone(libc.so.6)
nanosleep(libc.so.6),usleep(libc.so.6),galera::ReplicatorSMM::async_recv,galera_recv,wsrep_replication_process(wsrep_thd.cc:470),start_wsrep_THD(mysqld.cc:7460),pfs_spawn_thread(pfs.cc:2190),start_thread(libpthread.so.0),clone(libc.so.6)

Error Log:

2019-06-05T10:46:40.078234-06:00 1 [ERROR] WSREP: Local state seqno (536527181) is greater than group seqno (536526629): states diverged. Aborting to avoid potential data loss. Remove '/home/mysqladm/mysql/data//grastate.dat' file and restart if you wish to continue. (FATAL)
	 at galera/src/replicator_str.cpp:state_transfer_required():39
2019-06-05T10:46:40.078868-06:00 1 [Note] WSREP: applier thread exiting (code:8)
2019-06-05T10:46:40.079052-06:00 1 [Note] WSREP: Starting Shutdown
2019-06-05T10:46:40.178688-06:00 0 [Note] WSREP: Received shutdown signal. Will sleep for 10 secs before initiating shutdown. pxc_maint_mode switched to SHUTDOWN
2019-06-05T10:46:54.917187-06:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 16115ms. The settings might not be optimal. (flushed=8, during the time.)
2019-06-05T10:46:55.030873-06:00 0 [Note] WSREP: Stop replication
2019-06-05T10:46:55.031848-06:00 0 [Note] WSREP: Waiting for active wsrep applier to exit
2019-06-05T10:47:45.002593-06:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5075ms. The settings might not be optimal. (flushed=200, during the time.)
2019-06-05T10:48:20.011604-06:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4400ms. The settings might not be optimal. (flushed=200, during the time.)
2019-06-05T10:48:51.877875-06:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 7925ms. The settings might not be optimal. (flushed=200, during the time.)
2019-06-05T10:49:14.586805-06:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6693ms. The settings might not be optimal. (flushed=150, during the time.)

--- hangs forever ---

How to reproduce

1 - start a sysbench on node1

mysql -psekret -e "CREATE DATABASE sbtest"
sysbench --db-driver=mysql --db-ps-mode=disable --mysql-port=3306 --mysql-user=root --mysql-password=sekret --table_size=1000000 --tables=4 /usr/share/sysbench/oltp_read_write.lua prepare
sysbench --db-driver=mysql --db-ps-mode=disable --mysql-port=3306 --mysql-user=root --mysql-password=sekret --table_size=1000000 --tables=4 /usr/share/sysbench/oltp_read_write.lua --threads=1 --report-interval=1 --time=0 run

2 - On node2, force the IST issue by changing the group seqno when the check is performed:

gdb -p $(pidof mysqld) -ex "b replicator_str.cpp:30" -ex "continue" -ex "set variable group_seqno=1" -ex "continue" -batch

3 - On node2 in a new terminal windown, force a tcp delay wich will make the node be evicted from the cluster:

tc qdisc add dev eth0 root netem delay 3000ms

4- On node2, once the node gets evicted, make it re-enter the cluster by removing the tcp delay:

tc qdisc del dev eth0 root netem

At this point, gdb will provoke the error and server will start a shutdown, but will never exit. You will have to run a kill -9 no mysql pid

Environment

None

Smart Checklist

Activity

Julia Vural March 4, 2025 at 9:28 PM

It appears that this issue is no longer being worked on, so we are closing it for housekeeping purposes. If you believe the issue still exists, please open a new ticket after confirming it's present in the latest release.

Won't Do

Details
Assignee
Unassigned
Reporter
Marcelo Altmann(Deactivated)
Affects versions
5.7.25-31.35
Priority
Medium

Smart Checklist

Created June 6, 2019 at 11:08 PM

Updated March 4, 2025 at 9:28 PM

Resolved March 4, 2025 at 9:28 PM

PXC will hang at shutdown if IST fail

Description

Environment

Smart Checklist

Activity

Julia Vural March 4, 2025 at 9:28 PM

Details
Assignee
Unassigned
Reporter
Marcelo Altmann(Deactivated)
Affects versions
5.7.25-31.35
Priority
Medium

Details

Assignee

Reporter

Affects versions

Priority

Smart Checklist

Smart Checklist

Flag notifications

Something's gone wrong

Something's gone wrong

PXC will hang at shutdown if IST fail

Description

Environment

Smart Checklist

Activity

Julia Vural March 4, 2025 at 9:28 PM

DetailsAssigneeUnassignedUnassignedReporterMarcelo AltmannMarcelo Altmann(Deactivated)Affects versions5.7.25-31.35PriorityMedium

Details

Assignee

Reporter

Affects versions

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Flag notifications

Something's gone wrong

Something's gone wrong

Details
Assignee
Unassigned
Reporter
Marcelo Altmann(Deactivated)
Affects versions
5.7.25-31.35
Priority
Medium

Smart Checklist