Modify to end mysqld process when the joiner fails during an SST

Description

Startup two nodes (node1 and node2). I am using an older PXB version, so the SST will fail with the error message:

 

2020-04-08T09:53:23.979792Z WSREP_SST: [ERROR] ******************* FATAL ERROR ********************** 2020-04-08T09:53:23.980952Z WSREP_SST: [ERROR] The xtrabackup version is 2.4.19. Needs xtrabackup-2.4.20 or higher to perform SST 2020-04-08T09:53:23.982050Z WSREP_SST: [ERROR] ******************************************************

 

 

This is fine.  However, the process does not fully exit.

 

2020-04-08T09:53:26.987087Z 2 [Note] WSREP: gcomm: closed 2020-04-08T09:53:26.987118Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 2020-04-08T09:53:26.987174Z 0 [Note] WSREP: Flow-control interval: [100, 100] 2020-04-08T09:53:26.987178Z 0 [Note] WSREP: Trying to continue unpaused monitor 2020-04-08T09:53:26.987181Z 0 [Note] WSREP: Received NON-PRIMARY. 2020-04-08T09:53:26.987201Z 0 [Note] WSREP: Shifting PRIMARY -> OPEN (TO: 0) 2020-04-08T09:53:26.987210Z 0 [Note] WSREP: Received self-leave message. 2020-04-08T09:53:26.987214Z 0 [Note] WSREP: Flow-control interval: [0, 0] 2020-04-08T09:53:26.987216Z 0 [Note] WSREP: Trying to continue unpaused monitor 2020-04-08T09:53:26.987219Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection. 2020-04-08T09:53:26.987221Z 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 0) 2020-04-08T09:53:26.987225Z 0 [Note] WSREP: RECV thread exiting 0: Success 2020-04-08T09:53:26.987289Z 2 [Note] WSREP: recv_thread() joined. 2020-04-08T09:53:26.987298Z 2 [Note] WSREP: Closing replication queue. 2020-04-08T09:53:26.987302Z 2 [Note] WSREP: Closing slave action queue. 2020-04-08T09:53:26.987307Z 2 [Note] WSREP: Closing aborting applier THD: 2 2020-04-08T09:53:26.987367Z 0 [Note] WSREP: wsrep running threads now: 0 2020-04-08T09:53:26.987654Z 0 [Note] WSREP: Waiting for active wsrep applier to exit 2020-04-08T09:53:26.987669Z 0 [Note] WSREP: Service disconnected. 2020-04-08T09:53:26.987672Z 0 [Note] WSREP: Waiting to close threads...... 2020-04-08T09:53:31.988541Z 0 [Note] WSREP: Some threads may fail to exit. 2020-04-08T09:53:31.988680Z 0 [Note] Binlog end 2020-04-08T09:53:31.988951Z 0 [Note] /home/kennt/dev/pxc/build-bin/bin/mysqld: Shutdown complete

The process is still hanging around waiting for the SST:

 

Thread 3 (Thread 0x7fcb95b6b700 (LWP 61652)): #0 0x00007fcb944ce449 in futex_wait (private=<optimized out>, expected=12, futex_word=0x563a5b14d1c4 <COND_wsrep_sst+36>) at ../sysdeps/unix/sysv/linux/futex-internal.h:61 #1 futex_wait_simple (private=<optimized out>, expected=12, futex_word=0x563a5b14d1c4 <COND_wsrep_sst+36>) at ../sysdeps/nptl/futex-internal.h:135 #2 __pthread_cond_destroy (cond=0x563a5b14d1a0 <COND_wsrep_sst>) at pthread_cond_destroy.c:54 #3 0x0000563a5915e041 in native_cond_destroy (cond=0x563a5b14d1a0 <COND_wsrep_sst>) at include/thr_cond.h:122 #4 0x0000563a5915e5b0 in inline_mysql_cond_destroy (that=0x563a5b14d1a0 <COND_wsrep_sst>) at include/mysql/psi/mysql_thread.h:1164 #5 0x0000563a59160b6e in clean_up_mutexes () at sql/mysqld.cc:1841 #6 0x0000563a591600d9 in mysqld_exit (exit_code=1) at sql/mysqld.cc:1544 #7 0x0000563a5916008f in unireg_abort (exit_code=1) at sql/mysqld.cc:1532 #8 0x0000563a59198e69 in wsrep_sst_prepare (msg=0x7fcb95b68620, thd=0x7fcb6c000b40) at sql/wsrep_sst.cc:876 #9 0x0000563a5918c846 in wsrep_view_handler_cb (app_ctx=0x563a5b14fe04 <key_FILE_galera_gvwstate>, recv_ctx=0x7fcb6c000b40, view=0x7fcb6c0123e0, state=0x0, state_len=0, sst_req=0x7fcb95b68620, sst_req_len=0x7fcb95b68628) at sql/wsrep_mysqld.cc:721 #10 0x00007fcb92c87a0f in galera::ReplicatorSMM::process_conf_change (this=0x563a5c744c50, recv_ctx=0x7fcb6c000b40, view_info=..., repl_proto=9, next_state=galera::Replicator::S_CONNECTED, seqno_l=1) at galera/src/replicator_smm.cpp:1628 #11 0x00007fcb92c60cd9 in galera::GcsActionSource::dispatch (this=0x563a5c745308, recv_ctx=0x7fcb6c000b40, act=..., exit_loop=@0x7fcb95b68b6b: false) at galera/src/gcs_action_source.cpp:135 #12 0x00007fcb92c61355 in galera::GcsActionSource::process (this=0x563a5c745308, recv_ctx=0x7fcb6c000b40, exit_loop=@0x7fcb95b68b6b: false) at galera/src/gcs_action_source.cpp:180 #13 0x00007fcb92c80d75 in galera::ReplicatorSMM::async_recv (this=0x563a5c744c50, recv_ctx=0x7fcb6c000b40) at galera/src/replicator_smm.cpp:408 #14 0x00007fcb92ca24ff in galera_recv (gh=0x563a5c6c6bc0, recv_ctx=0x7fcb6c000b40) at galera/src/wsrep_provider.cpp:244 #15 0x0000563a591a5bd2 in wsrep_replication_process (thd=0x7fcb6c000b40) at sql/wsrep_thd.cc:470 #16 0x0000563a5916ae29 in start_wsrep_THD (arg=0x563a591a5ad9 <wsrep_replication_process(THD*)>) at sql/mysqld.cc:7467 #17 0x0000563a59c5f44c in pfs_spawn_thread (arg=0x563a5c7b3ae0) at storage/perfschema/pfs.cc:2198 #18 0x00007fcb944c86db in start_thread (arg=0x7fcb95b6b700) at pthread_create.c:463 #19 0x00007fcb938b288f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 2 (Thread 0x7fcb929ac700 (LWP 61645)): #0 0x00007fcb944ce9f3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x563a5c7a4868) at ../sysdeps/unix/sysv/linux/futex-internal.h:88 #1 __pthread_cond_wait_common (abstime=0x0, mutex=0x563a5c7a47e0, cond=0x563a5c7a4840) at pthread_cond_wait.c:502 #2 __pthread_cond_wait (cond=0x563a5c7a4840, mutex=0x563a5c7a47e0) at pthread_cond_wait.c:655 #3 0x0000563a5918aebf in native_cond_wait (cond=0x563a5c7a4840, mutex=0x563a5c7a47e0) at include/thr_cond.h:147 #4 0x0000563a5918af45 in my_cond_wait (cond=0x563a5c7a4840, mp=0x563a5c7a47e0) at include/thr_cond.h:202 #5 0x0000563a5918b41c in inline_mysql_cond_wait (that=0x563a5c7a4840, mutex=0x563a5c7a47e0, src_file=0x563a5a234760 "sql/wsrep_mysqld.cc", src_line=416) at include/mysql/psi/mysql_thread.h:1202 #6 0x0000563a5918ba2b in wsrep_pfs_instr_cb (type=WSREP_PFS_INSTR_TYPE_CONDVAR, ops=WSREP_PFS_INSTR_OPS_WAIT, tag=WSREP_PFS_INSTR_TAG_SERVICE_THD_CONDVAR, value=0x563a5c745268, alliedvalue=0x563a5c745258, ts=0x0) at sql/wsrep_mysqld.cc:416 #7 0x00007fcb92ad0fbb in gu::Lock::wait (this=0x7fcb929abd10, cond=...) at galerautils/src/gu_lock.hpp:112 #8 0x00007fcb92c5cdab in galera::ServiceThd::thd_func (arg=0x563a5c745240) at galera/src/galera_service_thd.cpp:37 #9 0x00007fcb944c86db in start_thread (arg=0x7fcb929ac700) at pthread_create.c:463 #10 0x00007fcb938b288f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7fcb95cae0c0 (LWP 61641)): #0 0x00007fcb944ce9f3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x563a5b14d1c8 <COND_wsrep_sst+40>) at ../sysdeps/unix/sysv/linux/futex-internal.h:88 #1 __pthread_cond_wait_common (abstime=0x0, mutex=0x563a5b14d160 <LOCK_wsrep_sst>, cond=0x563a5b14d1a0 <COND_wsrep_sst>) at pthread_cond_wait.c:502 #2 __pthread_cond_wait (cond=0x563a5b14d1a0 <COND_wsrep_sst>, mutex=0x563a5b14d160 <LOCK_wsrep_sst>) at pthread_cond_wait.c:655 #3 0x0000563a59196110 in native_cond_wait (cond=0x563a5b14d1a0 <COND_wsrep_sst>, mutex=0x563a5b14d160 <LOCK_wsrep_sst>) at include/thr_cond.h:147 #4 0x0000563a5919614f in my_cond_wait (cond=0x563a5b14d1a0 <COND_wsrep_sst>, mp=0x563a5b14d160 <LOCK_wsrep_sst>) at include/thr_cond.h:202 #5 0x0000563a591963e9 in inline_mysql_cond_wait (that=0x563a5b14d1a0 <COND_wsrep_sst>, mutex=0x563a5b14d160 <LOCK_wsrep_sst>, src_file=0x563a5a2361c0 "sql/wsrep_sst.cc", src_line=257) at include/mysql/psi/mysql_thread.h:1202 #6 0x0000563a59196c42 in wsrep_sst_wait () at sql/wsrep_sst.cc:257 #7 0x0000563a5918ec45 in wsrep_init_startup (first=true) at sql/wsrep_mysqld.cc:1239 #8 0x0000563a5916727b in init_server_components () at sql/mysqld.cc:4872 #9 0x0000563a59169118 in mysqld_main (argc=37, argv=0x563a5c6bce88) at sql/mysqld.cc:5877 #10 0x0000563a5915ddea in main (argc=10, argv=0x7ffc5b28f778) at sql/main.cc:32 #11 0x00007fcb937b2b97 in __libc_start_main (main=0x563a5915ddca <main(int, char**)>, argc=10, argv=0x7ffc5b28f778, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc5b28f768) at ../csu/libc-start.c:310 #12 0x0000563a5915dcea in _start ()

 

Environment

None

Smart Checklist

Activity

Show:

Zsolt Parragi April 10, 2020 at 2:08 PM

Tested on 8.0, doesn't happen.

Duplicate

Details

Assignee

Reporter

Fix versions

Affects versions

Priority

Smart Checklist

Created April 8, 2020 at 10:14 AM
Updated March 6, 2024 at 9:38 PM
Resolved April 27, 2020 at 9:19 AM

Flag notifications