Assertion failure on donor if joiner is killed in debug build
General
Escalation
General
Escalation
Description
Steps to reproduce the issue: (issue exists only in debug build)
Setup two node cluster say node1 and node2.
Trigger sysbench load on node2.
While sysbench load is in progress join new node node3 such that node1 is the donor server.
Wait for some seconds for SST to get triggered on joiner and kill the joiner node while SST is in progress.
Attached the bash script with the steps to reproduce the issue.
2025-02-23T09:53:00.167235Z 0 [Note] [MY-000000] [WSREP-SST] [second(s) to timeout: 188]
2025-02-23T09:53:01.181520Z 0 [Note] [MY-000000] [WSREP-SST] [second(s) to timeout: 187]
mysqld: /home/parveez.baig/pxc_clone/percona-xtradb-cluster-galera/galerautils/src/gu_asio_stream_react.cpp:257: virtual size_t gu::AsioStreamReact::write(const gu::AsioConstBuffer&): Assertion `write_result.bytes_transferred == buf.size() || (in_progress_ & socket_shutdown_in_progress)' failed.
2025-02-23T09:53:01.828187Z 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation
2025-02-23T09:53:01.828349Z 0 [Note] [MY-000000] [WSREP] Terminating SST process
2025-02-23T09:53:01Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=b3bdaccf8c78606829ea76cf92696b3ae898e5a1
Server Version: 8.0.40-31-debug Source distribution, wsrep_26.1.4.3
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x100000
2025-02-23T09:53:01.828635Z 0 [Note] [MY-000000] [WSREP-SST] Cleanup DONOR.
2025-02-23T09:53:01.832878Z 0 [Note] [MY-000000] [Galera] (cbb263ac-8dc4, 'tcp://127.0.0.1:4030') turning message relay requesting on, nonlive peers: tcp://127.0.0.1:6030
/home/parveez.baig/pxc_clone/bld/install/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x59) [0x5637ac0c55f1]
/home/parveez.baig/pxc_clone/bld/install/bin/mysqld(print_fatal_signal(int)+0x3ce) [0x5637aad185b5]
/home/parveez.baig/pxc_clone/bld/install/bin/mysqld(handle_fatal_signal+0x83) [0x5637aad187e0]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f64beb7f520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7f64bebd39fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7f64beb7f476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f64beb657f3]
/lib/x86_64-linux-gnu/libc.so.6(+0x2871b) [0x7f64beb6571b]
/lib/x86_64-linux-gnu/libc.so.6(+0x39e96) [0x7f64beb76e96]
/home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0x28b781) [0x7f64b1edf781]
/home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xb726d) [0x7f64b1d0b26d]
/home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xb4e3f) [0x7f64b1d08e3f]
/home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xaf0cb) [0x7f64b1d030cb]
/home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xaf61a) [0x7f64b1d0361a]
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f64bebd1ac3]
/lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7f64bec63850]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
Writing a core file
Issue is seen in both xtrabackup and clone SST method.
Steps to reproduce the issue: (issue exists only in debug build)
Setup two node cluster say node1 and node2.
Trigger sysbench load on node2.
While sysbench load is in progress join new node node3 such that node1 is the donor server.
Wait for some seconds for SST to get triggered on joiner and kill the joiner node while SST is in progress.
Attached the bash script with the steps to reproduce the issue.
2025-02-23T09:53:00.167235Z 0 [Note] [MY-000000] [WSREP-SST] [second(s) to timeout: 188] 2025-02-23T09:53:01.181520Z 0 [Note] [MY-000000] [WSREP-SST] [second(s) to timeout: 187] mysqld: /home/parveez.baig/pxc_clone/percona-xtradb-cluster-galera/galerautils/src/gu_asio_stream_react.cpp:257: virtual size_t gu::AsioStreamReact::write(const gu::AsioConstBuffer&): Assertion `write_result.bytes_transferred == buf.size() || (in_progress_ & socket_shutdown_in_progress)' failed. 2025-02-23T09:53:01.828187Z 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation 2025-02-23T09:53:01.828349Z 0 [Note] [MY-000000] [WSREP] Terminating SST process 2025-02-23T09:53:01Z UTC - mysqld got signal 6 ; Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware. BuildID[sha1]=b3bdaccf8c78606829ea76cf92696b3ae898e5a1 Server Version: 8.0.40-31-debug Source distribution, wsrep_26.1.4.3 Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x100000 2025-02-23T09:53:01.828635Z 0 [Note] [MY-000000] [WSREP-SST] Cleanup DONOR. 2025-02-23T09:53:01.832878Z 0 [Note] [MY-000000] [Galera] (cbb263ac-8dc4, 'tcp://127.0.0.1:4030') turning message relay requesting on, nonlive peers: tcp://127.0.0.1:6030 /home/parveez.baig/pxc_clone/bld/install/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x59) [0x5637ac0c55f1] /home/parveez.baig/pxc_clone/bld/install/bin/mysqld(print_fatal_signal(int)+0x3ce) [0x5637aad185b5] /home/parveez.baig/pxc_clone/bld/install/bin/mysqld(handle_fatal_signal+0x83) [0x5637aad187e0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f64beb7f520] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c) [0x7f64bebd39fc] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16) [0x7f64beb7f476] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3) [0x7f64beb657f3] /lib/x86_64-linux-gnu/libc.so.6(+0x2871b) [0x7f64beb6571b] /lib/x86_64-linux-gnu/libc.so.6(+0x39e96) [0x7f64beb76e96] /home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0x28b781) [0x7f64b1edf781] /home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xb726d) [0x7f64b1d0b26d] /home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xb4e3f) [0x7f64b1d08e3f] /home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xaf0cb) [0x7f64b1d030cb] /home/parveez.baig/pxc_clone/bld/install/lib/libgalera_smm.so(+0xaf61a) [0x7f64b1d0361a] /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f64bebd1ac3] /lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7f64bec63850] You may download the Percona XtraDB Cluster operations manual by visiting http://www.percona.com/software/percona-xtradb-cluster/. You may find information in the manual which will help you identify the cause of the crash. Writing a core file
Issue is seen in both xtrabackup and clone SST method.