Issues
- Node crashes with Transport endpoint is not connectedPXC-4167Resolved issue: PXC-4167Kamil Holubicki
- install component 'file://nosuchcomponent' drops PXC8 node from the clusterPXC-4088
- Node SST failed during final ISTPXC-4073Resolved issue: PXC-4073
- PXC 8.0.29 signal 11 when running ALTER DATABASEPXC-4068Resolved issue: PXC-4068Aaditya Dubey
- Empty gra_x_y_v2.log filesPXC-4048Resolved issue: PXC-4048Kamil Holubicki
- PXC 8.0 crashes while acquiring metadata lockPXC-4007Resolved issue: PXC-4007Kamil Holubicki
- Getting error from Slave SQL thread when it is started before the PXC 8.0 node is ready to accept connections while joining the PXC clusterPXC-3982Resolved issue: PXC-3982Kamil Holubicki
- Cluster Members Crash RandomlyPXC-3965Resolved issue: PXC-3965
- PXC 8.0.27 Assertion failurePXC-3952Resolved issue: PXC-3952
- mysqld: galerautils/src/gu_asio_stream_engine.cpp:298: gu::AsioStreamEngine::op_status AsioSslStreamEngine::map_status(int, int, const char*): Assertion `0' failed.PXC-3950
- TOI constantly fails after setting `repl.max_ws_size`PXC-3944
- SSL has to be expliticlty enabled with socket.ssl=yes in garbd 8.0.27-18PXC-3927Resolved issue: PXC-3927
- TRUNCATE TABLE X; INSERT INTO X; results in HA_ERR_FOUND_DUPP_KEY on slave node while foreign keys are disabled and violatedPXC-3924Resolved issue: PXC-3924Kamil Holubicki
- ANALYZE TABLE crashes node when [super_]read_only is set!PXC-3923Resolved issue: PXC-3923Kamil Holubicki
- tls_version ignored for GaleraPXC-3922Resolved issue: PXC-3922
- Connection timeout issue with garbd in 8.0.27PXC-3918Resolved issue: PXC-3918Kamil Holubicki
- pxc8.0.27 tarball misses libgcrypt.so libPXC-3913Resolved issue: PXC-3913Alex Miroshnychenko
- 8.0.27 docker image has xtrabackup linked to missing libprotobuf-lite libraryPXC-3912Resolved issue: PXC-3912Alex Miroshnychenko
Node crashes with Transport endpoint is not connected
Description
Environment
AFFECTED CS IDs
Details
Details
Assignee
Reporter
Needs QA
Fix versions
Affects versions
Priority
Smart Checklist
Smart Checklist
Activity
Kamil Holubicki September 22, 2023 at 11:25 AM
Post-merge fix will be provided in 8.0.34 by this PR
https://github.com/percona/galera/pull/270
Neil Billett September 22, 2023 at 8:23 AM
Hi,
As per the discussion here: https://forums.percona.com/t/occasional-db-crashes-in-pxc-8-0-32-around-remote-endpoint-transport-endpoint-is-not-connected/25158 we believe this is still an issue in PXC 8.0.32.
I’ve been able to replicate the crash with a two node cluster on PXC 8.0.32 using nmap’s tcp connect scan against port 4567 on both nodes.
I’ve got our application connected to <node1> (generating some writeset changes) and if I leave this running from a third host:
while true; do nmap -T2 -sT <node1> -p4567; nmap -T2 -sT <node2> -p4567; done
…I see a lot of these entries in both node logs as the commands loop:
[Warning] [MY-000000] [Galera] Failed to accept: remote_endpoint: Transport endpoint is not connected
…and after some time (usually minutes) <node1> falls over e.g:
2023-09-21T16:40:43.106276+01:00 0 [Warning] [MY-000000] [Galera] Failed to accept: remote_endpoint: Transport endpoint is not connected
2023-09-21T16:40:44.811107+01:00 0 [Warning] [MY-000000] [Galera] Failed to accept: remote_endpoint: Transport endpoint is not connected
2023-09-21T16:40:46.527115+01:00 0 [Warning] [MY-000000] [Galera] Failed to accept: remote_endpoint: Transport endpoint is not connected
terminate called after throwing an instance of 'std::system_error'
what(): remote_endpoint: Transport endpoint is not connected
2023-09-21T16:40:48.235578+01:00 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation
2023-09-21T15:40:48Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=df9f6877fc91c9a71d439f27569eabdef408f622
Server Version: 8.0.32-24.2 Percona XtraDB Cluster (GPL), Release rel24, Revision 2119e75, WSREP version 26.1.4.3, wsrep_26.1.4.3
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x80000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x2253a31]
/usr/sbin/mysqld(print_fatal_signal(int)+0x39f) [0x1262d0f]
/usr/sbin/mysqld(handle_fatal_signal+0xd8) [0x1262df8]
/lib64/libpthread.so.0(+0x12cf0) [0x7f7a1e6f9cf0]
/lib64/libc.so.6(gsignal+0x10f) [0x7f7a1caa7aff]
/lib64/libc.so.6(abort+0x127) [0x7f7a1ca7aea5]
/lib64/libstdc++.so.6(+0x9009b) [0x7f7a1d44909b]
/lib64/libstdc++.so.6(+0x9653c) [0x7f7a1d44f53c]
/lib64/libstdc++.so.6(+0x96597) [0x7f7a1d44f597]
/lib64/libstdc++.so.6(+0x967f8) [0x7f7a1d44f7f8]
/usr/lib64/galera4/libgalera_smm.so(+0x922cf) [0x7f7a0f5872cf]
/usr/lib64/galera4/libgalera_smm.so(+0x92d7c) [0x7f7a0f587d7c]
/usr/lib64/galera4/libgalera_smm.so(+0xa6885) [0x7f7a0f59b885]
/usr/lib64/galera4/libgalera_smm.so(+0xb3c98) [0x7f7a0f5a8c98]
/usr/lib64/galera4/libgalera_smm.so(+0x8e400) [0x7f7a0f583400]
/usr/lib64/galera4/libgalera_smm.so(+0x8e6b3) [0x7f7a0f5836b3]
/usr/lib64/galera4/libgalera_smm.so(+0x1c15ae) [0x7f7a0f6b65ae]
/usr/lib64/galera4/libgalera_smm.so(+0x1c16d6) [0x7f7a0f6b66d6]
/lib64/libpthread.so.0(+0x81ca) [0x7f7a1e6ef1ca]
/lib64/libc.so.6(clone+0x43) [0x7f7a1ca92e73]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
Hope its helpful!
yoann.lacancellera March 10, 2023 at 11:08 AM
Thank you, this really helps
Sadly nothing was documented in pxc releases about wsrep version, and variables still shows 26.4.3
| version_comment | Percona XtraDB Cluster binary (GPL) 8.0.30, Revision aff6a8b, WSREP version 26.4.3 |
I found out I should have checked for GALERA_VERSIONS file in submodules
Kamil Holubicki March 3, 2023 at 4:57 PM
Minimal testcase assuming default Galera communication port 4567:
start single node cluster
while true; do nmap -p4567 127.0.0.1; done
It crashes immediately.
The issue was fixed by Galera upstream commit 930c016108d7086b472ad7a8b9d0f6989202b48a and is included in Galera 26.4.12, so:
8.0.27 -> galera 26.4.10 - failure
8.0.28 -> galera 26.4.11 - failure
8.0.29 -> galera 26.4.12 - works fine
Related to :
https://jira.mariadb.org/browse/MDEV-25068
https://forums.percona.com/t/looks-like-bug-to-many-connection-crashes-pxc/17920
Upgrading WSREP version from 26.4.3 to 26.4.12 should solve a crash with some vulnerability scan utilities
Affects 8.0.27, but it should also affects 8.0.30 as wsrep version is the same
2023-02-28T13:18:54.794062+01:00 0 [Warning] [MY-000000] [Galera] unserialize error invalid protocol version 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():133 terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::system_error> >' what(): remote_endpoint: Transport endpoint is not connected 2023-02-28T13:19:41.105324+01:00 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation 12:19:41 UTC - mysqld got signal 6 ; Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.Build ID: df3732f507bb44de9b2cf240d6ec633fccedccbe Server Version: 8.0.27-18.1 Percona XtraDB Cluster (GPL), Release rel18, Revision ac35177, WSREP version 26.4.3, wsrep_26.4.3Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x100000 /usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x20d55fd] /usr/sbin/mysqld(handle_fatal_signal+0x383) [0x1172a83] /lib64/libpthread.so.0(+0xf630) [0x7f56d8736630] /lib64/libc.so.6(gsignal+0x37) [0x7f56d6a21387] /lib64/libc.so.6(abort+0x148) [0x7f56d6a22a78] /lib64/libstdc++.so.6(__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f56d7331a95] /lib64/libstdc++.so.6(+0x5ea06) [0x7f56d732fa06] /lib64/libstdc++.so.6(+0x5ea33) [0x7f56d732fa33] /lib64/libstdc++.so.6(+0x5ec53) [0x7f56d732fc53] /usr/lib64/galera4/libgalera_smm.so(+0x1dbce) [0x7f56c7b9bbce] /usr/lib64/galera4/libgalera_smm.so(+0x93f48) [0x7f56c7c11f48] /usr/lib64/galera4/libgalera_smm.so(+0xa3dc5) [0x7f56c7c21dc5] /usr/lib64/galera4/libgalera_smm.so(+0xa6b6a) [0x7f56c7c24b6a] /usr/lib64/galera4/libgalera_smm.so(+0xaddaf) [0x7f56c7c2bdaf] /usr/lib64/galera4/libgalera_smm.so(+0x8c160) [0x7f56c7c0a160] /usr/lib64/galera4/libgalera_smm.so(+0x1c418e) [0x7f56c7d4218e] /usr/lib64/galera4/libgalera_smm.so(+0x1c42b2) [0x7f56c7d422b2] /lib64/libpthread.so.0(+0x7ea5) [0x7f56d872eea5] /lib64/libc.so.6(clone+0x6d) [0x7f56d6ae99fd]