CLONE - Percona server 5.6.40 restarting with signal 11

Description

We recently migrated our mysql to new hardware and it was running in slave mode for 15 days. We made it the master 11th June. On 13th June it restarted for the first time with signal 11. Stacktrace from 1st segfault -

04:32:32 UTC - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Please help us make Percona Server better by reporting any bugs at http://bugs.percona.com/

key_buffer_size=33554432 read_buffer_size=131072 max_used_connections=547 max_threads=5002 thread_count=430 connection_count=430 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2023198 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2a83900 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 7f86b8c24e88 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8c66bc] /usr/sbin/mysqld(handle_fatal_signal+0x469)[0x64d079] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f8e9871c890] /usr/sbin/mysqld(_Z25gtid_pre_statement_checksPK3THD+0x0)[0x848820] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x316)[0x6cb8a6] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x5d8)[0x6d15e8] /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x117f)[0x6d2eaf] /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a2)[0x69f962] /usr/sbin/mysqld(handle_one_connection+0x40)[0x69fa00] /usr/sbin/mysqld(pfs_spawn_thread+0x146)[0x8fbfe6] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7f8e98715064] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f8e9675462d]

Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (7f85e012ee80): is an invalid pointer Connection ID (thread ID): 247827 Status: NOT_KILLED

 

Then after 2 days we observed 3 more segfaults - 

Segfault 1:

02:15:36 UTC - mysqld got signal 11 ;

This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=723
max_threads=5002
thread_count=358
connection_count=358
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2023198 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2b73a90
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f0e12d45e88 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8c66bc]
/usr/sbin/mysqld(handle_fatal_signal+0x469)[0x64d079]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f15f20b9890]
/usr/sbin/mysqld[0x64b000]
/usr/sbin/mysqld(vio_io_wait+0x76)[0xb77b56]
/usr/sbin/mysqld(vio_socket_io_wait+0x18)[0xb77bf8]
/usr/sbin/mysqld(vio_read+0xca)[0xb77cda]
/usr/sbin/mysqld[0x642203]
/usr/sbin/mysqld[0x6424f4]
/usr/sbin/mysqld(my_net_read+0x304)[0x6432e4]
/usr/sbin/mysqld(_Z10do_commandP3THD+0xca)[0x6d413a]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a2)[0x69f962]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x69fa00]
/usr/sbin/mysqld(pfs_spawn_thread+0x146)[0x8fbfe6]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7f15f20b2064]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f15f00f162d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 40900
Status: NOT_KILLED

Segfault 2:

02:36:32 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=401
max_threads=5002
thread_count=369
connection_count=369
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2023198 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x32448f0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f2fb82c3e88 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8c66bc]
/usr/sbin/mysqld(handle_fatal_signal+0x469)[0x64d079]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f3792426890]
/usr/sbin/mysqld(_ZN9PROFILING15start_new_queryEPKc+0x0)[0x6e60a0]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x47)[0x6d1d77]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a2)[0x69f962]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x69fa00]
/usr/sbin/mysqld(pfs_spawn_thread+0x146)[0x8fbfe6]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7f379241f064]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f379045e62d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 482
Status: NOT_KILLED

 

SEGFAULT 3:

04:53:22 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=448
max_threads=5002
thread_count=368
connection_count=368
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2023198 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x22bc600
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f10c19b9e88 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8c66bc]
/usr/sbin/mysqld(handle_fatal_signal+0x469)[0x64d079]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f189c309890]
/usr/sbin/mysqld(_Z20net_after_header_psiP6st_netPvmc+0x0)[0x5804b0]
/usr/sbin/mysqld[0x64250b]
/usr/sbin/mysqld(my_net_read+0x304)[0x6432e4]
/usr/sbin/mysqld(_Z10do_commandP3THD+0xca)[0x6d413a]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a2)[0x69f962]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x69fa00]
/usr/sbin/mysqld(pfs_spawn_thread+0x146)[0x8fbfe6]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7f189c302064]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f189a34162d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 1386
Status: NOT_KILLED

 

We did a master slave switch after this and took this server out of the cluster - 

On the new master as well we observed the same issue. We were also trying to reproduce it by starting a long running OLTP sysbench benchmark - 

During the benchmark the server crashed once again - 

 
 
{{10:51:18 UTC - mysqld got signal 11 ;This could be because you hit a bug. It is also possible that this binaryor one of the libraries it was linked against is corrupt, improperly built,or misconfigured. This error can also be caused by malfunctioning hardware.We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,something is definitely wrong and this may fail.Please help us make Percona Server better by reporting anybugs at http://bugs.percona.com/key_buffer_size=33554432read_buffer_size=131072max_used_connections=4max_threads=5002thread_count=3connection_count=3It is possible that mysqld could use up tokey_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2023198 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.Thread pointer: 0x22c8270Attempting backtrace. You can use the following information to find outwhere mysqld died. If you see no messages after this, something went
terribly wrong...stack_bottom = 7f65ec060e88 thread_stack 0x30000/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8c66bc]/usr/sbin/mysqld(handle_fatal_signal+0x469)[0x64d079]/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f6644371890]/lib/x86_64-linux-gnu/libc.so.6(__poll+0x0)[0x7f66423a0ac0]/usr/sbin/mysqld(vio_io_wait+0x86)[0xb77b66]/usr/sbin/mysqld(vio_socket_io_wait+0x18)[0xb77bf8]/usr/sbin/mysqld(vio_read+0xca)[0xb77cda]/usr/sbin/mysqld[0x642203]/usr/sbin/mysqld[0x6424f4]/usr/sbin/mysqld(my_net_read+0x304)[0x6432e4]/usr/sbin/mysqld(_Z10do_commandP3THD+0xca)[0x6d413a]/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a2)[0x69f962]/usr/sbin/mysqld(handle_one_connection+0x40)[0x69fa00]/usr/sbin/mysqld(pfs_spawn_thread+0x146)[0x8fbfe6]/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7f664436a064]/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f66423a962d]Trying to get some variables.Some pointers may be invalid and cause the dump to abort.Query (0): is an invalid pointer
Connection ID (thread ID): 32Status: NOT_KILLED}}

{{}}

*Logs from sysbench -* 

{{}}
 
{{FATAL: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'SELECT SUM(k) FROM sbtest20 WHERE id BETWEEN 5008643 AND 5008742'FATAL: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'DELETE FROM sbtest4 WHERE id=5025943'FATAL: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'SELECT SUM(k) FROM sbtest15 WHERE id BETWEEN 5049412 AND 5049511'FATAL: `thread_run' function failed: /usr/share/sysbench/oltp_common.lua:432: SQL error, errno = 2013, state = 'HY000': Lost connection to MySQL server during query
FATAL: `thread_run' function failed: /usr/share/sysbench/oltp_common.lua:487: SQL error, errno = 2013, state = 'HY000': Lost connection to MySQL server during query
FATAL: `thread_run' function failed: /usr/share/sysbench/oltp_common.lua:432: SQL error, errno = 2013, state = 'HY000': Lost connection to MySQL server during query
Error in my_thread_global_end(): 3 threads didn't exit}}{{}}

{{}}

We cannot enable general query log since the load is very high and it fills up the disk. We cannot deterministically reproduce this as well. Mysql version is - 5.6.40-84.0-log Debian version is - Linux version 3.16.0-6-amd64 (debian-kernel@lists.debian.org) (gcc version 4.9.2 (Debian 4.9.2-10+deb8u1) ) #1 SMP Debian 3.16.56-1+deb8u1 (2018-05-08)

Machine memory is 40GB Innodb buffer pool is 30GB

{{}}

Environment

None

Attachments

4
  • 26 Jun 2019, 05:38 PM
  • 26 Jun 2019, 05:38 PM
  • 26 Jun 2019, 05:31 PM
  • 26 Jun 2019, 05:31 PM

Smart Checklist

Activity

Show:

Jira Bot March 29, 2020 at 4:15 PM

Hello ,
It's been 208 days since this issue went into Incomplete and we haven't heard
from you on this.

At this point, our policy is to Close this issue, to keep things from getting
too cluttered. If you have more information about this issue and wish to
reopen it, please reply with a comment containing "jira-bot=reopen".

Jira Bot February 28, 2020 at 8:55 PM

Hello ,
It's jira-bot again. Your bug report is important to us, but we haven't heard
from you since the previous notification. If we don't hear from you on
this in 7 days, the issue will be automatically closed.

Jira Bot January 25, 2020 at 10:20 PM

Hello ,
I'm jira-bot, Percona's automated helper script. Your bug report is important
to us but we've been unable to reproduce it, and asked you for more
information. If we haven't heard from you on this in 3 more weeks, the issue
will be automatically closed.

Lalit Choudhary September 3, 2019 at 1:31 PM

Hello Sarthak,

Could you please provide new crash logs (crash log after upgrade to PS 5.6.44 server)

Sarthak Shrivastava June 26, 2019 at 5:38 PM

I have attached the standard and full backtrace as well.

Incomplete

Details

Assignee

Reporter

Affects versions

Priority

Smart Checklist

Created June 26, 2019 at 5:31 PM
Updated March 6, 2024 at 12:02 PM
Resolved March 29, 2020 at 4:15 PM