Issues

Select view

List view

Detail view

Select search mode

Basic

JQL

LP #1401133: The problem with Node after using innobackupex and "Backup Locks"
PXC-1020
Resolved issue: PXC-1020
Block create/alter/drop/undo truncation while backup lock is available and hold lock until operation is completed
PS-7328
Resolved issue: PS-7328
Block enable/disable redo log with lock tables for backup
PS-7245
Resolved issue: PS-7245
mysqldump --single-transaction --lock-for-backup missing mutex to avoid concurrency with DDLs
PS-5631
Resolved issue: PS-5631
Test LOCK TABLES FOR BACKUP with performance_schema.log_status and LOCK INSTANCE FOR BACKUP
PS-5030
Resolved issue: PS-5030
LOCK TABLES FOR BACKUP should depend on BACKUP_ADMIN instead of RELOAD
PS-5014
Resolved issue: PS-5014
handle_fatal_signal (sig=11) in handler::unbind_psi
PS-4929
Resolved issue: PS-4929
A sequence of LOCK TABLES FOR BACKUP and STOP SLAVE SQL_THREAD can cause replication to be blocked and cannot be restarted normally
PS-4758
Resolved issue: PS-4758
LP #1701154: backup lock should not block partitioned innodb table updates
PS-3713
Resolved issue: PS-3713
LP #1527463: Waiting for binlog lock
PS-3345
Resolved issue: PS-3345
LP #1617267: -Wunused-but-set-variable warning in Global_backup_lock::set_explicit_locks_duration for should_own
PS-2188
Resolved issue: PS-2188
LP #1616333: Test rpl.rpl_backup_locks_mts is unstable
PS-2180
Resolved issue: PS-2180
LP #1712202: XA operations do not take global commit lock under LOCK TABLES FOR BACKUP
PS-1817
Resolved issue: PS-1817
LP #1432494: Assertion `duration != MDL_EXPLICIT || !thd->mdl_context.is_lock_owner(m_namespace, "", "", MDL_INTENTION_EXCLUSIVE)' failed in sql/lock.cc:1198
PS-1613
Resolved issue: PS-1613
LP #1393682: Assertion `m_prot_lock != __null && thd->mdl_context.is_lock_owner(m_namespace, "", "", MDL_INTENTION_EXCLUSIVE)' failed in in Global_backup_lock::release_protection
PS-1584
Resolved issue: PS-1584
LP #1371827: Sporadic partial-hangup on various queries + related (same-testcase) crashes/asserts
PS-1540
Resolved issue: PS-1540
LP #1384583: mysqld got signal 11 ; on update query | handle_fatal_signal (sig=11) in open_table
PS-831
Resolved issue: PS-831
LP #1364707: sql/table_cache.h:527: void Table_cache::release_table(THD*, TABLE*): Assertion `! table->s->has_old_version()' failed. | sig6 abort() in Table_cache::release_table
PS-813
Resolved issue: PS-813

18 of 18

LP #1401133: The problem with Node after using innobackupex and "Backup Locks"

Done

General

Escalation

General

Escalation

Description

**Reported in Launchpad by Aleksey Sokolov last update 30-06-2017 09:32:34

Good evening, dear colleagues.

After the upgrade of PXC to the latest version, we had some problems with innobackupex and "Backup locks".

Current MySQL server version: "Server version: 5.6.21-70.1-56-log Percona XtraDB Cluster (GPL), Release rel70.1, Revision 938, WSREP version 25.8, wsrep_25.8.r4150"

Before the upgrade: "Server version: 5.6.20-68.0-56-log Percona XtraDB Cluster (GPL), Release rel68.0, Revision 888, WSREP version 25.7, wsrep_25.7.r4126"

OS: Scientific Linux release 6.6 (Carbon)
The number of nodes in the cluster: 3

While using the tool "innobackupex", errors occur during the locks and the node is crashed.

Trying to run innobackupex:

[root@natrium mysql]# innobackupex --parallel=2 /mnt/xtrabackup/
..............................................
..............................................
>> log scanned up to (2892514252116)
>> log scanned up to (2892514252116)
[01] ...done
xtrabackup: Creating suspend file '/mnt/xtrabackup/2014-12-10_11-49-36/xtrabackup_suspended_2' with pid '24638'
>> log scanned up to (2892514252116)

141210 12:36:34 innobackupex: Continuing after ibbackup has suspended
141210 12:36:34 innobackupex: Executing LOCK TABLES FOR BACKUP...
DBD::mysql::db do failed: MySQL server has gone away at /usr/bin/innobackupex line 3035.
innobackupex: got a fatal error with the following stacktrace: at /usr/bin/innobackupex line 3038
main::mysql_query('HASH(0x2a82530)', 'LOCK TABLES FOR BACKUP') called at /usr/bin/innobackupex line 3440
main::mysql_lock_tables('HASH(0x2a82530)') called at /usr/bin/innobackupex line 1982
main::backup() called at /usr/bin/innobackupex line 1592
innobackupex: Error:
Error executing 'LOCK TABLES FOR BACKUP': DBD::mysql::db do failed: MySQL server has gone away at /usr/bin/innobackupex line 3035.
141210 12:36:34 innobackupex: Waiting for ibbackup (pid=24638) to finish

Node immediately crashes. Logs:

10:07:54 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=8
max_threads=2050
thread_count=386
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 949471 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8f97d5]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6655c4]
/lib64/libpthread.so.0(+0xf710)[0x7fd38c8f1710]
/usr/lib64/libjemalloc.so.1(+0xf2c4)[0x7fd38cb0e2c4]
/usr/lib64/libjemalloc.so.1(+0x2834f)[0x7fd38cb2734f]
/usr/lib64/libjemalloc.so.1(malloc+0x28d)[0x7fd38cb0585d]
/usr/sbin/mysqld(my_malloc+0x32)[0x8f4ef2]
/usr/sbin/mysqld(init_dynamic_array2+0x62)[0x8e09d2]
/usr/sbin/mysqld(_ZN8Gtid_set4initEv+0x47)[0x881c37]
/usr/sbin/mysqld(_ZN3THDC1Ebb+0x966)[0x6b7866]
/usr/sbin/mysqld(_Z26handle_connections_socketsv+0x43f)[0x581d4f]
/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xf1b)[0x58a7cb]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fd38ad23d5d]
/usr/sbin/mysqld[0x57a229]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
141210 12:07:55 mysqld_safe Number of processes running now: 0
141210 12:07:55 mysqld_safe WSREP: not restarting wsrep node automatically
141210 12:07:55 mysqld_safe mysqld from pid file /var/lib/mysql/natrium.la.net.ua.pid ended

I restarted MySQL (using IST method for recovery) and tried again:

141210 14:34:31 innobackupex: Continuing after ibbackup has suspended
141210 14:34:31 innobackupex: Executing LOCK TABLES FOR BACKUP...
141210 14:34:31 innobackupex: Backup tables lock acquired

141210 14:34:31 innobackupex: Starting to backup non-InnoDB tables and files
innobackupex: in subdirectories of '/var/lib/mysql/'
innobackupex: Backing up files '/var/lib/mysql//mysql/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (75 files)
>> log scanned up to (2895378455386)
innobackupex: Backing up files '/var/lib/mysql//__zabbix/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (122 files)
innobackupex: Backing up file '/var/lib/mysql//test/db.opt'
innobackupex: Backing up files '/var/lib/mysql//performance_schema/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (53 files)
innobackupex: Backing up files '/var/lib/mysql//billing/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (563 files)
>> log scanned up to (2895378479147)
>> log scanned up to (2895378479640)
innobackupex: Backing up file '/var/lib/mysql//billing_storage/db.opt'
innobackupex: Backing up file '/var/lib/mysql//billing_storage/file_chunks.frm'
innobackupex: Backing up file '/var/lib/mysql//billing_storage/files.frm'
innobackupex: Backing up files '/var/lib/mysql//billing_backups/*.{frm,isl,MYD,MYI,MAD,MAI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}' (38 files)
innobackupex: Backing up file '/var/lib/mysql//quick/db.opt'
innobackupex: Backing up file '/var/lib/mysql//quick/KEY_COLUMN_USAGE.frm'
innobackupex: Backing up file '/var/lib/mysql//quick/REFERENTIAL_CONSTRAINTS.frm'
innobackupex: Backing up file '/var/lib/mysql//asterisk/db.opt'
innobackupex: Backing up file '/var/lib/mysql//asterisk/permissions.frm'
innobackupex: Backing up file '/var/lib/mysql//asterisk/sipfriends.frm'
141210 14:34:34 innobackupex: Finished backing up non-InnoDB tables and files

141210 14:34:34 innobackupex: Executing LOCK BINLOG FOR BACKUP...
DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /usr/bin/innobackupex line 3035.
innobackupex: got a fatal error with the following stacktrace: at /usr/bin/innobackupex line 3038
main::mysql_query('HASH(0x2148530)', 'LOCK BINLOG FOR BACKUP') called at /usr/bin/innobackupex line 3490
main::mysql_lock_binlog('HASH(0x2148530)') called at /usr/bin/innobackupex line 2000
main::backup() called at /usr/bin/innobackupex line 1592
innobackupex: Error:
Error executing 'LOCK BINLOG FOR BACKUP': DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /usr/bin/innobackupex line 3035.
141210 14:34:34 innobackupex: Waiting for ibbackup (pid=10719) to finish

After the second attempt, the error "deadlock" occurred. Server worked for another 15 minutes and crashed again. Logs:

12:55:01 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=5
max_threads=2050
thread_count=385
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 949471 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8f97d5]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6655c4]
/lib64/libpthread.so.0(+0xf710)[0x7fd0951ff710]
/usr/lib64/libjemalloc.so.1(+0xf2c4)[0x7fd09541c2c4]
/usr/lib64/libjemalloc.so.1(+0x2834f)[0x7fd09543534f]
/usr/lib64/libjemalloc.so.1(malloc+0x28d)[0x7fd09541385d]
/usr/sbin/mysqld(my_malloc+0x32)[0x8f4ef2]
/usr/sbin/mysqld(init_dynamic_array2+0x62)[0x8e09d2]
/usr/sbin/mysqld(_ZN8Gtid_set4initEv+0x47)[0x881c37]
/usr/sbin/mysqld(_ZN3THDC1Ebb+0x966)[0x6b7866]
/usr/sbin/mysqld(_Z26handle_connections_socketsv+0x43f)[0x581d4f]
/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xf1b)[0x58a7cb]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fd093631d5d]
/usr/sbin/mysqld[0x57a229]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
141210 14:55:01 mysqld_safe Number of processes running now: 0
141210 14:55:01 mysqld_safe WSREP: not restarting wsrep node automatically
141210 14:55:01 mysqld_safe mysqld from pid file /var/lib/mysql/natrium.la.net.ua.pid ended

Prior to the last update such problems did not arise (previously used "FLUSH TABLES WITH READ LOCK").

All packets installed from Percona RPM repository:

[root@natrium /]# rpm -qa | grep Percona
Percona-XtraDB-Cluster-client-56-5.6.21-25.8.938.el6.x86_64
Percona-XtraDB-Cluster-shared-56-5.6.21-25.8.938.el6.x86_64
Percona-XtraDB-Cluster-full-56-5.6.21-25.8.938.el6.x86_64
Percona-XtraDB-Cluster-galera-3-3.8-1.3390.rhel6.x86_64
Percona-XtraDB-Cluster-56-debuginfo-5.6.21-25.8.938.el6.x86_64
Percona-XtraDB-Cluster-server-56-5.6.21-25.8.938.el6.x86_64
Percona-XtraDB-Cluster-garbd-3-3.8-1.3390.rhel6.x86_64
Percona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64
Percona-XtraDB-Cluster-test-56-5.6.21-25.8.938.el6.x86_64
Percona-XtraDB-Cluster-galera-3-debuginfo-3.8-1.3390.rhel6.x86_64

All databases summary size: 256G
We also use partitioning.

/etc/my.cnf:

[mysqld_safe]

open-files-limit = 120000
malloc-lib=/usr/lib64/libjemalloc.so.1

[mysqld]

skip-name-resolve

user = mysql
bind-address = 10.10.91.4
port = 3306
max_connections = 2048

datadir = /var/lib/mysql
socket = /var/lib/mysql/mysql.sock
tmpdir = /tmp/mysql

symbolic-links = 0

table_open_cache = 8192
table_definition_cache = 4496
table_open_cache_instances = 16

thread_cache_size = 64

default_storage_engine = InnoDB
explicit_defaults_for_timestamp = On
ft_min_word_len = 3

large-pages

slow_query_log = On
slow_launch_time = 5

general_log_file = '/var/log/mysql/general/general.log'
general_log = Off

query_cache_size = 0
query_cache_type = off

innodb_buffer_pool_size = 64G
innodb_buffer_pool_instances = 16

innodb_log_file_size = 8G
innodb_log_buffer_size = 16M
innodb_log_block_size = 4096
innodb_log_group_home_dir = /var/lib/mysql_logs/innodb
innodb_data_file_path = /ibdata1:64M:autoextend
innodb_data_home_dir = /var/lib/mysql_logs/innodb

innodb_print_all_deadlocks = 0

innodb_open_files = 8192
innodb_file_per_table = 1
innodb_rollback_on_timeout = On
innodb_flush_log_at_trx_commit = 2
innodb_doublewrite = 1
innodb_flush_method = O_DIRECT
innodb_lock_wait_timeout = 300
innodb_flush_neighbors = 0

innodb_io_capacity = 40000
innodb_io_capacity_max = 70000

innodb_write_io_threads = 24
innodb_read_io_threads = 24
innodb_purge_threads = 4
innodb_random_read_ahead = On

innodb_support_xa = 0

innodb_autoinc_lock_mode = 2 # Galera
innodb_locks_unsafe_for_binlog = 1 # Galera

innodb_buffer_pool_load_at_startup = On
innodb_buffer_pool_dump_at_shutdown = On

log-bin = /var/lib/mysql_logs/binary/binlog
max_binlog_size = 1024M
binlog_format = ROW
binlog_cache_size = 5M
max_binlog_files = 10
expire_logs_days = 0
sync_binlog = 0

relay_log = /var/lib/mysql_logs/relay/relaylog
slave_load_tmpdir = /tmp/mysql

server-id = 12
skip-slave-start
log_slave_updates = On

log_error = "/var/log/mysql/error.log"
slow_query_log_file = "/var/log/mysql/slow.log"

wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider_options="gcache.size = 1G; gcache.page_size = 1G; gcache.name = /var/lib/mysql_logs/galera/galera.cache; gcs.fc_limit = 20000"

wsrep_node_address="10.10.91.4"

wsrep_cluster_name="PXC"
wsrep_node_name="natrium"

wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth = sst_xtrabackup:*****

wsrep_notify_cmd = '/usr/local/bin/wsrep_notify.sh'
wsrep_replicate_myisam = On
wsrep_forced_binlog_format = ROW
wsrep_log_conflicts = Off
wsrep_auto_increment_control = On
wsrep_retry_autocommit = 10
wsrep_slave_threads = 128
wsrep_convert_LOCK_to_trx = 1

wsrep_max_ws_size = 2147483648 # 2G
wsrep_max_ws_rows = 1048576 # 1M

Environment

None

Smart Checklist

Details
Assignee
Krunal Bauskar(Deactivated)
Reporter
lpjirasync(Deactivated)
Labels
backup-lockscrashinnobackupexmigratedfromlppxcsstxtrabackup
Priority
High

Smart Checklist

Created January 12, 2018 at 10:30 AM

Updated December 25, 2018 at 5:26 PM

Resolved December 15, 2023 at 2:26 PM

Configure

Activity

Show:

lpjirasyncJanuary 12, 2018 at 10:32 AM

**Comment from Launchpad by: Krunal Bauskar on: 18-11-2015 10:02:29

Anyone could reproduce it after the update fix for #2 as mentioned in comment#21 ?

lpjirasyncJanuary 12, 2018 at 10:32 AM

**Comment from Launchpad by: Alexey Kopytov on: 06-06-2015 19:25:53

There are 2 parts to this bug report:

1. Deadlock is triggered by the LOCK BINLOG FOR BACKUP statement
executed by XtraBackup
2. The XtraBackup connection is closed due to the above error while
holding a backup (TABLE) lock. Which results in a heap corruption and a
server crash, possibly some time later.

I believe #2 has been reported and fixed in PS 5.6.24-72.2 with
https://github.com/percona/percona-server/pull/26

Which means the latest PXC release 5.6.24-25.11 should not crash when
backup locks are used by XtraBackup if my
assumption is correct, but XtraBackup may still fail with an error due
to #1.

#1 (as in, a deadlock with LOCK BINLOG FOR BACKUP) looks like a
PXC-specific issue, but so far I’ve been unable to reproduce it
neither with PXC 5.6.22-25.8 nor with PXC 5.6.24-25.11. Here’s what I
tried:

1. sysbench OLTP_RW workload + concurrent LOCK * FOR BACKUP statements
in the same sequence as executed by XtraBackup on the same node
2. sysbench autocommit updates + same concurrent LOCK * FOR BACKUP
statements on the same
3. the above 2 case, but with LOCK * FOR BACKUP statements executed on
the same node where queries coming from sysbench are executed
4. all of the above cases with binary logging enabled.

At this point I’m out of ideas on how to reproduce the deadlock with
LOCK BINLOG FOR BACKUP issue.

If you can still reproduce either #1 or #2 with PXC 5.6.24-25.11 or a
later release, please provide your my.cnf and at least a general
description of your workload (tables, queries, storage engines involved,
etc.)

Thank you.

lpjirasyncJanuary 12, 2018 at 10:32 AM

**Comment from Launchpad by: Reiner Rusch on: 28-05-2015 01:42:34

... not to forget: teach your customers not to use myisam any more

lpjirasyncJanuary 12, 2018 at 10:32 AM

**Comment from Launchpad by: Reiner Rusch on: 28-05-2015 01:30:11

I got same problem and costs me some real hard days.

So to summarize:

1) Tip with --no-backup-locks did not work for @Medali
2) Next Tip: force flush with read lock ....

This is a great contrast, isn't it?
One does no locking - the other tip does a read lock explicitly.
-> 1) No locking should make the machine more responsive but could cause problems with myisam (more later)
-> 2) Could get you in great trouble with deadlocks

Both things locking tables (on one node, if all works fine) is a problem, read locks not much better.
A pitty is to read this (see last block) after searching for hints for hours:
https://www.percona.com/doc/percona-xtradb-cluster/5.6/limitation.html
This also doesn't to be real safe. Hmm. (I wish this last block was in red and fat!)

As I guess (still testing) the only solution would be this:
https://www.percona.com/blog/2012/03/23/how-flush-tables-with-read-lock-works-with-innodb-tables/
So don't do locking at all (perhaps forbit) and do innobackupex with:
--no-lock in combination with --save-slave-backup if interacting with a slave.

Any ideas?

lpjirasyncJanuary 12, 2018 at 10:32 AM

**Comment from Launchpad by: Raghavendra D Prabhu on: 28-04-2015 05:49:50

@Medali,

Can you provide the error (and innobackupex) logs when you used

[sst]
inno-backup-opts='--no-backup-locks'

Alternatively you can set FORCE_FTWRL=1 in /etc/sysconfig/mysql (or its systemd service variant) for centos, and /etc/default/mysql in debian.

Issues

LP #1401133: The problem with Node after using innobackupex and "Backup Locks"

Description

Environment

Smart Checklist

DetailsAssigneeKrunal BauskarKrunal Bauskar(Deactivated)Reporterlpjirasynclpjirasync(Deactivated)Labelsbackup-lockscrashinnobackupexmigratedfromlppxcsstxtrabackupPriorityHigh

Details

Assignee

Reporter

Labels

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Activity

lpjirasyncJanuary 12, 2018 at 10:32 AM

lpjirasyncJanuary 12, 2018 at 10:32 AM

lpjirasyncJanuary 12, 2018 at 10:32 AM

lpjirasyncJanuary 12, 2018 at 10:32 AM

lpjirasyncJanuary 12, 2018 at 10:32 AM

Details
Assignee
Krunal Bauskar(Deactivated)
Reporter
lpjirasync(Deactivated)
Labels
backup-lockscrashinnobackupexmigratedfromlppxcsstxtrabackup
Priority
High

Smart Checklist