LP #1188219: ineffective searching for unreserved slot in srv_conc_queue

Description

**Reported in Launchpad by Hui Liu last update 16-06-2013 06:25:11

If the threads inside InnoDB are over configured concurrency limit, such as 32, the newly arrived thread should be waited in srv_conc_queue(FIFO), that is searching an unreserved slot in srv_conc_slots, from 0 to OS_THREAD_MAX_N(typical OS_THREAD_MAX_N is 50K if BP > 1GB).

For the large connection case, such as 1K connections, each thread to be waited in srv_conc_queue would looping from 0 to k(k is much less than OS_THREAD_MAX_N), but actually any free slot between 0 to OS_THREAD_MAX_N could be usable.

Here is the perf top with disass info for srv_conc_enter_innodb:

srv_conc_enter_innodb
0.83 : 8959ac: sub $0x1,%rax
0.00 : 8959b0: mov %rax,0x882ff9(%rip) # 11189b0 <srv_conc_n_waiting_threads>
:
: goto retry;
0.00 : 8959b7: jmpq 895803 <srv_conc_enter_innodb+0x120>
: }
:
: /* Too many threads inside: put the current thread to a queue */
:
: for (i = 0; i < OS_THREAD_MAX_N; i++) {
0.00 : 8959bc: movq $0x0,-0x60(%rbp)
0.00 : 8959c4: jmp 8959f9 <srv_conc_enter_innodb+0x316>
: slot = srv_conc_slots + i;
8.83 : 8959c6: mov 0x88b2db(%rip),%rcx # 1120ca8 <srv_conc_slots>
0.17 : 8959cd: mov -0x60(%rbp),%rdx
0.04 : 8959d1: mov %rdx,%rax
0.03 : 8959d4: shl $0x2,%rax
8.18 : 8959d8: add %rdx,%rax
0.22 : 8959db: shl $0x3,%rax
0.14 : 8959df: lea (%rcx,%rax,1),%rax
0.18 : 8959e3: mov %rax,-0x68(%rbp)
:
: if (!slot->reserved) {
8.33 : 8959e7: mov -0x68(%rbp),%rax
3.07 : 8959eb: mov 0x8(%rax),%rax
47.67 : 8959ef: test %rax,%rax
0.00 : 8959f2: je 895a08 <srv_conc_enter_innodb+0x325>
: goto retry;
: }
:
: /* Too many threads inside: put the current thread to a queue */
:
: for (i = 0; i < OS_THREAD_MAX_N; i++) {
5.91 : 8959f4: addq $0x1,-0x60(%rbp)
1.87 : 8959f9: mov 0x882f80(%rip),%rax # 1118980 <srv_max_n_threads>
2.08 : 895a00: cmp %rax,-0x60(%rbp)
0.00 : 895a04: jb 8959c6 <srv_conc_enter_innodb+0x2e3>
0.00 : 895a06: jmp 895a09 <srv_conc_enter_innodb+0x326>
: slot = srv_conc_slots + i;
:
: if (!slot->reserved) {
:
: break;
0.01 : 895a08: nop
: }
: }

As a matter of fact, any free slot could be used between 0 to OS_THREAD_MAX_N. So, the variable i would be continued from last index position, adding a type prefix as "static" for i.

Here is the sysbench test case, oltp ro increased x10 for 2K conns.

/home/xiyu.lh/xiyu/sysbench-db-scripts/sysbench/sysbench --test=/home/xiyu.lh/sysbench/sysbench/tests/db/oltp.lua --init-rng=on --oltp-read-only=on --oltp-skip-trx=off --oltp-dist-type=uniform --num-threads=2000 --max-requests=0 --mysql-socket=/u01/mysql2/run/mysql.sock --mysql-db=PRDB --oltp-tables-count=12 --mysql-user=root --oltp-table-size=100000 --oltp-dist-type=special run

original:
-------- QPS TPS Hit% ---innodb rows status--- ------threads------ time | ins upd del sel iud| lor hit| ins upd del read| run con cre cac|
11:10:14| 0 0 0 5720 0| 54354 99.99| 0 0 0 183620|2001 2002 0 0|
11:10:15| 0 0 0 7582 0| 68204 99.99| 0 0 0 217686|1998 2002 0 0|
11:10:17| 0 0 0 7786 0| 69081 99.99| 0 0 0 219590|2001 2002 0 0|
11:10:18| 0 0 0 6751 0| 59462 99.98| 0 0 0 192695|2001 2002 0 0|
11:10:19| 0 0 0 5928 0| 55851 99.99| 0 0 0 187567|2001 2002 0 0|

patched:
-------- QPS TPS Hit% ---innodb rows status--- ------threads------ time | ins upd del sel iud| lor hit| ins upd del read| run con cre cac|
11:50:48| 0 0 0 69802 0| 653556 100.00| 0 0 0 2092003|1983 2001 0 0|
11:50:50| 0 0 0 64959 0| 578098 100.00| 0 0 0 1162614|1993 2001 0 0|
11:50:52| 0 0 0 51835 0| 519396 100.00| 0 0 0 1686133|1995 2001 0 0|
11:50:54| 0 0 0 92420 0| 805322 100.00| 0 0 0 2166806|1997 2001 0 0|

Environment

None

Smart Checklist

Activity

Show:

lpjirasync January 23, 2018 at 3:44 PM

**Comment from Launchpad by: Alexey Kopytov on: 16-06-2013 06:25:04

Does not apply to 5.6, as concurrency control is implemented in a different way there. If the target platforms supports atomics, the wait slots array and srv_conc_mutex are not used.

lpjirasync January 23, 2018 at 3:44 PM

**Comment from Launchpad by: Hui Liu on: 09-06-2013 03:21:30

Little bit better for initial i as:
i = thd_get_thread_id(trx->mysql_thd) % OS_THREAD_MAX_N;

that is, base QPS/TPS are(6133 1974), and previous static tuned are (19923 5694), while new idea of combine thread_id are (19937 5704).

lpjirasync January 23, 2018 at 3:44 PM

**Comment from Launchpad by: Hui Liu on: 08-06-2013 09:24:50

I have not recorded the consumed time for comparison, but it can be simple tested with systemtap:

$ cat /tmp/f.stp
#! /usr/bin/env stap

global start, intervals

probe process("/tmp/mysqld_4056").function("srv_conc_enter_innodb") { start[tid()] = gettimeofday_us() }

probe process("/tmp/mysqld_4056").function("srv_conc_enter_innodb") {
t = gettimeofday_us()
old_t = start[tid()]
if (old_t) intervals <<< t - old_t
delete start[tid()]
}
probe end
{
printf("intervals min:%dus avg:%dus max:%dus count:%d\n",
@min(intervals), @avg(intervals), @max(intervals),
@count(intervals))
print(@hist_log(intervals));
}

$ sudo stap /tmp/f.stp
^Cintervals min:0us avg:1us max:305us count:2931699
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1158638
1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1665699
2 | 11081
4 | 4772
8 |@@ 70329
16 | 17337
32 | 3781
64 | 40
128 | 13
256 | 9
512 | 0
1024 | 0

I would have a try with the i initialized with thread_id%MAX, it worth a try.

lpjirasync January 23, 2018 at 3:44 PM

**Comment from Launchpad by: Laurynas Biveinis on: 07-06-2013 13:40:21

Looks reasonable, thank you.

Does srv_conc_enter_innodb CPU time or srv_conc_mutex wait time register on your benchmarks after this patch?

If yes, maybe optimistic concurrency control would work better for searching the srv_conc_slots array. But then i should become thread-local again. Maybe initializing it to thread id mod OS_THREAD_N_MAX would be good.

This all is a speculation for now, I haven't tried out or measured anything yet.

lpjirasync January 23, 2018 at 3:44 PM

**Comment from Launchpad by: Hui Liu on: 06-06-2013 15:16:23

Discuss group: https://groups.google.com/forum/?fromgroups#!topic/percona-discussion/odrEhfBgvfQ

Done

Details

Assignee

Reporter

Priority

Smart Checklist

Created January 23, 2018 at 3:44 PM
Updated January 23, 2018 at 3:44 PM
Resolved January 23, 2018 at 3:44 PM