LP #1656022: Partition pruning doesn't seem to work with TokuDB engine

Description

**Reported in Launchpad by Rick Pizzi last update 20-01-2017 23:05:01

We noticed that partition pruning, although correctly shown by the EXPLAIN PARTITIONS command, is not honoured for the TokuDB engine.

Please consider the following table:

show create table mytable\G
*************************** 1. row ***************************
Table: mytable
Create Table: CREATE TABLE `mytable` (
`ID` char(64) NOT NULL,
`DATA_BLOB` mediumblob NOT NULL,
`EXPIRE_DATE` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`ID`,`EXPIRE_DATE`),
KEY `EXPIRE_DATE_IX` (`EXPIRE_DATE`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1 ROW_FORMAT=TOKUDB_UNCOMPRESSED
/*!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(EXPIRE_DATE))
(PARTITION p1003 VALUES LESS THAN (1484240400) ENGINE = TokuDB,
PARTITION p1004 VALUES LESS THAN (1484244000) ENGINE = TokuDB,
PARTITION p1005 VALUES LESS THAN (1484247600) ENGINE = TokuDB,
PARTITION p1006 VALUES LESS THAN (1484251200) ENGINE = TokuDB,
PARTITION p1007 VALUES LESS THAN (1484254800) ENGINE = TokuDB,
PARTITION p1008 VALUES LESS THAN (1484258400) ENGINE = TokuDB,
PARTITION p1009 VALUES LESS THAN (1484262000) ENGINE = TokuDB,
PARTITION p1010 VALUES LESS THAN (1484265600) ENGINE = TokuDB,
PARTITION p1011 VALUES LESS THAN (1484269200) ENGINE = TokuDB,
PARTITION p1012 VALUES LESS THAN (1484272800) ENGINE = TokuDB,
PARTITION p1013 VALUES LESS THAN (1484276400) ENGINE = TokuDB,
PARTITION p1014 VALUES LESS THAN (1484280000) ENGINE = TokuDB,
PARTITION p1015 VALUES LESS THAN (1484283600) ENGINE = TokuDB,
PARTITION p1016 VALUES LESS THAN (1484287200) ENGINE = TokuDB,
PARTITION p1017 VALUES LESS THAN (1484290800) ENGINE = TokuDB,
PARTITION p1018 VALUES LESS THAN (1484294400) ENGINE = TokuDB,
PARTITION p1019 VALUES LESS THAN (1484298000) ENGINE = TokuDB,
PARTITION p1020 VALUES LESS THAN (1484301600) ENGINE = TokuDB,
PARTITION p1021 VALUES LESS THAN (1484305200) ENGINE = TokuDB,
PARTITION p1022 VALUES LESS THAN (1484308800) ENGINE = TokuDB,
PARTITION p1023 VALUES LESS THAN (1484312400) ENGINE = TokuDB,
PARTITION p1024 VALUES LESS THAN (1484316000) ENGINE = TokuDB,
PARTITION p1025 VALUES LESS THAN (1484319600) ENGINE = TokuDB,
PARTITION p1026 VALUES LESS THAN (1484323200) ENGINE = TokuDB,
PARTITION p1027 VALUES LESS THAN (1484326800) ENGINE = TokuDB,
PARTITION p1028 VALUES LESS THAN (1484330400) ENGINE = TokuDB,
PARTITION p1029 VALUES LESS THAN (1484334000) ENGINE = TokuDB,
PARTITION p1030 VALUES LESS THAN (1484337600) ENGINE = TokuDB,
PARTITION p1031 VALUES LESS THAN (1484341200) ENGINE = TokuDB,
PARTITION p1032 VALUES LESS THAN (1484344800) ENGINE = TokuDB,
PARTITION p1033 VALUES LESS THAN (1484348400) ENGINE = TokuDB,
PARTITION p1034 VALUES LESS THAN (1484352000) ENGINE = TokuDB,
PARTITION p1035 VALUES LESS THAN (1484355600) ENGINE = TokuDB,
PARTITION p1036 VALUES LESS THAN (1484359200) ENGINE = TokuDB,
PARTITION p1037 VALUES LESS THAN (1484362800) ENGINE = TokuDB,
PARTITION p1038 VALUES LESS THAN (1484366400) ENGINE = TokuDB,
PARTITION p1039 VALUES LESS THAN (1484370000) ENGINE = TokuDB,
PARTITION p1040 VALUES LESS THAN (1484373600) ENGINE = TokuDB,
PARTITION p1041 VALUES LESS THAN (1484377200) ENGINE = TokuDB,
PARTITION p1042 VALUES LESS THAN (1484380800) ENGINE = TokuDB,
PARTITION p1043 VALUES LESS THAN (1484384400) ENGINE = TokuDB,
PARTITION p1044 VALUES LESS THAN (1484388000) ENGINE = TokuDB,
PARTITION p1045 VALUES LESS THAN (1484391600) ENGINE = TokuDB,
PARTITION p1046 VALUES LESS THAN (1484395200) ENGINE = TokuDB,
PARTITION p1047 VALUES LESS THAN (1484398800) ENGINE = TokuDB,
PARTITION p1048 VALUES LESS THAN (1484402400) ENGINE = TokuDB,
PARTITION p1049 VALUES LESS THAN (1484406000) ENGINE = TokuDB,
PARTITION p1050 VALUES LESS THAN (1484409600) ENGINE = TokuDB,
PARTITION p1051 VALUES LESS THAN (1484413200) ENGINE = TokuDB,
PARTITION p1052 VALUES LESS THAN (1484416800) ENGINE = TokuDB,
PARTITION p1053 VALUES LESS THAN (1484420400) ENGINE = TokuDB,
PARTITION p1054 VALUES LESS THAN (1484424000) ENGINE = TokuDB,
PARTITION p1055 VALUES LESS THAN (1484427600) ENGINE = TokuDB,
PARTITION p1056 VALUES LESS THAN (1484431200) ENGINE = TokuDB,
PARTITION p1057 VALUES LESS THAN (1484434800) ENGINE = TokuDB,
PARTITION p1058 VALUES LESS THAN (1484438400) ENGINE = TokuDB,
PARTITION p1059 VALUES LESS THAN (1484442000) ENGINE = TokuDB,
PARTITION p1060 VALUES LESS THAN (1484445600) ENGINE = TokuDB,
PARTITION p1061 VALUES LESS THAN (1484449200) ENGINE = TokuDB,
PARTITION p1062 VALUES LESS THAN (1484452800) ENGINE = TokuDB,
PARTITION p1063 VALUES LESS THAN (1484456400) ENGINE = TokuDB,
PARTITION p1064 VALUES LESS THAN (1484460000) ENGINE = TokuDB) */

If we run the following explain, all seems okay:

mysql>explain partitions DELETE FROM mytable WHERE ID = 'aa60185e07d29c8866f00a5fa9e951aca07f3e3ae42b7ebb57474bf1a26721b9' AND EXPIRE_DATE > date_add(NOW(), interval 1 hour)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: mytable
partitions: p1004,p1005,p1006,p1007,p1008,p1009,p1010,p1011,p1012,p1013,p1014,p1015,p1016,p1017,p1018,p1019,p1020,p1021,p1022,p1023,p1024,p1025,p1026,p1027,p1028,p1029,p1030,p1031,p1032,p1033,p1034,p1035,p1036,p1037,p1038,p1039,p1040,p1041,p1042,p1043,p1044,p1045,p1046,p1047,p1048,p1049,p1050,p1051,p1052,p1053,p1054,p1055,p1056,p1057,p1058,p1059,p1060,p1061,p1062,p1063,p1064
type: range
possible_keys: PRIMARY,EXPIRE_DATE_IX
key: PRIMARY
key_len: 68
ref: const,const
rows: 1
Extra: Using where
1 row in set (0.00 sec)

We can see that partition p1003 is pruned because contains stale data (rows with EXPIRE_DATE less than one hour old).

However, when running DROP PARTITION p1003, we see stalls in SELECT and DELETE statements that should NOT hit that partition because of pruning:

  1. Time: 170112 16:02:15

  2. Schema: myschema Last_errno: 0 Killed: 0

  3. Query_time: 12.219223 Lock_time: 9.126238 Rows_sent: 0 Rows_examined: 0 Rows_affected: 0

  4. Bytes_sent: 11 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0

  5. QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No

  6. Filesort: No Filesort_on_disk: No Merge_passes: 0
    SET timestamp=1484233335;
    DELETE FROM mytable WHERE ID = '70f8536e1c07367e591be7d05092d2bf10fd3b2df463a7bdad412e6d705477f1' AND EXPIRE_DATE > DATE_ADD(NOW(), INTERVAL 1 HOUR);

You can see that this query took 12 seconds and was locked for 9 seconds waiting for what seems a lock on the partition which we expected to be pruned.

We wouldn't expect the above DELETE to lock if the partition we are dropping is pruned and therefore out of scope for the DML.

Server version: 5.6.34-79.1-log Percona Server (GPL), Release 79.1, Revision 1c589f9

thanks
Rick

Environment

None

Smart Checklist

Activity

lpjirasync January 21, 2018 at 4:51 AM

**Comment from Launchpad by: Rick Pizzi on: 16-01-2017 20:18:39

Saw the issue, thanks!
Looking forward to try the fix,

I have noticed that once the dataset grows more than what the OS cache can handle, we are back to square one as far as stalls go.
Tomorrow I will go back to 16k (sorry!) and to 50% tokudb_cache_mem which offered the best results in our situation so far.

Mind you, with 8 GB toku cache and 64k once the dataset hit the disk, I have been able to saturate the tokudb_cachetable_pool_threads default value of 80 (the only one of out the 3 that I left to default).
Not sure why this happened or what that means actually, but having the pool saturated for one minute when I had the stall just seemed related.

Thanks!

Rick

Riccardo Pizzi
pizzi@leopardus.com
http://www.vecchiflipper.it
http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 16/gen/2017, at 20:31, George Ormond Lorch III <george.lorch@percona.com> wrote:

> Hi Rick,
> I've been looking around this one call and decided that a slightly different approach should be taken to correct this, assuming that it is what is I think it is as evidenced by the pmp output.
>
> Here is the issue tracker for TokuDB
> https://tokutek.atlassian.net/browse/DB-1033
>
> Once I get this fixed and tested I will try to get you an experimental
> binary to test out.
>
> –
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1656022
>
> Title:
> Partition pruning doesn't seem to work with TokuDB engine
>
> Status in Percona Server:
> New
>
> Bug description:
> We noticed that partition pruning, although correctly shown by the
> EXPLAIN PARTITIONS command, is not honoured for the TokuDB engine.
>
> Please consider the following table:
>
> show create table mytable\G
> *************************** 1. row ***************************
> Table: mytable
> Create Table: CREATE TABLE `mytable` (
> `ID` char(64) NOT NULL,
> `DATA_BLOB` mediumblob NOT NULL,
> `EXPIRE_DATE` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
> PRIMARY KEY (`ID`,`EXPIRE_DATE`),
> KEY `EXPIRE_DATE_IX` (`EXPIRE_DATE`)
> ) ENGINE=TokuDB DEFAULT CHARSET=latin1 ROW_FORMAT=TOKUDB_UNCOMPRESSED
> /*!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(EXPIRE_DATE))
> (PARTITION p1003 VALUES LESS THAN (1484240400) ENGINE = TokuDB,
> PARTITION p1004 VALUES LESS THAN (1484244000) ENGINE = TokuDB,
> PARTITION p1005 VALUES LESS THAN (1484247600) ENGINE = TokuDB,
> PARTITION p1006 VALUES LESS THAN (1484251200) ENGINE = TokuDB,
> PARTITION p1007 VALUES LESS THAN (1484254800) ENGINE = TokuDB,
> PARTITION p1008 VALUES LESS THAN (1484258400) ENGINE = TokuDB,
> PARTITION p1009 VALUES LESS THAN (1484262000) ENGINE = TokuDB,
> PARTITION p1010 VALUES LESS THAN (1484265600) ENGINE = TokuDB,
> PARTITION p1011 VALUES LESS THAN (1484269200) ENGINE = TokuDB,
> PARTITION p1012 VALUES LESS THAN (1484272800) ENGINE = TokuDB,
> PARTITION p1013 VALUES LESS THAN (1484276400) ENGINE = TokuDB,
> PARTITION p1014 VALUES LESS THAN (1484280000) ENGINE = TokuDB,
> PARTITION p1015 VALUES LESS THAN (1484283600) ENGINE = TokuDB,
> PARTITION p1016 VALUES LESS THAN (1484287200) ENGINE = TokuDB,
> PARTITION p1017 VALUES LESS THAN (1484290800) ENGINE = TokuDB,
> PARTITION p1018 VALUES LESS THAN (1484294400) ENGINE = TokuDB,
> PARTITION p1019 VALUES LESS THAN (1484298000) ENGINE = TokuDB,
> PARTITION p1020 VALUES LESS THAN (1484301600) ENGINE = TokuDB,
> PARTITION p1021 VALUES LESS THAN (1484305200) ENGINE = TokuDB,
> PARTITION p1022 VALUES LESS THAN (1484308800) ENGINE = TokuDB,
> PARTITION p1023 VALUES LESS THAN (1484312400) ENGINE = TokuDB,
> PARTITION p1024 VALUES LESS THAN (1484316000) ENGINE = TokuDB,
> PARTITION p1025 VALUES LESS THAN (1484319600) ENGINE = TokuDB,
> PARTITION p1026 VALUES LESS THAN (1484323200) ENGINE = TokuDB,
> PARTITION p1027 VALUES LESS THAN (1484326800) ENGINE = TokuDB,
> PARTITION p1028 VALUES LESS THAN (1484330400) ENGINE = TokuDB,
> PARTITION p1029 VALUES LESS THAN (1484334000) ENGINE = TokuDB,
> PARTITION p1030 VALUES LESS THAN (1484337600) ENGINE = TokuDB,
> PARTITION p1031 VALUES LESS THAN (1484341200) ENGINE = TokuDB,
> PARTITION p1032 VALUES LESS THAN (1484344800) ENGINE = TokuDB,
> PARTITION p1033 VALUES LESS THAN (1484348400) ENGINE = TokuDB,
> PARTITION p1034 VALUES LESS THAN (1484352000) ENGINE = TokuDB,
> PARTITION p1035 VALUES LESS THAN (1484355600) ENGINE = TokuDB,
> PARTITION p1036 VALUES LESS THAN (1484359200) ENGINE = TokuDB,
> PARTITION p1037 VALUES LESS THAN (1484362800) ENGINE = TokuDB,
> PARTITION p1038 VALUES LESS THAN (1484366400) ENGINE = TokuDB,
> PARTITION p1039 VALUES LESS THAN (1484370000) ENGINE = TokuDB,
> PARTITION p1040 VALUES LESS THAN (1484373600) ENGINE = TokuDB,
> PARTITION p1041 VALUES LESS THAN (1484377200) ENGINE = TokuDB,
> PARTITION p1042 VALUES LESS THAN (1484380800) ENGINE = TokuDB,
> PARTITION p1043 VALUES LESS THAN (1484384400) ENGINE = TokuDB,
> PARTITION p1044 VALUES LESS THAN (1484388000) ENGINE = TokuDB,
> PARTITION p1045 VALUES LESS THAN (1484391600) ENGINE = TokuDB,
> PARTITION p1046 VALUES LESS THAN (1484395200) ENGINE = TokuDB,
> PARTITION p1047 VALUES LESS THAN (1484398800) ENGINE = TokuDB,
> PARTITION p1048 VALUES LESS THAN (1484402400) ENGINE = TokuDB,
> PARTITION p1049 VALUES LESS THAN (1484406000) ENGINE = TokuDB,
> PARTITION p1050 VALUES LESS THAN (1484409600) ENGINE = TokuDB,
> PARTITION p1051 VALUES LESS THAN (1484413200) ENGINE = TokuDB,
> PARTITION p1052 VALUES LESS THAN (1484416800) ENGINE = TokuDB,
> PARTITION p1053 VALUES LESS THAN (1484420400) ENGINE = TokuDB,
> PARTITION p1054 VALUES LESS THAN (1484424000) ENGINE = TokuDB,
> PARTITION p1055 VALUES LESS THAN (1484427600) ENGINE = TokuDB,
> PARTITION p1056 VALUES LESS THAN (1484431200) ENGINE = TokuDB,
> PARTITION p1057 VALUES LESS THAN (1484434800) ENGINE = TokuDB,
> PARTITION p1058 VALUES LESS THAN (1484438400) ENGINE = TokuDB,
> PARTITION p1059 VALUES LESS THAN (1484442000) ENGINE = TokuDB,
> PARTITION p1060 VALUES LESS THAN (1484445600) ENGINE = TokuDB,
> PARTITION p1061 VALUES LESS THAN (1484449200) ENGINE = TokuDB,
> PARTITION p1062 VALUES LESS THAN (1484452800) ENGINE = TokuDB,
> PARTITION p1063 VALUES LESS THAN (1484456400) ENGINE = TokuDB,
> PARTITION p1064 VALUES LESS THAN (1484460000) ENGINE = TokuDB) */
>
> If we run the following explain, all seems okay:
>
> mysql>explain partitions DELETE FROM mytable WHERE ID = 'aa60185e07d29c8866f00a5fa9e951aca07f3e3ae42b7ebb57474bf1a26721b9' AND EXPIRE_DATE > date_add(NOW(), interval 1 hour)\G
> *************************** 1. row ***************************
> id: 1
> select_type: SIMPLE
> table: mytable
> partitions: p1004,p1005,p1006,p1007,p1008,p1009,p1010,p1011,p1012,p1013,p1014,p1015,p1016,p1017,p1018,p1019,p1020,p1021,p1022,p1023,p1024,p1025,p1026,p1027,p1028,p1029,p1030,p1031,p1032,p1033,p1034,p1035,p1036,p1037,p1038,p1039,p1040,p1041,p1042,p1043,p1044,p1045,p1046,p1047,p1048,p1049,p1050,p1051,p1052,p1053,p1054,p1055,p1056,p1057,p1058,p1059,p1060,p1061,p1062,p1063,p1064
> type: range
> possible_keys: PRIMARY,EXPIRE_DATE_IX
> key: PRIMARY
> key_len: 68
> ref: const,const
> rows: 1
> Extra: Using where
> 1 row in set (0.00 sec)
>
> We can see that partition p1003 is pruned because contains stale data
> (rows with EXPIRE_DATE less than one hour old).
>
> However, when running DROP PARTITION p1003, we see stalls in SELECT
> and DELETE statements that should NOT hit that partition because of
> pruning:
>
> # Time: 170112 16:02:15
> # Schema: myschema Last_errno: 0 Killed: 0
> # Query_time: 12.219223 Lock_time: 9.126238 Rows_sent: 0 Rows_examined: 0 Rows_affected: 0
> # Bytes_sent: 11 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
> # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
> # Filesort: No Filesort_on_disk: No Merge_passes: 0
> SET timestamp=1484233335;
> DELETE FROM mytable WHERE ID = '70f8536e1c07367e591be7d05092d2bf10fd3b2df463a7bdad412e6d705477f1' AND EXPIRE_DATE > DATE_ADD(NOW(), INTERVAL 1 HOUR);
>
> You can see that this query took 12 seconds and was locked for 9
> seconds waiting for what seems a lock on the partition which we
> expected to be pruned.
>
> We wouldn't expect the above DELETE to lock if the partition we are
> dropping is pruned and therefore out of scope for the DML.
>
> Server version: 5.6.34-79.1-log Percona Server (GPL), Release
> 79.1, Revision 1c589f9
>
> thanks
> Rick
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/percona-server/+bug/1656022/+subscriptions
>

lpjirasync January 21, 2018 at 4:51 AM

**Comment from Launchpad by: George Ormond Lorch III on: 16-01-2017 19:31:42

Hi Rick,
I've been looking around this one call and decided that a slightly different approach should be taken to correct this, assuming that it is what is I think it is as evidenced by the pmp output.

Here is the issue tracker for TokuDB https://tokutek.atlassian.net/browse/DB-1033

Once I get this fixed and tested I will try to get you an experimental binary to test out.

lpjirasync January 21, 2018 at 4:51 AM

**Comment from Launchpad by: Rick Pizzi on: 15-01-2017 21:15:04

Addendum - we had a new peak of traffic and the situation with 8 GB cache and 64k blocks got MUCH worse, with stalls of 40 seconds and more....

By looking at PMM graph I was being able to spot a bottleneck in tokudb_cachetable_pool_threads - the graph shows that pool saturated for several second (at 2 x NCPU, since I did not change that one). For sure there is a relationship. I am going to increase that pool from default to something higher to see if that solves this problem.

lpjirasync January 21, 2018 at 4:51 AM

**Comment from Launchpad by: Rick Pizzi on: 14-01-2017 09:31:24

Situation with 64k block size is a little better but stalls still there.
I will contact you privately for that patched binary.

lpjirasync January 21, 2018 at 4:51 AM

**Comment from Launchpad by: Rick Pizzi on: 13-01-2017 18:46:46

I am now running with 64k block size; so far (about 2 hours) results are encouraging, but let it pass peak hours to see how it goes. If stalls still around I will be glad to test your binary (this is a prod env, but the application is resilient to cache failures so we can just test it there).
I'll be back tomorrow with some results.

Thanks for continued support!

Rick

Done

Details

Assignee

Reporter

Priority

Smart Checklist

Created January 21, 2018 at 4:46 AM
Updated January 21, 2018 at 4:51 AM
Resolved January 21, 2018 at 4:46 AM