Unnecessary flush list iteration during ALTER TABLE ADD INDEX

Description

**Reported in Launchpad by jocelyn fournier last update 25-05-2017 08:07:39

Hi,

5.7 is much slower than 5.6 to execute ALTER TABLE ... ADD INDEX, even on an empty table when the number of dirty pages in the buffer pool is high.

In 5.6, with 1.2m dirty pages (6.6% of the buffer pool):

CREATE TABLE `data_3468769` (
-> `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
-> `extid` varchar(256) COLLATE latin1_general_ci NOT NULL DEFAULT '',
-> `pra` varchar(60) CHARACTER SET latin1 NOT NULL,
-> `from` varchar(60) CHARACTER SET latin1 NOT NULL,
-> `rcpt` varchar(60) CHARACTER SET latin1 NOT NULL,
-> `domaine` varchar(50) CHARACTER SET latin1 NOT NULL,
-> `etat` enum('','done','filtre','soft','hard') CHARACTER SET latin1 NOT NULL DEFAULT '',
-> `smtpmsg` tinytext CHARACTER SET latin1 NOT NULL,
-> `ouvert` tinyint(3) unsigned NOT NULL,
-> `clicks` tinyint(3) unsigned NOT NULL,
-> `sales_count` tinyint(3) unsigned NOT NULL DEFAULT '0',
-> `sales_amount` decimal(7,2) NOT NULL DEFAULT '0.00',
-> `desabo` tinyint(3) unsigned NOT NULL,
-> `spam` tinyint(3) unsigned NOT NULL,
-> PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;
Query OK, 0 rows affected (0.15 sec)

mysql> ALTER TABLE `data_3468769` ADD INDEX (etat), ADD INDEX (domaine);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0

In 5.7, with 1.4m dirty pages (8% of the buffer pool):
ALTER TABLE `data_3468769` ADD INDEX (etat), ADD INDEX (domaine);
Query OK, 0 rows affected (1.38 sec)

With about 4.2m dirty pages:
ALTER TABLE `data_3468769` ADD INDEX (etat), ADD INDEX (domaine);
Query OK, 0 rows affected (5.45 sec)

It seems the slowdown is cause by the call to FlushObserver::flush in row_merge_build_indexes, which count the number of dirty pages (buf_pool_get_dirty_pages_count). It scans the whole flush_list for each buffer pool searching the right space id, which is quite inefficient here!

The number of dirty pages seems to be only used by begin_phase_flush to estimate the amount of work, perhaps it would be faster to use the number of records of the table, instead of trying to compute the real number of pending dirty pages?

Thanks,
Jocelyn

Environment

None

Smart Checklist

Activity

Show:

jocelyn fournier June 15, 2018 at 7:57 AM

ah yes  a big server here, with 5TB dataset and 300000 tables (this is bad ! )

Satya Bodapati June 15, 2018 at 7:55 AM

oh thats pretty big I need to find instance of similar size

Thanks!

jocelyn fournier June 15, 2018 at 7:48 AM

Hi Satya!

 

FYI, I think the buffer_pool_size was 270G on this case.

 

  Jocelyn 

Satya Bodapati June 15, 2018 at 6:29 AM
Edited

I did more study and found that buf_flush_get_dirty_pages_count() is only used for estimate for PFS.

This estimate can be calculated without iterating flush_lists in all buffer pool instances.

The idea for fix is that when mtr adds dirty pages to flush list with an observer, we can calculate the dirty pages being added to flush list and store it as part of observer.

Fix is being tested.

Satya Bodapati May 31, 2018 at 5:05 PM

I think we should look at larger picture. In 5.7, add index is significantly faster in many cases. This is because it skips redo logging, undo logging and also does bottom up build (aka bulk index build). Since it skips redo logging, before the ALTER finishes, all the dirty pages has to be flushed. Otherwise recovery cannot work properly. We cannot use the estimate of table records etc.

Every page that was created part of 'ADD INDEX' process has to be flushed. So flushing cannot be avoided.

So I agree that adding empty index is slower in 5.7(unfortunate but most users wouldn't care), in majority cases, it is faster.

Done

Details

Assignee

Reporter

Time tracking

1h 2m logged

Fix versions

Priority

Smart Checklist

Created January 24, 2018 at 11:22 AM
Updated March 30, 2020 at 5:23 AM
Resolved June 29, 2018 at 5:43 AM