Done
Details
Details
Assignee
Satya Bodapati
Satya BodapatiReporter
lpjirasync
lpjirasync(Deactivated)Time tracking
1h 2m logged
Fix versions
Priority
Smart Checklist
Smart Checklist
Created January 24, 2018 at 11:22 AM
Updated March 30, 2020 at 5:23 AM
Resolved June 29, 2018 at 5:43 AM
**Reported in Launchpad by jocelyn fournier last update 25-05-2017 08:07:39
Hi,
5.7 is much slower than 5.6 to execute ALTER TABLE ... ADD INDEX, even on an empty table when the number of dirty pages in the buffer pool is high.
In 5.6, with 1.2m dirty pages (6.6% of the buffer pool):
CREATE TABLE `data_3468769` (
-> `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
-> `extid` varchar(256) COLLATE latin1_general_ci NOT NULL DEFAULT '',
-> `pra` varchar(60) CHARACTER SET latin1 NOT NULL,
-> `from` varchar(60) CHARACTER SET latin1 NOT NULL,
-> `rcpt` varchar(60) CHARACTER SET latin1 NOT NULL,
-> `domaine` varchar(50) CHARACTER SET latin1 NOT NULL,
-> `etat` enum('','done','filtre','soft','hard') CHARACTER SET latin1 NOT NULL DEFAULT '',
-> `smtpmsg` tinytext CHARACTER SET latin1 NOT NULL,
-> `ouvert` tinyint(3) unsigned NOT NULL,
-> `clicks` tinyint(3) unsigned NOT NULL,
-> `sales_count` tinyint(3) unsigned NOT NULL DEFAULT '0',
-> `sales_amount` decimal(7,2) NOT NULL DEFAULT '0.00',
-> `desabo` tinyint(3) unsigned NOT NULL,
-> `spam` tinyint(3) unsigned NOT NULL,
-> PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;
Query OK, 0 rows affected (0.15 sec)
mysql> ALTER TABLE `data_3468769` ADD INDEX (etat), ADD INDEX (domaine);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
In 5.7, with 1.4m dirty pages (8% of the buffer pool):
ALTER TABLE `data_3468769` ADD INDEX (etat), ADD INDEX (domaine);
Query OK, 0 rows affected (1.38 sec)
With about 4.2m dirty pages:
ALTER TABLE `data_3468769` ADD INDEX (etat), ADD INDEX (domaine);
Query OK, 0 rows affected (5.45 sec)
It seems the slowdown is cause by the call to FlushObserver::flush in row_merge_build_indexes, which count the number of dirty pages (buf_pool_get_dirty_pages_count). It scans the whole flush_list for each buffer pool searching the right space id, which is quite inefficient here!
The number of dirty pages seems to be only used by begin_phase_flush to estimate the amount of work, perhaps it would be faster to use the number of records of the table, instead of trying to compute the real number of pending dirty pages?
Thanks,
Jocelyn