MySQL Table Fragmentation: Beware of Bulk INSERT with FAILURE or ROLLBACK

栏目: IT技术 · 发布时间: 4年前

内容简介:In this blog post, I am going to explain how table fragmentation is happening with the INSERT statement.Before going into the topic, we need to know that with MySQL, there are two kinds of fragmentation:

MySQL Table Fragmentation: Beware of Bulk INSERT with FAILURE or ROLLBACK Usually, database people are familiar with table fragmentation with DELETE statements. Whenever doing a huge delete, in most cases, they are always rebuilding the table to reclaim the disk space. But, are you thinking only DELETEs can cause table fragmentation? ( Answer: NO ).

In this blog post, I am going to explain how table fragmentation is happening with the INSERT statement.

Before going into the topic, we need to know that with MySQL, there are two kinds of fragmentation:

  • Fragmentation where some of the InnoDB pages are completely free inside the table.
  • Fragmentation where some of the InnoDB pages are not completely filled (the page has some free space).

There are three major cases of table fragmentation with INSERTs :

  • INSERT with ROLLBACK
  • Failed INSERT statement
  • Fragmentation with page-splits

Test Environment

I have created my own test environment to experiment with those cases.

  • DB: percona
  • Tables : frag, ins_frag, frag_page_spl
  • Table Size: 2G

Case 1: INSERT with ROLLBACK

At first, I have created a new table “ ins_frag ”. Then I have created a transaction (with BEGIN) and started to copy the data from table “ frag ” to table “ ins_frag ” as shown below.

mysql> create table ins_frag like frag;
Query OK, 0 rows affected (0.01 sec)
 
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
 
mysql> insert into ins_frag select * from frag;
Query OK, 47521280 rows affected (3 min 7.45 sec)
Records: 47521280  Duplicates: 0  Warnings: 0
 
#Linux shell
sakthi-3.2# ls -lrth
total 8261632
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 02:43 frag.ibd
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 03:00 ins_frag.ibd

From the above, you can see the INSERT was executed, but still, I did not commit/rollback the INSERT. You can note that both tables have occupied 2 GB of disk space. 

Now, I am going to ROLLBACK the INSERT.

mysql> select count(*) from ins_frag;
+----------+
| count(*) |
+----------+
| 47521280 |
+----------+
1 row in set (1.87 sec)
 
mysql> rollback;
Query OK, 0 rows affected (5 min 45.21 sec)
mysql> select count(*) from ins_frag;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)
 
 
#Linux shell
sakthi-3.2# ls -lrth
total 8261632
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 02:43 frag.ibd
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 03:09 ins_frag.ibd

Yes, after rollback the INSERT, the table “ ins_frag ” is still occupying the same 2 GB of disk space. Let’s find the fragmented space through the MySQL client.

mysql> SELECT
-> table_schema as 'DATABASE',
-> table_name as 'TABLE',
-> CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL',
-> CONCAT(ROUND(data_free / ( 1024 * 1024 * 1024 ), 2), 'G') 'DATAFREE'
-> FROM information_schema.TABLES
-> where table_schema='percona' and table_name='ins_frag';
+----------+----------+-------+----------+
| DATABASE | TABLE.   | TOTAL | DATAFREE |
+----------+----------+-------+----------+
| percona  | ins_frag | 0.00G | 1.96G    |
+----------+----------+-------+----------+
1 row in set (0.01 sec)

So, this clears the rolling back the INSERT will create the fragmentation.  We need to rebuild the table to reclaim the disk space.

mysql> alter table ins_frag engine=innodb;
Query OK, 0 rows affected (2.63 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
#Linux shell
 
sakthi-3.2# ls -lrth
total 4131040
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 02:43 frag.ibd
-rw-r-----  1 _mysql  _mysql   112K Jun 17 03:11 ins_frag.ibd

Case 2: Failed INSERT Statement

To test this case, I have created two MySQL client sessions (session 1 and session 2). 

In session 1, I am running the same INSERT statement within the transaction. But this time I have interrupted and killed the INSERT query at session 2.

Session 1

#Linux shell
 
sakthi-3.2# ls -lrth
total 4131040
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 02:43 frag.ibd
-rw-r-----  1 _mysql  _mysql   112K Jun 17 04:02 ins_frag.ibd
 
#MySQL shell
 
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
 
mysql> insert into ins_frag select * from frag;   #is running

Session 2

mysql> pager grep -i insert ; show processlist;
PAGER set to 'grep -i insert'
| 33 | root            | localhost | percona | Query   |    14 | executing              | insert into ins_frag select * from frag |
4 rows in set (0.00 sec)
 
mysql> kill 33;
Query OK, 0 rows affected (0.00 sec)

The INSERT is interrupted and failed.

Again, at Session 1:

mysql> insert into ins_frag select * from frag;
ERROR 2013 (HY000): Lost connection to MySQL server during query
 
#Linux shell
 
sakthi-3.2# ls -lrth
total 4591616
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 02:43 frag.ibd
-rw-r-----  1 _mysql  _mysql   212M Jun 17 04:21 ins_frag.ibd
 
#MySQL shell
 
mysql> select count(*) from ins_frag;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.10 sec)

The INSERT is not completed and there is no data in the table. But still, the table .ibd file has grown up to 212M.  Let’s see the fragmented space through the MySQL client.

mysql> SELECT
-> table_schema as 'DATABASE',
-> table_name as 'TABLE',
-> CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') 'TOTAL',
-> CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'M') 'DATAFREE'
-> FROM information_schema.TABLES
-> where table_schema='percona' and table_name='ins_frag';
+----------+----------+---------+----------+
| DATABASE | TABLE    | TOTAL   | DATAFREE |
+----------+----------+---------+----------+
| percona  | ins_frag | 0.03M   | 210.56M  |
+----------+----------+---------+----------+
1 row in set (0.01 sec)

It shows the table has fragmented space and has to rebuild the table to reclaim the space.

mysql> alter table ins_frag engine='innodb';
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
#Linux shell
 
sakthi-3.2# ls -lrth
total 4131040
-rw-r-----  1 _mysql  _mysql   2.0G Jun 17 02:43 frag.ibd
-rw-r-----  1 _mysql  _mysql   112K Jun 17 04:32 ins_frag.ibd

Case 3: Fragmentation with Page-Splits

We know that internally, InnoDB records are stored in the InnoDB pages. Each page size is 16K by default, but you have the option to change the page size.

If the InnoDB page doesn’t have enough space to accommodate the new record or index entry, it will be split in two pages, which will be about 50% full each. This means even for insert only workload, with no rollbacks or deletes, you may end up with only 75% avg page utilization – and so a 25% loss for this kind of internal page fragmentation.

When the indexes are built by sort, they will have more congestion, and if the table has a lot of inserts that go to the random location in the index, it will cause the page-split.

Check out this excellent blog written by Marco Tusa, InnoDB Page Merging and Page Splitting, that has the complete internals about the page-split and InnoDB page structure/operations.

For an experiment, I have created a table with a sorted index (descending),

mysql> show create table frag_page_spl\G
*************************** 1. row ***************************
Table: frag_page_spl
Create Table: CREATE TABLE `frag_page_spl` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(16) DEFAULT NULL,
`messages` varchar(600) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_spl` (`messages` DESC)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.07 sec)

We can monitor the page split activity from the table INFORMATION_SCHEMA.INNODB_METRICS . For this, you need to enable the InnoDB monitor.

mysql> SET GLOBAL innodb_monitor_enable=all;
Query OK, 0 rows affected (0.09 sec)

Then I have created the script to trigger the INSERTs randomly with 6 parallel threads. After the end of the scripts:

mysql> select name,count,type,status,comment from information_schema.innodb_metrics where name like '%index_page_spl%'\G
*************************** 1. row ***************************
name: index_page_splits
count: 52186
type: counter
status: enabled
comment: Number of index page splits
1 row in set (0.05 sec)
 
mysql> SELECT
-> table_schema as 'DATABASE',
-> table_name as 'TABLE',
-> CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') 'TOTAL',
-> CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'M') 'DATAFREE'
-> FROM information_schema.TABLES
-> where table_schema='percona' and table_name='frag_page_spl';
+----------+---------------+----------+----------+
| DATABASE | TABLE.        | TOTAL    | DATAFREE |
+----------+---------------+----------+----------+
| percona  | frag_page_spl | 2667.55M | 127.92M  |
+----------+---------------+----------+----------+
1 row in set (0.00 sec)

Yes, from the metrics, we can see the page-split counter has increased. The output shows that there are 52186 page-splits operations that occurred, which created 127.92 MB of fragmentation.

Once the split page is created, the only way to move back is to have the created page drop below the merge threshold. When that happens, InnoDB moves the data from the split page with a merge operation. MERGE_THRESHOLD is configurable for table and specific indexes.

The other way is to reorganize the data is to OPTIMIZE the table. This can be a very heavy and long process, but often is the only way to recover from a situation where too many pages are located in sparse extents.

Conclusion

  • The first two cases are rare. Because most of the applications are not designed to write huge data in the table. 
  • You need to be aware of these issues, whenever doing bulk INSERTs (INSERT INTO SELECT * FROM, Loading data from Mysqldump, INSERT with huge data, etc.)
  • Remember that your fragmented disk space is always re-usable.

以上所述就是小编给大家介绍的《MySQL Table Fragmentation: Beware of Bulk INSERT with FAILURE or ROLLBACK》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

人月神话

人月神话

弗雷德里克.布鲁克斯 / UMLChina翻译组、汪颖 / 清华大学出版社 / 2007-9 / 48.00元

在软件领域,很少能有像《人月神话》一样具有深远影响力和畅销不衰的著作。Brooks博士为人们管理复杂项目提供了最具洞察力的见解,既有很多发人深省的观点,又有大量软件工程的实践。本书内容来自Brooks博士在IBM公司SYSTEM/360家族和OS/360中的项目管理经验,该项目堪称软件开发项目管理的典范。该书英文原版一经面世,即引起业内人士的强烈反响,后又译为德、法、日、俄、中、韩等多种文字,全球......一起来看看 《人月神话》 这本书的介绍吧!

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具