内容简介:如果对
如果对 索引字段 做 函数 操作,可能会 破坏索引值的有序性 ,因此 优化器 会决定 放弃 走 树搜索 功能
条件字段函数操作
交易日志表
CREATE TABLE `tradelog` (
`id` INT(11) NOT NULL,
`tradeid` VARCHAR(32) DEFAULT NULL,
`operator` INT(11) DEFAULT NULL,
`t_modified` DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `tradeid` (`tradeid`),
KEY `t_modified` (`t_modified`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
-- 94608000 = 3 * 365 * 24 * 3600
-- t_modified : 2016-01-01 00:00:00 ~ 2019-01-01 00:00:00
DELIMITER ;;
CREATE PROCEDURE tdata()
BEGIN
DECLARE i INT;
SET i=0;
WHILE i<1000000 DO
INSERT INTO tradelog VALUES (i,i,i,FROM_UNIXTIME(UNIX_TIMESTAMP('2016-01-01 00:00:00')+FLOOR(0+(RAND()*94608000))));
SET i=i+1;
END WHILE;
END;;
DELIMITER ;
CALL tdata();
month函数
SELECT COUNT(*) FROM tradelog WHERE MONTH(t_modified)=7;
explain
mysql> EXPLAIN SELECT COUNT(*) FROM tradelog WHERE MONTH(t_modified)=7\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tradelog
partitions: NULL
type: index
possible_keys: NULL
key: t_modified
key_len: 6
ref: NULL
rows: 998838
filtered: 100.00
Extra: Using where; Using index
-
key=t_modified:优化器选择了遍历二级索引t_modified -
type=index:表示 全索引扫描 (二级索引) -
rows=998,838≈1,000,000:说明这条语句基本 扫描 了整个二级索引t_modified -
Using index:表示使用了 覆盖索引 ( 无需回表 ) - 在索引字段
t_modified上加上MONTH函数,导致了 全索引扫描 ,无法使用 树搜索 功能
slowlog
Rows_examined=1,000,000 ,佐证了 全索引扫描
# Time: 2019-02-12T14:25:07.158350+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.208787 Lock_time: 0.000162 Rows_sent: 1 Rows_examined: 1000000 SET timestamp=1549952707; SELECT COUNT(*) FROM tradelog WHERE MONTH(t_modified)=7;
分析
-
WHERE t_modified='2018-07-01',InnoDB会按照绿色箭头的路线找到结果(树搜索)- 这源于B+树的特性: 同一层兄弟节点的有序性
-
WHERE MONTH(t_modified)=7,在树的第一层就不知道如何操作,因此 优化器放弃了树搜索功能- 优化器可以选择遍历 聚簇索引 ,或者遍历 二级索引
t_modified - 优化器在对比索引大小后发现,二级索引
t_modified更小,最终选择了遍历二级索引t_modified
- 优化器可以选择遍历 聚簇索引 ,或者遍历 二级索引
优化方案
mysql> SELECT COUNT(*) FROM tradelog WHERE
-> (t_modified >= '2016-7-1' AND t_modified<'2016-8-1') OR
-> (t_modified >= '2017-7-1' AND t_modified<'2017-8-1') OR
-> (t_modified >= '2018-7-1' AND t_modified<'2018-8-1');
explain
mysql> EXPLAIN SELECT COUNT(*) FROM tradelog WHERE
-> (t_modified >= '2016-7-1' AND t_modified<'2016-8-1') OR
-> (t_modified >= '2017-7-1' AND t_modified<'2017-8-1') OR
-> (t_modified >= '2018-7-1' AND t_modified<'2018-8-1')\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tradelog
partitions: NULL
type: range
possible_keys: t_modified
key: t_modified
key_len: 6
ref: NULL
rows: 180940
filtered: 100.00
Extra: Using where; Using index
-
type=range:表示 索引范围扫描 (二级索引) -
rows=180,940 < 998,838,扫描行数 远小于 上面使用MONTH函数的情况
slowlog
Rows_examined=84,704 < 1,000,000 , Query_time 也仅为使用 MONTH 函数情况的 25%
# Time: 2019-02-12T14:56:51.727672+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.051701 Lock_time: 0.000239 Rows_sent: 1 Rows_examined: 84704 SET timestamp=1549954611; SELECT COUNT(*) FROM tradelog WHERE (t_modified >= '2016-7-1' AND t_modified<'2016-8-1') OR (t_modified >= '2017-7-1' AND t_modified<'2017-8-1') OR (t_modified >= '2018-7-1' AND t_modified<'2018-8-1');
id+1
mysql> explain select * from tradelog where id+1 = 1000000\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tradelog
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 998838
filtered: 100.00
Extra: Using where
mysql> EXPLAIN SELECT * FROM tradelog WHERE id = 999999\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tradelog
partitions: NULL
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
filtered: 100.00
Extra: NULL
- 优化器会偷懒,依然认为
id+1=1,000,000是应用在索引字段上的函数,因此采用的是 全表扫描 - 而
id=999,999会走 聚簇索引 的 树搜索 ,const表示这是 常量 操作(最多只会有一行记录匹配)
隐式类型转换
字符串 -> 数字
在 MySQL 中,如果字符串和数字做比较,会先 将字符串转换为数字
mysql> SELECT '10' > 9; +----------+ | '10' > 9 | +----------+ | 1 | +----------+
tradeid
explain
mysql> EXPLAIN SELECT * FROM tradelog WHERE tradeid=625912\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tradelog
partitions: NULL
type: ALL
possible_keys: tradeid
key: NULL
key_len: NULL
ref: NULL
rows: 998838
filtered: 10.00
Extra: Using where
-
type=ALL:表示 全表扫描 -
rows=998,838≈1,000,000 - 等价于
SELECT * FROM tradelog WHERE CAST(tradid AS SIGNED INT)=625912;- 隐式的类型转换,导致会在索引字段上做函数操作,优化器会放弃走树搜索的功能
slowlog
Rows_examined 依然为 1,000,000
# Time: 2019-02-12T15:30:09.033772+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.312170 Lock_time: 0.000114 Rows_sent: 1 Rows_examined: 1000000 SET timestamp=1549956609; SELECT * FROM tradelog WHERE tradeid=625912;
id
explain
mysql> EXPLAIN SELECT * FROM tradelog WHERE id='625912'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tradelog
partitions: NULL
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
filtered: 100.00
Extra: NULL
-
type=const:表示 常量操作 -
key=PRIMARY:走 聚簇索引 的树搜索功能 -
rows=1:只需要扫描一行 - 等价于
SELECT * FROM tradelog WHERE id=CAST('625912' AS SIGNED INT);- 只是在 输入参数 上做隐式类型转换,在索引字段上并没有做函数操作,依然可以走 聚簇索引 的树搜索功能
slowlog
Rows_examined=1 ,只需要扫描一行
# Time: 2019-02-12T15:45:38.222760+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.000476 Lock_time: 0.000210 Rows_sent: 1 Rows_examined: 1 SET timestamp=1549957538; SELECT * FROM tradelog WHERE id='625912';
隐式字符编码转换
交易详情表
-- tradelog的编码为utf8mb4,trade_detail的编码为utf8
CREATE TABLE `trade_detail` (
`id` INT(11) NOT NULL,
`tradeid` VARCHAR(32) DEFAULT NULL,
`trade_step` INT(11) DEFAULT NULL,
`step_info` VARCHAR(32) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tradeid` (`tradeid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO tradelog VALUES (1, 'aaaaaaaa', 1000, NOW()); INSERT INTO tradelog VALUES (2, 'aaaaaaab', 1000, NOW()); insert into tradelog VALUES (3, 'aaaaaaac', 1000, NOW()); INSERT INTO trade_detail VALUES (1, 'aaaaaaaa', 1, 'add'); INSERT INTO trade_detail VALUES (2, 'aaaaaaaa', 2, 'update'); INSERT INTO trade_detail VALUES (3, 'aaaaaaaa', 3, 'commit'); INSERT INTO trade_detail VALUES (4, 'aaaaaaab', 1, 'add'); INSERT INTO trade_detail VALUES (5, 'aaaaaaab', 2, 'update'); INSERT INTO trade_detail VALUES (6, 'aaaaaaab', 3, 'update again'); INSERT INTO trade_detail VALUES (7, 'aaaaaaab', 4, 'commit'); INSERT INTO trade_detail VALUES (8, 'aaaaaaac', 1, 'add'); INSERT INTO trade_detail VALUES (9, 'aaaaaaac', 2, 'update'); INSERT INTO trade_detail VALUES (10, 'aaaaaaac', 3, 'update again'); INSERT INTO trade_detail VALUES (11, 'aaaaaaac', 4, 'commit');
函数作用于二级索引
explain
mysql> EXPLAIN SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND l.id=2; +----+-------------+-------+------------+-------+-----------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+-----------------+---------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | l | NULL | const | PRIMARY,tradeid | PRIMARY | 4 | const | 1 | 100.00 | NULL | | 1 | SIMPLE | d | NULL | ALL | NULL | NULL | NULL | NULL | 11 | 100.00 | Using where | +----+-------------+-------+------------+-------+-----------------+---------+---------+-------+------+----------+-------------+
-
tradelog称为 驱动表 ,trade_detail称为 被驱动表 ,tradeid为 关联字段 。驱动原则: 小表驱动大表 - 优化器会先在
tradelog表上查找id=2的行,使用了tradelog的聚簇索引,只扫描了一行,取出tradeid='aaaaaaab' - 然后到
trade_detail表上查找tradeid='aaaaaaab'的行,但没有选择 二级索引tradeid,而选择了 全表扫描-
type=ALL,不符合预期,本希望走二级索引tradeid的树搜索功能 - 原因:两个表的 字符集不相同
-
tradelog的编码为utf8mb4,trade_detail的编码为utf8, -
utf8mb4是utf8的超集,详见 mysql中utf8和utf8mb4区别 -
d.tradeid=l.tradeid时,需要先 将utf8字符串转换成utf8mb4字符串 - 因此,被驱动表
trade_detail里面的tradeid字段需要先转换成utf8mb4类型,再跟L2进行比较
-
- 等价于
SELECT * FROM trade_detail WHERE CONVERT(traideid USING utf8mb4)=$L2.tradeid.value;- 隐式的 字符编码转换 ,导致会在二级索引
tradeid上做函数操作,优化器会放弃走 树搜索 的功能
- 隐式的 字符编码转换 ,导致会在二级索引
-
slowlog
# Time: 2019-02-12T16:45:14.841502+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.000470 Lock_time: 0.000202 Rows_sent: 4 Rows_examined: 11 SET timestamp=1549961114; SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND l.id=2;
函数作用于输入参数
explain
mysql> EXPLAIN SELECT l.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND d.id=4; +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | SIMPLE | d | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL | | 1 | SIMPLE | l | NULL | ref | tradeid | tradeid | 131 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
-
trade_detail称为 驱动表 ,tradelog称为 被驱动表 ,tradeid为 关联字段 - 被驱动表
tradelog的编码为utf8mb4,驱动表trade_detail的编码为utf8- 等价于
SELECT * FROM tradelog WHERE traideid = CONVERT($R4.tradeid.value USING utf8mb4); - 函数是用在 输入参数 上的,并非二级索引
tradeid上,因此可以用 树搜索 功能(key=tradeid和rows=1) -
type=ref: Join语句中被驱动表索引引用的查询
- 等价于
slowlog
# Time: 2019-02-12T17:31:50.553151+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.004090 Lock_time: 0.001874 Rows_sent: 1 Rows_examined: 1 SET timestamp=1549963910; SELECT l.* FROM tradelog l, trade_detail d WHERE d.tradeid=l.tradeid AND d.id=4;
优化方案
- 常用:将
trade_detail.tradeid的字符串编码修改为utf8mb4-
ALTER TABLE trade_detail MODIFY tradeid VARCHAR(32) CHARACTER SET utf8mb4 DEFAULT NULL;
-
- 修改SQL(场景:数据量较大或暂不支持该DDL)
- 主动把
l.tradeid转换为utf8,避免了 被驱动表上的隐式字符编码转换 -
SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=CONVERT(l.tradeid USING utf8) AND l.id=2;
- 主动把
mysql> EXPLAIN SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=CONVERT(l.tradeid USING utf8) AND l.id=2; +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | SIMPLE | l | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL | | 1 | SIMPLE | d | NULL | ref | tradeid | tradeid | 99 | const | 4 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
# Time: 2019-02-12T17:50:29.844772+08:00 # User@Host: root[root] @ localhost [] Id: 13 # Query_time: 0.000504 Lock_time: 0.000206 Rows_sent: 4 Rows_examined: 4 SET timestamp=1549965029; SELECT d.* FROM tradelog l, trade_detail d WHERE d.tradeid=CONVERT(l.tradeid USING utf8) AND l.id=2;
参考资料
《MySQL实战45讲》
转载请注明出处:http://zhongmingmao.me/2019/02/12/mysql-index-function/
访问原文「 MySQL -- 索引上的函数 」获取最佳阅读体验并参与讨论
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- MySQL 8.0.13(GA) 发布,新增函数索引、强制主键等新特性
- MySQL 8.0.13(GA) 发布,新增函数索引、强制主键等新特性
- MySQL索引使用说明(单列索引和多列索引)
- Elasticsearch索引的基本操作(3)-索引的滚动索引
- Coreseek 增量索引模拟实时索引
- Coreseek 增量索引模拟实时索引
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。