MySql -添加列来选择会降低性能

pwuypxnk 于 2023-05-16 发布在 Mysql

关注(0)|答案(5)|浏览(79)

我有一个有几百万行数据的表。我在id上有一个主键，在col 2，col 3，col 4和my_date上有一个复合唯一键（称为comp_indx）。示例数据显示在这里…

id   col2 col3 col4 my_date             col5 col6 col7
1    1    1    1    2020-01-03 02:00:00 a    1    a
2    1    2    1    2020-01-03 01:00:00 b    2    1
3    1    3    1    2020-01-03 03:00:00 c    3    b
4    2    1    1    2020-02-03 01:00:00 d    4    2
5    2    2    1    2020-02-03 02:00:00 e    5    c
6    2    3    1    2020-02-03 03:00:00 f    6    3
7    3    1    1    2020-03-03 03:00:00 g    7    d
8    3    2    1    2020-03-03 02:00:00 h    8    4
9    3    3    1    2020-03-03 01:00:00 i    9    e

如果我执行以下查询...

SELECT col2, col3, max(my_date)
FROM table
where col4=1 and my_date <= '2001-01-27'
group by col2, col3

。。。查询非常高效，运行explain命令显示。。

select_type type  key       key_len rows Extra
----------- ----- --------- ------- ---- -------------------------------------
SIMPLE      range comp_indx 11      669  Using where; Using index for group-by

但是，如果我运行一个类似的命令（只请求更多的列-其中没有一个是索引的一部分），例如..

SELECT col2, col3, max(my_date), col5, col7
FROM table
where col4=1 and my_date <= '2001-01-27'
group by col2, col3

...然后性能马上下降，如果我再次运行explain命令，我会得到...

select_type type  key       key_len rows     Extra
----------- ----- --------- ------- -------  -----------
SIMPLE      index comp_indx 11      5004953  Using where

我可以看到类型已经从range更改为index，并且我可以看到index不再用于group-by。
我想知道为什么会发生这种情况，更重要的是，我该如何解决这个问题？
表的定义是…

CREATE TABLE `my_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col2` smallint(6) NOT NULL,
  `col3` smallint(6) NOT NULL,
  `col4` smallint(6) NOT NULL,
  `my_date` datetime NOT NULL,
  `col5` char(1) NOT NULL,
  `col6` char(1) NOT NULL,
  `col7` char(1) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `comp_indx` (`col2`,`col3`,`col4`,`my_date`)
) ENGINE=InnoDB;

mysql

来源：https://stackoverflow.com/questions/76188981/mysql-adding-columns-to-select-degrades-performance

5条答案

按热度按时间

xjreopfe1#

添加以下索引

alter table my_table add key cl4_dt_cl2_cl3 (col4,my_date,col2,col3);

而且，如果启用了sql_mode only_full_group_by，则以下查询无效

SELECT col2, col3, max(my_date), col5, col7
FROM table
where col4=1 and my_date <= '2001-01-27'
group by col2, col3

赞(0）回复(0）举报 2023-05-16

yeotifhr2#

您的原始查询：

SELECT col2, col3, max(my_date), col5, col7
FROM table
where col4=1 and my_date <= '2001-01-27'
group by col2, col3

col5，col7也应该被添加到group by子句中，对吗？

赞(0）回复(0）举报 2023-05-16

mv1qrgav3#

如果你不需要id来做任何事情，那么这将加快查询速度，而不管你需要获取的额外列（col 5/6/7）。

CREATE TABLE `my_table` (
  `col2` smallint(6) NOT NULL,
  `col3` smallint(6) NOT NULL,
  `col4` smallint(6) NOT NULL,
  `my_date` datetime NOT NULL,
  `col5` char(1) NOT NULL,
  `col6` char(1) NOT NULL,
  `col7` char(1) NOT NULL,
  PRIMARY KEY (col4,my_date,col2,col3)  -- in this order
) ENGINE=InnoDB;

如果您确实需要id，因为它被其他表引用，那么添加

`id` int(11) NOT NULL AUTO_INCREMENT,
  INDEX(id)  -- This is sufficient to keep auto_inc happy

我建议的PK是11个字节（相对于4个字节的INT）。任何辅助都将包含这11个字节。但是，PK和二级索引之间的任何公共列都不会重复。例如，INDEX(col2, col7)实际上就是INDEX(col2, col7, col4, my_date, col3)。
请记住，PK确定行的“引用位置”。任何以col4开头的二级索引几乎都是无用的，因为PK是以col4开头的。（这当然取决于基数，等等）

赞(0）回复(0）举报 2023-05-16

ctrmrzij4#

我现在已经通过在同一个表上使用2个选择和一个连接解决了我的性能问题，例如..

SELECT *
FROM (
    SELECT col2, col3, max(my_date) as max_date
    FROM table
    where col4=1 and my_date <= '2001-01-27'
    group by col2, col3
) aaa
join
(
    SELECT col2, col3, my_date, col5, col6, col7
    FROM table
    where col4=1
) bbb
on (aaa.col2=bbb.col2 and aaa.col3=bbb.col3 and aaa.max_date=bbb.my_date);

赞(0）回复(0）举报 2023-05-16

iyfjxgzm5#

您可能需要添加此覆盖索引以使第二个查询更快：

create index comp2_index on my_table(col2, col3, col4, my_date, col5, col7);

赞(0）回复(0）举报 2023-05-16

我来回答

MySql -添加列来选择会降低性能

5条答案

相关问题

热门标签

最新问答