带有in子句的mysql查询将失去性能

w1jd8yoj 于 2021-06-18 发布在 Mysql

关注(0)|答案(1)|浏览(296)

我有一个表来存储来自csv文件的数据。这是一个大表（超过4000万行）。这是它的结构：

CREATE TABLE `imported_lines` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `day` date NOT NULL,
  `name` varchar(256) NOT NULL,
  `origin_id` int(11) NOT NULL,
  `time` time(3) NOT NULL,
  `main_index` tinyint(4) NOT NULL DEFAULT 0,
  `transaction_index` tinyint(4) NOT NULL DEFAULT 0,
  `data` varchar(4096) NOT NULL,
  `error` bit(1) NOT NULL,
  `expressions_applied` bit(1) NOT NULL,
  `count_records` smallint(6) NOT NULL DEFAULT 0,
  `client_id` tinyint(4) NOT NULL DEFAULT 0,
  `receive_date` datetime(3) NOT NULL,
  PRIMARY KEY (`id`,`client_id`),
  UNIQUE KEY `uq` (`client_id`,`name`,`origin_id`,`receive_date`),
  KEY `dh` (`day`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 
/*!50100 PARTITION BY HASH (`client_id`) PARTITIONS 15 */

当我执行select with one day过滤器时，它返回数据的速度非常快（0.4秒）。但是，随着我增加日期范围，它会变慢，直到出现超时错误。
以下是查询：

SELECT origin_id, error, main_index, transaction_index, 
expressions_applied, name, day,    
COUNT(id) AS total, SUM(count_records) AS sum_records
FROM imported_lines FORCE INDEX (dh)
WHERE client_id = 1
AND day >= '2017-07-02' AND day <= '2017-07-03'  
AND name IN ('name1', 'name2', 'name3', ...)  
GROUP BY origin_id, error, main_index, transaction_index, expressions_applied, name, day;

我认为in条款可能会失效。我还试图补充 uq 这个查询的索引，它提供了一点好处( FORCE INDEX (dh, uq) ). 另外，我试着 INNER JOIN (SELECT name FROM providers WHERE id = 2) prov ON prov.name = il.name 但是也不会导致更快的查询。
编辑解释查询

id - 1
select_type - SIMPLE
table - imported_lines
type - range
possible_keys - uq, dh
key - dh
key_len - 261
ref - NULL
rows - 297988
extra - Using where; Using temporary; Using filesort

有什么建议吗？

mysql

来源：https://stackoverflow.com/questions/52725855/mysql-query-with-in-clause-loses-performance

1条答案

按热度按时间

0yg35tkg1#

我做了一些修改，添加了一个包含多列的新索引（正如@uuerdo所建议的），并按照另一个用户的建议重写了查询（但他删除了他的答案）。
我跑了几圈 EXPLAIN PARTITIONS 使用查询，使用测试 SQL_NO_CACHE 为了保证它不会使用缓存，搜索一个月的数据需要1.8秒。
快多了！我就是这么做的：

ALTER TABLE `imported_lines` DROP INDEX dh;
ALTER TABLE `imported_lines` ADD INDEX dhc (`day`, `name`, `client_id`);

查询：

SELECT origin_id, error, main_index, transaction_index, 
expressions_applied, name, day,    
COUNT(id) AS total, SUM(count_records) AS sum_records
FROM imported_lines il
INNER JOIN (
    SELECT id FROM imported_lines
    WHERE client_id = 1 
    AND day >= '2017-07-01' AND day <= '2017-07-31'  
    AND name IN ('name1', 'name2', 'name3', ...)  
) AS il_filter
ON il_filter.id = il.id
WHERE il.client_id = 1
GROUP BY origin_id, error, main_index, transaction_index, expressions_applied, name, day;

我意识到 INNER JOIN , EXPLAIN PARTITIONS 它开始使用索引。也与 WHERE il.client_id = 1 ，查询减少了要查找的分区数。
谢谢你的帮助！

赞(0）回复(0）举报 2021-06-18

我来回答

带有in子句的mysql查询将失去性能

1条答案

相关问题

热门标签

最新问答