当应用多个过滤器时，如何提高查询性能？

ruyhziif 于 2021-06-20 发布在 Mysql

关注(0)|答案(4)|浏览(265)

我有一种三元关系，叫做 ternary 这样地：

id_Offer    -   id_Profile  -   id_Skill
1           -   1           -   1
1           -   2           -   1

[and so on, there would be more registers for each id_Offer from Offer but I want to limit the example]

表profile如下所示（profile\u interest是建立profile和interest之间关系的表，仅此而已）：

id_Profile -   profile_name
1          -   profile-1
2          -   profile-2
3          -   profile-3

因此，当我进行以下查询时，我添加的or子句越多，查询的性能就越差，从~0.1-0.2秒开始，这是我进行的任何其他查询得到的结果，最多1.5秒。

SELECT DISTINCT ternary_table.id_profile, COUNT(distinct profile_interest.id_interest) as matching 
FROM ternary_table INNER JOIN profile ON ternary_table.id_profile=profile.id_profile 
INNER JOIN profile_interest ON profile.id_profile=profile_interest.id_profile 
WHERE profile_interest.id_interest= '1' 
 OR profile_interest.id_interest = '2' 
 OR profile_interest.id_interest = '3'
 OR profile_interest.id_interest = '14'
 OR profile_interest.id_interest = '15'
 OR profile_interest.id_interest = '16'
GROUP BY(ternary_table.id_profile) 
ORDER BY matching DESC;

我已尝试将字段配置文件\u interest.id\u interest设置为索引列，其中包含： CREATE INDEX filter_interest ON profile_interest(id_interest ); 没有任何改善。数据库的重量不到1千兆字节，是一个非常小的数据库，大约有15个表，所以我想知道是否有任何方法可以缩短查询延迟。
edit：为了添加更多信息，我之所以担心这个问题，是因为这个数据的唯一目的是连接到api，因此sql中的任何延迟都会延迟对这个数据的每次调用。
edit1：添加了explain输出，删除了first distinct，因为这是不必要的

+----+-------------+---------------------+------------+--------+------------------------------------------------+------------+---------+------------------------------------+------+----------+-----------------------------------------------------------+
| id | select_type | table               | partitions | type   | possible_keys                                  | key        | key_len | ref                                | rows | filtered | Extra                                                     |
+----+-------------+---------------------+------------+--------+------------------------------------------------+------------+---------+------------------------------------+------+----------+-----------------------------------------------------------+
|  1 | SIMPLE      | profile_interest      | NULL       | range  | PRIMARY,id_interest,filter_interest | id_interest | 202 | NULL                               |   40 |   100.00 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | perfil              | NULL       | eq_ref | PRIMARY                                        | PRIMARY    | 202     | BBDD.profile_interest.id_perfil    |    1 |   100.00 | Using index                                               |
|  1 | SIMPLE      | oferta_skill_perfil | NULL       | ref    | PRIMARY,id_skill,id_perfil                     | id_perfil  | 202     | BBDD.profile_interest.id_perfil    | 4609 |   100.00 | Using index                                               |
+----+-------------+---------------------+------------+--------+------------------------------------------------+------------+---------+------------------------------------+------+----------+-----------------------------------------------------------+

编辑2：添加了每个请求的表创建

SET FOREIGN_KEY_CHECKS=1;

CREATE TABLE profile (
    id_profile VARCHAR(200) NOT NULL,
    name_profile VARCHAR(200),
    type_profile VARCHAR(200),
    PRIMARY KEY (id_profile)
);

CREATE TABLE ternary (
    id_oferta VARCHAR(200) NOT NULL,
    id_skill VARCHAR(200) NOT NULL,
    id_profile VARCHAR(200) NOT NULL,
    ranking_skill DOUBLE NOT NULL,
    PRIMARY KEY (id_oferta, id_skill, id_profile),
    FOREIGN KEY (id_oferta) REFERENCES oferta(id_oferta),
    FOREIGN KEY (id_skill) REFERENCES skill(id_skill),
    FOREIGN KEY (id_profile) REFERENCES profile(id_profile)
);

 CREATE TABLE interest (
    id_interest VARCHAR(200) NOT NULL,
    name_interes VARCHAR(200),
    PRIMARY KEY (id_interest)
 );

CREATE TABLE profile_interest (
    id_profile VARCHAR(200) NOT NULL,
    id_interest VARCHAR(200) NOT NULL, 
    PRIMARY KEY (id_profile, id_interest),
    FOREIGN KEY (id_profile) REFERENCES profile(id_profile),
    FOREIGN KEY (id_interest) REFERENCES interes(id_interest)
);

sql mysql performance

来源：https://stackoverflow.com/questions/50702306/how-to-improve-query-performance-when-applying-multiple-filters

4条答案

按热度按时间

bfnvny8b1#

你可以试着这样写：

select tt.id_profile,
       (select count(distinct pi.id_interest)
        from profile_interest pi
        where tt.id_profile = pi.id_profile and
              pi.id_interest in (1, 2, 3, 14, 15, 16)
       ) as matching
from ternary_table tt;

为了让它工作，你需要一个索引 profile_interest(id_profile, id_interest) .
编辑：
如果只需要匹配的行，则可以添加：

having matching > 0

赞(0）回复(0）举报 2021-06-20

zpf6vheq2#

没有答案。评论时间太长。。。
fwiw，我觉得这更容易阅读。。。

SELECT x.id_profile
     , COUNT(DISTINCT i.id_interest) matching 
  FROM ternary_table x
  JOIN profile p 
    ON p.id_profile = x.id_profile
  JOIN profile_interest i
    ON i.id_profile = p.id_profile
 WHERE i.id_interest IN(1,2,3,14,15,16)
 GROUP 
    BY x.id_profile
 ORDER 
    BY matching DESC;

现在，如果我们能看到show create table语句以及对它的解释就好了。

赞(0）回复(0）举报 2021-06-20

8qgya5xd3#

您的查询可以简化为：

SELECT id_profile, 
   COUNT(distinct id_interest) as matching  -- or COUNT(*)??
FROM profile_interest 
WHERE id_interest IN(1,2,3,14,15,16) -- those ids are probably integers, not strings
GROUP BY id_profile
ORDER BY matching DESC;

赞(0）回复(0）举报 2021-06-20

fwzugrvs4#

SELECT  id_profile,
        COUNT(id_interest) as matching
    FROM  profile_interest AS pi
    WHERE  id_interest IN (1,2,4,14,15,16)
      AND EXISTS ( SELECT * FROM oferta_skill_perfil
                                          WHERE id_profile = pi.id_profile )
      AND EXISTS ( SELECT * FROM profile  WHERE id_profile = pi.id_profile )
    GROUP BY id_profile
    ORDER BY  matching DESC;

以及 profile_interest 需要 INDEX(id_interest, id_profile) 按这个顺序。
我曾经 EXISTS 只是因为这似乎是触摸其他table的真正目的。但也许他们不需要被触碰？
减速是由我所说的“爆炸-内爆”或“膨胀-收缩”引起的。当你 JOIN 一些表（导致更多中间行），然后 GROUP BY 回到你开始的状态。摆脱它的技巧是首先集中精力做聚合( COUNT ，在你的情况下），那么 JOIN 根据需要。 EXISTs 如果只需要检查是否存在，而不是实际查找所有4609行，则速度要快得多。 profile_interest 似乎是一个多对多Map表。请看我的提示。
注意，它推荐了我上面建议的索引。
通常id是整数；为什么你有 VARCHAR(200) ? 这种长字符串没有明显的来源。

赞(0）回复(0）举报 2021-06-20

我来回答

当应用多个过滤器时，如何提高查询性能？

4条答案

相关问题

热门标签

最新问答