postgresql select查询卡在执行中

okxuctiv  于 2021-07-29  发布在  Java
关注(0)|答案(1)|浏览(304)

我使用的是postgresql 11版。它包含数以百万计的数据。包括insert和update在内的所有操作都正常工作,但是当我运行带有一些过滤器的select查询时,它会卡住,即使在10分钟后也不会响应。
我正在使用这个简单的查询进行测试,但它仍然被卡住了。

select tweet_id, user_id 
from "TweetsData" 
where lid_id=1 and tweet_time<='2020-06-09 09:00:00' 
order by tweet_time desc limit 10

甚至我也没有得到这个查询的explain-analyze输出。
下面是explain的输出

Limit  (cost=2053991.46..2053992.62 rows=10 width=43)
  ->  Gather Merge  (cost=2053991.46..3462030.09 rows=12068060 width=43)
        Workers Planned: 2
        ->  Sort  (cost=2052991.43..2068076.51 rows=6034030 width=43)
              Sort Key: tweet_time DESC
              ->  Parallel Seq Scan on "TweetsData"  (cost=0.00..1922598.21 rows=6034030 width=43)
                    Filter: ((tweet_time <= '2020-06-09 09:00:00'::timestamp without time zone) AND (lid_id = 1))

请帮助我如何解决这个问题?这是非常关键的。提前谢谢。

cidc1ykv

cidc1ykv1#

如果没有可用的索引,计划员只能选择一个计划:扫描所有行,对它们排序,过滤它们,然后选择前10行。这意味着:从磁盘中获取12m条记录,只需获得10个结果元组。索引访问只需要从磁盘获取x*10页(x在0.1到几十之间)
这里只使用一个索引(在时间戳上)(列名称略有不同):

select version();

\d tweets_created_at_idx

        -- Before jun 9
explain analyse
select id, user_id
from tweets
where sucker_id = 507                   -- low-cardinality condition
and created_at < '2020-06-09 09:00:00'  -- about 7M records
order by created_at desc
limit 10;

        -- After jun 9
explain analyse
select id, user_id
from tweets
where sucker_id = 507                   -- low-cardinality condition
and created_at >= '2020-06-09 09:00:00' -- about 5K records
order by created_at desc
limit 10;

输出:

version                                                
-------------------------------------------------------------------------------------------------------
 PostgreSQL 11.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
(1 row)

           Index "public.tweets_created_at_idx"
   Column   |           Type           | Key? | Definition 
------------+--------------------------+------+------------
 created_at | timestamp with time zone | yes  | created_at
btree, for table "public.tweets"

                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..1.65 rows=10 width=24) (actual time=0.059..0.079 rows=10 loops=1)
   ->  Index Scan Backward using tweets_created_at_idx on tweets  (cost=0.43..853049.60 rows=6995588 width=24) (actual time=0.052..0.071 rows=10 loops=1)
         Index Cond: (created_at < '2020-06-09 09:00:00+02'::timestamp with time zone)
         Filter: (sucker_id = 507)
 Planning Time: 0.339 ms
 Execution Time: 0.102 ms
(6 rows)

                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..11.98 rows=10 width=24) (actual time=0.035..0.053 rows=10 loops=1)
   ->  Index Scan Backward using tweets_created_at_idx on tweets  (cost=0.43..6604.02 rows=5720 width=24) (actual time=0.034..0.050 rows=10 loops=1)
         Index Cond: (created_at >= '2020-06-09 09:00:00+02'::timestamp with time zone)
         Filter: (sucker_id = 507)
 Planning Time: 0.143 ms
 Execution Time: 0.067 ms
(6 rows)

raspberrypi b上的输出(大约慢10倍):

version                                                          
---------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.5 on armv6l-unknown-linux-gnueabihf, compiled by gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516, 32-bit
(1 row)

Did not find any relation named "tweets_created_at_idx".
               Index "public.tweets_du_idx"
   Column   |           Type           | Key? | Definition 
------------+--------------------------+------+------------
 created_at | timestamp with time zone | yes  | created_at
 user_id    | bigint                   | yes  | user_id
btree, for table "public.tweets"

                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..7.16 rows=10 width=24) (actual time=0.337..0.481 rows=10 loops=1)
   ->  Index Scan Backward using tweets_du_idx on tweets  (cost=0.43..4796963.78 rows=7126207 width=24) (actual time=0.322..0.428 rows=10 loops=1)
         Index Cond: (created_at < '2020-06-09 09:00:00+02'::timestamp with time zone)
         Filter: (sucker_id = 507)
 Planning Time: 4.272 ms
 Execution Time: 0.828 ms
(6 rows)

                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..7.75 rows=10 width=24) (actual time=0.202..0.435 rows=10 loops=1)
   ->  Index Scan Backward using tweets_du_idx on tweets  (cost=0.43..12876.25 rows=17588 width=24) (actual time=0.186..0.380 rows=10 loops=1)
         Index Cond: (created_at >= '2020-06-09 09:00:00+02'::timestamp with time zone)
         Filter: (sucker_id = 507)
 Planning Time: 8.428 ms
 Execution Time: 0.784 ms
(6 rows)

相关问题