ruby-on-rails Postgres只对第一个项目使用组合索引,搜索速度要慢得多

dwthyt8l  于 2023-01-27  发布在  Ruby
关注(0)|答案(1)|浏览(132)

我有两个索引显示非常不同的结果相似的搜索,我不知道这是为什么:

# The two indexes
"index_buyer_suppliers_on_buyer_id" btree (buyer_id)
"index_buyer_suppliers_on_supplier_id_and_buyer_id" UNIQUE, btree (supplier_id, buyer_id)

# The two searches
@current_suppliers = BuyerSupplier.where("buyer_id = ?", @entity.id).length
@current_buyers    = BuyerSupplier.where("supplier_id = ?", @entity.id).length

# The EXPLAINs
1. EXPLAIN SELECT "buyer_suppliers".* FROM "buyer_suppliers" WHERE (buyer_id = 891285);
                                             QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Index Scan using index_buyer_suppliers_on_buyer_id on buyer_suppliers  (cost=0.43..54.48 rows=15 width=56)
   Index Cond: (buyer_id = 891285)
(2 rows)

2. EXPLAIN SELECT "buyer_suppliers".* FROM "buyer_suppliers" WHERE (supplier_id = 891285);
                                             QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on buyer_suppliers  (cost=2309.31..32194.90 rows=99727 width=56)
   Recheck Cond: (supplier_id = 891285)
   ->  Bitmap Index Scan on index_buyer_suppliers_on_supplier_id_and_buyer_id  (cost=0.00..2284.38 rows=99727 width=0)
         Index Cond: (supplier_id = 891285)
(4 rows)

# The EXPLAINs with (ANALYZE, BUFFERS)
1. EXPLAIN (ANALYZE, BUFFERS) SELECT "buyer_suppliers".* FROM "buyer_suppliers" WHERE (buyer_id = 891285);
                                              QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using index_buyer_suppliers_on_buyer_id on buyer_suppliers  (cost=0.43..54.48 rows=15 width=56) (actual time=0.042..0.042 rows=0 loops=1)
   Index Cond: (buyer_id = 891285)
   Buffers: shared hit=3
 Planning Time: 0.103 ms
 Execution Time: 0.057 ms
(5 rows)

Time: 0.399 ms

2. EXPLAIN (ANALYZE, BUFFERS) SELECT "buyer_suppliers".* FROM "buyer_suppliers" WHERE (supplier_id = 891285);
                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on buyer_suppliers  (cost=2309.31..32194.90 rows=99727 width=56) (actual time=8.584..32.907 rows=99205 loops=1)
   Recheck Cond: (supplier_id = 891285)
   Heap Blocks: exact=9016
   Buffers: shared hit=9399
   ->  Bitmap Index Scan on index_buyer_suppliers_on_supplier_id_and_buyer_id  (cost=0.00..2284.38 rows=99727 width=0) (actual time=7.316..7.317 rows=99205 loops=1)
         Index Cond: (supplier_id = 891285)
         Buffers: shared hit=383
 Planning Time: 0.107 ms
 Execution Time: 39.306 ms
(9 rows)

Time: 39.806 ms

我一直认为只使用组合索引的第一部分是完全可以的,但是在这里我们看到这样做会导致性能的大幅度下降。为什么会发生这种情况?这是否意味着推荐的索引应该是以下四个:1):buyer_id,2):supplier_id,3)[:buyer_id, :supplier_id],4)[:supplier_id, :buyer_id]?这看起来有点矫枉过正,但基于上述情况,这可能是必要的吗?

nukf8bse

nukf8bse1#

返回99205的索引扫描比不返回任何行的索引扫描花费更多的时间,这并不奇怪。40毫秒似乎并不慢。因此,这里没有证据表明使用多列索引比单列索引慢得多。
为了验证这一点,你可以只在supplier_id上创建一个索引,看看它是否慢得多。任何速度增益都必须在扫描实际索引的前7毫秒内。较小的索引会稍微快一些,PostgreSQL会更喜欢它,但不要期望平均速度增益超过几毫秒。

相关问题