我看到了数以百万计的线程根据加速postgresql查询与全文搜索。我试图做的一切,但没有更多的想法。
我有一个很大的表(目前有20 612 971条记录),用pgsql的全文搜索器搜索,然后按ts_rank_cd排序。我执行查询的时间达到了3500- 4000 ms。有什么想法可以让它更快吗?如果可能的话,我不想使用像sphinx或solr这样的外部软件。所以,首选原生postgresql解决方案:)
下面是我的表的描述和解释分析选择的例子.
# \d artifacts.item
Table "artifacts.item"
Column | Type | Modifiers
-------------------------+-----------------------------+-------------------------------------------------------------
add_timestamp | timestamp without time zone |
author_account_id | integer |
description | text |
id | integer | not null default nextval('artifacts.item_id_seq'::regclass)
removed_since_timestamp | timestamp without time zone |
slug | character varying(2044) | not null
thumb_height | integer |
thumb_path | character varying(2044) | default NULL::character varying
thumb_width | integer |
title | character varying(2044) | not null
search_data | tsvector |
tags | integer[] |
is_age_restricted | boolean | not null default false
is_on_homepage | boolean | not null default false
is_public | boolean | not null default false
thumb_filename | character varying(2044) |
is_removed | boolean | not null default false
Indexes:
"artifacts_item_add_timestamp_idx" btree (add_timestamp DESC NULLS LAST)
"artifacts_item_id_idx" btree (id)
"artifacts_item_is_on_homepage_add_timestamp" btree (is_on_homepage DESC, add_timestamp DESC NULLS LAST)
"artifacts_item_is_on_homepage_idx" btree (is_on_homepage)
"artifacts_item_search_results" gin (search_data) WHERE is_public IS TRUE AND is_removed IS FALSE
"artifacts_item_tags_gin_idx" gin (tags)
"artifacts_item_thumbs_list" btree (is_public, is_removed, id DESC)
"index1" btree (add_timestamp)
"itemIdx" btree (is_public, is_removed, is_age_restricted)
"item_author_account_id_idx" btree (author_account_id)
分析:
# explain analyze SELECT i.id,
# i.title,
# i.description,
# i.slug,
# i.thumb_path,
# i.thumb_filename,
# CONCAT(
# i.thumb_path,
# '/',
# i.thumb_filename
# ) AS thumb_url,
# (CASE WHEN i.thumb_width = 0 THEN 280 ELSE i.thumb_width END) as thumb_width,
# (CASE WHEN i.thumb_height = 0 THEN 280 ELSE i.thumb_height END) as thumb_height,
# (i.thumb_height > i.thumb_width) AS is_vertical,
# i.add_timestamp
# FROM artifacts.item AS i
# WHERE i.is_public IS true
# AND i.is_removed IS false
# AND (i.search_data @@ to_tsquery('public.polish', $$'lego'$$))
# ORDER BY ts_rank_cd(i.search_data, to_tsquery('public.polish', $$'lego'$$)) desc,
# ts_rank_cd(i.search_data, to_tsquery('public.polish', $$'lego'$$)) desc,
# i.add_timestamp DESC NULLS LAST
# LIMIT 60
# OFFSET 0;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=358061.78..358061.93 rows=60 width=315) (actual time=335.870..335.876 rows=60 loops=1)
-> Sort (cost=358061.78..358357.25 rows=118189 width=315) (actual time=335.868..335.868 rows=60 loops=1)
Sort Key: (ts_rank_cd(search_data, '''lego'' | ''lega'''::tsquery)), add_timestamp
Sort Method: top-N heapsort Memory: 55kB
-> Bitmap Heap Scan on item i (cost=2535.96..353980.19 rows=118189 width=315) (actual time=33.163..308.371 rows=62025 loops=1)
Recheck Cond: ((search_data @@ '''lego'' | ''lega'''::tsquery) AND (is_public IS TRUE) AND (is_removed IS FALSE))
-> Bitmap Index Scan on artifacts_item_search_results (cost=0.00..2506.42 rows=118189 width=0) (actual time=23.066..23.066 rows=62085 loops=1)
Index Cond: (search_data @@ '''lego'' | ''lega'''::tsquery)
Total runtime: 335.967 ms
(9 rows)
Time: 3444.731 ms
1条答案
按热度按时间fnx2tebb1#
有62025行符合条件,必须对它们进行排序...
这将需要一段时间。有没有可能你可以有整个数据库或至少在RAM中的索引?这将有所帮助。