postgresql 检查一个Postgres数组是否包含一个有序的子数组

7d7tgy0s 于 12个月前发布在 PostgreSQL

关注(0)|答案(2)|浏览(163)

我需要编写一个SQL（或SQL函数）来检查Postgres项目是否包含项目的切片。我将此项目称为'Item'。所以基本上'Item'看起来像这样：
x1c 0d1x的数据
e1的大小始终与e2相同。
现在有两个数组，我就叫它们a1和a2，看起来像这样。
“{B，c，d}”、“{2，3，4}”。
e2（a2）中的值必须与e1（a1）中的索引完全匹配，因此在本例中，表中的第1和第4项都将匹配，但第2或第3项不匹配。因此，如果2个数组是'{c，d}'，'{3，4}'，则第1，第3和第4项将匹配。
我不知道如何做到这一点，我需要使用类似generate_series（）的东西来生成所有可能的项目切片，然后检查吗？我现在有点困惑。

postgresql

来源：https://stackoverflow.com/questions/46636923/check-if-a-postgres-array-contains-a-subarray-in-order

2条答案

按热度按时间

idv4meu81#

该函数返回子数组sub的数组arr中的索引，如果arr不包含sub，则返回0：

create or replace function index_of_subarray(arr anyarray, sub anyarray)
returns integer language plpgsql immutable as $$
begin
    for i in 1 .. cardinality(arr)- cardinality(sub)+ 1 loop
        if arr[i:i+ cardinality(sub)- 1] = sub then
            return i;
        end if;
    end loop;
    return 0;
end $$;

字符串
用途：

with my_table(e1, e2) as (
values
    ('{a,b,c,d}'::text[], '{1,2,3,4}'::int[]),
    ('{b,c,d,a}', '{1,2,3,4}'),
    ('{c,d}', '{3,4}'),
    ('{b,c,d}', '{2,3,4}')
)

select e1, e2
from my_table
where index_of_subarray(e1, '{b,c,d}') > 0 
and index_of_subarray(e1, '{b,c,d}') = index_of_subarray(e2, '{2,3,4}')

    e1     |    e2     
-----------+-----------
 {a,b,c,d} | {1,2,3,4}
 {b,c,d}   | {2,3,4}
(2 rows)

型

赞(0）回复(0）举报 12个月前

bq9c1y662#

根据具体情况，您可以从数组中创建一个字符串，并使用regex/like查询它。
这听起来像是一个天真的方法，结果比接受答案的plpgsql函数更快。至少在以下情况下：

drop table if exists temp;
create table temp as
  SELECT array[
    chr((floor(random()*26) + 65)::int),
    chr((floor(random()*26) + 65)::int),
    chr((floor(random()*26) + 65)::int),
    chr((floor(random()*26) + 65)::int),
    chr((floor(random()*26) + 65)::int),
    chr((floor(random()*26) + 65)::int)
    ] as a FROM GENERATE_SERIES(1, 1000000);

字符串
用Like查询

\timing

select count(*) from temp
where array_to_string(a, '>>') like '%A>>B%';
 count
-------
  7505
(1 row)

Time: 270,404 ms

型
查询函数

select count(*) from temp
where index_of_subarray (a, array['A', 'B']) != 0;
 count
-------
  7505
(1 row)

Time: 1999,002 ms (00:01,999)

型
当您需要匹配通配符时，性能上的差异变得更加明显：

select count(*) from temp
where array_to_string(a, '>>') like '%A>>B%>>C>>D%';
 count
-------
     7
(1 row)

Time: 173,343 ms

型
与功能

select count(*) from temp
where index_of_subarray (a, array['A', 'B']) != 0
and index_of_subarray (a, array['A', 'B']) < index_of_subarray (a, array['C', 'D']);
 count
-------
     7
(1 row)

Time: 1999,791 ms (00:02,000)

型
我对表现上的差异感到惊讶。
也许是因为plpgsql比Regex的原生实现慢得多，后者很可能是非常优化的。
也许在所创建的测试情况中有一些特殊的东西导致了性能上的优势。
此外，为了使字符串化解决方案可靠，您需要为array_to_string提供一个不在数组元素中出现的字符串。在某些情况下，这可能不可行。
但除此之外，你可以考虑从数组中创建一个字符串，并用like/regex查询它。

赞(0）回复(0）举报 12个月前

我来回答

postgresql 检查一个Postgres数组是否包含一个有序的子数组

2条答案

相关问题

热门标签

最新问答