比较两个数组并计算相同字符串的数目

zfycwa2u  于 2021-07-26  发布在  Java
关注(0)|答案(2)|浏览(368)

我想比较两列中的两个数组,并在第三列中返回这两个数组中相同字符串的数目。

|---------------------|------------------|------------------|
|      column 1       |     column 2     |     column 3     |
|---------------------|------------------|------------------|
|   [cat, dog, bird]  |   [cat, bird]    |          2       |
|---------------------|------------------|------------------|
|  [cat, bear, tiger] |      [tiger]     |          1       |
|---------------------|------------------|------------------|
|    [cat, tiger]     |  [tiger, cat]    |          2       |
|---------------------|------------------|------------------|
ht4b089n

ht4b089n1#

你可以用 unnest() . 假设单个数组没有重复项:

with t as (
      select array['cat', 'dog', 'bird'] as column1, array['cat', 'bird'] as column2 union all
      select array['cat', 'bear', 'tiger'], array['tiger'] union all
      select array['cat', 'tiger'], array['tiger', 'cat']
     )
select t.*,
       (select count(*)
        from unnest(column1) el1 join
             unnest(column2) el2
             on el1 = el2
       ) as column3
from t;
kyxcudwk

kyxcudwk2#

假设数组中没有重复项-下面是另一个选项


# standardSQL

SELECT *, 
  ARRAY_LENGTH(ARRAY(
    SELECT item
    FROM UNNEST(column1 || column2) AS item
    GROUP BY item
    HAVING COUNT(1) > 1
  )) AS column3
FROM `project.dataset.table`

上面的查询是做什么的-它将两个数组组合起来,然后删除所有不重复的数组,同时留下不同的重复列表,最后统计结果数组中的元素数
另外,我认为最简单/直接的方法是


# standardSQL

SELECT *, 
  (SELECT COUNT(1) FROM (
    SELECT * FROM t.column1 INTERSECT DISTINCT 
    SELECT * FROM t.column2
  )) AS column3
FROM `project.dataset.table` t

我不认为最后的版本需要任何解释
显然,上述两个版本都返回预期的输出

Row column1             column2         column3  
1   [cat, dog, bird]    [cat, bird]     2    
2   [cat, bear, tiger]  [tiger]         1    
3   [cat, tiger]        [tiger, cat]    2

相关问题