如何使用jq获取两个JSON数组的交集

gmxoilav 于 2023-08-08 发布在其他

关注(0)|答案(4)|浏览(159)

给定数组X和Y（最好都作为输入，否则，一个作为输入，另一个作为硬编码），我如何使用jq输出包含两者共有的所有元素的数组？例如，f的值是多少，使得

echo '[1,2,3,4]' | jq 'f([2,4,6,8,10])'

字符串
将输出

[2,4]

型
？
我尝试了以下方法：

map(select(in([2,4,6,8,10])))  --> outputs [1,2,3,4]
select(map(in([2,4,6,8,10])))  --> outputs [1,2,3,4,5]

型

JSON

来源：https://stackoverflow.com/questions/38364458/how-to-get-the-intersection-of-two-json-arrays-using-jq

4条答案

按热度按时间

j2datikz1#

注意：此解决方案假定array1没有重复项。

简单说明

所有这些答案的复杂性模糊了对原理的理解。这很不幸，因为原则很简单：

array1减去array2返回：
array1中剩下的所有内容
删除array2中的所有内容后
（并丢弃array2的其余部分）

简单Demo

# From array1, subtract array2, leaving the remainder
$ jq --null-input '[1,2,3,4] - [2,4,6,8]'
[
  1,
  3
]

# Subtract the remainder from the original
$ jq --null-input '[1,2,3,4] - [1,3]'
[
  2,
  4
]

# Put it all together
$ jq --null-input '[1,2,3,4] - ([1,2,3,4] - [2,4,6,8])'
[
  2,
  4
]

字符串

`comm`演示

def comm:
  (.[0] - (.[0] - .[1])) as $d |
    [.[0]-$d, .[1]-$d, $d]
;

型
有了这样的理解，我就能够模仿the *nix comm command的行为
在没有选项的情况下，生成三列输出。第一列包含FILE1独有的行，第二列包含FILE2独有的行，第三列包含两个文件共有的行。

$ echo 'def comm: (.[0]-(.[0]-.[1])) as $d | [.[0]-$d,.[1]-$d, $d];' > comm.jq
$ echo '{"a":101, "b":102, "c":103, "d":104}'                        > 1.json
$ echo '{         "b":202,          "d":204, "f":206, "h":208}'      > 2.json

$ jq --slurp '.' 1.json 2.json
[
  {
    "a": 101,
    "b": 102,
    "c": 103,
    "d": 104
  },
  {
    "b": 202,
    "d": 204,
    "f": 206,
    "h": 208
  }
]

$ jq --slurp '[.[] | keys | sort]' 1.json 2.json
[
  [
    "a",
    "b",
    "c",
    "d"
  ],
  [
    "b",
    "d",
    "f",
    "h"
  ]
]

$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm' 1.json 2.json
[
  [
    "a",
    "c"
  ],
  [
    "f",
    "h"
  ],
  [
    "b",
    "d"
  ]
]

$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm[2]' 1.json 2.json
[
  "b",
  "d"
]

型

赞(0）回复(0）举报 2023-08-08

3df52oht2#

一个简单而快速（但有点幼稚）的过滤器可能基本上做你想要的事情，可以定义如下：

# x and y are arrays
   def intersection(x;y):
     ( (x|unique) + (y|unique) | sort) as $sorted
     | reduce range(1; $sorted|length) as $i
         ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;

字符串
如果x作为STDIN上的输入提供，并且y以某种其他方式提供（例如，def y: ...），那么你可以使用它：intersection(.;y)个
提供两个不同数组作为输入的其他方式包括：

使用--slurp选项
使用--arg a v（或--argjson a v，如果在您的jq中可用）

下面是一个更简单但较慢的def，但实际上速度相当快：

def i(x;y):
       if (y|length) == 0 then []
       else (x|unique) as $x
       | $x - ($x - y)
       end ;

型
下面是一个独立的过滤器，用于查找任意多个数组的交集：

# Input: an array of arrays
def intersection:
  def i(y): ((unique + (y|unique)) | sort) as $sorted
  | reduce range(1; $sorted|length) as $i
       ([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
  reduce .[1:][] as $a (.[0]; i($a)) ;

型
示例：

[ [1,2,4], [2,4,5], [4,5,6]] #=> [4]
[[]]                         #=> []
[]                           #=> null

型
当然，如果已知x和y被排序和/或唯一，则更有效的解决方案是可能的。具体参见JSON实体的有限集合

赞(0）回复(0）举报 2023-08-08

rggaifut3#

$ echo '[1,2,3,4] [2,4,6,8,10]' | jq --slurp '[.[0][] as $x | .[1][] | select($x == .)]'
[
  2,
  4
]

字符串

赞(0）回复(0）举报 2023-08-08

eoxn13cs4#

下面是一个通过使用foreach对数组中元素的出现次数进行计数来实现的解决方案

[
  foreach ($X[], $Y[]) as $r (
    {}
  ; .[$r|tostring] += 1
  ; if .[$r|tostring] == 2 then $r else empty end
  )
]

字符串
如果此过滤器位于filter.jq中，则

jq -M -n -c --argjson X '[1,2,3,4]' --argjson Y '[2,4,6,8,10]' -f filter.jq

型
将产生

[2,4]

型
它假设初始数组中没有重复项。如果不是这样的话，那么很容易用独特的来补偿。例如：

[
  foreach (($X|unique)[], ($Y|unique)[]) as $r (
    {}
  ; .[$r|tostring] += 1
  ; if .[$r|tostring] == 2 then $r else empty end
  )
]

型

赞(0）回复(0）举报 2023-08-08

我来回答

如何使用jq获取两个JSON数组的交集

4条答案

简单说明

简单Demo

`comm`演示

相关问题

热门标签

最新问答

如何使用jq获取两个JSON数组的交集

4条答案

简单说明

简单Demo

comm演示

相关问题

热门标签

最新问答

`comm`演示