sql—如何获取表中a行和b行之间的行

9nvpjoqh  于 2021-07-26  发布在  Java
关注(0)|答案(1)|浏览(364)

我有一个包含许多行的事件日志,如:


我想回去:

group                  | text
-----------------------------------------------------
ErB0zyuzRq6DstbP0UdMag | Received: 1
                       | Received: more text
                       | Received: and some more text

所以本质上我想把所有的 text 发生在 group 因为在事件的开始和结束之间可能会发生多个事件 ErB0zyuzRq6DstbP0UdMag .
我试过用数字来划分(在这个例子中是36/37),但我似乎无法获胜。任何帮助都将不胜感激。每个序列的开头总是标着36,结尾总是标着37,所以我肯定认为使用window会管用,但到目前为止运气不好。
另外,我想要的内容总是标为25。

ncgqoxb0

ncgqoxb01#

下面是bigquery标准sql


# standardSQL

SELECT grp, ARRAY_AGG(text ORDER BY ts) AS text
FROM (
  SELECT *,
    LAST_VALUE(IF(label=36, text, NULL) IGNORE NULLS) OVER(ORDER BY ts) AS grp
  FROM `project.dataset.table`
  WHERE label IN (36, 37, 25)
)
WHERE label = 25
GROUP BY grp

如果应用于样本/虚拟数据,如下例所示


# standardSQL

WITH `project.dataset.table` AS (
  SELECT 36 label, '2020-06-01 15:54:31.266711 UTC' ts, 'ErB0zyuzRq6DstbP0UdMag' text UNION ALL
  SELECT 25, '2020-06-01 15:55:31.266711 UTC', 'Received: 1' UNION ALL
  SELECT 25, '2020-06-01 15:56:31.266711 UTC', 'Received: more text' UNION ALL
  SELECT 26, '2020-06-01 15:57:31.266711 UTC', 'Received: more text - to ignore' UNION ALL
  SELECT 25, '2020-06-01 15:58:31.266711 UTC', 'Received: and some more text' UNION ALL
  SELECT 37, '2020-06-01 16:54:31.266711 UTC', 'ErB0zyuzRq6DstbP0UdMag' UNION ALL
  SELECT 36, '2020-06-01 16:55:31.266711 UTC', 'XrB0zyuzRq6DstbP0UdMag' UNION ALL
  SELECT 28, '2020-06-01 16:56:31.266711 UTC', 'Received: 2 - to ignore' UNION ALL
  SELECT 25, '2020-06-01 16:57:31.266711 UTC', 'Received: more text 2' UNION ALL
  SELECT 37, '2020-06-01 16:58:31.266711 UTC', 'XrB0zyuzRq6DstbP0UdMag'
) 
SELECT grp, ARRAY_AGG(text ORDER BY ts) AS text
FROM (
  SELECT *,
    LAST_VALUE(IF(label=36, text, NULL) IGNORE NULLS) OVER(ORDER BY ts) AS grp
  FROM `project.dataset.table`
  WHERE label IN (36, 37, 25)
)
WHERE label = 25
GROUP BY grp

结果是

相关问题