根据bigquery上的时间戳聚合变量

3qpi33ja  于 2021-07-29  发布在  Java
关注(0)|答案(2)|浏览(287)

我计划为每个用户计算一天中最频繁的部分。在这种情况下,我首先将时间戳编码为天中的部分,然后将其聚合为天中频率最高的部分。我使用数组\u agg来计算模式()。但是,我不知道如何用数组\u agg处理时间戳,因为有错误,所以我的代码结构可能是错误的

SELECT  User_ID, time,
ARRAY_AGG(Time ORDER BY cnt DESC LIMIT 1)[OFFSET(0)] part_of_day,

case
  when  time BETWEEN '04:00:00' AND '12:00:00' 
  then  "morning"
  when  time < '04:00:00' OR time > '20:00:00' 
  then  "night" 
end AS part_of_day 

FROM (
      SELECT User_ID, 
        TIME_TRUNC(TIME(Request_Timestamp), SECOND) AS Time
        COUNT(*) AS cnt

收到错误:

Syntax error: Expected ")" but got identifier "COUNT" at [19:9]
htzpubme

htzpubme1#

即使您没有共享任何示例数据,我也能够识别代码中的一些问题。
我使用了一些基于代码中使用的格式和函数创建的示例数据来保持一致性。下面是代码,没有任何错误:

WITH data AS (
SELECT 98 as User_ID,DATETIME "2008-12-25 05:30:00.000000" AS Request_Timestamp, "something!" AS channel UNION ALL
SELECT 99 as User_ID,DATETIME "2008-12-25 22:30:00.000000" AS Request_Timestamp, "something!" AS channel
)
SELECT  User_ID, time,
ARRAY_AGG(Time ORDER BY cnt DESC LIMIT 1)[OFFSET(0)] part_of_day1,

case
  when  time BETWEEN '04:00:00' AND '12:00:00' 
  then  "morning"
  when  time < '04:00:00' OR time > '20:00:00' 
  then  "night" 
end AS part_of_day 

FROM (
      SELECT User_ID,
        TIME_TRUNC(TIME(Request_Timestamp), SECOND) AS time,
        COUNT(*) AS cnt
      FROM data
   GROUP BY User_ID, Channel, Request_Timestamp
   #order by Request_Timestamp

    )
    GROUP BY User_ID, Time;

首先,请注意,我已经更改了您的 ARRAY_AGG() 方法,因为它将导致错误“重复的列名”。第二,在你 TIME_TRUNC() 函数,它缺少一个逗号,因此您可以选择 COUNT(*) . 那么,在你的 GROUP BY ,您需要分组 Request_Timestamp 因为它既没有聚合也没有分组。最后,在上一个groupby中,您需要聚合或分组 time . 因此,在这些更正之后,您的代码将在没有任何错误的情况下执行。
注:以下为 Syntax error: Expected ")" but got identifier "COUNT" at [19:9] 您遇到的错误是由于缺少逗号。其他的将在更正此项后显示。

5lhxktic

5lhxktic2#

如果您需要每天最频繁的部分,则需要在聚合中使用“日”部分:

SELECT User_ID,
       ARRAY_AGG(part_of_day ORDER BY cnt DESC LIMIT 1)[OFFSET(0)] part_of_day
FROM (SELECT User_ID, 
             (case when time BETWEEN '04:00:00' AND '12:00:00' then 'morning'
                   when time < '04:00:00' OR time > '20:00:00' then 'night'
              end) AS part_of_day 
             COUNT(*) AS cnt
      FROM cognitivebot2.chitchaxETL.conversations 
      GROUP BY User_ID, part_of_day
     ) u
GROUP BY User_ID;

显然,如果您也需要通道,那么您需要在查询中包含该通道。

相关问题