sql—如何在按不同记录分组时以15分钟的时间间隔对记录进行计数

baubqpgj  于 2021-07-24  发布在  Java
关注(0)|答案(1)|浏览(840)

我试图从时间戳字段计算15分钟时间间隔内的记录数。我基本上是想计算每个用户id的15分钟时间间隔的数量。我是teradata的新手,所以我可能没有充分利用它的潜力,而是在寻找任何类型的指导。
当前表数据

期望输出

失败的当前sql

SELECT
 user_id,
 CAST(CREATION_DATE as date format 'YYYY-MM-DD') as contact_date,
 CAST(CREATION_DATE as TIME) contact_time, 
 HOUR(contact_time) AS h, 
 MINUTE(contact_time)-(MINUTE(contact_time) MOD 15) AS m, 
 COUNT(*) contact_count
FROM 
table1
zqdjd7g9

zqdjd7g91#

也许有一种更巧妙的方法来解决这个问题,比如16.10(我认为)中引入了一些新的时态逻辑,但这会起到作用:

SELECT user_id, COUNT(creation_period)
FROM
(
   SELECT NORMALIZE
      user_id,
       CASE 
         WHEN PERIOD(creation_date, NEXT(creation_date + INTERVAL '15' MINUTE)) CONTAINS LEAD(creation_date) OVER (PARTITION BY user_id ORDER BY creation_date)
            THEN PERIOD(creation_date, NEXT(LEAD(creation_date) OVER (PARTITION BY user_id ORDER BY creation_date)))
         WHEN PERIOD(creation_date - INTERVAL '15' MINUTE, NEXT(CREATION_DATE)) CONTAINS LAG(creation_date) OVER (PARTITION BY user_id ORDER BY creation_date)
            THEN PERIOD(LAG(creation_date) OVER (PARTITION BY user_id ORDER BY creation_date), NEXT(creation_date))
         ELSE
            PERIOD(creation_date, NEXT(creation_date))
         END AS creation_period
   FROM table1
) dt
GROUP BY 1;

基本上,这是做的是缝合在一起的 Period 通过查找15分钟间隔内的相邻记录来确定数据类型。那么我们 Normalize 记录每个不同的 user_id 制造周期相互交叉或重叠的地方。然后我们只计算剩余的不同时期。
输入:

CREATE MULTISET VOLATILE TABLE table1
(
   user_id VARCHAR(30),
   creation_date TIMESTAMP(0)
) ON COMMIT PRESERVE ROWS;
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0));
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0) + INTERVAL '10' MINUTE);
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0) + INTERVAL '1' DAY);
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0) + INTERVAL '1' DAY + INTERVAL '15' MINUTE);
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0) + INTERVAL '2' DAY);
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0) + INTERVAL '3' DAY);
INSERT INTO table1 VALUES ('user', CURRENT_TIMESTAMP(0) + INTERVAL '3' DAY + INTERVAL '3' MINUTE);
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0));
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0) + INTERVAL '10' MINUTE);
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0) + INTERVAL '11' MINUTE);
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0) + INTERVAL '1' DAY + INTERVAL '15' MINUTE);
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0) + INTERVAL '1' DAY + INTERVAL '20' MINUTE);
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0) + INTERVAL '3' MINUTE);
INSERT INTO table1 VALUES ('user' || '2', CURRENT_TIMESTAMP(0) + INTERVAL '3' DAY + INTERVAL '3' MINUTE);

输出:

+---------+------------------------+
| user_id | Count(creation_period) |
+---------+------------------------+
| user    |                      4 |
| user2   |                      3 |
+---------+------------------------+

相关问题