如何计算在mysql中用户被标记为橙色的连续天数?

ercv8c1e  于 2021-08-09  发布在  Java
关注(0)|答案(2)|浏览(419)

我想知道如何计算到今天为止连续有多少天一个用户被标记为橙色。我有以下几点

CREATE TABLE `survey_daily` (
  `id` int(11) NOT NULL,
  `user_id` varchar(30) NOT NULL,
  `color` varchar(10) NOT NULL,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `survey_daily` (`id`, `user_id`, `color`, `timestamp`) VALUES
(1, '3236', "ORANGE", '2020-05-12 02:40:59'),
(2, '3236', "WHITE", '2020-05-13 02:40:59'),
(3, '3236', "ORANGE", '2020-05-14 02:40:59'),
(4, '3236', "ORANGE", '2020-05-15 02:40:59'),
(5, '3237', "ORANGE", '2020-05-15 02:40:59'),
(6, '3237', "ORANGE", '2020-05-16 02:40:59'),
(7, '3236', "ORANGE", '2020-05-16 02:40:59');

小提琴:http://sqlfiddle.com/#!9/40cb26/1。
基本上,我有多个用户在一个表上,我想计算有多少连续几天的用户被标记为橙色。
在我的示例中,用户id 3236应该连续3天被标记为橙色,而用户3237应该有2天被标记为橙色直到今天。如果它们今天都没有记录,它将返回0。

谢谢您

jutyujz0

jutyujz01#

SELECT t1.user_id, MAX(1 + DATEDIFF(t2.`timestamp`, t1.`timestamp`)) max_delta
FROM survey_daily t1
JOIN survey_daily t2 ON t1.user_id = t2.user_id
WHERE t1.color = 'ORANGE'
  AND t2.color = 'ORANGE'
  AND t1.`timestamp` <= t2.`timestamp`
  AND NOT EXISTS ( SELECT NULL
                   FROM survey_daily t3
                   WHERE t1.user_id = t3.user_id
                     AND t3.color != 'ORANGE'
                     AND t1.`timestamp` < t3.`timestamp`
                     AND t3.`timestamp` < t2.`timestamp` )
GROUP BY t1.user_id;

逻辑。为用户获取所有记录对,其中两个记录的颜色都是橙色,并且它们之间不存在具有其他颜色的记录。计算每对中的距离(以天为单位)。获得最大间隙值。
fiddle(多亏gmb提供了一个fiddle,从中获取源数据脚本)。
如果某个用户没有橙色的记录,则不会返回该用户。如果你也需要这样的用户,那就得到一份 survey_daily 表,将我的查询作为子查询 user_id ,然后从表中获取用户,并从子查询中获取连续天数(使用coalesce函数 Package 它,以便将null值转换为零)。

jm81lzqq

jm81lzqq2#

这是一个缺口和孤岛问题。如果您运行的是mysql 8.0,一种方法是使用 row_numbers() 要在用户具有相同颜色的位置构建连续记录组,请执行以下操作:

select 
    user_id, 
    count(*) no_records, 
    min(timestamp) start_timestamp, 
    max(timestamp) max_timestamp
from (
    select 
        s.*,
        row_number() over(partition by user_id order by timestamp) rn1,
        row_number() over(partition by user_id, color order by timestamp) rn2
    from survey_daily s
) t
where color = 'orange'
group by user_id, rn1 - rn2
order by user_id, start_timestamp

这将为每个用户的一系列相邻橙色记录生成一条记录:

user_id | no_records | start_timestamp     | max_timestamp      
:------ | ---------: | :------------------ | :------------------
3236    |          1 | 2020-05-12 02:40:59 | 2020-05-12 02:40:59
3236    |          3 | 2020-05-14 02:40:59 | 2020-05-16 02:40:59
3237    |          2 | 2020-05-15 02:40:59 | 2020-05-16 02:40:59

如果您只想获得每个用户的最长条纹,可以在此基础上使用聚合,或者再次使用窗口函数:

select *
from (
    select 
        user_id, 
        count(*) no_records, 
        min(timestamp) start_timestamp, 
        max(timestamp) max_timestamp,
        row_number() over(partition by user_id order by count(*) desc) rn
    from (
        select 
            s.*,
            row_number() over(partition by user_id order by timestamp) rn1,
            row_number() over(partition by user_id, color order by timestamp) rn2
        from survey_daily s
    ) t
    where color = 'ORANGE'
    group by user_id, rn1 - rn2
) t
where rn = 1
order by user_id, start_timestamp
user_id | no_records | start_timestamp     | max_timestamp       | rn
:------ | ---------: | :------------------ | :------------------ | -:
3236    |          3 | 2020-05-14 02:40:59 | 2020-05-16 02:40:59 |  1
3237    |          2 | 2020-05-15 02:40:59 | 2020-05-16 02:40:59 |  1

db小提琴演示

相关问题