sql—一周级别的用户保留

qvtsj1bj  于 2021-06-24  发布在  Hive
关注(0)|答案(2)|浏览(311)

我有如下所示的订阅数据。

+------+-----------------+-----------+-----------+----------+--------+
| user | subscription_id |   start   |    end    | wk_start | wk_end |
+------+-----------------+-----------+-----------+----------+--------+
|    1 | 1A              | 6/1/2019  | 6/30/2019 |       22 |     27 |
|    2 | 2A              | 6/1/2019  | 6/21/2019 |       22 |     25 |
|    3 | 3A              | 6/1/2019  | 6/21/2019 |       22 |     25 |
|    4 | 4A              | 6/1/2019  | 6/15/2019 |       22 |     24 |
|      |                 |           |           |          |        |
|    1 | 1B              | 7/4/2019  | 8/4/2019  |       27 |     32 |
|    2 | 2B              | 7/1/2019  | 7/31/2019 |       27 |     31 |
|    3 | 3B              | 6/24/2019 | 7/24/2019 |       26 |     30 |
+------+-----------------+-----------+-----------+----------+--------+

数据显示用户何时购买了订阅。是的 user_id,subscription_id,start date and end_date . 我想找出用户保留率。
我想看看有多少用户第一次购买订阅在一个特定的星期是活跃在未来几周。
它们可以在当前订阅或当前订阅到期后购买的新订阅上处于活动状态。
所需输出如下

+----------+-------------+----------------+--+-----------------------------------------------------------------------------+
| start_wk | Rolling_wk  | Retained Users |  |                  Active User(Not a part of desired output)                  |
+----------+-------------+----------------+--+-----------------------------------------------------------------------------+
|       22 |          22 |              4 |  | 1,2,3,4                                                                     |
|       22 |          23 |              4 |  | 1,2,3,4                                                                     |
|       22 |          24 |              4 |  | 1,2,3,4                                                                     |
|       22 |          25 |              3 |  | 1,2,3                                                                       |
|       22 |          26 |              2 |  | 1,3(with subscription_id = 3B)                                              |
|       22 |          27 |              3 |  | 1,2,3(1 is counted only once. He was active with subscription_id 1A and 1B) |
|       22 |          28 |              3 |  | 1,2,3                                                                       |
|       22 |          29 |              3 |  | 1,2,3                                                                       |
|       22 |          30 |              3 |  | 1,2,3                                                                       |
+----------+-------------+----------------+--+-----------------------------------------------------------------------------+

请注意 Active User 不是所需输出的一部分。这只是为了了解如何在列中的数字 Retained_User 获得。
我想要列 start_wk , Rolling_wk 以及 Retained Users 作为输出。
我将有一个巨大的数据,像这样为每个星期,并希望在类似的方式输出每周。在每种情况下 start_wk 会改变 rolling_wk 将从 start_wk ```
+----------+------------+----------------+
| start_wk | rolling_wk | Retained_users |
+----------+------------+----------------+
| 22 | 22 | 100 |
| 22 | 23 | 80 |
| 22 | 24 | 50 |
| 22 | …… | …… |
| 22 | ……. | ……. |
| 23 | 23 | 150 |
| 23 | 24 | 120 |
| 23 | 25 | 110 |
| 23 | 26 | 94 |
| 23 | …… | …… |
| 23 | ……. | ……. |
| 23 | ……. | ……. |
| 24 | 24 | 78 |
| 24 | 25 | 56 |
| 24 | 26 | 43 |
| 24 | ……. | ……. |
| 24 | ……. | ……. |
+----------+------------+----------------+

任何帮助都将不胜感激。
30byixjq

30byixjq1#

我要做一张table weeks 其中将有从1到56的条目作为“week”列,您也可以使用loop。基本上 weeks 表表示所有可能的周数。

select 
    w1.week, w2.week, count(s1.user) as Retained_Users
from 
    weeks w1, weeks2 w2, subscriptions s1 
where   
    w1.week <= w2.week and
    s1.wk_start <= ALL(
        select s2.wk_start 
        from subscriptions s2 
        where s2.user = s1.user 
    ) 
    and
    (   select true
        from subscriptions s3
        where s3.user = s1.user and
            s3.wk_start <= w2.week and 
            w2.week <= s3.wk_end
        limit 1)
group by w1.week, w2.week
guykilcj

guykilcj2#

您的查询应该在下面这个查询的某个位置,以获得拥有多个/单个订阅的订户的顺序和计数,以及他们正在进行的+即将进行的计划的最大有效期<到当前工作\u开始时间+行数(在oracle中为22+1,22+2…)/行号()(我猜是在sql server中)

Select wk_start, wk_start+rownum,
         (Select count(*) from table where      
         (wk_start+rownum)  <= All (Select 
           wk_max  from 
          (SELECT user, 
           count(*) as 
          "no_of_subscriptions", 
         max(wk_end) as wk_max
           from table 
             group by user) as Retained Users
           from table;

相关问题