SQL Server Count the number of customers on a specific status per month\year? [duplicate]

0lvr5msh  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(117)

This question already has an answer here:

substitude zeros for missing months in mysql query (1 answer)
Closed last month.

I have a table that looks like this:

IDDateStatus
12020-10-01Silver
22021-01-01Gold
32020-11-01Bronze
12020-12-01Bronze
12021-03-01Platinum
22020-10-01Bronze
32021-04-01Diamond
22021-04-01Bronze
12020-04-01Gold

I need to count the number of customers per status per month, the thing is that the table only shows the entry when the status changed, but I would need to also account for the status for that particular customer on the months prior or after that. For example, ID 1 was Gold before(we can assume because it was the first entry) and on 04\2020, and remained gold until 10\2020 when status changed to Silver, then Silver until 12\2020 when it changed to Bronze and so forth... I would have to account for each customer for those months between the status changes that shows on the table, and I do not know how to do that.

I came up with this query, but it only accounts for the months where there was a status change, basically only the entries that shows on the table:

SELECT FORMAT(Date, 'yyyy-MM') AS Month_Year, Status, COUNT(ID) AS Count_Status
FROM Table1
GROUP BY FORMAT(Date , 'yyyy-MM'), Status

My output should be like:

Month_YearStatusCount_Status
2020\04Bronze11111
2020\04Silver...22222
2020\05Bronze11111
2020\05Silver...22222
2020\05Diamond33333

And so forth...

Can anyone point me to the right direction?

I also have another table that contains two columns, ID and Registration Date for each customer. How would be my approach to basically get the same output as above but this time account for only customer that registered on that particular month?

Thank you!

ac1kyiln

ac1kyiln1#

See example.
First we create calendar table - all month between min and max dates in table, or other date interval.
Then, expand status changes (status,from_date)->(status,from_date,to_date=next_date=next from_date).
Then JOIN Calendar table and expanded status changes table and get status for every Id for every possible months.

with recursive dates as(  -- calendar table
  select min(date) yy_mm_1,max(date)edt from test
  union all
  select 
--    date_add(yy_mm_1,interval 1 month) yy_mm_1  -- mySql
    dateadd(1,month,yy_mm_1) yy_mm_1            -- SQL Server
    ,edt 
  from dates
  where 
   --  date_add(yy_mm_1,interval 1 month)<=edt
     dateadd(1,month,yy_mm_1)<=edt

)
,status_changes as( -- status changes from date to next date
select * 
  ,lead(date)over(partition by id order by date) next_date
from test
)
,all_status as( -- JOIN Calendar and status_changes
select d.yy_mm_1,t1.* 
from dates d
left join status_changes t1
 on t1.date<=d.yy_mm_1 and (d.yy_mm_1<t1.next_date or t1.next_date is null)
)

-- group by year_month and status
select yy_mm_1,status,count(*) status_count
  -- ,group_concat(id order by id separator ',') ids -- MySql
  ,string_agg(id,',') within group(order by id)ids    -- SQL Server
from all_status
group by yy_mm_1,status
order by yy_mm_1

For your test data

create table test (ID int,Date date,Status varchar(10));
insert into test values
 (1,'2020-10-01','Silver')
,(2,'2021-01-01','Gold') 
,(3,'2020-11-01','Bronze') 
,(1,'2020-12-01','Bronze') 
,(1,'2021-03-01','Platinum') 
,(2,'2020-10-01','Bronze') 
,(3,'2021-04-01','Diamond') 
,(2,'2021-04-01','Bronze') 
,(1,'2020-04-01','Gold')
;

result is

yy_mm_1Statusstatus_countids
2020-04-01Gold11
2020-05-01Gold11
2020-06-01Gold11
2020-07-01Gold11
2020-08-01Gold11
2020-09-01Gold11
2020-10-01Bronze12
2020-10-01Silver11
2020-11-01Bronze22,3
2020-11-01Silver11
2020-12-01Bronze31,2,3
2021-01-01Bronze21,3
2021-01-01Gold12
2021-02-01Bronze21,3
2021-02-01Gold12
2021-03-01Bronze13
2021-03-01Gold12
2021-03-01Platinum11
2021-04-01Bronze12
2021-04-01Diamond13
2021-04-01Platinum11

For the selected report interval, you can modify the query as follows (MySql syntax)

with recursive dates as(
  select cast('2020-10-01' as date) yy_mm_1,cast('2021-02-01' as date)edt 
  union all
  select date_add(yy_mm_1,interval 1 month),edt from dates
  where date_add(yy_mm_1,interval 1 month)<=edt
)
,status_changes as(
select * 
from(
  select * 
     ,lead(date)over(partition by id order by date) next_date
  from test
  where date<=cast('2021-02-01' as date)
  )t
where next_date>=cast('2020-10-01' as date) or next_date is null
)
,all_status as(
select d.yy_mm_1,t1.* 
from dates d
left join status_changes t1
 on t1.date<=d.yy_mm_1 and (d.yy_mm_1<t1.next_date or t1.next_date is null)
)
select yy_mm_1,status,count(*) status_count
  ,group_concat(id order by id separator ',') ids
from all_status
group by yy_mm_1,status
order by yy_mm_1

There result is

yy_mm_1Statusstatus_countids
2020-10-01Bronze12
2020-10-01Silver11
2020-11-01Bronze22,3
2020-11-01Silver11
2020-12-01Bronze31,2,3
2021-01-01Bronze21,3
2021-01-01Gold12
2021-02-01Bronze21,3
2021-02-01Gold12

相关问题