I would like to count the number of objects (lightbulbs, scheduled jobs, customer billing status) that are in a specific status by date. However, the data is intermittent. Objects may or may not have a row on a specific date.
Not a simple GROUP BY solution. The requirement for number of objects "in a specific status" depends on previous rows.
In the minimum problem to solve example below, on 03-01 a GROUP BY would result in "1 on and 1 off". On 03-02 only a single object is represented. A GROUP BY showing 03-02 results would be "1 on and 0 off". The correct answer for total objects by status on 03-02 is "2 on and 1 off" because it must include the objects with rows on 03-01.
A minimum problem to solve is here
SET NOCOUNT ON
-->>-- minimum problem to solve -- count by status for a specific day
IF OBJECT_ID('tempdb..#d') IS NOT NULL DROP TABLE #d
CREATE TABLE #d (ndx SMALLINT IDENTITY(1,1), id TINYINT, dt DATE, status CHAR(10) )
INSERT INTO #d (id, dt, status)
VALUES
( 1, '20230301' , 'on' )
, ( 3, '20230301' , 'off' )
, ( 2, '20230302' , 'on' )
, ( 3, '20230303' , 'off' )
, ( 3, '20230305' , 'on' )
, ( 1, '20230308' , 'off' )
, ( 2, '20230308' , 'off' )
, ( 1, '20230310' , 'off' )
, ( 2, '20230311' , 'off' )
, ( 1, '20230312' , 'off' )
, ( 3, '20230312' , 'off' )
, ( 2, '20230313' , 'on' )
, ( 1, '20230314' , 'on' )
, ( 3, '20230314' , 'off' )
, ( 3, '20230316' , 'off' )
, ( 2, '20230320' , 'on' )
, ( 1, '20230321' , 'off' )
SELECT * FROM #d d ORDER BY id, dt
IF OBJECT_ID('tempdb..#c') IS NOT NULL DROP TABLE #c
CREATE TABLE #c ( calendardt DATE )
INSERT INTO #c(calendardt)
VALUES
('2023-03-01 '), ('2023-03-02 '), ('2023-03-03 '), ('2023-03-04 '), ('2023-03-05 ')
, ('2023-03-06 '), ('2023-03-07 '), ('2023-03-08 '), ('2023-03-09 '), ('2023-03-10 ')
, ('2023-03-11 '), ('2023-03-12 '), ('2023-03-13 '), ('2023-03-14 '), ('2023-03-15 ')
, ('2023-03-16 '), ('2023-03-17 '), ('2023-03-18 '), ('2023-03-19 '), ('2023-03-20 ')
, ('2023-03-21 '), ('2023-03-22 '), ('2023-03-23 '), ('2023-03-24 '), ('2023-03-25 ')
SELECT * FROM #c UNION ALL SELECT * FROM #c ORDER BY calendardt
SELECT *
FROM #c c
LEFT JOIN #d d ON d.dt = c.calendardt
ORDER BY c.calendardt, d.id
The expected result should look like this
-->>-- expected result
calendardt [status] [count]
2023-03-01 on 1
2023-03-01 off 1
2023-03-02 on 2
2023-03-02 off 1
2023-03-03 on 2
2023-03-03 off 1
2023-03-04 on 2
2023-03-04 off 1
2023-03-05 on 3
2023-03-05 off 0
2023-03-06 on 3
2023-03-06 off 0
2023-03-07 on 3
2023-03-07 off 0
2023-03-08 on 1
2023-03-08 off 2
2023-03-09 on 1
2023-03-09 off 2
2023-03-10 on 1
2023-03-10 off 2
2023-03-11 on 1
2023-03-11 off 2
2023-03-12 on 1
2023-03-12 off 2
2023-03-13 on 0
2023-03-13 off 3
2023-03-14 on 0
2023-03-14 off 3
2023-03-15 on 0
2023-03-15 off 3
2023-03-16 on 0
2023-03-16 off 3
I do not have an attempt. I am able to track changes to the data on a per day basis. I can capture the first date the object entered a new status and calculate how long it has been in that status. But I am not able calculate the totals by date and status.
Any help counting by date and status would be appreciated.
2条答案
按热度按时间yfwxisqw1#
SQL Server dbfiddle
Initially I made solution in Oracle, dbfiddle is here . My knowledge of SQL Server is very limited and I can test it only in dbfiddle, but it returns expected values (until 13 April, but I think it should be 1 ON, because id=2 is ON).
The idea is to first eliminate rows which do not change anything (status off->off for example), then join with calendar and cross join with all ids. This way we have rows for each day and id. Then last_value fills missing data. And we can aggregate.
hwamh0ep2#
On/off table is missing
db<>fiddle