SQL Server Summarising data including zeros for where no data exists

xwbd5t1u  于 2023-05-16  发布在  其他
关注(0)|答案(2)|浏览(115)

I have a table containing a list of years, from 2010 onwards, along the lines of.

CREATE TABLE Years
(
    YearName int
);
    
INSERT INTO Years (YearName)
VALUES
    (2010), (2011), (2012), (2013),
    (2014), (2015), (2016), (2017),
    (2018), (2019), (2020), (2021),
    (2022), (2023), (2024), (2025)

I have a second table containing people, along the lines of

CREATE TABLE People
(
    PersonID int PRIMARY KEY, 
    PersonName varchar(50)
);
    
INSERT INTO People (PersonID, PersonName)
VALUES
    (1, 'Bob'),
    (2, 'Kate'),
    (3, 'Jo'),
    (4, 'Fred');

And I have a table containing the work that people have done, of various types each year:

CREATE TABLE Workload
(
    ID int PRIMARY KEY, 
    PersonID int, 
    YearName int, 
    WorkType varchar(8), 
    Hours int
);
    
INSERT INTO Workload (ID, PersonID, YearName, WorkType, Hours)
VALUES
    (1, 1, 2014, 'Plumbing', 7),
    (2, 1, 2020, 'Washing', 9),
    (3, 1, 2020, 'Cooking', 10),
    (4, 1, 2020, 'Drawing', 4),
    (5, 1, 2021, 'Reading', 2),
    (6, 2, 2020, 'Washing', 9),
    (7, 2, 2021, 'Cooking', 10),
    (8, 2, 2022, 'Drawing', 4),
    (9, 3, 2014, 'Cooking', 4),
    (10, 3, 2014, 'Plumbing', 22),
    (11, 3, 2015, 'Washing', 7);

I want to summarise the total amount of work done by each person, per year. So, I've got this as my starting point:

SELECT 
    PersonName, YearName, SUM(Hours) AS WorkDone
FROM 
    People p 
INNER JOIN 
    Workload w ON p.PersonID = w.PersonID
WHERE 
    YearName BETWEEN YEAR(GETDATE()) - 9 AND YEAR(GETDATE())
GROUP BY 
    PersonName, YearName

This works fine, but what I'd like to have is an output that gives a zero value for each person in each year where they did none of these tasks:

PersonYearWorkload
Bob20147
Bob20150
Bob20160
Bob20170
Bob20180
Bob20190
Bob202023
Bob20210
Bob20220
Bob20230
Kate20140
Kate20150
Kate20160
Kate20170
Kate20180
Kate20190
Kate20209
Kate202110
Kate20224
Kate20230

... and so on for the other people and years.

How do I best achieve this? I sense that I probably need to cross apply the three tables, but can't seem to figure out how to do so and get the results I need.

wfypjpf4

wfypjpf41#

You could cross join the years and people table to generate all possible combinations, then bring the workload table with a LEFT JOIN - which ensures that "missing" years/people tuples are not filtered out. The final step is aggregation, and COALESCE() to return 0 on unmatched tuples.

SELECT p.PersonName, y.YearName, COALESCE(SUM(w.Hours), 0) as WorkDone
FROM People p 
CROSS JOIN Years y
LEFT JOIN Workload w ON w.PersonID = p.PersonID AND w.YearName = y.YearName
WHERE y.YearName BETWEEN YEAR(GetDate())-9 AND YEAR(GETDATE())
GROUP BY p.PersonID, p.PersonName, y.YearName
ORDER BY p.PersonName, y.YearName

Note that it is safer to include the person id in the group by clause; two different persons might have the same name, and you probably don't want their workloads to be grouped together.

We could also use a correlated subquery (or apply ) to do the workload computation, which would avoid outer aggregation:

SELECT p.PersonName, y.YearName, w.*
FROM People p 
CROSS JOIN Years y
CROSS APPLY (
    SELECT COALESCE(SUM(w.Hours), 0) as WorkDone
    FROM Workload w 
    WHERE w.PersonID = p.PersonID AND w.YearName = y.YearName
) w
WHERE y.YearName BETWEEN YEAR(GetDate())-9 AND YEAR(GETDATE())
ORDER BY p.PersonName, y.YearName
flvlnr44

flvlnr442#

By using this query, the minimum year of each person is extracted and based on the non-existent days, a value of zero is set

dbfiddle

;with _list as (

select *
        from (
                SELECT 
PersonName, YearName,sum(w.Hours) over(partition by p.PersonID,YearName) as Workload
                ,row_number() over(partition by p.PersonID,YearName order by  p.PersonID) as rw
                ,min(YearName) over(partition by p.PersonID order by  p.PersonID) as minYearName
                --, SUM(Hours) as WorkDone
                FROM People p 
                INNER JOIN Workload w ON p.PersonID=w.PersonID

        )a
        where a.rw=1  
)
select  
ISNULL(a.PersonName,b.PersonName) as PersonName ,b.YearName
,ISNULL(a.Workload,b.Workload) as Workload
from (
        select a.PersonName,b.YearName,0 Workload
        from Years b
        cross join (
                select PersonName,min(minYearName) as minYearName
                from _list
                group by PersonName
        )a
        where b.YearName between a.minYearName and  DATEPART(YEAR ,GETDATE())
)b
left join  _list a on  a.YearName=b.YearName 
and  a.PersonName=b.PersonName

相关问题