I have a table called 'staff':
Account Score UpdateTime UpdateTime_order
K1897 A 2023-09-08 14:57:58.113 1
K1897 B 2023-09-08 14:57:57.896 2
K1897 B 2023-08-01 10:07:57.487 3
K1897 B 2023-06-28 07:23:57.696 4
K1897 B 2023-06-05 14:20:13.789 5
K1898 C 2023-06-04 14:20:13.789 1
Every staff can only have one score per day, so Account K1897's score should be A at 2023-09-08. (The score changed from B to A that day)
To solve this problem, I decided to convert datetime to date format, then kept the newest UpdateTime when they are duplicate.
For example, I deselected the row when UpdateTime_order=2 from Account K1897 because its original UpdateTime 2023-09-08 14:57:57.896 < 2023-09-08 14:57:58.113
Account Score UpdateTime UpdateTime_order
K1897 A 2023-09-08 1
K1897 B 2023-08-01 3
K1897 B 2023-06-28 4
K1897 B 2023-06-05 5
K1898 C 2023-06-04 1
Then reordered the UpdateTime_order based on new result.
What I expected:
Account Score UpdateTime UpdateTime_order
K1897 A 2023-09-08 1
K1897 B 2023-08-01 2
K1897 B 2023-06-28 3
K1897 B 2023-06-05 4
K1898 C 2023-06-04 1
My code:
;WITH CTE_staff AS (
select
Account,
Score,
CAST([UpdateTime] AS Date) UpdateDate,
UpdateTime,
UpdateTime_order
FROM staff
)
select
Account,
Score,
UpdateDate,
ROW_NUMBER()OVER(PARTITION BY Account ORDER BY UpdateTime DESC) as UpdateTime_order3
from(
select *,
ROW_NUMBER()OVER(PARTITION BY Account, UpdateDate ORDER BY UpdateTime DESC) as UpdateTime_order2
from CTE_staff
) jj
where jj.UpdateTime_order2=1
It ran succesfully, but I think I wrote it in a complicated way by creating new columns. Wandering if there's an easy way to do this?
fiddle: https://dbfiddle.uk/dJ1qw3Lt
3条答案
按热度按时间lb3vh1jj1#
Fiddle: https://dbfiddle.uk/_NZfe5JW
This query uses a common table expression (CTE) named
RankedStaff
to assign a row number to each row within a partition defined by the combination ofAccount
and the date part ofUpdateTime
. TheROW_NUMBER()
function is used to order the rows within each partition based onUpdateTime
in descending order, so the row with the newestUpdateTime
gets the row number 1.The final
SELECT
statement then filters the results to only include rows whereRowNum
is equal to 1, meaning it's the row with the newestUpdateTime
for each day for each account.This should give you a result where only the rows with the latest
UpdateTime
for each day for each account are included.xfb7svmp2#
Here's how to accomplish it using one
ROW_NUMBER()
to get the most recently updated row per account and day :It is also possible to use
GROUP BY
and the aggregate functionMAX()
: We need first determine the maximum update time per account and day:Then Join this dataset with the table as follow :
Demo here
qc6wkl3g3#
This should return result you said you expect:
See the fiddle .