SQL Server Using LAG() function to get the previous row value in SQL/BigQuery but still incorrect

rqdpfwrv  于 2023-08-02  发布在  其他
关注(0)|答案(2)|浏览(95)

I used the below query to retrieve the desired table:

with

PairingActivityInPairing AS (
SELECT  DISTINCT
    P.Id AS PairingId,
    PD.Id AS PairingDutyId,
    PA.Id AS PairingActivityId,
    E.Id AS EmployeeId,
    PEV.PairingEventTypeId,
    PEV.StartDate AS PEStartDate,
    FF.ArrivalActual AS FFArrival,
    TZ.ZuluOffset
FROM
   <my_table>
ORDER BY
   P.Id, PD.Id, PA.Id
PairingIdPairingDutyIdPairingActivityIdEmployeeIdPairingEventTypeIdPEStartDateFFArrivalZuluOffset
201068224790710nullnull2021-06-18 15:18:00null
201068224790810nullnull2021-06-18 20:15:00null
20106822479091082021-06-18 19:55:00null9.0
201068324791010nullnull2021-06-20 01:55:00null
19469802253883nullnull2022-03-01 19:28:00null
1946980225389382023-03-01 19:40:00null9.0
19469812253903nullnull2022-03-03 02:31:00null

And I want to get the previous FFArrival of each row PARTITION BY PairingId, PairingDutyId

I tried to add the LAG() function in the query as below:

with

PairingActivityInPairing AS (
SELECT  DISTINCT
    P.Id AS PairingId,
    PD.Id AS PairingDutyId,
    PA.Id AS PairingActivityId,
    E.Id AS EmployeeId,
    PEV.PairingEventTypeId,
    PEV.StartDate AS PEStartDate,
    FF.ArrivalActual AS FFArrival,
    LAG(FF.ArrivalActual) OVER (PARTITION BY P.Id, PD.Id ORDER BY PA.Id) AS LagFFArrival,
    TZ.ZuluOffset
FROM
   <my_table>
ORDER BY
   P.Id, PD.Id, PA.Id

However, the queried result is not the same as expected one

PairingIdPairingDutyIdPairingActivityIdEmployeeIdPairingEventTypeIdPEStartDateFFArrivalLagFFArrivalZuluOffset
201068224790710nullnull2021-06-18 15:18:002021-06-18 15:18:00null
201068224790810nullnull2021-06-18 20:15:002021-06-18 15:18:00null
20106822479091082021-06-18 19:55:00nullnull9.0
201068324791010nullnull2021-06-20 01:55:002021-06-20 01:55:00null
19469802253883nullnull2022-03-01 19:28:002022-03-01 19:28:00null
1946980225389382023-03-01 19:40:00nullnull9.0
19469812253903nullnull2022-03-03 02:31:002022-03-03 02:31:00null

It simply copies the value of FFArrival and my expected result would be below

PairingIdPairingDutyIdPairingActivityIdEmployeeIdPairingEventTypeIdPEStartDateFFArrivalLagFFArrivalZuluOffset
201068224790710nullnull2021-06-18 15:18:00nullnull
201068224790810nullnull2021-06-18 20:15:002021-06-18 15:18:00null
20106822479091082021-06-18 19:55:00null2021-06-18 20:15:009.0
201068324791010nullnull2021-06-20 01:55:00nullnull
19469802253883nullnull2022-03-01 19:28:00nullnull
1946980225389382023-03-01 19:40:00null2022-03-01 19:28:009.0
19469812253903nullnull2022-03-03 02:31:00nullnull

Please help me to fix the issue. Thanks!

sf6xfgos

sf6xfgos1#

CREATE TABLE PairingActivity (
    PairingId INT,
    PairingDutyId INT,
    PairingActivityId INT,
    EmployeeId INT,
    PairingEventTypeId INT,
    PEStartDate DATETIME,
    FFArrival DATETIME,
    ZuluOffset DECIMAL(5, 2)
);

INSERT INTO PairingActivity (PairingId, PairingDutyId, PairingActivityId, EmployeeId, PairingEventTypeId, PEStartDate, FFArrival, ZuluOffset)
VALUES
    (2010, 682, 247907, 10, NULL, NULL, '2021-06-18 15:18:00', NULL),
    (2010, 682, 247908, 10, NULL, NULL, '2021-06-18 20:15:00', NULL),
    (2010, 682, 247909, 10, 8, '2021-06-18 19:55:00', NULL, 9.0),
    (2010, 683, 247910, 10, NULL, NULL, '2021-06-20 01:55:00', NULL),
    (1946, 980, 225388, 3, NULL, NULL, '2022-03-01 19:28:00', NULL),
    (1946, 980, 225389, 3, 8, '2023-03-01 19:40:00', NULL, 9.0),
    (1946, 981, 225390, 3, NULL, NULL, '2022-03-03 02:31:00', NULL);

  
WITH PairingActivityInPairing AS (
    SELECT *,
        LAG(FFArrival) OVER (PARTITION BY PairingId, PairingDutyId ORDER BY PairingActivityId) AS LagFFArrival
    FROM PairingActivity
)
SELECT *
FROM PairingActivityInPairing
ORDER BY PairingId desc
PairingIdPairingDutyIdPairingActivityIdEmployeeIdPairingEventTypeIdPEStartDateFFArrivalZuluOffsetLagFFArrival
201068224790710nullnull2021-06-18 15:18:00.000nullnull
201068224790810nullnull2021-06-18 20:15:00.000null2021-06-18 15:18:00.000
20106822479091082021-06-18 19:55:00.000null9.002021-06-18 20:15:00.000
201068324791010nullnull2021-06-20 01:55:00.000nullnull
19469802253883nullnull2022-03-01 19:28:00.000nullnull
1946980225389382023-03-01 19:40:00.000null9.002022-03-01 19:28:00.000
19469812253903nullnull2022-03-03 02:31:00.000nullnull

fiddle

y0u0uwnf

y0u0uwnf2#

If you include LAG() under the SELECT DISTINCT, DISTINCT will mess up the ORDER BY sequence you put it within LAG() , making the final result non-ordered, which probably is not what you intend to do.

It makes more sense to perform LAG() on the set of unique values after DISTINCT . DISTINCT is an aggregate function that returns the set of unique values so it synthesizes/condenses information. Whereas window function (ie. LAG()) retains individual records.

So

SELECT *, LAG() OVER (PARTITION BY... ORDER BY ...) AS LagFFArrival
FROM (
       SELECT DISTINCT
         all the listed columns
      ) AS sub;

相关问题