SQL-Server Return rows where only column X has changed [duplicate]

qmelpv7a  于 2023-08-02  发布在  其他
关注(0)|答案(3)|浏览(105)

This question already has answers here:

Is there a way to access the "previous row" value in a SELECT statement? (9 answers)
Closed 20 days ago.

I have a historized table in SQL Server where the id (part of a composite key) repeats when a record is altered. For now, I would like to check which rows with the same id have a difference ONLY in col_a. Suppose the following data sample:

row_id | id | col_a  | col_b
------------------------------
1      | 1  |value_a | value_b
2      | 1  |value_a | value_c
3      | 1  |value_d | value_c
4      | 1  |value_e | value_c
5      | 1  |value_f | value_g

I would like my output to be the following:

row_id | id | col_a   | col_b
-------------------------------
3      | 1  | value_d | value_c
4      | 1  | value_e | value_c

My data has a lot more columns, but the idea is that I really want to return a table that contains only rows that have been duplicated to track a change in col_a and nothing else. Preferably without having to create/write to a table. I added the row_id for illustration purposes, I would have to create it out of the composite key if required to solve the problem.

a6b3iqyw

a6b3iqyw1#

Here is how to achieve using LAG function:

WITH CTE_Lag AS
(
    SELECT *
     , LAG(col_a) OVER (PARTITION BY id ORDER BY row_id) as prev_col_a 
     , LAG(col_b) OVER (PARTITION BY id ORDER BY row_id) as prev_col_b
    FROM mytable
)
SELECT * FROM CTE_Lag
WHERE col_a <> prev_col_a AND col_b = prev_col_b
e5nqia27

e5nqia272#

With 2 columns, one that needs to change one that doesn't, then you could just do a IS (NOT) DISTINCT FROM (assuming SQL Server 2022 as no tags to suggest otherwise) to check that Col_A is different while Col_b is not with a LAG / LEAD n a CTE

You suggest that you have more columns though. As such an INTERSECT might work better here. I use ROW_NUMBER in a CTE and then an EXISTS (with said INTERSECT ) against the same CTE but on the ROW_NUMBER +1.:

WITH RNs AS(
    SELECT row_id,
           id,
           col_a,
           col_b,
           ROW_NUMBER() OVER (PARTITION BY id ORDER BY row_id ASC) AS RN --Asumes Row_id is irrelevant of id, which isn't what we want
    FROM dbo.YourTable YT1)
SELECT RN1.row_id,
       RN1.id,
       RN1.col_a,
       RN1.col_b
FROM RNs RN1
     JOIN RNs RN2 ON RN1.id = RN2.id
                 AND RN1.col_a IS DISTINCT FROM RN2.col_a --Assumes 2022+ again
                 AND RN1.RN = RN2.RN + 1
WHERE EXISTS (SELECT RN1.col_b, RN1.col_c, RN1.col_d, RN1.col_e
              INTERSECT
              SELECT RN2.col_b, RN2.col_c, RN2.col_d, RN2.col_e);
mbyulnm0

mbyulnm03#

data

CREATE TABLE mytable(
   row_id INTEGER  NOT NULL 
  ,id     int  NOT NULL
  ,col_a  VARCHAR(30) NOT NULL
  ,col_b  VARCHAR(30) NOT NULL
);
INSERT INTO mytable
(row_id,id,col_a,col_b) VALUES 
(1,1,'value_a','value_b'),
(2,1,'value_a','value_c'),
(3,1,'value_d','value_c'),
(4,1,'value_e','value_c'),
(5,1,'value_f','value_g'),
(7,1,'value_d','value_c');/*Repeated value*/
--(7,1,'value_b','value_c'); /*that is not a Repeated value, it is whole new value*/

use DENSE_RANK

WITH t
     AS (SELECT *,
                DENSE_RANK()
                  OVER(
                    partition BY id, col_b
                    ORDER BY col_a ) id2
         FROM   mytable)

SELECT min(row_id) row_id ,id,col_a,col_b
FROM   t
WHERE  id2 > 1  
group by id,col_a,col_b

or use a reference of a CTE to another CTE

WITH t
     AS (SELECT *,
                Dense_rank()
                  OVER(
                    partition BY id, col_b
                    ORDER BY col_a ) id2
         FROM   mytable),
/*new CTE*/
     a
     AS (SELECT row_id,
                id,
                col_a,
                col_b,
                Row_number()
                  OVER(
                    partition BY id, col_a, col_b
                    ORDER BY row_id ) id3
         FROM   t
         WHERE  id2 > 1)
SELECT row_id,
       id,
       col_a,
       col_b
FROM   a
WHERE  id3 = 1

dbfiddle

相关问题