SQL Server Shuffling data across rows in a table

rn0zuynd 于 9个月前发布在其他

关注(0)|答案(2)|浏览(81)

I want to update a table to have the data shuffled around. Let's say I have a table with 5 columns. column 1 is the primary key so that'll remain unchanged. columns 2 and 3 will be shuffled to different rows. columns 4 and 5 will also be shuffled but as pairs (they must be moved together).

How would I write the script for this? The primary key is a random unique identifier so it doesn't correlate with the row number.

Before:
| c1 | c2 | c3 | c4 | c5 |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| 10 | 1 | 2 | 3 | 4 |
| 20 | 5 | 6 | 7 | 8 |
| 30 | 9 | 10 | 11 | 12 |
| 40 | 13 | 14 | 15 | 16 |

After:

c1	c2	c3	c4	c5
10	5	14	11	12
20	9	6	15	16
30	1	2	3	4
40	13	10	7	8

To provide an update on why I am doing this, this isn't a production database or anything. I'm shuffling the data around to anonymize the data for testing purposes, that's about it.

To provide additional details, say this is a people table. I might shuffle the address or phone number to anonymize the person.

sql-server

来源：https://stackoverflow.com/questions/77545412/shuffling-data-across-rows-in-a-table

2条答案

按热度按时间

ccgok5k51#

If temporary tables are allowed:

Create a temporary table that contains row_number based keys
One key acts as a secondary key
Other keys will consist secondary key values but sorted randomly

select *
     , sk = row_number() over (order by c1)
     , k1 = row_number() over (order by newid())
     , k2 = row_number() over (order by newid())
     , k3 = row_number() over (order by newid())
into #temp
from t

c1	c2	c3	c4	c5	sk	k1	k2	k3
10	1	2	3	4	1	4	3	1
20	5	6	7	8	2	3	4	2
30	9	10	11	12	3	1	2	3
40	13	14	15	16	4	2	1	4

You can use an update-with-join query:

update t set
t.c2 = a.c2,
t.c3 = b.c3,
t.c4 = c.c4,
t.c5 = c.c5
from t
join #temp as x on t.c1 = x.c1
join #temp as a on x.sk = a.k1
join #temp as b on x.sk = b.k2
join #temp as c on x.sk = c.k3

DB<>Fiddle

赞(0）回复(0）举报 9个月前

z4iuyo4d2#

You appear to be under the impression that all these values are unrelated; they aren't. 10 is related directly to 1 , 2 , 3 and 4 , and those values to each other as well. If you "shuffle" the value of 10 (in c1 ) then the values 1 , 2 , 3 and 4 go with it as they are a single row.

If you need to do this type of thing then firstly I'd suggest that all these values are on different rows and denote what "c" they belong to. So your data would look more like this:
| c | v |
| ------------ | ------------ |
| 1 | 10 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 4 |
| 1 | 20 |
| 2 | 5 |
| 3 | 6 |
| 4 | 7 |
| 5 | 8 |

etc...

If your data looks like that, it's far easier to get the results you want. You can order you data on something "random" and the use conditional aggregation to get the values into their respective columns:

WITH CTE AS(
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY C ORDER BY NEWID()) AS I
    FROM (VALUES(1,10),
                (2,1),
                (3,2),
                (4,3),
                (5,4),
                (1,20),
                (2,5),
                (3,6),
                (4,7),
                (5,8))YT(C,V))
SELECT MAX(CASE C WHEN 1 THEN V END) AS C1,
       MAX(CASE C WHEN 2 THEN V END) AS C2,
       MAX(CASE C WHEN 3 THEN V END) AS C3,
       MAX(CASE C WHEN 4 THEN V END) AS C4,
       MAX(CASE C WHEN 5 THEN V END) AS C5
FROM CTE
GROUP BY I;

If you "can't" get your data in a better format, you could unpivot it before you repivot it:

WITH CTE AS(
    SELECT V.C,
           V.V,
           ROW_NUMBER() OVER (PARTITION BY V.C ORDER BY NEWID()) AS I
    FROM (VALUES(10, 1, 2, 3, 4),
                (20, 5, 6, 7, 8),
                (30, 9,10,11,12),
                (40,13,14,15,16))YT(C1,C2,C3,C4,C5)
         CROSS APPLY (VALUES(1,C1),
                            (2,C2),
                            (3,C3),
                            (4,C4),
                            (5,C5))V(C,V))
SELECT MAX(CASE C WHEN 1 THEN V END) AS C1,
       MAX(CASE C WHEN 2 THEN V END) AS C2,
       MAX(CASE C WHEN 3 THEN V END) AS C3,
       MAX(CASE C WHEN 4 THEN V END) AS C4,
       MAX(CASE C WHEN 5 THEN V END) AS C5
FROM CTE
GROUP BY I;

赞(0）回复(0）举报 9个月前

我来回答

SQL Server Shuffling data across rows in a table

2条答案

相关问题

热门标签

最新问答