I want to update a table to have the data shuffled around. Let's say I have a table with 5 columns. column 1 is the primary key so that'll remain unchanged. columns 2 and 3 will be shuffled to different rows. columns 4 and 5 will also be shuffled but as pairs (they must be moved together).
How would I write the script for this? The primary key is a random unique identifier so it doesn't correlate with the row number.
Before:
| c1 | c2 | c3 | c4 | c5 |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| 10 | 1 | 2 | 3 | 4 |
| 20 | 5 | 6 | 7 | 8 |
| 30 | 9 | 10 | 11 | 12 |
| 40 | 13 | 14 | 15 | 16 |
After:
c1 | c2 | c3 | c4 | c5 |
---|---|---|---|---|
10 | 5 | 14 | 11 | 12 |
20 | 9 | 6 | 15 | 16 |
30 | 1 | 2 | 3 | 4 |
40 | 13 | 10 | 7 | 8 |
To provide an update on why I am doing this, this isn't a production database or anything. I'm shuffling the data around to anonymize the data for testing purposes, that's about it.
To provide additional details, say this is a people table. I might shuffle the address or phone number to anonymize the person.
2条答案
按热度按时间ccgok5k51#
If temporary tables are allowed:
Create a temporary table that contains row_number based keys
One key acts as a secondary key
Other keys will consist secondary key values but sorted randomly
You can use an update-with-join query:
DB<>Fiddle
z4iuyo4d2#
You appear to be under the impression that all these values are unrelated; they aren't.
10
is related directly to1
,2
,3
and4
, and those values to each other as well. If you "shuffle" the value of10
(inc1
) then the values1
,2
,3
and4
go with it as they are a single row.If you need to do this type of thing then firstly I'd suggest that all these values are on different rows and denote what "c" they belong to. So your data would look more like this:
| c | v |
| ------------ | ------------ |
| 1 | 10 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 4 |
| 1 | 20 |
| 2 | 5 |
| 3 | 6 |
| 4 | 7 |
| 5 | 8 |
etc...
If your data looks like that, it's far easier to get the results you want. You can order you data on something "random" and the use conditional aggregation to get the values into their respective columns:
If you "can't" get your data in a better format, you could unpivot it before you repivot it: