-- Create a sample table
`CREATE TABLE #YourTable (
column1 VARCHAR(255),
column2 VARCHAR(255),
column3 VARCHAR(255),
AssociatedRows VARCHAR(MAX)
-- Insert sample data
`INSERT INTO #YourTable (RowID, column1, column2, column3, AssociatedRows)
(1, 'A', 'B', 'C', ''),
(2, 'D', 'E', 'F', ''),
(3, 'A', 'X', 'Y', ''),
(4, 'G', 'H', 'I', ''),
(5, 'J', 'K', 'C', '');`
Expected output:
RowID | column1| column2| column2| AssociatedRows
1 | A | B | C | 1,3,5
2 | D | E | F | 2
3 | A | X | Y | 1,3,5
4 | G | H | I | 4
5 | J | K | C | 1,3,5
Row IDs 1, 3, and 5 are associated with each other because they share matching values in at least one field, and their associated row IDs are stored in the AssociatedRows column.
I tried below code but didn't achieve the expected output -
; WITH Addressdups(EmployeeAddressClean, EmployeeCityClean, EmployeeStateZipClean, EmployeeNumber)
AS (
SELECT EmployeeAddressClean, EmployeeCityClean, EmployeeStateZipClean ,
EmployeeNumber = stuff( (Select ', ' + EmployeeNumber
from #Employee e2
Where e1.EmployeeAddressClean = e2.EmployeeAddressClean
and e1.EmployeeCityClean = e2.EmployeeCityClean
and e1.EmployeeStateZipClean = e2.EmployeeStateZipClean for xml path ('')),1,2,'')
FROM #Employee e1 Group By EmployeeAddressClean, EmployeeCityClean, EmployeeStateZipClean Having count(distinct e1.EmployeeNumber ) > 1
SSNDups (SSN , EmployeeNumber)
( SELECT SSN , EmployeeNumber = stuff( (Select ', ' + EmployeeNumber
from #Employee e2
Where e1.SSN = e2.SSN
for xml path ('')),1,2,'')
FROM #Employee e1 Group By SSN Having count(distinct e1.EmployeeNumber ) > 1
NameDups (EmployeeName , EmployeeNumber)
( SELECT EmployeeName , EmployeeNumber = stuff( (Select ', ' + EmployeeNumber
from #Employee e2
Where e1.EmployeeName = e2.EmployeeName
for xml path ('')),1,2,'')
FROM #Employee e1 Group By EmployeeName Having count(distinct e1.EmployeeNumber ) > 1
Select e1.* , concat_ws( ',', a.EmployeeNumber , s.EmployeeNumber , n.EmployeeNumber) duplicate
from #Employee e1
Left Join Addressdups a on e1.EmployeeAddressClean = a.EmployeeAddressClean
and e1.EmployeeCityClean = a.EmployeeCityClean
and e1.EmployeeStateZipClean = a.EmployeeStateZipClean
Left Join SSNDups s on e1.SSN = s.SSN
Left Join NameDups n On e1.EmployeeName = n.EmployeeName
You can try following option
Based on the given example data and its expected output, understand that need to find associated rows by searching multiple levels recursively.
Create a scalar function which will return associated rows with comma separated for given input RowID. The function also takes another input parameter @MaxSearchLvel which tells how many levels the associated rows can be searched, defined 3 as a default value.
Select query to view associated rows for the example data
The output of the above query will match exactly the given expected output. Please note that performance of the query is question mark.
Gives expected output: