SQL Server Find and update duplicate values held in a column with a unique constraint? - Msg 402 The data types are incompatible in the add operator

lymgl2op  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(115)

I need to create a SQL Server query to clean up an old database by ensuring all non-unique values in columns with a unique restraint are updated to be unique.

To do this I was thinking to find all columns in the database that have a unique constraint. Then go through all the values in those columns to find any non-unique values. Lastly, thinking to add a suffix to the end of any non-unique values. When the suffix is used it will increment by 1.

For all duplicates, the table, column, old non-unique value, and new unique value need to be written to the console.

To me, it seemed like I would need to use dynamic SQL to achieve this.

Is there a better way?

This is what I currently have:

DECLARE @tableName NVARCHAR(255)
DECLARE @columnName NVARCHAR(255)
DECLARE @sql NVARCHAR(MAX)

DECLARE uniqueColumns CURSOR FOR
SELECT
    cu.TABLE_NAME,
    cu.COLUMN_NAME
FROM
    INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE cu
    JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc ON cu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
WHERE
    tc.CONSTRAINT_TYPE = 'UNIQUE'

OPEN uniqueColumns

FETCH NEXT FROM uniqueColumns INTO @tableName, @columnName

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @sql = N'
        WITH DuplicatesCTE AS (
            SELECT                
                ColumnNameString AS ColumnName,
                ColumnNameString AS OriginalValue,
                ColumnNameString + ''_'' + CAST(ROW_NUMBER() OVER (PARTITION BY ColumnNameString ORDER BY (SELECT 1)) AS NVARCHAR(MAX)) AS NewValue
            FROM TableNameString
            WHERE ColumnNameString IN (
                SELECT ColumnNameString
                FROM TableNameString
                GROUP BY ColumnNameString
                HAVING COUNT(*) > 1
            )
        )
        UPDATE DuplicatesCTE
        SET OriginalValue = NewValue
        OUTPUT INSERTED.ColumnName, INSERTED.NewValue
    ';

    SET @sql = REPLACE(@sql, 'TableNameString', QUOTENAME(@tableName))
    SET @sql = REPLACE(@sql, 'ColumnNameString', QUOTENAME(@columnName))

    --EXEC sp_executesql @sql
    PRINT @sql

    FETCH NEXT FROM uniqueColumns INTO @tableName, @columnName
END

CLOSE uniqueColumns
DEALLOCATE uniqueColumns

I am getting this error:

Msg 402, Level 16, State 1, Line 6
The data types uniqueidentifier and varchar are incompatible in the add operator.

If I print the @sql instead of executing it I get this SQL:

WITH DuplicatesCTE AS (
    SELECT                
        [AdminRoleId] AS ColumnName,
        [AdminRoleId] AS OriginalValue,
        [AdminRoleId] + '_' + CAST(ROW_NUMBER() OVER (PARTITION BY [AdminRoleId] ORDER BY (SELECT 1)) AS NVARCHAR(MAX)) AS NewValue
    FROM [AdminRoleModules]
    WHERE [AdminRoleId] IN (
        SELECT [AdminRoleId]
        FROM [AdminRoleModules]
        GROUP BY [AdminRoleId]
        HAVING COUNT(*) > 1
    )
)
UPDATE DuplicatesCTE
SET OriginalValue = NewValue

And I get the same error:

Msg 402, Level 16, State 1, Line 5
The data types uniqueidentifier and varchar are incompatible in the add operator.

sqyvllje

sqyvllje1#

It looks like there were several syntax errors in my dynamic SQL. In addition, I had forgotten to run this on strings only, not on Primary Keys or Foreign Keys. These issues have been fixed in the below SQL:

DECLARE @tableName NVARCHAR(255)
DECLARE @columnName NVARCHAR(255)
DECLARE @sql NVARCHAR(MAX)

DECLARE uniqueColumns CURSOR FOR
SELECT
    cu.TABLE_NAME,
    cu.COLUMN_NAME
FROM
    INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE cu
    JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc ON cu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
    JOIN INFORMATION_SCHEMA.COLUMNS col ON cu.COLUMN_NAME = col.COLUMN_NAME AND cu.TABLE_NAME = col.TABLE_NAME
WHERE
    tc.CONSTRAINT_TYPE = 'UNIQUE'
    AND col.DATA_TYPE = 'NVARCHAR'

OPEN uniqueColumns

FETCH NEXT FROM uniqueColumns INTO @tableName, @columnName

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @sql = N'
        WITH DuplicatesCTE AS (
            SELECT                
                ColumnNameString AS ColumnName,
                ColumnNameString AS OriginalValue,
                (ColumnNameString + ''_'' + CAST(ROW_NUMBER() OVER (PARTITION BY ColumnNameString ORDER BY (SELECT 1)) AS NVARCHAR(MAX))) AS NewValue
            FROM TableNameString
            WHERE ColumnNameString IN (
                SELECT ColumnNameString
                FROM TableNameString
                GROUP BY ColumnNameString
                HAVING COUNT(*) > 1
            )
        )
        UPDATE DuplicatesCTE
        SET OriginalValue = NewValue
        OUTPUT INSERTED.ColumnName--, INSERTED.NewValue
    ';

    SET @sql = REPLACE(@sql, 'TableNameString', QUOTENAME(@tableName))
    SET @sql = REPLACE(@sql, 'ColumnNameString', QUOTENAME(@columnName))

    EXEC sp_executesql @sql
    --PRINT @sql

    FETCH NEXT FROM uniqueColumns INTO @tableName, @columnName
END

CLOSE uniqueColumns
DEALLOCATE uniqueColumns

This however still falls short in outputting the table name, column name, old value, and new value so the changes can be audited. It is also too broad in it's scope of changes as well. Currently we are moving toward creating a query that is a report only. This way we can identify the tables and columns that need cleaning up. Then we will address those individually. I will update this answer when I have the full solution.

相关问题