Is there a way to simplify a NULL compare of 2 values

bkkx9g8r  于 2023-02-28  发布在  其他
关注(0)|答案(3)|浏览(128)

This is my simplified statement

SELECT ...
FROM tab1 AS i FULL OUTER JOIN tab2 AS d ON i.[Id]=d.[Id] 
WHERE d.[Data]<>i.[Data] OR 
    (d.[Data] IS NULL AND i.[Data] IS NOT NULL) OR 
    (d.[Data] IS NOT NULL AND i.[Data] IS NULL)

I want to get all entries that are

  1. i.[Data] is different from d.[Data]
  2. At least one value in table i or d is NOT NULL

So I don't want to see records were and i and d contain the same data or are both NULL.

My statement look so long and complicated. Is there an easier way?

Using ISNULL(d.[Data],'')<>ISNULL(i.[Data],'') works for text, but not for DATE or TIME(0) columns.

My statement works for every type.

5rgfhyps

5rgfhyps1#

Yes you can, and you can get the optimizer to recognize it too.

Paul White has this little ditty :

WHERE NOT EXISTS (
    SELECT d.[Data]
    INTERSECT
    SELECT i.[Data])

This works because of the semantics of INTERSECT which deal with nulls. What this says is "are there no rows in the subquery made up of value B and value B", this will only be satisfied if they are different values or one is null and the other not. If both are nulls, there will be a row with a null.

If you check the XML query plan (not the graphical one in SSMS), you will see that it compiles all the way down to d.[Data] <> i.[Data] , but the operator it uses will have CompareOp="IS" and not EQ .

See the full plan here .

The relevant part of the plan is:

<Predicate>
                  <ScalarOperator ScalarString="@t1.[i] as [t1].[i] = @t2.[i] as [t2].[i]">
                    <Compare CompareOp="IS">
                      <ScalarOperator>
                        <Identifier>
                          <ColumnReference Table="@t1" Alias="[t1]" Column="i" />
                        </Identifier>
                      </ScalarOperator>
                      <ScalarOperator>
                        <Identifier>
                          <ColumnReference Table="@t2" Alias="[t2]" Column="i" />
                        </Identifier>
                      </ScalarOperator>
                    </Compare>
                  </ScalarOperator>
                </Predicate>

I find the optimizer works very well this way round, rather than doing EXISTS / EXCEPT .

I urge you to vote for the Azure Feedback to implement a proper operator.

UPDATE FOR SQL SERVER 2022: This operator is now available as IS [NOT] DISTINCT FROM .

iqjalb3h

iqjalb3h2#

You can simplify to:

SELECT ...
FROM ...
WHERE d.[Data] <> i.[Data]
OR IIF(d.[Data] IS NULL, 0, 1) = IIF(i.[Data] IS NULL, 1, 0)

See live demo.

qacovj5a

qacovj5a3#

A new option is now available in SQL Server 2022, which has now implemented the ISO-standard IS [NOT] DISTINCT FROM syntax mentioned in the comments.

SELECT ...
FROM tab1 AS i
FULL OUTER JOIN tab2 AS d ON i.Id = d.Id
WHERE d.Data IS DISTINCT FROM i.Data;

This compiles into an IS comparison underneath (the same as in my other answer using EXCEPT ), and is very efficient.

相关问题