UPDATE: The question was re-written, so some comments may be not relevant anymore.
After the change of the collation in the table we've got problems with the usage of the indexes. There appeared lot's of the scans. The default database collation is SQL_Latin1_General_CP1_CI_AS. Now let's take a look to the example:
IF OBJECT_ID('colltest') > 0 DROP TABLE CollTest;
CREATE TABLE dbo.CollTest
(
cs CHAR(8) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
, ci CHAR(8) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
, cs_latin CHAR(8) COLLATE Latin1_General_CS_AS NOT NULL
, ci_latin CHAR(8) COLLATE Latin1_General_CI_AS NOT NULL
);
CREATE INDEX ix_cs ON dbo.CollTest (cs);
CREATE INDEX ix_ci ON dbo.CollTest (ci);
CREATE INDEX ix_cs_latin ON dbo.CollTest (cs_latin);
CREATE INDEX ix_ci_latin ON dbo.CollTest (ci_latin);
WITH q (n) AS (SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 1)
, q100 (n) AS (SELECT 1 FROM q a, q b)
, q10000 (n) AS (SELECT 1 FROM q100 a, q100 b)
, q100000 (n) AS (SELECT 1 FROM q a, q10000 b)
INSERT INTO dbo.CollTest
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
FROM q100000;
And here are the explain plans for following queries:
SELECT cs FROM dbo.colltest WHERE cs = CAST('1000000' AS NVARCHAR(MAX))
SELECT ci FROM dbo.colltest WHERE ci = CAST('1000000' AS NVARCHAR(MAX))
SELECT cs_latin FROM dbo.colltest WHERE cs_latin = CAST('1000000' AS NVARCHAR(MAX))
SELECT ci_latin FROM dbo.colltest WHERE ci_latin = CAST('1000000' AS NVARCHAR(MAX))
So it shows that when COLLATION is SQL_* then it uses scans, when Latin* then it uses seeks. Why is that?
1条答案
按热度按时间0x6upsns1#
This is explained in Comparing SQL collations to Windows collations
For a Windows collation, a comparison of non-Unicode data is implemented by using the same algorithm as Unicode data. ... In a SQL collation, SQL Server defines different comparison semantics for non-Unicode data.
NVARCHAR
has a higher datatype precedence thanVARCHAR
so when you compare aVARCHAR
column to anNVARCHAR
the column needs to be implicitly cast.Example Data
The query
Is able to perform a dynamic seek and call
GetRangeThroughConvert
to convert the predicate into a seekable index range on thevarchar
index.For the query
You see a scan
Note that this returns two results
¹
and1
Also note the results of this
The index on this column will be stored in the order returned from the above.
1
and¹
aren't stored next to each other in the index so it isn't possible to convert the predicateSQL_CollationVC = N'1'
into a simple seek on that index.