SQL Server Index is not working with SQL_Latin1_General_CP1 collation but works with Latin1_General

kgqe7b3p  于 2023-05-28  发布在  其他
关注(0)|答案(1)|浏览(167)

UPDATE: The question was re-written, so some comments may be not relevant anymore.

After the change of the collation in the table we've got problems with the usage of the indexes. There appeared lot's of the scans. The default database collation is SQL_Latin1_General_CP1_CI_AS. Now let's take a look to the example:

IF OBJECT_ID('colltest') > 0 DROP TABLE CollTest;

CREATE TABLE dbo.CollTest
(
    cs       CHAR(8) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
  , ci       CHAR(8) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
  , cs_latin CHAR(8) COLLATE Latin1_General_CS_AS NOT NULL
  , ci_latin CHAR(8) COLLATE Latin1_General_CI_AS NOT NULL
);

CREATE INDEX ix_cs ON dbo.CollTest (cs);

CREATE INDEX ix_ci ON dbo.CollTest (ci);

CREATE INDEX ix_cs_latin ON dbo.CollTest (cs_latin);

CREATE INDEX ix_ci_latin ON dbo.CollTest (ci_latin);

WITH q (n) AS (SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1
               UNION ALL
               SELECT 1)
   , q100 (n) AS (SELECT 1 FROM q a, q b)
   , q10000 (n) AS (SELECT 1 FROM q100 a, q100 b)
   , q100000 (n) AS (SELECT 1 FROM q a, q10000 b)
INSERT INTO dbo.CollTest
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
     , ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
     , ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
     , ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) step
  FROM q100000;

And here are the explain plans for following queries:

SELECT cs FROM dbo.colltest WHERE cs = CAST('1000000' AS NVARCHAR(MAX))
SELECT ci FROM dbo.colltest WHERE ci = CAST('1000000' AS NVARCHAR(MAX))
SELECT cs_latin FROM dbo.colltest WHERE cs_latin = CAST('1000000' AS NVARCHAR(MAX))
SELECT ci_latin FROM dbo.colltest WHERE ci_latin = CAST('1000000' AS NVARCHAR(MAX))

So it shows that when COLLATION is SQL_* then it uses scans, when Latin* then it uses seeks. Why is that?

0x6upsns

0x6upsns1#

This is explained in Comparing SQL collations to Windows collations
For a Windows collation, a comparison of non-Unicode data is implemented by using the same algorithm as Unicode data. ... In a SQL collation, SQL Server defines different comparison semantics for non-Unicode data.

NVARCHAR has a higher datatype precedence than VARCHAR so when you compare a VARCHAR column to an NVARCHAR the column needs to be implicitly cast.

Example Data

DECLARE @T TABLE
(
SQL_CollationVC   VARCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AS,
Win_CollationVC   VARCHAR(1) COLLATE Latin1_General_CS_AS,
INDEX SQL_CollationVC(SQL_CollationVC),
INDEX Win_CollationVC(Win_CollationVC)
);

INSERT INTO @T
VALUES
(N'¹',N'¹'),
(N'½',N'½'),
(N'¾',N'¾'),
(N'0',N'0'),
(N'1',N'1');

The query

SELECT Win_CollationVC
FROM @T
WHERE Win_CollationVC = N'1'

Is able to perform a dynamic seek and call GetRangeThroughConvert to convert the predicate into a seekable index range on the varchar index.

For the query

SELECT SQL_CollationVC
FROM @T
WHERE SQL_CollationVC = N'1'

You see a scan

Note that this returns two results ¹ and 1

Also note the results of this

SELECT SQL_CollationVC
FROM @T
ORDER BY SQL_CollationVC

The index on this column will be stored in the order returned from the above.

+-----------------+
| SQL_CollationVC |
+-----------------+
| ¹               |
| ½               |
| ¾               |
| 0               |
| 1               |
+-----------------+

1 and ¹ aren't stored next to each other in the index so it isn't possible to convert the predicate SQL_CollationVC = N'1' into a simple seek on that index.

相关问题