IF OBJECT_ID('core.NTallyRangeAB') IS NOT NULL DROP FUNCTION core.NTallyRangeAB;
GO
CREATE FUNCTION core.NTallyRangeAB
(
@tiles BIGINT,
@rows BIGINT,
@desc BIT
)
/*****************************************************************************************
[Purpose]:
NTallyRangeAB is a faster, 100% readless alternative to the ANSI SQL:2003 compliant
T-SQL NTILE ranking function which:
"Distributes the rows in an ordered partition into a specified number of groups. The
groups are numbered, starting at one. For each row, NTILE returns the number of the
group to which the row belongs...
... If the number of rows in a partition is not divisible by integer_expression, this
will cause groups of two sizes that differ by one member. Larger groups come before
smaller groups in the order specified by the OVER clause. For example: if have 53 rows
and the number of groups is five, the first three groups will have 11 rows and the two
remaining groups will each have 10. If the total number of rows is divisible by the
number of groups, the rows will be evenly distributed among the groups."
SEE: https://docs.microsoft.com/en-us/sql/t-sql/functions/ntile-transact-sql
The ASNI SQL:2003 NTILE function allows developers to divide a set "as evenly as
possible." NTallyRangeAB(@tiles,@rows,1) returns the same value as NTILE(@tiles) for
a set with @rows*rows. NTallyRangeAB (n) does this by returning an ordered set of
Row Numbers (n.RN) and two columns of "Tile Numbers" (n.Tile and n.TileOp). n.Tile will
be the same value as NTILE(@tiles) OVER (ORDER BY n.RN). n.RN can be used to *join* to
the table with the column(s) that need to be divided into "tiles". Note this query:
DECLARE @tiles BIGINT = 3, @rows BIGINT = 8, @desc BIT = 0;
SELECT n.RN, n.Tile, [NTILE(@tiles)] = NTILE(@tiles) OVER (ORDER BY n.RN)
FROM core.NTallyRangeAB(@tiles,@rows,@desc) AS n;
In this example ^^ the tiles are divided up where lower tile groups numbers(n.TN) always
have an equal or greater number of members than the higher tile group numbers. The above
query returns 3X 1's, 3X 2's but only 2X 3's.
The third parameter (@desc) can be changed to 1 to reverse the distribution so that the
higher group always have an equal or greater number of members. This query is identical
to the one above except that @desc = 1:
DECLARE @tiles BIGINT = 3, @rows BIGINT = 8, @desc BIT = 1;
SELECT n.RN, n.Tile, [NTILE(@tiles)] = NTILE(@tiles) OVER (ORDER BY n.RN)
FROM core.NTallyRangeAB(@tiles,@rows,@desc) AS n;
Notice ^^ how the query returns 3X 2's and 3X 3's but only 2X 1's.
[Author]:
AJB
[Compatibility]:
SQL Server 2005+
[Syntax]:
--===== Autonomous
SELECT r.RN, r.Tile
FROM core.NTallyRangeAB(@tiles,@rows) AS r;
--===== Against a table using APPLY
WITH anchor(RN,SomeValue) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY t.SomeValue), t.SomeValue
FROM SomeTable AS t
)
SELECT t.SomeValue, Tile = nt.Tile
FROM anchor AS t
CROSS APPLY core.NTallyRangeAB(@tiles, (SELECT COUNT(*) FROM anchor)) AS nt
WHERE t.RN = nt.RN;
[Parameters]:
@tiles = BIGINT; requested number of tile groups (same as the parameter passed to NTILE)
@rows = BIGINT; the number of rows to be "tiled" (have group number assigned to it)
@desc = BIT; when @desc=0 the function distributes the tile groups so that lower tile
group numbers always have an equal or greater number of members than
higher tile group numbers. When @desc=1 the opposite is true: higher
group numbers will always have an equal or greater number of members
[Returns]:
Inline Table Valued Function returns:
RN = BIGINT; a row number beginning with 1 and ending with @rows
Tile = BIGINT; a "tile number" or group number the same
TileOP = BIGINT; an "opposite tile number"
[Dependencies]:
core.RangeAB
[Developer Notes]:
1. When there isn't any partitioning involved, core.NTallyRangeAB (which uses
core.RangeAB) is ~10-15% slower than core.NTally which uses a persisted tally table.
core.NTallyRangeAB, however, doesn't generate any reads or require Tally table.
2. For best results a P.O.C. index should exists on the table that you are "tiling". For
more information about P.O.C. indexes see:
http://sqlmag.com/sql-server-2012/sql-server-2012-how-write-t-sql-window-functions-part-3
3. NTallyRangeAB is deterministic; for more about deterministic and nondeterministic functions
see https://msdn.microsoft.com/en-us/library/ms178091.aspx
[Examples]:
--===== 1. Demonstrating how the function mimics NTILE
--; To better understand NTallyRangeAB, run the DML with different values assigned to @rows and
--; @tiles. Note how the tile column and NTILE produces the same results.
DECLARE @rows BIGINT = 8, @tiles BIGINT = 3, @desc BIT = 0;
SELECT rn, tile, NTILE(@tiles) OVER (ORDER BY rn) as [NTILE]
FROM core.NTallyRangeAB(@tiles,@rows,@desc);
--===== 2. Using NTallyRangeAB is a faster alternative to NTILE (with no PARTITION BY clause)
--; Run the code below from --;START to --;END
--; Note how you get the same result but how, the more rows you add, the more efficient
--; the NTallyRangeAB solution is, with respect to reads, when compared to NTILE:
--; e.g. NTILE against 100K rows = 200K+ reads, 0 (ZERO) reads for NTallyRangeAB
--;START
--; Declare variables
DECLARE @rows BIGINT = 8, @tiles BIGINT = 5;
--; Setup sample data
DECLARE @SomeTable TABLE (SomeValue int primary key);
INSERT @SomeTable
SELECT TOP(@rows) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))*5
FROM sys.all_columns a, sys.all_columns b;
--; How to divide @some table into 3 tile groups using NTILE
SET STATISTICS IO ON;
PRINT 'NTILE version:';
SELECT SomeValue, NTILE(@tiles) OVER (ORDER BY SomeValue) AS TileGroup
FROM @SomeTable;
--; How to divide @SomeTable into 3 tile groups using NTallyRangeAB
PRINT CHAR(10)+'NTallyRangeAB version:';
WITH anchor AS
(
SELECT SomeValue, ROW_NUMBER() OVER (ORDER BY SomeValue) AS rn
FROM @SomeTable
)
SELECT SomeValue, nt.tile AS TileGroup
FROM anchor a
CROSS APPLY core.NTallyRangeAB(@tiles,(SELECT COUNT(*) FROM @SomeTable),0) AS nt
WHERE a.RN = nt.RN;
SET STATISTICS IO OFF;
--;END
--===== 3. Using NTallyRangeAB an alternative to NTILE with a PARTITION BY clause
--; 3.1. Create sample table with 10 rows and 3 partitions
IF OBJECT_ID('tempdb..#SomeTable') IS NOT NULL DROP TABLE #SomeTable;
CREATE TABLE #SomeTable
(
PartitionKey int NOT NULL,
SomeValue int NOT NULL,
CONSTRAINT pk_SomeTable PRIMARY KEY(PartitionKey,SomeValue)
);
INSERT #SomeTable
SELECT TOP (12)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))/5+1,
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))*5
FROM sys.all_columns;
--; 3.2. Using NTILE and PARTITION BY
SELECT
s.PartitionKey,
s.SomeValue,
NTILE(3) OVER (PARTITION BY s.PartitionKey ORDER BY s.SomeValue) AS TileGroup
FROM #SomeTable s;
--; 3.3. Using the NTallyRangeAB function
WITH
anchor AS -- Use ROW_NUMBER for your partitioning and sorting
(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY v.PartitionKey ORDER BY v.SomeValue),
PartitionKey, SomeValue
FROM #SomeTable AS v
),
parts AS -- collect the number of rows per partition
(
SELECT a.PartitionKey, mxrn = MAX(a.RN)
FROM anchor AS a
GROUP BY a.PartitionKey
)
SELECT a.PartitionKey, a.SomeValue, nt.tile AS TileGroup
FROM parts AS p
CROSS APPLY core.NTallyRangeAB(3,mxrn,0) AS nt
CROSS APPLY anchor AS a
WHERE p.PartitionKey = a.PartitionKey
AND a.RN = nt.RN;
-----------------------------------------------------------------------------------------
[Revision History]:
Rev 00 - 20190114 - Initial Creation - AJB
*****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT RN = ROW_NUMBER() OVER (ORDER BY t.RN),
Tile = t.RN,
TileOp = t.OP
FROM core.rangeAB(1,@tiles,1,1) AS t
CROSS APPLY (VALUES(IIF(@desc=0,t.RN,t.OP))) AS d(D)
CROSS APPLY core.rangeAB(1,@rows/@tiles+IIF(d.D<=@rows%@tiles,1,0),1,1) AS x;
GO
2条答案
按热度按时间wd2eg0qa1#
对行数进行模化:
如果没有要排序的列,请使用:
但请注意,它不会生成稳定/可预测的行编号
wwtsj6pe2#
考伊斯的答案正是我该怎么做。我将在他的答案的基础上,引入Ntaly表。代码如下。简言之,它做的和ntile做的一样,只是速度更快。这里也可以用。
请注意:
退货:
靠着table:
退货:
Ntaly代码: