在sql server中计算缺少存储桶的直方图

zyfwsgd6  于 2021-07-26  发布在  Java
关注(0)|答案(3)|浏览(242)

我有以下数据集:

| 0 |
    | 0 |
    | 0 |
    | 1 |
    | 1 |
    | 1 |
    | 1 |
    | 1 |
    | 1 |
    | 2 |
    | 2 |
    | 2 |
    | 2 |
    | 2 |
    | 2 |
    | 2 |
    | 3 |
    | 3 |
    | 3 |
    | 4 |
    | 4 |
    | 4 |
    | 4 |
    | 4 |
    | 4 |
    | 6 |
    | 6 |
    | 6 |
    | 7 |
    | 7 |
    | 7 |
    | 7 |
    | 8 |
    | 8 |
    | 8 |
    | 9 |
    | 9 |
    | 14 |
    | 16 |
    | 21 |
    | 28 |
    | 30 |
    | 32 |
    | 57 |

我有以下sql查询:

select bucket_floor,
CONCAT(bucket_floor, ' to ', bucket_ceiling-1) as bucket_name,
count(*) as count
  from (
      select 
          floor([value]/5)*5 as bucket_floor,
          floor([value]/5)*5 + 5 as bucket_ceiling
      from @URQuartileDataRaw
        ) a
        group by bucket_floor, CONCAT(bucket_floor, ' to ', bucket_ceiling-1)
        order by bucket_floor;

当我运行这个时,我得到以下信息:

|Bucket_Name|Release_Count|
    |0 to 4     |25
    |5 to 9     |12
    |10 to 14   |1
    |15 to 19   |1
    |20 to 24   |1
    |25 to 29   |1
    |30 to 34   |2
    |55 to 59   |1

如何插入35到39、40到44、45到49、50到54的bucket,因为数据集中没有这些bucket的数据,但是我需要它们,因为这些数据提供了一个图表?
我意识到所有的建议都假设我知道我需要多少组。我需要水桶是动态的,所以我想出了以下方法:

declare @Buckets table(
   ROWID INT IDENTITY(1,1),
   LowValue int,
   HiValue int
   )

   declare @Maxiteratons int = (Select ceiling(max([value])/5) from @QuartileDataRaw)

   -- Declare an iterator
   DECLARE @iteratons INT, @LowVal int = 0, @HiVal int = 4
   -- Initialize the iterator
   SET @iteratons = 0

    -- Loop through the rows of a table @myTable
   WHILE (@iteratons < @Maxiteratons)
       BEGIN
           insert into @Buckets (LowValue,HiValue)
           select @LowVal, @HiVal

           set @LowVal = @LowVal + 5
           set @HiVal = @HiVal + 5

           - Increment the iterator
           SET @iteratons = @iteratons + 1

       END

       ---- Get the number of rows in the looping table
       SET @Maxiteratons = (SELECT COUNT(ROWID) FROM @Buckets) 
       SET @iteratons = 1
       WHILE (@iteratons <= @Maxiteratons)
           BEGIN

                  -- Get the data from table and set to variables
                  SELECT @LowVal = LowValue, @HiVal = HiValue  FROM @Buckets WHERE ROWID = @iteratons

                  Insert into @DataTable (Bucket_Floor,Bucket_Name,Release_Count)

                  select 
                  @LowVal, CONCAT(@LowVal, ' to ', @HiVal),Count(*)
                  from @QuartileDataRaw
                  where [value] >= @LowVal
                  and [value] <= @HiVal

                  SET @iteratons = @iteratons + 1

                 END
6qfn3psc

6qfn3psc1#

我喜欢将我的层保存在一个通用表中。这将从代码中删除逻辑,并允许您通过向tiergrp传递一个参数来为多个主机提供服务
例子

Declare @Tier table (TierGrp varchar(50),TierTitle varchar(50),TierR1 money,TierR2 money)
Insert Into @Tier values
 ('Sample','0 - 4'   ,0   ,5)
,('Sample','5 - 10'  ,5   ,10)
,('Sample','10 - 14' ,10  ,15)
,('Sample','15 - 19' ,15  ,20)
,('Sample','20 - 24' ,20  ,25)
,('Sample','25+'     ,25  ,999999)

Declare @YourTable table (SomeCol int)
Insert Into @YourTable values
 (0),(0),(0),(1),(1),(15)

Select T.TierTitle
      ,ItemCnt = count(SomeCol)
 From  @Tier   T 
 Left  Join  @YourTable D on TierGrp='Sample' and SomeCol >=TierR1 and SomeCol <TierR2
 Group By T.TierR1,T.TierTitle
 Order By T.TierR1

退货

TierTitle   ItemCnt
0 - 4       5
5 - 10      0
10 - 14     0
15 - 19     1
20 - 24     0
25+         0
zzoitvuj

zzoitvuj2#

使用 values() 或者类似的生成桶然后使用 left join :

select v.bucket, count(dr.value)
from (values (0, 4, '0 to 4'),
             (5, 9, '5 to 9'),
             . . . 
     ) v(bucket, lo, hi) left join
     @URQuartileDataRaw dr
     on dr.value between v.lo and v.hi
group by v.bucket, v.lo
order by v.lo;
soat7uwm

soat7uwm3#

使用一个派生表,把你的查询联合起来,对每个bucket生成0个计数的常量进行查询,按bucket\u name分组,求出释放\u计数的总和。

相关问题