postgresql 计算2个不同组的2个累积总和

3vpjnl9f  于 2023-02-12  发布在  PostgreSQL
关注(0)|答案(4)|浏览(228)

我有一个表,看起来像这样:

id   position    value
5    senior      10000
6    senior      20000
8    senior      30000
9    junior      5000
4    junior      7000
3    junior      10000

它已经按职位和价值(asc)排序了。我想计算一下在50,000的预算中可以容纳的高年级学生和低年级学生的数量,这样就可以优先考虑高年级学生。
因此,例如,在这里2大四学生(第一和第二)+ 3大三学生可以适合在50,000的预算。

id   position    value     cum_sum
5    senior      10000     10000
6    senior      20000     30000
8    senior      30000     60000   ----not possible because it is more than 50000
-----------------------------------   --- so out of 50k, 30k is used for 2 seniors.
9    junior      5000      5000 
4    junior      7000      12000
1    junior      7000      19000 ---with the remaining 20k, these 3 juniors can also fit
3    junior      10000     29000

因此输出应该如下所示:

juniors    seniors
3          2

我如何在sql中实现这一点?

ni65a41a

ni65a41a1#

这里有一个可能的解决方案:DB Fiddle

with seniorsCte as (
  select id, position, value, total
  from budget b
  inner join (
    select id, position, value, (sum(value) over (order by value, id)) total
    from people
    where position = 'senior'
  ) as s 
  on s.total <= b.amount
)
, juniorsCte as (
  select j.id, j.position, j.value, j.total + r.seniorsTotal
  from (
    select  coalesce(max(total), 0) seniorsTotal
    , max(b.amount) - coalesce(max(total), 0) remainingAmount
    from budget b
    cross join seniorsCte
  ) as r
  inner join (
    select id, position, value, (sum(value) over (order by value, id)) total
    from people
    where position = 'junior'
  ) as j
  on j.total <= r.remainingAmount
)
/* use this if you want the specific records
select *
from seniorsCte 
union all
select *
from juniorsCte
*/
select (select count(1) from seniorsCte) seniors
, (select count(1) from juniorsCte) juniors

从你的问题来看,我猜你对窗口函数很熟悉;但如果没有;下面的查询从people表(其中positionsenior)中提取所有行,并创建一列total,它是返回的行的value的累计总和,从最小值开始,升序排列(然后按id排序,以确保在多行具有相同值时行为一致;尽管如果我们愿意以任意顺序获得这些元素,这并不是严格要求的)。

select id, position, value, (sum(value) over (order by value, id)) total
from people
where position = 'senior'

budget表,我只是用它来保存一行/一个值,说明我们的截止值是什么;也就是说,这避免了硬编码您提到的50k值,因此我们可以根据需要轻松地修改它。
公用表表达式(CTE)我曾经允许我们根据老年人子查询的输出过滤青年人子查询(即,因为我们只想要那些低年级学生达到预算和高年级学生的总数之间的差),同时允许我们独立地返回低年级学生和高年级学生的结果(即,如果我们想要返回实际的行,而不仅仅是总数,这允许我们在两个集合之间执行union all;如注解掉的代码所示。

fslejnso

fslejnso2#

为了使它工作,求和不仅要累积,而且要有选择性,正如评论中提到的,你可以通过递归cte来实现:online demo

with recursive 
 ordered as --this will be fed to the actual recursive cte
(   select *,
           row_number() over (order by position desc,value asc) 
    from test_table)
,recursive_cte as 
( select id,
         position,
         value, 
         value*(value<50000)::int as cum_sum,
         value<50000 as is_hired,
         2 as next_i
  from ordered
  where row_number=1
  union
  select o.id,
         o.position,
         o.value, 
         case when o.value+r.cum_sum<50000 then o.value+r.cum_sum else r.cum_sum end,
         (o.value+r.cum_sum)<50000 as is_hired,
         r.next_i+1 as next_i
  from recursive_cte r, 
       ordered o
  where o.row_number=next_i
)
select count(*) filter (where position='junior') as juniors,
       count(*) filter (where position='senior') as seniors
from recursive_cte 
where is_hired;
  • row_number() over ()是一个
  • count(*) filter (where...)是一个聚合过滤器。它是sum(case when expr then a else 0 end)count(nullif(expr))方法的一个更快的变体,适用于只希望对特定的值子集求和的情况。这只是将这些值放入列中,就像您在预期结果中所做的那样,但可以使用select position, count(*) from recursive_cte where is_hired group by position堆栈式过滤器来完成。

它所做的就是根据你在第一个cte中的优先级来排列你的列表,然后在第二个cte中一行一行地浏览,根据它是否仍然低于你的限制/预算来收集累积的总和。

14ifxucb

14ifxucb3#

postgresql支持窗口SUM(col)OVER()

with cte as (
  SELECT *, SUM(value) OVER(PARTITION BY position ORDER BY id) AS cumulative_sum
  FROM mytable
)
select position, count(1)
from cte
where cumulative_sum < 50000
group by position

在一行中获取结果的另一种方法是:

with cte as (
  SELECT *, SUM(value) OVER(PARTITION BY position ORDER BY id) AS cumulative_sum
  FROM mytable
),
cte2 as (
  select position, count(1) as _count
  from cte
  where cumulative_sum < 50000
  group by position
)
select
sum(case when position = 'junior' then _count else null end) juniors,
sum(case when position = 'senior' then _count else null end) seniors
from cte2

Demo here

06odsfpq

06odsfpq4#

此使用累计的示例:

select 
count(case when chek_sum_jun > 0 and position = 'junior'  then position else null end) chek_jun,
count(case when chek_sum_sen > 0 and position = 'senior' then  position else null end) chek_sen
    from (
    select position, 
    20000 - sum(case when position = 'junior' then value else 0 end) over (partition by position order by value asc rows between unbounded preceding and current row )  chek_sum_jun,
    50000 - sum(case when position = 'senior' then value else 0 end) over (partition by position order by value asc rows between unbounded preceding and current row )  chek_sum_sen
    from test_table) x

演示:https://dbfiddle.uk/ZgOoSzF0

相关问题