postgresql 如何在分组集查询中仅显示前2个值和前1个值?

xqnpmsa8  于 2023-01-08  发布在  PostgreSQL
关注(0)|答案(1)|浏览(226)

我有这个PostgreSQL表和数据:

CREATE TABLE info (
  brand VARCHAR(255),
  segment VARCHAR(255),
  name VARCHAR(255)
);

INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'SUV', 'Highlander');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'SUV', 'Highlander');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'SUV', 'Highlander');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'SUV', '4Runner');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'SUV', 'RAV4');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'SUV', 'RAV4');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Sedan', 'Camry');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Sedan', 'Camry');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Sedan', 'Corolla');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Sedan', 'Corolla');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Sedan', 'Corolla');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Truck', 'Tacoma');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Truck', 'Tundra');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Truck', 'Tacoma');
INSERT INTO info (brand, segment, name) VALUES ('Toyota', 'Van', 'Sienna');

我执行此查询是为了显示每个分组集的计数,并按每个品牌、细分市场和名称的总计数对其进行排序:

SELECT
  brand,
  segment,
  name,
  count (1) as total
FROM
  info
GROUP BY
  GROUPING SETS (
    (brand),
    (brand, segment),
    (brand, segment,name)
  )
ORDER BY
 max(count (1)) over (partition by brand) desc,
 max(count (1)) over (partition by brand,segment) desc,
 count (1) desc;

This fiddle显示了它的外观。
现在,我只想选择每个品牌的前2个细分市场,以及每个品牌/细分市场的前1个名称。因此,结果应该如下所示:
| 烙印|段|姓名|总计|
| - ------|- ------|- ------|- ------|
| 丰田|||十五|
| 丰田|越野车||六个|
| 丰田|越野车|汉兰达|三个|
| 丰田|轿车||五个|
| 丰田|轿车|花冠|三个|
我试过使用窗口函数,但结果不是我所期望的

zphenhs4

zphenhs41#

尝试按如下方式使用ROW_NUMBER函数:

WITH get_grouping_set AS
(
  SELECT brand, segment, name, count(1) AS total
  FROM info
  GROUP BY GROUPING SETS 
  (
    (brand),
    (brand, segment),
    (brand, segment, name)
  )
), 
brand_segment_order AS
(
  SELECT brand, segment,
    ROW_NUMBER() OVER (PARTITION BY brand ORDER BY total DESC) rn_seg
    FROM get_grouping_set
  WHERE segment IS NOT NULL AND name IS NULL
), 
joined_data AS
(
  SELECT T.*,
    ROW_NUMBER() OVER (PARTITION BY T.brand, T.segment ORDER BY T.total DESC) rn 
  FROM get_grouping_set T JOIN brand_segment_order T2
  ON T.brand = T2.brand AND T.segment = T2.segment OR T.segment IS NULL
  WHERE T2.rn_seg <= 2
 )
SELECT brand, segment, name, total  
FROM joined_data 
WHERE (rn = 1 AND segment IS NULL ) OR (rn <= 2 AND segment IS NOT NULL)
ORDER BY brand, MAX(Total) OVER (PARTITION BY brand, segment) DESC, 
         Total DESC, segment NULLS FIRST, name NULLS FIRST

See demo

另一种解决方案。您可以对每个品牌细分组使用按max(count)排序的dense_rank函数,如下所示:

WITH get_grouping_set AS
(
  SELECT brand, segment, name, count(1) AS total, 
    MAX(count(*)) over (PARTITION BY brand, segment) max_brand_segment
  FROM info
  GROUP BY GROUPING SETS 
  (
    (brand),
    (brand, segment),
    (brand, segment, name)
  )
),
brand_segment_order AS
(
  SELECT *,
    DENSE_RANK() OVER (PARTITION BY brand ORDER BY max_brand_segment DESC) segment_rank,
    DENSE_RANK() OVER (PARTITION BY brand, segment ORDER BY Total DESC) name_rank
    FROM get_grouping_set
)
SELECT brand, segment, name, total 
FROM brand_segment_order
WHERE segment_rank <= 3 AND name_rank <= 2
ORDER BY brand, max_brand_segment DESC, 
         Total DESC, segment NULLS FIRST, name NULLS FIRST

WHERE segment_rank <= 3这将检索每个品牌的两个段,加上一个,因为包括了基本品牌(其中段为空)。
AND name_rank <= 2这将为每个段检索一个名称,加上一个,因为包括了基本段(其中名称为空)。
dense_rank函数的使用是为了在平局的情况下,即当存在具有相同max(count)的多个段/名称时,获得所有的(段,名称)。

相关问题