我有一个像这样的变化表
CREATE TABLE IF NOT EXISTS changes (
entity_id TEXT NOT NULL,
column_id TEXT NOT NULL,
value JSONB NOT NULL,
updated_at TIMESTAMP NOT NULL
);
INSERT INTO changes VALUES ('1', 'height', to_jsonb(140), '01-01-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'weight', to_jsonb(30), '01-01-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'height', to_jsonb(145), '01-02-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'weight', to_jsonb(34),'01-03-2021 00:00:00'::TIMESTAMP);
entity_id | column_id | value | updated_at
-----------+-----------+-------+---------------------
1 | height | 140 | 2021-01-01 00:00:00
1 | weight | 30 | 2021-01-01 00:00:00
1 | height | 145 | 2021-01-02 00:00:00
1 | weight | 34 | 2021-01-03 00:00:00
我想得到这个表的累积视图
entity_id | height | weight | updated_at
-----------+--------+--------+---------------------
1 | 140 | 30 | 2021-01-01 00:00:00
1 | 145 | 30 | 2021-01-02 00:00:00
1 | 145 | 34 | 2021-01-03 00:00:00
我当前的查询看起来有效
SELECT
entity_id,
coalesce(change->'height', lag(change->'height', 1, null) over (partition by entity_id order by updated_at)) as height,
coalesce(change->'weight', lag(change->'weight', 1, null) over (partition by entity_id order by updated_at)) as weight,
updated_at
FROM (
SELECT entity_id, json_object_agg(column_id, value) as change, updated_at FROM changes
GROUP BY entity_id, updated_at
) as changes;
但是我不喜欢这里的json_object_agg
,我确信有一种方法可以在没有冗余聚合的情况下实现它?我错过的一些使用窗口聚合函数的方法。
UPD. @SelVazi帮助使查询更好,但我觉得这不是最终的解决方案。
with cte as (
SELECT
entity_id,
max(case when column_id = 'height' then value::int end) as height,
max(case when column_id = 'weight' then value::int end) as weight,
updated_at
from changes
GROUP by entity_id, updated_at
)
select
entity_id,
coalesce(height, lag(height) over (partition by entity_id order by updated_at)) as height,
coalesce(weight, lag(weight) over (partition by entity_id order by updated_at)) as weight,
updated_at
from cte;
2条答案
按热度按时间2lpgd9681#
这比看起来要复杂得多。可以使用条件聚合来将高度和重量旋转到列,但之后我们必须填充“缺失”的值。
我假设两个度量中的任何一个都可能存在多个日期的间隔,这使得
lag()
在Postgres中是错误的,因为它只能回顾预定义的行数(并且不能忽略null
值)。我们可以通过在示例数据的末尾添加一行来演示
lag()
的问题:| 实体标识|列ID|价值|更新于|
| --------------|--------------|--------------|--------------|
| 1|高度|一百四十|2019 -01- 21 00:00:00|
| 1|重量|三十|2019 -01- 21 00:00:00|
| 1|高度|一百四十五|2019 -02-01 00:00:00|
| 1|重量|34人|2021-03-01 00:00:00|
| 1|重量|一百四十|2021-04-01 00:00:00|
一种解决方法是使用间隙和孤岛技术将“缺失”值放入以非
null
值开始的组中,然后该值成为新值。fiddle
| 实体标识|更新于|高度|重量|
| --------------|--------------|--------------|--------------|
| 1| 2019 -01- 21 00:00:00|一百四十|三十|
| 1| 2019 -02-01 00:00:00|一百四十五|三十|
| 1| 2021-03-01 00:00:00|一百四十五|34人|
| 1| 2021-04-01 00:00:00|一百四十五|一百四十|
2cmtqfgy2#
不使用
json_object_agg
,你可以使用条件聚合来获取列的高度和宽度:最后一个查询:
结果: