sql—在postgresql的后续行中分布值

des4xlb0  于 2021-07-29  发布在  Java
关注(0)|答案(1)|浏览(183)

我有两张table。一个包含用户状态转换。

create table state (
  id serial primary key,
  ctime timestamp with time zone not null,
  state text not null
);

insert into state (ctime, state) values
  ('2019-05-01 03:58:40+00', 'Busy')
, ('2019-05-01 03:58:42+00', 'Ready')
, ('2019-05-01 04:00:14+00', 'Busy')
, ('2019-05-01 04:16:26+00', 'Ready')
, ('2019-05-01 04:16:36+00', 'Busy')
;

另一个包含用户的操作。

create table action (
  id serial primary key,
  ctime timestamp with time zone not null,
  action text not null
);

insert into action (ctime, action) values
  ('2019-05-01 03:58:42+00', 'vasah')
, ('2019-05-01 03:58:42+00', 'mituh')
, ('2019-05-01 04:00:14+00', 'jumuf')
, ('2019-05-01 04:00:16+00', 'vibaj')
, ('2019-05-01 04:00:16+00', 'sasij')
, ('2019-05-01 04:16:21+00', 'husih')
, ('2019-05-01 04:16:26+00', 'radod')
, ('2019-05-01 04:16:30+00', 'zadub')
, ('2019-05-01 04:16:35+00', 'mimoh')
, ('2019-05-01 04:16:36+00', 'rimoh')
, ('2019-05-01 04:16:37+00', 'zahuf')
, ('2019-05-01 04:16:37+00', 'fisak')
;

很容易合并这些表,并直观地查看执行操作的状态。

select *
from (
  select ctime, state, null from state
   union all
     select ctime, null, action from action) x
order by ctime;

输出:

2019-05-01 06:58:40+03 | Busy  |
 2019-05-01 06:58:42+03 |       | mituh
 2019-05-01 06:58:42+03 |       | vasah
 2019-05-01 06:58:42+03 | Ready |
 2019-05-01 07:00:14+03 |       | jumuf
 2019-05-01 07:00:14+03 | Busy  |
 2019-05-01 07:00:16+03 |       | vibaj
 2019-05-01 07:00:16+03 |       | sasij
 2019-05-01 07:16:21+03 |       | husih
 2019-05-01 07:16:26+03 | Ready |
 2019-05-01 07:16:26+03 |       | radod
 2019-05-01 07:16:30+03 |       | zadub
 2019-05-01 07:16:35+03 |       | mimoh
 2019-05-01 07:16:36+03 | Busy  |
 2019-05-01 07:16:36+03 |       | rimoh
 2019-05-01 07:16:37+03 |       | zahuf
 2019-05-01 07:16:37+03 |       | fisak

我怎样才能“填空”,使输出是这样的?

2019-05-01 06:58:40+03 | Busy  |
 2019-05-01 06:58:42+03 | Busy  | mituh
 2019-05-01 06:58:42+03 | Busy  | vasah
 2019-05-01 06:58:42+03 | Ready |
 2019-05-01 07:00:14+03 | Ready | jumuf
 2019-05-01 07:00:14+03 | Busy  |
 2019-05-01 07:00:16+03 | Busy  | vibaj
 2019-05-01 07:00:16+03 | Busy  | sasij
 2019-05-01 07:16:21+03 | Busy  | husih
 2019-05-01 07:16:26+03 | Ready |
 2019-05-01 07:16:26+03 | Ready | radod
 2019-05-01 07:16:30+03 | Ready | zadub
 2019-05-01 07:16:35+03 | Ready | mimoh
 2019-05-01 07:16:36+03 | Busy  |
 2019-05-01 07:16:36+03 | Busy  | rimoh
 2019-05-01 07:16:37+03 | Busy  | zahuf
 2019-05-01 07:16:37+03 | Busy  | fisak
soat7uwm

soat7uwm1#

这是一个 lag(ignore nulls) 真的很有用。但不适用于postgres。所以你可以分两步来做。根据分配组 state . 然后传播价值:

select t.*,
       max(state) over (partition by grp) as imputed_state
from (select t.*, count(state) over (order by ctime) as grp
      from t
     ) t;

请注意 t 实际上是你的 union all 查询。我将其抽象为一个表别名,以说明逻辑的重要部分。
此外,您的数据中还存在关联(具有相同时间的行)。这意味着答案是不确定的——关系可以同时在状态之前或之后。
这是一把小提琴。

相关问题