我有一个样本数据下表。
| id | type| cycle| datetimestamp | status1 | status2 |
|:----:|:---:|:----:|:-------------------:|:---------:|:---------:|
| 27 | AN | 123 | 2022-12-28 04:12:31 | Normal A | Normal A |
| 27 | AR | 124 | 2022-12-28 04:12:31 | Normal A | Normal A |<--Delete
| 19 | AN | 125 | 2022-12-28 05:24:30 | Normal A | Normal A |
| 19 | AR | 126 | 2022-12-28 06:18:20 | Normal A | Normal A |
| 19 | AR | 234 | 2022-12-28 07:22:20 | Normal A | Normal A |<--Delete
| 19 | AN | 235 | 2022-12-28 07:22:20 | Normal A | Normal A |
| 20 | AR | 236 | 2022-12-28 08:25:49 | Normal A | Normal A |<--Delete
| 20 | AN | 237 | 2022-12-28 08:25:49 | Normal A | Normal A |
| 19 | AR | 129 | 2022-12-28 09:08:19 | Normal A | Normal A |<--Delete
| 19 | AN | 127 | 2022-12-28 09:08:19 | Normal A | Normal A |
| 19 | AR | 238 | 2022-12-28 10:04:31 | Normal A | Normal A |<--Delete
| 19 | AN | 230 | 2022-12-28 10:04:31 | Normal A | Normal A |
| 22 | AN | 239 | 2022-12-28 11:04:58 | Normal A | Normal A |
| 22 | AR | 256 | 2022-12-28 11:04:58 | Normal A | Normal A |<--Delete
我想找出使用列(id,type,datestampamp,status1,status2)的重复项,并删除列类型为“AR”的重复行(保留类型为“AN”)。
| id | type| cycle| datetimestamp | status1 | status2 |
|:----:|:---:|:----:|:-------------------:|:---------:|:---------:|
| 27 | AN | 123 | 2022-12-28 04:12:31 | Normal A | Normal A |
| 19 | AN | 125 | 2022-12-28 05:24:30 | Normal A | Normal A |
| 19 | AR | 126 | 2022-12-28 06:18:20 | Normal A | Normal A |
| 19 | AN | 235 | 2022-12-28 07:22:20 | Normal A | Normal A |
| 20 | AN | 237 | 2022-12-28 08:25:49 | Normal A | Normal A |
| 19 | AN | 127 | 2022-12-28 09:08:19 | Normal A | Normal A |
| 19 | AN | 230 | 2022-12-28 10:04:31 | Normal A | Normal A |
| 22 | AN | 239 | 2022-12-28 11:04:58 | Normal A | Normal A |
但是我的查询返回的type='AN'而不是' AR '。
select * from test_data e
where exists
( select * from test_data e2
where e.datetimestamp=e2.datetimestamp and e.id=e2.id
and e.status1=e2.status1
and e.status2=e2.status2
and e.type='AN' and e2.type='AR') order by e.datetimestamp asc;
设置表查询:
~~~pgsql
CREATE TABLE test_data (
id character varying(2) NOT NULL,
type character varying(2),
cycle integer,
datetimestamp timestamp without time zone NOT NULL,
status1 character varying(10),
status2 character varying(10),
PRIMARY KEY(id, cycle, datetimestamp)
);
INSERT INTO test_data VALUES
(27, 'AN', 123, '2022-12-28 04:12:31', 'Normal A', 'Normal A')
, (27, 'AR', 124, '2022-12-28 04:12:31', 'Normal A', 'Normal A')
, (19, 'AN', 125, '2022-12-28 05:24:30', 'Normal A', 'Normal A')
, (19, 'AR', 126, '2022-12-28 06:18:20', 'Normal A', 'Normal A')
, (19, 'AR', 234, '2022-12-28 07:22:20', 'Normal A', 'Normal A')
, (19, 'AN', 235, '2022-12-28 07:22:20', 'Normal A', 'Normal A')
, (20, 'AR', 236, '2022-12-28 08:25:49', 'Normal A', 'Normal A')
, (20, 'AN', 237, '2022-12-28 08:25:49', 'Normal A', 'Normal A')
, (19, 'AR', 129, '2022-12-28 09:08:19', 'Normal A', 'Normal A')
, (19, 'AN', 127, '2022-12-28 09:08:19', 'Normal A', 'Normal A')
, (19, 'AR', 238, '2022-12-28 10:04:31', 'Normal A', 'Normal A')
, (19, 'AN', 230, '2022-12-28 10:04:31', 'Normal A', 'Normal A')
, (22, 'AN', 239, '2022-12-28 11:04:58', 'Normal A', 'Normal A')
, (22, 'AR', 256, '2022-12-28 11:04:58', 'Normal A', 'Normal A')
;
~~~
3条答案
按热度按时间r6vfmomb1#
您可以通过以下查询列出删除的行:
因此,删除查询将为:
稍微解释一下:我的解决方案是创建一个新的唯一列并将其作为主键,然后过滤重复的行并删除类型为"AR"的行。
p/s:如果你不允许添加1列,你可以使用ctid列代替,但我不参考这个解决方案。
希望对你有帮助。
watbbzwu2#
试试这个:
dbfiddle中的测试结果
mnemlml83#
exists
看起来是一个不错的方法,从你的查询开始,我只是在type
上固定了条件(我们想删除“AR”记录,基于“AN”的存在);我还将多重等式重写为元组等式--但这只是语法上的糖衣: