我有两个表c0
和m0
,我使用左连接合并。结果是一个有13行的表。我想添加下一个40天窗口的第一个日期。以下是数据:
DROP TABLE IF EXISTS test0.c0;
CREATE TABLE test0.c0 (
id INTEGER
, cons_dt DATE
);
INSERT INTO test0.c0 VALUES
('1','2000-01-01')
,('1','2000-02-01')
,('1','2000-03-01')
,('1','2000-04-01')
,('1','2000-05-01')
,('1','2000-06-01')
,('1','2000-07-01')
,('1','2000-08-01')
,('1','2000-09-01')
,('1','2000-10-01')
,('1','2000-11-01')
,('1','2000-12-01')
;
DROP TABLE IF EXISTS test0.m0;
CREATE TABLE test0.m0 (
id INTEGER
, start_dt DATE
, atc CHAR(1)
);
INSERT INTO test0.m0 VALUES
('1','2000-03-01','A')
,('1','2000-04-01','A')
,('1','2000-08-01','A')
,('1','2000-08-01','B')
,('1','2000-09-01','A')
,('1','2000-10-01','B')
;
字符串
下面是代码:
SELECT c.*, m.start_dt, m.atc
, (SELECT MIN(c.cons_dt)
FROM test0.c0 c2
WHERE c2.id = c.id
AND c2.cons_dt BETWEEN c.cons_dt + INTERVAL 1 DAY AND c.cons_dt + INTERVAL 40 DAY
) AS stop_dt
FROM test0.c0 c
LEFT OUTER JOIN test0.m0 m ON c.id = m.id AND c.cons_dt = m.start_dt
型
输出如下:
+----+------------+------------+-----+---------+
| id | cons_dt | start_dt | atc | stop_dt |
+----+------------+------------+-----+---------+
| 1 | 2000-03-01 | 2000-03-01 | A | \N |
+----+------------+------------+-----+---------+
型
如果条件没有被满足,我希望得到13行带有停止日期或没有停止日期的数据。代码有什么问题吗?我知道我可以使用windows函数,但这不适用于这种数据。考虑到这是一个简化的数据集。原始数据集有许多id和各种atc。
更新1:这里有一个代码,它给出了正确的结果:
SELECT c.*, m.start_dt, m.atc, MIN(c2.cons_dt) AS stop_dt
FROM test0.c0 c
LEFT OUTER JOIN test0.m0 m ON c.id = m.id AND c.cons_dt = m.start_dt
LEFT JOIN test0.c0 c2 ON c.id = c2.id
AND c2.cons_dt BETWEEN c.cons_dt + INTERVAL 1 DAY AND c.cons_dt + INTERVAL 30 DAY
GROUP BY c.id, c.cons_dt, m.start_dt, m.atc
;
型
这是我得到的表格,我期望:
+----+------------+------------+-----+------------+
| id | cons_dt | start_dt | atc | stop_dt |
+----+------------+------------+-----+------------+
| 1 | 2000-01-01 | \N | \N | \N |
| 1 | 2000-02-01 | \N | \N | 2000-03-01 |
| 1 | 2000-03-01 | 2000-03-01 | A | \N |
| 1 | 2000-04-01 | 2000-04-01 | A | 2000-05-01 |
| 1 | 2000-05-01 | \N | \N | \N |
| 1 | 2000-06-01 | \N | \N | 2000-07-01 |
| 1 | 2000-07-01 | \N | \N | \N |
| 1 | 2000-08-01 | 2000-08-01 | A | \N |
| 1 | 2000-08-01 | 2000-08-01 | B | \N |
| 1 | 2000-09-01 | 2000-09-01 | A | 2000-10-01 |
| 1 | 2000-10-01 | 2000-10-01 | B | \N |
| 1 | 2000-11-01 | \N | \N | 2000-12-01 |
| 1 | 2000-12-01 | \N | \N | \N |
+----+------------+------------+-----+------------+
型
我只是想了解为什么子查询不起作用。一个有趣的问题也是哪个执行得更好。
更新2:ONLY_FULL_GROUP_BY slaakso指出了问题所在。当ONLY_FULL_GROUP_BY被启用时(它是默认的),引擎限制了聚合函数的使用。然而,这是令人困惑的,因为这是有效的:
SELECT c.* ,m.atc ,m.start_dt
, (SELECT COUNT(*)
FROM test0.c0 c2
LEFT OUTER JOIN test0.m0 m2 ON c2.id = m2.id AND c2.cons_dt = m2.start_dt
WHERE c2.id = c.id
AND m2.atc <=> m.atc
AND c.cons_dt > c2.cons_dt
) + 1 AS counter
FROM test0.c0 C
LEFT OUTER JOIN test0.m0 m ON c.id = m.id AND c.cons_dt = m.start_dt
ORDER BY m.atc, c.cons_dt, counter
;
型
这里是输出:
+----+------------+-----+------------+---------+
| id | cons_dt | atc | start_dt | counter |
+----+------------+-----+------------+---------+
| 1 | 2000-01-01 | \N | \N | 1 |
| 1 | 2000-02-01 | \N | \N | 2 |
| 1 | 2000-05-01 | \N | \N | 3 |
| 1 | 2000-06-01 | \N | \N | 4 |
| 1 | 2000-07-01 | \N | \N | 5 |
| 1 | 2000-11-01 | \N | \N | 6 |
| 1 | 2000-12-01 | \N | \N | 7 |
| 1 | 2000-03-01 | A | 2000-03-01 | 1 |
| 1 | 2000-04-01 | A | 2000-04-01 | 2 |
| 1 | 2000-08-01 | A | 2000-08-01 | 3 |
| 1 | 2000-09-01 | A | 2000-09-01 | 4 |
| 1 | 2000-08-01 | B | 2000-08-01 | 1 |
| 1 | 2000-10-01 | B | 2000-10-01 | 2 |
+----+------------+-----+------------+---------+
型
为什么第一个密码不起作用,而第二个密码起作用?是什么使这两个密码不同?
更新3:回到我最初的问题
我的代码是错误的。子查询中的MIN(c.cons_dt)是错误的。正确的是MIN(c2.cons_dt)。现在它给出了预期的结果。我想是因为c2.cons_dt属于子查询。
SELECT c.*, m.start_dt, m.atc
, (SELECT MIN(c2.cons_dt)
FROM test0.c0 c2
WHERE c.id = c2.id
AND c2.cons_dt BETWEEN c.cons_dt + INTERVAL 1 DAY AND c.cons_dt + INTERVAL 30 DAY
) AS stop_dt
FROM test0.c0 c
LEFT OUTER JOIN test0.m0 m ON c.id = m.id AND c.cons_dt = m.start_dt
;
型
这里有一个替代方案:
SELECT c.*, m.start_dt, m.atc
, (SELECT c2.cons_dt
FROM test0.c0 c2
WHERE c.id = c2.id
AND c2.cons_dt BETWEEN c.cons_dt + INTERVAL 1 DAY AND c.cons_dt + INTERVAL 30 DAY
ORDER BY c.id, cons_dt
LIMIT 1
) AS stop_dt
FROM test0.c0 c
LEFT OUTER JOIN test0.m0 m ON c.id = m.id AND c.cons_dt = m.start_dt
;
型
整理:我最初的代码是错误的。多亏了这个,我学会了sql_mode ONLY_FULL_GROUP_BY
。
1条答案
按热度按时间e0bqpujr1#
当您将普通列与聚合函数混合使用时,需要包含
GROUP BY
子句。确保您的服务器设置了
ONLY_FULL_GROUP_BY
模式,因为它会捕获许多不正确的查询。