每次我们收到一批新的数据时,我们都会将其加载到一个表中,该表存储我们以前收到的每个数据集。每批新产品都有一个新的 batch_id
. 只有另外两列是 item_id
以及 cost
. 我想建立一个报告,它将描述添加了多少新项(基于项的id)、删除了多少项以及最新批处理中项的净变化。考虑到这一点,我想比较所有批次,但只有这样,更高的 batch_id
比较 batch_id
就在那之前。澄清一下,如果我有3个 batch_id
然后我要第3批和第2批比较,第2批和第1批比较。第一批没有什么可比性。我可以用下面的脚本来实现这一点,但只针对我指定的批处理(结果中的第二行),对于成百上千的批处理,这会让人麻木。有没有办法在SQLServer中使其更具动态性?我在下面提供了一个示例数据集和预期结果。
WITH b1
AS (SELECT *
FROM example_cumulative
WHERE batch_id = 2),
b2
AS (SELECT *
FROM example_cumulative
WHERE batch_id = 3)
SELECT SUM(CASE
WHEN b1.item_id IS NULL
THEN 1
ELSE 0
END) [new items],
SUM(CASE
WHEN b2.item_id IS NULL
THEN 1
ELSE 0
END) [removed items],
COUNT(b2.item_id) - COUNT(b1.item_id) [net change in item volume]
FROM b1
FULL OUTER JOIN b2 ON b1.item_id = b2.item_id;
预期结果
CREATE TABLE example_cumulative
(batch_id INT NOT NULL,
item_id INT NOT NULL,
cost FLOAT NOT NULL
);
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (1,10,100)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (1,11,100)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (1,12,106)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (1,13,142)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (1,14,152)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (1,15,154)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (2,10,110)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (2,11,140)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (2,12,100)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (2,16,100)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (3,10,110)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (3,11,100)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (3,16,195)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (3,17,102)
INSERT INTO example_cumulative (batch_id, item_id, cost) VALUES (3,18,101)
1条答案
按热度按时间56lgkhnf1#
你可以延长你的电流
full join
逻辑如下,通过在连续batch_id
s、 像这样:这个
where
子句的作用是防止查询返回一个额外的、不需要的记录batch_id
大于最大可用值。对于示例数据,返回: