sql中的多行和计数行

goucqfw6  于 2021-05-29  发布在  Hadoop
关注(0)|答案(3)|浏览(312)

我有一张有仓库进出口的table。我想多行到每天,然后计算和总和存储成本到每天((原始图像)

+ -------- + -------------- + ---------- + ---------- + -------------- + ------------ +
| material | wasting_time_a | indate     | outdate    | count_material | storage_cost |
+ -------- + -------------- + ---------- + ---------- + -------------- + ------------ +
| 963651   | 5              | 2016-12-02 | 2016-12-06 | 2              | 0.04357      |
| 963651   | 6              | 2016-12-02 | 2016-12-07 | 1              | 0.02615      |
| 963651   | 7              | 2016-12-02 | 2016-12-08 | 1              | 0.0305       |
| 963651   | 11             | 2016-12-02 | 2016-12-12 | 4              | 0.1917       |
| 963651   | 12             | 2016-12-02 | 2016-12-13 | 1              | 0.05229      |
| 963651   | 13             | 2016-12-02 | 2016-12-14 | 3              | 0.1699       |
| 963651   | 14             | 2016-12-02 | 2016-12-15 | 9              | 0.5490       |
| 963651   | 15             | 2016-12-02 | 2016-12-16 | 7              | 0.4575       |
| 963651   | 16             | 2016-12-02 | 2016-12-17 | 2              | 0.1394       |
| 963651   | 18             | 2016-12-02 | 2016-12-19 | 5              | 0.3922       |
| 963651   | 19             | 2016-12-02 | 2016-12-20 | 6              | 0.4968       |
| 963651   | 20             | 2016-12-02 | 2016-12-21 | 6              | 0.5229       |
| 963651   | 21             | 2016-12-02 | 2016-12-22 | 2              | 0.1830       |
| 963651   | 22             | 2016-12-02 | 2016-12-23 | 1              | 0.0959       |
| 963651   | 2              | 2016-12-22 | 2016-12-23 | 2              | 0.01743      |
| 963651   | 9              | 2016-12-22 | 2016-12-30 | 3              | 0.1177       |
| 963651   | 10             | 2016-12-22 | 2016-12-31 | 1              | 0.04357      |
| 963651   | 12             | 2016-12-22 | 2017-01-02 | 1              | 0.05229      |
| 963651   | 14             | 2016-12-22 | 2017-01-04 | 2              | 0.1220       |
+ -------- + -------------- + ---------- + ---------- + -------------- + ------------ +

等待时间只不过是过期和过期的区别。盘点物料是过期时从仓库取出的物料的盘点。
存储成本每天都在增加,所以我需要具体计算每一天。我的想法是这样计算每一天:我从indate到outdate为每一行计算多行,最后我将求和。根据我的表格,2016-12-02的存储成本将是按indate统计的物料组的总和。但我不知道怎么计算2016-12-03。
我正在使用impala,但每一个sql都会有所帮助:)
表格如下:

+ ---------- + ------------------ +
| DATE       | total_storage_cost |
+ ---------- + ------------------ +
| 2016-12-02 | 40                 |
| 2016-12-03 | 47                 |
| 2016-02-04 | ...                |
| 2016-02-05 | ...                |
| 2016-02-06 | ...                |
+ ---------- + ------------------ +

谢谢你的帮助

lmyy7pcs

lmyy7pcs1#

@全部。我已经关注这些hadoop讨论几个星期了。我在这里读了1000多篇文章。我也在读几本关于hadoop的书。这篇文章让我觉得hadoop就像sqlserver。我真的看不出SQLServer和hadoop之间有什么区别。不过,正如我在上面提到的,我对整个hadoop概念非常陌生。我使用SQLServer已经快10年了。如果我已经非常了解sql server,那么像我这样的人学习hadoop有很大的优势吗?我只是好奇。
谢谢。

6mzjoqzu

6mzjoqzu2#

这可能是个开始。但是你想如何计算表中没有的天数呢?也许你可以再详细一点?

DECLARE @D TABLE(mDate DATE)
INSERT INTO @D VALUES ('20161201'),('20161202'),('20161203'),('20161204'),('20161205'),('20161206'),('20161207'),('20161208'),
                  ('20161209'),('20161210'),('20161211'),('20161212'),('20161213'),('20161214'),('20161215'),('20161216'),
                  ('20161217'),('20161218'),('20161219'),('20161220'),('20161221'),('20161222'),('20161223'),('20161224')
DECLARE @T TABLE(Material INT
            ,Waiting_Time_a INT
            ,Indate DATE
            ,Outdate DATE
            ,Count_Material INT
            ,Storage_cost DECIMAL(18,10))
INSERT INTO @T VALUES
(963651,5,'20161202','20161206',2,0.0435749999),
(963651,6,'20161202','20161207',1,0.026145),
(963651,7,'20161202','20161208',1,0.0305025000),
(963651,11,'20161202','20161212',4,0.19173),
(963651,12,'20161202','20161213',1,0.05229),
(963651,13,'20161202','20161214',3,0.1699425),
(963651,14,'20161202','20161215',9,0.5490449999),
(963651,15,'20161202','20161216',7,0.4575375),
(963651,16,'20161202','20161217',2,0.13944),
(963651,18,'20161202','20161219',5,0.3921750000),
(963651,19,'20161202','20161220',6,0.4967549999),
(963651,20,'20161202','20161221',6,0.522899999),
(963651,21,'20161202','20161222',2,0.183015),
(963651,22,'20161202','20161223',1,0.095865),
(963651,2,'20161222','20161223',2,0.01743),
(963651,9,'20161222','20161230',3,0.1176525000),
(963651,10,'20161222','20161231',1,0.04357499999),
(963651,12,'20161222','20170102',1,0.05229),
(963651,14,'20161222','20170104',2,0.1220100000)

SELECT d.mDate, t.dCost
FROM @D AS d
   LEFT OUTER JOIN
    (SELECT Indate, SUM(Storage_cost / Waiting_Time_a) AS dCost
     FROM @T
     GROUP BY Indate) AS t
        ON t.Indate = d.mDate
wtzytmuj

wtzytmuj3#

对于某些rdbms,这可以使用递归cte来完成,但impala不支持它们。
对于您的情况,我建议创建一些仅包含单列的表,其中包含带有每日增量的日期(2016-12-02、2016-12-03、2016-12-04……)。
如果您不需要任何编码,可以通过ms excel完成,然后导出到csv,然后导入到hdfs。
然后您可以像在表联接和日期范围过滤中那样使用这个表。

相关问题