累积和

lvmkulzt  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(358)

我想在中使用配置单元sql进行累计计数 recorrencia 列根据其他的。

+------------+---------+-------+--------------+--+
| t.ano_mes  | t.site  | t.uf  | recorrencia  |
+------------+---------+-------+--------------+--+
| 202001     | 174     | AM    | 1            |
| 202002     | 174     | AM    | 1            |
| 202003     | 174     | AM    | 1            |
| 202004     | 174     | AM    | 1            |
| 202005     | 174     | AM    | 1            |
| 202006     | 174     | AM    | 1            |
| 202007     | 174     | AM    | 1            |
| 202008     | 174     | AM    | 1            |
| 202005     | 1JN     | SP    | 1            |
| 202006     | 1JN     | SP    | 1            |
| 202005     | 1LJ     | SP    | 1            |
| 202009     | 1LJ     | SP    | 1            |
| 202001     | 1RG     | SP    | 1            |
| 202002     | 1RG     | SP    | 1            |
| 202003     | 1RG     | SP    | 1            |
| 202004     | 1RG     | SP    | 1            |
| 202005     | 1RG     | SP    | 1            |
| 202006     | 1RG     | SP    | 1            |
| 202007     | 1RG     | SP    | 1            |

期望输出

+------------+---------+-------+--------------+--------+
| t.ano_mes  | t.site  | t.uf  | recorrencia  |cum_rec
+------------+---------+-------+--------------+--------+
| 202001     | 174     | AM    | 1            |1
| 202002     | 174     | AM    | 1            |2
| 202003     | 174     | AM    | 1            |3
| 202004     | 174     | AM    | 1            |4
| 202005     | 174     | AM    | 1            |5
| 202006     | 174     | AM    | 1            |6
| 202007     | 174     | AM    | 1            |7
| 202008     | 174     | AM    | 1            |8
| 202005     | 1JN     | SP    | 1            |1
| 202006     | 1JN     | SP    | 1            |2
| 202005     | 1LJ     | SP    | 1            |1
| 202009     | 1LJ     | SP    | 1            |2
| 202001     | 1RG     | SP    | 1            |1
| 202002     | 1RG     | SP    | 1            |2
| 202003     | 1RG     | SP    | 1            |3
| 202004     | 1RG     | SP    | 1            |4
| 202005     | 1RG     | SP    | 1            |5
| 202006     | 1RG     | SP    | 1            |6
| 202007     | 1RG     | SP    | 1            |7

我试过很多功能,比如 COUNT(*) OVER (t.ano_mes) 以及 COUNT(*) OVER (t.site) 但是它运行sum直到表的末尾,并且不作为 t.site 变化。
一旦 t.site 更改后,计数器应重新启动。

r1zk6ea1

r1zk6ea11#

那就是:

sum(recorrencia) over(partition by t.site order by t.ano_mes) as cum_rec

这个 partition by 子句使总和在每次站点更改时重置。
注意,如果 recorrencia 总是 1 ,如示例数据所示 row_number() 足够了:

row_number() over(partition by t.site order by t.ano_mes) as cum_rec

相关问题