sparksql-lag,当我更改列时,结果得到不同的行

oknrviil  于 2021-05-24  发布在  Spark
关注(0)|答案(0)|浏览(270)

当一个字段符合某些条件时,我试图延迟它,因为我需要使用过滤器,所以我使用max函数来延迟它,因为延迟函数本身不能按我需要的方式工作。我已经能够使用下面的id\u event\u log代码来完成它,但是当我将max中的id\u event\u log更改为ensaio列时,我会延迟ensaio列,它不能正常工作。下面的例子。
数据集:

+------------+---------+------+
|ID_EVENT_LOG|ID_PAINEL|ENSAIO|
+------------+---------+------+
|           1|        1|  null|
|           2|        1|  null|
|           3|        1|INICIO|
|           4|        1|  null|
|           5|        1|  null|
|           6|        1|  null|
|           7|        1|   FIM|
|           8|        1|  null|
|           9|        1|  null|
|          10|        1|  null|
|          11|        2|   FIM|
|          12|        2|   FIM|
|          13|        2|INICIO|
|          14|        2|  null|
|          15|        2|   FIM|
+------------+---------+------+

工作代码

DFReadFile = spark.read.format('csv').option("header", "true").option('sep',',').load('12_delete_between_inicio_fim_v4.csv')

DFReadFile.show()

DFReadFile.createOrReplaceTempView("12_delete_between_inicio_fim")

sqlDF = spark.sql("SELECT *, \
                        CASE \
                            WHEN (ENSAIO like '%null%') THEN \
                                MAX(CASE WHEN (ENSAIO like '%INICIO%') OR (ENSAIO like '%FIM%') THEN ID_EVENT_LOG END) \
                                    OVER (PARTITION BY ID_PAINEL ORDER BY int(ID_EVENT_LOG) RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)  \
                            ELSE 0 \
                            END as TESTE \
                            FROM 12_delete_between_inicio_fim")

sqlDF.show()

结果:

+------------+---------+------+-----+
|ID_EVENT_LOG|ID_PAINEL|ENSAIO|TESTE|
+------------+---------+------+-----+
|           1|        1|  null| null|
|           2|        1|  null| null|
|           3|        1|INICIO|    0|
|           4|        1|  null|    3|
|           5|        1|  null|    3|
|           6|        1|  null|    3|
|           7|        1|   FIM|    0|
|           8|        1|  null|    7|
|           9|        1|  null|    7|
|          10|        1|  null|    7|
|          11|        2|   FIM|    0|
|          12|        2|   FIM|    0|
|          13|        2|INICIO|    0|
|          14|        2|  null|   13|
|          15|        2|   FIM|    0|
+------------+---------+------+-----+

要解决的错误:
数据集相同
无效代码(唯一的更改是ensaio的id\事件\日志):

DFReadFile = spark.read.format('csv').option("header", "true").option('sep',',').load('12_delete_between_inicio_fim_v4.csv')

DFReadFile.show()

DFReadFile.createOrReplaceTempView("12_delete_between_inicio_fim")

sqlDF = spark.sql("SELECT *, \
                        CASE \
                            WHEN (ENSAIO like '%null%') THEN \
                                MAX(CASE WHEN (ENSAIO like '%INICIO%') OR (ENSAIO like '%FIM%') THEN ENSAIO END) \
                                    OVER (PARTITION BY ID_PAINEL ORDER BY int(ID_EVENT_LOG) RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)  \
                            ELSE 0 \
                            END as TESTE \
                            FROM 12_delete_between_inicio_fim")

sqlDF.show()

结果:

+------------+---------+------+------+
|ID_EVENT_LOG|ID_PAINEL|ENSAIO| TESTE|
+------------+---------+------+------+
|           1|        1|  null|  null|
|           2|        1|  null|  null|
|           3|        1|INICIO|     0|
|           4|        1|  null|INICIO|
|           5|        1|  null|INICIO|
|           6|        1|  null|INICIO|
|           7|        1|   FIM|     0|
|           8|        1|  null|INICIO|
|           9|        1|  null|INICIO|
|          10|        1|  null|INICIO|
|          11|        2|   FIM|     0|
|          12|        2|   FIM|     0|
|          13|        2|INICIO|     0|
|          14|        2|  null|INICIO|
|          15|        2|   FIM|     0|
+------------+---------+------+------+

预期结果:

+------------+---------+------+------+
|ID_EVENT_LOG|ID_PAINEL|ENSAIO| TESTE|
+------------+---------+------+------+
|           1|        1|  null|  null|
|           2|        1|  null|  null|
|           3|        1|INICIO|     0|
|           4|        1|  null|INICIO|
|           5|        1|  null|INICIO|
|           6|        1|  null|INICIO|
|           7|        1|   FIM|     0|
|           8|        1|  null|   FIM|
|           9|        1|  null|   FIM|
|          10|        1|  null|   FIM|
|          11|        2|   FIM|     0|
|          12|        2|   FIM|     0|
|          13|        2|INICIO|     0|
|          14|        2|  null|INICIO|
|          15|        2|   FIM|     0|
+------------+---------+------+------+

先谢谢你

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题