使用强制转换时,Pyspark输出结果不正确

yeotifhr  于 2023-10-15  发布在  Spark
关注(0)|答案(1)|浏览(97)

当使用spark.sql时,当cast()函数被调用为decimal(38,16)时,结果它计算的不是最多16个小数位,而是最多10个字符,其余的用零填充
单位代码如下:

cast(
            AVG(
                CASE WHEN 
                    a.is_debtor = 1 AND 
                    a.is_kk = 0 AND 
                    a.is_open = 0 
                THEN datediff((a.date_end_plan + interval '1 day'), a.date_start)/(365.25/ 12) 
                END
            ) as decimal(38,16)
        )
 AS avg_term_plan_closed

结果,输出以下结果:

+--------------------+
|avg_term_plan_closed|
+--------------------+
| -3.2744695414000000|
| 11.2689938398000000|
|  0.9856262834000000|
|                null|
| -1.1498973306000000|
                  ...
+--------------------+
only showing top 20 rows

然而,结果应该是这样的:

+--------------------+
|avg_term_plan_closed|
+--------------------+
| -3.2744695414099936|
| 11.2689938398357280|
|  0.9856262833675564|
|                null|
| -1.1498973305954825|
                  ...
+--------------------+

我试着把CAST放在代码中任何可能的地方,但没有用。有趣的是,如果你把除以(365.25/ 12)去掉,结果就正确了,请帮帮我,我已经为这个错误挣扎了一个星期了

ikfrs5lh

ikfrs5lh1#

将除数转换为浮点数以获得所需的输出。

spark.sql('select numdays, cast((numdays/cast(365.25/12 as float)) as decimal(38,16)) as c1 from samp').show()

# +-------+-------------------+
# |numdays|                 c1|
# +-------+-------------------+
# |    343|11.2689938398357280|
# +-------+-------------------+

而不对除数进行造型

spark.sql('select numdays, cast(numdays/(365.25/12) as decimal(38,16)) as c1 from samp').show()

# +-------+-------------------+
# |numdays|                 c1|
# +-------+-------------------+
# |    343|11.2689938398000000|
# +-------+-------------------+

相关问题