get\u json\u obj对于selectexpr()失败,但是对于pyspark中的select有效

cpjpxq1n  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(466)

我面临一个奇怪的问题,我试图显示我的json对象的值,它可以正常工作 select() 但这不适用于 selectExp() ,我得到一个奇怪的错误,在我的实现中,

from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.functions import col

spark = SparkSession.builder.appName("JsonPractice").getOrCreate()
my_json_df = spark.range(1).selectExpr(
    """'{"sample_json":{"sample_json1":["1st_vale","2nd_val"]}}' as my_json_column""")
my_json_df.selectExpr(get_json_object(col("my_json_column"), "$.sample_json.sample_json1[1]")).show(2)
my_select_expr = get_json_object(col('my_json_column'), '$.sample_json.sample_json1')
my_json_df.selectExpr(my_select_expr).show()

我有以下错误
raise typeerror(“列不可iterable”)
typeerror:列不可编辑

0s7z1bwu

0s7z1bwu1#

我们不需要具体说明 col 使用时 selectExpr ```
my_select_expr = "get_json_object(my_json_column, '$.sample_json.sample_json1')"
my_json_df.selectExpr(my_select_expr).show(10,False)

or

my_json_df.selectExpr("get_json_object(my_json_column,'$.sample_json.sample_json1')").show(10,False)

+-----------------------------------------------------------+

|get_json_object(my_json_column, $.sample_json.sample_json1)|

+-----------------------------------------------------------+

|["1st_vale","2nd_val"] |

+-----------------------------------------------------------+

`UPDATE:`
from pyspark.sql.functions import *
my_select_expr=get_json_object(col('my_json_column'),'$.sample_json.sample_json1')
my_json_df.select(my_select_expr).show(10,False)

+-----------------------------------------------------------+

|get_json_object(my_json_column, $.sample_json.sample_json1)|

+-----------------------------------------------------------+

|["1st_vale","2nd_val"] |

+-----------------------------------------------------------+

相关问题