spark json-apply schema with nullable=false

tv6aics1 于 2021-05-27 发布在 Spark

关注(0)|答案(0)|浏览(243)

我正在尝试为我的json文件应用nullable=false。它总是将默认值显示为nullable=true。写了我自己的模式。

val carsSchema = StructType(Array(
    StructField("Name", StringType),
    StructField("Miles_per_Gallon", DoubleType,nullable = false),
    StructField("Cylinders", LongType),
    StructField("Displacement", DoubleType),
    StructField("Horsepower", LongType),
    StructField("Weight_in_lbs", LongType),
    StructField("Acceleration", DoubleType),
    StructField("Year", StringType),
    StructField("Origin", StringType)))

df.show（）

root
 |-- Name: string (nullable = true)
 |-- Miles_per_Gallon: double (nullable = true)
 |-- Cylinders: long (nullable = true)
 |-- Displacement: double (nullable = true)
 |-- Horsepower: long (nullable = true)
 |-- Weight_in_lbs: long (nullable = true)
 |-- Acceleration: double (nullable = true)
 |-- Year: string (nullable = true)
 |-- Origin: string (nullable = true)

经过一些研究，转换成rdd，然后应用到df使用下面的代码。

val jsonRDD = spark.sparkContext.textFile(carsDataWithErrorjsonfile)
  val carDF = spark.read
            //.format("json")
          //.option("inferSchema", true)
          .schema(carsSchema)
          .option("mode","permisive") //failFast,permisive,dropMalformed,
          .json(jsonRDD)

它正在按预期工作。但ide显示，作为rdd传递给json的方法已被弃用。可以选择将nullable设置为false。
样本数据集

{"Name":"chevrolet chevelle malibu", "Miles_per_Gallon":18, "Cylinders":8, "Displacement":307, "Horsepower":130, "Weight_in_lbs":3504, "Acceleration":12, "Year":"1970-01-01", "Origin":"USA"}

hadoop scala apache-spark apache-spark-sql

来源：https://stackoverflow.com/questions/62028523/spark-json-apply-schema-with-nullable-false

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

spark json-apply schema with nullable=false

暂无答案！

相关问题

热门标签

最新问答