我使用.toDDL得到的模式字符串很简洁,但是对于复杂的模式来说非常难以阅读。如何格式化它,使它看起来更容易与所有的缩进和换行符?
.toDDL
06odsfpq1#
我相信没有直接的功能来格式化文件。我使用下面的代码来格式化/解析XML。它将向df.schema或StructType的对象添加printDDL函数
df.schema
StructType
printDDL
scala> df.printSchema root |-- author: struct (nullable = true) | |-- firstname: string (nullable = true) | |-- lastname: string (nullable = true) |-- category: array (nullable = true) | |-- element: string (containsNull = true) |-- editor: struct (nullable = true) | |-- firstname: string (nullable = true) | |-- lastname: string (nullable = true) |-- isbn: string (nullable = true) |-- title: string (nullable = true)
scala> df.schema.toDDL res86: String = author STRUCT<firstname: STRING, lastname: STRING>,category ARRAY<STRING>,editor STRUCT<firstname: STRING, lastname: STRING>,isbn STRING,title STRING
scala> :paste // Entering paste mode (ctrl-D to finish) implicit class DDL(val schema: org.apache.spark.sql.types.StructType) { def printDDL: Unit = { val tableName = "_source" spark.sql(s"DROP TABLE IF EXISTS ${tableName}") spark.sql(s"CREATE TABLE IF NOT EXISTS ${tableName}(${schema.toDDL}) USING orc") println(spark.sql(s"SHOW CREATE TABLE ${tableName}") .as[String] .head .split("\n") .filterNot(l => l.contains("CREATE") || l.contains("USING")).mkString("\n ", "\n ", "") .dropRight(1)) } }
scala> df.schema.printDDL author STRUCT<firstname: STRING, lastname: STRING>, category ARRAY<STRING>, editor STRUCT<firstname: STRING, lastname: STRING>, isbn STRING, title STRING scala>
1条答案
按热度按时间06odsfpq1#
我相信没有直接的功能来格式化文件。
我使用下面的代码来格式化/解析XML。它将向
df.schema
或StructType
的对象添加printDDL
函数