如何打印Spark字符串？

m2xkgtsf 于 2023-10-23 发布在 Apache

关注(0)|答案(1)|浏览(160)

我使用.toDDL得到的模式字符串很简洁，但是对于复杂的模式来说非常难以阅读。如何格式化它，使它看起来更容易与所有的缩进和换行符？

apache-spark

来源：https://stackoverflow.com/questions/77236408/how-to-pretty-print-spark-ddl-string

1条答案

按热度按时间

06odsfpq1#

我相信没有直接的功能来格式化文件。
我使用下面的代码来格式化/解析XML。它将向df.schema或StructType的对象添加printDDL函数

scala> df.printSchema
root
 |-- author: struct (nullable = true)
 |    |-- firstname: string (nullable = true)
 |    |-- lastname: string (nullable = true)
 |-- category: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- editor: struct (nullable = true)
 |    |-- firstname: string (nullable = true)
 |    |-- lastname: string (nullable = true)
 |-- isbn: string (nullable = true)
 |-- title: string (nullable = true)

scala> df.schema.toDDL
res86: String = author STRUCT<firstname: STRING, lastname: STRING>,category ARRAY<STRING>,editor STRUCT<firstname: STRING, lastname: STRING>,isbn STRING,title STRING

scala> :paste
// Entering paste mode (ctrl-D to finish)

implicit class DDL(val schema: org.apache.spark.sql.types.StructType) {
    def printDDL: Unit = {
        val tableName = "_source"
        spark.sql(s"DROP TABLE IF EXISTS ${tableName}")
        spark.sql(s"CREATE TABLE IF NOT EXISTS ${tableName}(${schema.toDDL}) USING orc")
        println(spark.sql(s"SHOW CREATE TABLE ${tableName}")
        .as[String]
        .head
        .split("\n")
        .filterNot(l => l.contains("CREATE") || l.contains("USING")).mkString("\n ", "\n ", "")
        .dropRight(1))
    }
}

scala> df.schema.printDDL

   author STRUCT<firstname: STRING, lastname: STRING>,
   category ARRAY<STRING>,
   editor STRUCT<firstname: STRING, lastname: STRING>,
   isbn STRING,
   title STRING

scala>

赞(0）回复(0）举报 2023-10-23

我来回答

如何打印Spark字符串？

1条答案

相关问题

热门标签

最新问答