scala隐式地将Map转换为元组

oxf4rvwz  于 2021-05-27  发布在  Spark
关注(0)|答案(3)|浏览(425)

我试图创建一个基于少数条件的Map数组。以下是我的职责。即使我被提供 Map 斯卡拉强迫我 tuple 作为返回类型。有什么办法我能修好它吗?

def getSchemaMap(schema: StructType): Array[(String, String)] ={
    schema.fields.flatMap {
      case StructField(name, StringType, _, _) => Map(name-> "String")
      case StructField(name, IntegerType, _, _) => Map(name-> "int")
      case StructField(name, LongType, _, _) => Map(name-> "int")
      case StructField(name, DoubleType, _, _) => Map(name-> "int")
      case StructField(name, TimestampType, _, _) => Map(name-> "timestamp")
      case StructField(name, DateType, _, _) => Map(name-> "date")
      case StructField(name, BooleanType, _, _) => Map(name-> "boolean")
      case StructField(name, _:DecimalType, _, _) => Map(name-> "decimal")
      case StructField(name, _, _, _) => Map(name-> "String")

    }
  }
hs1ihplo

hs1ihplo1#

使用 toMap 转换 Array[(String, String)]Map[String, String] :

def getSchemaMap(schema: StructType): Map[String, String] = {
  schema.fields.flatMap {
    case StructField(name, StringType, _, _) => Map(name -> "String")
    case StructField(name, IntegerType, _, _) => Map(name -> "int")
    case StructField(name, LongType, _, _) => Map(name -> "int")
    case StructField(name, DoubleType, _, _) => Map(name -> "int")
    case StructField(name, TimestampType, _, _) => Map(name -> "timestamp")
    case StructField(name, DateType, _, _) => Map(name -> "date")
    case StructField(name, BooleanType, _, _) => Map(name -> "boolean")
    case StructField(name, _: DecimalType, _, _) => Map(name -> "decimal")
    case StructField(name, _, _, _) => Map(name -> "String")
  }.toMap
}

但实际上你不需要 flatMap 在这里,因为您将一个值Map到一个值,而不是将一个值Map到多个值。所以在这种情况下你可以 map 值转换为元组,然后将元组列表转换为Map

def getSchemaMap(schema: StructType): Map[String, String] = {
  schema.fields.map {
    case StructField(name, StringType, _, _) => name -> "String"
    case StructField(name, IntegerType, _, _) => name -> "int"
    case StructField(name, LongType, _, _) => name -> "int"
    case StructField(name, DoubleType, _, _) => name -> "int"
    case StructField(name, TimestampType, _, _) => name -> "timestamp"
    case StructField(name, DateType, _, _) => name -> "date"
    case StructField(name, BooleanType, _, _) => name -> "boolean"
    case StructField(name, _: DecimalType, _, _) => name -> "decimal"
    case StructField(name, _, _, _) => name -> "String"
 }.toMap
}
rks48beu

rks48beu2#

一切正常除了 flatmap 将转换为( flatten ) Array[Map(String, String)]Array[(String, String)] .
替换 flatmapmap .

def getSchemaMap(schema: StructType): Array[Map[String, String]] =
  schema.fields.map {
  //case statements
  }
wvt8vs2t

wvt8vs2t3#

如果我理解正确,您正在尝试将列(名称/类型对)提取到 Map[String, String] . 您可以为此使用内置功能并利用现有api,因此我认为不需要任何自定义模式匹配。
你可以用 df.schema.fields 或者 df.schema.toDDL 如下所述:

df.schema.fields.map(f => (f.name, f.dataType.typeName)).toMap // also try out f.dataType.simpleString
// res4: scala.collection.immutable.Map[String,String] = Map(col1 -> string, col2 -> integer, col3 -> string, col4 -> string)

df.schema.toDDL.split(",").map{f => (f.split(" ")(0), f.split(" ")(1))}.toMap
// res8: scala.collection.immutable.Map[String,String] = Map(`col1` -> STRING, `col2` -> INT, `col3` -> STRING, `col4` -> STRING)

具有以下功能:

def schemaToMap(schema: StructType): Map[String, String] =
    schema.fields.map(f => (f.name, f.dataType.typeName)).toMap

相关问题