如何将Dataframe转换为列表列表(scala)?

zzlelutf  于 2021-05-29  发布在  Spark
关注(0)|答案(1)|浏览(714)

所以现在我有一些东西看起来像这样:

输入:

会员ID |课程ID

1                 3
1                 4
2                 3
2                 5
2                 6

输出:列表(列表(3,4),列表(3,5,6))

zz2j4svz

zz2j4svz1#

val df = Seq((1,3),
    (1,4),
    (2,3),
    (2,5),
    (2,6)).toDF("MemberID", "CourseID")
  df.show(false)

  val resDF = df.groupBy("MemberID").agg(collect_list('CourseID).alias("CourseID"))
  val result = resDF.select(concat_ws(",", 'CourseID)).collect.toList.map(_.toSeq.toList)
//  +--------+--------+
//  |MemberID|CourseID|
//  +--------+--------+
//  |1       |3       |
//  |1       |4       |
//  |2       |3       |
//  |2       |5       |
//  |2       |6       |
//  +--------+--------+
//
//  df: org.apache.spark.sql.DataFrame = [MemberID: int, CourseID: int]
//  resDF: org.apache.spark.sql.DataFrame = [MemberID: int, CourseID: array<int>]
//  result: List[List[Any]] = List(List(3,4), List(3,5,6))

相关问题