如何在sparkscala的for循环中动态创建多个Dataframe

jaxagkaj 于 2021-05-19 发布在 Spark

关注(0)|答案(1)|浏览(579)

我有一个数组[string]，它的值为[“df1”，“df2”，“df3”]
我的配置文件中有如下值。

DF1=select * from TB1
DF2=select * from TB2
DF3=select * from TB3

我需要通过卸载相应的表在for循环中动态创建3个Dataframe。
假设我的数组名是array1。
我的密码是 for (for1 <- ARRAY1){val $for1+_DF = spark.sql(for1)} 上面只是一种伪代码。
请帮助我提供正确的代码语法。
谢谢，纳文

scala apache-spark

来源：https://stackoverflow.com/questions/64452732/how-to-create-multiple-data-frames-dynamically-in-a-for-loop-in-spark-scala

1条答案

按热度按时间

rkue9o1l1#

检查以下代码。

scala> val queries = Map(
                          "df1" -> "select * from tb1",
                          "df2" -> "select * from TB2",
                          "df3" -> "select * from TB3"
                      ) // After reading from config file.

scala> 

queries
.values
.par
.map(spark.sql)
.foreach(_.show(false)) 

// Used .par for parallel loading & all three DataFrame object will be in list & you can do any operation on that, here i am using show function to display values

+---+
|id |
+---+
|1  |
|2  |
|3  |
|4  |
|5  |
|6  |
|7  |
|8  |
|9  |
+---+

+---+
|id |
+---+
|1  |
|2  |
|3  |
|4  |
|5  |
|6  |
|7  |
|8  |
|9  |
+---+

+---+
|id |
+---+
|1  |
|2  |
|3  |
|4  |
|5  |
|6  |
|7  |
|8  |
|9  |
+---+

赞(0）回复(0）举报 2021-05-20

我来回答

如何在sparkscala的for循环中动态创建多个Dataframe

1条答案

相关问题

热门标签

最新问答