如何在sparkscala的for循环中动态创建多个Dataframe

jaxagkaj  于 2021-05-19  发布在  Spark
关注(0)|答案(1)|浏览(579)

我有一个数组[string],它的值为[“df1”,“df2”,“df3”]
我的配置文件中有如下值。

DF1=select * from TB1
DF2=select * from TB2
DF3=select * from TB3

我需要通过卸载相应的表在for循环中动态创建3个Dataframe。
假设我的数组名是array1。
我的密码是 for (for1 <- ARRAY1){val $for1+_DF = spark.sql(for1)} 上面只是一种伪代码。
请帮助我提供正确的代码语法。
谢谢,纳文

rkue9o1l

rkue9o1l1#

检查以下代码。

scala> val queries = Map(
                          "df1" -> "select * from tb1",
                          "df2" -> "select * from TB2",
                          "df3" -> "select * from TB3"
                      ) // After reading from config file.
scala> 

queries
.values
.par
.map(spark.sql)
.foreach(_.show(false)) 

// Used .par for parallel loading & all three DataFrame object will be in list & you can do any operation on that, here i am using show function to display values

+---+
|id |
+---+
|1  |
|2  |
|3  |
|4  |
|5  |
|6  |
|7  |
|8  |
|9  |
+---+

+---+
|id |
+---+
|1  |
|2  |
|3  |
|4  |
|5  |
|6  |
|7  |
|8  |
|9  |
+---+

+---+
|id |
+---+
|1  |
|2  |
|3  |
|4  |
|5  |
|6  |
|7  |
|8  |
|9  |
+---+

相关问题