我想把rdd的效率和通常的相比 List
在斯卡拉Spark。
val list = Range(1, 1000000, 1)
val dist_list = sc.parallelize(list)
val start_time = System.nanoTime
val sum = list.reduce((x,y) => x+y)
println(s"for list time is ${System.nanoTime - start_time}")
val s_time = System.nanoTime
val dist_sum = dist_list.reduce((x,y) => x+y)
println(s"for dist_list time is ${System.nanoTime - s_time}")
结果呢
for list time is 24849500
for dist_list time is 378051900
这意味着rdd比通常的操作慢15倍。为什么会这样?
暂无答案!
目前还没有任何答案,快来回答吧!