我有大量的兽人档案 abfss://adv3@ej89g4hf7lokiu.dfs.core.windows.net/efgrd/hf/oiukefhgeoikijukioiuy/wfg/iolpkiubnmhgtrfe/
. 我的目标是将该位置的所有orc文件转换为单个csv。这是我使用的scala代码:
val df_test = spark.read.format("orc").option("inferSchema","true").option("recursiveFileLookup","true").load("abfss://adv3@ej89g4hf7lokiu.dfs.core.windows.net/efgrd/hf/oiukefhgeoikijukioiuy/wfg/iolpkiubnmhgtrfe/")
val df_test_final = df_test.drop("column1", "column2")
df_test_final
.repartition(1)
.write
.option("header", "true")
.mode(SaveMode.Append)
.csv("abfss://adv3@ej89g4hf7lokiu.dfs.core.windows.net/testenvironment/output/")
我得到的错误是:
Exception in thread "main" org.apache.spark.SparkException: Job aborted.
1:41
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 285, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows.
Caused by: java.lang.NullPointerException
有趣的是如果我用 df_test.show()
而不是试图写入csv,错误是完全相同的。什么样的数据问题可能会导致此问题?根据你在我的代码中看到的任何提示?
暂无答案!
目前还没有任何答案,快来回答吧!