spark作业给出的路径已经存在错误，即使在使用df.overwrite之后

zazmityj 于 2021-05-27 发布在 Hadoop

关注(0)|答案(0)|浏览(201)

尝试将Parquet数据写入s3或hdfs会产生相同的错误error:i already 提及的df.overwrite

resFinal.write.mode(SaveMode.Overwrite).partitionBy("pro".......

本期使用的spark submit是：

spark-submit --master yarn --deploy-mode cluster --executor-memory 50G --driver-memory 54G --executor-cores 5 --queue High --conf spark.yarn.maxAppAttempts=1 --conf spark.driver.maxResultSize=7g --conf spark.executor.memoryOverhead=4500 --conf spark.driver.memoryOverhead=5400 --conf spark.sql.shuffle.partitions=7000 --conf spark.shuffle.service.enabled=true --conf spark.dynamicAllocation.enabled=true --conf spark.shuffle.spill.compress=true --conf spark.sql.tungsten.enabled=true --conf spark.sql.autoBroadCastJoinThreshold=-1 --conf spark.speculation=true --conf spark.dynamicAllocation.minExecutors=200 --conf spark.dynamicAllocation.maxExecutors=500 --conf spark.memory.storageFraction=0.6 --conf spark.memory.fraction=0.7 --class com.mnb.history_cleanup s3://dv-cam/1/cleanup-1.0-SNAPSHOT.jar H 0 20170101-20170102 HSO

不管我是写hdfs还是s3，我都明白了

org.apache.hadoop.fs.FileAlreadyExistsException: Path already exists as a file: s3://dv-ms-east-1/ms414x-test1/dl/ry8/.spark-staging-28e84dbb-7e91-4d5c-87ba-8e880cf28904/

或

File does not exist: /user/m/dl/vi/.spark-staging-bdb317f3-7ff9-458e-9ea8-7fb70ce4/pro

hadoop scala apache-spark amazon-s3 amazon-web-services

来源：https://stackoverflow.com/questions/54854469/spark-job-gives-path-already-exists-error-even-after-using-df-overwrite

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

spark作业给出的路径已经存在错误，即使在使用df.overwrite之后

暂无答案！

相关问题

热门标签

最新问答