spark作业给出的路径已经存在错误,即使在使用df.overwrite之后

zazmityj  于 2021-05-27  发布在  Hadoop
关注(0)|答案(0)|浏览(201)

尝试将Parquet数据写入s3或hdfs会产生相同的错误error:i already 提及的df.overwrite

resFinal.write.mode(SaveMode.Overwrite).partitionBy("pro".......

本期使用的spark submit是:

spark-submit --master yarn --deploy-mode cluster --executor-memory 50G --driver-memory 54G --executor-cores 5 --queue High --conf spark.yarn.maxAppAttempts=1 --conf spark.driver.maxResultSize=7g --conf spark.executor.memoryOverhead=4500 --conf spark.driver.memoryOverhead=5400 --conf spark.sql.shuffle.partitions=7000 --conf spark.shuffle.service.enabled=true --conf spark.dynamicAllocation.enabled=true --conf spark.shuffle.spill.compress=true --conf spark.sql.tungsten.enabled=true --conf spark.sql.autoBroadCastJoinThreshold=-1 --conf spark.speculation=true --conf spark.dynamicAllocation.minExecutors=200 --conf spark.dynamicAllocation.maxExecutors=500 --conf spark.memory.storageFraction=0.6 --conf spark.memory.fraction=0.7 --class com.mnb.history_cleanup s3://dv-cam/1/cleanup-1.0-SNAPSHOT.jar H 0 20170101-20170102 HSO

不管我是写hdfs还是s3,我都明白了

org.apache.hadoop.fs.FileAlreadyExistsException: Path already exists as a file: s3://dv-ms-east-1/ms414x-test1/dl/ry8/.spark-staging-28e84dbb-7e91-4d5c-87ba-8e880cf28904/

File does not exist: /user/m/dl/vi/.spark-staging-bdb317f3-7ff9-458e-9ea8-7fb70ce4/pro

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题