spark`filealreadyexistsexception`when`saveastextfile`即使输出目录不存在

b5buobof 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(758)

这个问题在这里已经有答案了：

如何覆盖spark中的输出目录（8个答案）
四年前关门了。
我正在运行以下命令行：

hadoop fs -rm -r /tmp/output

然后是一个java8Spark工作 main() ```
SparkConf sparkConf = new SparkConf();
JavaSparkContext sc = new JavaSparkContext(sparkConf);
JavaRDD rdd = sc.textFile("/tmp/input")
.map (s -> new JSONObject(s))
rdd.saveAsTextFile("/tmp/output");
sc.stop();

我得到一个错误：

ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /tmp/output already exists

你知道怎么修吗？

Java hadoop hdfs apache-spark

来源：https://stackoverflow.com/questions/35407107/spark-filealreadyexistsexception-when-saveastextfile-even-though-the-output

1条答案

按热度按时间

d7v8vwbk1#

删除hdfs目录，但尝试保存在本地文件系统中。
要在hdfs中保存，请尝试以下操作：

rdd.saveAsTextFile("hdfs://<URL-hdfs>:<PORT-hdfs>/tmp/output");

localhost的默认值为：

rdd.saveAsTextFile("hdfs://localhost:9000/tmp/output");

另一个解决方案是删除 /tmp/output 从本地文件系统
致以最诚挚的问候

赞(0）回复(0）举报 2021-05-29

我来回答

spark`filealreadyexistsexception`when`saveastextfile`即使输出目录不存在

1条答案

相关问题

热门标签

最新问答