hadoop:有可能避免某些文件的复制吗?

lzfw57am  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(364)

在hdfs中,据我所知,所有的文件都是复制的,但是我们在执行作业时会进行某些日志记录,这些文件我们不需要复制,因为它可能会不必要地维护复制的副本,是否可以这样做?i、 避免只复制日志文件。?

kwvwclae

kwvwclae1#

您可以使用-setrep标志和hadoop fs shell命令来设置复制。

Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>

Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.

Options:

The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
The -R flag is accepted for backwards compatibility. It has no effect.
Example:

hadoop fs -setrep -w 3 /user/hadoop/dir1

为了避免复制,可以将numreplicas设置为1。

相关问题