addcachefile

mspsb9vt 于 2021-06-02 发布在 Hadoop

关注(0)|答案(2)|浏览(329)

我是hadoop的新手。我想知道以下代码：

DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);

我的意思是：=>arg[0].touri（）
关于“addcachfile”
谢谢

Java hadoop mapreduce Arguments

来源：https://stackoverflow.com/questions/41954063/addcachefile-in-hadoop

2条答案

按热度按时间

pcrecxhr1#

分布式缓存的adcachefile（）方法获取要添加到分布式缓存的文件的uri，即新路径（args[0]），无论该路径是从输入参数转换为uri，然后使用该uri将文件添加到hadoop的分布式缓存中。
路径-可以是文件或目录的名称。
当这个文件被添加到分布式缓存中时，所有的Map程序都可以使用这个文件，如果你有一个小文件，这是hadoop中的优化技术之一。您可以使所有节点都可以访问它，以便更快地访问数据。
有关更多详细信息，请查看：
https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/fs/path.html
hadoop中分布式缓存的困惑

赞(0）回复(0）举报 2021-06-02

dw1jzc5e2#

谢谢paritosh ahuja，
我有两个关于多边形的txt文件：我的完整代码

public class OverlayPhase2  extends Configured implements Tool
{
    public int run(String[] args) throws IOException
    {
    JobConf conf = new JobConf( getConf(), OverlayPhase2.class);
    if (conf == null) {
    return -1;
    }
    conf.setOutputKeyClass(IntWritable.class);
    conf.setOutputValueClass(Text.class);
    conf.setMapperClass(OverlayPhase2Mapper.class);

    conf.setReducerClass(OverlayPhase2Reducer.class);
    conf.setNumMapTasks(2);
    conf.setNumReduceTasks(8);

    DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);

    Path inp1 = new Path(arg[1]);
    Path inp2 = new Path(arg[2]);
   Path out1 = new Path(arg[3]);
   FileInputFormat.setInputPaths(conf, inp1 );
   FileInputFormat.setInputPaths(conf, inp2 );
   FileInputFormat.setOutputPath(conf, out1 );   
   JobClient.runJob(conf);
    return 0;
}

public static void main(String[] args) throws Exception
{
    int exitCode = ToolRunner.run(new OverlayPhase2(), args);
    System.exit(exitCode);
}

我将arg[1]，arg[2]，arg[3]设置为：

arg[1] =/home/mostafa/Desktop/b1.txt
arg[2] = /home/mostafa/Desktop.b2.txt
arg[3] = /home/mostafa/Desktaop/output

嗯，arg[0]：？
祝你好运，莫斯塔法

赞(0）回复(0）举报 2021-06-02

我来回答

addcachefile

2条答案

相关问题

热门标签

最新问答