hadoop distributedcache返回null

5lhxktic 于 2021-06-04 发布在 Hadoop

关注(0)|答案(1)|浏览(398)

我正在使用hadoop distributedcache，但遇到了一些问题。我的hadoop处于伪分布式模式。
从这里我们可以看到，在伪分布式模式下，我们使用distributedcache.getlocalcache（xx）来检索缓存文件。
首先，我将文件放入distributedcache：

DistributedCache.addCacheFile(new Path(
"hdfs://localhost:8022/user/administrator/myfile").toUri(),
            job.getConfiguration());

然后在mapper setup（）中检索，但是 DistributedCache.getLocalCache 返回null。我可以通过 System.out.println("Cache: "+context.getConfiguration().get("mapred.cache.files")); 然后打印出来： hdfs://localhost:8022/user/administrator/myfile 这是我的伪代码：

public static class JoinMapper{
     @Override
protected void setup(Context context){
        Path[] cacheFiles = DistributedCache.getLocalCacheFiles(context
                .getConfiguration());
    System.out.println("Cache 
             :"+context.getConfiguration().get("mapred.cache.files"));
      Path cacheFile;
      if (cacheFiles != null) {}
    }
}

xx....

public static void main(String[] args){
             Job job = new Job(conf, "Join Test");
        DistributedCache.addCacheFile(new Path("hdfs://localhost:8022/user/administrator/myfile").toUri(),
            job.getConfiguration());}

对不起，排版不好。有人帮忙吗。。。。
顺便说一句，我可以使用 URI[] uris = DistributedCache.getCacheFiles(context .getConfiguration()); URI返回：hdfs://localhost：8022/user/administrator/myfile
当我试图从uri中读取时，出现了“找不到文件”异常错误。

hadoop NullPointerException distributed-cache

来源：https://stackoverflow.com/questions/17257023/hadoop-distributedcache-returns-null

1条答案

按热度按时间

zpgglvta1#

分布式缓存将您的文件从hdfs复制到所有tasktracker的本地文件系统。你怎么看文件的？如果文件在hdfs中，则必须获取hdfs文件系统，否则将使用默认文件系统（可能是本地文件系统）。因此，要在hdfs中读取文件，请尝试：

FileSystem fs = FileSystem.get(new Path("hdfs://localhost:8022/user/administrator/myfile").toUri(), new Configuration());
Path path = new Path (url);
BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(path)));

赞(0）回复(0）举报 2021-06-04

我来回答

hadoop distributedcache返回null

1条答案

相关问题

热门标签

最新问答