如何在hadoop作业中使用bufferedreader/filereader？

tktrz96b 于 2021-05-27 发布在 Hadoop

关注(0)|答案(0)|浏览(265)

我正在尝试运行一个hadoop作业，其中我将参数（inputpath、outputpath、somestring）传递给该作业： hadoop jar q2.jar Q2 /user/p2/points_small.csv /user/p2/output -D "hdfs://localhost:9000/user/p2/centroids.csv" 'dfs[a-z.]+' .
我使用jobrunner并能够成功地解析所需的字符串 "hdfs://localhost:9000/user/p2/centroids.csv" 一个变量，我可以访问我的Map器函数调用 centroidFile . 我正在尝试打开这个文件并读入数据以存储为Map程序可以访问的数组或列表（这里的工作是测试的基本工作）。

public static class PointMapper extends Mapper <Object, Text, Text, Text> {

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {

        List<List<String>> centers = new ArrayList<>();
        String line = "";
        try (BufferedReader br = new BufferedReader(new FileReader(centroidFile))) {
            while ((line = br.readLine()) != null) {
                String[] values = line.split(",");
                centers.add(Arrays.asList(values));
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        String record = value.toString();
        String[] parts = record.split(",");

        context.write(new Text(parts[0]), new Text(centers.get(0).get(0)));
    }
}

作业失败的原因是“na”，但我确信问题是因为我试图在上下文中写入一些不好的内容，即csv没有被读取，值没有写入上下文。
如何在作业期间成功读取此文件路径中的数据？我的最终目标是能够比较我的输入数据（一组点）和我从这个csv加载的数据，另一组点。
需要注意的是，使用获取的文件路径读取csv的尝试最终将驻留在main（）中，但为了可读性，我将其包含在map（）中。

Java hadoop hdfs mapreduce csv

来源：https://stackoverflow.com/questions/64397353/how-do-i-use-bufferedreader-filereader-in-a-hadoop-job

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

如何在hadoop作业中使用bufferedreader/filereader？

暂无答案！

相关问题

热门标签

最新问答