hdfs纯文本写入hbase,未设置输出目录

r1zk6ea1  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(376)

在Map中,我读到了hbase的hdfs文件更新,
版本:hadoop 2.5.1 hbase 1.0.0
例外情况如下:

Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.

也许你的身体有问题

job.setOutputFormatClass(TableOutputFormat.class);

此行提示:

The method setOutputFormatClass(Class<? extends OutputFormat>) in the type Job is not applicable for the arguments (Class<TableOutputFormat>)

代码如下:

public class HdfsAppend2HbaseUtil extends Configured implements Tool{

    public static class HdfsAdd2HbaseMapper extends Mapper<Text, Text, ImmutableBytesWritable, Put>{

        public void map(Text ikey, Text ivalue, Context context) 
                throws IOException, InterruptedException {

            String oldIdList = HBaseHelper.getValueByKey(ikey.toString());

            StringBuffer sb = new StringBuffer(oldIdList);
            String newIdList = ivalue.toString();
            sb.append("\t" + newIdList);

            Put p = new Put(ikey.toString().getBytes());
            p.addColumn("idFam".getBytes(), "idsList".getBytes(), sb.toString().getBytes());
            context.write(new ImmutableBytesWritable(), p);

        }

    }

    public int run(String[] paths) throws Exception {

        Configuration conf = HBaseConfiguration.create();
        conf.set("hbase.zookeeper.quorum", "master,salve1");
        conf.set("hbase.zookeeper.property.clientPort", "2181");

        Job job = Job.getInstance(conf,"AppendToHbase");
        job.setJarByClass(cn.edu.hadoop.util.HdfsAppend2HbaseUtil.class);

        job.setInputFormatClass(KeyValueTextInputFormat.class);

        job.setMapperClass(HdfsAdd2HbaseMapper.class);
        job.setNumReduceTasks(0);

        job.setOutputFormatClass(TableOutputFormat.class); 

        job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "reachableTable");

        FileInputFormat.setInputPaths(job, new Path(paths[0]));

        job.setOutputKeyClass(ImmutableBytesWritable.class);
        job.setOutputValueClass(Put.class);

        return job.waitForCompletion(true) ? 0 : 1;
    }

    public static void main(String[] args) throws Exception {

        System.out.println("Append Start: ");

        long time1 = System.currentTimeMillis();
        long time2;
        String[] pathsStr = {Const.TwoDegreeReachableOutputPathDetail};

        int exitCode = ToolRunner.run(new HdfsAppend2HbaseUtil(), pathsStr);
        time2 = System.currentTimeMillis();
        System.out.println("Append Cost " + "\t" + (time2-time1)/1000 +" s");

        System.exit(exitCode);
    }
}
rjee0c15

rjee0c151#

你没有提到输出目录,它是在那里写输出,就像你给输入路径。
这样说吧。

FileOutputFormat.setOutputPath(job, new Path(<output path>));
cgh8pdjw

cgh8pdjw2#

终于,我知道为什么了,就像我想的那样:

job.setOutputFormatClass(TableOutputFormat.class);

此行提示:

The method setOutputFormatClass(Class<? extends OutputFormat>) in the type Job is not applicable for the arguments (Class<TableOutputFormat>)

实际上我们需要进口

org.apache.hadoop.hbase.mapreduce.TableOutputFormat

不导入

org.apache.hadoop.hbase.mapred.TableOutputFormat

前者从org.apache.hadoop.mapred.fileoutputformat扩展而来
请参见:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapred/tableoutputformat.html
后者从org.apache.hadoop.mapreduce.outputformat扩展而来
请参见:
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/tableoutputformat.html
最后非常感谢大家!!!

相关问题