使用mapreduce将数据导入hbase中的check hbase row exist

9jyewag0 于 2021-06-02 发布在 Hadoop

关注(0)|答案(0)|浏览(247)

我在hbase中有一个名为table1的表，其集合行如下

<Image_Id, <float,float,.....>>

with image\表示图像id，后跟一系列浮点数。
然后我想读取这个表的数据，然后将新的值存储到另一个表中（假设新表名为表2，第一次为空）。
我使用mapreduce实现这个任务

TableMapReduceUtil.initTableMapperJob(
            "Table1",      // input table
            scan,             // Scan instance to control CF and attribute selection
            MyMapper.class,   // mapper class
            null,             // mapper output key
            null,             // mapper output value
            job);
        TableMapReduceUtil.initTableReducerJob(
            "Table2",      // output table
            null,             // reducer class
            job);
        job.setNumReduceTasks(0);

例如在mapper中

public static class MyMapper extends TableMapper<ImmutableBytesWritable, Put>  {

        public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
          //read data in table 1 here
        }
    }

假设在我读取了表1中每一行的一个值之后，我会像

int hcode = hash(GetRowValue())

然后像这样把这个hode插入表2中

context.write(hcode, Image_ID);

with row key是hashcode，其值是表1中对应的image\u id
问题是，如果hcode与以前的其他代码类似，我将以 <hashCode, <Image_ID1, Image_ID2>> 通过使用list<>来存储值的列表。为此，我将检查rowkey是否存在于表2中，然后插入一个新的或更新当前行。
但是在运行我的代码之后，我看到只有在mapreduce完成之后，table2才会被填充数据。在mapreduce过程中，表2仍然是空的。
edited：那么，有没有什么方法可以实现像检查hbase表中某一行是否存在这样的工作流，然后在使用mapreduce时更新其值？

hadoop hbase mapreduce

来源：https://stackoverflow.com/questions/39632214/check-hbase-row-exist-in-using-mapreduce-to-import-data-into-hbase

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

使用mapreduce将数据导入hbase中的check hbase row exist

暂无答案！

相关问题

热门标签

最新问答