如何使用spark的hbase columnrangefilter

tv6aics1 于 2021-06-10 发布在 Hbase

关注(0)|答案(1)|浏览(267)

我正在考虑如何使用hbase columnrangefilter by spark。
我查看了org.apache.hadoop.hbase.mapreduce.tableinputformat，但是这个api不包含columnrangefilter。
所以我不知道怎么用Spark过滤。
例如，我想使用以“20170225”开头并以“20170305”结尾的columnrangefilter。
我可以像在代码下面一样扫描行。

val conf = HBaseConfiguration.create()
conf.set(TableInputFormat.INPUT_TABLE, "like_count")
val startRow = "001"
val endRow = "100"
conf.set(TableInputFormat.SCAN_ROW_START, startRow)
conf.set(TableInputFormat.SCAN_ROW_STOP, endRow)
sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result])

我需要添加什么代码？如果有人有什么建议，请告诉我。

hbase apache-spark

来源：https://stackoverflow.com/questions/42604507/how-to-use-hbase-columnrangefilter-by-spark

1条答案

按热度按时间

h7wcgrx31#

使用scan对象设置起始行和结束行，并在hbase配置中设置该扫描对象，然后将该配置对象传递给tableinputformathttps://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/scan.html

Scan scan = new Scan(startRow, endRow);
scan.setMaxVersions(MAX_VERSIONS);

//This can also be done if not specified in scan object constructor
scan.setFilter(new ColumnRangeFilter(startrow,true,endrow,true));

HBaseConfiguration.merge(conf, HBaseConfiguration.create(conf));

conf.set(TableInputFormat.INPUT_TABLE, username + ":" + path);
conf.set(TableInputFormat.SCAN, convertScanToString(scan));

tableInputFormat.setConf(conf);

赞(0）回复(0）举报 2021-06-10

我来回答

如何使用spark的hbase columnrangefilter

1条答案

相关问题

热门标签

最新问答