从多个进程写入hbase

6psbrbz9  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(259)

我必须在hbase db中放入大约2.5 tb的数据。因为一个进程要花很长时间才能将其写入数据库,所以我尝试创建多个进程来完成这项工作。但我怀疑这样做是否安全,因为我正在打电话 batch.commit(finalize=True) 同时从多个进程。我正在使用 1.1.2.2.6.1.0-129 我的系统中的hbase版本。
我正在通过hbase restapi执行所有的写操作。由于执行多个操作,我的服务器上出现以下错误:

2017-08-07 11:01:18,311 INFO  [hconnection-0x5e834655-shared--pool3-t52497] client.AsyncProcess: #7542, table=clueweb12, attempt=12/35 failed=124ops, last exception: org.apache.hadoop.hbase.RegionTooBusyException: org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, regionName=clueweb12,http://www.innovativeusers.org/list/archives/2006/msg058,1502034291085.f923654226ace7491d61f8b67764c46f., server=node9.local,16020,1499079576247, memstoreSize=539000216, blockingMemStoreSize=536870912
        at org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:3824)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2977)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2928)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:748)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:708)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2124)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32393)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
 on node9.local,16020,1499079576247, tracking started null, retrying after=20140ms, replay=124ops

这个错误会导致信息丢失吗?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题