集成apachenutch与cloudera hbase和solr

wfauudbj  于 2021-06-02  发布在  Hadoop
关注(0)|答案(0)|浏览(291)

我将把ClouderaHadoop与ApacheNutch集成。不幸的是,当我试图爬网一个网站下面的例外出现。我用纯hbase和solr配置nutch没有任何问题,但是cloudera似乎在hbase内部做了一些nutch的gora模块无法理解的更改。

./crawl urls/seed.txt testCrawl localhost:8983/solr/ 2
InjectorJob: starting at 2014-05-18 17:29:33
InjectorJob: Injecting urlDir: urls/seed.txt
InjectorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: java.lang.NumberFormatException: For input string: "60000��u��Tm#PBUF
"
localhost�������("
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
Caused by: java.lang.RuntimeException: java.lang.NumberFormatException: For input string: "60000��u��Tm#PBUF
"
localhost�������("
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:127)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
... 7 more
Caused by: java.lang.NumberFormatException: For input string: "60000��u��Tm#PBUF
"
localhost�������("
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java.lang.Integer.parseInt(Integer.java:499)
at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:63)
at org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:63)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:354)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:109)
... 9 more

致以最诚挚的问候。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题