selenium web驱动程序

bfnvny8b  于 2021-06-02  发布在  Hadoop
关注(0)|答案(0)|浏览(269)

我们目前开发了一个使用nutch2和hbase后端的爬虫程序。我们使用selenium web驱动程序为web解析器创建了一个插件。在本地模式下一切正常。但是,当我们尝试使用nutch部署模式将其部署到集群上时,出现了错误,说明“无法成功解析”。下面是错误。
java.util.concurrent.executionexception:java.lang.nosuchfielderror:java.util.concurrent.futuretask.report(futuretask)处的示例。java:122)在java.util.concurrent.futuretask.get(futuretask。java:206)在org.apache.nutch.parse.parseutil.runparser(parseutil。java:164)在org.apache.nutch.parse.parseutil.parse(parseutil。java:146)在org.apache.nutch.parse.parserchecker.run(parserchecker。java:142)在org.apache.hadoop.util.toolrunner.run(toolrunner。java:70)在org.apache.nutch.parse.parserchecker.main(parserchecker。java:214)在sun.reflect.nativemethodaccessorimpl.invoke0(本机方法)在sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl)。java:62)在sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl。java:43)在java.lang.reflect.method.invoke(方法。java:497)在org.apache.hadoop.util.runjar.run(runjar。java:221)在org.apache.hadoop.util.runjar.main(runjar。java:136)原因:java.lang.nosuchfielderror:示例org.apache.http.conn.ssl.sslconnectionsocketfactory.(sslconnectionsocketfactory。java:144)位于org.openqa.selenium.remote.internal.httpclientfactory.getclientconnectionmanager(httpclientfactory)。java:71)位于org.openqa.selenium.remote.internal.httpclientfactory.(httpclientfactory。java:57)在org.openqa.selenium.remote.internal.httpclientfactory.(httpclientfactory。java:60)位于org.openqa.selenium.remote.internal.apachehttpclient$factory.getdefaulthttpclientfactory(apachehttpclient)。java:251)在org.openqa.selenium.remote.internal.apachehttpclient$factory。java:228)在org.openqa.selenium.remote.httpcommandexecutor.getdefaultclientfactory(httpcommandexecutor。java:96)位于org.openqa.selenium.remote.httpcommandexecutor.(httpcommandexecutor。java:70)位于org.openqa.selenium.remote.httpcommandexecutor.(httpcommandexecutor。java:58)在org.openqa.selenium.firefox.internal.newprofileextensionconnection.start(newprofileextensionconnection)。java:97)在org.openqa.selenium.firefox.firefoxdriver.startclient(firefoxdriver。java:271)在org.openqa.selenium.remote.remotewebdriver。java:117)在org.openqa.selenium.firefox.firefoxdriver。java:216)在org.openqa.selenium.firefox.firefoxdriver.(firefoxdriver。java:211)在org.openqa.selenium.firefox.firefoxdriver。java:207)在org.openqa.selenium.firefox.firefoxdriver。java:124)位于org.apache.nutch.store.readable.seleniumhandlers.httpwebclient$1.initialvalue(httpwebclient.html)。java:148)在org.apache.nutch.store.readable.seleniumhandlers.httpwebclient$1.initialvalue(httpwebclient.html)。java:49)在java.lang.threadlocal.setinitialvalue(threadlocal。java:180)在java.lang.threadlocal.get(threadlocal。java:170)在org.apache.nutch.store.readable.seleniumhandlers.httpwebclient.gethtmlpage(httpwebclient。java:318)在org.apache.nutch.store.readable.seleniumhandlers.httpwebclient.gethtmlpage(httpwebclient。java:309)在org.apache.nutch.store.readable.parserhandlers.jsouptokepia.constructjson(jsouptokepia。java:108)在org.apache.nutch.store.readable.storereadable.addjsontopage(storereadable。java:349)在org.apache.nutch.store.readable.storereadable.getparse(storereadable。java:311)在org.apache.nutch.parse.parsecallable.call(parsecallable。java:36)在org.apache.nutch.parse.parsecallable.call(parsecallable。java:23)在java.util.concurrent.futuretask.run(futuretask。java:266)在java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor。java:1142)在java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor。java:617)在java.lang.thread.run(线程。java:745)
看起来selenium在hadoop上不起作用。我认为这是相关的问题。是因为selenium不能在hadoop上运行,还是对这些问题有什么建议?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题