nutch crawl使用协议selenium和phantomjs作为mesos任务启动：org.openqa.selenium.nosuchelementexception

9bfwbjaz 于 2021-06-26 发布在 Mesos

关注(0)|答案(0)|浏览(255)

我正在尝试用nutch使用协议selenium和phantomjs驱动程序来抓取基于ajax的站点。我使用的是从nutch的github存储库编译的apache-nutch-1.13。这些爬网作为任务在mesos管理的系统中启动。当我从服务器的终端启动nutch的crawl脚本时，一切都很顺利，站点按照我的要求进行了爬网。但是，当我在mesos任务中使用相同的参数执行相同的爬网脚本时，nutch引发了一个异常：

fetch of http://XXXXX failed with: java.lang.RuntimeException: org.openqa.selenium.NoSuchElementException: {"errorMessage":"Unable to find element with tag name 'body'","request":{"headers":{"Accept-Encoding":"gzip,deflate","Connection":"Keep-Alive","Content-Length":"35","Content-Type":"application/json; charset=utf-8","Host":"localhost:12215","User-Agent":"Apache-HttpClient/4.3.5 (java 1.5)"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"tag name\",\"value\":\"body\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/a7f98ec0-b8aa-11e6-8b84-232b0d8e1024/element"}}

我的第一印象是环境变量（hadoop\u home，path，classpath…）有些奇怪，但我在nutch脚本和终端中使用了相同的变量，结果仍然相同。
你知道我做错了什么吗？

selenium mesos nutch phantomjs

来源：https://stackoverflow.com/questions/40937333/nutch-crawl-using-protocol-selenium-with-phantomjs-launched-as-a-mesos-task-or

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

nutch crawl使用协议selenium和phantomjs作为mesos任务启动：org.openqa.selenium.nosuchelementexception

暂无答案！

相关问题

热门标签

最新问答