我是一个新的java开发人员。
我和jsoup有点问题。有几天,我做了一个网络爬虫(唯一的目的是练习和学习一些新的东西),它是工作的,由于代码中的错误,我必须解决,但至少它运行良好,通过控制台。我的意思是,它没有达到预期的效果,而是在运行。但现在我不明白发生了什么,我有麻烦有关的汇编。
这是输出中的错误:
Scanning for projects...
--------------< com.webcrawler.jsoupexample:jsoupexample >--------------
Building jsoupexample 1.0-SNAPSHOT
--------------------------------[ jar ]---------------------------------
--- maven-resources-plugin:2.6:resources (default-resources) @ jsoupexample ---
Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
Copying 0 resource
--- maven-compiler-plugin:3.1:compile (default-compile) @ jsoupexample ---
Changes detected - recompiling the module!
File encoding has not been set, using platform encoding Cp1252, i.e. build is platform dependent!
Compiling 2 source files to C:\Users\**\**\**\**\**\web-crawler-jsoup-example-master\webcrawler\target\classes
-------------------------------------------------------------
COMPILATION ERROR :
-------------------------------------------------------------
com/webcrawler/jsoupexample/ParserEngine.java:[3,17] package org.jsoup does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[4,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[5,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[6,24] package org.jsoup.select does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[22,9] cannot find symbol
symbol: class Document
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[22,24] cannot find symbol
symbol: variable Jsoup
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[23,9] cannot find symbol
symbol: class Elements
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[25,14] cannot find symbol
symbol: class Element
location: class com.webcrawler.jsoupexample.ParserEngine
8 errors
-------------------------------------------------------------
------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time: 3.465 s
Finished at: 2020-12-06T20:00:55-03:00
------------------------------------------------------------------------
Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project jsoupexample: Compilation failure: Compilation failure:
com/webcrawler/jsoupexample/ParserEngine.java:[3,17] package org.jsoup does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[4,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[5,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[6,24] package org.jsoup.select does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[22,9] cannot find symbol
symbol: class Document
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[22,24] cannot find symbol
symbol: variable Jsoup
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[23,9] cannot find symbol
symbol: class Elements
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[25,14] cannot find symbol
symbol: class Element
location: class com.webcrawler.jsoupexample.ParserEngine
-> [Help 1]
To see the full stack trace of the errors, re-run Maven with the -e switch.
Re-run Maven using the -X switch to enable full debug logging.
For more information about the errors and possible solutions, please read the following articles:
[Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
这是我的 pom.mxl
```
4.0.0
<groupId>com.webcrawler.jsoupexample</groupId>
<artifactId>jsoupexample</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.11.3</version>
</dependency>
</dependencies>
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements; //this 4 imports has error now, days ago didn't have error
import java.io.IOException;
import java.util.ArrayList;
public class ParserEngine {
private String baseUrl;
private ArrayList<String> urlList;
public ParserEngine(String baseUrl){
this.baseUrl = baseUrl;
this.urlList = new ArrayList<String>();
}
public void crawl(String url) throws IOException {
Document doc = Jsoup.connect(url).ignoreContentType(true).get();
Elements links = doc.select("a[href]");
//here I found the problem why the crawler doesn't work as I want,
//but it isn't my actual issue, i want to be able to run it again in console
for (Element link : links) {
String actualUrl = link.attr("abs:href");
if (!urlList.contains(actualUrl) & actualUrl.startsWith(baseUrl)){
print(" * a: <%s> (%s)", actualUrl, trim(link.text(), 35));
urlList.add(actualUrl);
crawl(actualUrl);
}
}
}
private static void print(String msg, Object... args) {
System.out.println(String.format(msg, args));
}
private static String trim(String s, int width) {
if (s.length() > width)
return s.substring(0, width-1) + ".";
else
return s;
}
public String getBaseUrl(){
return baseUrl;
}
public void setBaseUrl(String url){
baseUrl = url;
}
public ArrayList<String> getUrlList(){
return urlList;
}
}
这里是 `Main.Java` ```
package com.webcrawler.jsoupexample;
import java.io.IOException;
public class Main {
public static void main(String[] args) throws IOException {
String url = "http://elfreneticoinformatico.com";
ParserEngine parser = new ParserEngine(url);
parser.crawl(parser.getBaseUrl());
System.out.println("Crawler finished. Total URLs: " + parser.getUrlList().size());
}
}
有人能帮忙吗?
1条答案
按热度按时间hvvq6cgz1#
如果您没有使用一个像样的javaide(集成开发环境),我建议您使用netbeans、intellijidea,或者eclipse。netbeans更容易启动,因为它可以直接与maven项目一起工作。ide甚至可以在您构建之前检测到问题。
…我试着用maven来建造它
clean
以及compile
目标,并用IntelliJ IDEA Community 2020.3
而且在NetBeans 12.2
,和。。。它的构造和运行都很好,所以我不确定是怎么回事。netbeans生成输出为:intellij运行输出为: