我有一个使用第三方代理提供商(luminati.io)的网络爬虫,它已经为多个网站工作,没有任何问题。然而,今天我为一个新站点构建了一个爬虫,并遇到了 javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
尝试连接到主机终结点时。我正在运行jdk版本1.8.0151。以下是代理客户端的代码:
public class ProxyClient implements Client
{
private static final String username = "my-luminati-username";
private static final String password = "my-luminati-pw";
private static final String theHostname = "zproxy.lum-superproxy.io";
private static final int port = 22225;
public String session_id = Integer.toString(new Random().nextInt(Integer.MAX_VALUE));
private WebClient theWebClient;
public ProxyClient(String country){
String myLogin = username+(country!=null ? "-country-"+country : "")
+"-session-" + session_id;
CredentialsProvider myCredentialsProvider = new BasicCredentialsProvider();
myCredentialsProvider.setCredentials(new AuthScope(new HttpHost(theHostname, port)),
new UsernamePasswordCredentials(myLogin, password));
theWebClient = new WebClient();
theWebClient.getOptions().setCssEnabled(false);
theWebClient.getOptions().setJavaScriptEnabled(false);
theWebClient.getOptions().setProxyConfig(new ProxyConfig(theHostname, port));
theWebClient.setCredentialsProvider(myCredentialsProvider);
}
public HtmlPage request(String aUrl) throws IOException
{
return theWebClient.getPage(aUrl);
}
public void close() throws IOException { theWebClient.close(); }
}
下面是我正在运行的爬虫程序的简化版本,其中客户端作为proxyclient传入:
public class BusinessSearchTaxCrawler
{
private String theBaseUrl = "https://apps.ilsos.gov/corporatellc/CorporateLlcController";
private HtmlPage thePage;
public BusinessSearchTaxCrawler()
{
thePage = null;
}
public boolean getBusinessMailingAddress(Client aClient, PropertyInfo aPropertyInfo)
{
try
{
thePage = aClient.request(theBaseUrl);
} catch (Exception aE)
{
aE.printStackTrace();
}
return false;
}
}
这是错误的完整堆栈跟踪:
javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:436)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.upgrade(DefaultHttpClientConnectionOperator.java:191)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:392)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:428)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:177)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1324)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1241)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:348)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:417)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:402)
at ProxyClient.request(ProxyClient.java:39)
at BusinessSearchTaxCrawler.getBusinessMailingAddress(BusinessSearchTaxCrawler.java:24)
at Main.main(Main.java:27)
Caused by: java.io.EOFException: SSL peer shut down incorrectly
at sun.security.ssl.InputRecord.read(InputRecord.java:505)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
... 22 more
以下是我调试问题所采取的步骤:
-在没有代理的情况下运行代码可以正常运行。我可以使用以下代码连接到主机终结点而不会出现问题:
WebClient theWebClient = new WebClient();
theWebClient.getOptions().setCssEnabled(false);
theWebClient.getOptions().setJavaScriptEnabled(false);
thePage = theWebClient.getPage(theBaseUrl);
-我试着添加 -Dhttps.protocols=TLSv1.1,TLSv1.2
到vm选项。这并没有改变结果
-我用它运行了应用程序 -Djavax.net.debug=all
并在堆栈跟踪中观察到以下情况:
Ignoring unsupported cipher suite: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 for TLSv1.1
%% No cached client session
***ClientHello, TLSv1.2
RandomCookie: GMT: 1591992531 bytes = { 169, 86, 174, 70, 252, 104, 167, 236, 15, 50, 36, 85, 3, 119, 151, 231, 179, 110, 140, 53, 104, 169, 249, 35, 95, 76, 189, 130 }
Session ID: {}
Cipher Suites: [TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDH_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_DSS_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256, TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
Compression Methods: { 0 }
Extension elliptic_curves, curve names: {secp256r1, secp384r1, secp521r1, sect283k1, sect283r1, sect409k1, sect409r1, sect571k1, sect571r1, secp256k1}
Extension ec_point_formats, formats: [uncompressed]
Extension signature_algorithms, signature_algorithms: SHA512withECDSA, SHA512withRSA, SHA384withECDSA, SHA384withRSA, SHA256withECDSA, SHA256withRSA, SHA256withDSA, SHA224withECDSA, SHA224withRSA, SHA224withDSA, SHA1withECDSA, SHA1withRSA, SHA1withDSA
Extension server_name, server_name: [type=host_name (0), value=apps.ilsos.gov]
***
[write] MD5 and SHA1 hashes: len = 176
0000: 01 00 00 AC 03 03 5F E4 E1 D3 A9 56 AE 46 FC 68 ......_....V.F.h
0010: A7 EC 0F 32 24 55 03 77 97 E7 B3 6E 8C 35 68 A9 ...2$U.w...n.5h.
0020: F9 23 5F 4C BD 82 00 00 2C C0 23 C0 27 00 3C C0 .#_L....,.#.'.<.
0030: 25 C0 29 00 67 00 40 C0 09 C0 13 00 2F C0 04 C0 %.).g.@...../...
0040: 0E 00 33 00 32 C0 2B C0 2F 00 9C C0 2D C0 31 00 ..3.2.+./...-.1.
0050: 9E 00 A2 00 FF 01 00 00 57 00 0A 00 16 00 14 00 ........W.......
0060: 17 00 18 00 19 00 09 00 0A 00 0B 00 0C 00 0D 00 ................
0070: 0E 00 16 00 0B 00 02 01 00 00 0D 00 1C 00 1A 06 ................
0080: 03 06 01 05 03 05 01 04 03 04 01 04 02 03 03 03 ................
0090: 01 03 02 02 03 02 01 02 02 00 00 00 13 00 11 00 ................
00A0: 00 0E 61 70 70 73 2E 69 6C 73 6F 73 2E 67 6F 76 ..apps.ilsos.gov
main, WRITE: TLSv1.2 Handshake, length = 176
[Raw write]: length = 181
0000: 16 03 03 00 B0 01 00 00 AC 03 03 5F E4 E1 D3 A9 ..........._....
0010: 56 AE 46 FC 68 A7 EC 0F 32 24 55 03 77 97 E7 B3 V.F.h...2$U.w...
0020: 6E 8C 35 68 A9 F9 23 5F 4C BD 82 00 00 2C C0 23 n.5h..#_L....,.#
0030: C0 27 00 3C C0 25 C0 29 00 67 00 40 C0 09 C0 13 .'.<.%.).g.@....
0040: 00 2F C0 04 C0 0E 00 33 00 32 C0 2B C0 2F 00 9C ./.....3.2.+./..
0050: C0 2D C0 31 00 9E 00 A2 00 FF 01 00 00 57 00 0A .-.1.........W..
0060: 00 16 00 14 00 17 00 18 00 19 00 09 00 0A 00 0B ................
0070: 00 0C 00 0D 00 0E 00 16 00 0B 00 02 01 00 00 0D ................
0080: 00 1C 00 1A 06 03 06 01 05 03 05 01 04 03 04 01 ................
0090: 04 02 03 03 03 01 03 02 02 03 02 01 02 02 00 00 ................
00A0: 00 13 00 11 00 00 0E 61 70 70 73 2E 69 6C 73 6F .......apps.ilso
00B0: 73 2E 67 6F 76 s.gov
main, received EOFException: error
main, handling exception: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
main, SEND TLSv1.2 ALERT: fatal, description = handshake_failure
main, WRITE: TLSv1.2 Alert, length = 2
[Raw write]: length = 7
0000: 15 03 03 00 02 02 28 ......(
main, called closeSocket()
我跑了 openssl s_client -connect ilsos.gov:443
并观察到:
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES128-GCM-SHA256
Session-ID: 66D1C471C9CA0DA2BCE6DA7675DF099D134BB0495C69D05B52AE0A5F4CF7976F
Session-ID-ctx:
Master-Key: A7B388126D92E03C1314EDE2815E9E8A38CF10FD745CB13C2F6163E0FBB05F35CF17CAF18128F072FCF1D1B03A4C3A11
Start Time: 1608833542
Timeout : 7200 (sec)
Verify return code: 0 (ok)
最后,我读了一些补充的东西 System.setProperty("https.protocols", "TLSv1,TLSv1.1,TLSv1.2");
我在 ProxyClient
班级。这也没有解决问题。
我还是一个新手,所以我不知道如何清楚地解释上面的调试信息。但我怀疑代理使用的是比我的机器更旧的tls协议。谢谢你的帮助。
暂无答案!
目前还没有任何答案,快来回答吧!