debugging Apache HttpClient未显示响应的内容长度和内容编码标头

kxkpmulp  于 2023-02-04  发布在  Apache
关注(0)|答案(2)|浏览(136)

我安装了Apache httpcomponents-client-5.0.x,在查看http响应的标头时,我感到很震惊,它没有显示Content-LengthContent-Encoding标头,这是我用于测试的代码

import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import com.sun.net.httpserver.Headers;

CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet request = new HttpGet(new URI("https://www.example.com"));
CloseableHttpResponse response = httpclient.execute(request);
Header[] responseHeaders = response.getHeaders();
for(Header header: responseHeaders) {               
    System.out.println(header.getName());
}
// this prints all the headers except 
// status code header
// Content-Length
// Content-Encoding

不管我怎么努力我得到的都是一样的结果,就像这样

Iterator<Header> headersItr = response.headerIterator();
while(headersItr.hasNext()) {
    Header header = headersItr.next();
    System.out.println(header.getName());
}

或者这个

HttpEntity entity = response.getEntity();
System.out.println(entity.getContentEncoding()); // NULL
System.out.println(entity.getContentLength());   // -1

根据6年前提出的this question问题,即使是旧版本的Apache HttpClient,这似乎也是一个老问题。
当然,正如Wireshark所确认的那样,服务器实际上返回了这些标头,并且ApacheHttpClient自己记录日志

2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << HTTP/1.1 200 OK
2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Encoding: gzip
2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Accept-Ranges: bytes
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Age: 451956
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Cache-Control: max-age=604800
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Type: text/html; charset=UTF-8
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Date: Fri, 03 Apr 2020 05:59:09 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Etag: "3147526947+gzip"
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Expires: Fri, 10 Apr 2020 05:59:09 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Server: ECS (dcb/7EEB)
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Vary: Accept-Encoding
2020-04-03 07:59:09,109 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << X-Cache: HIT
2020-04-03 07:59:09,109 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Length: 648

顺便说一句,java.net.http库被称为JDK HttpClient的工作很好,并显示所有的头部。
是我做错了什么,还是我应该报告一个已经存在多年的bug?

j5fpnvbx

j5fpnvbx1#

此处为HttpComponents提交者...
你没有仔细听Dave G说的话。默认情况下,HttpClientBuilder会启用透明解压缩,而你看不到一些头文件的原因如下:

if (decoderFactory != null) {
  response.setEntity(new DecompressingEntity(response.getEntity(), decoderFactory));
  response.removeHeaders(HttpHeaders.CONTENT_LENGTH);
  response.removeHeaders(HttpHeaders.CONTENT_ENCODING);
  response.removeHeaders(HttpHeaders.CONTENT_MD5);
} ...

至于JDK HttpClient,它不会执行任何透明的解压缩,因此您可以看到压缩流的长度,您必须自己解压缩。
curl 提交者在这里...
我也有raised an issue

  • 更新日期:03 Feb. '23* 禁用自动解压缩的密码是:
CloseableHttpClient httpclient = HttpClients.createSimple();
// OR
CloseableHttpClient httpclient = HttpClients.custom().disableContentCompression().build();
kyks70gy

kyks70gy2#

在这种情况下,内容长度可能被忽略。

HttpGet request = new HttpGet(new URI("https://www.example.com"));
request.setHeader("Accept-Encoding", "identity");
CloseableHttpResponse response = httpclient.execute(request);

我可以看到以下内容

HttpEntity entity = response.getEntity();
System.out.println(entity.getContentLength());
System.out.println(entity.getContentEncoding());

产出

...
2020-04-03 03:04:17.760 DEBUG 34196 --- [           main] org.apache.hc.client5.http.headers       : http-outgoing-0 << Content-Length: 1256
...
1256
null

我想提请您注意正在发送的此标题:

http-outgoing-0 >> Accept-Encoding: gzip, x-gzip, deflate

这告诉服务器这个客户端可以接受gzip,x-gzip,并在响应中压缩内容。响应声明它是'gzip'编码的。

http-outgoing-0 << Content-Encoding: gzip

我相信HttpClient在内部透明地处理这个问题,并使内容可用。
正如您引用的另一篇文章中所述,其中一个答案指出可以应用方法EntityUtils.toByteArray(httpResponse.getEntity()).length来获取内容长度。

相关问题