Running into a weird certificate issue that I've been debugging for days and took multiple stabs at this.
My application simply uploads a directory to an S3 bucket then pulls down that directory from that same S3 bucket into a spark dataframe.
I'm only using apache spark, hadoop-aws, aws-java-sdk-bundleSpark version 3.1.1, Scala version 2.12, hadoop version 3.2.0, and aws java sdk version 1.11.901
- verified AWS secret key and access key are 100% correct
- Running application locally without docker works without any issues
When I try running my application with docker I'm able to upload the directory, but when I try to make an attempt to login and read the directory I run this stacktrace of exceptions (probably just a propagation from the first exception that occurs)
Exception in thread "main" org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on s3a://bucket-name/directory: com.amazonaws.SdkClientException: Unable to execute HTTP request: Certificate for <bucket-name.s3.amazonaws.com> doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]: Unable to execute HTTP request: Certificate for <bucket-name.s3.amazonaws.com> doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Certificate for <bucket-name.s3.amazonaws.com> doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for <bucket-name.s3.amazonaws.com> doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507) at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437) at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384) at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) at sun.reflect.GeneratedMethodAccessor137.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76) at com.amazonaws.http.conn.$Proxy60.connect(Unknown Source) at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
I find it odd because my colleague is also using the same credentials as me but he does not run into this issue at all?
Any ideas on why this might be happening?
- Could it be something with the bucket policy or something?
1条答案
按热度按时间8zzbczxx1#
如何使用Docker、Spark、Python读取AWS S3
Docker镜像:
spark_version=3.3.0
hadoop_version=3
python_version=3.10.6
Pyspark代码:
感谢@FelipeGonzalez的
spark.hadoop.fs.s3a.path.style.access
提示。