我尝试将spark3.0.0-preview2升级到spark3.0.0,在运行“spark.read.load”(“s3a:…”)”时遇到以下错误,代码可以在以前的版本中工作。
20/07/21 08:57:06 INFO spark.SecurityManager: Changing view acls to: ec2-user
20/07/21 08:57:06 INFO spark.SecurityManager: Changing modify acls to: ec2-user
20/07/21 08:57:06 INFO spark.SecurityManager: Changing view acls groups to:
20/07/21 08:57:06 INFO spark.SecurityManager: Changing modify acls groups to:
20/07/21 08:57:06 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ec2-user); groups with view permissions: Set(); users with modify permissions: Set(ec2-user); groups with modify permissions: Set()
20/07/21 08:57:06 INFO client.TransportClientFactory: Successfully created connection to Master/10.0.179.117:39527 after 68 ms (0 ms spent in bootstraps)
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:283)
at org.apache.spark.executor.YarnCoarseGrainedExecutorBackend$.main(YarnCoarseGrainedExecutorBackend.scala:81)
at org.apache.spark.executor.YarnCoarseGrainedExecutorBackend.main(YarnCoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:302)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:103)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:87)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$1(CoarseGrainedExecutorBackend.scala:311)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:62)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
... 4 more
Caused by: org.apache.spark.SparkException: Unsupported message RpcMessage(10.0.179.117:49308,RetrieveSparkAppConfig(0),org.apache.spark.rpc.netty.RemoteNettyRpcCallContext@516aceb5) from 10.0.179.117:49308
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$2(Inbox.scala:104)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$receiveAndReply$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:205)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:203)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
我目前使用netty4.1.47和awswava1.7.4。很抱歉,我不能提供更多的信息给你,因为我不知道从哪里来的错误。
编辑时间:
Fail to execute line 2: data = spark.read.load("s3a://[some parquet file]")
Traceback (most recent call last):
File "/tmp/1595350396558-0/zeppelin_python.py", line 153, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 2, in <module>
File "/home/ec2-user/spark/python/pyspark/sql/readwriter.py", line 178, in load
return self._df(self._jreader.load(path))
File "/home/ec2-user/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/home/ec2-user/spark/python/pyspark/sql/utils.py", line 137, in deco
raise_from(converted)
File "<string>", line 3, in raise_from
pyspark.sql.utils.AnalysisException: <unprintable AnalysisException object>
暂无答案!
目前还没有任何答案,快来回答吧!