从pyspark访问文件时无法连接到hadoop群集

yyyllmsg  于 2021-06-02  发布在  Hadoop
关注(0)|答案(0)|浏览(151)

我正在运行以下代码:

conf = SparkConf().setAppName("basicRegressionUbuntu").setMaster("spark://MyCUSTOMIP:7077")
sc = SparkContext(conf=conf)

rdd = sc.textFile("hdfs://MYHADOOPMASTERNODE:8020/sampleData/Sacramentorealestatetransactions.csv")

它抛出以下内容:

16/03/25 10:01:11 WARN security.UserGroupInformation: PriviledgedActionException as:hduser (auth:SIMPLE) cause:java.io.IOException: Failed to connect to /10.0.2.15:42939
Exception in thread "main" java.io.IOException: Failed to connect to /10.0.2.15:42939
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection timed out: /10.0.2.15:42939
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more

我知道文件路径存在,因为当我ssh到myhadoopmasternode并执行 hdfs dfs -ls /sampleData/ 它给我看鱼片。
任何帮助都将不胜感激!

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题