spark集群启动问题

pu3pd22g  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(675)

我是spark的新手,正在尝试建立spark cluster。我做了以下事情来设置和检查星火群集的状态,但不确定状态。
我试图在浏览器中检查主ip:8081(80804040401),但没有看到任何结果。首先,我设置并启动了hadoop集群。

JPS gives:

 2436 SecondaryNameNode
 2708 NodeManager
 2151 NameNode
 5495 Master
 2252 DataNode
 2606 ResourceManager
 5710 Jps

问题(是否有必要启动hadoop?)
在master/usr/local/spark/conf/slaves中

localhost
 slave-node-1
 slave-node-2

现在,启动Spark;大师从

$SPARK_HOME/sbin/start-master.sh

并用

ps -ef|grep spark
  hduser    5495     1  0 18:12 pts/0    00:00:04 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/*:/usr/local/hadoop/etc/hadoop/ -Xmx1g org.apache.spark.deploy.master.Master --host master-hostname --port 7077 --webui-port 8080

在从属节点1上

$SPARK_HOME/sbin/start-slave.sh spark://205.147.102.19:7077

测试

ps -ef|grep spark
 hduser    1847     1 20 18:24 pts/0    00:00:04 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master-ip:7077

在从属节点2上相同

$SPARK_HOME/sbin/start-slave.sh spark://master-ip:7077
  ps -ef|grep spark
  hduser    1948     1  3 18:18 pts/0    00:00:03 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master-ip:7077

我在spark的网络控制台上看不到任何东西。。所以我想问题可能出在防火墙上。这是我的iptables。。

iptables -L -nv
  Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
  6136  587K fail2ban-ssh  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            multiport dports 22
  151K   25M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
  6   280 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0           
  579 34740 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
  34860 2856K ACCEPT     all  --  eth1   *       0.0.0.0/0            0.0.0.0/0           
  145  7608 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
  56156 5994K REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited
  0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080
  0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8081

  Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
  0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

 Chain OUTPUT (policy ACCEPT 3531 packets, 464K bytes)
 pkts bytes target     prot opt in     out     source               destination         

 Chain fail2ban-ssh (1 references)
 pkts bytes target     prot opt in     out     source               destination         
 2   120 REJECT     all  --  *      *       218.87.109.153       0.0.0.0/0            reject-with icmp-port-unreachable
 5794  554K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0

我正在尽我所能,看看是否Spark簇设置和如何检查它正确。如果集群已经设置好了,为什么我不能在web控制台上检查呢?有什么问题吗?任何提示都会有帮助。。。
edit-在sparkshell之后添加日志--master local命令(在master中)

17/01/11 18:12:46 INFO util.Utils: Successfully started service 'sparkMaster' on port 7077.
 17/01/11 18:12:47 INFO master.Master: Starting Spark master at spark://master:7077
 17/01/11 18:12:47 INFO master.Master: Running Spark version 2.1.0
 17/01/11 18:12:47 INFO util.log: Logging initialized @3326ms
 17/01/11 18:12:47 INFO server.Server: jetty-9.2.z-SNAPSHOT
 17/01/11 18:12:47 INFO handler.ContextHandler: Started   o.s.j.s.ServletContextHandler@20f0b5ff{/app,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@734e74b2{/app/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bc45d76{/,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6a274a23{/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4f5d45d5{/static,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4fb65368{/app/kill,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@76208805{/driver/kill,null,AVAILABLE}
 17/01/11 18:12:47 INFO server.ServerConnector: Started ServerConnector@258dbadd{HTTP/1.1}{0.0.0.0:8080}
 17/01/11 18:12:47 INFO server.Server: Started @3580ms
 17/01/11 18:12:47 INFO util.Utils: Successfully started service 'MasterUI' on port 8080.
 17/01/11 18:12:47 INFO ui.MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://master:8080
 17/01/11 18:12:47 INFO server.Server: jetty-9.2.z-SNAPSHOT
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cfbb7e9{/,null,AVAILABLE}
 17/01/11 18:12:47 INFO server.ServerConnector: Started ServerConnector@2f7af4e{HTTP/1.1}{master:6066}
 17/01/11 18:12:47 INFO server.Server: Started @3628ms
 17/01/11 18:12:47 INFO util.Utils: Successfully started service on port 6066.
 17/01/11 18:12:47 INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@799d5f4f{/metrics/master/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@647c46e3{/metrics/applications/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO master.Master: I have been elected leader! New state: ALIVE

在从属节点中-

17/01/11 18:22:46 INFO Worker: Connecting to master master:7077...
 17/01/11 18:22:46 WARN Worker: Failed to connect to master master:7077

大量java错误。。

17/01/11 18:31:18 ERROR Worker: All masters are unresponsive! Giving up.
wn9m85ua

wn9m85ua1#

问题是ip表。其他大部分都很好。所以我只是按照这里的指示做的https://wiki.debian.org/iptables 修复ip表,这对我很有用。你唯一应该知道的是哪些端口将用于spark/hadoop等。我打开了8080543105007077(一些默认值被许多人用于hadoop和spark安装)。。。

vlju58qv

vlju58qv2#

创建sparkcontext时启动spark web ui
试着跑 spark-shell --master yourmaster:7077 然后打开spark ui。你也可以使用 spark-sumit 若要提交某些应用程序,则将创建sparkcontext。
例子 spark-submit ,来自spark文档:

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

回答第一个问题:如果您想使用hdfs或yarn,就必须启动hadoop组件。如果没有,就不能启动
你也可以去 /etc/hosts/ 并用127.0.0.1或set MASTER_IP 将spark配置中的变量转换为正确的主机名

相关问题