我在master上运行了如下spark代码:
import pyspark
from pyspark import SparkContext
sc =SparkContext()
nums= sc.parallelize([1,2,3,4])
nums.collect()
我的群集配置:独立/客户端模式下有3个节点(1个主节点+2个从节点)
Master config 600mb RAM, 1CPU
Slave1 config 600mb RAM, 1CPU
Slave2 config 16GB RAM, 4CPU
当我使用命令提交作业时,我有一个长期运行的作业 spark-submit --master spark://<MASTER_IP>:7077 --num-executors=6 --conf spark.driver.memory=500M --conf spark.executor.memory=6G --deploy-mode client test.py
屏幕上的日志:
20/05/11 19:43:09 INFO BlockManagerMaster: Removal of executor 105 requested
20/05/11 19:43:09 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200511193954-0001/106 on worker-20200511192038--MASTER_IP:44249 (MASTER_IP:44249) with 4 core(s)
20/05/11 19:43:09 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 105
20/05/11 19:43:09 INFO BlockManagerMasterEndpoint: Trying to remove executor 105 from BlockManagerMaster.
20/05/11 19:43:10 INFO StandaloneSchedulerBackend: Granted executor ID app-20200511193954-0001/106 on hostPort MASTER_IP:44249 with 4 core(s), 6.0 GB RAM
^C20/05/11 19:43:58 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
尝试的解决方案:
我试着添加一个新的集群 Slave3
因为上面搜索到的关于资源不足的错误,但是这个错误仍然存在。
是不是因为我的记忆太少了 Master
节点??有什么建议吗??
1条答案
按热度按时间omjgkv6w1#
先试着用最小的要求来运行。还要将部署模式更改为集群以使用工作节点。阅读更多信息https://spark.apache.org/docs/latest/submitting-applications.html