在google计算引擎上部署了一个hadoop(yarn+spark)集群,有一个主节点和两个从节点。当我运行以下shell脚本时:
spark submit--class org.apache.spark.examples.sparkpi--master yarn cluster--num executors 1--driver memory 1g--executor memory 1g--executor cores 1/home/hadoop/spark install/lib/spark-examples-1.1.0-hadoop2.4.0.jar 10
作业一直在运行&每一秒我都会收到类似这样的消息:
15/02/06 22:47:12 INFO yarn.Client: Application report from ResourceManager:
application identifier: application_1423247324488_0008<br>
appId: 8<br>
clientToAMToken: null<br>
appDiagnostics:<br>
appMasterHost: hadoop-w-zrem.c.myapp.internal<br>
appQueue: default<br>
appMasterRpcPort: 0<br>
appStartTime: 1423261517468<br>
yarnAppState: RUNNING<br>
distributedFinalState: UNDEFINED<br>
appTrackingUrl: http://hadoop-m-xxxx:8088/proxy/application_1423247324488_0008/<br>
appUser: achitre
2条答案
按热度按时间ltskdhd11#
而不是
--master yarn-cluster
使用--master yarn-client
djp7away2#
在我的脚本中添加了以下行之后,它成功了:
export spark\u java\u opts=“-dspark.yarn.executor.memoryoverhead=1024-dspark.local.dir=/tmp-dspark.executor.memory=1024”
我想,在指定内存时,我们不应该使用'm'、'g'等;否则我们会得到numberformatexception。