如何使用spark配置yarn集群？

r1zk6ea1 于 2021-06-01 发布在 Hadoop

关注(0)|答案(0)|浏览(210)

我有两台32gb内存的机器，每台机器8核。因此，如何使用spark配置Yarn，以及根据数据集调整资源时必须使用哪些属性。我有8gb的数据集，所以有人能建议在并行作业中使用spark的Yarn配置吗？
这里是Yarn配置：我使用的是hadoop2.7.3、spark2.2.0和ubuntu16

`yarn scheduler minimum-allocation-mb--2048 
yarn scheduler maximum-allocation-mb--5120
yarn nodemanager resource.memory-mb--30720 
yarn scheduler minimum-allocation-vcores--1 
yarn scheduler maximum-allocation-vcores--6 
yarn nodemanager resource.cpu-vcores--6`

以下是spark配置：

spark master    master:7077 
spark yarn am memory 4g 
spark yarn am cores 4 
spark yarn am memoryOverhead    412m 
spark executor instances    3 
spark executor cores    4 
spark executor memory   4g 
spark yarn executor memoryOverhead  412m

但我的问题是每台机器有32gb内存和8核。这个配置是否正确，我可以运行多少个应用程序？bcoz只有两个应用程序并行运行。

hadoop yarn apache-spark hadoop2.7.3

来源：https://stackoverflow.com/questions/52361121/how-to-configure-the-yarn-cluster-with-spark

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

如何使用spark配置yarn集群？

暂无答案！

相关问题

热门标签

最新问答