cgroups:在非安全集群中初始化容器执行器失败

pu82cl6c  于 2021-06-02  发布在  Hadoop
关注(0)|答案(2)|浏览(721)

我正在尝试在非安全模式下使用cgroups和yarn 2.6.0。如果我使用defaultcontainerexecutor,它可以正常工作。但是,当我尝试使用linuxcontainerexecutor时出现错误。
现在,当我执行-->$yarn nodemanager时,它失败了

ExitCodeException exitCode=24: File /home/hduser2/hadoop/hadoop-2.6.0/etc/hadoop must be owned by root, but is owned by 1001

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:181)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:209)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
15/08/08 23:07:39 INFO nodemanager.ContainerExecutor: 
15/08/08 23:07:39 INFO service.AbstractService: Service NodeManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
    at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:209)
    ... 3 more
Caused by: ExitCodeException exitCode=24: File /home/hduser2/hadoop/hadoop-2.6.0/etc/hadoop must be owned by root, but is owned by 1001

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:181)
    ... 4 more
15/08/08 23:07:39 WARN service.AbstractService: When stopping the service NodeManager : java.lang.NullPointerException
java.lang.NullPointerException
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.stopRecoveryStore(NodeManager.java:161)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:273)
    at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
    at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
    at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
15/08/08 23:07:39 FATAL nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
    at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:209)
    ... 3 more
Caused by: ExitCodeException exitCode=24: File /home/hduser2/hadoop/hadoop-2.6.0/etc/hadoop must be owned by root, but is owned by 1001

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:181)
    ... 4 more
15/08/08 23:07:39 INFO nodemanager.NodeManager: SHUTDOWN_MSG:

特定于现场的Yarn配置特性包括:

<property>
        <name>yarn.nodemanager.container-executor.class</name>
        <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value> 
</property>
<property>
        <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
        <value>org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler</value> 
</property>
<property>
        <name>yarn.nodemanager.linux-container-executor.cgroups.hierarchy</name>
        <value>/hadoop-yarn</value> 
</property>
<property>
        <name>yarn.nodemanager.linux-container-executor.cgroups.mount</name>
        <value>true</value> 
</property>
<property>
        <name>yarn.nodemanager.linux-container-executor.cgroups.mount-path</name>
        <value>/cgroup</value> 
</property>
<property>
        <name>yarn.nodemanager.linux-container-executor.group</name>
        <value>hadoop</value> 
</property>
<property>
        <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name>
        <value>95</value>
</property>

    <property>
            <name>yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage</name>
            <value>true</value>
    </property>

container-executor.cfg是:

yarn.nodemanager.linux-container-executor.group=hadoop
min.user.id=1000

如果有人能帮我找出我的设置有什么问题,那就太好了。

798qvoo8

798qvoo81#

在所有nm中添加以下属性将解决您的问题

yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = false
bsxbgnwa

bsxbgnwa2#

您的问题可以在这里的堆栈跟踪中解释: Caused by: ExitCodeException exitCode=24: File /home/hduser2/hadoop/hadoop-2.6.0/etc/hadoop must be owned by root, but is owned by 1001 包含container-executor.cfg的整个路径必须由root拥有并可写。
很可能这个问题可以通过 chown root:root /home/hduser2/hadoop/hadoop-2.6.0/etc/hadoop 尽管你可能需要继续改变所有权。

相关问题