emr hadoop流式处理作业在查找容器令牌时失败

k3bvogb1  于 2021-06-02  发布在  Hadoop
关注(0)|答案(0)|浏览(267)

尝试运行emr流作业经常失败,原因是:

2014-10-15 18:36:36,560 ERROR [main] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[main,5,main] threw an Exception.
java.io.IOException: Exception reading /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1413396780703_0003/container_1413396780703_0003_01_000218/container_tokens
    at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:177)
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:744)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:703)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:605)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:98)
Caused by: java.io.FileNotFoundException: /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1413396780703_0003/container_1413396780703_0003_01_000218/container_tokens (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:146)
    at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:172)
    ... 4 more

故障是不确定的,但在大型集群上过于频繁。我就是这样启动集群的:

elastic-mapreduce --create --alive --instance-group master --instance-type m1.large \
--instance-count 1 \
--instance-group core --instance-type r3.xlarge \
--instance-count 200 --hadoop-version "2.4.0" \
--ami-version "3.2.1" --enable-debugging --json ./emr_config \
--bootstrap-action 's3://path/to/bootstrap.sh' --bootstrap-name Bootstrap

这是步骤配置(emr\u config):

[
  {
    "Name": "Step Name",
    "ActionOnFailure": "CONTINUE",
    "HadoopJarStep": {
      "Jar": "/home/hadoop/contrib/streaming/hadoop-streaming.jar",
      "Args": [
         "-files", "s3://path/to/mapper.py",
         "-input",     "s3://path/to/input/",
         "-output",    "s3://path/to/output/",
         "-mapper",    "mapper.py",
         "-reducer",   "/bin/cat",
         "-jobconf",   "mapreduce.map.java.opts=-Xmx22528m",
         "-jobconf",   "mapreduce.map.memory.mb=23424",
         "-jobconf",   "mapreduce.task.timeout=24000000",
         "-jobconf",   "mapreduce.job.maps=200",
         "-jobconf",   "mapreduce.tasktracker.map.tasks.maximum=1",
         "-jobconf",   "mapred.map.tasks.speculative.execution=false"
      ]
    }
  }
]

有人知道问题的根源或解决方法吗?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题