在rhadoop中hadoop流失败,错误代码为1

r7xajy2e  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(541)

我使用rhadoop的代码如下:

Sys.setenv(HADOOP_OPTS="-Djava.library.path=/usr/local/hadoop/lib/native")
Sys.setenv(HADOOP_HOME="/usr/local/hadoop")
Sys.setenv(HADOOP_CMD="/usr/local/hadoop/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-3.0.0.jar")
Sys.setenv(JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64")

library(rJava)
library(rhdfs)
library(rmr2)
hdfs.init()

mapper = function (., X) {
  n=nrow(X);
  ones=matrix(rep(1,n),nrow=n,ncol=1);
  ag=aggregate(cbind(ones,X[,1:79]),by=list(X[,80]),FUN="sum")
  key=factor(ag[,1]);
  keyval(key,split(ag[,-1],key))
}

reducer = function(k, A) {
  keyval(k,list(Reduce('+', A)))
}

GroupSums <-  from.dfs( mapreduce(input = "/ISCXFlowMeter.csv", map = mapper, reduce = reducer, combine = T))

当我运行此代码时,我得到一个错误:
packagejobjar:[/tmp/hadoop-unjar7138506441946536619/][]/tmp/streamjob6099552934186757596.jar tmpdir=null 2018-06-12 22:40:04651 info client.rmproxy:连接到resourcemanager at/0.0.0:8032 2018-06-12 22:40:04945 info client.rmproxy:连接到resourcemanager at/0.0.0.0:8032 2018-06-12 22:40:05,201 info mapreduce.jobresourceuploader:禁用路径的擦除编码:/tmp/hadoop yarn/staging/uel/.staging/job_\u 0012 2018-06-12 22:40:06158 info mapred.fileinputformat:要处理的输入文件总数:1 2018-06-12 22:40:06171 info net.networktopology:添加新节点:/default rack/127.0.1:9866 2018-06-12 22:40:06,233 info mapreduce.jobsubmitter:数量splits:2 2018-06-12 22:40:06348 info configuration.deprecation:不推荐使用yarn.resourcemanager.system-metrics-publisher.enabled。相反,请使用yarn.system-metrics-publisher.enabled 2018-06-12 22:40:06608 info mapreduce.jobsubmitter:提交作业令牌:job\u 1528838017005\u 0012 2018-06-12 22:40:06610 info mapreduce.jobsubmitter:使用令牌执行:[]2018-06-12 22:40:06945 info conf.configuration:resource-types.xml not found 2018-06-12 22:40:06,945 info resource.resourceutils:找不到“resource types.xml”。2018-06-12 22:40:07022 info impl.yarclientimpl:提交的申请\u 1528838017005 \u 0012 2018-06-12 22:40:07249 info mapreduce.job:跟踪作业的url:http://uel-deskop-vm:8088/proxy/application\u 1528838017005\u 0012/2018-06-12 22:40:07251 info mapreduce.job:运行作业:job\u 1528838017005\u 0012 2018-06-12 22:40:09,301 info mapreduce.job:job job\u 1528838017005\u 0012在uber模式下运行:false 2018-06-12 22:40:09305 info mapreduce.job:map 0%reduce 0%2018-06-12 22:40:09,337 info mapreduce.job:作业作业\u 1528838017005 \u 0012失败,状态为failed,原因是:应用程序\u 1528838017005 \u 0012因appattempt的am容器而失败2次\u 1528838017005 \u 0012 \u000002退出,退出代码:127,尝试失败。诊断:[2018-06-12 22:40:08.734]容器启动异常。集装箱编号:集装箱号:1528838017005号:0012号:02号:000001出口代码:127
[2018-06-12 22:40:08.736]集装箱以非零出口代码127退出。错误文件:prelaunch.err。prelaunch.err的最后4096字节:stderr的最后4096字节:/bin/bash:/bin/java:没有这样的文件或目录
[2018-06-12 22:40:08.736]集装箱以非零出口代码127退出。错误文件:prelaunch.err。prelaunch.err的最后4096字节:stderr的最后4096字节:/bin/bash:/bin/java:没有这样的文件或目录
有关更详细的输出,请查看应用程序跟踪页面:http://uel-deskop-vm:8088/cluster/app/application\u 1528838017005\u 0012然后单击指向每次尝试日志的链接。应用程序失败。2018-06-12 22:40:09368 info mapreduce.job:计数器:0 2018-06-12 22:40:09369流错误。streamjob:作业未成功!流命令失败!mr中出错(map=map,reduce=reduce,combine=combine,vectorized.reduce,:hadoop流失败,错误代码1>
hadoop中的iscxflowmeter.csv文件在以下位置可用:https://www.dropbox.com/s/rbppzg6x2slzcjz/iscxflowmeter.csv?dl=1
你能指导我如何纠正这个问题吗?

xcitsw88

xcitsw881#

一段时间后,通过将以下属性添加到 mapred-site.xml ,我可以纠正错误。

<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>

但是,现在的问题是,在完成map reduce之后,键值为null。任何帮助,我都很感激。

相关问题