“hadoop流式处理失败，错误代码为5”

krugob8w 于 2021-06-02 发布在 Hadoop

关注(0)|答案(0)|浏览(226)

我用我的两台笔记本电脑创建了一个多节点hadoop集群，并成功地进行了测试。之后，我在hadoop环境中安装了rhadoop。安装了所有必要的包并设置了路径变量。
然后，尝试运行wordcount示例，如下所示：

map <- function(k,lines) {

   words.list <- strsplit(lines, "\\s")

   words <- unlist(words.list)

   return(keyval(words, 1))

}

reduce <- function(word, counts) {

 keyval(word, sum(counts))

}

wordcount <- function(input, output = NULL) {

   mapreduce(input = input, output = output, input.format = "text", map = map, reduce = reduce)

}

hdfs.root <- "wordcount"
hdfs.data <- file.path(hdfs.root, "data")
hdfs.out <- file.path(hdfs.root, "out")
out <- wordcount(hdfs.data, hdfs.out)

我得到以下错误：

15/05/24 21:09:20 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/05/24 21:09:20 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
15/05/24 21:09:20 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with     processName=JobTracker, sessionId= - already initialized
15/05/24 21:09:21 INFO mapreduce.JobSubmitter: Cleaning up the staging area     file:/app/hadoop/tmp/mapred/staging/master91618435/.staging/job_local91618435_0001
15/05/24 21:09:21 ERROR streaming.StreamJob: Error Launching job : No such file or directory
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  : 
  hadoop streaming failed with error code 5
Called from: mapreduce(input = input, output = output, input.format = "text", 
    map = map, reduce = reduce)

在运行这个之前，我已经创建了两个hdfs文件夹 wordcount/data 以及 wordcount/out 并用命令行上传了一些文本到第一个。
另一个问题是：我的计算机上有两个用户： hduser 以及 master . 第一个是为hadoop安装创建的。我想当我打开r/rstudio时 master ，因为hadoop是为 hduser 有一些权限问题导致此错误。正如人们在4号桌上看到的那样。系统试图查找的输出行 master91618435 ，我想应该是这样的 hduser... .
我的问题是，我怎样才能摆脱这个错误？
p、 s：这里有一个类似的问题，但对我来说没有任何有用的答案

hadoop r rhadoop

来源：https://stackoverflow.com/questions/30427648/hadoop-streaming-failed-with-error-code-5

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

“hadoop流式处理失败，错误代码为5”

暂无答案！

相关问题

热门标签

最新问答