我正在学习使用hadoop的map reduce,并运行以下命令: hadoop jar /usr/lib/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.2.jar -mapper mapper.py -reducer reducer.py -file mapper.py -file reducer.py -input sales_data -output salesout
我包括我得到的全部错误输出:
16/04/15 00:39:26 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
packageJobJar: [mapper.py, reducer.py] [] /tmp/streamjob4183555536412178637.jar tmpDir=null
16/04/15 00:39:28 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/04/15 00:39:28 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/04/15 00:39:28 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/04/15 00:39:28 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-enlighter/mapred/staging/enlighter1664997312/.staging/job_local1664997312_0001
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal character in path at index 112: file:/run/media/enlighter/dd3546e2-4871-4fc6-a57e-7336392cb705/home/enlighter/workspace/dbms-lab/assign_4(Hadoop MapReduce project)/code/tester/mapper.py
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:109)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:95)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:190)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:1014)
at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:135)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.URISyntaxException: Illegal character in path at index 112: file:/run/media/enlighter/dd3546e2-4871-4fc6-a57e-7336392cb705/home/enlighter/workspace/dbms-lab/assign_4(Hadoop MapReduce project)/code/tester/mapper.py
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.checkChars(URI.java:3021)
at java.net.URI$Parser.parseHierarchical(URI.java:3105)
at java.net.URI$Parser.parse(URI.java:3053)
at java.net.URI.<init>(URI.java:588)
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:107)
... 26 more
似乎执行过程中我的mapper和reducer scipts所在的系统路径有问题。无法正确解析路径中的特殊字符。
我应该怎么做才能成功运行作业?我需要更改文件夹名称吗?还是有更好的解决办法?
1条答案
按热度按时间at0kjp5o1#
我也遇到了同样的例外,我能够通过删除foldername中存放mapper和reducer程序的空格来解决这个问题。