warn mapreduce.loadincrementalhfiles:跳过emr上的非目录hdfs:

ercv8c1e  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(354)

我正在尝试在hbase中使用mapreduce批量加载文本文件。一切正常,但当我在最后一步批量加载我得到警告和我的mapreduce工作卡住。

17/06/15 10:22:43 INFO mapreduce.Job: Job job_1495181241247_0013 completed successfully
17/06/15 10:22:43 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=836391
                FILE: Number of bytes written=1988049
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=73198
                HDFS: Number of bytes written=12051358
                HDFS: Number of read operations=8
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=196200
                Total time spent by all reduces in occupied slots (ms)=428490
                Total time spent by all map tasks (ms)=4360
                Total time spent by all reduce tasks (ms)=4761
                Total vcore-milliseconds taken by all map tasks=4360
                Total vcore-milliseconds taken by all reduce tasks=4761
                Total megabyte-milliseconds taken by all map tasks=6278400
                Total megabyte-milliseconds taken by all reduce tasks=13711680
        Map-Reduce Framework
                Map input records=5604
                Map output records=5603
                Map output bytes=8240332
                Map output materialized bytes=836387
                Input split bytes=240
                Combine input records=0
                Combine output records=0
                Reduce input groups=5603
                Reduce shuffle bytes=836387
                Reduce input records=5603
                Reduce output records=179296
                Spilled Records=11206
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=137
                CPU time spent (ms)=11240
                Physical memory (bytes) snapshot=820736000
                Virtual memory (bytes) snapshot=7694557184
                Total committed heap usage (bytes)=724566016
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=72958
        File Output Format Counters
                Bytes Written=12051358
Incremental upload completed..........
job is successfull..........H file Loading Will start Now
17/06/15 10:22:43 WARN mapreduce.LoadIncrementalHFiles: Skipping non-directory hdfs://ip:8020/user/hadoop/ESGTRF/outputdir/output0/_SUCCESS

同样的事情也在cloudera上工作,但是当我在aws emr上运行这个时,我发现了这个问题。
我怀疑配置有问题。我没有明确提到任何配置。

8gsdolmq

8gsdolmq1#

在明确设置了权限之后,我的问题就解决了

hfs.setPermission(new Path(outputPath+"/columnFamilyName"),FsPermission.valueOf("drwxrwxrwx"));

相关问题