在mapreduce模式下运行pig脚本时出现问题

gzjq41n4  于 2021-05-30  发布在  Hadoop
关注(0)|答案(2)|浏览(328)

我有一个正在运行的hadoop(2.6.0)集群,有6个节点(包括主节点),我想在mapreduce模式下运行pig(0.14.0)脚本。脚本运行时没有错误,但不幸的是,它似乎只在主节点上运行。在我的研究中,我尝试了对hadoop配置文件的一些更改,但没有成功。
你能帮我弄清楚如何让Pig在整个集群上工作吗?
以下是一些信息:
每个节点上的配置:
概述:
/etc/主机

127.0.0.1       localhost
192.168.101.3   master
192.168.101.4   node1
192.168.101.5   node2
192.168.101.6   node3
192.168.101.7   node4
192.168.101.8   node5

hadoop:
yarn-site.xml文件

<configuration>
<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>master:8025</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>master:8030</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.address</name>
                <value>master:8050</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>master:8041</value>
                <description>...</description>
        </property>
        <property>
                <name>yarn.nodemanager.aux_services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux_services.mapreduce.shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
                <name>yarn.log.server.url</name>
                <value>master:19888/jobhistory/logs/</value>
        </property>
</configuration>

core-site.xml文件

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/app/hadoop/tmp</value>
                <description>A base for other temporary dictionaries.</description>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://master:9000/</value>
                <description>...</description>
        </property>
</configuration>

mapred-site.xml文件

<configuration>
        <property>
                <name>mapreduce.jobtracker.address</name>
                <value>master:54311</value>
                <description>...</description>
        </property>
        <property>
                <name>mapred.framework.name</name>
                <value>yarn</value>
                <final>true</final>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>master:10020</value>
                <description>...</description>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>master:19888</value>
                <description>...</description>
        </property>

</configuration>

清管器产量:

15/01/09 13:12:54 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
15/01/09 13:12:54 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
15/01/09 13:12:54 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2015-01-09 13:12:54,845 [main] INFO  org.apache.pig.Main - Apache Pig version 0.14.0 (r1640057) compiled Nov 16 2014, 18:02:05
2015-01-09 13:12:54,845 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/hduser/pig_1420805574843.log
2015-01-09 13:12:56,450 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/hduser/.pigbootup not found
2015-01-09 13:12:56,876 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-01-09 13:12:56,886 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:12:56,886 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master:9000/
2015-01-09 13:12:58,146 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master:54311
2015-01-09 13:12:59,195 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:12:59,418 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:12:59,598 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:00,496 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: FILTER,UNION
2015-01-09 13:13:00,618 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:00,634 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-01-09 13:13:00,713 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2015-01-09 13:13:00,987 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-01-09 13:13:01,037 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-01-09 13:13:01,038 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-01-09 13:13:01,079 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:01,103 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
2015-01-09 13:13:01,105 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2015-01-09 13:13:01,149 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-01-09 13:13:01,161 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2015-01-09 13:13:01,161 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-01-09 13:13:01,161 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2015-01-09 13:13:01,167 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2015-01-09 13:13:19,222 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/pig-0.14.0-core-h2.jar to DistributedCache through /tmp/temp-1277984423/tmp-918732110/pig-0.14.0-core-h2.jar
2015-01-09 13:13:20,063 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1277984423/tmp883771618/automaton-1.11-8.jar
2015-01-09 13:13:20,621 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1277984423/tmp-1372558595/antlr-runtime-3.4.jar
2015-01-09 13:13:26,600 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp-1277984423/tmp-1556176302/guava-11.0.2.jar
2015-01-09 13:13:29,300 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/joda-time-2.1.jar to DistributedCache through /tmp/temp-1277984423/tmp145012374/joda-time-2.1.jar
2015-01-09 13:13:29,718 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-01-09 13:13:29,736 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-01-09 13:13:29,736 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-01-09 13:13:29,736 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-01-09 13:13:29,840 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-01-09 13:13:29,841 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2015-01-09 13:13:30,191 [JobControl] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-01-09 13:13:30,384 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:30,785 [JobControl] WARN  org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2015-01-09 13:13:30,949 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:13:30,949 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:13:31,250 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 52
2015-01-09 13:13:31,309 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:13:31,309 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:13:31,355 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 24
2015-01-09 13:13:31,378 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:13:31,379 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:13:31,394 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 6
2015-01-09 13:13:31,587 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:82
2015-01-09 13:13:31,706 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:13:32,475 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local647507189_0001
2015-01-09 13:13:33,628 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805612754/pig-0.14.0-core-h2.jar <- /home/hduser/pig-0.14.0-core-h2.jar
2015-01-09 13:13:33,758 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp-1277984423/tmp-918732110/pig-0.14.0-core-h2.jar as file:/app/hadoop/tmp/mapred/local/1420805612754/pig-0.14.0-core-h2.jar
2015-01-09 13:13:33,759 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805612755/automaton-1.11-8.jar <- /home/hduser/automaton-1.11-8.jar
2015-01-09 13:13:33,770 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp-1277984423/tmp883771618/automaton-1.11-8.jar as file:/app/hadoop/tmp/mapred/local/1420805612755/automaton-1.11-8.jar
2015-01-09 13:13:33,772 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805612756/antlr-runtime-3.4.jar <- /home/hduser/antlr-runtime-3.4.jar
2015-01-09 13:13:33,781 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp-1277984423/tmp-1372558595/antlr-runtime-3.4.jar as file:/app/hadoop/tmp/mapred/local/1420805612756/antlr-runtime-3.4.jar
2015-01-09 13:15:54,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp206201348/tmp-1481268210/guava-11.0.2.jar
2015-01-09 13:15:56,233 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/hduser/pig-0.14.0/lib/joda-time-2.1.jar to DistributedCache through /tmp/temp206201348/tmp-1921418840/joda-time-2.1.jar
2015-01-09 13:15:56,340 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-01-09 13:15:56,366 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-01-09 13:15:56,367 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-01-09 13:15:56,368 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-01-09 13:15:56,483 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-01-09 13:15:56,486 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2015-01-09 13:15:56,505 [JobControl] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-01-09 13:15:56,582 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:15:56,695 [JobControl] WARN  org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2015-01-09 13:15:57,070 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:15:57,070 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:15:57,197 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 52
2015-01-09 13:15:57,227 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:15:57,228 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:15:57,263 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 24
2015-01-09 13:15:57,289 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-01-09 13:15:57,289 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-01-09 13:15:57,306 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 6
2015-01-09 13:15:57,393 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:82
2015-01-09 13:15:57,416 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:15:57,791 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local561414911_0001
2015-01-09 13:15:58,741 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758017/pig-0.14.0-core-h2.jar <- /home/hduser/pig-0.14.0-core-h2.jar
2015-01-09 13:15:58,755 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp1912320441/pig-0.14.0-core-h2.jar as file:/app/hadoop/tmp/mapred/local/1420805758017/pig-0.14.0-core-h2.jar
2015-01-09 13:15:58,757 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758018/automaton-1.11-8.jar <- /home/hduser/automaton-1.11-8.jar
2015-01-09 13:15:58,766 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp-886499198/automaton-1.11-8.jar as file:/app/hadoop/tmp/mapred/local/1420805758018/automaton-1.11-8.jar
2015-01-09 13:15:58,768 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758019/antlr-runtime-3.4.jar <- /home/hduser/antlr-runtime-3.4.jar
2015-01-09 13:15:58,778 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp1437387446/antlr-runtime-3.4.jar as file:/app/hadoop/tmp/mapred/local/1420805758019/antlr-runtime-3.4.jar
2015-01-09 13:15:58,779 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758020/guava-11.0.2.jar <- /home/hduser/guava-11.0.2.jar
2015-01-09 13:15:58,786 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp-1481268210/guava-11.0.2.jar as file:/app/hadoop/tmp/mapred/local/1420805758020/guava-11.0.2.jar
2015-01-09 13:15:58,787 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /app/hadoop/tmp/mapred/local/1420805758021/joda-time-2.1.jar <- /home/hduser/joda-time-2.1.jar
2015-01-09 13:15:58,795 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://master:9000/tmp/temp206201348/tmp-1921418840/joda-time-2.1.jar as file:/app/hadoop/tmp/mapred/local/1420805758021/joda-time-2.1.jar
2015-01-09 13:15:58,953 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758017/pig-0.14.0-core-h2.jar
2015-01-09 13:15:58,954 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758018/automaton-1.11-8.jar
2015-01-09 13:15:58,955 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758019/antlr-runtime-3.4.jar
2015-01-09 13:15:58,955 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758020/guava-11.0.2.jar
2015-01-09 13:15:58,955 [JobControl] INFO  org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/app/hadoop/tmp/mapred/local/1420805758021/joda-time-2.1.jar
2015-01-09 13:15:58,970 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/
2015-01-09 13:15:58,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local561414911_0001
2015-01-09 13:15:58,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases records_infobox,records_mappingbased,records_person,records_union,result_filter
2015-01-09 13:15:58,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: records_person[10,17],records_person[-1,-1],null[-1,-1],records_union[13,16],records_infobox[6,18],records_infobox[-1,-1],result_filter[16,16],records_mappingbased[8,23],records_mappingbased[-1,-1],null[-1,-1] C:  R: 
2015-01-09 13:15:58,990 [Thread-19] INFO  org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null
2015-01-09 13:15:58,991 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-01-09 13:15:58,994 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local561414911_0001]
2015-01-09 13:15:59,067 [Thread-19] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2015-01-09 13:15:59,069 [Thread-19] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-09 13:15:59,069 [Thread-19] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-01-09 13:15:59,094 [Thread-19] INFO  org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2015-01-09 13:15:59,257 [Thread-19] INFO  org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2015-01-09 13:15:59,258 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local561414911_0001_m_000000_0
2015-01-09 13:15:59,459 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorProcessTree : [ ]
2015-01-09 13:15:59,470 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 134217728
Input split[0]:
   Length = 134217728
   ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
   Locations:

-----------------------

2015-01-09 13:15:59,522 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed hdfs://master:9000/wiki/infobox_properties_en.nt:0+134217728
2015-01-09 13:15:59,662 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-01-09 13:15:59,743 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: records_person[10,17],records_person[-1,-1],null[-1,-1],records_union[13,16],records_infobox[6,18],records_infobox[-1,-1],result_filter[16,16],records_mappingbased[8,23],records_mappingbased[-1,-1],null[-1,-1] C:  R: 
2015-01-09 13:15:59,798 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject(ACCESSING_NON_EXISTENT_FIELD): Attempt to access field which was not found in the input
2015-01-09 13:15:59,815 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject(ACCESSING_NON_EXISTENT_FIELD): Attempt to access field which was not found in the input
2015-01-09 13:16:05,578 [communication thread] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:08,582 [communication thread] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:10,209 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:10,699 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task:attempt_local561414911_0001_m_000000_0 is done. And is in the process of committing
2015-01-09 13:16:10,714 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2015-01-09 13:16:10,714 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task attempt_local561414911_0001_m_000000_0 is allowed to commit now
2015-01-09 13:16:10,849 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local561414911_0001_m_000000_0' to hdfs://master:9000/tmp/temp206201348/tmp-1297558267/_temporary/0/task_local561414911_0001_m_000000
2015-01-09 13:16:10,854 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map
2015-01-09 13:16:10,854 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task 'attempt_local561414911_0001_m_000000_0' done.
2015-01-09 13:16:10,855 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local561414911_0001_m_000000_0
2015-01-09 13:16:10,855 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local561414911_0001_m_000001_0
2015-01-09 13:16:10,877 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorProcessTree : [ ]
2015-01-09 13:16:10,883 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
....
dgsult0t

dgsult0t1#

我也有类似的问题,但情况不同 mapred-site.xml 尽管如此,我认为问题还是存在的。 Yarn 是下一个版本的 MR ,这就是为什么我们需要文件中的以下部分来确保它与旧程序一起使用:

<property>
            <name>mapred.framework.name</name>
            <value>yarn</value>
            <final>true</final>
    </property>

但是,假设您使用 Yarn ,你没有 Jobtracker ,因为它被 ResourceManager 在某种意义上(实际上,这是一次彻底的重新设计。你可以在报纸上看到http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-operators/ )
因此,您需要删除以下行:

<property>
            <name>mapreduce.jobtracker.address</name>
            <value>master:54311</value>
            <description>...</description>
    </property>

从文件上看,Pig会很好的去的。
(有一个相关的答案讨论了为什么在yarn上有mapreduce.jobtracker.address配置这一变化?)

zhte4eai

zhte4eai2#

执行: yarn application -list 并检查节点是否能够连接到resourcemanager。你的情况是: master:8050 .

相关问题