使用oozie的sqoop工作流总是失败

2mbi3lxu  于 2021-06-03  发布在  Sqoop
关注(0)|答案(1)|浏览(547)

在学习sqoop的过程中,我执行了一个sqoop命令来获取cloudera的dh中的所有mysql数据库,它会正确地返回所有可用的数据库。问题是,如果我在oozie工作流中运行与作业相同的命令,它总是会失败。

作业属性

nameNode=hdfs://quickstart.cloudera:8020
resourceManager=0.0.0.0:8032

oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/pig_demo

工作流.xml

<workflow-app name="foo-wf" xmlns="uri:oozie:workflow:0.2">
    <start to="sqoop-36c5"/>

    <action name="sqoop-36c5">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${resourceManager}</job-tracker>
            <name-node>${nameNode}</name-node>
            <command>list-databases --m 1 
                 --connect "jdbc:mysql://quickstart.cloudera:3306"
                 --username retail_dba
                 --password cloudera</command>
         </sqoop>
         <ok to="finish"/>
         <error to="errorHalt"/>
    </action>

    <kill name="errorHalt">
        <message>Input unavailable,error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
   <end name="finish"/>
</workflow-app>

以下是生成的日志

2019-01-24 09:52:09,352 INFO [Thread-69] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/cloudera/job_1548311302916_0001-1548312666604-cloudera-oozie%3Alauncher%3AT%3Dsqoop%3AW%3Dfoo%2Dwf%3AA%3Dsqoop%2D36c5%3AID%3D00-1548312729060-1-0-SUCCEEDED-root.cloudera-1548312703464.jhist_tmp to hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/cloudera/job_1548311302916_0001-1548312666604-cloudera-oozie%3Alauncher%3AT%3Dsqoop%3AW%3Dfoo%2Dwf%3AA%3Dsqoop%2D36c5%3AID%3D00-1548312729060-1-0-SUCCEEDED-root.cloudera-1548312703464.jhist
2019-01-24 09:52:09,352 INFO [Thread-69] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2019-01-24 09:52:09,353 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1548311302916_0001_m_000000_0
2019-01-24 09:52:09,437 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1548311302916_0001_m_000000_0 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCEEDED
2019-01-24 09:52:09,441 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to 
2019-01-24 09:52:09,442 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: History url is http://quickstart.cloudera:19888/jobhistory/job/job_1548311302916_0001
2019-01-24 09:52:09,480 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Waiting for application to be successfully unregistered.
2019-01-24 09:52:10,483 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2019-01-24 09:52:10,488 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://quickstart.cloudera:8020 /tmp/hadoop-yarn/staging/cloudera/.staging/job_1548311302916_0001
2019-01-24 09:52:10,505 INFO [Thread-69] org.apache.hadoop.ipc.Server: Stopping server on 34049
2019-01-24 09:52:10,517 INFO [IPC Server listener on 34049] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 34049
2019-01-24 09:52:10,523 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2019-01-24 09:52:10,524 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2019-01-24 09:52:10,531 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: TaskAttemptFinishingMonitor thread interrupted
2019-01-24 09:52:10,556 INFO [Thread-69] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1548311302916_0001
2019-01-24 09:52:10,560 INFO [Thread-69] org.mortbay.log: Job end notification attempts left 0
2019-01-24 09:52:10,560 INFO [Thread-69] org.mortbay.log: Job end notification trying http://quickstart.cloudera:11000/oozie/callback?id=0000000-190124093049946-oozie-oozi-W@sqoop-36c5&status=SUCCEEDED
2019-01-24 09:52:10,590 INFO [Thread-69] org.mortbay.log: Job end notification to http://quickstart.cloudera:11000/oozie/callback?id=0000000-190124093049946-oozie-oozi-W@sqoop-36c5&status=SUCCEEDED succeeded
2019-01-24 09:52:10,590 INFO [Thread-69] org.mortbay.log: Job end notification succeeded for job_1548311302916_0001
2019-01-24 09:52:15,605 INFO [Thread-69] org.apache.hadoop.ipc.Server: Stopping server on 44688
2019-01-24 09:52:15,613 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2019-01-24 09:52:15,617 INFO [IPC Server listener on 44688] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 44688
2019-01-24 09:52:15,637 INFO [Thread-69] org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:0

sqoop作业成功,但taskattempt从success\u finishing\u container转换到succeeded的操作被终止,这是为什么呢。

[cloudera@quickstart ~]$ yarn version
Hadoop 2.6.0-cdh5.13.0
Subversion http://github.com/cloudera/hadoop -r 42e8860b182e55321bd5f5605264da4adc8882be
Compiled by jenkins on 2017-10-04T18:08Z
Compiled with protoc 2.5.0
From source with checksum 5e84c185f8a22158e2b0e4b8f85311
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.13.0.jar
nr9pn0ug

nr9pn0ug1#

首先,您正在运行一个listing数据库命令。不知道为什么要把它放在workflow.xml文件中。
我经历过clouderavm的行为不太一致,因为我们通常使用有限的内存来运行它,或者没有分配容器,或者容器被杀死。重新启动整个虚拟机也无济于事。
如果您再次获得一个带有该映像的cloudera vm的新示例并尝试运行它,它可能会解决您的问题。过去对我们有用。

相关问题