我有一个工作流如下所示:
<workflow-app xmlns="uri:oozie:workflow:0.2" name="oozie-sqoop">
<start to="sqoop1" />
<action name="sqoop1">
<sqoop xmlns="uri:oozie:sqoop-action:0.4">
<job-tracker>localhost:8032</job-tracker>
<name-node>hdfs://quickstart.cloudera:8020</name-node>
<arg>import</arg>
<arg>--connect</arg>
<arg>jdbc:mysql://8.8.8.8:3306/pro-data</arg>
<arg>--username</arg>
<arg>root</arg>
<arg>--table</arg>
<arg>data_source</arg>
<arg>--hive-import</arg>
</sqoop>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end" />
</workflow-app>
它总是会遇到这样的错误:
主类[org.apache.oozie.action.hadoop.sqoopmain],退出代码[1]
如果我把一个 --target-dir
arg到hdfs但是当我使用 hive-import
它不起作用,我的xml有什么问题吗?
实际上我在这里使用的是oozie restapi。我的端点和数据如下所示
http://8.8.8.8:11000/oozie/v1/jobs?jobtype=sqoop
输入数据:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://quickstart.cloudera:8020</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8032</value>
</property>
<property>
<name>user.name</name>
<value>cloudera</value>
</property>
<property>
<name>oozie.sqoop.command</name>
<value>
import
--connect
jdbc:mysql://ip:3306/pro-data
--username
root
--table
data_source
--hive-home
/user/cloudera/warehouse/
-m
1
--incremental
append
--check-column
id
--hive-import
</value>
</property>
<property>
<name>oozie.libpath</name>
<value>hdfs://quickstart.cloudera:8020/user/oozie/share/lib/lib_20160715181153/sqoop</value>
</property>
<property>
<name>hcat.metastore.uri</name>
<value>thrift://127.0.0.1:9083</value>
</property>
<property>
<name>oozie.use.system.libpath</name>
<value>True</value>
</property>
<property>
<name>oozie.proxysubmission</name>
<value>True</value>
</property>
</configuration>
日志:
016-07-16 14:30:38,171 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@:start:] Start action [0000016-160716103436859-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-07-16 14:30:38,199 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@:start:] [***0000016-160716103436859-oozie-oozi-W@:start:***]Action status=DONE
2016-07-16 14:30:38,204 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@:start:] [***0000016-160716103436859-oozie-oozi-W@:start:***]Action updated in DB!
2016-07-16 14:30:38,475 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] Start action [0000016-160716103436859-oozie-oozi-W@sqoop1] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-07-16 14:31:17,880 INFO SqoopActionExecutor:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] checking action, hadoop job ID [job_1468690384910_0024] status [RUNNING]
2016-07-16 14:31:17,887 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] [***0000016-160716103436859-oozie-oozi-W@sqoop1***]Action status=RUNNING
2016-07-16 14:31:17,887 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] [***0000016-160716103436859-oozie-oozi-W@sqoop1***]Action updated in DB!
2016-07-16 14:34:40,286 INFO CallbackServlet:520 - SERVER[quickstart.cloudera] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] callback for action [0000016-160716103436859-oozie-oozi-W@sqoop1]
2016-07-16 14:34:42,001 INFO SqoopActionExecutor:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] checking action, hadoop job ID [job_1468690384910_0024] status [RUNNING]
2016-07-16 14:34:57,679 INFO CallbackServlet:520 - SERVER[quickstart.cloudera] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] callback for action [0000016-160716103436859-oozie-oozi-W@sqoop1]
2016-07-16 14:34:58,642 INFO SqoopActionExecutor:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] action completed, external ID [job_1468690384910_0024]
2016-07-16 14:34:58,663 WARN SqoopActionExecutor:523 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
2016-07-16 14:34:58,987 INFO ActionEndXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@sqoop1] ERROR is considered as FAILED for SLA
2016-07-16 14:34:59,299 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@fail] Start action [0000016-160716103436859-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-07-16 14:34:59,343 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@fail] [***0000016-160716103436859-oozie-oozi-W@fail***]Action status=DONE
2016-07-16 14:34:59,349 INFO ActionStartXCommand:520 - SERVER[quickstart.cloudera] USER[cloudera] GROUP[-] TOKEN[] APP[oozie-sqoop] JOB[0000016-160716103436859-oozie-oozi-W] ACTION[0000016-160716103436859-oozie-oozi-W@fail] [***0000016-160716103436859-oozie-oozi-W@fail***]Action updated in DB!
Yarn原木:
mapreduce.tasktracker.http.threads=40
dfs.stream-buffer-size=4096
tfile.fs.output.buffer.size=262144
fs.permissions.umask-mode=022
dfs.client.datanode-restart.timeout=30
dfs.namenode.resource.du.reserved=104857600
yarn.resourcemanager.am.max-attempts=2
yarn.nodemanager.resource.percentage-physical-cpu-limit=100
ha.failover-controller.graceful-fence.connection.retries=1
mapreduce.job.speculative.speculative-cap-running-tasks=0.1
dfs.datanode.drop.cache.behind.writes=false
hadoop.common.configuration.version=0.23.0
mapreduce.job.ubertask.enable=false
yarn.app.mapreduce.am.resource.cpu-vcores=1
dfs.namenode.replication.work.multiplier.per.iteration=2
mapreduce.job.acl-modify-job=
io.seqfile.local.dir=${hadoop.tmp.dir}/io/local
yarn.resourcemanager.system-metrics-publisher.enabled=false
fs.s3.sleepTimeSeconds=10
mapreduce.client.output.filter=FAILED
------------------------
Sqoop command arguments :
import
--connect
jdbc:mysql://172.16.1.18:3306/pro-data
--username
root
--table
data_source
--hive-home
/user/cloudera/warehouse/
-m
1
--incremental
append
--check-column
id
--hive-import
Fetching child yarn jobs
tag id : oozie-a68d0f5f197314a14720c8ff3935b1dc
Child yarn jobs are found -
=================================================================
>>> Invoking Sqoop command line now >>>
42238 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
42453 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.5.0
42572 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.BaseSqoopTool - Using Hive-specific delimiters for output. You can override
42572 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.BaseSqoopTool - delimiters with --fields-terminated-by, etc.
42685 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
43432 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset.
43491 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
45931 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `data_source` AS t LIMIT 1
46198 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `data_source` AS t LIMIT 1
46219 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
62817 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/78cb8ad53d1f0fe6f62c936c7688a4b8/data_source.jar
62926 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Maximal id query for free form incremental import: SELECT MAX(`id`) FROM `data_source`
62937 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Incremental import based on column `id`
62937 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Upper bound value: 45
62937 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql.
62937 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct
62937 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path.
62937 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql)
62979 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of data_source
63246 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
65748 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
Heart beat
Heart beat
Heart beat
148412 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Transferred 754 bytes in 85.1475 seconds (8.8552 bytes/sec)
148429 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Retrieved 9 records.
148464 [uber-SubtaskRunner] INFO org.apache.sqoop.util.AppendUtils - Appending to directory data_source
148520 [uber-SubtaskRunner] INFO org.apache.sqoop.util.AppendUtils - Using found partition 2
148685 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `data_source` AS t LIMIT 1
148741 [uber-SubtaskRunner] WARN org.apache.sqoop.hive.TableDefWriter - Column created_date had to be cast to a less precise type in Hive
148741 [uber-SubtaskRunner] WARN org.apache.sqoop.hive.TableDefWriter - Column updated_date had to be cast to a less precise type in Hive
148743 [uber-SubtaskRunner] INFO org.apache.sqoop.hive.HiveImport - Loading uploaded data into Hive
Heart beat
Intercepting System.exit(1)
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://quickstart.cloudera:8020/user/cloudera/oozie-oozi/0000009-160719121646145-oozie-oozi-W/sqoop1--sqoop/action-data.seq
Oozie Launcher ends
暂无答案!
目前还没有任何答案,快来回答吧!