sqoop中“val”的illegalargumentexception和非法分区

gv8xihay  于 2021-06-03  发布在  Sqoop
关注(0)|答案(1)|浏览(352)

我一遍又一遍地问这个问题,我开始相信我错过了一些真正基本的东西。似乎没有多少人遇到过这一点,我真的被困在这一点上:
当我在sqoop中使用incremental import lastmodified指定merge key参数时,出现了这个错误。如果我通过命令行运行作业,它可以正常工作,但当我提交给oozie时就不行了。我通过oozie提交我的工作。不确定oozie是问题所在还是色调问题所在,但sqoop作业不是问题所在,因为它在通过命令行(包括合并步骤)执行时确实可以正常工作。
我的sqoop工作如下:

sqoop job --meta-connect jdbc:hsqldb:hsql://FQDN:16000/sqoop  
    --create test_table -- import --driver com.mysql.jdbc.Driver --connect
     jdbc:mysql://IP/DB?zeroDateTimeBehavior=convertToNull --username 
     USER_NAME --password 'PASSWORD' --table test_table --merge-key id --
    split-by id --target-dir LOCATION --incremental lastmodified 
    --last-value 0 --check-column updated_at

第一次导入正常。开始第二次导入时,我得到:
我创建了一个小测试表,用int、datetime和varchar进行测试,数据中没有任何null或无效字符,但我面临同样的问题:


# id, updated_at, name

'1', '2016-07-02 17:16:53', 'l'
'3', '2016-06-29 14:12:53', 'f'

数据中只有2行,但我得到了:

Error: java.lang.IllegalArgumentException
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
    at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Error: java.io.IOException: Illegal partition for 3 (-2)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
    at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
    at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Error: java.lang.IllegalArgumentException
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
    at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Error: java.io.IOException: Illegal partition for 1 (-2)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
    at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
    at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

我只在oozie中得到这个错误,并通过hue提交作业,当我通过命令行运行sqoop作业时,包括merge mapreduce在内,这个工作正常
取自oozie launcher,这是我的mapreduce作业日志的样子:
注意:底部出现合并Map减少作业失败错误:
现在调用sqoop命令行>>>

5373 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
    5407 [uber-SubtaskRunner] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.7.0
    5702 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.BaseSqoopTool  - Setting your password on the command-line is insecure. Consider using -P instead.
    5715 [uber-SubtaskRunner] WARN  org.apache.sqoop.ConnFactory  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
    5740 [uber-SubtaskRunner] WARN  org.apache.sqoop.ConnFactory  - Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
    5754 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Using default fetchSize of 1000
    5754 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.CodeGenTool  - Beginning code generation
    6091 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
    6098 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
    6118 [uber-SubtaskRunner] INFO  org.apache.sqoop.orm.CompilationManager  - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/lib/hadoop-mapreduce
    8173 [uber-SubtaskRunner] INFO  org.apache.sqoop.orm.CompilationManager  - Writing jar file: /tmp/sqoop-yarn/compile/454902ac78d49b783a1f51b7bfe0a2be/test_table.jar
    8185 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
    8192 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Incremental import based on column updated_at
    8192 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Lower bound value: '2016-07-02 17:13:24.0'
    8192 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Upper bound value: '2016-07-02 17:16:56.0'
    8194 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Beginning import of test_table
    8214 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
    8230 [uber-SubtaskRunner] WARN  org.apache.sqoop.mapreduce.JobBase  - SQOOP_HOME is unset. May not be able to find all job dependencies.
    8716 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.DBInputFormat  - Using read commited transaction isolation
    8717 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat  - BoundingValsQuery: SELECT MIN(id), MAX(id) FROM test_table WHERE ( updated_at >= '2016-07-02 17:13:24.0' AND updated_at < '2016-07-02 17:16:56.0' )
    8721 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.IntegerSplitter  - Split size: 0; Num splits: 4 from: 1 to: 1
    25461 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Transferred 26 bytes in 17.2192 seconds (1.5099 bytes/sec)
    25471 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Retrieved 1 records.
    25536 [uber-SubtaskRunner] WARN  org.apache.sqoop.mapreduce.ExportJobBase  - IOException checking input file header: java.io.EOFException
    25550 [uber-SubtaskRunner] WARN  org.apache.sqoop.mapreduce.JobBase  - SQOOP_HOME is unset. May not be able to find all job dependencies.
    Heart beat
    Heart beat
    70628 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.ImportTool  - Merge MapReduce job failed!
    70628 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Saving incremental import state to the metastore
    70831 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.ImportTool  - Updated data for job: test_table
5anewei6

5anewei61#

这里有一个解决方法可以帮我解决这个问题:http://www.yourtechchick.com/sqoop/illegal-partition-exception-sqoop-incremental-imports/

相关问题