如何使用sqoop将json数据从hdfs插入mysql?

q8l4jmvw  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(333)

我已经将json数据加载到我的hdfs中,我在mysql数据库中创建了包含必需列的表,如下所示。
如何使用行格式化程序创建表以接受json?
我的hdfs数据

{
"Employees" : [
{
"userId":"rirani",
"jobTitleName":"Developer",
"firstName":"Romin",
"lastName":"Irani",
"preferredFullName":"Romin Irani",
"employeeCode":"E1",
"region":"CA",
"phoneNumber":"408-1234567",
"emailAddress":"romin.k.irani@gmail.com"
},
{
"userId":"nirani",
"jobTitleName":"Developer",
"firstName":"Neil",
"lastName":"Irani",
"preferredFullName":"Neil Irani",
"employeeCode":"E2",
"region":"CA",
"phoneNumber":"408-1111111",
"emailAddress":"neilrirani@gmail.com"
},
{
"userId":"thanks",
"jobTitleName":"Program Directory",
"firstName":"Tom",
"lastName":"Hanks",
"preferredFullName":"Tom Hanks",
"employeeCode":"E3",
"region":"CA",
"phoneNumber":"408-2222222",
"emailAddress":"tomhanks@gmail.com"
}
]
}

我的sql表结构

mysql> create table employee(userid int,jobTitleName varchar(20),firstName varchar(20),lastName varchar(20),preferrredFullName varchar(20),employeeCode varchar(20),region varchar(20),phoneNumber varchar(20), emailAddress varchar(20),modifiedDate timestamp  DEFAULT CURRENT_TIMESTAMP);
mysql> desc employee;
+--------------------+-------------+------+-----+-------------------+-------+
| Field              | Type        | Null | Key | Default           | Extra |
+--------------------+-------------+------+-----+-------------------+-------+
| userid             | int(11)     | YES  |     | NULL              |       |
| jobTitleName       | varchar(20) | YES  |     | NULL              |       |
| firstName          | varchar(20) | YES  |     | NULL              |       |
| lastName           | varchar(20) | YES  |     | NULL              |       |
| preferrredFullName | varchar(20) | YES  |     | NULL              |       |
| employeeCode       | varchar(20) | YES  |     | NULL              |       |
| region             | varchar(20) | YES  |     | NULL              |       |
| phoneNumber        | varchar(20) | YES  |     | NULL              |       |
| emailAddress       | varchar(20) | YES  |     | NULL              |       |
| modifiedDate       | timestamp   | NO   |     | CURRENT_TIMESTAMP |       |
+--------------------+-------------+------+-----+-------------------+-------+
10 rows in set (0.00 sec)

我尝试使用sqoop export将上表的数据从hdfs加载到mysql,如下所示

sqoop export --connect jdbc:mysql://localhost/emp_scheme --username root --password adithyan --table employee --export-dir /user/adithyan/filesystem/employee.txt

它以以下例外结束

17/02/18 19:35:35 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/02/18 19:35:35 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/02/18 19:35:35 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/02/18 19:35:35 INFO tool.CodeGenTool: Beginning code generation
17/02/18 19:35:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
17/02/18 19:35:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
17/02/18 19:35:36 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/adithyan/hadoop_dir/hadoop-1.2.1
Note: /tmp/sqoop-adithyan/compile/35afadf151a1dd1626a3658577cbc2dd/employee.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/02/18 19:35:41 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-adithyan/compile/35afadf151a1dd1626a3658577cbc2dd/employee.jar
17/02/18 19:35:41 INFO mapreduce.ExportJobBase: Beginning export of employee
17/02/18 19:35:45 INFO input.FileInputFormat: Total input paths to process : 1
17/02/18 19:35:45 INFO input.FileInputFormat: Total input paths to process : 1
17/02/18 19:35:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library
17/02/18 19:35:45 WARN snappy.LoadSnappy: Snappy native library not loaded
17/02/18 19:35:46 INFO mapred.JobClient: Running job: job_201702181051_0002
17/02/18 19:35:47 INFO mapred.JobClient:  map 0% reduce 0%
17/02/18 19:36:17 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
    at employee.__loadFromFields(employee.java:596)
    at employee.parse(employee.java:499)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.valueOf(Integer.java:766)
    at employee.__loadFromFields(employee.java:548)
    ... 12 more

17/02/18 19:36:18 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_0, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
    at employee.__loadFromFields(employee.java:596)
    at employee.parse(employee.java:499)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:580)
    at java.lang.Integer.valueOf(Integer.java:766)
    at employee.__loadFromFields(employee.java:548)
    ... 12 more

17/02/18 19:36:29 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
    at employee.__loadFromFields(employee.java:596)
    at employee.parse(employee.java:499)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.valueOf(Integer.java:766)
    at employee.__loadFromFields(employee.java:548)
    ... 12 more

17/02/18 19:36:29 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_1, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
    at employee.__loadFromFields(employee.java:596)
    at employee.parse(employee.java:499)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:580)
    at java.lang.Integer.valueOf(Integer.java:766)
    at employee.__loadFromFields(employee.java:548)
    ... 12 more

17/02/18 19:36:42 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '"firstName":"Tom"'
    at employee.__loadFromFields(employee.java:596)
    at employee.parse(employee.java:499)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more
Caused by: java.lang.NumberFormatException: For input string: ""firstName":"Tom""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.valueOf(Integer.java:766)
    at employee.__loadFromFields(employee.java:548)
    ... 12 more

17/02/18 19:36:42 INFO mapred.JobClient: Task Id : attempt_201702181051_0002_m_000001_2, Status : FAILED
java.io.IOException: Can't export data, please check failed map task logs
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Can't parse input data: '{'
    at employee.__loadFromFields(employee.java:596)
    at employee.parse(employee.java:499)
    at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
    ... 10 more
Caused by: java.lang.NumberFormatException: For input string: "{"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:580)
    at java.lang.Integer.valueOf(Integer.java:766)
    at employee.__loadFromFields(employee.java:548)

有人能帮我吗?

wsewodh2

wsewodh21#

你可能需要考虑多种选择。。json_set/replace/insert-这些选项可能还不受sqoop的直接支持。
另一种选择是使用pig预处理数据,然后在sqooping到rdbms之前将数据放在hdfs中。

相关问题