在sqoop中处理零值

4sup72z8  于 2021-06-03  发布在  Sqoop
关注(0)|答案(2)|浏览(311)

我创建了一个测试表,其中有一个布尔字段,名为“inactive”。
在sqlserver非活动字段中,值可以为零 0 或者一个 1 . 但每当我将数据从sqlserver打包到配置单元时,零 0 替换为 NULL 在Hive里。
但我需要保持 0 既然如此,那么如何处理sqoop中的零值呢?

CREATE EXTERNAL TABLE IF NOT EXISTS test (
     IndividualUid string,
     First string,
     Middle string,
     Last string,
     Inactive boolean
)row format delimited fields terminated by ',' location 'Individual';

谢谢您。

koaltpgm

koaltpgm1#

不知道你的案子到底出了什么问题。你能试试下面吗。

mysql> create table tablewithboolean (id int, name varchar(10), status boolean);
Query OK, 0 rows affected (0.01 sec)

mysql> desc tablewithboolean;
+--------+-------------+------+-----+---------+-------+
| Field  | Type        | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| id     | int(11)     | YES  |     | NULL    |       |
| name   | varchar(10) | YES  |     | NULL    |       |
| status | tinyint(1)  | YES  |     | NULL    |       |
+--------+-------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

mysql> insert into tablewithboolean values(1, 'One', false);
Query OK, 1 row affected (0.02 sec)

mysql> insert into tablewithboolean values(1, 'Two', true);
Query OK, 1 row affected (0.00 sec)

mysql> select * from tablewithboolean;
+------+------+--------+
| id   | name | status |
+------+------+--------+
|    1 | One  |      0 |
|    1 | Two  |      1 |
+------+------+--------+
2 rows in set (0.00 sec)

在配置单元中创建外部表

hive> use hadoopexam;

create external table hivetablewithboolean (id int, name varchar(10), status boolean)
row format delimited fields terminated by ',' 
location '/user/hive/warehouse/hivetablewithboolean';

hive> create external table hivetablewithboolean (id int, name varchar(10), status boolean)
    > row format delimited fields terminated by ',' 
    > location '/user/hive/warehouse/hivetablewithboolean';

hive> describe hivetablewithboolean;
OK
id                      int                                         
name                    varchar(10)                                 
status                  boolean                                     
Time taken: 0.266 seconds, Fetched: 3 row(s)

将数据从mysql导入hive。

sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username retail_dba --password cloudera \
--query 'select id, name, status from tablewithboolean where $CONDITIONS' \
--hive-table hadoopexam.hivetablewithboolean \
--target-dir /user/hive/warehouse/hivetablewithboolean \
--split-by id

检查目标目录。

[root@quickstart spark]# hadoop fs -ls -R /user/hive/warehouse/hivetablewithboolean/
-rw-r--r--   1 root supergroup          0 2018-02-28 15:56 /user/hive/warehouse/hivetablewithboolean/_SUCCESS
-rw-r--r--   1 root supergroup         23 2018-02-28 15:56 /user/hive/warehouse/hivetablewithboolean/part-m-00000
[root@quickstart spark]# hadoop fs -cat /user/hive/warehouse/hivetablewithboolean/part-m-00000
1,One,false
1,Two,true

检查配置单元表的内容

hive> select * from hivetablewithboolean;
OK
1   One false
1   Two true
Time taken: 0.147 seconds, Fetched: 2 row(s
ffx8fchx

ffx8fchx2#

也许您可以在import语句中尝试如下操作:
--query“select id,name,cast(status as decimal)new from tablewithboolean where$conditions”\

相关问题