无法以正确格式将数据加载到配置单元表中

nnsrf1az  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(272)

我正在尝试加载下表,该表在配置单元中有两个数组类型的列。
基表:

Array<int> col1   Array<string> col2
[1,2]             ['a','b','c']
[3,4]             ['d','e','f']

我在配置单元中创建了如下表:

create table base(col1 array<int>,col2 array<string>) row format delimited fields terminated by '\t' collection items terminated by ',';

然后加载数据如下:

load data local inpath '/home/hduser/Desktop/batch/hiveip/basetable' into table base;

我使用了以下命令:

select * from base;

我得到的结果如下

[null,null]     ["['a'","'b'","'c']"]
 [null,null]     ["['d'","'e'","'f]"]

我没有得到正确格式的数据。
请帮我找出哪里错了。

ergxz8rk

ergxz8rk1#

您可以更改字符串数组col1的数据类型,而不是int数组,然后您就可以获得col1的数据。
col1数据类型为数组(字符串)时:-

hive>create table base(col1 array<string>,col2 array<string>) row format delimited fields terminated by '\t' collection items terminated by ',';
hive>select * from base;
+--------------+------------------------+--+
|     col1     |          col2          |
+--------------+------------------------+--+
| ["[1","2]"]  | ["['a'","'b'","'c']"]  |
| ["[3","4]"]  | ["['d'","'e'","'f']"]  |
+--------------+------------------------+--+

为什么会出现这种行为,是因为hive无法将数组中的值检测为整数,因为我们有1,2个值包含在[]col1中elements:-

hive>select col1[0],col1[1] from base;
    +------+------+--+
    | _c0  | _c1  |
    +------+------+--+
    | [1   | 2]   |
    | [3   | 4]   |
    +------+------+--+

(或)
col1数据类型作为数组(int类型):-
如果您不想更改数据类型,那么您需要将输入文件保持如下,不使用[]方括号表示数组(即col1)值。

1,2     ['a','b','c']
3,4     ['d','e','f']

然后创建与问题中提到的相同的表,然后hive可以将前1,2作为数组元素检测为int类型。

hive> create table base(col1 array<int>,col2 array<string>) row format delimited fields terminated by '\t' collection items terminated by ',';

hive> select * from base;
+--------+------------------------+--+
|  col1  |          col2          |
+--------+------------------------+--+
| [1,2]  | ["['a'","'b'","'c']"]  |
| [3,4]  | ["['d'","'e'","'f']"]  |
+--------+------------------------+--+

访问数组elements:-

hive> select col1[0] from base;
    +------+--+
    | _c0  |
    +------+--+
    | 1    |
    | 3    |
    +------+--+

相关问题