配置单元表数组列-使用数组索引分解

dm7nw8vv  于 2021-05-29  发布在  Hadoop
关注(0)|答案(3)|浏览(824)

嗨,我有一张Hive桌

select a,b,c,d from riskfactor_table 
In the above table B, C and D columns are array columns. Below is my Hive DDL 
Create external table riskfactor_table 
(a string, 
b array<string>, 
c array<double>, 
d array<double> ) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '~'  
stored as textfile location 'user/riskfactor/data';

以下是我的表格数据:
id400s,[“jms”,“jndi”,“jaxb”,“jaxn”],[100200300400],[1,2,3,4]
id200n,[“一”,“二”,“三”],[212352418],[6,10,8]
如果要拆分数组列,如何拆分?
如果我使用explode函数,我只能为一列拆分数组值
从riskfactor\表中选择explode(b)作为b;
输出:

jms  
jndi  
jaxb  
jxn  
one  
two  
three

但我想用下面的一个select语句填充所有列-
查询-从风险系数中选择a、b、c、d;
输出:

row1-  ID400S    jms    100    1  
row2-  ID400S    jndi   200    2  
row3-  ID400S    jaxb    300    3  
row4-  ID400S    jaxn    400    4

如何填充所有数据?

e1xvtsh3

e1xvtsh31#

我也在寻找同样问题的答案。谢谢杰罗姆,为这个砖房解决方案。
我不得不做一个小小的改变(将别名“n1添加为n”)如下,以使其适用于我的案例:

hive> describe test;
OK
id              string
animals     array<string>
cnt         array<bigint>

hive> select * from test;
OK
abc     ["cat","dog","elephant","dolphin","snake","parrot","ant","frog","kuala","cricket"]      [10597,2027,1891,1868,1804,1511,1496,1432,1305,1299]

hive> select `id`, array_index(`animals`,n), array_index(`cnt`,n) from test lateral view numeric_range(0,10) n1 as n;
OK
abc     cat             10597
abc     dog             2027
abc     elephant        1891
abc     dolphin         1868
abc     snake           1804
abc     parrot          1511
abc     ant             1496
abc     frog            1432
abc     kuala           1305
abc     cricket         1299

唯一的事情是我必须事先知道有10个元素要爆炸。

sigwle7e

sigwle7e2#

您可以使用侧视图来实现这一点

SELECT Mycoulmna, Mycoulmnb ,Mycoulmnc
                 FROM  riskfactor_table
             LATERAL VIEW explode(a) myTablea AS Mycoulmna
             LATERAL VIEW explode(a) myTableb AS Mycoulmnb
             LATERAL VIEW explode(a) myTablec AS Mycoulmnc ;

想知道更多细节,就扔吧。

yyhrrdl8

yyhrrdl83#

使用brickhouse的“numeric\u range”自定义项。下面是一篇描述细节的博客文章。
https://brickhouseconfessions.wordpress.com/2013/03/07/exploding-multiple-arrays-at-the-same-time-with-numeric_range/
在您的情况下,您的查询将类似于

SELECT a, 
       array_index( b, i ),
       array_index( c, i ),
       array_index( d, i )
FROM risk_factor_table
 LATERAL VIEW numeric_range( 0, 3 );

相关问题