我在hive中有一个元数据表cust,如下所示
Cust_dec Cust_det
buy Interested
buy Cheap
no_buy Found Cheaper
no_buy No Interest
no_buy Other Faults
还有另一个表reg\u cst\u dtls需要与上面的元数据表多次连接,并派生多个字段,如下所示。
item_id ca_brd_dec ne_brd_dec co_brd_dec ca_dtl ne_dtl co_dtl
1012 buy no_buy no_buy Interested Found Cheaper Other Faults
5278 buy buy Found Cheaper
1572 no_buy buy buy No Interest Cheap Cheap
6896 no_buy no_buy no_buy Other Faults Cheap Found Cheaper
现在,对于每个item\u id reg\u cst\u dtls,我需要查看ca\u brd\u dec是否匹配cust\u dec,然后ca\u dtl也应该匹配cust\u det,并且新字段ca\u resp应该等于ca\u dtl else null。类似地,如果ne\u brd\u dec匹配cust\u dec,则ne\u dtl也应匹配cust\u det,且ne\u resp应等于ne\u dtl else null,如果co\u brd\u dec匹配cust\u dec,则co\u dtl也应匹配cust\u det,且co\u resp应等于co\u dtl else null。预期结果如下。
item_id ca_brd_dec ne_brd_dec co_brd_dec ca_dtl ne_dtl co_dtl ca_resp ne_resp co_resp
1012 buy no_buy no_buy Interested Found Cheaper Other Faults Interested Found Cheaper Other Faults
5278 buy buy Found Cheaper
1572 no_buy buy buy No Interest Cheap Cheap No Interest
6896 no_buy no_buy no_buy Other Faults Cheap Found Cheaper Other Faults cheap Found Cheaper
有人能帮助我们如何在Hive中实现这一点吗?
谢谢。。。!
1条答案
按热度按时间rkkpypqq1#
您可以使用以下配置单元查询。如果您的数据不区分大小写,请尝试使用upper或lower函数。hive-udf方法在这里是可行的。