配置单元中多个字段上的相同联接

t3psigkw  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(269)

我在hive中有一个元数据表cust,如下所示

Cust_dec       Cust_det
buy            Interested
buy            Cheap
no_buy         Found Cheaper
no_buy         No Interest
no_buy         Other Faults

还有另一个表reg\u cst\u dtls需要与上面的元数据表多次连接,并派生多个字段,如下所示。

item_id     ca_brd_dec     ne_brd_dec     co_brd_dec     ca_dtl             ne_dtl              co_dtl
1012        buy            no_buy         no_buy         Interested         Found Cheaper       Other Faults
5278        buy            buy                           Found Cheaper      
1572        no_buy         buy            buy            No Interest        Cheap               Cheap
6896        no_buy         no_buy         no_buy         Other Faults       Cheap               Found Cheaper

现在,对于每个item\u id reg\u cst\u dtls,我需要查看ca\u brd\u dec是否匹配cust\u dec,然后ca\u dtl也应该匹配cust\u det,并且新字段ca\u resp应该等于ca\u dtl else null。类似地,如果ne\u brd\u dec匹配cust\u dec,则ne\u dtl也应匹配cust\u det,且ne\u resp应等于ne\u dtl else null,如果co\u brd\u dec匹配cust\u dec,则co\u dtl也应匹配cust\u det,且co\u resp应等于co\u dtl else null。预期结果如下。

item_id     ca_brd_dec     ne_brd_dec     co_brd_dec     ca_dtl             ne_dtl              co_dtl              ca_resp         ne_resp             co_resp
1012        buy            no_buy         no_buy         Interested         Found Cheaper       Other Faults        Interested      Found Cheaper       Other Faults
5278        buy            buy                           Found Cheaper      
1572        no_buy         buy            buy            No Interest        Cheap               Cheap               No Interest
6896        no_buy         no_buy         no_buy         Other Faults       Cheap               Found Cheaper       Other Faults    cheap               Found Cheaper

有人能帮助我们如何在Hive中实现这一点吗?
谢谢。。。!

rkkpypqq

rkkpypqq1#

您可以使用以下配置单元查询。如果您的数据不区分大小写,请尝试使用upper或lower函数。hive-udf方法在这里是可行的。

SELECT item_id,
   ca_brd_dec,
   ne_brd_dec,
   co_brd_dec,
   ca_dtl,
   ne_dtl,
   co_dtl,
   CASE
       WHEN ca_j.cust_loopup IS NULL THEN NULL
       ELSE ca_dtl
   END AS ca_resp,
   CASE
       WHEN ne_j.cust_loopup IS NULL THEN NULL
       ELSE ne_dtl
   END AS ne_resp,
   CASE
       WHEN co_j.cust_loopup IS NULL THEN NULL
       ELSE co_dtl
   END AS co_resp
FROM
  (SELECT item_id,
          ca_brd_dec,
          ne_brd_dec,
          co_brd_dec,
          ca_dtl,
          ne_dtl,
          co_dtl,
          concat(ca_brd_dec, ca_dtl) AS ca,
          concat(ne_brd_dec, ne_dtl) AS ne,
          concat(co_brd_dec, co_dtl) AS co
   FROM reg_cst_dtls reg
   LEFT OUTER JOIN
     (SELECT concat(cust_dec, cust_det) AS cust_loopup
      FROM cust) ca_j ON reg.ca = ca_j.cust_loopup
   LEFT OUTER JOIN
     (SELECT concat(cust_dec, cust_det) AS cust_loopup
      FROM cust) ne_j ON reg.ne = ne_j.cust_loopup
   LEFT OUTER JOIN
     (SELECT concat(cust_dec, cust_det) AS cust_loopup
      FROM cust) co_j ON reg.co = co_j.cust_loopup) sub

相关问题