配置单元:计算分组项的最大和时出错

myzjeezk  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(405)

我想运行一个查询,计算每个信用卡每月的最高消费额。对于每一张信用卡,我需要计算每个月的消费金额。我有一张包含信用卡交易的表 credit_transact :

processdate timestamp   ""
cardno_hash string  ""
amount  int ""
year    int ""
month   int ""

组成样本数据:

card    year    month    amount
a123    2016    12       23160
a123    2016    10       287
c123    2016    11       5503
c123    2016    11       4206

我想:

card    year    month    amount
a123    2016    12       23160
c123    2016    11       9709

重要的一点是年和月是分区列。
我尝试了如下子查询:

USE credit_card_db;
SELECT sum_amount_transact.cardno_hash, sum_amount_transact.year, sum_amount_transact.month, MAX(sum_amount_transact.sum_amount)
FROM
(
  SELECT cardno_hash, year, month, SUM(amount) AS sum_amount FROM credit_transact
  GROUP BY cardno_hash, year, month
) AS sum_amount_transact
GROUP BY sum_amount_transact.cardno_hash, sum_amount_transact.year;

但是,会显示以下错误:

java.lang.Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid column reference 'month'

以下子查询工作正常,并按预期返回结果:

SELECT cardno_hash, year, month, SUM(amount) AS sum_amount FROM credit_transact
  GROUP BY cardno_hash, year, month

结果是:

card    year    month    amount
a123    2016    12       23160
a123    2016    10       287
c123    2016    11       9709

如果有人能帮助解决这个问题,我们将不胜感激。

wlsrxk51

wlsrxk511#

我不太清楚你到底想要什么,但我很确定你想要什么 row_number() . 我想你想要每年的最大月数:

SELECT ct.*
FROM (SELECT cardno_hash, year, month, SUM(amount) AS sum_amount,
             ROW_NUMBER() OVER (PARTITION BY cardno_hash, year ORDER BY SUM(amount) DESC) as seqnum
      FROM credit_transact
      GROUP BY cardno_hash, year, month
     ) ct
WHERE seqnum = 1;

相关问题