如何在两个表之间对join include group by和max value进行sql查询

rnmwe5a2  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(246)

我有两张table, coaches 以及 awardscoaches . coaches 让我了解教练和团队之间的关系。 awardscoaches 包括教练奖励信息。其结构如下:
教练

coachid: string   (the id of each coach, it is the primary key in this table)
tmid:    string   (the team id)

奖品支票

coachid: string   (the id of each coach)
award:   string   (the award the coach got, each coach may have more than one award, so the primary key in this table is the combine of coachid and award)

现在我要写一个查询,找出每支球队哪个教练获奖最多。
以下是我目前拥有的sql:

select c.tmid tmid, max(a.count) count from coaches c 
inner join (select coachid, count(award) count 
from awardscoaches group by coachid) a 
on a.coachid = c.coachid group by c.tmid;

此查询将返回每个团队中的最大奖励数。但是我不知道如何在结果集中选择coachid,因为我只能从groupby中选择字段。我正在寻找一个通用的sql语句来实现这个要求。
我试过以下命令:

select coachid,tmid,award_count 
from (select coachid,tmid,award_count
      ,rank() over(partition by tmid order by award_count desc) as rnk
      from (select a.coachid, count(*) over(partition by a.coachid) as award_count,c.tmid 
            from awardscoaches a
            join coaches c on c.coachid=a.coachid
           ) t
      ) t
where rnk = 1

但有重复的行如下:

murrabr01c  WAS 17
murrabr01c  WAS 17
murrabr01c  WAS 17
murrabr01c  WAS 17
krommbo01c  WIJ 10
krommbo01c  WIJ 10
krommbo01c  WIJ 10
krommbo01c  WIJ 10
wattto01c   WIN 7
wattto01c   WIN 7
wattto01c   WIN 7
jjhzyzn0

jjhzyzn01#

max(struct(a.cnt,a.coachid)).col2   as coachid
select      c.tmid                              as tmid
           ,max(struct(a.cnt,a.coachid)).col2   as coachid 

from                    coaches c 

            join       (select      coachid
                                   ,count(*)    as cnt

                        from        awardscoaches 

                        group by    coachid
                        ) a 

            on          a.coachid = 
                        c.coachid 

group by    c.tmid
;

相关问题