semanticexception[error 10004]表别名或列引用无效

omvjsjqw  于 2021-06-25  发布在  Hive
关注(0)|答案(2)|浏览(1174)

我试图按id获取每组中第一行的记录。行号是按id的顺序生成的。我一直得到相同的错误[10004]。
此配置单元脚本是在sas studio环境中构建的,下面是一段过程sql代码:
数据集id\u f包含变量x、y和z

execute (create temporary table event_id_f1 as
        select *,ROW_NUMBER() OVER(ORDER BY id) as rownum
        from id_f
                        group by id
                        having rownum = min(rownum)
        )
by df20;

错误:执行错误:org.apache.hive.service.cli.hivesqlexception:编译语句时出错:失败:semanticexception[error 10004]:行1:129无效的表别名或列引用“rownum”:(可能的列名为:x、y、z)
谢谢你的帮助!
@伯尼建议Hive不能识别别名。我应该如何嵌套别名以使其工作?代码也在execute语句中。我的尝试:

execute (create temporary table event_id_f1 as

        from (
             select *,
             ROW_NUMBER() OVER(ORDER BY id) as rownum
             from id_f
              group by id
              having rownum = min(rownum) ) ranked
             WHERE ranked=1
             )
by df20;
olmpazwi

olmpazwi1#

我的拙见是以下sql有问题(取自另一个答案):

select * 
from (
    select id, ROW_NUMBER() OVER (ORDER BY id) as rownum
    from id_f
    group by id 
) ranked
where ranked.rownum = 1

此查询分为三个阶段:
第一, select id from id_f group by id ,这相当于获得所有 id .
然后,一个全局窗口 order by 这些不同的ID并给每个ID一个行号。
最后是 where ranked.rownum = 1 筛选器使结果集仅包含一行:

minimal_id, 1

如果你真的只想要这些,就用吧 min 应该好多了。
这个问题乍一看对我来说没什么意义。我想你要找的是 partition by id ,在每个分区内, order by something_else ,然后选择在这些不同分区中排名第一的行。如果幸运的话,我的猜测是对的,我建议这样做:

select 
    * 
from (
    select 
        *, ROW_NUMBER() OVER (partition by id ORDER BY STH_ELSE) as rownum
    from id_f
) ranked
where ranked.rownum = 1
7cwmlq89

7cwmlq892#

我就是这样写的。如果有任何错误,请告诉我:

execute (create temporary table event_id_f1 as
    select * 
    from (
        select id, ROW_NUMBER() OVER (ORDER BY id) as rownum
        from id_f
        group by id 
    ) ranked
    where ranked.rownum = 1         
) by df20;

相关问题