我有两个表,我想对一个短语在数据中的次数进行求和。需要注意的是,我想做一个月到现在的分析,所以收集从一个月的第一天到当前日期-1(所以在脚本运行的前一天)之间的所有记录。我编写了一个脚本来获取计数(见下文),但我只希望有一个记录的计数总和。
这是我的密码:
WITH AB AS (
SELECT substr(A1.tm, 1,10) as the_DATE , A1.LOG_ID
FROM db1.table1 A1 ,
db1.table2 A2 WHERE A1.LOG_ID=A2.LOG_ID
and substr(A1.tm, 1,10) between date_add(last_day(add_months(current_date, -1)),1) and date_sub(current_timestamp(),1)
),
BC AS (
SELECT AB.the_DATE ,
COUNT ( DISTINCT (CASE WHEN (TXT like '% assistance%') THEN AB.LOG_ID ELSE NULL END )) AS
assistance ,
COUNT ( DISTINCT (CASE WHEN (TXT like '%help%') THEN AB.LOG_ID ELSE NULL END )) AS
helpp , COUNT(DISTINCT AB.LOG_ID) AS VOL_TOTAL
FROM AB LEFT JOIN db1.tabel2 BC ON AB.LOG_ID =BC.LOG_ID
where substr(tm, 1,10) between date_add(last_day(add_months(current_date, -1)),1) and date_sub(current_timestamp(),1)
group by AB.the_DATE
)
SELECT coalesce(BC.the_date ,date_sub(current_timestamp(),1)) as the_DATE ,
COALESCE(BC.VOL_TOTAL,0) AS VOL_TOTAL ,COALESCE (BC.assistance,0) AS assistance ,
COALESCE (BC.help,0) AS helpp
FROM BC;
产生
the_date | vol_total | assistance | helpp
2021-04-03 , 3 , 5 , 6
2021-04-06 , 3 , 5 , 6
2021-04-01 , 3 , 5 , 6
2021-04-04 , 3 , 5 , 6
2021-04-07 , 3 , 5 , 6
我希望输出为:
the_date | vol_total | assistance | helpp
2021-04-07 , 15 , 25 , 30
因此,每个字段的总和与“theu date”字段一起计算为昨天的日期。我是新的Hive和它的功能,所以任何想法都将有助于获得我想要的输出。
暂无答案!
目前还没有任何答案,快来回答吧!