我想算出板球运动员的滚动平均数。任何了解这项运动的人都会知道,除非击球手没有出局,否则平均分是以得分/局来计算的。如果一个击球手打了两局,其中一局“没有出局”,他们的平均得分将被算作
runs scored innings 1 + runs scored innings 2 / 1
如果他们两局都出局的话,计算结果将是
runs scored innings 1 + runs scored innings 2 / 2
这是很容易工作的整体平均数,但我想计算这作为一个运行平均数。在使用循环并分别计算每行的平均值之前,我已经这样做了,但是有人能建议一种使用任何内置函数的方法吗?
当前代码示例:
with cte as (
select
Innings_Player,
Innings_Runs_Scored,
Innings_Date,
CASE WHEN Innings_Runs_Scored = "DNB" THEN null WHEN Innings_Runs_Scored LIKE "%*%" THEN REPLACE(Innings_Runs_Scored,"*","") ELSE Innings_Runs_Scored END AS RunsNum,
CASE WHEN Innings_Runs_Scored LIKE "%*%" THEN 1 ELSE 0 END AS NotOutFlag,
ROW_NUMBER() OVER (PARTITION BY Innings_Player ORDER BY Innings_Date) as RN
from TABLE
where Innings_Player = "JE Root"
AND Innings_Runs_Scored IS NOT NULL
ORDER BY Innings_Date
)
,cte2 as
(
select
*,
SUM(CAST(RunsNum AS INT64)) OVER (PARTITION BY Innings_Player ORDER BY RN) AS RunningTotal,
AVG(CAST(RunsNum AS INT64)) OVER (PARTITION BY Innings_Player ORDER BY RN) AS RunningAvg,
from cte
where runsNum IS NOT NULL AND runsNum <> "TDNB"
)
select * from cte2
结果数据集:
所以,平均值是不正确的。对于滚动平均值,第三行的计算应该是前三行的局数除以2而不是3,正如你从notoutflag中看到的,列表中的3局没有出局。
同样地,第4行应该被3除,第5行被4除,然后第6行也是一个非out,第6行应该被4除,第7行被5除等等,我认为这个等式是
Innings_Run_Scored / Innings - Not Out Count
1条答案
按热度按时间dgiusagp1#
AVG
基本上是一个SUM / COUNT
既然你想改变COUNT
我建议放弃使用avg函数。你可以用一个SUM
与CASE
只统计那些NotOutFlag
是0
所以这条线会变成
当然,您需要添加更多的逻辑来避免被0除。