SQL Server percentile_cont with aggregate Function

pnwntuvh  于 2023-06-21  发布在  其他
关注(0)|答案(2)|浏览(115)

I'm using SQL Server 2012 and trying to calculate some aggregate functions and percentiles on a data set with two columns (id and time). In my research I found some solutions, but they don't work for me (maybe it's because of my outdated SQL version)
| id | time_seconds |
| ------------ | ------------ |
| 1 | 120 |
| 2 | 10 |
| 3 | 50 |
| 4 | 80 |
| 5 | 60 |
| 6 | 42 |
| 7 | 96 |

I'll tried something like that:

SELECT 
    MIN(time_seconds) AS min,
    MAX(time_seconds) AS max,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY time_seconds) OVER () AS '95 percentil'
from table

If i run this query, it wants me to make an include "time_seconds" in Group by, but then the aggregate function doesnt work anymore.

The Output id like to become is something like this :
| min | max | 95 percentil |
| ------------ | ------------ | ------------ |
| 10 | 120 | 110 |

Thanks for your help!

xxb16uws

xxb16uws1#

Unfortunately PERCENTILE_CONT works as a window function only, not as an aggregate function. This is a (rare) case when select distinct may come useful:

select distinct
    min(time_seconds) over() as min_time_seconds,
    max(time_seconds) over() as max_time_seconds,
    percentile_cont(0.95) within group (order by time_seconds) over () as percentil_95
from mytable

Basically this takes the problem the opposite way: since we can't have an aggregate percentile computation, we turn other aggregate functions to window functions: the window computations yield the same values on all rows, so we can then deduplicate the resultset with distinct .

min_time_secondsmax_time_secondspercentil_95
10120112.8

fiddle - this is SQL Server 2014, since db<>fiddle does not support 2012.

jaql4c8m

jaql4c8m2#

Try this:

;WITH cte AS (
    SELECT
        CAST(COUNT(*) OVER() AS float) AS total_rows,
        ROW_NUMBER() OVER (ORDER BY time_seconds) AS row_number,
        time_seconds
    FROM (
        SELECT 1 AS id, 120 AS time_seconds UNION ALL
        SELECT 2 AS id, 10 AS time_seconds UNION ALL
        SELECT 3 AS id, 50 AS time_seconds UNION ALL
        SELECT 4 AS id, 80 AS time_seconds UNION ALL
        SELECT 5 AS id, 60 AS time_seconds UNION ALL
        SELECT 6 AS id, 42 AS time_seconds UNION ALL
        SELECT 7 AS id, 96 AS time_seconds
    ) AS YourTable
),
percentiles AS (
    SELECT
        time_seconds,
        100.0 * ((row_number - 1) / (total_rows - 1)) AS percentile,
        LEAD(time_seconds) OVER (ORDER BY row_number) AS next_time_seconds,
        100.0 * (row_number / (total_rows - 1)) AS next_percentile
    FROM cte
)
SELECT 
    MIN(time_seconds) AS min_time_seconds, 
    MAX(time_seconds) AS max_time_seconds,
    (
        SELECT 
            time_seconds + (next_time_seconds - time_seconds) * (95 - percentile) / (next_percentile - percentile)
        FROM percentiles 
        WHERE percentile <= 95 AND next_percentile >= 95
    ) AS percentile_95
FROM cte;

相关问题