I have a group of athletes and I would like to calculate where their scores land from a percentile perspective. Doing so based on their best score doesn't make sense, but doing so based on their average is closer to the mark.
So let's assume this data:
| athleteId | points |
| ------------ | ------------ |
| 1 | 100 |
| 1 | 200 |
| 1 | 300 |
| 1 | 400 |
| 1 | 500 |
| 2 | 101 |
| 2 | 202 |
| 2 | 303 |
| 2 | 404 |
| 2 | 505 |
| 3 | 10 |
| 3 | 20 |
| 3 | 30 |
| 3 | 40 |
| 3 | 50 |
Using this data, I can execute this simple query to calculate their average points:
SELECT
t.athleteId,
AVG(t.points) AS average
FROM table t
GROUP BY t.athleteId
And I will get a result of:
athleteId | average |
---|---|
1 | 300 |
2 | 303 |
3 | 30 |
Using those averages, I now want to calculate the athlete's percentile based on their average score against the original table. To make things a bit more complicated (since this query is part of a bigger picture and I can't lose sight of that), I would also like to know the athlete's best score, the total number of scores, and how their best score ranked. Therefore, I would expect the following results:
athleteId | average | bestScore | totalScores | rank | percentile |
---|---|---|---|---|---|
1 | 300 | 500 | 15 | 2 | 0.60 |
2 | 303 | 505 | 15 | 1 | 0.67 |
3 | 30 | 50 | 15 | 11 | 0.13 |
I need the SQL Server query that can produce that result.
1条答案
按热度按时间23c0lvtd1#
I can't get your exact results. The data you set out is quite "santiary", in that each athlete's average score is also one of their actual scores, so the "percentile of average score" neatly resolves itself when we eyeball the data.
This solution implements a single query with CTE's - if you're writing a stored proc then you might want to use temp tables, and a variable for the total row count.
Anyway, here we go.
I'll declare a table variable to hold the data in this example.
Now express the various CTE's.
The first one adds the rank of each score
.. this one throws the total number of scores in ..
.. this one calculates the averages ..
Note the cast to float, which will force a non-integer calculation later when we determine the percentile.
.. so far, so good.
Then I had difficulty with the percentile. I have taken the athletes calculated average and added it to the total list of scores, performed a rank, pulled out the rank for the average score, and then divided that rank by the total row count + 1.
It's not quite the same result as you were looking for, but it gets close.
Hope this helps you along your way.