dataframe-rank的答案不同于sql-rank

evrscar2  于 2021-07-24  发布在  Java
关注(0)|答案(1)|浏览(341)

我在学习sql中的rank函数时发现,它使用的排序方法与pandas方法不同。如何得到相同的答案?
问题链接:https://www.windowfunctions.com/questions/ranking/1

sql代码

select rank() over (order by weight desc) as ranking, weight, name from cats order by ranking, name

我的Pandas代码

df[['weight','name']].assign(ranking=df['weight'].rank(method='dense',ascending=False)).sort_values('weight',ascending=False)

# I am getting a different answer than SQL

# How to get the same answer?

数据

df = pd.DataFrame({'name': ['Molly', 'Ashes', 'Felix', 'Smudge', 'Tigger', 'Alfie', 'Oscar', 'Millie', 'Misty', 'Puss', 'Smokey', 'Charlie'],
          'breed': ['Persian', 'Persian', 'Persian', 'British Shorthair', 'British Shorthair', 'Siamese', 'Siamese', 'Maine Coon', 'Maine Coon', 'Maine Coon', 'Maine Coon', 'British Shorthair'],
          'weight': [4.2, 4.5, 5.0, 4.9, 3.8, 5.5, 6.1, 5.4, 5.7, 5.1, 6.1, 4.8],
          'color': ['Black', 'Black', 'Tortoiseshell', 'Black', 'Tortoiseshell', 'Brown', 'Black', 'Tortoiseshell', 'Brown', 'Tortoiseshell', 'Brown', 'Black'],
          'age': [1, 5, 2, 4, 2, 5, 1, 5, 2, 2, 4, 4]})

我的输出

weight  name    ranking
6   6.1 Oscar   1.0
10  6.1 Smokey  1.0
8   5.7 Misty   2.0 # this should be 3 and so on
5   5.5 Alfie   3.0
7   5.4 Millie  4.0
9   5.1 Puss    5.0
2   5.0 Felix   6.0
3   4.9 Smudge  7.0
11  4.8 Charlie 8.0
1   4.5 Ashes   9.0
0   4.2 Molly   10.0
4   3.8 Tigger  11.0
6yt4nkrj

6yt4nkrj1#

如注解所示,我们可以使用以下方法在sql和pandas中得到相同的答案:

df[['weight','name']].assign(ranking=df['weight'].rank(method='min',ascending=False)).sort_values('weight',ascending=False)

相关问题