在Python + NumPy中编写没有for循环的组排名

gywdnpxw 于 11个月前发布在 Python

关注(0)|答案(1)|浏览(86)

我的目标是创建一个函数，它接受两个输入数组，并输出组内的排名。（scores），并且第二输入数组是组大小的数组（groups），以组大小的和等于分数的总量的方式，scores数组可以被划分成几个组，类似于np.split(scores, np.cumsum(groups))。输出排名应该是1，代表组内得分最高的点，2代表第二高的点，依此类推。我实现了如下（包括两个测试用例）：

import numpy as np

def assign_ranks(scores, groups):
    # Create an array to store the ranks
    ranks = np.zeros(len(scores), dtype=int)

    # Iterate through groups
    start = 0
    for group_size in groups:
        end = start + group_size
        ranks[start:end] = (-scores[start:end]).argsort().argsort() + 1
        start = end

    return ranks

def test():
    # Test case 1
    scores1 = np.array([0.1, 0.5, 0.3, 0.4, 0.5])
    groups1 = np.array([3, 2])
    result1 = assign_ranks(scores1, groups1)
    print("Test case 1 result:", result1)  # Expected: [3 1 2 2 1]
    assert np.all(result1 == np.array([3, 1, 2, 2, 1]))

    # Test case 2
    scores2 = np.array([0.7, 0.3, 0.6, 0.1, 0.2, 0.5, 0.8, 0.9, 0.4])
    groups2 = np.array([4, 3, 2])
    result2 = assign_ranks(scores2, groups2)
    print("Test case 2 result:", result2)  # Expected: [1 3 2 4 3 2 1 1 2]
    assert np.all(result2 == np.array([1, 3, 2, 4, 3, 2, 1, 1, 2]))

if __name__ == "__main__":
    test()

字符串
虽然这是正确的工作，我想知道是否有可能实现这一点，而不使用for循环，潜在地提高性能。我已经做了几个没有成功的编译器。任何建议？
我尝试过其他方法，例如利用np.lexsort和np.split，但都没有成功。

numpy

来源：https://stackoverflow.com/questions/77369820/writing-group-ranking-without-for-loops-in-python-numpy

1条答案

按热度按时间

bvk5enib1#

如果所有的值都是正数，你可以将它们偏移一个基本值，该基本值对应于它们所在的组索引的倍数。然后使用argsort来获得具有偏移量的项目的全局位置。这将使给定组的成员相对于组索引保持在一起。然后将全局索引转换回组相对位置，并颠倒顺序以获得排名：

def assign_ranks(scores,groups):
    offset    = np.repeat(np.arange(groups.size),groups) * np.max(scores)
    order     = np.argsort(scores + offset)
    groupBase = np.repeat(np.cumsum(groups),groups)
    return groupBase - order

字符串
如果有负值，则需要计算一个更大的偏移量（比* np.max(scores)大），以确保结果值中的组之间没有重叠。
您可以使用* 2 * np.max(np.abs(scores))来完成此操作
或* (np.max(scores) - np.min(scores))

赞(0）回复(0）举报 11个月前

我来回答

在Python + NumPy中编写没有for循环的组排名

1条答案

相关问题

热门标签

最新问答