numpy 我在Cython代码中做错了什么？

92dk7w1h 于 2023-06-23 发布在其他

关注(0)|答案(1)|浏览(102)

你能评论一下我的这段代码吗？
对于上下文，我正在尝试学习Cython，看看它如何为未来的用例服务，我想集成C和Python。
这个“练习”代码是这样工作的：
1.我从一个文件中读取了一长串的3d坐标，这些坐标描述了时间步长上的两个点。
1.我计算每个时间步长两点之间的欧氏距离，并将其报告为numpy数组
我使用Cython的Pure Python模式。

# computer.py

import cython
# from cython.cimports.cpython import array
# import array
import numpy as np

if cython.compiled:
    print("Yep, I'm compiled.")
    from cython.cimports.libc.math import sqrt

else:
    print("Just a lowly interpreted script.")
    from math import sqrt

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cfunc
def compute_distance_cy(x1: cython.float, y1: cython.float, z1: cython.float,
                        x2: cython.float, y2: cython.float, z2: cython.float):

    return sqrt(sum(((x1 - x2) ** 2.0, (y1 - y2) ** 2.0, (z1 - z2) ** 2.0)))

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.ccall
def compute_distances_pure(points: cython.double[:, :]):

    # get the maximum dimensions of the array
    x_max: cython.size_t = points.shape[0]
    y_max: cython.size_t = points.shape[1]

    # create memoryviews of the single points
    view2d: cython.double[:, :] = points
    view1d: cython.double[:]

    # create memoryviews of the results
    result = np.zeros(x_max, dtype=np.double)
    result2dview: cython.double[:] = result

    # access the memoryview by way of our constrained indexes
    x: cython.size_t
    for x in range(x_max):
        view1d = view2d[x, :]
        result2dview[x] = compute_distance_cy(
            view1d[0], view1d[1], view1d[2], view1d[3], view1d[4], view1d[5])

    return result

这是从调用：

...

def pure_python_mode(points):
    return computer.compute_distances_pure(points)

def do_it_in_numpy(points):
    return np.sqrt((points[:, 0] - points[:, 3])**2 +
                   (points[:, 1] - points[:, 4])**2 +
                   (points[:, 2] - points[:, 5])**2)

points_ndarray = np.array(points_list, dtype=np.double)
points_distance_array_from_cython = pure_python_mode(points_ndarray)
points_distance_array_from_numpy = do_it_in_numpy(points_ndarray)

我用一个简单的 Package 器对这两种方法进行计时。在这一点上，我能够实现：

Function pure_python_mode Took 0.0439 seconds
Function do_it_in_numpy Took 0.0183 seconds

我在Cython中的表现通常比Numpy差4 - 8倍。这对于我的用例来说是可以接受的，但是我想知道是否有人可以指出我做错了什么，或者这是否和这里一样好。

numpy

来源：https://stackoverflow.com/questions/76365181/what-am-i-doing-wrong-with-this-cython-code

1条答案

按热度按时间

e7arh2l61#

作为一般提示，cython有一个“annotate”选项，如果你使用它编译，它会显示哪些行可能是问题所在（即仍然使用python而不是c），并给你一个很好的想法为什么。
例如，如果您正在通过www.example.com构建，则可以通过将cythonize("your_file.pyx")更改为cythonize("your_file.pyx", annotate=True)来启用此功能。setup.py you can enable this by changing cythonize("your_file.pyx") to cythonize("your_file.pyx", annotate=True) .
一些具体说明：

`compute_distance_cy`：

(x1-x2)**2.0等使用python规则而不是c完成，这里只是手动平方，通常考虑从c中导入数学函数的方式与sqrt相同
sum(...)比较昂贵，这里可以只使用+，对于更大的数组可以考虑循环
你可能需要cython.double而不是cython.float（否则你会损失一半的精度
要添加返回类型
boundscheck和wraparound是不必要的（您在此函数中不执行任何索引）

结合这些你的功能可能看起来像这样：

@cython.cfunc
def compute_distance_cy(x1: cython.double, y1: cython.double, z1: cython.double, x2: cython.double, y2: cython.double, z2: cython.double) -> cython.double:
    dx: cython.double = x1 - x2
    dy: cython.double = y1 - y2
    dz: cython.double = z1 - z2
    return sqrt(dx*dx + dy*dy + dz*dz)

`compute_distances_pure`

你的一些观点是不必要的。view2d-points已经是您想要的类型，因此直接使用它即可
索引到视图的视图比直接（即，当您执行view1d[0]时，它比view2d[x,0]慢），请尽可能避免这种情况

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.ccall
def compute_distances_pure(points: cython.double[:, :]) -> cython.double[:]:
    x_max: cython.size_t = points.shape[0]
    result : cython.double[:] = np.zeros(x_max, dtype=np.double)

    x: cython.size_t
    for x in range(x_max):
        result[x] = d3(points[x, 0], points[x, 1], points[x, 2], points[x, 3], points[x, 4], points[x, 5])

Simplifying your second function could look something like:

    return result

结果

结合这些想法，我得到了比原始cython版本快75倍的速度。

赞(0）回复(0）举报 2023-06-23

我来回答

numpy 我在Cython代码中做错了什么？

1条答案

`compute_distance_cy`：

`compute_distances_pure`

结果

相关问题

热门标签

最新问答

numpy 我在Cython代码中做错了什么？

1条答案

compute_distance_cy：

compute_distances_pure

结果

相关问题

热门标签

最新问答

`compute_distance_cy`：

`compute_distances_pure`