pandas 多阵列上的Numpy离散傅里叶变换(fft)

gudnpqoy  于 2023-04-28  发布在  其他
关注(0)|答案(1)|浏览(186)

这是这个other question的后续问题
我有几个相当大的pandas Dataframe ,我需要应用Numpy的fft函数来分别对每行进行降噪。
我现在要做的是遍历dataframe的行,并在每行上应用numpy fft:

df = pd.DataFrame(np.random.rand(10000, 52), columns=range(1,53))

output = np.empty((0,52))

for index, row in df.iterrows():
    spectrum = np.fft.rfft(row)
    spectrum[6:] = 0
    verified =np.fft.irfft(spectrum)
    output = np.vstack((output, verified))

这是输出:

0         1         2         3         4         5         6   \
0     0.476861  0.449378  0.445224  0.458605  0.480861  0.504024  0.523671   
1     0.569499  0.642474  0.691314  0.703127  0.673323  0.607825  0.522125   
2     0.229334  0.206852  0.194395  0.186826  0.181918  0.181320  0.189769   
3     0.542612  0.485116  0.454579  0.454857  0.480331  0.517998  0.551836   
4     0.350204  0.428149  0.532144  0.627614  0.683061  0.680347  0.620361   
...        ...       ...       ...       ...       ...       ...       ...   
9995  0.241540  0.247316  0.296193  0.381337  0.487676  0.595264  0.683786   
9996  0.433201  0.386454  0.346898  0.324144  0.324614  0.349595  0.394791   
9997  0.585794  0.503882  0.450025  0.438172  0.469075  0.529971  0.598605   
9998  0.364178  0.363996  0.400953  0.465722  0.540743  0.605164  0.640928   
9999  0.720946  0.693376  0.642622  0.577479  0.510498  0.454305  0.418147   

            7         8         9   ...        42        43        44  \
0     0.540032  0.557019  0.579607  ...  0.482889  0.561783  0.642733   
1     0.437323  0.374242  0.347437  ...  0.294238  0.296055  0.301317   
2     0.212893  0.254505  0.314413  ...  0.484699  0.568643  0.623170   
3     0.567988  0.559083  0.526314  ...  0.490357  0.514340  0.561120   
4     0.522275  0.416836  0.336087  ...  0.326105  0.452154  0.574466   
...        ...       ...       ...  ...       ...       ...       ...   
9995  0.737062  0.746413  0.712124  ...  0.364438  0.427846  0.487851   
9996  0.451591  0.509691  0.560211  ...  0.432432  0.453183  0.472881   
9997  0.650264  0.665516  0.636145  ...  0.545284  0.532642  0.546533   
9998  0.638102  0.597791  0.531701  ...  0.548747  0.573257  0.603792   
9999  0.405732  0.414925  0.439214  ...  0.765117  0.746037  0.711731   

            45        46        47        48        49        50        51  
0     0.709887  0.749437  0.753514  0.722475  0.664799  0.594611  0.527631  
1     0.305615  0.308494  0.314105  0.329759  0.362792  0.416867  0.489137  
2     0.638698  0.614139  0.556706  0.479495  0.397632  0.324239  0.267452  
3     0.622705  0.685017  0.731861  0.749860  0.732776  0.683743  0.614625  
4     0.658297  0.680878  0.638835  0.549281  0.444148  0.359660  0.324439  
...        ...       ...       ...       ...       ...       ...       ...  
9995  0.531525  0.548069  0.531840  0.484457  0.415239  0.339668  0.276112  
9996  0.491950  0.509969  0.524766  0.532484  0.528706  0.510272  0.477096  
9997  0.589529  0.652623  0.717826  0.764054  0.774347  0.741951  0.673153  
9998  0.630760  0.642783  0.630788  0.591811  0.531082  0.461475  0.400280  
9999  0.674761  0.647708  0.639035  0.650343  0.675875  0.704429  0.722997

在我的PC上,这个脚本需要5-6秒。考虑到我有数百个 Dataframe 来运行脚本,整个过程将花费大量时间。有没有一种方法可以在整个 Dataframe 上应用fft函数,或者无论如何使脚本更快?
谢谢

jm2pwxwz

jm2pwxwz1#

感谢Warren Weckesser的评论,我找到了以下解决方案:

spectrum= pd.DataFrame(np.fft.rfft(df.to_numpy(), axis=1))
spectrum.iloc[: , 6:] = 0
output= pd.DataFrame(np.fft.irfft(spectrum.to_numpy(), axis=1))

一定有更好的解决方案来避免从numpy到pandas的来回,但它比iterrows快65,000倍,所以对我来说没问题:)

相关问题