Python/Pandas. For循环在多个dataFrames上无法正常工作

zvokhttg  于 2023-04-10  发布在  Python
关注(0)|答案(1)|浏览(133)

我正在尝试使用for循环以多种方式处理一个 Dataframe 列表(示例显示2,现实中有更多)。删除循环中引用的 Dataframe 中的列可以正常工作,但是,concat在循环中不做任何事情。我希望更新dfs中引用的原始 Dataframe 。
更新问题声明
以前的例子不包括这种情况/似乎不起作用。从这里改编的例子:pandas dataframe concat using for loop not working
缩小示例导致以下结果(代码部分借用自另一个问题)

import numpy as np
import pandas as pd

data = [['Alex',10],['Bob',12],['Clarke',13]]
data2 = ['m','m','x']
A = pd.DataFrame(data, columns=['Name','Age'])
B = pd.DataFrame(data, columns=['Name','Age'])
C = pd.DataFrame(data2, columns=['Gender'])

#expected result for A:
Anew=pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])

dfs = [A,B]

for k, v in enumerate(dfs):
    # The following line works as expected on A an B respectively, inplace is required to actually modify A,B as defined above
    dfs[k]=v.drop('Age',axis=1, inplace=True)
    # The following line doesn't do anything, I was expecting Anew (see above) 
    dfs[k] = pd.concat([v, C], axis=1)
    # The following line prints the expected result within the loop
    print(dfs[k])

# This just shows A, not Anew: To me tha tmeans A was never updated with dfs[k] as I thought it would. 
print(A)
4c8rllxm

4c8rllxm1#

更新

尝试:

data = [['Alex',10],['Bob',12],['Clarke',13]]
data2 = ['m','m','x']
A = pd.DataFrame(data, columns=['Name','Age'])
B = pd.DataFrame(data, columns=['Name','Age'])
C = pd.DataFrame(data2, columns=['Gender'])
Anew = pd.DataFrame([['Alex','m'],['Bob','m'],['Clarke','x']], columns=['Name', 'Gender'])

dfs = [A, B]
for v in dfs:
    v.drop('Age', axis=1, inplace=True)
    v['Gender'] = C
print(A)
print(Anew)

输出:

>>> A
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

>>> Anew
     Name Gender
0    Alex      m
1     Bob      m
2  Clarke      x

如果使用inplace=True,Pandas不会返回DataFrame,所以dfs现在是None

dfs[k]=v.drop('Age', axis=1, inplace=True)  # <- Remove inplace=True

尝试:

dfs = [A, B]
for k, v in enumerate(dfs):
    dfs[k] = v.drop('Age', axis=1)
    dfs[k] = pd.concat([v, C], axis=1)
out = pd.concat([A, C], axis=1)

输出:

>>> out
     Name  Age Gender
0    Alex   10      m
1     Bob   12      m
2  Clarke   13      x

相关问题