Pandas值错误:列的长度必须与键PyCharm相同

uqxowvwt  于 2023-01-11  发布在  PyCharm
关注(0)|答案(3)|浏览(284)

我想使用字典来添加一列到PandasDataFrame。我使用apply lambda和一个函数来添加一行。我得到'ValueError:列的长度必须与key '相同。我应该可以添加一个新列,但为了简化,我在df中包含了要更改的列。
我不知道我做错了什么。

import pandas as pd

court_dict = dict(zip(['INC:INC08 Pensions', 'TX:TX01 Federal Tax', 'HO:HO08 Rent'], [8, 8, 0]))
bank_info = {
        'Category':['INC:INC08 Pensions', 'TX:TX01 Federal Tax', 'HO:HO08 Rent'],
        'Amount':[1250.23, 300.0, 1000],
        'Paragraph': ['', '', '', ]
            }
bank2 = pd.DataFrame(bank_info)

def get_column_names(row: pd.core.series.Series, position: int) -> str:
    category = row['Category']
    result = court_dict.get(category, 'd')
    print(category, result)
    return result

if __name__=="__main__":
    bank2[['Paragraph']] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)
    print(bank2)

下面是输出:

C:\Users\Steve\anaconda3\envs\AccountingPersonal\python.exe C:\Users\Steve\PycharmProjects\AccountingPersonal\src\get_simple.py 
INC:INC08 Pensions 8
TX:TX01 Federal Tax 8
HO:HO08 Rent 0
Traceback (most recent call last):
  File "C:\Users\Steve\PycharmProjects\AccountingPersonal\src\get_simple.py", line 20, in <module>
    bank2[['Paragraph']] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)
  File "C:\Users\Steve\anaconda3\envs\AccountingPersonal\lib\site-packages\pandas\core\frame.py", line 3643, in __setitem__
    self._setitem_array(key, value)
  File "C:\Users\Steve\anaconda3\envs\AccountingPersonal\lib\site-packages\pandas\core\frame.py", line 3702, in _setitem_array
    self._iset_not_inplace(key, value)
  File "C:\Users\Steve\anaconda3\envs\AccountingPersonal\lib\site-packages\pandas\core\frame.py", line 3721, in _iset_not_inplace
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key

Process finished with exit code 1
mzsu5hc0

mzsu5hc01#

一切看起来都很完美只是和[]-

bank2['Paragraph'] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)

[] -〉是系列 -〉是 Dataframe

xriantvc

xriantvc2#

对于bank2[['Paragraph']],您返回的是 DataFrame 而不是 Series。您需要使用单方括号[]

def get_column_names(row: pd.core.series.Series, position: int) -> str:
    category = row['Category']
    result = court_dict.get(category, 'd')
    print(category, result)
    return result

if __name__=="__main__":
    bank2['Paragraph'] = bank2.apply(lambda row:get_column_names(row, 0), axis=1) # <- line updated
    print(bank2)

顺便说一句,您可以使用pandas.Series.map而不使用apply和一个定制函数来获得预期的输出/列。

if __name__=="__main__":
    bank2['Paragraph'] = bank2['Category'].map(court_dict)
    print(bank2)
nkhmeac6

nkhmeac63#

当您试图指定某些内容时,请使用单括号。双括号返回列。

bank2['Paragraph'] = bank2.apply(lambda row:get_column_names(row, 0), axis=1)

相关问题