pandas 如何加快自定义功能

lawou6xi  于 2023-02-06  发布在  其他
关注(0)|答案(1)|浏览(140)

如何加快我的自定义功能?
我有三串数字:
列表1列表2列表3
Pandas数据框是这样的:
| 身份证|伊努姆|描述_1|雷克什|
| - ------|- ------|- ------|- ------|
| 识别码1|伊努姆1|1个|建议1|
| 身份2|伊努姆2|第二章|建议2|
| id3|伊努姆3|三个|建议3|
我的自定义函数:

def keep_inum(row):
    if len(row) != 0:
        if int(row['inum']) in list1:
            if row['DESC_1'] == 1:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list2:
            if row['DESC_1'] == 2:
                return row['recs']
            else:
                return ''
        elif int(row['inum']) in list3:
            if row['DESC_1'] == 3:
                return row['recs']
            else:
                return ''
        else:
            return row['recs']
    else:
        pass

将函数应用于DF:

df['recs'] = df.apply(keep_inum, axis = 1)
6ie5vjzr

6ie5vjzr1#

通过根本不使用自定义函数:

import pandas as pd

df = pd.DataFrame(
    {
        "id": ["id1", "id2", "id3", "id4"],
        "inum": ["111", "222", "333", "331"],
        "DESC_1": [1, 4, 3, 3],
        "recs": ["recs1", "recs2", "recs3", "yes"],
    }
)

print(df)
print("---")

list1 = [111]
list2 = [222]
list3 = [333, 331]

# Cast inum to int in one go
df["inum_int"] = df["inum"].astype(int)
# Empty the recs where inum doesn't match desc
df.loc[df["inum_int"].isin(list1) & ~(df["DESC_1"] == 1), "recs"] = ""
df.loc[df["inum_int"].isin(list2) & ~(df["DESC_1"] == 2), "recs"] = ""
df.loc[df["inum_int"].isin(list3) & ~(df["DESC_1"] == 3), "recs"] = ""
df.drop(columns=["inum_int"], inplace=True)
print(df)

此输出

id inum  DESC_1   recs
0  id1  111       1  recs1
1  id2  222       4  recs2
2  id3  333       3  recs3
3  id4  331       3    yes
---
    id inum  DESC_1   recs
0  id1  111       1  recs1
1  id2  222       4       
2  id3  333       3  recs3
3  id4  331       3    yes

相关问题