为什么.merge()删除了我在Pandas中的所有列?[duplicate]

mnemlml8  于 2022-12-28  发布在  其他
关注(0)|答案(1)|浏览(140)
    • 此问题在此处已有答案**:

Pandas Merging 101(8个答案)
6小时前关门了。
我正在开发一个GUI,用于编辑和比较用户在健康数据作业中每月上传的CSV文件。我已经为我们的一个客户业务构建了此操作,但现在我正尝试重新配置它,以便为其他客户业务工作。它目前工作正常,但出于某种原因,我的new_users Dataframe 在合并后打印为空白。对于我构建的其他包,这段代码工作正常。2我的第一个想法可能是因为UniqueID需要10个字符,但即使在输入字符要求后,我仍然得到一个空的new_users Dataframe 。3这让我为难。

def client_merge(ef_in, ul_in):
    pd.set_option('mode.chained_assignment', None)

    ef_in['UniqueID'] = ef_in['UniqueID'].astype(object)
    ef_in['HireDate'] = ef_in['HireDate'].astype(object)
    ef_in['DateOfBirth'] = ef_in['DateOfBirth'].astype(object)
    ul_in['UniqueID'] = ul_in['UniqueID'].astype(object)
    ul_in['Action'] = ul_in['Action'].astype(object)
    ul_in['ZipCode'] = ul_in['ZipCode'].astype(object)

    df = pd.concat(([ef_in, ul_in]), axis=0, ignore_index=True, sort=False)
    df.drop_duplicates(subset=["UniqueID"], keep=False, inplace=True)

    df['UniqueID'] = df['UniqueID'].str.rjust(10, "0")
    ef_in['UniqueID'] = ef_in['UniqueID'].str.rjust(10, "0")
    ul_in['UniqueID'] = ul_in['UniqueID'].str.rjust(10, "0")

    print(ef_in)
    new_users = df.merge(ef_in)
    #print(new_users)
    disable_users = df.merge(ul_in)
    #print(ul_in)

    disable_users['Action'].fillna('Disable', inplace=True)
    ready_to_print_file = pd.concat([new_users, disable_users], ignore_index=False)

    rtpf1 = ready_to_print_file[ready_to_print_file["FirstName"].str.contains("companytest") == False]
    rtpf2 = rtpf1[rtpf1["FirstName"].str.contains("Clarks", "test") == False]

    rtpf2.to_csv(path, header=True, index=False)

我已经用它玩了两个小时,交叉引用手工做的比较Excel,和我的文件,它肯定不应该回来空白.我已经附上了我的另一个客户端的工作代码如下:

def client_merge(ef_in, ul_in):
    pd.set_option('mode.chained_assignment', None)

    ef_in['EmployeeId'] = ef_in['EmployeeId'].astype(object)
    ul_in['EmployeeId'] = ul_in['EmployeeId'].astype(object)
    ul_in['Action'] = ul_in['Action'].astype(object)
    ul_in['PrimaryMemberEmployeeId'] = ul_in['PrimaryMemberEmployeeId'].astype(object)
    ul_in['ZipCode'] = ul_in['ZipCode'].astype(object)

    df = pd.concat(([ef_in, ul_in]), axis=0, ignore_index=True, sort=False)
    df.drop_duplicates(subset=["EmployeeId"], keep=False, inplace=True)

    new_users = df.merge(ef_in)
    disable_users = df.merge(ul_in)

    disable_users['Action'].fillna('Disable', inplace=True)
    ready_to_print_file = pd.concat([new_users, disable_users], ignore_index=False)

    rtpf1 = ready_to_print_file[ready_to_print_file["FirstName"].str.contains("admin") == False]
    rtpf2 = rtpf1[rtpf1["FirstName"].str.contains("client") == False]

    rtpf2.to_csv(path, header=True, index=False)
nfzehxib

nfzehxib1#

df1.merge(ef_in, how='left', on='a')

有关详细信息,请参阅链接:
Reference for pandas dataframe merging

相关问题