python-3.x 如果列的项不等于另一列的项,则追加到新列

pes8fvy9  于 2023-02-06  发布在  Python
关注(0)|答案(2)|浏览(107)

假设我有一个CSV invites.csv

Email Invite                    Email Denied                                                                                       
batman@email.com                batman@email.com                       
poisonivy@email.com             catgirl@email.com             
superman@email.com              supergirl@email.com           
catgirl@email.com                           
joker@email.com                             
supergirl@email.com

我想比较这两个列,并创建一个新列Emails Left,其中只包含不在Email Denied列中的电子邮件。

Email Invite                    Email Denied               Emails Left                                                                                 
batman@email.com                batman@email.com           poisonivy@email.com               
poisonivy@email.com             catgirl@email.com          superman@email.com   
superman@email.com              supergirl@email.com        joker@email.com   
catgirl@email.com                                          flash@email.com
joker@email.com                             
supergirl@email.com
flash@email.com

这是我的代码:

import pandas as pd

Dir='invites.csv'

df = pd.read_csv(Dir)
df = pd.DataFrame(df)

a = len(df['Email invite'])
aList = []

for i in range(a):
    if df['Email invite'][i] != df['Email Denied'][i]:
        aList.append(df['Email Invite'][i])  

#place list as third column df['Emails Left']
63lcw9qa

63lcw9qa1#

我想通了。
在执行IF语句之前,我必须首先使列行相互匹配。

df = pd.read_csv(Dir)
df = pd.DataFrame(df)

df = pd.merge(df[["Email Invite"]],
              df[['Email Denied']],
              left_on='Email Invite',
              right_on='Email Denied',
              how='left')

这样,DataFrame将如下所示:

Email Invite                    Email Denied                                                                                       
batman@email.com                batman@email.com                       
poisonivy@email.com                          
superman@email.com                         
catgirl@email.com               catgirl@email.com             
joker@email.com                             
supergirl@email.com             supergirl@email.com

我继续我的FOR循环和IF语句:

a = len(df['Email invite'])
aList = []

for i in range(a):
    if df['Email invite'][i] != df['Email Denied'][i]:
        aList.append(df['Email Invite'][i])  

df['Emails Left'] = pd.Series(aList)

现在我有我的额外专栏:

Email Invite                  Email Denied           Emails Left                                                                                      
batman@email.com              batman@email.com       poisonivy@email.com                
poisonivy@email.com                                  superman@email.com  
superman@email.com                                   joker@email.com
catgirl@email.com             catgirl@email.com              
joker@email.com                             
supergirl@email.com           supergirl@email.com

现在我可以将其传输到新的CSV。

df.to_csv("NewInvite.csv", index=False)

现在程序运行良好。

vwhgwdsa

vwhgwdsa2#

不建议使用列长度不同的 Dataframe 。您必须用NaN或' '或其他东西填充剩余的元素,使它们长度相等。列表在以下情况下会更好:

import pandas as pd

df = pd.DataFrame({'Email Invite': ['batman@email.com', 'poisonivy@email.com', 'superman@email.com','catgirl@email.com' ,'joker@email.com' ,'supergirl@email.com'], 'Email Denied': ['batman@email.com', 'catgirl@email.com' ,'supergirl@email.com', '','','']})

email_invited = list(df['Email Invite'])
email_denied = list(df[df['Email Denied']!='']['Email Denied'])

email_left = [email for email in email_invited if email not in email_denied]
print(email_left)

相关问题