使用pandas dataframe从一列中提取特殊字符(:)并填充相同的行值,直到出现下一个字符

0md85ypi  于 2023-04-19  发布在  其他
关注(0)|答案(3)|浏览(124)

使用pandas dataframe a我想从一列中提取特殊字符文本& ffill(forward fill),以填充相同的值,直到下一次出现特殊字符,提取后删除特殊字符行。我已经尝试了以下方法,但没有得到我想要的结果。
输入 Dataframe :

import pandas as pd
df = pd.DataFrame({
    'col1': ['White color :', 'I am not really sure how to do this', 'I am not really sure how to do this', 
             'Black color :', 'I am not ready to solve your issue',
           'I am not ready to solve your issue','I am not ready to solve your issue'],
    
     })

df['new_col'] = df['col1'].str.extract('^([^:]+)', expand=False)
mask = df.apply(lambda x: x.str.contains(':')).any(axis=1)
df.loc[mask, :] = df.loc[mask, :].ffill(axis=1)
df

所需输出 Dataframe

col1                                       new_col   
 0  I am not really sure how to do this        White color
 1  I am not really sure how to do this        White color
 2  I am not ready to solve your issue         Black color 
 3  I am not ready to solve your issue         Black color 
 4  I am not ready to solve your issue         Black color
s2j5cfk0

s2j5cfk01#

遵循您的方法:

m = df["col1"].str.contains(":", na=False)

out = (df.assign(new_col=df["col1"].str.extract(r"^([^:]+)", expand=False)
                 .where(m).ffill()).loc[~m]).reset_index(drop=True)

输出:

print(out)

                                  col1       new_col
0  I am not really sure how to do this  White color 
1  I am not really sure how to do this  White color 
2   I am not ready to solve your issue  Black color 
3   I am not ready to solve your issue  Black color 
4   I am not ready to solve your issue  Black color
vjhs03f7

vjhs03f72#

您可以从col1中提取颜色,并将其复制到new_col中,不带冒号,向前填充(这是因为我们只复制col1值中有:的位置)。然后您可以简单地删除col1中有颜色的行:

df['new_col'] = df['col1'].str.extract(r'^(.*)\s+:').ffill()
df = df[~df['col1'].str.contains(':')].reset_index(drop=True)

输出:

col1      new_col
0  I am not really sure how to do this  White color
1  I am not really sure how to do this  White color
2   I am not ready to solve your issue  Black color
3   I am not ready to solve your issue  Black color
4   I am not ready to solve your issue  Black color
bxfogqkk

bxfogqkk3#

这是一种buteforce,但它的工作就像一个魅力。

import pandas as pd
df = pd.DataFrame({
    'col1': ['White color :', 'I am not really sure how to do this', 'I am not really sure how to do this', 
             'Black color :', 'I am not ready to solve your issue',
           'I am not ready to solve your issue','I am not ready to solve your issue'],
    
     })

special_character_set = ["White color :", "Black color :"]
special_char = ""
special = []
temp = []
for idx, row in df.iterrows():
    if row["col1"] in special_character_set:
        special_char = row["col1"][:-2]
    else:
        temp.append(row["col1"])
        special.append(special_char)    

final_df = pd.DataFrame({
    "col1" : temp,
    "new_col": special
})

相关问题