在Pandas的列之间移动特定的字符串

plicqrtu  于 2023-01-07  发布在  其他
关注(0)|答案(3)|浏览(141)

我有一个Pandas属性的数据框。

**Address**      | **Added on**.       | 
15 Smith Close   |  Added on 17/11/22  |
1 Apple Drive    |  Reduced on 19/11/22|
27 Pride place   |  Added on 18/1//22  |

我想将“Added on ...”列中“reduced on ...”的所有示例移动到 Dataframe 中另一个名为“Reduced on”的列。如何执行此操作?
非常感谢。

eeq64g8w

eeq64g8w1#

您可以使用pd.DataFrame.where

df['Reduced on'] = df['Added on'].where(df['Added on'].str.contains('Reduced on'))
df['Added on'] = df['Added on'].where(~ df['Added on'].str.contains('Reduced on'))

df

          Address           Added on           Reduced on
0  15 Smith Close  Added on 17/11/22                  NaN
1   1 Apple Drive                NaN  Reduced on 19/11/22
2  27 Pride place  Added on 18/1//22                  NaN

另一种方法是使用pd.Series.str.extractpd.DataFrame.concat

pd.concat([df['Address'], df['Added on'].str.extract('(?P<Added_on>Add.*)|(?P<Reduced_on>Reduced.*)')], axis=1)

          Address           Added_on           Reduced_on
0  15 Smith Close  Added on 17/11/22                  NaN
1   1 Apple Drive                NaN  Reduced on 19/11/22
2  27 Pride place  Added on 18/1//22                  NaN
knpiaxh1

knpiaxh12#

拟议代码:

import pandas as pd
import numpy as np

# Build Dataframe to work on
df = pd.DataFrame({"**Address** ": ['15 Smith Close' , '1 Apple Drive', '27 Pride place'], 
                   "**Added on**": ['Added on 17/11/22', 'Reduced on 19/11/22', 'Added on 18/1//22']})

# Define the mask m
m = df['**Added on**'].str.contains('Reduced')               

# 1- Move 'Reduced' rows to **New Col**                       
df['**Reduced on**'] = df['**Added on**'].where(m, np.nan)
# 2- Erase 'Reduced' rows from  **Added on**
df['**Added on**'] = df['**Added on**'].where(~m, np.nan) 

print(df)

结果:

**Address**        **Added on**          **Reduced on**
0  15 Smith Close  Added on 17/11/22                  NaN
1   1 Apple Drive                NaN  Reduced on 19/11/22
2  27 Pride place  Added on 18/1//22                  NaN
eqzww0vc

eqzww0vc3#

这应该也能起作用:

(df[['Address']].join(df[['Added on']]
.set_index(df['Added on']
.str.rsplit(n=1)
.str[0]
.rename(None),append=True)['Added on']
.unstack()))

输出:

Address           Added on           Reduced on
0  15 Smith Close  Added on 17/11/22                  NaN
1   1 Apple Drive                NaN  Reduced on 19/11/22
2  27 Pride place  Added on 18/1//22                  NaN

相关问题