我有以下 Dataframe
df = pd.DataFrame({'Category': {0: 'onboarding segment-confirmation-unexpected-input origin',
1: 'onboarding segment-confirmation-unexpected-input view',
2: 'product-availability cpf-request-unexpected-input origin',
3: 'product-availability postalcode-validation-true-unexpected-input origin',
4: 'product-availability postalcode-validation-true-unexpected-input view'},
'UserId': {0: 9090, 1: 4545, 2: 3266, 3: 2894, 4: 2772}})
我想做的是制定一个标志,检查不同于单词“view”或“origin”的字符串部分是否等于先前的值,如果是,则保持该标志,如果不增加标志值.
预期结果
df = pd.DataFrame({'Category': {0: 'onboarding segment-confirmation-unexpected-input origin',
1: 'onboarding segment-confirmation-unexpected-input view',
2: 'product-availability cpf-request-unexpected-input origin',
3: 'product-availability postalcode-validation-true-unexpected-input origin',
4: 'product-availability postalcode-validation-true-unexpected-input view'},
'UserId': {0: 9090, 1: 4545, 2: 3266, 3: 2894, 4: 2772},
'Flag':{0:'Flag_1',1:'Flag_1',2:'Flag_2',3:'Flag_3',4:'Flag_3'}})
怎么做呢?我试着把它切片,并制定一个groupby,但我在增加的部分有一点困难。
3条答案
按热度按时间kulphzqa1#
假设你想考虑前两个块或字符串(块之间用空格分隔):
输出:
0x6upsns2#
这对我很有效:
iqxoj9l93#