numpy Pandas中的条件Fillna,从上一个值进行条件增量

dgtucam1  于 2022-12-04  发布在  其他
关注(0)|答案(3)|浏览(189)

我要根据上一行的增量值在“最后一个唯一ID”列中填充值

**input is** 
Channel last unique id
0   MYNTRA  MN000351370
1   NYKAA   NYK00038219
2   NYKAA   NaN
3   NYKAA   NaN
4   NYKAA   NaN
5   NYKAA   NaN
6   MYNTRA  NaN
7   MYNTRA  NaN
8   MYNTRA  NaN
9   MYNTRA  NaN
10  MYNTRA  NaN
11  MYNTRA  NaN

Expected output

        Channel last unique id
0   MYNTRA  MN000351370
1   NYKAA   NYK00038219
2   NYKAA   NYK00038220
3   NYKAA   NYK00038221
4   NYKAA   NYK00038222
5   NYKAA   NYK00038223
6   MYNTRA  MN000351371
7   MYNTRA  MN000351372
8   MYNTRA  MN000351373
9   MYNTRA  MN000351374
10  MYNTRA  MN000351375
11  MYNTRA  MN000351376

希望你明白问题所在

2vuwiymt

2vuwiymt1#

您可以使用groupby.cumcount来递增数字,并将其加到数字部分:

g = df.groupby('Channel')

# ffill per group
# extract letter and number part
df2 = (g['last unique id'].ffill()
       .str.extract(r'(\D+)(\d+)')
       )

# convert number part to integer
# add cumcount, merge back as string
df['last unique id'] = (df2[0]
 .add(df2[1].astype(int)
            .add(g.cumcount())
            .astype(str)
      )
 )

print(df)

输出量:

Channel last unique id
0   MYNTRA       MN351370
1    NYKAA       NYK38219
2    NYKAA       NYK38220
3    NYKAA       NYK38221
4    NYKAA       NYK38222
5    NYKAA       NYK38223
6   MYNTRA       MN351371
7   MYNTRA       MN351372
8   MYNTRA       MN351373
9   MYNTRA       MN351374
10  MYNTRA       MN351375
11  MYNTRA       MN351376
u4dcyp6a

u4dcyp6a2#

示例

data = {'col1': {0: 'A', 1: 'B', 2: 'A', 3: 'A', 4: 'B', 5: 'B'},
 'col2': {0: 'A001', 1: 'BC020', 2: None, 3: None, 4: 'BC021', 5: None}}
df = pd.DataFrame(data)

df值

col1 col2
0   A   A001
1   B   BC020
2   A   None
3   A   None
4   B   BC021
5   B   None

代码

df[['col3', 'col4']] = df.groupby('col1')['col2'].ffill().str.extract('(\D+)(\d+)')
df['col4'] = df['col4'].astype('int') + df.groupby(['col1', 'col4']).cumcount()
df['col2'] = df['col2'].fillna(df['col3'] + df['col4'].astype('str').str.zfill(3))
df = df.drop(['col3', 'col4'], axis=1)

结果(df):

col1 col2
0   A   A001
1   B   BC020
2   A   A002
3   A   A003
4   B   BC021
5   B   BC022
5cnsuln7

5cnsuln73#

下面是如何通过填充零获得所需的输出,以使id始终保持固定长度11。
第一个

相关问题