numpy 使用基于具有条件的同一组的先前值的值填充pandas数据框列

3pvhb19x  于 2023-05-07  发布在  其他
关注(0)|答案(1)|浏览(122)

考虑下面的Pandas DataFrame

df = pd.DataFrame({'Make': ['Tesla','Tesla','Tesla','Toyota','Ford','Ford','Ford','BMW','BMW','BMW','Mercedes','Mercedes','Mercedes'],
                   'Type': ['Model X','Model X','Model X','Corolla','Bronco','Bronco','Mustang','3 Series','3 Series','7 Series','C-Class','C-Class','S-Class'],
                   'Year': [2015, 2015, 2015, 2017, 2018, 2018, 2020, 2015, 2015, 2017, 2018, 2018, 2020],
                   'Price': [85000, 90000, 95000, 20000, 35000, 35000, 45000, 40000, 40000, 65000, 50000, 50000, 75000],
                   'Color': ['White','White','White','Red','Blue','Blue','Yellow','Silver','Silver','Black','White','White','Black'],
                   'Code'  : ['TSLABG','TSLA',None,'TYTA','FRD','_BG',None,None,'BMW','BMW','MercedesBG','Mercedes_BG','MercedesBG']
                  })
df
     Make   Type       Year Price   Color   Code
0   Tesla   Model X    2015 85000   White   TSLABG
1   Tesla   Model X    2015 90000   White   TSLA
2   Tesla   Model X    2015 95000   White   None
3   Toyota  Corolla    2017 20000   Red     TYTA
4    Ford   Bronco     2018 35000   Blue    FRD 
5    Ford   Bronco     2018 35000   Blue    _BG
6    Ford   Mustang    2020 45000   Yellow  None
7     BMW   3 Series   2015 40000   Silver  None
8     BMW   3 Series   2015 40000   Silver  BMW
9     BMW   7 Series   2017 65000   Black   BMW
10 Mercedes C-Class    2018 50000   White   MercedesBG
11 Mercedes C-Class    2018 50000   White   Mercedes_BG
12 Mercedes S-Class    2020 75000   Black   MercedesBG

我试图根据Make列更新Code列,如果Code列有None,则必须根据Code列的其他值为同一Make正确填充。换句话说,如果任何MakeCode列中定义了Code,则该值应用于填充Code列中的None值,并且如果BG_BG附加到相同Make的任何代码值,所有的Code值都应该分别附加BG_BG,以用于相同的Make
由于BMW没有BG_BG用于现有的BMW代码值,因此在替换None时,不会追加BG_BG。对于福特,其中一个值存在_BG,另一个值存在FRD,因此Ford的所有Code值都应该是FRD_BG
对于Mercedes,有Code值,并将BG附加到代码中,因此,如果任何Code值的_前缀为BG,则_应附加到Code值中
我试过的代码

code = (df['Code'].str.split('(_)', expand=True).add_prefix('part').replace('-', None).groupby(df['Make']).transform('first').fillna('').agg(''.join, axis=1))
df['Code'] = code
df

我得到的输出是MercedesBG_BG,但我期望的是MercedesBG

Make   Type       Year Price   Color   Code
0   Tesla   Model X    2015 85000   White   TSLABG
1   Tesla   Model X    2015 90000   White   TSLABG
2   Tesla   Model X    2015 95000   White   TSLABG
3   Toyota  Corolla    2017 20000   Red     TYTA
4    Ford   Bronco     2018 35000   Blue    FRD_BG
5    Ford   Bronco     2018 35000   Blue    FRD_BG
6    Ford   Mustang    2020 45000   Yellow  FRD_BG
7     BMW   3 Series   2015 40000   Silver  BMW
8     BMW   3 Series   2015 40000   Silver  BMW
9     BMW   7 Series   2017 65000   Black   BMW
10 Mercedes C-Class    2018 50000   White   MercedesBG_BG
11 Mercedes C-Class    2018 50000   White   MercedesBG_BG
12 Mercedes S-Class    2020 75000   Black   MercedesBG_BG

预期输出为:

Make   Type       Year Price   Color   Code
0   Tesla   Model X    2015 85000   White   TSLABG
1   Tesla   Model X    2015 90000   White   TSLABG
2   Tesla   Model X    2015 95000   White   TSLABG
3   Toyota  Corolla    2017 20000   Red     TYTA
4    Ford   Bronco     2018 35000   Blue    FRD_BG
5    Ford   Bronco     2018 35000   Blue    FRD_BG
6    Ford   Mustang    2020 45000   Yellow  FRD_BG
7     BMW   3 Series   2015 40000   Silver  BMW
8     BMW   3 Series   2015 40000   Silver  BMW
9     BMW   7 Series   2017 65000   Black   BMW
10 Mercedes C-Class    2018 50000   White   Mercedes_BG
11 Mercedes C-Class    2018 50000   White   Mercedes_BG
12 Mercedes S-Class    2020 75000   Black   Mercedes_BG
bfnvny8b

bfnvny8b1#

具有自定义功能:

def f(x):
    bg = x.notna() & x.str.endswith('BG')  # base mask
    if bg.any():
        if x.str.endswith('_BG').any():  # _BG is in priority
            x = (x.replace('^_BG', None, regex=True)
                 .replace('_?BG$', '', regex=True) + '_BG')
        else:
            x[~bg] = x[~bg] + 'BG'
    return x.ffill().bfill()

df['Code'] = df.groupby('Make')['Code'].apply(f)
print(df)
Make      Type  Year  Price   Color         Code
0      Tesla   Model X  2015  85000   White       TSLABG
1      Tesla   Model X  2015  90000   White       TSLABG
2      Tesla   Model X  2015  95000   White       TSLABG
3     Toyota   Corolla  2017  20000     Red         TYTA
4       Ford    Bronco  2018  35000    Blue       FRD_BG
5       Ford    Bronco  2018  35000    Blue       FRD_BG
6       Ford   Mustang  2020  45000  Yellow       FRD_BG
7        BMW  3 Series  2015  40000  Silver          BMW
8        BMW  3 Series  2015  40000  Silver          BMW
9        BMW  7 Series  2017  65000   Black          BMW
10  Mercedes   C-Class  2018  50000   White  Mercedes_BG
11  Mercedes   C-Class  2018  50000   White  Mercedes_BG
12  Mercedes   S-Class  2020  75000   Black  Mercedes_BG

相关问题