pandas 如何替换Jupyter Notebook上的特定值[重复]

bxfogqkk  于 2023-03-28  发布在  其他
关注(0)|答案(3)|浏览(211)

此问题在此处已有答案

(9个答案)
7小时前关闭。
我几周前开始学习Python的数据科学,在我自己的项目中遇到了这个问题。我试图在计数低于5时将游戏发行商名称替换为“其他”。当我使用.mask时()函数,但是它似乎也将“Count”值替换为“Other”。有没有可能将“Publisher”值更改为“Other”并保留“Count”价值如?
我尝试的方法如下:

publisher_subset = data.filter(['Publisher'])
df = publisher_subset.value_counts().reset_index(name='Counts')
df.mask(df["Counts"] <= 5, "Other", inplace=False)

enter image description here`

r3i60tvu

r3i60tvu1#

您正在寻找的页面

import numpy as np    
df['Publisher'] = np.where(df["Counts"] <= 5, "Other", df['Publisher'])
vkc1a9a2

vkc1a9a22#

可以使用.loc[]索引器有选择地将掩码仅应用于Publisher列。

publisher_subset = data.filter(['Publisher'])
df = publisher_subset.value_counts().reset_index(name='Counts')

df.loc[df["Counts"] <= 5, "Publisher"] = "Other"
ryhaxcpt

ryhaxcpt3#

如果仅选择Publisher列,则可以使用mask

# Select publisher column only to mask values
rename_others = lambda x: x['Publisher'].mask(x['counts'] <= 5, other='Others')

out = (df.value_counts('Publisher').reset_index(name='counts')
         .assign(Publisher=rename_others))
print(out)

# Output
  Publisher  counts
0      Sony       7
1    Bandai       6
2    Others       5
3    Others       5
4    Others       5
5    Others       4
6    Others       4

我想你也想用Publisher求和:

out = (df.value_counts('Publisher').reset_index(name='counts')
         .assign(Publisher=rename_others)
         .groupby('Publisher', sort=False, as_index=False).sum())

  Publisher  counts
0      Sony       7
1    Bandai       6
2    Others      23

相关问题