Pandas:生成列相对于另一列的百分比

0md85ypi  于 2023-03-11  发布在  其他
关注(0)|答案(3)|浏览(147)

我的数据集如下所示
| 斯图|运动员|棒球|干杯|足球|
| - ------|- ------|- ------|- ------|- ------|
| 1个|无|无|无|无|
| 第二章|1个|1个|无|无|
| 三个|1个|无|1个|无|
| 四个|1个|无|无|1个|
我想计算运动员总数的百分比以及运动员人口中每个运动员群体的百分比(换句话说,运动员/计数和运动/运动员)。
我希望得到如下所示的输出:
| 标题1||
| - ------|- ------|
| 运动员|七十五|
| 棒球|三十三|
| 干杯|三十三|
| 足球|三十三|
我可以通过运行以下语句来实现这一点:

df['Athlete'].sum()/df['Count'].sum() * 100

df['Baseball'].sum()/df['Athlete'].sum() * 100

等等。
但是,是否有一种方法可以发出一条语句来执行此操作,而无需创建单独的语句?

m528fe3b

m528fe3b1#

下面是使用pandas.DataFrame.loc并分配新df的方法

sport_pct = (df.iloc[:, 2:].sum() / df['Athlete'].sum() * 100).astype(int)
athlete_pct = int(df['Athlete'].sum() / len(df) * 100)

new_df = pd.DataFrame({
    'header 1': ['Athlete'] + list(sport_pct.index),
    '': [athlete_pct] + list(sport_pct.values)
})
print(new_df)
header 1    
0   Athlete  75
1  Baseball  33
2     Cheer  33
3  Football  33
omvjsjqw

omvjsjqw2#

# calculate total number of athletes
total_athletes = df['Athlete'].sum()

# calculate percentage of athletes for each sport
sport_percentages = df[['Baseball', 'Cheer', 
'Football']].sum().apply(lambda x: x/total_athletes * 100)

# calculate percentage of athletes in each group
group_percentages = df[['Baseball', 'Cheer', 'Football']].apply(lambda x: 
x/df['Athlete'].sum() * 100)

# concatenate the two results into one DataFrame
result = pd.concat([pd.DataFrame({'Athlete': [total_athletes]}), 
sport_percentages.to_frame().transpose(), group_percentages])
pftdvrlh

pftdvrlh3#

你可以试试groupby和agg,如下所示:

# group by 'Athlete' 
result = df.groupby('Athlete').agg(
    Total=('Athlete', 'size'),
    Baseball_Percentage=('Baseball', 'mean'),
    Cheer_Percentage=('Cheer', 'mean'),
    Football_Percentage=('Football', 'mean')
)

# calculate the percentage
result['Percentage_of_Total_Athletes'] = result['Total'] / result['Total'].sum() * 100

相关问题