numpy 如何简化“每个班级的百分比”

j91ykkif  于 2023-08-05  发布在  其他
关注(0)|答案(1)|浏览(67)

我有著名的泰坦尼克号数据集

fr1 = pd.DataFrame({
"class": ["1", "2", "2"],
"survived": [0, 1, 1]})

字符串
我需要得到每个班级幸存者的百分比,所以首先我分开敷面膜,然后分组

fr2 = fr1[fr1["Survived"] == 0]
fr2 = fr2.groupby("Pclass", as_index=False)["Survived"].agg(["count"])
fr3 = fr1[fr1["Survived"] == 1]
fr3 = fr3.groupby("Pclass", as_index=False)["Survived"].agg(["count"])


现在,我合并了我得到的 Dataframe ,并创建了百分比列,以了解每个班级有多少人幸存下来

merged = pd.merge(fr2,fr3,left_index=True,right_index=True)
merged.columns = "Survived Died".split()    
merged["Percentage"] = merged["Survived"] / (len(fr1))*100

eyh26e7m

eyh26e7m1#

非常简单的方法,因为你已经有了0/1,只要得到groupby.mean

out = (fr1.groupby('class')['survived']
          .mean().mul(100).reset_index()
       )

字符串
变体:

out = (fr1.groupby('class', as_index=False)['survived']
          .agg(lambda g: g.mean()*100)
       )


输出量:

class  survived
0     1       0.0
1     2     100.0

相关问题