pandas 删除低于总行数/总和的某个百分比阈值的行[Python]

cnwbcb6i 于 2023-02-02 发布在 Python

关注(0)|答案(1)|浏览(116)

我在过滤掉犯罪-“OffenseDescription”-低于 Dataframe 中总行数的5%（具体或一般解决方案将有所帮助，以便我可以根据需要复制/调整需求）时遇到问题。
到目前为止，我已经尝试过了，但它会使内核崩溃，本质上是在运行一个无限循环/执行。
我也在VS代码中通过Jupyter笔记本完成这一工作。
这是我到目前为止尝试的代码：

tot=crime.OffenseDescription.sum()  #Find sum of column 
  
  crime[crime.groupby(['OffenseDescriptiom']).transform(lambda x:
  (x.div(tot)*100)<0.05)]   #calculate percentage filter as per
  condition

链接到我正在使用的 Dataframe 的.head（）的屏幕截图：

短暂性脑缺血发作

pandas

来源：https://stackoverflow.com/questions/75281234/dropping-rows-that-fall-below-a-certain-percentage-threshold-of-the-total-rows-s

1条答案

按热度按时间

vngu2lb81#

使用Series.value_counts，对百分比进行归一化，并删除0.05过滤器Map列下大于或等于boolean indexing中0.05的组：

percentage = crime.OffenseDescription.value_counts(normalize=True) 

crime[crime['OffenseDescriptiom'].map(percentage) >= 0.05)]

赞(0）回复(0）举报 2023-02-02

我来回答

pandas 删除低于总行数/总和的某个百分比阈值的行[Python]

1条答案

相关问题

热门标签

最新问答