我有以下的框架:
Country Crop Harvest Year Area (ha)
Afghanistan Maize 2019 94910
Afghanistan Maize 2020 140498
Afghanistan Maize 2021 92144
Afghanistan Winter Wheat 2019 2334000
Afghanistan Winter Wheat 2020 2668000
Afghanistan Winter Wheat 2021 1833357
Argentina Maize 2019 7232761
Argentina Maize 2020 7730506
Argentina Maize 2021 8146596
Argentina Winter Wheat 2019 6050953
Argentina Winter Wheat 2020 6729838
Argentina Winter Wheat 2021 6394102
China Maize 2019 41309740
China Maize 2020 41292000
China Maize 2021 43355859
China Winter Wheat 2019 23732560
China Winter Wheat 2020 23383000
China Winter Wheat 2021 23571400
Ethiopia Maize 2019 2274306
Ethiopia Maize 2020 2363507
Ethiopia Maize 2021 2530000
Ethiopia Winter Wheat 2019 1789372
Ethiopia Winter Wheat 2020 1829051
Ethiopia Winter Wheat 2021 1950000
France Maize 2019 1506100
France Maize 2020 1691130
France Maize 2021 1549520
France Winter Wheat 2019 5244250
France Winter Wheat 2020 4512420
France Winter Wheat 2021 5276730
India Maize 2019 9027130
India Maize 2020 9569060
India Maize 2021 9860000
India Winter Wheat 2019 29318780
India Winter Wheat 2020 31357020
India Winter Wheat 2021 31610000
Namibia Maize 2019 21123
Namibia Maize 2020 35000
Namibia Maize 2021 46070
Namibia Winter Wheat 2019 1079
Namibia Winter Wheat 2020 2000
Namibia Winter Wheat 2021 3026
字符串
我想选择前2个国家的平均值Area (ha)
列在整个收获年的.我尝试了这一点,但它不工作:
df = df.groupby("Crop", dropna=False).apply( lambda x: x.nlargest(2, "Area (ha)") )个 产量应该是,这里中国和印度是玉米和冬小麦平均
Area (ha)`最大的国家,但在完整的数据集中,不同国家的不同作物的平均值最大:
Country Crop Harvest Year Area (ha)
China Maize 2019 41309740
China Maize 2020 41292000
China Maize 2021 43355859
China Winter Wheat 2019 23732560
China Winter Wheat 2020 23383000
China Winter Wheat 2021 23571400
India Maize 2019 9027130
India Maize 2020 9569060
India Maize 2021 9860000
India Winter Wheat 2019 29318780
India Winter Wheat 2020 31357020
India Winter Wheat 2021 31610000
型
2条答案
按热度按时间gblwokeq1#
IIUC,你可以做两个
.groupby
:字符串
按平均面积打印前2名作物/国家:
型
然后你可以使用这个索引来过滤原始的框架:
型
打印:
型
2w2cym1i2#
您在分组中混淆了
Crop
和Harvest Year
。首先,将数据限制为特定年份的所有项目,然后保留n个最大的项目:字符串
结果如预期
的数据