matplotlib Python中的堆叠AND分组水平条形图

x8goxv8g  于 2022-11-15  发布在  Python
关注(0)|答案(1)|浏览(956)

我试图用Python获得一个堆叠和分组的水平条形图:

female_numbers_2015 = [20882, 31322, 52204, 52205, 31322, 20881]
female_numbers_2018 = [20882, 31322, 52204, 52205, 31322, 20881]
male_numbers_2015 = [11352, 17080, 28380, 28380, 17028, 11351]
male_numbers_2018 = [11454, 17181, 28636, 28634, 17181, 11454]

total_numbers_2015 = [306669]
total_numbers_2018 = [323356]

percent_males_2015 = [i /j * 100 for i,j in zip(male_numbers_2015, total_numbers_2015)]
percent_females_2015 = [i /j * 100 for i,j in zip(female_numbers_2015, total_numbers_2015)]
percent_males_2018 = [i /j * 100 for i,j in zip(male_numbers_2018, total_numbers_2018)]
percent_females_2018 = [i /j * 100 for i,j in zip(female_numbers_2018, total_numbers_2018)]

index = ['Poorest 10%', '10-25%', '25-50%', '50-75%', '75-90%', 'Richest 10%']

df = pd.DataFrame({'percent_females_2015': percent_females_2015,'percent_males_2015': percent_males_2015,
                  'percent_females_2018': percent_females_2018,'percent_males_2018': percent_males_2018}, index=index)

x = np.arange(len(index))
width = 0.35  # the width of the bars

fig, ax = plt.subplots()
rects1 = ax.barh(x = {male_numbers_2015, female_numbers_2015}, x - width/2, width, label='2015', stacked = True)
rects2 = ax.barh(x = {male_numbers_2018, female_numbers_2018}, x + width/2, width, label='2018', stacked = True)

plt.show()

在这里,我想按index变量对条形图进行分组,例如,Poorest 10%类别将有两个条形图与该标签关联:2015年和2018年的数据。在每个条形图中,我需要叠加男性和女性的数据,例如,在最贫困的10%类别中:2015年的标准将包括2015年构成这一类别的女性百分比和2015年男性百分比。
非常感谢您的帮助!

fcg9iug3

fcg9iug31#

你的代码中有一些错误,逻辑也需要做一些修改。首先,百分比计算需要做一些修改。下面是获取数据、列表和数据框的代码。注意,我已经修改了一些数据点,因为你的数据给出了男性与女性相同的百分比。

## Your data, some changes to differentiate the values
female_numbers_2015 = [20882, 31322, 52204, 52205, 31322, 20881]
female_numbers_2018 = [20882, 31322, 52204, 52205, 31322, 20881]
male_numbers_2015 = [13352, 15080, 24380, 32380, 15028, 13351]
male_numbers_2018 = [14454, 14181, 30636, 26634, 12181, 16454]

## Percentage calculation corrected. Need to just divide each entry by sum(vals)
percent_males_2015 = [i /sum(male_numbers_2015) * 100 for i in male_numbers_2015]
percent_females_2015 = [i /sum(female_numbers_2015) * 100 for i in female_numbers_2015]
percent_males_2018 = [i /sum(male_numbers_2018) * 100 for i in male_numbers_2018]
percent_females_2018 = [i /sum(female_numbers_2018) * 100 for i in female_numbers_2018]

myindex = ['Poorest 10%', '10-25%', '25-50%', '50-75%', '75-90%', 'Richest 10%']

下一步是确保数据在数据框中以正确的顺序排列,以便Pandas图可以看到并构建正确的图。基本上,创建3个列表--年份、女性值和男性值各一个。然后,将它们添加到数据框中。调整数据框,使其具有您使用的索引(分组〈10%,...)和按性别/性别列,后面是年份(2015,2018)。

Year = []
Female = []
Male = []

Year=['2015']*len(percent_females_2015)
Year=Year+['2018']*len(percent_females_2018)

Female=percent_females_2015+percent_females_2018
Male=percent_males_2015+percent_males_2018

df=pd.DataFrame({'index':myindex*2, 'Year':Year, 'Female':Female, 'Male':Male})

df.set_index(['Year', 'index'], inplace=True)
df0 = df.reorder_levels(['index', 'Year']).sort_index()
df0 = df0.unstack(level=-1)

一旦数据准备好了,使用pandas/matplotlib barh plot绘制它。注意我没有使用stacked=True,而是绘制了女性+男性,然后只在顶部绘制男性,这样女性条在顶部,男性条在下面。我使用了Paired颜色,所以男性颜色应该是绿色阴影,而女性颜色应该是红色阴影。

colors = plt.cm.Paired.colors
fig, ax = plt.subplots(figsize=(10,5))
(df0['Female']+df0['Male']).plot(kind='barh', color=[colors[3], colors[2]], rot=0, ax=ax)
df0['Male'].plot(kind='barh', color=[colors[5], colors[4]], rot=0, ax=ax)

legend_labels = [f'{val} ({context})' for val, context in df0.columns]
ax.legend(legend_labels)

plt.show()

情节

相关问题