matplotlib 如何在列的分组条形图上添加误差线

64jmpszr 于 2023-01-21 发布在其他

关注(0)|答案(4)|浏览(147)

我有一个pandas Dataframe df，它有四列：Candidate、Sample_Set、Values和Error。例如，Candidate列具有三个唯一条目：[X, Y, Z]并且我们有三个样本集，使得Sample_Set也有三个唯一值：[1,2,3]。df大致如下所示。

import pandas as pd

data = {'Candidate': ['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z'],
        'Sample_Set': [1, 1, 1, 2, 2, 2, 3, 3, 3],
        'Values': [20, 10, 10, 200, 101, 99, 1999, 998, 1003],
        'Error': [5, 2, 3, 30, 30, 30, 10, 10, 10]}
df = pd.DataFrame(data)

# display(df)
  Candidate  Sample_Set  Values  Error
0         X           1      20      5
1         Y           1      10      2
2         Z           1      10      3
3         X           2     200     30
4         Y           2     101     30
5         Z           2      99     30
6         X           3    1999     10
7         Y           3     998     10
8         Z           3    1003     10

我使用seaborn创建一个分组条形图，其中包含x="Candidate"，y="Values"，hue="Sample_Set"。一切都很好，直到我尝试使用名为Error的列下的值沿着y轴添加一个误差条。我使用以下代码。

import seaborn as sns

ax = sns.factorplot(x="Candidate", y="Values", hue="Sample_Set", data=df,
                    size=8, kind="bar")

我如何合并错误？
我希望能有一个解决办法或一个更优雅的方法来完成这项任务。

matplotlib

来源：https://stackoverflow.com/questions/42017049/how-to-add-error-bars-on-a-grouped-barplot-from-a-column

4条答案

按热度按时间

hfwmuf9z1#

正如@ResMar在评论中指出的那样，seaborn中似乎没有内置的功能来轻松设置各个错误条。
如果你更关心结果而不是到达目的地的方式，下面的（不太优雅的）解决方案可能会有帮助，它构建在matplotlib.pyplot.bar上，seborn import只是用来获得相同的样式。

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

def grouped_barplot(df, cat,subcat, val , err):
    u = df[cat].unique()
    x = np.arange(len(u))
    subx = df[subcat].unique()
    offsets = (np.arange(len(subx))-np.arange(len(subx)).mean())/(len(subx)+1.)
    width= np.diff(offsets).mean()
    for i,gr in enumerate(subx):
        dfg = df[df[subcat] == gr]
        plt.bar(x+offsets[i], dfg[val].values, width=width, 
                label="{} {}".format(subcat, gr), yerr=dfg[err].values)
    plt.xlabel(cat)
    plt.ylabel(val)
    plt.xticks(x, u)
    plt.legend()
    plt.show()

cat = "Candidate"
subcat = "Sample_Set"
val = "Values"
err = "Error"

# call the function with df from the question
grouped_barplot(df, cat, subcat, val, err )

注意通过简单地颠倒范畴和子范畴

cat = "Sample_Set"
subcat = "Candidate"

你可以得到不同的分组：

赞(0）回复(0）举报 2023-01-21

mcvgt66p2#

seaborn图在聚合数据时生成误差线，但此数据已聚合并具有指定的误差列。
最简单的解决方案是使用pandas创建pandas.DataFrame.plot和kind='bar'的bar-chart
默认情况下，matplotlib用作绘图后端，绘图API具有yerr参数，该参数接受以下内容：
作为错误的DataFrame或dict，列名称与绘图DataFrame的columns属性匹配或与Series的name属性匹配。
作为str，指示绘图DataFrame的哪些列包含误差值。
作为原始值（list、tuple或np.ndarray）。必须与绘图DataFrame/Series的长度相同。
这可以通过使用pandas.DataFrame.pivot将 Dataframe 从长格式重新调整为宽格式来实现
请参阅panda用户指南：使用误差线绘图
*在python 3.8.12、pandas 1.3.4、matplotlib 3.4.3中测试

# reshape the dataframe into a wide format for Values
vals = df.pivot(index='Candidate', columns='Sample_Set', values='Values')

# display(vals)
Sample_Set   1    2     3
Candidate                
X           20  200  1999
Y           10  101   998
Z           10   99  1003

# reshape the dataframe into a wide format for Errors
yerr = df.pivot(index='Candidate', columns='Sample_Set', values='Error')

# display(yerr)
Sample_Set  1   2   3
Candidate            
X           5  30  10
Y           2  30  10
Z           3  30  10

# plot vals with yerr
ax = vals.plot(kind='bar', yerr=yerr, logy=True, rot=0, figsize=(6, 5))
_ = ax.legend(title='Sample Set', bbox_to_anchor=(1, 1.02), loc='upper left')

赞(0）回复(0）举报 2023-01-21

63lcw9qa3#

我建议从patches属性中提取位置坐标，然后绘制误差线。

ax = sns.barplot(data=df, x="Candidate", y="Values", hue="Sample_Set")
x_coords = [p.get_x() + 0.5*p.get_width() for p in ax.patches]
y_coords = [p.get_height() for p in ax.patches]
ax.errorbar(x=x_coords, y=y_coords, yerr=df["Error"], fmt="none", c= "k")

赞(0）回复(0）举报 2023-01-21

wr98u20j4#

使用Pandas绘图功能，您可以接近您所需要的：see this answer

bars = data.groupby("Candidate").plot(kind='bar',x="Sample_Set", y= "Values", yerr=data['Error'])

这并不完全符合你的要求，但也很接近了。不幸的是，ggplot2 for python目前不能正确地呈现错误条。就我个人而言，在这种情况下，我会求助于R ggplot2：

data <- read.csv("~/repos/tmp/test.csv")
data
library(ggplot2)
ggplot(data, aes(x=Candidate, y=Values, fill=factor(Sample_Set))) + 
  geom_bar(position=position_dodge(), stat="identity") +
  geom_errorbar(aes(ymin=Values-Error, ymax=Values+Error), width=.1, position=position_dodge(.9))

赞(0）回复(0）举报 2023-01-21

我来回答

matplotlib 如何在列的分组条形图上添加误差线

4条答案

相关问题

热门标签

最新问答