我在pandas中有以下dataframe:
data = {
'idx': [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10],
'hue_val': ["A","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","B","C","C","C","C","C","C","C","C","C","C",],
'value': np.random.rand(30),
}
df = pd.DataFrame(data)
现在,我想通过跟随每个“hue_val”的“idx”,得到一个值的累积和的线图。因此,最终将是三条严格向上的曲线(因为它们是正数),一条用于“A”,“B”和“C”。
我在几个来源中找到了这段代码:
sns.lineplot(x="idx", y="value", hue="hue_val", data=df, estimator="cumsum")
这是行不通的,因为曲线和x轴都是假的:
2条答案
按热度按时间bpsygsoo1#
您可以单独计算累计和并绘制结果:
neekobn82#
给定OP Dataframe
有两件事需要做:
1.计算每个
hue_val
的累积和1.把它画出来
1.计算每个
hue_val
的累计和为了计算累计和,可以使用pandas.DataFrame.groupby和
pandas.Series.cumsum
。根据OP的要求,使用变量column
作为选择要考虑的列的方法,如下所示当使用Numpy生成一些dataframe值时,也可以使用它来计算
pandas.DataFrame.apply
和numpy.cumsum
的cum sum,如下所示2.画出来
然后可以用
seaborn.lineplot
绘制它,如下所示注:
value3
.apply()
.会推荐阅读此:When should I (not) want to use pandas apply() in my code?