pandas 根据列值更改散点图标记的大小[重复]

9w11ddsr  于 2023-04-28  发布在  其他
关注(0)|答案(1)|浏览(114)

此问题已在此处有答案

Customize lineplot marker size based on values of a column(1个答案)
2天前关闭。
我有一个数据集,看起来像这样:

{'ScoreDate': {0: '12/1/2019',
  1: '1/1/2020',
  2: '2/1/2020',
  3: '3/1/2020',
  4: '4/1/2020',
  5: '5/1/2020',
  6: '6/1/2020',
  7: '7/1/2020',
  8: '7/1/2020',
  9: '7/1/2020',
  10: '7/1/2020',
  11: '7/1/2020',
  12: '7/1/2020',
  13: '8/1/2020',
  14: '8/1/2020',
  15: '8/1/2020',
  16: '8/1/2020',
  17: '8/1/2020',
  18: '9/1/2020'},
 'CustomerID': {0: 4554,
  1: 4554,
  2: 4554,
  3: 4554,
  4: 4554,
  5: 4554,
  6: 4554,
  7: 4554,
  8: 4554,
  9: 4554,
  10: 4554,
  11: 4554,
  12: 4554,
  13: 4554,
  14: 4554,
  15: 4554,
  16: 4554,
  17: 4554,
  18: 4554},
 'Supplier_Name': {0: 'ABC Company',
  1: 'ABC Company',
  2: 'ABC Company',
  3: 'ABC Company',
  4: 'ABC Company',
  5: 'ABC Company',
  6: 'ABC Company',
  7: 'ABC Company',
  8: 'ABC Company',
  9: 'ABC Company',
  10: 'ABC Company',
  11: 'ABC Company',
  12: 'ABC Company',
  13: 'ABC Company',
  14: 'ABC Company',
  15: 'ABC Company',
  16: 'ABC Company',
  17: 'ABC Company',
  18: 'ABC Company'},
 'Score': {0: 90,
  1: 90,
  2: 90,
  3: 75,
  4: 75,
  5: 75,
  6: 90,
  7: 90,
  8: 90,
  9: 90,
  10: 90,
  11: 90,
  12: 90,
  13: 90,
  14: 90,
  15: 90,
  16: 90,
  17: 90,
  18: 90},
 'EDate': {0: nan,
  1: nan,
  2: nan,
  3: nan,
  4: '4/1/2020',
  5: nan,
  6: '6/1/2020',
  7: '7/1/2020',
  8: '7/1/2020',
  9: '7/1/2020',
  10: '7/1/2020',
  11: '7/1/2020',
  12: '7/1/2020',
  13: '8/1/2020',
  14: '8/1/2020',
  15: '8/1/2020',
  16: '8/1/2020',
  17: '8/1/2020',
  18: nan}}

和一些代码来生成Score的线图,每个EDate都有标记:

size = 15
params = {'legend.fontsize': 'large',
      'figure.figsize': (20,8),
      'axes.labelsize': size,
      'axes.titlesize': size,
      'xtick.labelsize': size*0.75,
      'ytick.labelsize': size*0.75,
      'axes.titlepad': 25}
plt.figure(figsize=(10,5))
sns.set(style="darkgrid")
plt.rcParams.update(params)

sns.lineplot(data=df, x='ScoreDate', y='Score', ci=None, 
             linewidth=2, palette="deep").set(title="Score")
sns.scatterplot(data=df, x='EDate', y='Score', color='orange')

生产:

我期待实现:

  • 将标记大小设置为该日期发生的EDate(事件)数量

我已使用以下方法成功地对数据进行了分组:

c_df = df.groupby(['ScoreDate', 'Score'])['EDate'].count().reset_index(name='count')

size = 15
params = {'legend.fontsize': 'large',
      'figure.figsize': (20,8),
      'axes.labelsize': size,
      'axes.titlesize': size,
      'xtick.labelsize': size*0.75,
      'ytick.labelsize': size*0.75,
      'axes.titlepad': 25}
plt.figure(figsize=(10,5))
sns.set(style="darkgrid")
plt.rcParams.update(params)

sns.lineplot(data=c_df, x='ScoreDate', y='Score', ci=None, 
             linewidth=2, palette="deep").set(title="Score")
sns.scatterplot(data=c_df, x='ScoreDate', y='count', color='orange')

生产:

这显然不是我要找的。我怎么能完成我的三个目标呢?

4xrmg8kj

4xrmg8kj1#

我相信你正在寻找size参数:

sns.lineplot(data=df, x='ScoreDate', y='Score', ci=None, 
             linewidth=2, palette="deep").set(title="Score")

sns.scatterplot(data=c_df, x='ScoreDate', y='Score', size='count', color='orange')

输出:

注:您还可以指定sizes(* 例如 * sizes=[0,30,60,90])参数,以手动设置每个计数组的所需大小。因此,例如:

注意标记的大小是不同的(例如,零根本不显示)。或者,您可以使用c_df.query('count>0')c_df中过滤掉它们以进行绘图。

相关问题