pandas 如何注解散点绘制与王子图书馆

b5buobof 于 2023-06-20 发布在其他

关注(0)|答案(1)|浏览(77)

我正在使用库prince来执行对应分析

from prince import CA

我的列联表dummy_contingency看起来像这样：

{'v1': {'0': 4.479591836734694,
  '1': 75.08163265306122,
  '2': 1.1020408163265305,
  '3': 5.285714285714286,
  '4': 14.244897959183673,
  '5': 0.0,
  '6': 94.06122448979592,
  '7': 0.5102040816326531,
  '8': 87.62244897959184,
  '9': 16.102040816326532},
 'v2': {'0': 6.142857142857143,
  '1': 24.653061224489797,
  '2': 0.3979591836734694,
  '3': 2.63265306122449,
  '4': 18.714285714285715,
  '5': 0.0,
  '6': 60.92857142857143,
  '7': 1.030612244897959,
  '8': 71.73469387755102,
  '9': 14.76530612244898},
 'v3': {'0': 3.642857142857143,
  '1': 21.551020408163264,
  '2': 0.8061224489795918,
  '3': 2.979591836734694,
  '4': 14.5,
  '5': 0.030612244897959183,
  '6': 39.60204081632653,
  '7': 0.7551020408163265,
  '8': 71.89795918367346,
  '9': 11.571428571428571},
 'v4': {'0': 6.1020408163265305,
  '1': 25.632653061224488,
  '2': 0.6938775510204082,
  '3': 3.9285714285714284,
  '4': 21.581632653061224,
  '5': 0.22448979591836735,
  '6': 10.704081632653061,
  '7': 0.8469387755102041,
  '8': 71.21428571428571,
  '9': 12.489795918367347}}

卡方检验显示依赖性：

Chi-square statistic: 69.6630377155341
p-value: 1.2528156966101567e-05

现在我拟合数据：

dummy_contingency = pd.DataFrame(dummy_contingency)

ca_dummy = CA(n_components=2)  # Number of components for correspondence analysis
ca_dummy.fit(dummy_contingency)

情节：

fig = ca_dummy.plot(
    X=dummy_contingency)
fig

我如何为这个图做标签？其他人发布的示例（Using mca package in Python）使用了函数plot_coordinates()，该函数也可以选择放置标签。但看起来这个函数不再适用于prince包，需要使用plot()函数，它没有放置标签的选项。感谢你的帮助。
编辑：带有标签的输出示例：

文本为每个点的情节像"草莓"，"香蕉"，"酸奶"等。是我正在寻找的标签，其中蓝色点的索引值为0，1，2，3，4，5，6，7，8，9，橙色点的列名为“v1”、“v2”、“v3”、“v4”。

pandas

来源：https://stackoverflow.com/questions/76475602/how-to-annotate-scatter-points-plotted-with-the-prince-library

1条答案

按热度按时间

beq87vna1#

将注解添加到散点图来自How to do annotations with Altair，但是，这不包括绘制ca中的点的必要步骤。
为了注解correspondence-analysis图，必须从ca模型中提取.column_coordinates和.row_coordinates。这些是图上的点，而不是来自df的点。

import pandas as pd
import prince
import altair as alt

# convert the dictionary of data to a dataframe
df = pd.DataFrame(dummy_contingency)

# create the model
ca = prince.CA()

# fit the model
ca = ca.fit(df)

# extract the column coordinate dataframe, and change the column names
cc = ca.column_coordinates(df).reset_index()
cc.columns = ['name', 'x', 'y']

# extract the row coordinates dataframe, and change the column names
rc = ca.row_coordinates(df).reset_index()
rc.columns = ['name', 'x', 'y']

# combine the dataframes
crc_df = pd.concat([cc, rc], ignore_index=True)

# plot and annotate
points = ca.plot(df)

annot = alt.Chart(crc_df).mark_text(
    align='left',
    baseline='middle',
    fontSize = 20,
    dx = 7
).encode(
    x='x',
    y='y',
    text='name'
)

points + annot

请注意，图中已经有浮动注解，没有添加annot。

注解也可以在不将cc和rc组合到单个 Dataframe 中的情况下添加。

points = ca.plot(df)

annot1 = alt.Chart(cc).mark_text(
    align='left',
    baseline='middle',
    fontSize = 20,
    dx = 7
).encode(
    x='x',
    y='y',
    text='name'
)

annot2 = alt.Chart(rc).mark_text(
    align='left',
    baseline='middle',
    fontSize = 20,
    dx = 7
).encode(
    x='x',
    y='y',
    text='name'
)

points + annot1 + annot2

赞(0）回复(0）举报 2023-06-20

我来回答

pandas 如何注解散点绘制与王子图书馆

1条答案

相关问题

热门标签

最新问答