我已经按照这个教程:https://towardsdev.com/multi-dimension-visualization-in-python-part-i-85c13e9b7495
https://towardsdev.com/multi-dimension-visualization-in-python-part-ii-8c56d861923a
我看到5D散点图太大,如何修改大小,使其成为当前大小的1/10或1/50?
csv如下所示:
带有超大分散圆圈的图:
守则:
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib as mpl
import numpy as np
import seaborn as sns
dataidxbluechip = pd.read_csv("/home/browni/LasthrimProjection/Python/csv/idxbluechipstocks.csv", index_col=0, parse_dates = True)
dataidxpenny = pd.read_csv("/home/browni/LasthrimProjection/Python/csv/idxpennystocks.csv", index_col=0, parse_dates = True)
# Print first 5 rows of data
print(dataidxbluechip.head())
print(dataidxpenny.head())
# Store stocks type as an attribute
dataidxbluechip['stocks_type'] = 'IDX Blue chip'
dataidxpenny['stocks_type'] = 'IDX Penny Stocks'
# bucket stocks quality scores into qualitative quality labels
dataidxbluechip['PER 2023'] = dataidxbluechip['per2023'].apply(lambda value: 'low'
if value <= 10
else 'medium' if value <= 15
else 'high')
dataidxbluechip['PER 2023'] = pd.Categorical(dataidxbluechip['PER 2023'],
categories=['low','medium','high'])
dataidxpenny['PER 2023'] = dataidxpenny['per2023'].apply(lambda value: 'low'
if value <= 10
else 'medium' if value <= 15
else 'high')
dataidxpenny['PER 2023'] = pd.Categorical(dataidxpenny['PER 2023'],
categories=['low','medium','high'])
allstocks = pd.concat([dataidxbluechip, dataidxpenny])
# Print first 5 rows of data after being concatenate and replace
# the column of per2023 with PER 2023 stating the level of PER
print(allstocks.head())
# Visualizing 5-D mix data using bubble charts
# leveraging the concepts of hue, size and depth
g = sns.FacetGrid(allstocks, col="stocks_type", hue='PER 2023',
col_order=['IDX Blue chip', 'IDX Penny Stocks'],
hue_order=['low', 'medium', 'high'],
aspect=1.2, palette=sns.light_palette('navy', 4)[1:])
# The size='bvps2023' seems problematic.. ask..
g.map_dataframe(sns.scatterplot, "pricejan2010", "debtequityratio2023", alpha=0.9,
edgecolor='white', linewidth=0.5, size='bvps2023',
sizes=(allstocks['bvps2023'].min(), allstocks['bvps2023'].max()))
fig = g.fig
fig.subplots_adjust(top=0.8, wspace=0.3)
fig.suptitle('Stocks Type - Book Value - Debt Equity Ratio - PER 2023', fontsize=14)
l = g.add_legend(title='Stocks PER Quality Class')
plt.show()
fig.savefig('5dscatterplot.png')
1条答案
按热度按时间ffx8fchx1#
size=
指示 Dataframe 的哪一列将用于指示散点大小。例如,size='weight'
将使用权重列确定点大小。最小权重将Map到最小点大小,最大权重将Map到最大点大小。默认情况下,seaborn为大小范围选择了一些合适的大小。使用sizes=(min_size, max_size)
可以设置另一个范围。设置mpg['weight'].min(), mpg['weight'].max()
将把点大小范围设置为(1613, 5140)
。这些大小非常大。如果您不使用“不要喜欢默认值(sizes=None
),您可以尝试使用自己的范围,例如sizes=(10, 150)
。注意,标记大小以
points**2
为单位,点为1/72 inch
;点通常用于字体大小,例如12点字体。因此,点大小100
在大小上与字体大小10
的字符相当。下面是一些代码,从seaborn的mgp数据集开始,演示了正在发生的事情:
sizes=
的高值会创建巨大的点:相同的代码,但现在使用
sizes=(10, 150)
: