pandas 5D散点图太大,如何修改大小属性?

wi3ka0sx  于 2023-03-06  发布在  其他
关注(0)|答案(1)|浏览(191)

我已经按照这个教程:https://towardsdev.com/multi-dimension-visualization-in-python-part-i-85c13e9b7495
https://towardsdev.com/multi-dimension-visualization-in-python-part-ii-8c56d861923a
我看到5D散点图太大,如何修改大小,使其成为当前大小的1/10或1/50?
csv如下所示:

带有超大分散圆圈的图:

守则:

import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib as mpl
import numpy as np
import seaborn as sns

dataidxbluechip = pd.read_csv("/home/browni/LasthrimProjection/Python/csv/idxbluechipstocks.csv", index_col=0, parse_dates = True)
dataidxpenny = pd.read_csv("/home/browni/LasthrimProjection/Python/csv/idxpennystocks.csv", index_col=0, parse_dates = True)

# Print first 5 rows of data
print(dataidxbluechip.head())
print(dataidxpenny.head())

# Store stocks type as an attribute
dataidxbluechip['stocks_type'] = 'IDX Blue chip'
dataidxpenny['stocks_type'] = 'IDX Penny Stocks'

# bucket stocks quality scores into qualitative quality labels
dataidxbluechip['PER 2023'] = dataidxbluechip['per2023'].apply(lambda value: 'low'
        if value <= 10 
        else 'medium' if value <= 15
        else 'high')
dataidxbluechip['PER 2023'] = pd.Categorical(dataidxbluechip['PER 2023'],
            categories=['low','medium','high'])

dataidxpenny['PER 2023'] = dataidxpenny['per2023'].apply(lambda value: 'low'
        if value <= 10 
        else 'medium' if value <= 15
        else 'high')
dataidxpenny['PER 2023'] = pd.Categorical(dataidxpenny['PER 2023'],
            categories=['low','medium','high'])

allstocks = pd.concat([dataidxbluechip, dataidxpenny])

# Print first 5 rows of data after being concatenate and replace
# the column of per2023 with PER 2023 stating the level of PER
print(allstocks.head())

# Visualizing 5-D mix data using bubble charts
# leveraging the concepts of hue, size and depth

g = sns.FacetGrid(allstocks, col="stocks_type", hue='PER 2023',
                  col_order=['IDX Blue chip', 'IDX Penny Stocks'], 
          hue_order=['low', 'medium', 'high'],
                  aspect=1.2, palette=sns.light_palette('navy', 4)[1:])

# The size='bvps2023' seems problematic.. ask..
g.map_dataframe(sns.scatterplot, "pricejan2010", "debtequityratio2023", alpha=0.9, 
      edgecolor='white', linewidth=0.5, size='bvps2023', 
      sizes=(allstocks['bvps2023'].min(), allstocks['bvps2023'].max()))

fig = g.fig 
fig.subplots_adjust(top=0.8, wspace=0.3)
fig.suptitle('Stocks Type - Book Value - Debt Equity Ratio - PER 2023', fontsize=14)
l = g.add_legend(title='Stocks PER Quality Class')

plt.show()
fig.savefig('5dscatterplot.png')
ffx8fchx

ffx8fchx1#

size=指示 Dataframe 的哪一列将用于指示散点大小。例如,size='weight'将使用权重列确定点大小。最小权重将Map到最小点大小,最大权重将Map到最大点大小。默认情况下,seaborn为大小范围选择了一些合适的大小。使用sizes=(min_size, max_size)可以设置另一个范围。设置mpg['weight'].min(), mpg['weight'].max()将把点大小范围设置为(1613, 5140)。这些大小非常大。如果您不使用“不要喜欢默认值(sizes=None),您可以尝试使用自己的范围,例如sizes=(10, 150)
注意,标记大小以points**2为单位,点为1/72 inch;点通常用于字体大小,例如12点字体。因此,点大小100在大小上与字体大小10的字符相当。
下面是一些代码,从seaborn的mgp数据集开始,演示了正在发生的事情:

import matplotlib.pyplot as plt
import seaborn as sns

mpg = sns.load_dataset('mpg')

g = sns.FacetGrid(mpg, col='origin', hue='cylinders',
                  aspect=1.2, palette='turbo')

g.map_dataframe(sns.scatterplot, 'model_year', 'mpg', alpha=0.9,
                edgecolor='white', linewidth=0.5, size='weight',
                sizes=(mpg['weight'].min(), mpg['weight'].max()))
plt.show()

sizes=的高值会创建巨大的点:

相同的代码,但现在使用sizes=(10, 150)

g = sns.FacetGrid(mpg, col='origin', hue='cylinders',
                  aspect=1.2, palette='turbo')

g.map_dataframe(sns.scatterplot, 'model_year', 'mpg', alpha=0.9,
                edgecolor='white', linewidth=0.5, size='weight',
                sizes=(10, 150))

相关问题