matplotlib 散点图中正值的过滤

8yoxcaq7  于 2023-05-18  发布在  其他
关注(0)|答案(4)|浏览(182)

假设一个嵌套的字典,其中第一级是日期,第二级字典将数字作为键,并将与该数字对应的值作为值。我想在每个日期的正上做一个散点图,而不会丢失作为x轴的日期。我的意思是,如果嵌套字典(一个特定日期)没有正值(全部为零或负值),我不想丢失日期,但没有显示日期的y轴值。
这里有一个工作示例来澄清。我把所有的东西都画在这里。我想在x轴上保持相同的标签和距离,但在y轴上不显示零或负值。在本例中,我不想看到日期'2006-02-03'和'2006-04- 05'上的点。任何帮助将不胜感激。

import matplotlib.pyplot as plt

dictt = {('2005-01-04', '2006-01-04'): {0: 0, 1: 3, 2: -1, 3: 5}, ('2005-02-03', '2006-02-03'): {0: 0, 1: 0, 2: 0, 3: 0},
('2005-03-07', '2006-03-07'):  {0: -3, 1: 0, 2: 3, 3: 5}, ('2005-04-06', '2006-04-05'):  {0: -2, 1: -3, 2: -1, 3: -2}}

for k in list(dictt.keys()):
    for i in range(4):
        plt.scatter(k[1], dictt[k[0], k[1]][i])

dldeef67

dldeef671#

你更喜欢什么底线?零(我的首选)还是绘制值的最小值?我提供了两种解决方案!
此外,我提供了一个解决方案,它根据 n 参数的值标记点。我认为这是一种进步,你觉得呢?

这是代码,它有点复杂,因为,哦,好吧,你的数据格式很复杂。

import matplotlib.pyplot as plt

dictt = {('2005-01-04', '2006-01-04'): {0:  0, 1:  3, 2: -1, 3:  5},
         ('2005-02-03', '2006-02-03'): {0:  0, 1:  0, 2:  0, 3:  0},
         ('2005-03-07', '2006-03-07'): {0: -3, 1:  0, 2:  3, 3:  5}, 
         ('2005-04-06', '2006-04-05'): {0: -2, 1: -3, 2: -1, 3: -2}}

dates = [k[1] for k in dictt.keys()]

by_n = {} # reorder data according to the second layer of ordinality
for (d0, d1), d in dictt.items():
        for n in d.keys():
            if n not in by_n: by_n[n] = [] # empty list
            by_n[n].append((d1, d[n] if d[n]>0 else None))

fig, (ax0, axmin) = plt.subplots(ncols=2, figsize=(8,3), layout='constrained')

# first, using 0 as the bottom line    
for n, by_date in by_n.items(): ax0.scatter(*zip(*by_date), label='n=%d'%n)
for day in dates: ax0.scatter(day, 0, s=0)
ax0.legend()

# next, using the minimum value, that of course we must compute 
min_val = min([n for d in dictt.values() for n in d.values if n>0])
for n, by_date in by_n.items(): axmin.scatter(*zip(*by_date), label='n=%d'%n)
for day in dates: axmin.scatter(day, min_val, s=0)
axmin.legend()

plt.show()
rkttyhzu

rkttyhzu2#

一种方法是使用ylim(),然后用白色散射0。以下是ylim()的文档链接:https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.ylim.html#matplotlib.pyplot.ylim
代码看起来像这样:

import matplotlib.pyplot as plt

dict = {
    ("2005-01-04", "2006-01-04"): {0: 0, 1: 3, 2: -1, 3: 5},
    ("2005-02-03", "2006-02-03"): {0: 0, 1: 0, 2: 0, 3: 0},
    ("2005-03-07", "2006-03-07"): {0: -3, 1: 0, 2: 3, 3: 5},
    ("2005-04-06", "2006-04-05"): {0: -2, 1: -3, 2: -1, 3: -2},
}

values = [list(item.values()) for item in list(dict.values())] # this line and the next line would get all the y values from dictionary and then find the maximum to set the top limit for y-axis, you should actually set the top as well as bottom so your plot looks like before
flat_list = [item for sublist in values for item in sublist] # flattens the y values
plt.ylim(bottom=0, top=max(flat_list) + 0.5)
for k in list(dict.keys()):
    for i in range(4):
        if dict[k[0], k[1]][i] == 0: # if you wanna have the 0 values remove this if
            plt.scatter(k[1], dict[k[0], k[1]][i], color="white")
        else:
            plt.scatter(k[1], dict[k[0], k[1]][i])

结果是这样的:

ttcibm8c

ttcibm8c3#

你的例子与你的描述不一致。
1.如果要清除日期“2006-02-03”和“2006-04- 05”上的点,但保留其他日期上的所有点:

x = 0   # Set xticks explicitly to keep the date order.
for k in list(dictt.keys()):
    if any(map(lambda x: x > 0, dictt[k[0], k[1]].values())):
        plt.scatter([x]*4, dictt[k[0], k[1]].values())
    x += 1
plt.xticks(range(4), [i[1] for i in dictt.keys()])

1.如果要清除所有非正数点,同时保留日期刻度标签:

x = 0
for k in list(dictt.keys()):
    for i in range(4):
        y = dictt[k[0], k[1]][i]
        if y > 0:
            plt.scatter(x, y)
    x += 1
plt.xticks(range(4), [i[1] for i in dictt.keys()])
# You can also set the yticks if you like.
8mmmxcuj

8mmmxcuj4#

在评论中,OP要求一种能够绘制不同颜色的替代方案,即使内部字典中不同元素的数量超过10。作为回答,我建议使用彩色条

这里是生成上面图像的代码--当然,我已经将dictt更改为dt,以便在内部字典中有更多的元素

import matplotlib.pyplot as plt
from matplotlib.cm import ScalarMappable
from matplotlib.colors import BoundaryNorm

from datetime import datetime ### it's better to convert the date strings to date objects
convert = lambda date: datetime.strptime(date, '%Y-%m-%d').date()

# modified to have a variety of "n"
dt = {('2005-01-04','2006-01-04'):{0:0,1:0,2:0,3:0,4:8,5:0,6:0,7:0,8:0,9:0,10:0}, 
      ('2005-02-03','2006-02-03'):{0:4,1:0,2:0,3:0,4:8,5:0,6:0,7:0,8:0,9:0,10:4},
      ('2005-03-07','2006-03-07'):{0:0,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0},
      ('2005-04-06','2006-04-05'):{0:6,1:0,2:0,3:4,4:0,5:0,6:0,7:8,8:8,9:0,10:0}}

nmin, nmax = 0, 10

dates = [convert(d1) for d0, d1 in dt.keys()]

by_n = {}
for (d0, d1), d in dt.items():
        for n in d.keys():
            if n not in by_n: by_n[n] = [] # empty list
            by_n[n].append((convert(d1), d[n] if d[n]>0 else None))

# the scatterplot(s)
cm = plt.get_cmap('cool')
fig, ax0  = plt.subplots(figsize=(5,3), layout='constrained')
for day in dates: ax0.scatter(day, 0, s=0)
for n, by_date in by_n.items():
    ax0.scatter(*zip(*by_date), label='n=%d'%n, color=[cm((n-nmin)/(nmax-nmin))], s=80, ec='k')
ax0.set_xticks(dates)

# the colorbar
bounds = [n+0.5 for n in range(nmin-1, nmax+1)]
norm = BoundaryNorm(bounds, 256)
cb = plt.colorbar(ScalarMappable(norm, cm), format='%02d', ax=plt.gca())
# fix the ticks, removing the secondary ones
cb.set_ticks(range(nmin, nmax+1))
cb.set_ticks([], minor=True)
# this is optional, to draw the dividers between colors
for b in bounds:cb.ax.axhline(b, color='k', lw=0.50)

plt.show()

相关问题