python—如何根据时间戳对 Dataframe 中的各个列取平均值来绘制图?

yc0p9oo0  于 2021-08-25  发布在  Java
关注(0)|答案(1)|浏览(210)

我试图绘制一条日变化曲线,显示指定持续时间内的数据。使用下面的代码,我能够绘制一条曲线(下面的快照),显示指定时间段内数据的平均值。
有没有一种方法可以让我在下面的数据框中绘制一条曲线,在所有月份的指定时间内平均出所需列中的数据,例如“01:00:00”到“01:00:00”,“02:00:00”到“02:00:00”等,然后在一条直线上绘制,而不是在不同月份绘制单独的直线?
日图快照
为了代码,我加入了一个示例数据框(原始数据框有数千行和几列)-

import pandas as pd
import matplotlib as pt
from matplotlib import dates as d
import datetime as dt
import matplotlib.pyplot as plt
import numpy as np
dataframe = pd.DataFrame( 
    columns = ['From Date',   'NO',          'NO2',       'NOx',    'CO',           'Ozone'],           
    data = [
        ['2018-12-30 00:00:00', 5.856666,    39.208341,   28.97,   331.280881,  19.778900],
        ['2018-12-30 01:00:00', 4.050059,    16.262145,   13.53,   454.031703,  25.075286],
        ['2018-12-30 02:00:00', 4.057806,    15.293990,   12.96,   466.502681,  24.825294],
        ['2018-12-30 03:00:00', 3.835476,    13.526193,   11.71,   446.526784,  25.033312],
        ['2018-12-30 04:00:00', 4.230690,    11.251531,   10.70,   355.638469,  25.748796],
        ['2020-01-01 00:00:00',    1,            2,        6.91,    4,             5],
        ['2020-01-01 01:00:00',            5,           10,        7.37,    13.2,          9],
        ['2020-01-01 02:00:00',            4,           13,        8.28,    4,             4],
        ['2020-01-01 03:00:00',            3,           9,         8.57,    3,             5],
        ['2020-01-01 04:00:00',            2,           4,         9.12,    4,             6],
        ['2020-02-01 00:00:00',            2,            3,        6,    8,             9],
        ['2020-02-01 01:00:00',            5,           10,        7.37,    10.2,          8],
        ['2020-02-01 02:00:00',            4,           13,        8.28,    2,             5],
        ['2020-02-01 03:00:00',            3,           9,         8.57,    7,             3],
        ['2020-02-01 04:00:00',            2,           4,         9.12,    2,             2]        
    ]
)
dataframe['From Date'] = pd.to_datetime(dataframe['From Date'])
dataframe = dataframe.set_index('From Date')
dataframe.replace('NoData', np.nan, inplace= True)
dataframe['Ozone']=dataframe['Ozone'].astype(float)
dataframe['NOx']=dataframe['NOx'].astype(float)
dataframe['NO']=dataframe['NO'].astype(float)
dataframe['NO2']=dataframe['NO2'].astype(float)
dataframe['Month'] = dataframe.index.map(lambda x: x.strftime("%m"))
dataframe['Time'] = dataframe.index.map(lambda x: x.strftime("%H:%M"))

# Creates subplots based on the number of months

fig, ax = plt.subplots(1, figsize=(12,6))

for month in dataframe['Month'].unique():

    df = dataframe.loc[dataframe['Month'] == month]
    df = df.groupby('Time').describe()
    ax.plot(df.index, df['NO2']['mean'], linewidth=6.0, label=month)
    ax.legend()
    ticks = ax.get_xticks()
    ax.set_xticks(np.linspace(ticks[0], d.date2num(
        d.num2date(ticks[-1]) + dt.timedelta(hours=3)), 5))
    ax.set_xticks(np.linspace(ticks[0], d.date2num(
        d.num2date(ticks[-1]) + dt.timedelta(hours=3)), 25), minor=True)
    ax.set_title("NO2 conc")  # <--------------
    ax.set_xlabel("Time")  # <--------------
    ax.set_ylabel("Concn in ppb")# <--------------
ax.plot(df.index, df['NO2']['75%'], color='g')
ax.plot(df.index, df["NO2"]['25%'], color='r')
ax.fill_between(df.index, df["NO2"]['mean'], df["NO2"]['75%'], alpha=.5, facecolor='y')
ax.fill_between(df.index, df["NO2"]['mean'], df["NO2"]['25%'], alpha=.5, facecolor='r')    
fig.tight_layout(pad=2.0)
0x6upsns

0x6upsns1#

你可以使用pd.grouper

dataframe.groupby(pd.Grouper(key = 'From Date', freq = '3h'))['NO','NO2'].mean()

我使用了3小时的频率,但有更多的选择。检查一下石斑鱼。

相关问题