我试图绘制一条日变化曲线,显示指定持续时间内的数据。使用下面的代码,我能够绘制一条曲线(下面的快照),显示指定时间段内数据的平均值。
有没有一种方法可以让我在下面的数据框中绘制一条曲线,在所有月份的指定时间内平均出所需列中的数据,例如“01:00:00”到“01:00:00”,“02:00:00”到“02:00:00”等,然后在一条直线上绘制,而不是在不同月份绘制单独的直线?
日图快照
为了代码,我加入了一个示例数据框(原始数据框有数千行和几列)-
import pandas as pd
import matplotlib as pt
from matplotlib import dates as d
import datetime as dt
import matplotlib.pyplot as plt
import numpy as np
dataframe = pd.DataFrame(
columns = ['From Date', 'NO', 'NO2', 'NOx', 'CO', 'Ozone'],
data = [
['2018-12-30 00:00:00', 5.856666, 39.208341, 28.97, 331.280881, 19.778900],
['2018-12-30 01:00:00', 4.050059, 16.262145, 13.53, 454.031703, 25.075286],
['2018-12-30 02:00:00', 4.057806, 15.293990, 12.96, 466.502681, 24.825294],
['2018-12-30 03:00:00', 3.835476, 13.526193, 11.71, 446.526784, 25.033312],
['2018-12-30 04:00:00', 4.230690, 11.251531, 10.70, 355.638469, 25.748796],
['2020-01-01 00:00:00', 1, 2, 6.91, 4, 5],
['2020-01-01 01:00:00', 5, 10, 7.37, 13.2, 9],
['2020-01-01 02:00:00', 4, 13, 8.28, 4, 4],
['2020-01-01 03:00:00', 3, 9, 8.57, 3, 5],
['2020-01-01 04:00:00', 2, 4, 9.12, 4, 6],
['2020-02-01 00:00:00', 2, 3, 6, 8, 9],
['2020-02-01 01:00:00', 5, 10, 7.37, 10.2, 8],
['2020-02-01 02:00:00', 4, 13, 8.28, 2, 5],
['2020-02-01 03:00:00', 3, 9, 8.57, 7, 3],
['2020-02-01 04:00:00', 2, 4, 9.12, 2, 2]
]
)
dataframe['From Date'] = pd.to_datetime(dataframe['From Date'])
dataframe = dataframe.set_index('From Date')
dataframe.replace('NoData', np.nan, inplace= True)
dataframe['Ozone']=dataframe['Ozone'].astype(float)
dataframe['NOx']=dataframe['NOx'].astype(float)
dataframe['NO']=dataframe['NO'].astype(float)
dataframe['NO2']=dataframe['NO2'].astype(float)
dataframe['Month'] = dataframe.index.map(lambda x: x.strftime("%m"))
dataframe['Time'] = dataframe.index.map(lambda x: x.strftime("%H:%M"))
# Creates subplots based on the number of months
fig, ax = plt.subplots(1, figsize=(12,6))
for month in dataframe['Month'].unique():
df = dataframe.loc[dataframe['Month'] == month]
df = df.groupby('Time').describe()
ax.plot(df.index, df['NO2']['mean'], linewidth=6.0, label=month)
ax.legend()
ticks = ax.get_xticks()
ax.set_xticks(np.linspace(ticks[0], d.date2num(
d.num2date(ticks[-1]) + dt.timedelta(hours=3)), 5))
ax.set_xticks(np.linspace(ticks[0], d.date2num(
d.num2date(ticks[-1]) + dt.timedelta(hours=3)), 25), minor=True)
ax.set_title("NO2 conc") # <--------------
ax.set_xlabel("Time") # <--------------
ax.set_ylabel("Concn in ppb")# <--------------
ax.plot(df.index, df['NO2']['75%'], color='g')
ax.plot(df.index, df["NO2"]['25%'], color='r')
ax.fill_between(df.index, df["NO2"]['mean'], df["NO2"]['75%'], alpha=.5, facecolor='y')
ax.fill_between(df.index, df["NO2"]['mean'], df["NO2"]['25%'], alpha=.5, facecolor='r')
fig.tight_layout(pad=2.0)
1条答案
按热度按时间0x6upsns1#
你可以使用pd.grouper
我使用了3小时的频率,但有更多的选择。检查一下石斑鱼。