pandas 如何按年或月拆分 Dataframe

r3i60tvu  于 2023-01-15  发布在  其他
关注(0)|答案(1)|浏览(146)

我有一个数据框,其中包含一个时间序列,该时间序列包含2015年到2020年的每小时数据。我想创建一个新的数据框,其中包含一列,该列包含每年或每年每个月的时间序列值,以便执行单独的分析。由于我有一个闰年,我希望它们共享索引,但在该位置具有NaN值(2月29日)。我尝试使用合并创建两个新列,分别称为month和day_of_一个月,但索引变得疯狂,最终有数百万个条目,而不是它应该有的~ 40. 000,最后它在RAM上结束了一个超过20 GB的文件,并中断:

years = pd.DataFrame(index=pd.date_range('2016-01-01', '2017-01-01', freq='1H'))
years['month'] = years.index.month
years['day_of_month'] = years.index.day
gp = data_md[['value', 'month', 'day_of_month']].groupby(pd.Grouper(freq='1Y'))
for name, group in gp:
    years = years.merge(group, right_on=['month', 'day_of_month'], left_on=['month', 'day_of_month'])

RESULT:

    month   day_of_month    value
0   1   1   0
1   1   1   6
2   1   1   2
3   1   1   0
4   1   1   1
...     ...     ...     ...
210259  12  31  6
210260  12  31  2
210261  12  31  4
210262  12  31  5
210263  12  31  1

我如何构造一个框架,使其每个年份或月份都有一个值列?
在这里,我离开了原来的框架,我想从那里创建一个新的,现在只需要列是值

value    month  day_of_month    week    day_name    year    hour    season  dailyp  day_of_week     ...     hourly_no_noise     daily_trend     daily_seasonal  daily_residuals     daily_no_noise  daily_trend_h   daily_seasonal_h    daily_residuals_h   daily_no_noise_h    Total
date                                                                                    
2015-01-01 00:00:00     0   1   1   1   Thursday    2015    0   Invierno    165.0   3   ...     NaN     NaN     -9.053524   NaN     NaN     NaN     -3.456929   NaN     NaN     6436996.0
2015-01-01 01:00:00     6   1   1   1   Thursday    2015    1   Invierno    NaN     3   ...     NaN     NaN     -9.053524   NaN     NaN     NaN     -4.879983   NaN     NaN     NaN
2015-01-01 02:00:00     2   1   1   1   Thursday    2015    2   Invierno    NaN     3   ...     NaN     NaN     -9.053524   NaN     NaN     NaN     -5.895367   NaN     NaN     NaN
2015-01-01 03:00:00     0   1   1   1   Thursday    2015    3   Invierno    NaN     3   ...     NaN     NaN     -9.053524   NaN     NaN     NaN     -6.468616   NaN     NaN     NaN
2015-01-01 04:00:00     1   1   1   1   Thursday    2015    4   Invierno    NaN     3   ...     NaN     NaN     -9.053524   NaN     NaN     NaN     -6.441830   NaN     NaN     NaN
...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...
2019-12-31 19:00:00     6   12  31  1   Tuesday     2019    19  Invierno    NaN     1   ...     11.529465   230.571429  -4.997480   -11.299166  237.299166  9.613095    2.805720    1.176491    17.823509   NaN
2019-12-31 20:00:00     3   12  31  1   Tuesday     2019    20  Invierno    NaN     1   ...     11.314857   230.571429  -4.997480   -11.299166  237.299166  9.613095    2.928751    1.176491    17.823509   NaN
2019-12-31 21:00:00     3   12  31  1   Tuesday     2019    21  Invierno    NaN     1   ...     10.141139   230.571429  -4.997480   -11.299166  237.299166  9.613095    1.774848    1.176491    17.823509   NaN
2019-12-31 22:00:00     3   12  31  1   Tuesday     2019    22  Invierno    NaN     1   ...     8.823152    230.571429  -4.997480   -11.299166  237.299166  9.613095    0.663344    1.176491    17.823509   NaN
2019-12-31 23:00:00     6   12  31  1   Tuesday     2019    23  Invierno    NaN     1   ...     6.884636    230.571429  -4.997480   -11.299166  237.299166  9.613095    -1.624980   1.176491    17.823509   NaN
wpx232ag

wpx232ag1#

这是你想要的吗?如果需要,请告诉我编辑我的答案。

import pandas as pd

df = pd.DataFrame(pd.date_range('2016-01-01', '2017-01-01', freq='1H'), columns = ['Date and Time'])

df['str'] = df['Date and Time'].dt.strftime('%Y-%m-%d')

df[['Year', 'Month','Day']] = df['str'].apply(lambda x: pd.Series(str(x).split("-")))

print(df)

输出:

Date and Time         str  Year Month Day
0    2016-01-01 00:00:00  2016-01-01  2016    01  01
1    2016-01-01 01:00:00  2016-01-01  2016    01  01
2    2016-01-01 02:00:00  2016-01-01  2016    01  01
3    2016-01-01 03:00:00  2016-01-01  2016    01  01
4    2016-01-01 04:00:00  2016-01-01  2016    01  01
...                  ...         ...   ...   ...  ..
8780 2016-12-31 20:00:00  2016-12-31  2016    12  31
8781 2016-12-31 21:00:00  2016-12-31  2016    12  31
8782 2016-12-31 22:00:00  2016-12-31  2016    12  31
8783 2016-12-31 23:00:00  2016-12-31  2016    12  31
8784 2017-01-01 00:00:00  2017-01-01  2017    01  01

相关问题