Python Pandas将具有一列的月/年 Dataframe 转换为每行表示一年的 Dataframe

pb3skfrl  于 2023-01-24  发布在  Python
关注(0)|答案(3)|浏览(167)

给定形式为的Pandas Dataframe

January-2021,0.294 
February-2021,0.252 
March-2021,0.199 
...
January-2022,0.384 
February-2022,0.333 
March-2022,0.271 
...

如何将其转换为具有12列(每个月一列)的 Dataframe ,使其看起来

year,January,February,March,...
2021,0.294,0.252,0.199,...
2022,0.384,0.333,0.271,...
esbemjvw

esbemjvw1#

您可以:

# `month-year` is name of date column
dates = df['month-year'].str.extract('(?P<month>\w+)-(?P<year>\d+)')

# `data` is name of data column
pd.crosstab(dates['year'], dates['month'], df['data'], aggfunc='first')

输出:

month  February  January  March
year                           
2021      0.252    0.294  0.199
2022      0.333    0.384  0.271
6jjcrrmo

6jjcrrmo2#

您可以先使用str.split,然后再使用pd.pivot

# January-2021, ... exists in 'Date' column.
df[['Month', 'Year']] = pd.DataFrame(df['Date'].str.split('-').to_list())

# 0.294, ... exists in 'Value' column
df_new = df.pivot(index='Year', columns='Month', values='Value')

print(df_new)

输出:

Month  February  January  March
Year                           
2021      0.252    0.294  0.199
2022      0.333    0.384  0.271
ubof19bj

ubof19bj3#

我将使用pivot来获得所需的输出。

prepare dataframe

data = [('January-2021',0.294), ('February-2021',0.252), ('March-2021',0.199),
('January-2022',0.384), ('February-2022',0.333), ('March-2022',0.271) ]
df = pd.DataFrame(data, columns=['date', 'val'])

extract month and year

df['month'] = df['date'].apply(lambda x: x.split('-')[0])
df['year'] = df['date'].apply(lambda x: x.split('-')[1])
df.drop(['date'], axis=1, inplace=True)

pivoting the dataframe

df = df.pivot(index='year', columns='month')

rename the columns

df.columns = [c[1] for c in df.columns]
df = df.reset_index()

最后,您将得到以下结果:

相关问题