按年-周合并,pandas中的正负n周

kcwpcxri  于 2023-08-01  发布在  其他
关注(0)|答案(2)|浏览(116)

我有以下 Dataframe :

import pandas as pd
aa = pd.DataFrame({'id': ['a','a','a','b'],
             'week': ['2022-W13','2022-W14', '2022-W19', '2022-W14']})
bb = pd.DataFrame({'id': ['a','a','a','a','a','a','b','b','b','b'],
              'week': ['2022-W12','2022-W13','2022-W14','2022-W15','2022-W16','2022-W20',
                      '2022-W13','2022-W14','2022-W15','2022-W16'],
             'val': [0,1,2,3,4,5,6,7,8,9]})

id      week
0  a  2022-W13
1  a  2022-W14
2  a  2022-W19
3  b  2022-W14
  id      week  val
0  a  2022-W12    0
1  a  2022-W13    1
2  a  2022-W14    2
3  a  2022-W15    3
4  a  2022-W16    4
5  a  2022-W20    5
6  b  2022-W13    6
7  b  2022-W14    7
8  b  2022-W15    8
9  b  2022-W16    9

字符串
我想通过idweek合并两个 Dataframe ,但week我想给予一个加或减1周
说明:

  • 对于aa的第一行,其中id=='a'week='2022-W13',我必须从bb中获取行,其中id=='a'week='2022-W12'week='2022-W13'week='2022-W14'
  • 对于aa的第二行,其中id=='a'week='2022-W14',我必须从bb中获取行,其中id=='a'week='2022-W13'week='2022-W14'week='2022-W15'
  • 等等

输出数据名应为:

pd.DataFrame({'id': ['a','a','a','a','a','b','b','b'],
              'week': ['2022-W12','2022-W13','2022-W14','2022-W15','2022-W20',
                      '2022-W13','2022-W14','2022-W15'],
             'val': [0,1,2,3,5,
                     6,7,8]})

id  week    val
0   a   2022-W12    0
1   a   2022-W13    1
2   a   2022-W14    2
3   a   2022-W15    3
4   a   2022-W20    5
5   b   2022-W13    6
6   b   2022-W14    7
7   b   2022-W15    8

snz8szmq

snz8szmq1#

您可以重复+/- 1周的数据框,然后合并:

#get the date (Monday of the week) from the week column
aa["week"] = pd.to_datetime(aa["week"]+"-1", format="%Y-W%W-%w")

#repeat the dataframe with +/- 1 week
left = pd.concat([aa, aa.assign(week=aa["week"]+pd.DateOffset(days=-7)),aa.assign(week=aa["week"]+pd.DateOffset(days=7))])

#convert date back to week number to merge
left["week"] = left["week"].dt.strftime("%Y-W%W")

#merge to get the output
output = left.drop_duplicates().sort_values(by=["id","week"]).merge(bb)

>>> output
  id      week  val
0  a  2022-W12    0
1  a  2022-W13    1
2  a  2022-W14    2
3  a  2022-W15    3
4  a  2022-W20    5
5  b  2022-W13    6
6  b  2022-W14    7
7  b  2022-W15    8

字符串

x4shl7ld

x4shl7ld2#

试试这个

def convert_to_datetime(row):
    year, week_number = row['week'].split('-W')
    base_date = pd.to_datetime(f"{year}-W{int(week_number)-1}-1", format="%Y-W%U-%w")
    one_week_later = base_date + pd.DateOffset(weeks=1)
    one_week_earlier = base_date + pd.DateOffset(weeks=2)
    return base_date, one_week_later, one_week_earlier

# Apply the conversion function to the 'week' column and expand the result into new columns
aa['week'] = aa.apply(convert_to_datetime, axis=1)
aa = aa.explode(column='week').reset_index(drop=True)

def convert_to_week_format(row):
    return f"{row.year}-W{row.weekofyear}"

# # Convert the new datetime columns back to the original week format
aa['week'] = aa['week'].apply(convert_to_week_format)

aa.merge(bb).drop_duplicates()

字符串

输出:

id  week       val
0   a   2022-W12    0
1   a   2022-W13    1
3   a   2022-W14    2
5   a   2022-W15    3
6   a   2022-W20    5
7   b   2022-W13    6
8   b   2022-W14    7
9   b   2022-W15    8

相关问题