pandas 如何在dataframe中查找日期差异并排除第一项

hkmswyz6  于 2023-05-05  发布在  其他
关注(0)|答案(1)|浏览(107)

我有下面的数据集:

import pandas as pd

df = pd.DataFrame([['x','iii-2019-10-16','18/07/2019'],
                   ['x','iii-2019-10-16','21/04/2019'],
                   ['x','iii-2019-10-16','12/09/2019'],
                   ['x','zzz-2020-10-25','12/04/2022'],
                   ['y','qqq-2018-05-28','10/12/2020'], 
                   ['y','qqq-2018-05-28','15/02/2018'],
                   ['y','ooo-2019-11-22','30/05/2019'],
                   ['y','rrr-16-12-2020','16/12/2020'],
                   ['z','ppt-2019-12-03','07/02/2018'],
                   ['z','ttt-2019-12-03','28/05/2019'],
                   ['z','ttt-2019-12-03','09/09/2019'],
                   ['z','ttt-2019-12-03','30/09/2019']
                  ],
                  columns=['Car_code','customer_rent_code','Rent_Date'])

我需要找到每个car_code条目之间的日期差异,例如car x,但每个car的第一个日期将为空,预期结果将是:

1zmg4dgp

1zmg4dgp1#

组内有日期差异:

df['Rent_Date'] = pd.to_datetime(df['Rent_Date'], dayfirst=True)
df['Date diff'] = df.groupby('Car_code')['Rent_Date'].diff().abs().dt.days
Car_code customer_rent_code  Rent_Date  Date diff
0         x     iii-2019-10-16 2019-07-18        NaN
1         x     iii-2019-10-16 2019-04-21       88.0
2         x     iii-2019-10-16 2019-09-12      144.0
3         x     zzz-2020-10-25 2022-04-12      943.0
4         y     qqq-2018-05-28 2020-12-10        NaN
5         y     qqq-2018-05-28 2018-02-15     1029.0
6         y     ooo-2019-11-22 2019-05-30      469.0
7         y     rrr-16-12-2020 2020-12-16      566.0
8         z     ppt-2019-12-03 2018-02-07        NaN
9         z     ttt-2019-12-03 2019-05-28      475.0
10        z     ttt-2019-12-03 2019-09-09      104.0
11        z     ttt-2019-12-03 2019-09-30       21.0

相关问题