Python Pandas -如何获取第2行和第3行中的值,并将这些值添加到具有最大日期的行

j9per5c4  于 2023-02-02  发布在  Python
关注(0)|答案(1)|浏览(319)

我有以下数据集,希望从其他单元格中获取值,并将其粘贴到具有最大日期的行中,然后删除其余内容
样本数据集

ID  MILESTONE_NAME    DESC              Completed_Date   DECISION_DATE  SUBMISSION_DATE  EPUBLISHED_DATE
 1  DECISION         Final Decision       6/6/2017        6/6/2017
 1  DECISION         Response Received    6/5/2017        6/5/2017
 2  SUBMIT           Submission           1/1/2019                        1/1/2019
 2  SUBMIT           Re-Submission        1/20/2019                       1/20/2019
 3  EPUBLICATION     E-Published          2/2/2021                                      2/2/2021
 3  SUBMIT           First Submission     12/1/2020                       12/1/2020

预期输出

ID  MILESTONE_NAME    DESC              Completed_Date  DECISION_DATE  EPUBLICATION_DATE  SUBMISSION_DATE  
     1  DECISION        Final Decision       6/6/2017        6/6/2017     
     2  SUBMIT          Re-Submission        1/20/2019                                        1/20/2019                 
     3  EPUBLICATION    E-Published          12/1/2020                         2/2/2021       12/1/2020
olhwl3o2

olhwl3o21#

假设您的日期列已经转换为日期,这应该可以工作:

d = {k:'max' if v == 'datetime64[ns]' else 'first' for k,v in df.dtypes.items()}

df.groupby('ID',as_index=False).agg(d)

输出:

ID  MILESTONE_NAME               DESC Completed_Date DECISION_DATE  SUBMISSION_DATE EPUBLISHED_DATE
0   1  DECISION           Final Decision     2017-06-06    2017-06-06             NaT             NaT
1   2  SUBMIT                 Submission     2019-01-20           NaT      2019-01-20             NaT   
2   3  EPUBLICATION          E-Published     2021-02-02           NaT      2020-12-01      2021-02-02

相关问题