pandas 如何修复-int 64加法中的溢出

68de4m5k  于 2023-06-04  发布在  其他
关注(0)|答案(1)|浏览(188)

我试图通过将一个包含天数df ['num_days']的列添加到另一个列df[“sampling_date”]来计算未来的日期,但在int 64加法中得到溢出。源代码-

df['sampling_date']=pd.to_datetime(df['sampling_date'], errors='coerce')
df['future_date'] = df['sampling_date'] + pd.to_timedelta(df['num_days'], unit='D')
df['future_date'] = pd.to_datetime(df['future_date']).dt.strftime('%Y-%m-%d')
df['future_date'] = df['future_date'].astype(np.str)
df['future_date'] = np.where(df['num_days']<=0,0, df['future_date'])

对于列df ['num_days'],值如下[0,866,729,48357555,567,478]
我想在unix服务器上运行这个。请帮我解决。

b1uwtaje

b1uwtaje1#

问题是这个值:48357555
你可以创建一个简单的函数,如下所示,如果抛出错误,返回NaT

import numpy as np
import pandas as pd

# Here is an example df
df = pd.DataFrame({
    'sampling_date': ['2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01', '2022-05-01', '2022-06-01'],
    'num_days': [0, 866, 729, 48357555, 567, 478]
})

df['sampling_date'] = pd.to_datetime(df['sampling_date'], errors='coerce')

def calculate_future_date(row):
    try:
        return row['sampling_date'] + pd.to_timedelta(row['num_days'], unit='D')
    except:
        return pd.NaT

# Apply the function to each row
df['future_date'] = df.apply(calculate_future_date, axis=1)
df['future_date'] = np.where(df['num_days'] <= 0, df['sampling_date'], df['future_date'])
df['future_date'] = df['future_date'].dt.strftime('%Y-%m-%d').replace(pd.NaT, '0').astype(str)
print(df)

  sampling_date  num_days future_date
0    2022-01-01         0  2022-01-01
1    2022-02-01       866  2024-06-16
2    2022-03-01       729  2024-02-28
3    2022-04-01  48357555           0
4    2022-05-01       567  2023-11-19
5    2022-06-01       478  2023-09-22

相关问题