将合并pandas df [datetime] time和date列合并为一个datetime列

nr7wwzry  于 12个月前  发布在  其他
关注(0)|答案(2)|浏览(117)

尝试将合并pandas DataFrame列与datetime.date格式和datetime.time格式组合在一起时,我有点困惑。DF看起来像这样:

VJNo    VJIdx   lnTime  lnDate
0   32613   1   05:00:00    2023-04-18
1   32613   2   05:01:00    2023-04-18
2   32613   3   05:02:30    2023-04-18
3   32613   5   05:05:30    2023-04-18
4   32613   6   05:06:30    2023-04-18
5   32613   8   05:07:30    2023-04-18
6   32613   9   05:08:30    2023-04-18
7   32613   11  05:10:30    2023-04-18

字符串
我想使用pandas.Timestamp.combine(date, time),但显然它不想为DataSeries工作.(?)运行:

import pandas as pd

# Defining the data
data = {'VJNo': [32613, 32613, 32613, 32613, 32613, 32613, 32613, 32613],
        'VJIdx': [1, 2, 3, 5, 6, 8, 9, 11],
        'lnTime': ['05:00:00', '05:01:00', '05:02:30', '05:05:30', '05:06:30', '05:07:30', '05:08:30', '05:10:30'],
        'lnDate': ['2023-04-18', '2023-04-18', '2023-04-18', '2023-04-18', '2023-04-18', '2023-04-18', '2023-04-18', '2023-04-18']}

# Create pandas dataframe
df = pd.DataFrame(data)

df['tmp'] = pd.Timestamp.combine( df['lnDate'], df['lnTime'])


返回错误:combine() argument 1 must be datetime.date, not Series,虽然它是datetime.date,但它的系列.不幸的是,其他解决方案,如发现here也不工作(可能是由于Pandas的变化):

df['tmp'] = df.apply(pd.Timestamp.combine, df['lnDate'], df['lnTime'])


df['tmp'] = df.apply(lambda x: pd.Timestamp.combine(x['lnDate'], x['lnTime']))


我做错什么了吗?最后的办法可能是将日期和时间转换为字符串,然后对它们使用pd.to_datetime,但.我认为这不是正确的方法。

p5cysglq

p5cysglq1#

你需要vectorize你的函数:

import numpy as np

f = np.vectorize(pd.Timestamp.combine)
# f = np.vectorize(lambda d,t: pd.Timestamp.combine(d, t))

dfP['out'] = f(dfP['lnDate'], dfP['lnTime'])

字符串
或者使用列表理解:

dfP['out'] = [pd.Timestamp(d, t) for d, t in zip(df['lnDate'], df['lnTime'])]


输出量:

VJNo  VJIdx    lnTime      lnDate                 out
0  32613      1  05:00:00  2023-04-18 2023-04-18 05:00:00
1  32613      2  05:01:00  2023-04-18 2023-04-18 05:01:00
2  32613      3  05:02:30  2023-04-18 2023-04-18 05:02:30
3  32613      5  05:05:30  2023-04-18 2023-04-18 05:05:30
4  32613      6  05:06:30  2023-04-18 2023-04-18 05:06:30
5  32613      8  05:07:30  2023-04-18 2023-04-18 05:07:30
6  32613      9  05:08:30  2023-04-18 2023-04-18 05:08:30
7  32613     11  05:10:30  2023-04-18 2023-04-18 05:10:30

xfb7svmp

xfb7svmp2#

您面临的问题部分是由于您的值被编码为string。您可以通过首先转换为datetime,然后组合来实现所需的结果:

df['lnDate'] = pd.to_datetime(df['lnDate'])
df['lnTime'] = pd.to_datetime(df['lnTime'], format='%H:%M:%S').dt.time

# Sourced from: https://stackoverflow.com/a/39474812/10521959
df['tmp'] = df.apply(lambda r: pd.datetime.combine(r['lnDate'], r['lnTime']), 1)

字符串
或者,如果数据被可靠地编码为字符串,则可以将合并日期和时间列合并,然后作为字符串进行解析:

# Sourced from https://stackoverflow.com/a/17978188/10521959
df['tmp'] = pd.to_datetime(df['lnDate'] + ' ' + df['lnTime'])

相关问题