regex 无法 从 字符 串 获取 日期 和 时间

nhn9ugyo  于 2022-11-18  发布在  其他
关注(0)|答案(1)|浏览(90)

我试图从文本FOX News October 27,2022 5:00 pm-6:00 pm PDT中获取时间和日期,但没有得到一个标准的方法。虽然使用字符串的位置,但它不起作用,因为字符串的大小随着不同的月份(4月到12月)和时间而变化。

text=['FOX News  October 27, 2022 5:00pm-6:00pm PDT'
'FOX News  April 28, 2022 10:00pm-11:00pm PDT']
df=pd.DataFrame(list(zip(text)),columns =['text'])
df['text'].str[-20:]

output 

22 5:00pm-6:00pm PDT

How can I improve the code to get generalise results in two different columns(date and time)?
polhcujo

polhcujo1#

您可以使用dateutil来撷取日期:

import pandas as pd
from dateutil import parser
text=['FOX News  October 27, 2022 5:00pm-6:00pm PDT','FOX News  October 28, 2022 5:00pm-6:00pm PDT']
df=pd.DataFrame(list(zip(text)),columns =['text'])
df['text'] = df['text'].apply(parser.parse, fuzzy=True)

输出量:

>>> df['text']
0   2022-10-27 17:00:00-06:00
1   2022-10-28 17:00:00-06:00
Name: text, dtype: datetime64[ns, tzoffset(None, -21600)]

相关问题