我正在sparksql中运行一个查询,如下所述
campaign_df = spark.sql('''select CAMPAIGN_ID,CAMPAIGN_NAME,TAGS,
CAMPAIGN_CREATED_DATE,
UPDATED_DATE,
FIRST_SENT,LAST_SENT,
SCHEDULE_TYPE,CHANNELS,ARCHIVED
from pipeline.campaign_details_raw''')
我正在获取列的日期值,如campaign\u created\u date、updated\u date,格式为“2015-01-03t17:00:07+00:00”,第一列的格式为“2014-10-26t16:00:00z”。对于上述列,我希望在数据框中使用唯一的格式“2014-10-18t17:00:00.000+0000”。
campaign_df.head(2)
[Row(CAMPAIGN_ID='e4b32e76-8707-4406-8c16-c31410239660', CAMPAIGN_NAME='10/18 Push: $10 off $10', TAGS='', CAMPAIGN_CREATED_DATE='2014-10-17T15:11:59+00:00', UPDATED_DATE='2014-10-18T17:00:12+00:00', FIRST_SENT='2014-10-18T17:00:00Z', LAST_SENT='2014-10-18T17:00:00Z', SCHEDULE_TYPE='time_based', CHANNELS='ios_push,', ARCHIVED='False'),
Row(CAMPAIGN_ID='ed06f75e-6e3b-422d-8226-6d279f2be3bf', CAMPAIGN_NAME='10/26 - 40% off Everything - EARLY40', TAGS='', CAMPAIGN_CREATED_DATE='2014-10-24T15:53:06+00:00', UPDATED_DATE='2014-10-26T16:30:00+00:00', FIRST_SENT='2014-10-26T16:00:00Z', LAST_SENT='2014-10-26T16:00:00Z', SCHEDULE_TYPE='time_based', CHANNELS='ios_push,', ARCHIVED='False')]
campaign_df
campaign_df:pyspark.sql.dataframe.DataFrame
CAMPAIGN_ID:string
CAMPAIGN_NAME:string
TAGS:string
CAMPAIGN_CREATED_DATE:string
UPDATED_DATE:string
FIRST_SENT:string
LAST_SENT:string
SCHEDULE_TYPE:string
CHANNELS:string
ARCHIVED:string
提前谢谢!
1条答案
按热度按时间a11xaf1n1#
将所有格式转换为iso时间戳