如何在sparkDataframe中更改字符串值

kb5ga3dv  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(353)

我正在sparksql中运行一个查询,如下所述

campaign_df = spark.sql('''select CAMPAIGN_ID,CAMPAIGN_NAME,TAGS,
                                CAMPAIGN_CREATED_DATE,
                                UPDATED_DATE,
                                FIRST_SENT,LAST_SENT,
                                SCHEDULE_TYPE,CHANNELS,ARCHIVED 
                                from pipeline.campaign_details_raw''')

我正在获取列的日期值,如campaign\u created\u date、updated\u date,格式为“2015-01-03t17:00:07+00:00”,第一列的格式为“2014-10-26t16:00:00z”。对于上述列,我希望在数据框中使用唯一的格式“2014-10-18t17:00:00.000+0000”。

campaign_df.head(2)
[Row(CAMPAIGN_ID='e4b32e76-8707-4406-8c16-c31410239660', CAMPAIGN_NAME='10/18 Push: $10 off $10', TAGS='', CAMPAIGN_CREATED_DATE='2014-10-17T15:11:59+00:00', UPDATED_DATE='2014-10-18T17:00:12+00:00', FIRST_SENT='2014-10-18T17:00:00Z', LAST_SENT='2014-10-18T17:00:00Z', SCHEDULE_TYPE='time_based', CHANNELS='ios_push,', ARCHIVED='False'),
 Row(CAMPAIGN_ID='ed06f75e-6e3b-422d-8226-6d279f2be3bf', CAMPAIGN_NAME='10/26 - 40% off Everything - EARLY40', TAGS='', CAMPAIGN_CREATED_DATE='2014-10-24T15:53:06+00:00', UPDATED_DATE='2014-10-26T16:30:00+00:00', FIRST_SENT='2014-10-26T16:00:00Z', LAST_SENT='2014-10-26T16:00:00Z', SCHEDULE_TYPE='time_based', CHANNELS='ios_push,', ARCHIVED='False')]
campaign_df
campaign_df:pyspark.sql.dataframe.DataFrame
CAMPAIGN_ID:string
CAMPAIGN_NAME:string
TAGS:string
CAMPAIGN_CREATED_DATE:string
UPDATED_DATE:string
FIRST_SENT:string
LAST_SENT:string
SCHEDULE_TYPE:string
CHANNELS:string
ARCHIVED:string

提前谢谢!

a11xaf1n

a11xaf1n1#

将所有格式转换为iso时间戳

campaign_df = spark.sql('''select CAMPAIGN_ID,CAMPAIGN_NAME,TAGS,
                                cast(CAMPAIGN_CREATED_DATE as timestamp),
                                cast(UPDATED_DATE as timestamp),
                                cast(FIRST_SENT as timestamp),
                                cast(LAST_SENT as timestamp),
                                SCHEDULE_TYPE,CHANNELS,ARCHIVED 
                                from pipeline.campaign_details_raw''')

相关问题