查询获取hive或pyspark每个周日和周六的日期

fzwojiic  于 2021-04-03  发布在  Hive
关注(0)|答案(1)|浏览(806)

例如,如果给定的日期是2020-10-01,它需要返回两行sunday_dates, saturday_dates,其中包含该日期 "2020-10-01 "之后的所有周日和周六的值。
我试过这样的方法,但是,似乎对我不起作用。

spark.sql("select date_sub('2020-10-01', cast(date_format(current_date(),'u')%7 as int)) as sunday_dates").show(10,False)
+------------+
|sunday_dates|
+------------+
|2020-09-29  |
+------------+

在hive或pyspark中有没有什么方法可以实现这一点。
谢谢!

x7yiwoj4

x7yiwoj41#

你需要使用date_trunc()来达到一周的开始日期,然后再使用date_sub()date_sub()来获得周六和周日的日期。
在此创建dataframe。

df = spark.createDataFrame([("2020-11-02",1),("2020-11-03",2),("2020-11-04",3)],["event_dt","word"])
    df.show()
    df = df.withColumn("week_start", F.date_trunc('WEEK', F.col("event_dt")))

# `In case you want to get backward weekdays`

    df = df.selectExpr('*', 'date_sub(week_start, 2) as backward_Saturday')
    df = df.selectExpr('*', 'date_sub(week_start, 1) as backward_Sunday')

# In case you want forward weekends

    df = df.selectExpr('*', 'date_add(week_start, 5) as forward_Saturday')
    df = df.selectExpr('*', 'date_add(week_start, 6) as forward_Sunday')
    df.show()

input

+----------+----+
|  event_dt|word|
+----------+----+
|2020-11-02|   1|
|2020-11-03|   2|
|2020-11-04|   3|
+----------+----+

output

+----------+----+-------------------+-----------------+---------------+----------------+--------------+
|  event_dt|word|         week_start|backward_Saturday|backward_Sunday|forward_Saturday|forward_Sunday|
+----------+----+-------------------+-----------------+---------------+----------------+--------------+
|2020-11-02|   1|2020-11-02 00:00:00|       2020-10-31|     2020-11-01|      2020-11-07|    2020-11-08|
|2020-11-03|   2|2020-11-02 00:00:00|       2020-10-31|     2020-11-01|      2020-11-07|    2020-11-08|
|2020-11-04|   3|2020-11-02 00:00:00|       2020-10-31|     2020-11-01|      2020-11-07|    2020-11-08|
+----------+----+-------------------+-----------------+---------------+----------------+--------------+

相关问题