从pyspark中的日期列获取上一季度的月份结束日期,并获取上一年的所有月份结束日期

nzkunb0c  于 2021-07-13  发布在  Spark
关注(0)|答案(1)|浏览(420)

我是python和pyspark的新手。我正在处理一个日期为整数格式的Dataframe。我必须从列中找到最大日期,并根据该日期找到-
最近12个月结束日期。
上一季度月结束日期
上个月结束日期。
这是Dataframe
日期20201219202012102020101220200718200624
最大日期(date)=20201219
输出Dataframe-
自定义日期信息20200731prev quarter20200831prev quarter20200930prev quarter20201130当前年份20201031当前年份20200930当前年份20200831当前年份20200731当前年份20200630当前年份20200531当前年份20200430当前年份20200331当前年份20200229当前年份20200131当前年份20191231当前年份20191231prev年份20191130prev20191031上一年20190930上一年20190831上一年20190731上一年20190630上一年20190531上一年20190430上一年20190331上一年20190228上一年20190131上一年
到目前为止,我已经尝试了一些化学需氧量,我能够得到今年,但在获得季度和上一年卡住

m4pnthwp

m4pnthwp1#

您可以构建一些日期序列并将其全部分解:

maxdate = df.select(F.to_date(F.max('date'), 'yyyyMMdd').cast('string')).head()[0]

df2 = spark.sql(f"""
select explode(transform(sequence(trunc(to_date('{maxdate}'), 'quarter') - interval 2 months, trunc(to_date('{maxdate}'), 'quarter'), interval 1 month), x -> x - interval 1 day)) as custom_date, 'prev quarter' as info

union all

select explode(transform(sequence(trunc(to_date('{maxdate}'), 'month'), trunc(to_date('{maxdate}'), 'month') - interval 11 month, interval -1 month), x -> x - interval 1 day)) as custom_date, 'current year' as info

union all

select explode(transform(sequence(trunc(to_date('{maxdate}'), 'year'), trunc(to_date('{maxdate}'), 'year') - interval 11 month, interval -1 month), x -> x - interval 1 day)) as custom_date, 'prev year' as info
""")

df2.show(99)
+-----------+------------+
|custom_date|        info|
+-----------+------------+
| 2020-07-31|prev quarter|
| 2020-08-31|prev quarter|
| 2020-09-30|prev quarter|
| 2020-11-30|current year|
| 2020-10-31|current year|
| 2020-09-30|current year|
| 2020-08-31|current year|
| 2020-07-31|current year|
| 2020-06-30|current year|
| 2020-05-31|current year|
| 2020-04-30|current year|
| 2020-03-31|current year|
| 2020-02-29|current year|
| 2020-01-31|current year|
| 2019-12-31|current year|
| 2019-12-31|   prev year|
| 2019-11-30|   prev year|
| 2019-10-31|   prev year|
| 2019-09-30|   prev year|
| 2019-08-31|   prev year|
| 2019-07-31|   prev year|
| 2019-06-30|   prev year|
| 2019-05-31|   prev year|
| 2019-04-30|   prev year|
| 2019-03-31|   prev year|
| 2019-02-28|   prev year|
| 2019-01-31|   prev year|
+-----------+------------+

相关问题