pyspark-'dataframe'对象没有属性'map'

nwnhqdif  于 2021-07-09  发布在  Spark
关注(0)|答案(1)|浏览(387)

我使用databricks上的pyspark对dataset进行了以下总结
订单月EARSALEAAMOUNT2012-11-01t00:00:00.000+0000473760.5700000012010-04-01t00:00:00.000+0000490967.0900000001
将ordermonthyear转换为整数类型时,此Map函数出现Dataframe错误

results = summary.map(lambda r: (int(r.OrderMonthYear.replace('-','')), r.SaleAmount)).toDF(["OrderMonthYear","SaleAmount"])

有什么想法吗?

AttributeError: 'DataFrame' object has no attribute 'map'
igetnqfo

igetnqfo1#

在这里找到了pyspark日期yyyy-mmm-dd转换的解决方案

from datetime import datetime
from pyspark.sql.functions import col, unix_timestamp, from_unixtime, date_format
from pyspark.sql.types import DateType

df = summary.withColumn('date', from_unixtime(unix_timestamp("OrderMonthYear", 'yyyy-MMM')))

df2 = df.withColumn("new_date_str", date_format(col("date"), "yyyyMMdd"))
display(df2)

谢谢@mck的帮助!干杯

相关问题