意外类型：< class 'pyspark.sql.types. DataTypeSingleton'>在ApacheSpark数据框架上转换为Int时

emeijp43 于 2023-06-21 发布在 Spark

关注(0)|答案(1)|浏览(105)

我在pyspark dataframe上尝试将StringType转换为IntType时出错：

joint = aggregates.join(df_data_3,aggregates.year==df_data_3.year)
joint2 = joint.filter(joint.CountyCode==999).filter(joint.CropName=='WOOL')\
    .select(aggregates.year,'Production')\
    .withColumn("ProductionTmp", df_data_3.Production.cast(IntegerType))\
    .drop("Production")\
    .withColumnRenamed("ProductionTmp", "Production")

我得到了：
TypeErrorTraceback（most recent call last）in（）1 joint = aggregates.join（df_data_3，aggregates. year ==df_data_3.year）----> 2 joint2 = joint.filter（joint.CountyCode==999）.filter（joint.CropName=='WOOL'）
.select（aggregates.year，'Production '）.withColumn（“ProductionTmp”，df_data_3.Production.cast（IntegerType））.drop（“Production”）
.withColumnRenamed（“ProductionTmp”，“Production”）
/usr/local/src/spark20master/spark/python/pyspark/sql/column.py in cast（self，dataType）335 jc = self._jc.cast（jdt）336 else：--> 337 raise TypeError（“unexpected type：%s”% type（dataType））338 return Column（jc）339
TypeError：意外类型：

pyspark

来源：https://stackoverflow.com/questions/40701122/unexpected-type-class-pyspark-sql-types-datatypesingleton-when-casting-to-i

1条答案

按热度按时间

5kgi1eie1#

PySpark SQL数据类型不再是（1.3之前是）单例。你需要创建一个示例：

from pyspark.sql.types import IntegerType
from pyspark.sql.functions import col

col("foo").cast(IntegerType())

Column<b'CAST(foo AS INT)'>

与之相反：

col("foo").cast(IntegerType)

TypeError  
   ...
TypeError: unexpected type: <class 'type'>

cast方法也可以使用字符串描述：

col("foo").cast("integer")

Column<b'CAST(foo AS INT)'>

有关Spark SQL和Dataframes中支持的数据类型的概述，可以单击此link。

赞(0）回复(0）举报 2023-06-21

我来回答

意外类型：< class 'pyspark.sql.types. DataTypeSingleton'>在ApacheSpark数据框架上转换为Int时

1条答案

相关问题

热门标签

最新问答