如何解决spark读取配置单元orc文件遇到错误

798qvoo8  于 2021-07-13  发布在  Spark
关注(0)|答案(0)|浏览(311)

JDK1.8Scala2.12.11Spark3.0.1当我在ScalaSpark中读取配置单元表并编写导出orc文件时

df.write.option("compression", "none").mode(SaveMode.Overwrite).orc(dump_path)

它运行成功
当我想从python pyspark中的period export orc文件读取orc文件时,它正在成功运行 dfs = spark.read.orc("/Users/muller/Documents/gitcode/personEtl/knowledge_source_100.orc") 但是当我想从scala spark中的period orc文件中读取相同的orc文件时,会遇到这个错误
java.lang.classcastexception:org.apache.orc.impl.readerimpl不能转换为java.io.closeable java.lang.classcastexception:org.apache.orc.impl.readerimpl不能转换为java.io.closeable

at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2538)
at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.readSchema(OrcUtils.scala:65)
at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.$anonfun$readSchema$4(OrcUtils.scala:88)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.TraversableOnce.collectFirst(TraversableOnce.scala:172)
at scala.collection.TraversableOnce.collectFirst$(TraversableOnce.scala:159)
at scala.collection.AbstractIterator.collectFirst(Iterator.scala:1431)
at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.readSchema(OrcUtils.scala:88)
at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.inferSchema(OrcUtils.scala:128)
at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.inferSchema(OrcFileFormat.scala:96)
at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$11(DataSource.scala:198)
at scala.Option.orElse(Option.scala:447)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:195)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题