我试图学习spark和scala,在我试图通过调用parquet方法将结果的dataframe对象写入parquet文件时,我得到了这样的错误
代码库fails:-
df2.write.mode(SaveMode.Overwrite).parquet(outputPath)
这也失败了
df2.write.format("org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat").mode(SaveMode.Overwrite).parquet(outputPath)
错误log:-
Exception in thread "main" org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat), please specify the fully qualified class name.;
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:707)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:733)
at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:967)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:304)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)
at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:848)
如果我调用另一个方法来保存,代码会正常工作,
这很管用fine:-
df2.write.format("org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat").mode(SaveMode.Overwrite).save(outputPath)
虽然我对这个问题有一个解决方案,但我想了解为什么第一种方法不起作用,以及如何解决它。
我使用的规范的细节are:- scala 2.12.9 java 1.8 spark 2.4.4版本
p、 这个问题只在spark上看到
暂无答案!
目前还没有任何答案,快来回答吧!