线程“main”java.lang.noclassdeffounderror中出现异常:intellij中spark scala应用程序中的org/apache/spark/sql/catalyst/structfilters

pkmbmrz7  于 2021-07-09  发布在  Spark
关注(0)|答案(1)|浏览(820)

我的pom.xml

<dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>${scala.version}</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.4</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.specs</groupId>
      <artifactId>specs</artifactId>
      <version>1.2.5</version>
      <scope>test</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.12</artifactId>
      <version>3.0.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.12</artifactId>
      <version>3.0.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.databricks/spark-xml -->
    <dependency>
      <groupId>com.databricks</groupId>
      <artifactId>spark-xml_2.12</artifactId>
      <version>0.10.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
    <dependency>
      <groupId>mysql</groupId>
      <artifactId>mysql-connector-java</artifactId>
      <version>5.1.29</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk -->
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk</artifactId>
      <version>1.11.985</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-s3</artifactId>
      <version>1.11.985</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-core -->
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-core</artifactId>
      <version>1.11.985</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-dynamodb -->
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-dynamodb</artifactId>
      <version>1.11.985</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-cloudwatch -->
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-cloudwatch</artifactId>
      <version>1.11.985</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-kinesis -->
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-kinesis</artifactId>
      <version>1.11.985</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-avro -->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-avro_2.12</artifactId>
      <version>3.1.1</version>
    </dependency>

我想读avro文件

val conf = new SparkConf().setAppName("Nightmare").setMaster("local")
val sc = new SparkContext(conf)
sc.setLogLevel("Error")
val spark= SparkSession.builder().getOrCreate()
import spark.implicits._
//Step 1-2 read avro file
println("Step 1-2")
val df1 = spark.read
  .format("com.databricks.spark.avro")
  .option("multiline","true")
  .load("file:///D:/bigdata_tasks/nightmare.avro")
//step 3-4 Hit the url --- Convert it to dataframe -- https://randomuser.me/api/0.8/?results=1000 - df2
println("Step 3-5")
val html =Source.fromURL(" https://randomuser.me/api/0.8/?results=1000")
val rdddata=html.mkString
//convert string to rdd
val paralleldata=sc.parallelize(List(rdddata))
val df2= spark.read.json(paralleldata)
df2.printSchema()
df2.show()

运行后出现异常:
线程“main”java.lang.noclassdeffounderror中出现异常:org/apache/spark/sql/catalyst/structfilters
我还尝试了以下代码:

val df1 = spark.read
      .format("avro")
      .option("multiline","true")
      .load("file:///D:/bigdata_tasks/nightmare.avro")

但同样的例外是给予。我的spark版本是2.12。我应该更新spark版本吗?

flvtvl50

flvtvl501#

最可能的原因是混合了spark版本-您的avro库来自spark 3.1.1,而spark的核心来自spark 3.0.1(通常最好将版本声明为属性,这样所有组件都可以有一个版本)。另外,删除不必要的依赖项,如spark xml、aws sdk等。
还要检查scala版本

相关问题