spark版本:3.1.0
我是Maven的新成员,有Spark和斯卡拉。我正在尝试创建一个项目并使用sparksession编写spark文件来计算文本文件中的字数,但在运行代码时出现以下错误:
使用的pom:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sparkscala</groupId>
<artifactId>mavenproject</artifactId>
<version>0.0.1-SNAPSHOT</version>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.1.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<!-- mixed scala/java compile -->
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<id>compile</id>
<goals>
<goal>compile</goal>
</goals>
<phase>compile</phase>
</execution>
<execution>
<id>test-compile</id>
<goals>
<goal>testCompile</goal>
</goals>
<phase>test-compile</phase>
</execution>
<execution>
<phase>process-resources</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
<!-- for fatjar -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>assemble-all</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<mainClass>fully.qualified.MainClass</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
<pluginManagement>
<plugins>
<!--This plugin's configuration is used to store Eclipse m2e settings
only. It has no influence on the Maven build itself. -->
<plugin>
<groupId>org.eclipse.m2e</groupId>
<artifactId>lifecycle-mapping</artifactId>
<version>1.0.0</version>
<configuration>
<lifecycleMappingMetadata>
<pluginExecutions>
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.scala-tools</groupId>
<artifactId>
maven-scala-plugin
</artifactId>
<versionRange>
[2.15.2,)
</versionRange>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</pluginExecutionFilter>
<action>
<execute></execute>
</action>
</pluginExecution>
</pluginExecutions>
</lifecycleMappingMetadata>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
项目结构:
项目结构
代码“wordcount.scala”:
import org.apache.spark.sql.SparkSession
object Sample {
def main(args:Array[String]) ={ println("Hello World");
val spark = SparkSession.builder.appName("WordCount").master("local[*]").getOrCreate
}
}
我得到以下错误:
Hello World
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/02/25 20:07:37 INFO SparkContext: Running Spark version 3.1.0
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/collections/map/UnmodifiableMap
at org.apache.hadoop.conf.Configuration$DeprecationContext.<init>(Configuration.java:509)
at org.apache.hadoop.conf.Configuration.<clinit>(Configuration.java:546)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:304)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1828)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:710)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:660)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:571)
at org.apache.spark.util.Utils$.$anonfun$getCurrentUserName$1(Utils.scala:2474)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2474)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:314)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2678)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:942)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:936)
at com.sparkscala.mavenproject.Sample$.main(Sample.scala:5)
at com.sparkscala.mavenproject.Sample.main(Sample.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.collections.map.UnmodifiableMap
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 17 more
我已经包括了apachecommons集合jar:collections jar
请帮我解决这个错误。提前谢谢。
1条答案
按热度按时间yshpjwxd1#
通过使用以下依赖关系解决此问题: