我曾尝试在集群上启动一个简单的spring spark应用程序,但发现了以下问题:
Caused by: java.lang.ClassCastException: cannot assign instance of
java.lang.invoke.SerializedLambda to field
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1.f$3 of type
org.apache.spark.api.java.function.FlatMapFunction in instance of
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1**
我尝试启动的应用程序如下:
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("Test");
conf.setJars(new String[]{"/home/ubuntu/spring-spark-word-count-master/target/spring-spark-word-count-0.0.1-SNAPSHOT.jar"});
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> words = sc.textFile("hdfs://master.vmware.local:8020/test/test.txt");
JavaRDD<String> wordsFromFile = words.flatMap(s -> Arrays.asList(s.split(" ")).iterator());
Map<String, Long> wordCounts = wordsFromFile.countByValue();
wordCounts.forEach((k, v) -> System.out.println(k + " " + v));
words.saveAsTextFile("hdfs://master.vmware.local:8020/test/" + String.valueOf(new Date().getTime()));
sc.close();
}
经过一些测试,我发现问题是由于平面图。要在群集上启动应用程序,请使用以下命令:
spark-submit "/home/ubuntu/spring-spark-word-count-master/target/spring-spark-word-count-0.0.1-SNAPSHOT.jar"
当我在主节点上本地启动应用程序时,它可以工作,而当我在节点上分发应用程序时,它会给我带来问题。我不明白问题出在哪里。以下是从ambari提取的pom和集群配置:
聚甲醛:
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
<version>2.8</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<artifactId>hadoop-client</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
群集配置:
hdfs 3.1.1.3.1
Yarn3.1.1
MapReduce2 3.1.1
Hive3.1.0
spark2 2.3.0版
暂无答案!
目前还没有任何答案,快来回答吧!