我刚刚开始使用spark流媒体,我正在尝试构建一个示例应用程序来计算Kafka流中的单词。尽管它是用 sbt package
,当我运行它时,我得到 NoClassDefFoundError
. 这篇文章似乎也有同样的问题,但解决方案是maven的,我还没能用sbt重现它。 KafkaApp.scala
:
import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka._
object KafkaApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("kafkaApp").setMaster("local[*]")
val ssc = new StreamingContext(conf, Seconds(1))
val kafkaParams = Map(
"zookeeper.connect" -> "localhost:2181",
"zookeeper.connection.timeout.ms" -> "10000",
"group.id" -> "sparkGroup"
)
val topics = Map(
"test" -> 1
)
// stream of (topic, ImpressionLog)
val messages = KafkaUtils.createStream(ssc, kafkaParams, topics, storage.StorageLevel.MEMORY_AND_DISK)
println(s"Number of words: %{messages.count()}")
}
}
``` `build.sbt` :
name := "Simple Project"
version := "1.1"
scalaVersion := "2.10.4"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.1.1",
"org.apache.spark" %% "spark-streaming" % "1.1.1",
"org.apache.spark" %% "spark-streaming-kafka" % "1.1.1"
)
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
我提交时附带:
bin/spark-submit
--class "KafkaApp"
--master local[4]
target/scala-2.10/simple-project_2.10-1.1.jar
错误:
14/12/30 19:44:57 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@192.168.5.252:65077/user/HeartbeatReceiver
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils$
at KafkaApp$.main(KafkaApp.scala:28)
at KafkaApp.main(KafkaApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:329)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils$
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
9条答案
按热度按时间sg2wtvxw1#
使用
--packages
争论spark-submit
,它以group:artifact:version,...
polkgigr2#
在build.sbt中使用以下命令
这将解决问题
w6lpcovy3#
跟随
build.sbt
为我工作。它要求你也把sbt-assembly
插件在一个文件下projects/
目录。构建.sbt
项目/插件.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.1")
2g32fytz4#
您还可以下载jar文件并将其放在spark lib文件夹中,因为它没有与spark一起安装,而不是拼命打赌sbt build.sbt是否可以工作。
http://central.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-10_2.10/2.1.1/spark-streaming-kafka-0-10_2.10-2.1.1.jar
复制到:
/usr/local/spark/spark-2.1.0-bin-hadoop2.6/jars/
hwamh0ep5#
在外部添加了依赖项,项目-->属性-->java构建路径-->库-->添加外部jar并添加所需的jar。
这解决了我的问题。
13z8s7eq6#
请尝试在提交应用程序时包含所有依赖项jar:
./spark submit--name“sampleapp”--deploy mode client--masterspark://host:7077--class com.stackexchange.sampleapp--jars$spark\u install\u dir/spark-streaming-kafka\u 2.10-1.3.0.jar,$kafka\u install\u dir/libs/kafka\u 2.10-0.8.2.0.jar,$kafka\u install\u dir/libs/metrics-core-2.2.0.jar,$kafka\u install\u dir/libs/zkclient-0.3.jar spark-example-1.0-snapshot.jar
irtuqstp7#
spark submit不会自动放置包含kafkautils的包。你需要在你的项目jar里有。为此,您需要使用sbt assembly创建一个包罗万象的uberjar。下面是build.sbt的示例。
https://github.com/tdas/spark-streaming-external-projects/blob/master/kafka/build.sbt
显然,您还需要将程序集插件添加到sbt。
https://github.com/tdas/spark-streaming-external-projects/tree/master/kafka/project
kiz8lqtg8#
使用spark 1.6为我完成这项工作,而不必麻烦地处理这么多外部jar。。。管理起来会很复杂。。。
knpiaxh19#
遇到同样的问题,我通过构建带有依赖项的jar来解决它。
将下面的代码添加到pom.xml
mvn包提交“example jar with dependencies.jar”