使用sbt spark包插件了解build.sbt

liwlm1x9  于 2021-07-14  发布在  Java
关注(0)|答案(1)|浏览(487)

我是新的scala和sbt构建文件。在介绍性教程中,通过sbt spark软件包插件将spark依赖项添加到scala项目应该是直截了当的,但我发现以下错误:

[error] (run-main-0) java.lang.NoClassDefFoundError: org/apache/spark/SparkContext

请提供资源,以了解更多关于什么可能是驱动错误,因为我想了解过程更彻底。
代码:

trait SparkSessionWrapper {

  lazy val spark: SparkSession = {
    SparkSession
      .builder()
      .master("local")
      .appName("spark citation graph")
      .getOrCreate()
  }

  val sc = spark.sparkContext

}

import org.apache.spark.graphx.GraphLoader

object Test extends SparkSessionWrapper {

  def main(args: Array[String]) {
    println("Testing, testing, testing, testing...")

    var filePath = "Desktop/citations.txt"
    val citeGraph = GraphLoader.edgeListFile(sc, filepath)
    println(citeGraph.vertices.take(1))
  }
}

插件.sbt

resolvers += "bintray-spark-packages" at "https://dl.bintray.com/spark-packages/maven/"

addSbtPlugin("org.spark-packages" % "sbt-spark-package" % "0.2.6")

build.sbt—正在工作。为什么librarydependencies运行/工作?

spName := "yewno/citation_graph"

version := "0.1"

scalaVersion := "2.11.12"

sparkVersion := "2.2.0"

sparkComponents ++= Seq("core", "sql", "graphx")

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "2.2.0",
  "org.apache.spark" %% "spark-sql" % "2.2.0",
  "org.apache.spark" %% "spark-graphx" % "2.2.0"
)

build.sbt—不工作。希望它能正确编译和运行

spName := "yewno/citation_graph"

version := "0.1"

scalaVersion := "2.11.12"

sparkVersion := "2.2.0"

sparkComponents ++= Seq("core", "sql", "graphx")

额外的解释+资源链接,以了解更多关于sbt构建过程,jar文件,以及任何其他可以帮助我加快速度的东西!

6rqinv9w

6rqinv9w1#

sbt spark包插件提供了 provided 范围:

sparkComponentSet.map { component =>
  "org.apache.spark" %% s"spark-$component" % sparkVersion.value % "provided"
}.toSeq

我们可以通过运行 show libraryDependencies 来自sbt:

[info] * org.scala-lang:scala-library:2.11.12
[info] * org.apache.spark:spark-core:2.2.0:provided
[info] * org.apache.spark:spark-sql:2.2.0:provided
[info] * org.apache.spark:spark-graphx:2.2.0:provided
``` `provided` 范围是指:
依赖关系将是编译和测试的一部分,但从运行时中排除。
因此 `sbt run` 投掷 `java.lang.NoClassDefFoundError: org/apache/spark/SparkContext` 如果我们真的想包括 `provided` 依赖于 `run` classpath@douglaz建议:

run in Compile := Defaults.runTask(fullClasspath in Compile, mainClass in (Compile, run), runner in (Compile, run)).evaluated

相关问题