如何解决无害的“java,nio.file.NoSuchFileException:xxx/hadoop-client-API-3,运行`sbt run`时Spark中出现错误?

htzpubme  于 2023-04-28  发布在  Java
关注(0)|答案(3)|浏览(131)

我在Scala 2中有一个简单的Spark应用程序。12.

我的应用

**find-retired-people-scala/project/ www.example.com 我们

sbt.version=1.8.2

find-retired-people-scala/src/main/scala/com/hongbomiao/FindRetiredPeople.scala

package com.hongbomiao

import org.apache.spark.sql.{DataFrame, SparkSession}

object FindRetiredPeople {
  def main(args: Array[String]): Unit = {
    val people = Seq(
      (1, "Alice", 25),
      (2, "Bob", 30),
      (3, "Charlie", 80),
      (4, "Dave", 40),
      (5, "Eve", 45)
    )

    val spark: SparkSession = SparkSession.builder()
      .master("local[*]")
      .appName("find-retired-people-scala")
      .config("spark.ui.port", "4040")
      .getOrCreate()

    import spark.implicits._
    val df: DataFrame = people.toDF("id", "name", "age")
    df.createOrReplaceTempView("people")

    val retiredPeople: DataFrame = spark.sql("SELECT name, age FROM people WHERE age >= 67")
    retiredPeople.show()

    spark.stop()
  }
}

find-retired-people-scala/build.sbt

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.4.0",
  "org.apache.spark" %% "spark-sql" % "3.4.0",
)

find-retired-people-scala/.jvmopts(仅为Java 17添加此文件,其他旧版本删除)

--add-exports=java.base/sun.nio.ch=ALL-UNNAMED

问题

当我在本地开发期间运行sbt run进行测试和调试时,应用程序成功运行。
但是,我还是得到了一个无害的错误:

➜ sbt run
[info] welcome to sbt 1.8.2 (Homebrew Java 17.0.7)
# ...
+-------+---+
|   name|age|
+-------+---+
|Charlie| 80|
+-------+---+

23/04/24 13:56:40 INFO SparkUI: Stopped Spark web UI at http://10.10.8.125:4040
23/04/24 13:56:40 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/04/24 13:56:40 INFO MemoryStore: MemoryStore cleared
23/04/24 13:56:40 INFO BlockManager: BlockManager stopped
23/04/24 13:56:40 INFO BlockManagerMaster: BlockManagerMaster stopped
23/04/24 13:56:40 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/04/24 13:56:40 INFO SparkContext: Successfully stopped SparkContext
[success] Total time: 8 s, completed Apr 24, 2023, 1:56:40 PM
Exception in thread "Thread-1" java.lang.RuntimeException: java.nio.file.NoSuchFileException: find-retired-people-scala/target/bg-jobs/sbt_e60ef8a3/target/3d275f27/dbc63e3b/hadoop-client-api-3.3.2.jar
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3089)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3036)
    at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2914)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2896)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1246)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1863)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840)
    at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
    at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
    at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
Caused by: java.nio.file.NoSuchFileException: find-retired-people-scala/target/bg-jobs/sbt_e60ef8a3/target/3d275f27/dbc63e3b/hadoop-client-api-3.3.2.jar
    at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
    at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
    at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:148)
    at java.base/java.nio.file.Files.readAttributes(Files.java:1851)
    at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1264)
    at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
    at java.base/java.util.jar.JarFile.<init>(JarFile.java:347)
    at java.base/sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:103)
    at java.base/sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:72)
    at java.base/sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:168)
    at java.base/sun.net.www.protocol.jar.JarFileFactory.getOrCreate(JarFileFactory.java:91)
    at java.base/sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
    at java.base/sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:175)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:3009)
    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3105)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3063)
    ... 10 more

Process finished with exit code 0

我的问题

我的问题是如何隐藏这个无害的错误?

我的尝试

基于此,我尝试通过更新build来添加hadoop-client。sbt

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.4.0",
  "org.apache.spark" %% "spark-sql" % "3.4.0",
  "org.apache.hadoop" %% "hadoop-client" % "3.3.4",
)

然而,我的错误变成了

sbt run
[info] welcome to sbt 1.8.2 (Homebrew Java 17.0.6)
[info] loading project definition from hongbomiao.com/hm-spark/applications/find-retired-people-scala/project

  | => find-retired-people-scala-build / Compile / compileIncremental 0s
[info] loading settings for project find-retired-people-scala from build.sbt ...
[info] set current project to FindRetiredPeople (in build file:hongbomiao.com/hm-spark/applications/find-retired-people-scala/)

  | => find-retired-people-scala / update 0s
[info] Updating 

  | => find-retired-people-scala / update 0s
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client_2.12/3.3.4/hadoo…
    0.0% [          ] 0B (0B / s)

  | => find-retired-people-scala / update 0s
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client_2.12/3.3.4/hadoo…
    0.0% [          ] 0B (0B / s)
[info] Resolved  dependencies

  | => find-retired-people-scala / update 0s
[warn] 

  | => find-retired-people-scala / update 0s
[warn]  Note: Unresolved dependencies path:

  | => find-retired-people-scala / update 0s
[error] sbt.librarymanagement.ResolveException: Error downloading org.apache.hadoop:hadoop-client_2.12:3.3.4
[error]   Not found
[error]   Not found
[error]   not found: /Users/hongbo-miao/.ivy2/localorg.apache.hadoop/hadoop-client_2.12/3.3.4/ivys/ivy.xml
[error]   not found: https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client_2.12/3.3.4/hadoop-client_2.12-3.3.4.pom
[error]     at lmcoursier.CoursierDependencyResolution.unresolvedWarningOrThrow(CoursierDependencyResolution.scala:344)
[error]     at lmcoursier.CoursierDependencyResolution.$anonfun$update$38(CoursierDependencyResolution.scala:313)
[error]     at scala.util.Either$LeftProjection.map(Either.scala:573)
[error]     at lmcoursier.CoursierDependencyResolution.update(CoursierDependencyResolution.scala:313)
[error]     at sbt.librarymanagement.DependencyResolution.update(DependencyResolution.scala:60)
[error]     at sbt.internal.LibraryManagement$.resolve$1(LibraryManagement.scala:59)
[error]     at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$12(LibraryManagement.scala:133)
[error]     at sbt.util.Tracked$.$anonfun$lastOutput$1(Tracked.scala:73)
[error]     at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$20(LibraryManagement.scala:146)
[error]     at scala.util.control.Exception$Catch.apply(Exception.scala:228)
[error]     at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11(LibraryManagement.scala:146)
[error]     at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11$adapted(LibraryManagement.scala:127)
[error]     at sbt.util.Tracked$.$anonfun$inputChangedW$1(Tracked.scala:219)
[error]     at sbt.internal.LibraryManagement$.cachedUpdate(LibraryManagement.scala:160)
[error]     at sbt.Classpaths$.$anonfun$updateTask0$1(Defaults.scala:3687)
[error]     at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error]     at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62)
[error]     at sbt.std.Transform$$anon$4.work(Transform.scala:68)
[error]     at sbt.Execute.$anonfun$submit$2(Execute.scala:282)
[error]     at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:23)
[error]     at sbt.Execute.work(Execute.scala:291)
[error]     at sbt.Execute.$anonfun$submit$1(Execute.scala:282)
[error]     at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
[error]     at sbt.CompletionService$$anon$2.call(CompletionService.scala:64)
[error]     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error]     at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[error]     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error]     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[error]     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[error]     at java.base/java.lang.Thread.run(Thread.java:833)
[error] (update) sbt.librarymanagement.ResolveException: Error downloading org.apache.hadoop:hadoop-client_2.12:3.3.4
[error]   Not found
[error]   Not found
[error]   not found: /Users/hongbo-miao/.ivy2/localorg.apache.hadoop/hadoop-client_2.12/3.3.4/ivys/ivy.xml
[error]   not found: https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client_2.12/3.3.4/hadoop-client_2.12-3.3.4.pom
[error] Total time: 2 s, completed Apr 19, 2023, 4:52:07 PM
make: *** [sbt-run] Error 1

注意,在日志中,它需要hadoop-client_2。12-3.3.4.pom2.12就是Scala 2。12.但是,hadoop-client没有Scala 2。12版本。

为了进行比较,这是它如何查找具有Scala版本的库,例如spark-sql

我能做些什么来帮助解决这个问题?谢谢!

更新1(4/19/2023):

我在build中从"org.apache.hadoop" %% "hadoop-client" % "3.3.4"更新到"org.apache.hadoop" % "hadoop-client" % "3.3.4"。sbt基于Dmytro Mitin的答案,所以现在看起来像

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.4.0",
  "org.apache.spark" %% "spark-sql" % "3.4.0",
  "org.apache.hadoop" % "hadoop-client" % "3.3.4"
)

现在错误变为:

➜ sbt run
[info] welcome to sbt 1.8.2 (Amazon.com Inc. Java 17.0.6)
[info] loading project definition from find-retired-people-scala/project
[info] loading settings for project find-retired-people-scala from build.sbt ...
[info] set current project to FindRetiredPeople (in build file:find-retired-people-scala/)
[info] running com.hongbomiao.FindRetiredPeople
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
23/04/19 19:28:46 INFO SparkContext: Running Spark version 3.4.0
23/04/19 19:28:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
# ...
+-------+---+
|   name|age|
+-------+---+
|Charlie| 80|
+-------+---+

23/04/19 19:28:48 INFO SparkContext: SparkContext is stopping with exitCode 0.
23/04/19 19:28:48 INFO SparkUI: Stopped Spark web UI at http://10.0.0.135:4040
23/04/19 19:28:48 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/04/19 19:28:48 INFO MemoryStore: MemoryStore cleared
23/04/19 19:28:48 INFO BlockManager: BlockManager stopped
23/04/19 19:28:48 INFO BlockManagerMaster: BlockManagerMaster stopped
23/04/19 19:28:48 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/04/19 19:28:48 INFO SparkContext: Successfully stopped SparkContext
[success] Total time: 7 s, completed Apr 19, 2023, 7:28:48 PM
23/04/19 19:28:49 INFO ShutdownHookManager: Shutdown hook called
23/04/19 19:28:49 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-6cd93a01-3109-4ecd-aca2-a21a9921ecf8
23/04/19 19:28:49 ERROR Configuration: error parsing conf core-default.xml
java.nio.file.NoSuchFileException: find-retired-people-scala/target/bg-jobs/sbt_20affd86/target/f5c922ec/359669fc/hadoop-client-api-3.3.4.jar
    at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
    at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
    at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:148)
    at java.base/java.nio.file.Files.readAttributes(Files.java:1851)
    at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1264)
    at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
    at java.base/java.util.jar.JarFile.<init>(JarFile.java:347)
    at java.base/sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:103)
    at java.base/sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:72)
    at java.base/sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:168)
    at java.base/sun.net.www.protocol.jar.JarFileFactory.getOrCreate(JarFileFactory.java:91)
    at java.base/sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
    at java.base/sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:175)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:3009)
    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3105)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3063)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3036)
    at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2914)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2896)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1246)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1863)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840)
    at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
    at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
    at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
Exception in thread "Thread-1" java.lang.RuntimeException: java.nio.file.NoSuchFileException: find-retired-people-scala/target/bg-jobs/sbt_20affd86/target/f5c922ec/359669fc/hadoop-client-api-3.3.4.jar
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3089)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3036)
    at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2914)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2896)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1246)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1863)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840)
    at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
    at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
    at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
Caused by: java.nio.file.NoSuchFileException: find-retired-people-scala/target/bg-jobs/sbt_20affd86/target/f5c922ec/359669fc/hadoop-client-api-3.3.4.jar
    at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
    at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
    at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
    at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:148)
    at java.base/java.nio.file.Files.readAttributes(Files.java:1851)
    at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1264)
    at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
    at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
    at java.base/java.util.jar.JarFile.<init>(JarFile.java:347)
    at java.base/sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:103)
    at java.base/sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:72)
    at java.base/sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:168)
    at java.base/sun.net.www.protocol.jar.JarFileFactory.getOrCreate(JarFileFactory.java:91)
    at java.base/sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
    at java.base/sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:175)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:3009)
    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3105)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3063)
    ... 10 more

我的**/target/bg-jobs**文件夹实际上是空的。

更新2(4/19/2023):

我尝试了Java 8(类似于Java 11),但仍然显示相同的错误:

➜ sbt run
[info] welcome to sbt 1.8.2 (Amazon.com Inc. Java 1.8.0_372)
[info] loading project definition from find-retired-people-scala/project
[info] loading settings for project find-retired-people-scala from build.sbt ...
[info] set current project to FindRetiredPeople (in build file:find-retired-people-scala/)
[info] compiling 1 Scala source to find-retired-people-scala/target/scala-2.12/classes ...
[info] running com.hongbomiao.FindRetiredPeople
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
23/04/19 19:26:25 INFO SparkContext: Running Spark version 3.4.0
23/04/19 19:26:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
# ...
+-------+---+
|   name|age|
+-------+---+
|Charlie| 80|
+-------+---+

23/04/19 19:26:28 INFO SparkContext: SparkContext is stopping with exitCode 0.
23/04/19 19:26:28 INFO SparkUI: Stopped Spark web UI at http://10.0.0.135:4040
23/04/19 19:26:28 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/04/19 19:26:28 INFO MemoryStore: MemoryStore cleared
23/04/19 19:26:28 INFO BlockManager: BlockManager stopped
23/04/19 19:26:28 INFO BlockManagerMaster: BlockManagerMaster stopped
23/04/19 19:26:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/04/19 19:26:28 INFO SparkContext: Successfully stopped SparkContext
[success] Total time: 12 s, completed Apr 19, 2023 7:26:28 PM
23/04/19 19:26:29 INFO ShutdownHookManager: Shutdown hook called
23/04/19 19:26:29 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-9d00ad16-6f44-495b-b561-4d7bba1f7918
23/04/19 19:26:29 ERROR Configuration: error parsing conf core-default.xml
java.io.FileNotFoundException: find-retired-people-scala/target/bg-jobs/sbt_e72285b9/target/f5c922ec/359669fc/hadoop-client-api-3.3.4.jar (No such file or directory)
    at java.util.zip.ZipFile.open(Native Method)
    at java.util.zip.ZipFile.<init>(ZipFile.java:231)
    at java.util.zip.ZipFile.<init>(ZipFile.java:157)
    at java.util.jar.JarFile.<init>(JarFile.java:171)
    at java.util.jar.JarFile.<init>(JarFile.java:108)
    at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93)
    at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
    at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99)
    at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
    at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:3009)
    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3105)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3063)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3036)
    at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2914)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2896)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1246)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1863)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840)
    at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
    at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
    at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
Exception in thread "Thread-3" java.lang.RuntimeException: java.io.FileNotFoundException: find-retired-people-scala/target/bg-jobs/sbt_e72285b9/target/f5c922ec/359669fc/hadoop-client-api-3.3.4.jar (No such file or directory)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3089)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3036)
    at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2914)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2896)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1246)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1863)
    at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1840)
    at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
    at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
    at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
Caused by: java.io.FileNotFoundException: find-retired-people-scala/target/bg-jobs/sbt_e72285b9/target/f5c922ec/359669fc/hadoop-client-api-3.3.4.jar (No such file or directory)
    at java.util.zip.ZipFile.open(Native Method)
    at java.util.zip.ZipFile.<init>(ZipFile.java:231)
    at java.util.zip.ZipFile.<init>(ZipFile.java:157)
    at java.util.jar.JarFile.<init>(JarFile.java:171)
    at java.util.jar.JarFile.<init>(JarFile.java:108)
    at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93)
    at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
    at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99)
    at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
    at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)
    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:3009)
    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3105)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3063)
    ... 10 more

更新3(4/24/2023):

尝试添加hadoop-commonhadoop-client-api,也没有运气与相同的错误:

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.4.0",
  "org.apache.spark" %% "spark-sql" % "3.4.0",
  "org.apache.hadoop" % "hadoop-client" % "3.3.4",  
  "org.apache.hadoop" % "hadoop-client-api" % "3.3.4",
  "org.apache.hadoop" % "hadoop-common" % "3.3.4"
)
wfveoks0

wfveoks01#

使用%而不是%%添加Hadoop(如您提到的链接所示)

"org.apache.hadoop" % "hadoop-client" % "3.3.4"

https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/
https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client
它是Java库,而不是Scala库(如spark-sql等)。)
https://github.com/apache/hadoop
%%添加了Scala后缀_2.13_2.12_2.11等。这与Java无关。
基于堆栈跟踪中的java.base...,您使用的是Java 9+。尝试切换到Java 8。
实际上,您似乎甚至使用Java 17
[信息]欢迎来到sbt 1。8.2(Homebrew Java 17.0.6)
但是
支持的Java版本

  • Apache Hadoop 3.3和更高版本支持Java 8和Java 11(仅运行时)

https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions
https://issues.apache.org/jira/browse/HADOOP-17177
How a spark application starts using sbt run.
Difference in running a spark application with sbt run or with spark-submit script
您也可以尝试在build.sbt中打开fork := true
https://www.scala-sbt.org/1.x/docs/Forking.html#Forking
Why do we need to add "fork in run := true" when running Spark SBT application?

oyxsuwqo

oyxsuwqo2#

关于第一个错误sbt run(在compile/package之后)不是您运行Spark应用程序的方式。.. spark-submit设置客户端classpath以包含Spark(和Hadoop)库。Sbt应该只用于编译代码和将要使用的类。
sbt run在本地工作的唯一原因是因为类路径是为您预先配置的,并且使用了Spark可能需要的任何额外环境变量,比如SPARK_CONF_DIR来进行额外的运行时配置。
关于剩余的错误,需要使用sbt assembly来创建一个uber jar,其中包含所有可传递依赖项(在您将% provided添加到Spark依赖项并删除hadoop-client之后,因为它是一个可传递依赖项。..),如mentioned in the answer to your previous Spark/sbt question
这里有一个插件,使Spark更容易-https://github.com/alonsodomin/sbt-spark

h79rfbju

h79rfbju3#

(Note这个答案解决了我在问题中遇到的原始无害错误,但在我的情况下引入了一些新的无害错误)
感谢@Dmytro Mitin!
fork := true帮助sbt run摆脱我的UPDATE 1中原始的无害错误。build.sbt看起来像这样

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
fork := true
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.3.2",
  "org.apache.spark" %% "spark-sql" % "3.3.2",
)

注意,当我sbt assembly时,我仍然使用这个(没有fork := true并添加"provided"):

name := "FindRetiredPeople"
version := "1.0"
scalaVersion := "2.12.17"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.3.2" % "provided",
  "org.apache.spark" %% "spark-sql" % "3.3.2" % "provided"
)

请注意,即使在添加fork := true之后,它也只能在Java 11和Java 1中工作。8.
现在sbt run的日志如下所示。它删除了UPDATE 1中的无害错误,但引入了一些新的无害错误。

➜ sbt run
[info] welcome to sbt 1.8.2 (Amazon.com Inc. Java 11.0.17)
[info] loading settings for project find-retired-people-scala-build from plugins.sbt ...
[info] loading project definition from find-retired-people-scala/project
[info] loading settings for project find-retired-people-scala from build.sbt ...
[info] set current project to FindRetiredPeople (in build file:find-retired-people-scala/)
[info] running (fork) com.hongbomiao.FindRetiredPeople
[error] SLF4J: Class path contains multiple SLF4J bindings.
[error] SLF4J: Found binding in [jar:file:find-retired-people-scala/target/bg-jobs/sbt_21530e9/target/90e7c350/f67577ef/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
[error] SLF4J: Found binding in [jar:file:find-retired-people-scala/target/bg-jobs/sbt_21530e9/target/163709c3/4a423a38/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
[error] SLF4J: Found binding in [jar:file:find-retired-people-scala/target/bg-jobs/sbt_21530e9/target/d9dbc825/8c0ceb42/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
[error] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
[error] SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
[error] Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
[info] 23/04/24 13:19:11 WARN Utils: Your hostname, Hongbos-MacBook-Pro-2021.local resolves to a loopback address: 127.0.0.1; using 10.10.8.125 instead (on interface en0)
[info] 23/04/24 13:19:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
[info] 23/04/24 13:19:11 INFO SparkContext: Running Spark version 3.3.2
[info] 23/04/24 13:19:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[info] 23/04/24 13:19:11 INFO ResourceUtils: ==============================================================
[info] 23/04/24 13:19:11 INFO ResourceUtils: No custom resources configured for spark.driver.
[info] 23/04/24 13:19:11 INFO ResourceUtils: ==============================================================
[info] 23/04/24 13:19:11 INFO SparkContext: Submitted application: find-retired-people-scala
[info] 23/04/24 13:19:11 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
[info] 23/04/24 13:19:11 INFO ResourceProfile: Limiting resource is cpu
[info] 23/04/24 13:19:11 INFO ResourceProfileManager: Added ResourceProfile id: 0
[info] 23/04/24 13:19:11 INFO SecurityManager: Changing view acls to: hongbo-miao
[info] 23/04/24 13:19:11 INFO SecurityManager: Changing modify acls to: hongbo-miao
[info] 23/04/24 13:19:11 INFO SecurityManager: Changing view acls groups to:
[info] 23/04/24 13:19:11 INFO SecurityManager: Changing modify acls groups to:
[info] 23/04/24 13:19:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hongbo-miao); groups with view permissions: Set(); users  with modify permissions: Set(hongbo-miao); groups with modify permissions: Set()
[info] 23/04/24 13:19:11 INFO Utils: Successfully started service 'sparkDriver' on port 62133.
[info] 23/04/24 13:19:11 INFO SparkEnv: Registering MapOutputTracker
[error] WARNING: An illegal reflective access operation has occurred
[error] WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:find-retired-people-scala/target/bg-jobs/sbt_21530e9/target/795ef2be/5f387cad/spark-unsafe_2.12-3.3.2.jar) to constructor java.nio.DirectByteBuffer(long,int)
[error] WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
[error] WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
[error] WARNING: All illegal access operations will be denied in a future release
[info] 23/04/24 13:19:11 INFO SparkEnv: Registering BlockManagerMaster
[info] 23/04/24 13:19:11 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
[info] 23/04/24 13:19:11 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
[info] 23/04/24 13:19:11 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
[info] 23/04/24 13:19:11 INFO DiskBlockManager: Created local directory at /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/blockmgr-8cc28404-cae7-4673-85c2-b29fc5dd4823
[info] 23/04/24 13:19:11 INFO MemoryStore: MemoryStore started with capacity 9.4 GiB
[info] 23/04/24 13:19:11 INFO SparkEnv: Registering OutputCommitCoordinator
[info] 23/04/24 13:19:11 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[info] 23/04/24 13:19:11 INFO Executor: Starting executor ID driver on host 10.10.8.125
[info] 23/04/24 13:19:11 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
[info] 23/04/24 13:19:11 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62134.
[info] 23/04/24 13:19:11 INFO NettyBlockTransferService: Server created on 10.10.8.125:62134
[info] 23/04/24 13:19:11 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
[info] 23/04/24 13:19:11 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.10.8.125, 62134, None)
[info] 23/04/24 13:19:11 INFO BlockManagerMasterEndpoint: Registering block manager 10.10.8.125:62134 with 9.4 GiB RAM, BlockManagerId(driver, 10.10.8.125, 62134, None)
[info] 23/04/24 13:19:11 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.10.8.125, 62134, None)
[info] 23/04/24 13:19:11 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.10.8.125, 62134, None)
[info] 23/04/24 13:19:14 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
[info] 23/04/24 13:19:14 INFO SharedState: Warehouse path is 'file:find-retired-people-scala/spark-warehouse'.
[info] 23/04/24 13:19:14 INFO CodeGenerator: Code generated in 87.737958 ms
[info] 23/04/24 13:19:15 INFO CodeGenerator: Code generated in 2.845959 ms
[info] 23/04/24 13:19:15 INFO CodeGenerator: Code generated in 4.161 ms
[info] 23/04/24 13:19:15 INFO CodeGenerator: Code generated in 5.464375 ms
[info] +-------+---+
[info] |   name|age|
[info] +-------+---+
[info] |Charlie| 80|
[info] +-------+---+
[info] 23/04/24 13:19:15 INFO SparkUI: Stopped Spark web UI at http://10.10.8.125:4040
[info] 23/04/24 13:19:15 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
[info] 23/04/24 13:19:15 INFO MemoryStore: MemoryStore cleared
[info] 23/04/24 13:19:15 INFO BlockManager: BlockManager stopped
[info] 23/04/24 13:19:15 INFO BlockManagerMaster: BlockManagerMaster stopped
[info] 23/04/24 13:19:15 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
[info] 23/04/24 13:19:15 INFO SparkContext: Successfully stopped SparkContext
[info] 23/04/24 13:19:15 INFO ShutdownHookManager: Shutdown hook called
[info] 23/04/24 13:19:15 INFO ShutdownHookManager: Deleting directory /private/var/folders/22/ntjwd5dx691gvkktkspl0f_00000gq/T/spark-2a1a3b4a-efa4-4f43-9a0e-babff52aa946
[success] Total time: 9 s, completed Apr 24, 2023, 1:19:15 PM

我想我会继续使用UPDATE 1中没有fork := true的原始无害错误,因为我确实想使用Java 17。

相关问题