我正在尝试运行Spark应用程序,以便从Dataproc向Cloud Bigtable写入和读取数据。
最初,我得到了这个异常java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument
。然后从这个Google文档中了解到了一些依赖性问题-[Manage Java and Scala dependencies for Apache Spark][1]。
按照说明,我更改了我的build.sbt
文件,以阴影jar-
assembly / assemblyShadeRules := Seq(
ShadeRule.rename("com.google.common.**" -> "repackaged.com.google.common.@1").inAll,
ShadeRule.rename("com.google.protobuf.**" -> "repackaged.com.google.protobuf.@1").inAll,
ShadeRule.rename("io.grpc.**" -> "repackaged.io.grpc.@1").inAll
)
然后得到了这个错误
repackaged.io.grpc.ManagedChannelProvider$ProviderNotFoundException: No functional channel service provider found. Try adding a dependency on the grpc-okhttp, grpc-netty, or grpc-netty-shaded artifact
at repackaged.io.grpc.ManagedChannelProvider.provider(ManagedChannelProvider.java:45)
at repackaged.io.grpc.ManagedChannelBuilder.forAddress(ManagedChannelBuilder.java:39)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createSingleChannel(InstantiatingGrpcChannelProvider.java:353)
at com.google.api.gax.grpc.ChannelPool.<init>(ChannelPool.java:107)
at com.google.api.gax.grpc.ChannelPool.create(ChannelPool.java:85)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createChannel(InstantiatingGrpcChannelProvider.java:237)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.getTransportChannel(InstantiatingGrpcChannelProvider.java:231)
at com.google.api.gax.rpc.ClientContext.create(ClientContext.java:201)
at com.google.cloud.bigtable.data.v2.stub.EnhancedBigtableStub.create(EnhancedBigtableStub.java:175)
at com.google.cloud.bigtable.data.v2.BigtableDataClient.create(BigtableDataClient.java:165)
at com.groupon.crm.BigtableClient$.getDataClient(BigtableClient.scala:59)
... 44 elided
接下来,我在build.sbt
文件中添加了对的依赖。
libraryDependencies += "io.grpc" % "grpc-netty" % "1.49.2"
但我还是犯了同样的错误。
环境详情Dataproc详情-
"software_config": {
"image_version": "1.5-debian10",
"properties": {
"dataproc:dataproc.logging.stackdriver.job.driver.enable": "true",
"dataproc:dataproc.logging.stackdriver.enable": "true",
"dataproc:jobs.file-backed-output.enable": "true",
"dataproc:dataproc.logging.stackdriver.job.yarn.container.enable": "true",
"capacity-scheduler:yarn.scheduler.capacity.resource-calculator" : "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator",
"hive:hive.server2.materializedviews.cache.at.startup": "false",
"spark:spark.jars":"XXXX"
},
"optional_components": ["ZEPPELIN","ANACONDA","JUPYTER"]
}
Spark Job详细信息-
val sparkVersion = "2.4.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-hive" % sparkVersion % "provided"
libraryDependencies += "com.google.cloud" % "google-cloud-bigtable" % "2.23.1"
libraryDependencies += "com.google.auth" % "google-auth-library-oauth2-http" % "1.17.0"
libraryDependencies += "io.grpc" % "grpc-netty" % "1.49.2"
1条答案
按热度按时间htrmnn0y1#
最后,我自己解决了这个问题。我遵循了以下步骤。
1.在路径
src/main/resources
中,添加META-INF
目录,并在该文件夹中添加services
目录。1.在
src/main/resources/META-INF/services
目录中添加2个文件,即io.grpc.LoadBalancerProvider
和io.grpc.NameResolverProvider
。1.将以下内容添加到
io.grpc.LoadBalancerProvider
文件io.grpc.internal.PickFirstLoadBalancerProvider
中。1.将以下内容添加到
io.grpc.internal.NameResolverProvider
文件io.grpc.internal.DnsNameResolverProvider
中。1.最后,对
build.sbt
进行如下更改。