我们最近为etl项目从spark2.4.2升级到了2.4.5。
部署更改并运行作业后,我看到以下错误:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at com.advisory.pic.etl.utils.OracleDialect$.<init>(OracleDialect.scala:12)
at com.advisory.pic.etl.utils.OracleDialect$.<clinit>(OracleDialect.scala)
at com.advisory.pic.etl.drivers.BaseDriver.$init$(BaseDriver.scala:19)
at com.advisory.pic.etl.drivers.PASLoadDriver$.<init>(PASLoadDriver.scala:19)
at com.advisory.pic.etl.drivers.PASLoadDriver$.<clinit>(PASLoadDriver.scala)
at com.advisory.pic.etl.drivers.PASLoadDriver.main(PASLoadDriver.scala)
... 6 more
我在网上看到,可能是因为库版本不匹配,但我在build.gradle中找不到任何这样的冲突,除了testimplementation,我将其升级到正确的版本,但我怀疑这是问题的根本原因。
下面是build.gradle文件中的依赖项片段。
dependencies {
def hadoopClientVersion = '2.7.1'
def hadoopCommonsVersion = '2.7.1'
def sparkVersion = '2.4.5'
def sparkTestingVersion = '2.4.5'
provided group: 'org.apache.hadoop', name: 'hadoop-client', version: hadoopClientVersion
provided group: 'org.apache.hadoop', name: 'hadoop-common', version: hadoopCommonsVersion
implementation("org.apache.spark:spark-sql_2.12:$sparkVersion"){
//excluding is causing issues when running through IDE - as sparkLibraries are not available at run-time
//We can comment while going for deployment if we face jar conflict issues
//exclude module: 'spark-core_2.10'
//exclude module: 'spark-catalyst_2.10'
}
implementation group: 'org.apache.spark', name: 'spark-core_2.12', version: sparkVersion
testImplementation group: 'org.apache.spark', name: 'spark-core_2.12', version: sparkVersion
// spark -sql with avro
implementation("com.databricks:spark-avro_2.11:4.0.0")
// joda-time
implementation 'com.github.nscala-time:nscala-time_2.12:2.22.0'
//configuration object
implementation group: 'com.typesafe', name: 'config', version: '1.2.1'
implementation "ch.qos.logback:logback-classic:1.1.3"
implementation "org.slf4j:log4j-over-slf4j:1.7.13"
// Libraries needed for scala api
implementation 'org.scala-lang:scala-library:2.12.0'
implementation 'org.scala-lang:scala-compiler:2.12.0'
testImplementation 'org.scalatest:scalatest_2.12:3.0.5'
implementation 'com.oracle:ojdbc7:12.1.0.1'
testImplementation group: 'com.h2database', name: 'h2', version: '1.4.196'
testImplementation 'com.holdenkarau:spark-testing-base_2.12:' + sparkTestingVersion + '_0.12.0'
itestCompile 'org.scala-lang:scala-library:2.12.0'
itestCompile 'org.scalatest:scalatest_2.12:3.0.5'
testImplementation 'org.scalamock:scalamock_2.12:4.3.0'
}
关于为什么会出现上述问题,以及如何验证版本不匹配,有什么建议吗?
1条答案
按热度按时间qni6mghb1#
我认为这是由于编译代码的scala版本和运行时的scala版本不匹配造成的。
spark 2.4.2是使用scala 2.12预构建的,但是scala 2.4.5是使用scala 2.11预构建的,如-https://spark.apache.org/downloads.html.
如果使用2.11中编译的spark库,这个问题应该会消失