spark群集模式下的impala jdbc连接问题

cygmwpex  于 2021-06-26  发布在  Impala
关注(0)|答案(1)|浏览(594)

在群集模式下运行spark作业时,impala jdbc连接引发以下异常。spark作业创建配置单元表,并使用jdbc使impala表失效/刷新。相同的作业在spark客户端模式下成功执行。

java.sql.SQLException: [Simba][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: [Simba][ImpalaJDBCDriver](500169) Unable to connect to server: GSS initiate failed. at om.cloudera.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source)
    at com.cloudera.hivecommon.api.HiveServer2ClientFactory.createClient(Unknown Source)
    at com.cloudera.hivecommon.core.HiveJDBCCommonConnection.connect(Unknown Source)
    at com.cloudera.impala.core.ImpalaJDBCConnection.connect(Unknown Source)
    at com.cloudera.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
    at com.cloudera.jdbc.common.AbstractDriver.connect(Unknown Source)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:270)
xkrw2x1b

xkrw2x1b1#

protected def getImpalaConnection(impalaJdbcDriver: String, impalaJdbcUrl: String): Connection = {
if (impalaJdbcDriver.length() == 0) return null
try {
  Class.forName(impalaJdbcDriver).newInstance
  UserGroupInformation.getLoginUser.doAs(
    new PrivilegedAction[Connection] {
      override def run(): Connection = DriverManager.getConnection(impalaJdbcUrl)
    }
  )
} catch {
  case e: Exception => {
    println(e.toString() + " --> " + e.getStackTraceString)
    throw e
  }
} }

val   impalaJdbcDriver = "com.cloudera.impala.jdbc41.Driver"

val impalaJdbcUrl = "jdbc:impala://<Impala_Host>:21050/default;AuthMech=1;SSL=1;KrbRealm=HOST.COM;KrbHostFQDN=_HOST;KrbServiceName=impala;REQUEST_POOL=xyz"

println("Start impala connection")

val impalaConnection = getImpalaConnection(impalaJdbcDriver,impalaJdbcUrl)

val result = impalaConnection.createStatement.executeQuery(s"SELECT COUNT(1) FROM testTable")
println("End impala connection")

构建厚jar并使用下面给定的spark submit命令。如果需要,可以传递其他参数,如file、jars。
spark提交命令:

spark-submit --master yarn-cluster --keytab /home/testuser/testuser.keytab --principal testuser@host.COM  --queue xyz--class com.dim.UpdateImpala

根据您的spark版本进行如下更改
对于spark1:usergroupinformation.getloginuser
对于spark2:usergroupinformation.getcurrentuser

相关问题