spark以编程方式连接到远程kerberos安全配置单元

mmvthczy  于 2021-05-17  发布在  Spark
关注(0)|答案(0)|浏览(708)

我正在尝试连接和查询远程hadoop集群上的配置单元表,该集群由kerberos保护。我的用户在集群上有必要的特权,我手头有keytab和krb5.conf文件。 Have already been successful connecting over JDBC, but trying to connect with this alternate approach. 在intellij的类路径中添加了hive-site.xml、core-site.xml等文件。
这是我到目前为止所做的尝试。

package org.nexus.spark.kerberos

import com.typesafe.scalalogging.Logger
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects}

object SparkHiveConnect {
  val logger: Logger = Logger("SparkHiveConnect")

  def main(args: Array[String]): Unit = {
    System.setProperty("java.security.auth.login.config", "jaas.conf")
    System.setProperty("sun.security.jgss.debug", "true")
    System.setProperty("java.security.debug", "gssloginconfig,configfile,configparser,logincontext")
    System.setProperty("javax.security.auth.useSubjectCredsOnly", "false")
    System.setProperty("java.security.krb5.conf", "krb5.conf")
    System.setProperty("hive.metastore.sasl.enabled", "true")
    System.setProperty("hive.security.authorization.enabled", "false")
    System.setProperty("hive.metastore.kerberos.principal", "hive/_PRINCIPAL")
    System.setProperty("hive.metastore.execute.setugi", "true")
    System.setProperty("hadoop.security.authentication", "kerberos")
    System.setProperty("hadoop.home.dir", "HADOOP_HOME")
    System.setProperty("spark.home", "SPARK_HOME")

    val spark = SparkSession
      .builder()
      .master("local[1]")
      .appName("Scala Spark Hive Example")
      .config("spark.hadoop.hive.metastore.uris", "thrift://hostname:9083")
      .config("spark.yarn.keytab", "my-keytab")
      .config("spark.yarn.principal", "user@PRINCIPAL")
      .config("spark.hadoop.spark.sql.warehouse.dir", "/tmp/warehouse")
      .enableHiveSupport()
      .getOrCreate();

    spark.sql("select * from testtable").show()
  }
}
``` `jaas.conf` 文件如下

com.sun.security.jgss.krb5.initiate {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=false
doNotPrompt=true
useKeyTab=true
keyTab="file:/my-ketab"
principal="user@PRINCIPAL"
debug=true;
};

上面的代码开始执行,甚至成功地连接到metastore,但之后就卡住了

19:12:34.103 [main] INFO o.s.j.server.handler.ContextHandler - Started o.s.j.s.ServletContextHandler@7a24eb3{/SQL/execution/json,null,AVAILABLE,@Spark}
19:12:34.104 [main] INFO o.s.j.server.handler.ContextHandler - Started o.s.j.s.ServletContextHandler@4fe875be{/static/sql,null,AVAILABLE,@Spark}
19:12:34.804 [main] INFO org.apache.spark.sql.hive.HiveUtils - Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
19:12:34.825 [main] INFO o.a.s.sql.hive.client.HiveClientImpl - Attempting to login to Kerberos using principal: user@PRINCIPAL and keytab: my-keytab.keytab
19:12:35.255 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://hostname:9083
Search Subject for Kerberos V5 INIT cred (<>, sun.security.jgss.krb5.Krb5InitCredential)
Debug is true storeKey false useTicketCache false useKeyTab true doNotPrompt true ticketCache is null isInitiator true KeyTab is my-keytab.keytab refreshKrb5Config is false principal is user@PRINCIPAL tryFirstPass is false useFirstPass is false storePass is false clearPass is false
principal is user@PRINCIPAL
Will use keytab
Commit Succeeded

19:12:38.672 [main] INFO hive.metastore - Connected to metastore.

在调试模式下运行时,我看到在无限循环中打印以下详细信息

19:15:05.278 [main] INFO hive.metastore - Connected to metastore.
19:15:07.244 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
19:15:07.244 [main] DEBUG o.a.thrift.transport.TSaslTransport - writing data length: 42
19:15:07.536 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
19:15:07.536 [main] DEBUG o.a.thrift.transport.TSaslTransport - writing data length: 48
19:15:07.827 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
19:15:07.827 [main] DEBUG o.a.thrift.transport.TSaslTransport - writing data length: 48
19:15:08.178 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34

任何关于我可能做错什么的建议都非常感谢。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题