我正在尝试连接和查询远程hadoop集群上的配置单元表,该集群由kerberos保护。我的用户在集群上有必要的特权,我手头有keytab和krb5.conf文件。 Have already been successful connecting over JDBC, but trying to connect with this alternate approach.
在intellij的类路径中添加了hive-site.xml、core-site.xml等文件。
这是我到目前为止所做的尝试。
package org.nexus.spark.kerberos
import com.typesafe.scalalogging.Logger
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects}
object SparkHiveConnect {
val logger: Logger = Logger("SparkHiveConnect")
def main(args: Array[String]): Unit = {
System.setProperty("java.security.auth.login.config", "jaas.conf")
System.setProperty("sun.security.jgss.debug", "true")
System.setProperty("java.security.debug", "gssloginconfig,configfile,configparser,logincontext")
System.setProperty("javax.security.auth.useSubjectCredsOnly", "false")
System.setProperty("java.security.krb5.conf", "krb5.conf")
System.setProperty("hive.metastore.sasl.enabled", "true")
System.setProperty("hive.security.authorization.enabled", "false")
System.setProperty("hive.metastore.kerberos.principal", "hive/_PRINCIPAL")
System.setProperty("hive.metastore.execute.setugi", "true")
System.setProperty("hadoop.security.authentication", "kerberos")
System.setProperty("hadoop.home.dir", "HADOOP_HOME")
System.setProperty("spark.home", "SPARK_HOME")
val spark = SparkSession
.builder()
.master("local[1]")
.appName("Scala Spark Hive Example")
.config("spark.hadoop.hive.metastore.uris", "thrift://hostname:9083")
.config("spark.yarn.keytab", "my-keytab")
.config("spark.yarn.principal", "user@PRINCIPAL")
.config("spark.hadoop.spark.sql.warehouse.dir", "/tmp/warehouse")
.enableHiveSupport()
.getOrCreate();
spark.sql("select * from testtable").show()
}
}
``` `jaas.conf` 文件如下
com.sun.security.jgss.krb5.initiate {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=false
doNotPrompt=true
useKeyTab=true
keyTab="file:/my-ketab"
principal="user@PRINCIPAL"
debug=true;
};
上面的代码开始执行,甚至成功地连接到metastore,但之后就卡住了
19:12:34.103 [main] INFO o.s.j.server.handler.ContextHandler - Started o.s.j.s.ServletContextHandler@7a24eb3{/SQL/execution/json,null,AVAILABLE,@Spark}
19:12:34.104 [main] INFO o.s.j.server.handler.ContextHandler - Started o.s.j.s.ServletContextHandler@4fe875be{/static/sql,null,AVAILABLE,@Spark}
19:12:34.804 [main] INFO org.apache.spark.sql.hive.HiveUtils - Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
19:12:34.825 [main] INFO o.a.s.sql.hive.client.HiveClientImpl - Attempting to login to Kerberos using principal: user@PRINCIPAL and keytab: my-keytab.keytab
19:12:35.255 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://hostname:9083
Search Subject for Kerberos V5 INIT cred (<>, sun.security.jgss.krb5.Krb5InitCredential)
Debug is true storeKey false useTicketCache false useKeyTab true doNotPrompt true ticketCache is null isInitiator true KeyTab is my-keytab.keytab refreshKrb5Config is false principal is user@PRINCIPAL tryFirstPass is false useFirstPass is false storePass is false clearPass is false
principal is user@PRINCIPAL
Will use keytab
Commit Succeeded
19:12:38.672 [main] INFO hive.metastore - Connected to metastore.
在调试模式下运行时,我看到在无限循环中打印以下详细信息
19:15:05.278 [main] INFO hive.metastore - Connected to metastore.
19:15:07.244 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
19:15:07.244 [main] DEBUG o.a.thrift.transport.TSaslTransport - writing data length: 42
19:15:07.536 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
19:15:07.536 [main] DEBUG o.a.thrift.transport.TSaslTransport - writing data length: 48
19:15:07.827 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
19:15:07.827 [main] DEBUG o.a.thrift.transport.TSaslTransport - writing data length: 48
19:15:08.178 [main] DEBUG o.a.thrift.transport.TSaslTransport - CLIENT: reading data length: 34
任何关于我可能做错什么的建议都非常感谢。
暂无答案!
目前还没有任何答案,快来回答吧!