我们已将hdp cluster升级到3.1.1.3.0.1.0-187,并发现:
hive有一个新的元存储位置
spark看不到Hive数据库
事实上我们看到:
org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database ... not found
你能帮助我了解发生了什么事以及如何解决这个问题吗?
更新:
配置:
(spark.sql.warehouse.dir,/warehouse/tablespace/external/hive/)(spark.admin.acls,)(spark.yarn.dist.files,文件:///opt/folder/config.yml,文件:///opt/jdk1.8.0172/jre/lib/security/cacerts)(spark.history.kerberos.keytab,/etc/security/keytab/spark.service.keytab)(spark.io.compression.lz4.blocksize,128kb)(spark.executor.extrajavaopt,-djavax.net.ssl.truststore=cacerts)(spark.history.fs.logdirectory,hdfs:///spark2 history/)(spark.io.encryption.keygen.algorithm,hmacsha1)(spark.sql.autobroadcastjointhreshold,26214400)(spark.eventlog.enabled,true)(spark.shuffle.service.enabled,true)(spark.driver.extralibrarypath,/usr/hdp/current/hadoop client/lib/native:/usr/hdp/current/hadoop client/lib/native/linux-amd64-64)(spark.ssl.keystore,/etc/security/serverkeys/server keystore.jks)(spark.warn.queue,默认)(spark.jars,文件:/opt/folder/component-assembly-0.1.0-snapshot.jar)(spark.ssl.enabled,true)(spark.sql.orc.filterpushdown,true)(spark.shuffle.unsafe.file.output.buffer,5m)(spark.yarn.historyserver.address,master2.env。project:18481)(spark.ssl.truststore,/etc/security/clientkeys/all.jks)(spark.app.name,com.company.env.component.myclass)(spark.sql.hive.metastore.jars,/usr/hdp/current/spark2 client/standalone metastore/)(spark.io.encryption.keysizebits,128)(spark.driver.memory,2g)(spark.executor.instances,10)(spark.history.kerberos.principal,spark/edge.env。project@env.project)(spark.ssl.keypassword,(redacted))(spark.ssl.keypassword,(redacted))(spark.ssl.keystrepassword,******(redacted))(spark.history.fs.cleaner.enabled,true)(spark.shuffle.io.serverthreads,128)(spark.sql.hive.convertmatastoreorc,true)(spark.submit.deploymode,client)(spark.sql.orc.char.enabled,true)(spark.master,yarn)(spark.authenticate.enablesalencryption,true)(spark.history.fs.cleaner.interval,7d)(spark.authenticate,true)(spark.history.fs.cleaner.maxage,90d)(spark.history.ui.acls.enable,true)(spark.acls.enable,true)(spark.history.provider,org.apache.spark.deploy.history.fshistoryprovider)(spark.executor.extralibrarypath,/usr/hdp/current/hadoop client/lib/native:/usr/hdp/current/hadoop client/lib/native/linux-amd64)(spark.executor.memory,2g)(spark.io.encryption.enabled,true)(spark.shuffle.file.buffer,1m)(spark.eventlog.dir,hdfs:///spark2 history/)(spark.ssl.protocol,tls)(spark.dynamicalocation.enabled,true)(spark.executor.cores,3)(spark.history.ui.port,18081)(spark.sql.statistics.fallbacktohdfs,true)(spark.repl.local.jars,文件:///opt/folder/postgresql-42.2.2.jar,文件:///opt/folder/ojdbc6.jar)(spark.ssl.truststorepassword,*********(修订版))(spark.history.ui.admin.acls,)(spark.history.kerberos.enabled,true)(spark.shuffle.io.backlog,8192)(spark.sql.orc.impl,native)(spark.ssl.enabledalgorithms,tls\u rsa\u with \u aes\u 128\u cbc\u sha,tls\u rsa\u with \u aes\u 256\u cbc\u sha)(spark.sql.orc.enabled,true)(spark.yarn.dist.jars,文件:///opt/folder/postgresql-42.2.2.jar,文件:///opt/folder/ojdbc6.jar)(spark.sql.hive.metastore.version,3.0)
从hive-site.xml:
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/warehouse/tablespace/managed/hive</value>
</property>
代码如下所示:
val spark = SparkSession
.builder()
.appName(getClass.getSimpleName)
.enableHiveSupport()
.getOrCreate()
...
dataFrame.write
.format("orc")
.options(Map("spark.sql.hive.convertMetastoreOrc" -> true.toString))
.mode(SaveMode.Append)
.saveAsTable("name")
spark提交:
--master yarn \
--deploy-mode client \
--driver-memory 2g \
--driver-cores 4 \
--executor-memory 2g \
--num-executors 10 \
--executor-cores 3 \
--conf "spark.dynamicAllocation.enabled=true" \
--conf "spark.shuffle.service.enabled=true" \
--conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=cacerts" \
--conf "spark.sql.warehouse.dir=/warehouse/tablespace/external/hive/" \
--jars postgresql-42.2.2.jar,ojdbc6.jar \
--files config.yml,/opt/jdk1.8.0_172/jre/lib/security/cacerts \
--verbose \
component-assembly-0.1.0-SNAPSHOT.jar \
2条答案
按热度按时间tez616oj1#
我有一个回溯技巧为这一个虽然免责声明,它绕过了游侠权限(不要责怪我,如果你招致一个管理员的愤怒)。
与Spark壳配合使用
与sparkyr一起使用
它应该工作的节俭服务器,但我还没有测试。
0x6upsns2#
看起来这是一个没有实现的spark特性。但我发现,从3.0开始使用spark和hive的唯一方法是使用horton的hivewarehouseconnector。这里是文档。霍顿社区的好向导。在spark开发人员准备好自己的解决方案之前,我不回答这个问题。