scala—无法使用maven项目从eclipse通过hivecontext访问配置单元表

g6baxovj  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(375)

这个问题在这里已经有答案了

hdfs上的root scratch dir:/tmp/hive应该是可写的。当前权限为:-wx------(4个答案)
两年前关门了。
我正在尝试使用scala nature从eclipsemaven项目访问配置单元表。
我尝试使用配置单元上下文获取配置单元数据库的详细信息,如下所示,但遇到以下错误。
我可以在sparkshellcli中执行下面的代码,但在eclipse scala ide中无法执行相同的代码,因为它添加了maven依赖项。
下面是我的代码:

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.sql.hive._

object readHiveTable {
  def main(args: Array[String]){
    val conf = new SparkConf().setAppName("Read Hive Table").setMaster("local")
    conf.set("spark.ui.port","4041")
    val sc = new SparkContext(conf)
    //val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val hc = new HiveContext(sc)
    hc.setConf("hive.metastore.uris","thrift://127.0.0.1:9083")
    hc.sql("use default")
    val a = hc.sql("show tables")
    a.show
  }
}

下面是我在控制台窗口中遇到的错误:

18/02/04 19:58:15 INFO SparkUI: Started SparkUI at http://192.168.0.10:4041
18/02/04 19:58:15 INFO Executor: Starting executor ID driver on host localhost
18/02/04 19:58:15 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 36099.
18/02/04 19:58:15 INFO NettyBlockTransferService: Server created on 36099
18/02/04 19:58:15 INFO BlockManagerMaster: Trying to register BlockManager
18/02/04 19:58:15 INFO BlockManagerMasterEndpoint: Registering block manager localhost:36099 with 744.4 MB RAM, BlockManagerId(driver, localhost, 36099)
18/02/04 19:58:15 INFO BlockManagerMaster: Registered BlockManager
18/02/04 19:58:17 INFO HiveContext: Initializing execution hive, version 1.2.1
18/02/04 19:58:17 INFO ClientWrapper: Inspected Hadoop version: 2.2.0
18/02/04 19:58:17 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.2.0
18/02/04 19:58:17 INFO deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
18/02/04 19:58:17 INFO deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
18/02/04 19:58:17 INFO deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
18/02/04 19:58:17 INFO deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
18/02/04 19:58:17 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
18/02/04 19:58:17 INFO deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
18/02/04 19:58:17 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
18/02/04 19:58:17 INFO deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
18/02/04 19:58:17 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
18/02/04 19:58:17 INFO ObjectStore: ObjectStore, initialize called
18/02/04 19:58:17 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
18/02/04 19:58:17 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
18/02/04 19:58:28 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
18/02/04 19:58:30 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
18/02/04 19:58:30 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
18/02/04 19:58:38 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
18/02/04 19:58:38 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
18/02/04 19:58:39 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
18/02/04 19:58:39 INFO ObjectStore: Initialized ObjectStore
18/02/04 19:58:40 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
18/02/04 19:58:40 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
18/02/04 19:58:41 INFO HiveMetaStore: Added admin role in metastore
18/02/04 19:58:41 INFO HiveMetaStore: Added public role in metastore
18/02/04 19:58:41 INFO HiveMetaStore: No user is added in admin role, since config is empty
18/02/04 19:58:41 INFO HiveMetaStore: 0: get_all_databases
18/02/04 19:58:41 INFO audit: ugi=chaithu   ip=unknown-ip-addr  cmd=get_all_databases   
18/02/04 19:58:41 INFO HiveMetaStore: 0: get_functions: db=default pat=*
18/02/04 19:58:41 INFO audit: ugi=chaithu   ip=unknown-ip-addr  cmd=get_functions: db=default pat=* 
18/02/04 19:58:41 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218)
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461)
    at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
    at com.CITIGenesis.readHiveTable$.main(readHiveTable.scala:13)
    at com.CITIGenesis.readHiveTable.main(readHiveTable.scala)
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------
    at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
    at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
    ... 12 more
18/02/04 19:58:43 INFO SparkContext: Invoking stop() from shutdown hook
18/02/04 19:58:43 INFO SparkUI: Stopped Spark web UI at http://192.168.0.10:4041
18/02/04 19:58:43 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/02/04 19:58:43 INFO MemoryStore: MemoryStore cleared
18/02/04 19:58:43 INFO BlockManager: BlockManager stopped
18/02/04 19:58:43 INFO BlockManagerMaster: BlockManagerMaster stopped
18/02/04 19:58:43 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/02/04 19:58:43 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
18/02/04 19:58:43 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
18/02/04 19:58:43 INFO SparkContext: Successfully stopped SparkContext
18/02/04 19:58:43 INFO ShutdownHookManager: Shutdown hook called
18/02/04 19:58:43 INFO ShutdownHookManager: Deleting directory /tmp/spark-0ec5892a-1d53-4721-b770-d16e8757865d
18/02/04 19:58:43 INFO ShutdownHookManager: Deleting directory /tmp/spark-0ca97c02-57c7-400b-b552-44f6d7813da5

hdfs目录:

chaithu@localhost:~$ hadoop fs -ls /tmp
Found 3 items
d---------   - hdfs   supergroup          0 2018-02-04 14:15 /tmp/.cloudera_health_monitoring_canary_files
drwxrwxrwx   - hdfs   supergroup          0 2018-01-31 11:42 /tmp/hive
drwxrwxrwt   - mapred hadoop              0 2018-01-31 11:25 /tmp/logs
chaithu@localhost:~$ hadoop fs -ls /user/
Found 6 items
drwxrwxrwx   - chaithu supergroup          0 2018-02-04 19:34 /user/chaithu
drwxrwxrwx   - mapred  hadoop              0 2018-01-31 11:25 /user/history
drwxrwxr-t   - hive    hive                0 2018-01-31 11:31 /user/hive
drwxrwxr-x   - hue     hue                 0 2018-01-31 11:38 /user/hue
drwxrwxr-x   - oozie   oozie               0 2018-01-31 11:34 /user/oozie
drwxr-x--x   - spark   spark               0 2018-01-31 22:39 /user/spark
hlswsv35

hlswsv351#

for Hadoop version 2.2.0 假设这是spark版本,您应该使用 SparkSession 以及使用 enableHiveSupport() ,那么 spark.sql 方法将像在Spark壳中一样工作。
hive/sqlcontext仅用于向后兼容。新的Spark代码不应该使用它们。 underlying DB is DERBY 对我来说,这句话意味着
配置单元正在使用默认的元存储配置
spark没有连接到metastore,并且已经创建了一个本地derby数据库。这与 Failed to get database default 在后一种情况下,检查本地文件系统的/tmp文件夹
请在此处查看各种解决方案,以了解如何连接到元存储
如何在sparksql中以编程方式连接到配置单元元存储?

相关问题