hive cli无法从另一个表创建表

14ifxucb  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(349)

我一直在尝试创建一个包含来自另一个表的列的表,但是hivecli始终没有这样做。
以下是查询:

CREATE TABLE tweets_id_sample AS
SELECT
   id
FROM tweets_sample;

此配置单元查询伴随的cli错误如下:

Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201310250853_0023, Tracking URL = http://sandbox:50030/jobdetails.jsp?jobid=job_201310250853_0023
Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill job_201310250853_0023
Hadoop job information for Stage-1: number of mappers: 7; number of reducers: 0
2013-10-26 07:40:37,273 Stage-1 map = 0%,  reduce = 0%
2013-10-26 07:41:21,570 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201310250853_0023 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://sandbox:50030/jobdetails.jsp?jobid=job_201310250853_0023
Examining task ID: task_201310250853_0023_m_000008 (and more) from job job_201310250853_0023
Examining task ID: task_201310250853_0023_m_000000 (and more) from job job_201310250853_0023

Task with the most failures(4):
-----
Task ID:
  task_201310250853_0023_m_000000

URL:
  http://sandbox:50030/taskdetails.jsp?jobid=job_201310250853_0023&tipid=task_201310250853_0023_m_000000
-----
Diagnostic Messages for this Task:

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 7   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

检查作业跟踪器后,任务及其所有尝试(直到作业被终止)都会出现以下错误:

java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:425)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
    ... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDe
    at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:463)
    at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:479)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:90)
    ... 22 more
Caused by: java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDe
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:247)
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
    at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:422)
    ... 24 more

上面的查询同样适用于Hive蜂蜡。
我一直成功地在Hive蜂蜡中创建了这些类型的查询。上面的同一查询(使用不同的表名)正常工作,并具有以下日志:

13/10/26 07:51:30 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
13/10/26 07:51:30 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
13/10/26 07:51:30 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
13/10/26 07:51:30 INFO ql.Driver: <PERFLOG method=Driver.run>
13/10/26 07:51:30 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
13/10/26 07:51:30 INFO ql.Driver: <PERFLOG method=compile>
13/10/26 07:51:30 INFO parse.ParseDriver: Parsing command: use default
13/10/26 07:51:30 INFO parse.ParseDriver: Parse Completed
13/10/26 07:51:30 INFO ql.Driver: Semantic Analysis Completed
13/10/26 07:51:30 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
13/10/26 07:51:30 INFO ql.Driver: </PERFLOG method=compile start=1382799090878 end=1382799090880 duration=2>
13/10/26 07:51:30 INFO ql.Driver: <PERFLOG method=Driver.execute>
13/10/26 07:51:30 INFO ql.Driver: Starting command: use default
13/10/26 07:51:30 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1382799090878 end=1382799090880 duration=2>
13/10/26 07:51:30 INFO ql.Driver: </PERFLOG method=Driver.execute start=1382799090880 end=1382799090924 duration=44>
OK
13/10/26 07:51:30 INFO ql.Driver: OK
13/10/26 07:51:30 INFO ql.Driver: <PERFLOG method=releaseLocks>
13/10/26 07:51:30 INFO ql.Driver: </PERFLOG method=releaseLocks start=1382799090924 end=1382799090924 duration=0>
13/10/26 07:51:30 INFO ql.Driver: </PERFLOG method=Driver.run start=1382799090878 end=1382799090924 duration=46>
13/10/26 07:51:30 INFO ql.Driver: <PERFLOG method=compile>
13/10/26 07:51:30 INFO parse.ParseDriver: Parsing command: CREATE TABLE tweets_id_sample_ui AS
   SELECT
      id
FROM tweets_sample
13/10/26 07:51:30 INFO parse.ParseDriver: Parse Completed
13/10/26 07:51:30 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
13/10/26 07:51:30 INFO parse.SemanticAnalyzer: Creating table tweets_id_sample_ui position=13
13/10/26 07:51:30 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
13/10/26 07:51:30 INFO parse.SemanticAnalyzer: Get metadata for source tables
13/10/26 07:51:31 INFO parse.SemanticAnalyzer: Get metadata for subqueries
13/10/26 07:51:31 INFO parse.SemanticAnalyzer: Get metadata for destination tables
13/10/26 07:51:31 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
13/10/26 07:51:31 INFO ppd.OpProcFactory: Processing for FS(286)
13/10/26 07:51:31 INFO ppd.OpProcFactory: Processing for SEL(285)
13/10/26 07:51:31 INFO ppd.OpProcFactory: Processing for TS(284)
13/10/26 07:51:31 INFO optimizer.GenMRFileSink1: using CombineHiveInputformat for the merge job
13/10/26 07:51:31 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
13/10/26 07:51:31 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
13/10/26 07:51:31 INFO parse.SemanticAnalyzer: Completed plan generation
13/10/26 07:51:31 INFO ql.Driver: Semantic Analysis Completed
13/10/26 07:51:31 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:id, type:bigint, comment:null)], properties:null)
13/10/26 07:51:31 INFO ql.Driver: </PERFLOG method=compile start=1382799090924 end=1382799091259 duration=335>
13/10/26 07:51:31 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
13/10/26 07:51:31 INFO ql.Driver: <PERFLOG method=Driver.execute>
13/10/26 07:51:31 INFO ql.Driver: Starting command: CREATE TABLE tweets_id_sample_ui AS
   SELECT
      id
FROM tweets_sample
Total MapReduce jobs = 3
13/10/26 07:51:31 INFO ql.Driver: Total MapReduce jobs = 3
13/10/26 07:51:31 INFO ql.Driver: </PERFLOG method=TimeToSubmit end=1382799091337>
Launching Job 1 out of 3
13/10/26 07:51:31 INFO ql.Driver: Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
13/10/26 07:51:31 INFO exec.Task: Number of reduce tasks is set to 0 since there's no reduce operator
13/10/26 07:51:31 INFO exec.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
13/10/26 07:51:31 INFO exec.ExecDriver: adding libjars: file:///usr/lib//hcatalog/share/hcatalog/hcatalog-core.jar,file:///usr/lib/hive/lib/json-serde-1.1.4-jar-with-dependencies.jar
13/10/26 07:51:31 INFO exec.ExecDriver: Processing alias tweets_sample
13/10/26 07:51:31 INFO exec.ExecDriver: Adding input file hdfs://sandbox:8020/data/oct25_tweets
13/10/26 07:51:31 INFO exec.Utilities: Content Summary not cached for hdfs://sandbox:8020/data/oct25_tweets
13/10/26 07:51:35 INFO exec.ExecDriver: Making Temp Directory: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10002
13/10/26 07:51:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/10/26 07:51:35 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://sandbox:8020/data/oct25_tweets; using filter path hdfs://sandbox:8020/data/oct25_tweets
13/10/26 07:51:35 INFO mapred.FileInputFormat: Total input paths to process : 964
13/10/26 07:51:39 INFO io.CombineHiveInputFormat: number of splits 7
Starting Job = job_201310250853_0024, Tracking URL = http://sandbox:50030/jobdetails.jsp?jobid=job_201310250853_0024
13/10/26 07:51:39 INFO exec.Task: Starting Job = job_201310250853_0024, Tracking URL = http://sandbox:50030/jobdetails.jsp?jobid=job_201310250853_0024
Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill job_201310250853_0024
13/10/26 07:51:39 INFO exec.Task: Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill job_201310250853_0024
Hadoop job information for Stage-1: number of mappers: 7; number of reducers: 0
13/10/26 07:51:48 INFO exec.Task: Hadoop job information for Stage-1: number of mappers: 7; number of reducers: 0
2013-10-26 07:51:48,788 Stage-1 map = 0%,  reduce = 0%
13/10/26 07:51:48 INFO exec.Task: 2013-10-26 07:51:48,788 Stage-1 map = 0%,  reduce = 0%
2013-10-26 07:52:00,853 Stage-1 map = 1%,  reduce = 0%
13/10/26 07:52:00 INFO exec.Task: 2013-10-26 07:52:00,853 Stage-1 map = 1%,  reduce = 0%
2013-10-26 07:52:02,037 Stage-1 map = 2%,  reduce = 0%
13/10/26 07:52:02 INFO exec.Task: 2013-10-26 07:52:02,037 Stage-1 map = 2%,  reduce = 0%
2013-10-26 07:52:04,048 Stage-1 map = 3%,  reduce = 0%
13/10/26 07:52:04 INFO exec.Task: 2013-10-26 07:52:04,048 Stage-1 map = 3%,  reduce = 0%
...
2013-10-26 07:54:30,400 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 141.58 sec
13/10/26 07:54:30 INFO exec.Task: 2013-10-26 07:54:30,400 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 141.58 sec
MapReduce Total cumulative CPU time: 2 minutes 21 seconds 580 msec
13/10/26 07:54:30 INFO exec.Task: MapReduce Total cumulative CPU time: 2 minutes 21 seconds 580 msec
Ended Job = job_201310250853_0024
13/10/26 07:54:30 INFO exec.Task: Ended Job = job_201310250853_0024
13/10/26 07:54:30 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/_tmp.-ext-10002 to: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/_tmp.-ext-10002.intermediate
13/10/26 07:54:30 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/_tmp.-ext-10002.intermediate to: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10002
Stage-4 is filtered out by condition resolver.
13/10/26 07:54:30 INFO exec.Task: Stage-4 is filtered out by condition resolver.
Stage-3 is selected by condition resolver.
13/10/26 07:54:30 INFO exec.Task: Stage-3 is selected by condition resolver.
Stage-5 is filtered out by condition resolver.
13/10/26 07:54:30 INFO exec.Task: Stage-5 is filtered out by condition resolver.
Launching Job 3 out of 3
13/10/26 07:54:30 INFO ql.Driver: Launching Job 3 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
13/10/26 07:54:30 INFO exec.Task: Number of reduce tasks is set to 0 since there's no reduce operator
13/10/26 07:54:30 INFO exec.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
13/10/26 07:54:30 INFO exec.ExecDriver: adding libjars: file:///usr/lib//hcatalog/share/hcatalog/hcatalog-core.jar,file:///usr/lib/hive/lib/json-serde-1.1.4-jar-with-dependencies.jar
13/10/26 07:54:30 INFO exec.ExecDriver: Processing alias hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10002
13/10/26 07:54:30 INFO exec.ExecDriver: Adding input file hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10002
13/10/26 07:54:30 INFO exec.Utilities: Content Summary not cached for hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10002
13/10/26 07:54:30 INFO exec.ExecDriver: Making Temp Directory: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10001
13/10/26 07:54:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/10/26 07:54:30 INFO mapred.FileInputFormat: Total input paths to process : 7
13/10/26 07:54:30 INFO io.CombineHiveInputFormat: number of splits 1
Starting Job = job_201310250853_0025, Tracking URL = http://sandbox:50030/jobdetails.jsp?jobid=job_201310250853_0025
13/10/26 07:54:31 INFO exec.Task: Starting Job = job_201310250853_0025, Tracking URL = http://sandbox:50030/jobdetails.jsp?jobid=job_201310250853_0025
Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill job_201310250853_0025
13/10/26 07:54:31 INFO exec.Task: Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill job_201310250853_0025
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
13/10/26 07:54:39 INFO exec.Task: Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2013-10-26 07:54:39,392 Stage-3 map = 0%,  reduce = 0%
13/10/26 07:54:39 INFO exec.Task: 2013-10-26 07:54:39,392 Stage-3 map = 0%,  reduce = 0%
2013-10-26 07:54:48,505 Stage-3 map = 87%,  reduce = 0%
13/10/26 07:54:48 INFO exec.Task: 2013-10-26 07:54:48,505 Stage-3 map = 87%,  reduce = 0%
2013-10-26 07:54:49,510 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 6.95 sec
13/10/26 07:54:49 INFO exec.Task: 2013-10-26 07:54:49,510 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 6.95 sec
2013-10-26 07:54:50,517 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 6.95 sec
13/10/26 07:54:50 INFO exec.Task: 2013-10-26 07:54:50,517 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 6.95 sec
2013-10-26 07:54:51,525 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
13/10/26 07:54:51 INFO exec.Task: 2013-10-26 07:54:51,525 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
MapReduce Total cumulative CPU time: 6 seconds 950 msec
13/10/26 07:54:51 INFO exec.Task: MapReduce Total cumulative CPU time: 6 seconds 950 msec
Ended Job = job_201310250853_0025
13/10/26 07:54:51 INFO exec.Task: Ended Job = job_201310250853_0025
13/10/26 07:54:51 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/_tmp.-ext-10001 to: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/_tmp.-ext-10001.intermediate
13/10/26 07:54:51 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/_tmp.-ext-10001.intermediate to: hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10001
Moving data to: hdfs://sandbox:8020/apps/hive/warehouse/tweets_id_sample_ui
13/10/26 07:54:51 INFO exec.Task: Moving data to: hdfs://sandbox:8020/apps/hive/warehouse/tweets_id_sample_ui from hdfs://sandbox:8020/tmp/hive-beeswax-hue/hive_2013-10-26_07-51-30_924_8805518057234020615/-ext-10001
13/10/26 07:54:51 INFO exec.DDLTask: Default to LazySimpleSerDe for table tweets_id_sample_ui
13/10/26 07:54:51 INFO hive.metastore: Trying to connect to metastore with URI thrift://sandbox:9083
13/10/26 07:54:51 INFO hive.metastore: Waiting 1 seconds before next connection attempt.
13/10/26 07:54:52 INFO hive.metastore: Connected to metastore.
13/10/26 07:54:53 INFO exec.StatsTask: Executing stats task
Table default.tweets_id_sample_ui stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 10972500, raw_data_size: 0]
13/10/26 07:54:54 INFO exec.Task: Table default.tweets_id_sample_ui stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 10972500, raw_data_size: 0]
13/10/26 07:54:54 INFO ql.Driver: </PERFLOG method=Driver.execute start=1382799091328 end=1382799294689 duration=203361>
MapReduce Jobs Launched: 
13/10/26 07:54:54 INFO ql.Driver: MapReduce Jobs Launched: 
Job 0: Map: 7   Cumulative CPU: 141.58 sec   HDFS Read: 1762842930 HDFS Write: 10972500 SUCCESS
13/10/26 07:54:54 INFO ql.Driver: Job 0: Map: 7   Cumulative CPU: 141.58 sec   HDFS Read: 1762842930 HDFS Write: 10972500 SUCCESS
Job 1: Map: 1   Cumulative CPU: 6.95 sec   HDFS Read: 10973519 HDFS Write: 10972500 SUCCESS
13/10/26 07:54:54 INFO ql.Driver: Job 1: Map: 1   Cumulative CPU: 6.95 sec   HDFS Read: 10973519 HDFS Write: 10972500 SUCCESS
Total MapReduce CPU Time Spent: 2 minutes 28 seconds 530 msec
13/10/26 07:54:54 INFO ql.Driver: Total MapReduce CPU Time Spent: 2 minutes 28 seconds 530 msec
OK
13/10/26 07:54:54 INFO ql.Driver: OK
13/10/26 07:54:56 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
13/10/26 07:54:56 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

以下是使用my hive cli的示例:
如果创建的是视图而不是表,则上面的查询也可以工作。
可以创建空表
可以从hdfs文件创建表(例如,从第一个代码块找到的tweets\u示例表是从hdfs文件创建的)
以下是通过hive cli为tweets执行的查询示例:

CREATE EXTERNAL TABLE tweets_sample (
   id BIGINT,
   created_at STRING,
   source STRING,
   favorited BOOLEAN,
   retweet_count INT,
   retweeted_status STRUCT<
      text:STRING,
      user:STRUCT<screen_name:STRING,name:STRING>>,
   entities STRUCT<
      urls:ARRAY<STRUCT<expanded_url:STRING>>,
      user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
      hashtags:ARRAY<STRUCT<text:STRING>>>,
   text STRING,
   user STRUCT<
      screen_name:STRING,
      name:STRING,
      friends_count:INT,
      followers_count:INT,
      statuses_count:INT,
      verified:BOOLEAN,
      utc_offset:STRING, -- was INT but nulls are strings
      time_zone:STRING>,
   in_reply_to_screen_name STRING,
   year int,
   month int,
   day int,
   hour int
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/data/oct25_tweets'
;

目前,我被困在如何解决这个问题。
其他注意事项:
我工作的环境如下:
oracle vm virtualbox上的hortonworks沙盒v1.3
我在做hortonworks教程#13
Hive蜂蜡查询通过用户“hue”的hue ui执行
配置单元cli查询从用户“root”执行(也从用户“hue”测试)

5kgi1eie

5kgi1eie1#

解决方案:
这可以通过配置配置单元通过配置单元cli将jar添加到其类路径中来解决,如下所示:

hive> ADD JAR [path to JSON SerDe jar file];

例如:

hive> ADD JAR /usr/lib/hive/lib/json-serde-1.1.4-jar-with-dependencies.jar;

hive将通过返回以下语句来确认添加:

Added /usr/lib/hive/lib/json-serde-1.1.4-jar-with-dependencies.jar to class path
Added resource: /usr/lib/hive/lib/json-serde-1.1.4-jar-with-dependencies.jar

必须在每个配置单元会话开始时执行上述操作。
说明:
由于select from子句,原始问题提供的查询产生错误。如果将以下查询提交到配置单元cli,则会遇到相同的错误:

SELECT
   id
FROM tweets_sample;

源表tweets\u sample的行以json-serde格式存储。从问题末尾生成tweets\u sample的查询可以看出:

CREATE EXTERNAL TABLE tweets_sample (
   id BIGINT,
   ...
   hour int
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/data/oct25_tweets';

默认情况下,配置单元不知道如何以这种格式解析或提取列。您可能会注意到,即使在添加json serde jar文件之前,以下查询仍将实际工作:

SELECT *
FROM tweets_sample;

此查询之所以有效,是因为配置单元不需要从行中的特定列提取元素,因此不需要知道行的格式。
通过在执行上述解决方案中提供的任何json serde格式相关查询之前指定json serde jar文件,hive将知道如何执行此类查询。

相关问题