我已经在我的虚拟机上安装了cdh5.6和spark。
在配置单元中提交任何查询时 mr
发动机运转良好,但当我把它换成 spark
,日志显示它正在提交作业并以这样的无限循环进行:
hive> set hive.execution.engine=spark;
hive> create table landing.tmp as select * from landing.employee;
Query ID = root_20171203165454_64ca3a41-25af-4cd5-a2c1-9b7c9e8f49cd
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Spark Job = cf7a1eff-7ba6-48f5-90dd-f5de3794c36e
Query Hive on Spark job[0] stages:
0
Status: Running (Hive on Spark job[0])
Job Progress Format
CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]
2017-12-03 16:55:27,468 Stage-0_0: 0/1
2017-12-03 16:55:30,528 Stage-0_0: 0/1
2017-12-03 16:55:33,585 Stage-0_0: 0/1
2017-12-03 16:55:37,557 Stage-0_0: 0/1
2017-12-03 16:55:40,617 Stage-0_0: 0/1
2017-12-03 16:55:43,683 Stage-0_0: 0/1
2017-12-03 16:55:46,764 Stage-0_0: 0/1
2017-12-03 16:55:49,822 Stage-0_0: 0/1
2017-12-03 16:55:52,900 Stage-0_0: 0/1
2017-12-03 16:55:55,945 Stage-0_0: 0/1
2017-12-03 16:55:58,999 Stage-0_0: 0/1
2017-12-03 16:56:02,077 Stage-0_0: 0/1
2017-12-03 16:56:05,134 Stage-0_0: 0/1
2017-12-03 16:56:08,196 Stage-0_0: 0/1
2017-12-03 16:56:11,238 Stage-0_0: 0/1
2017-12-03 16:56:14,280 Stage-0_0: 0/1
2017-12-03 16:56:17,345 Stage-0_0: 0/1
2017-12-03 16:56:20,380 Stage-0_0: 0/1
2017-12-03 16:56:23,405 Stage-0_0: 0/1
2017-12-03 16:56:26,464 Stage-0_0: 0/1
2017-12-03 16:56:29,534 Stage-0_0: 0/1
2017-12-03 16:56:32,598 Stage-0_0: 0/1
2017-12-03 16:56:35,627 Stage-0_0: 0/1
2017-12-03 16:56:38,661 Stage-0_0: 0/1
2017-12-03 16:56:41,718 Stage-0_0: 0/1
....
...
我已经添加了Spark组件xx*…罐在 /usr/lib/hive/lib
路径。
我还添加了以下属性 hive-site.xml
.
<property>
<name>spark.master</name>
<value>spark://192.168.190.128:7077</value>
</property>
<property>
<name>spark.home</name>
<value>/usr/lib/spark</value>
</property>
<property>
<name>spark.eventLog.enabled</name>
<value>true</value>
</property>
<property>
<name>spark.eventLog.dir</name>
<value>/usr/lib/hive/spark_log</value>
</property>
<property>
<name>spark.executor.memory</name>
<value>512m</value>
</property>
<property>
<name>spark.serializer</name>
<value>org.apache.spark.serializer.KryoSerializer</value>
</property>
<property>
<name>spark.executor.cores</name>
<value>3</value>
</property>
<property>
<name>spark.driver.memory</name>
<value>1024m</value>
</property>
我还为属性名创建了目录 <name>spark.eventLog.dir</name>
如其价值所示 <value>/usr/lib/hive/spark_log</value>
.
我甚至试过设置 spark.master
作为 spark://localhost.localdomain:7077
.
我遗漏了什么吗?
暂无答案!
目前还没有任何答案,快来回答吧!