如何在tez上控制hive中容器的数量

iklwldmw 于 2021-06-24 发布在 Hive

关注(0)|答案(1)|浏览(763)

我对使用tez引擎还不熟悉。我在tez引擎上运行hive查询，查询似乎利用了所有可用的资源。我想知道有没有办法控制集装箱的数量。例如，我们如何在spark中使用--executor cores和--num executors配置进行控制。
我找了又找不到任何具体的东西。另外，我不想通过队列来区分它（因为我在emr上运行它时使用了缩放选项，并且基于多个队列定义缩放会使设置复杂化）。
更新1：带有vertice信息

VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 1            container       RUNNING     17          0       11        6       0       0
----------------------------------------------------------------------------------------------

上面的查询触发1个vertice，其中11个任务并行运行（使用集群的所有11个资源）。我想控制vertice中并发运行任务的数量（在本例中是从11到3）。

Hive yarn amazon-emr amazon-web-services apache-tez

来源：https://stackoverflow.com/questions/63233839/how-to-control-number-of-container-in-hive-on-tez

1条答案

按热度按时间

fykwrbwg1#

小数据集查询的设置：

set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;
set hive.exec.parallel=true;
set hive.auto.convert.join=true;
set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true;
set hive.exec.compress.output=true;
set hive.exec.compress.intermediate=true;
set hive.tez.container.size=10240;
set hive.tez.java.opts=-Xmx8192m;
set tez.runtime.io.sort.mb=4096;
set tez.grouping.min-size=16777216;
set tez.grouping.max-size=1073741824; 
set tez.grouping.split-count=8;
set hive.exec.reducers.bytes.per.reducer=256000000;
hive.exec.reducers.max=10;
set hive.tez.auto.reducer.parallelism = true;
set tez.runtime.unordered.output.buffer.size-mb=1024;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

--为更大的数据集配置：

set hive.execution.engine=tez;
set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;
set hive.exec.parallel=true;
set hive.auto.convert.join=true;
set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true;
set hive.exec.compress.output=true;
set hive.exec.compress.intermediate=true;
set hive.tez.container.size=10240;
set hive.tez.java.opts=-Xmx8192m;
set tez.runtime.io.sort.mb=4096;
set tez.runtime.unordered.output.buffer.size-mb=1024;
set tez.grouping.min-size=1073741824;
set tez.grouping.max-size=1073741824;
set tez.grouping.split-count=16;
set hive.exec.reducers.bytes.per.reducer=512000000;
hive.exec.reducers.max=10;
set hive.tez.auto.reducer.parallelism = true;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

注意：由于您的配置单元或tez版本以及您的平台权限，可能不支持所提到的某些配置。

赞(0）回复(0）举报 2021-06-24

我来回答

如何在tez上控制hive中容器的数量

1条答案

相关问题

热门标签

最新问答