datastax配置单元查询性能问题-

osh3o9ms  于 2021-06-26  发布在  Hive
关注(0)|答案(0)|浏览(330)

我已经注意到通过spark hive在dse上运行查询时的性能问题。
表架构

CREATE TABLE tests(
    id text,
    user text,
    aname text,
    iname text,
    gentime bigint,
    snapdate bigint,
    action text,
    acount bigint,
    line bigint,
    PRIMARY KEY ((id, user), aname, iname, gentime, snapdate, action)
) WITH CLUSTERING ORDER BY (aname ASC, iname ASC, gentime ASC, snapdate ASC, action ASC)

查询

select * 
from tests 
where (id='a37') and (aname='ABC') and (iname = 'ABC1') and (user is not null) and (gentime = 1520985600000) 
group by user, snapdate

问题

2018-03-16 05:44:43,404 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 4110.32 sec
2018-03-16 05:44:44,407 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 4110.32 sec
2018-03-16 05:44:45,410 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 4110.32 sec
2018-03-16 05:44:46,412 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 4110.32 sec
2018-03-16 05:44:47,416 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 4110.32 sec
2018-03-16 05:44:48,420 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 4110.32 sec
2018-03-16 05:44:49,426 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4111.27 sec
2018-03-16 05:44:50,428 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4111.27 sec
2018-03-16 05:44:51,431 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4111.27 sec
2018-03-16 05:44:52,434 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4111.27 sec
2018-03-16 05:44:53,437 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4111.27 sec
MapReduce Total cumulative CPU time: 0 days 1 hours 8 minutes 31 seconds 270 msec
Ended Job = job_201801161541_0002
MapReduce Jobs Launched: 
Job 0:**Map: 167**Reduce: 1 **Cumulative CPU: 4111.27 sec** HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 days 1 hours 8 minutes 31 seconds 270 msec

为了解决这么多Map器和cpu延迟问题的原因,我开始阅读有关dse配置的文章。
到目前为止,我发现 vnodes dse cassandra中的配置,这可能是潜在的问题。
我还验证了 num_tokens 生产中 cassandra.yaml 是的,它是配置的。我不知道配置背后的逻辑。
我的问题:
为什么有这么多Map绘制者? Job 0:**Map: 167** 为什么一个执行者被多个任务重载?
我们需要禁用vnode来解决多个Map器的问题吗?
cassandrasqlcontext与hivecontext。在性能方面有什么不同吗?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题