incubator-doris [Bug] 当并行执行sql语句BE占用内存达到mem_limit上限值时,BE进程异常退出

voase2hg  于 2022-04-22  发布在  Java
关注(0)|答案(1)|浏览(1288)

Search before asking

  • I had searched in the issues and found no similar issues.

Version

0.14.0

What's Wrong?

1.并行执行多个hash sql,提高BE占用的内存到内存上限值mem_limit。
执行的sql会返回下面相关信息
ERROR 1064 (HY000): errCode = 2, detailMessage = Could not allocate aggregate expression intermediate value ExecNode Exprs could not allocate 16.00 B without exceeding limit. Error occurred on backend 192.168.255.203 by fragment fe17cbf462934c66-9f146e738ac954fa Memory left in process limit: -39723616.00 B ExecEnv root: memory limit exceeded. Limit=3.00 GB Total=3.04 GB Peak=3.34 GB RuntimeState: query Fragment deab64ae21664eda-b3567d2268c5ac1f: Limit=3.00 GB Reservation=0 ReservationLimit=3.00 GB OtherMemory=8.00 KB Total=
2.然后,使用gdb跟踪BE进程发现异常退出点在incubator-doris-0.14.0/be/src/runtime/mem_tracker.cpp
int64_t limit = reservation_counters->reservation_limit->value();

gdb跟踪信息
#0 0x000000000114e38a in doris::MemTracker::LogUsage (this=this@entry=0x17e139c80, max_recursive_depth=max_recursive_depth@entry=1, prefix=..., logged_consumption=logged_consumption@entry=0x7f21178744a0) at /*/incubator-doris-0.14.0/be/src/runtime/mem_tracker.cpp:346 #1 0x00000000011513f1 in doris::MemTracker::LogUsage (max_recursive_depth=max_recursive_depth@entry=1, prefix=..., trackers=..., logged_consumption=logged_consumption@entry=0x7f21178746f8) at /*/incubator-doris-0.14.0/be/src/runtime/mem_tracker.cpp:404 #2 0x000000000114f54a in doris::MemTracker::LogUsage (this=this@entry=0x64c1980, max_recursive_depth=max_recursive_depth@entry=2, prefix=..., logged_consumption=logged_consumption@entry=0x0) at /*/incubator-doris-0.14.0/be/src/runtime/mem_tracker.cpp:376 #3 0x0000000001152a30 in doris::MemTracker::MemLimitExceeded (mtracker=<optimized out>, state=state@entry=0x1eed83100, details=..., failed_allocation_size=failed_allocation_size@entry=21632) at /*/incubator-doris-0.14.0/be/src/runtime/mem_tracker.cpp:497 #4 0x0000000001715628 in doris::MemTracker::MemLimitExceeded (failed_allocation=21632, details=..., state=0x1eed83100, this=<optimized out>) at /*/incubator-doris-0.14.0/be/src/runtime/mem_tracker.h:369 #5 doris::PartitionedHashTableCtx::ExprValuesCache::Init (this=this@entry=0x17f1c1c20, state=state@entry=0x1eed83100, tracker=..., build_exprs=...) at /*/incubator-doris-0.14.0/be/src/exec/partitioned_hash_table.cc:323

What You Expected?

在BE进程内存达到上限值时,执行语句返回错误而不影响BE进程的稳定性。

How to Reproduce?

1.3fe和3be集群配置
2.创建一张分区表,大约导入5千万数据
3.查询语句基本上就是几个分区内的hash本身查询
4.BE的mem_limit 调小一点,为了更容易复现问题,我的测试集群BEmem_limit配置为3G
mem_limit = 3G
5.10个并行hash sql一直执行,BE进程很快就快退出。

Anything Else?

无.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

hvvq6cgz

hvvq6cgz1#

We had the same problem .
Has the problem been fixed ?

相关问题