bug描述 Describe the Bug
python 通信API: alltoall_single
文档中声明了 in_split_sizes
的长度只要可以被 world_size
整除即可,然而在后端实现中只支持 in_split_sizes==world_size
的情况,这一点与文档不符合,同时也无法对齐 torch 的API。
以下是对应的后端代码:
https://github.com/PaddlePaddle/Paddle/blob/incubate/new_frl/paddle/fluid/distributed/collective/process_group_nccl.cc#L250
运行报错:
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: The length of size_on_each_rank must be equal to world_size.
[Hint: Expected length_size_on_each_rank == world_size, but received length_size_on_each_rank:16 != world_size:8.] (at /work/Paddle/paddle/fluid/distributed/collective/process_group_nccl.cc:222)
其他补充信息 Additional Supplementary Information
No response
1条答案
按热度按时间uxhixvfz1#
@LiYuRio 请帮忙看下这个API的问题,谢谢