Skip to content

Commit

Permalink
Return if no data to allreduce
Browse files Browse the repository at this point in the history
Summary: When the input tensor is empty, just return. Otherwise the num_thread will be 0 and fail to launch cuda kernels.

Reviewed By: feikou

Differential Revision: D68318641
  • Loading branch information
xw285cornell authored and facebook-github-bot committed Jan 17, 2025
1 parent 21d1260 commit eb1e962
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
4 changes: 4 additions & 0 deletions fbgemm_gpu/experimental/gen_ai/src/comm/car.cu
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,10 @@ void one_shot_car_allreduce(
TORCH_CHECK(y.numel() % 8 == 0);
TORCH_CHECK(y.numel() < kMaxCAR);
const auto N = y.numel();
if (N == 0) {
// no data to allreduce, return
return;
}
if (z) {
TORCH_CHECK(z->numel() == y.numel());
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,7 @@ def _run_oneshot_car_stress_inner(path: str) -> None:
torch.distributed.barrier()

ITER = 1000
for idx, N in enumerate(np.logspace(4, 24, num=20, base=2).tolist()):
for idx, N in enumerate([0] + np.logspace(4, 24, num=20, base=2).tolist()):
N = int(N)

def round_up(a: int, b: int) -> int:
Expand Down

0 comments on commit eb1e962

Please sign in to comment.