#14406: Add perf test for reduce scatter #14838

Aswinmcw · 2024-11-07T11:22:23Z

Ticket

#14406

What's changed

Adds perf test for reduce scatter T3k ring and line, N300 ring and line

Checklist

Post commit CI passes
Blackhole Post commit (if applicable)
Model regression CI testing passes (if applicable)
Device performance regression CI testing passes (if applicable)
New/Existing tests provide coverage for changes

SeanNijjar · 2024-11-07T15:48:41Z

Looks good overall but there are two things that look off here (potentially). I'm not sure which dataformat being used for each test (can you please add that), and I think the Op BW equation is bugged. I'll revisit that today and get back to you with the right equation because op BW should always be >= link BW and right now it is much lower.

SeanNijjar · 2024-11-08T03:00:49Z

I double checked the equation in the issue for ring reduce scatter op bandwidth, it was incorrect. I corrected it to input_tensor_volume / longest_device_fw_time for the "per chip" op bandwidth. Total op BW is a little squishy for the full cluster. That may eventually end up being a more useful measurement but this is useful to track right now (perhaps a later iteration of this work can also express cluster-level op BW

SeanNijjar

Approving now pending updates to BW calculations:

for ring reduce scatter it should match line reduce scatter: input_tensor_volume / longest_device_fw_time

Also I realized the reduce scatter eth BW is incorrect. Since we send/receive the full input tensor volume through each chip's ethernet, we should just do input_tensor_volume \ / longest_erisc_fw_time

* tenstorrent#14406: Add perf test for reduce scatter * tenstorrent#14406: Add perf test for N300 reduce scatter * tenstorrent#14406: Fix BW computation

Aswinmcw requested a review from SeanNijjar November 7, 2024 11:59

Aswinmcw assigned SeanNijjar and Aswinmcw Nov 7, 2024

SeanNijjar approved these changes Nov 8, 2024

View reviewed changes

Aswinmcw added 2 commits November 8, 2024 03:52

#14406: Add perf test for reduce scatter

e460d7b

#14406: Add perf test for N300 reduce scatter

3ba4c54

Aswinmcw force-pushed the Aswinmcw/ccl_reduce_scatter_perf branch from a263609 to c43d160 Compare November 8, 2024 06:06

#14406: Fix BW computation

fefe768

Aswinmcw force-pushed the Aswinmcw/ccl_reduce_scatter_perf branch from c43d160 to fefe768 Compare November 8, 2024 06:07

Aswinmcw marked this pull request as ready for review November 8, 2024 07:14

Aswinmcw requested a review from jvegaTT as a code owner November 8, 2024 07:14

Aswinmcw merged commit bdf1f06 into main Nov 8, 2024
134 of 137 checks passed

Aswinmcw deleted the Aswinmcw/ccl_reduce_scatter_perf branch November 8, 2024 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#14406: Add perf test for reduce scatter #14838

#14406: Add perf test for reduce scatter #14838

Aswinmcw commented Nov 7, 2024 •

edited

Loading

SeanNijjar commented Nov 7, 2024

SeanNijjar commented Nov 8, 2024

SeanNijjar left a comment

#14406: Add perf test for reduce scatter #14838

#14406: Add perf test for reduce scatter #14838

Conversation

Aswinmcw commented Nov 7, 2024 • edited Loading

Ticket

What's changed

Checklist

SeanNijjar commented Nov 7, 2024

SeanNijjar commented Nov 8, 2024

SeanNijjar left a comment

Choose a reason for hiding this comment

Aswinmcw commented Nov 7, 2024 •

edited

Loading