-
Notifications
You must be signed in to change notification settings - Fork 111
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Apply Various 1D Fabric Optimizations - Improve Performance by ~500 M…
…B/s for 4k packet size (#18186) Apply various small optimizations. The transformations and their performance deltas are listed below. Note that the measurements below are when -O3 is enabled for fabric kernel build, even though -Os is used in main. The reason for this is that -O3 will be enabled later this week - currently blocked by some dependencies - so this is the most representative performance delta. Baselining and measuring at -Os would not be representative. ``` Baseline unicast 112816548 -> 15.43 GB/s mcast 274540294 -> 12.68 GB/s # Cache noc addr: 110155221 -> 15.8 GB/s 276839301 -> 12.57 GB/s ## Flatten main loop sender, 1st branch nest: 107584162 unicast -> 16.18 GB/s 269844156 mcast -> 12.9 GB/s ## Flatten receiver last branch nest: 106827158 unicast -> 16.3 267551029 mcast -> 13.0 GB/s Swapping fwd vs local noc write order to do forwarding write first: 104042988 unicast -> 16.7 GB/s 258379905 mcast -> 13.47 GB/s ``` Note that the cached noc addr showed a minor perf degradation for mcast, although there is no reason it should cause a slow down. I did try dropping that commit but keeping the rest of the change sequence and saw a net perf degradation of 1-3% so I think the cached_noc_addr change was probably perturbing other code indirectly and causing a degradation. When applied as a last commit there is an improvement. Update after rebase ontop of @tt-aho's latest changes to routing fields in packet header, new numbers are mcast -> 13.81 GB/s, up from 13.3 GB/s
- Loading branch information
1 parent
4fb909f
commit 42adc10
Showing
2 changed files
with
46 additions
and
44 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters