Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#15221: Post completion messages to dispatch_s #16187

Merged
merged 1 commit into from
Dec 19, 2024
Merged

Conversation

jbaumanTT
Copy link
Contributor

Ticket

#15221

Problem description

We're seeing a latency of around 500ns from when the dispatch message is sent to when it's received, in the case where a many are sent at the same time.

What's changed

We never wait on the acks from these completion messages, so make them posted to avoid contention from a lot of replies being sent at once. In the case where every worker is sending them at the same time, this can halve the latency from 500ns to 250ns (on wormhole).

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • (For models and ops writers) Full new models tests passes
  • New/Existing tests provide coverage for changes

We never wait on the acks from these completion messages, so make them posted
to avoid contention from a lot of replies being sent at once. In the case where
every worker is sending them at the same time, this can halve the latency from
500ns to 250ns (on wormhole).
@@ -406,7 +406,8 @@ int main() {
NOC_UNICAST_WRITE_VC,
1,
31 /*wrap*/,
false /*linked*/);
false /*linked*/,
true /*posted*/);
Copy link
Contributor

@tt-asaigal tt-asaigal Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your change but a potential bug with the non-posted atomic incs previously: Looking at the additional args we're now setting.. I realize that we never correctly configured the return addr of the non-posted atomic inc.

The pre-incremented value would have been written over the NOC at worker address 0. Seems like we were lucky that this address wasn't being actively used?

This is not an issue with your change, since response traffic for the atomic inc won't be generated anymore.

Edit: Ah this issue is only with DYNAMIC_NOC. I guess this feature was never enabled.

@jbaumanTT jbaumanTT merged commit 3f54011 into main Dec 19, 2024
187 of 189 checks passed
@jbaumanTT jbaumanTT deleted the jbauman/postnotify branch December 19, 2024 18:53
@ubcheema
Copy link
Contributor

I am assuming this is just speeding up the noc itself.
I don't see any atomic barrier, so fw/kernel does not seem to be held up due to an inflight response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants