Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#13690: Allow buffers to be allocated when a trace is live on device #13696

Merged
merged 1 commit into from
Oct 11, 2024

Conversation

tt-asaigal
Copy link
Contributor

Ticket

#13690

Problem description

  • Trace infra does not allow users to interleave traced and untraced workloads, even if the intermediates/outputs of the untraced workload are fully consumed before a trace is run
  • Untraced workloads will allocate memory on device. When a live trace exists, allocations are disabled -> assert out
  • Blocking vLLM effort, where model writers try to interleave untraced Prefill with traced Decode

What's changed

  • Instead of asserting out, print a warning informing the user that mixing modes can can lead to data corruptions
  • Allows interleaving of traced and untraced workloads (this is safe as long as untraced outputs/intermediates are fully consumed before a trace is run)

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • New/Existing tests provide coverage for changes

@cfjchu
Copy link
Contributor

cfjchu commented Oct 10, 2024

Can we add some unit testing for this feature ?

@tt-asaigal
Copy link
Contributor Author

Can we add some unit testing for this feature ?

Added a basic test for this. Fairly low level feature so its an FD test

  - Instead of asserting out, print a warning informing the user that
    this is can lead to data corruptions
  - Allows interleaving of traced and untraced workloads (this is safe
    as long as untraced outputs/intermediates are fully consumed before
    a trace is run
@tt-asaigal tt-asaigal force-pushed the asaigal/buffer_alloc_warning branch from 62b1cfa to 334e28f Compare October 11, 2024 17:30
@tt-asaigal
Copy link
Contributor Author

@tt-asaigal tt-asaigal merged commit f0b2483 into main Oct 11, 2024
6 checks passed
@tt-asaigal tt-asaigal deleted the asaigal/buffer_alloc_warning branch October 11, 2024 17:50
yan-zaretskiy pushed a commit that referenced this pull request Oct 18, 2024
…13696)

- Instead of asserting out, print a warning informing the user that
   this is can lead to data corruptions
- Allows interleaving of traced and untraced workloads (this is safe
   as long as untraced outputs/intermediates are fully consumed before
   a trace is run
ct-clmsn pushed a commit to ct-clmsn/tt-metal that referenced this pull request Nov 12, 2024
… on device (tenstorrent#13696)

- Instead of asserting out, print a warning informing the user that
   this is can lead to data corruptions
- Allows interleaving of traced and untraced workloads (this is safe
   as long as untraced outputs/intermediates are fully consumed before
   a trace is run
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants