Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTDevice init #359

Merged
merged 2 commits into from
Dec 6, 2024
Merged

TTDevice init #359

merged 2 commits into from
Dec 6, 2024

Conversation

broskoTT
Copy link
Contributor

@broskoTT broskoTT commented Dec 1, 2024

Issue

Related to #98

Description

Initial TTDevice class. It holds PCIDevice and arch implementation.
Gradually, I'd like to move all branches on arch type to be moved from cluster and pci_device to tt_device, and that class to be the only one in the stack which offers different implementation for each arch. According to: https://docs.google.com/drawings/d/1-m1azdsBqMA0A6ATYRMfkhyeuOJuGCEI62N5a96LXj0/edit

List of the changes

  • Create TTDevice class.
  • Created arch specific TTDevice classes, which currently don't hold much implementation.
  • architecture_implementation is moved to TTDevice, and ~half of PCIDevice is moved. PCIDevice should hold only non-arch specific code, except getting the arch itself. There were only mild, compile related changes in the functions moved from PCIDevice to TTDevice.
  • cluster.cpp and tests changed accordingly.
  • read_checking_offset moved to architecture_implementation
  • Blackhole destructor specific code moved to BlackholeTTDevice from PCIDevice.

Testing

Existing CI tests should be enough.

API Changes

There are no API changes in this PR.
But I scheduled post commit tests just to be sure.

Copy link
Contributor

@pjanevskiTT pjanevskiTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, left a few nit comments

@pjanevskiTT
Copy link
Contributor

So even if this PR is not changing the API to the clients, I would like to see tt-metal post commit pipeline pass with all this changes. I guess our tests are testing the correctness well enough (apart from Galaxy), but I would like to make sure that perf is not worse with all of this layering. And metal has a lot of tests that would confirm this

Copy link
Contributor

@joelsmithTT joelsmithTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PCIDevice should hold only non-arch specific code, except getting the arch itself

The conventions for how BAR0 and BAR4 are mapped are architecture-specific. Is the plan for TTDevice implementations to manage these mappings?

If KMD starts enforcing proper resource management for inbound TLB windows and the firmware messaging interface, is the TTDevice abstraction still necessary?

@broskoTT
Copy link
Contributor Author

broskoTT commented Dec 4, 2024

So even if this PR is not changing the API to the clients, I would like to see tt-metal post commit pipeline pass with all this changes. I guess our tests are testing the correctness well enough (apart from Galaxy), but I would like to make sure that perf is not worse with all of this layering. And metal has a lot of tests that would confirm this

Scheduled.

PCIDevice should hold only non-arch specific code, except getting the arch itself

The conventions for how BAR0 and BAR4 are mapped are architecture-specific. Is the plan for TTDevice implementations to manage these mappings?

If KMD starts enforcing proper resource management for inbound TLB windows and the firmware messaging interface, is the TTDevice abstraction still necessary?

For BARs, I'll have to think about it more. The mapping itself no. But how they are used, e.g. which one is used for system_register access, etc, possibly yes.
If KMD implements that, we should remove it from TTDevice. If TTDevice becomes small enough such that it doesn't make sense anymore, we'll remove it.

@broskoTT broskoTT enabled auto-merge (squash) December 6, 2024 08:22
@broskoTT broskoTT merged commit c853fdf into main Dec 6, 2024
22 checks passed
@broskoTT broskoTT deleted the brosko/ttdevicenew branch December 6, 2024 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants