Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grim/explorer memory vis #2319

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Grim/explorer memory vis #2319

wants to merge 7 commits into from

Conversation

jgrimTT
Copy link

@jgrimTT jgrimTT commented Feb 27, 2025

Ticket

Link to Github Issue

Problem description

Creating the memory visualization function.

Copy link
Contributor

@vprajapati-tt vprajapati-tt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! Overall some changes requested on where in the code the attribute gets added in.

I know for each op we collect the attributes in the get_attributes function, but you should be able to expose the attribute list to some helper functions and append them using these helper functions. In the OpHandler add a function add_attribute and call this before the line graph_node = operation.make_graph_node() to add the Memory attributes (if present).

Comment on lines 109 to 111
#if memory_data:
# graph = utils.add_to_dataclass(graph, "memory", memory_data)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to clean up unused code if it's not going to be used in the future

Comment on lines 491 to 492
except:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the try/except block if the exception is just to pass. This case should ideally be successful every time or we guard it with a strict check on how exactly this can go wrong with specific telemetry to the user if this is the case.

f"{self.model_state[model_path].model_output_dir}/run/program_0/memory_results.json"
)
if not os.path.exists(mem_file):
raise FileNotFoundError(f"Memory file {mem_file} not found. Memory file may not have been created. Try running command: ttrt run out.ttnn --memory --save-artifacts")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explorer user will often be unable to directly access the flatbuffer being executed, nor should we expect them to manipulate this. The error message is good, but we don't need the last sentence.

Comment on lines 470 to 489
result.append(
utils.add_to_dataclass(
graph_builder.KeyValue(
key="dram_memory",
value=str(memory_data[str(node_id_number)]["dram"]),
),
'display_type',
'memory'
)
)

result.append(
utils.add_to_dataclass(
graph_builder.KeyValue(
key="l1_memory",
value=str(memory_data[str(node_id_number)]["l1"]),
),
'display_type',
'memory'
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While these do need to be added to each op, I don't think the best location to keep this is in the parse_ttnn_ttnn_layout. This also forces the global variables to be present which doesn't scale well. This code would be better off being added directly in the build_graph function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding this code to the build_graph function, because that's definitely more intuitive and eliminates the need for the global variables. I couldn't figure out how to make it work though. When I did, the memory attributes showed up under the outputs section of explorer instead of attributes. If you have any thoughts on how to resolve that, lmk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah check the parent comment to this review. Essentially you need to modify the OpHandler class to expose the attributes array before it gets added in. An example modification could be like changing the make_graph_node function to intake extra attributes and append them to the end of the get_attributes results.

    # Example
    def make_graph_node(self, extra_attrs=None):
        attrs = self.get_attributes()
        if extra_attrs is not None:
            attrs.extend(extra_attrs)
        return graph_builder.GraphNode(
            id=self.id,
            label=str(self.op.name),
            namespace=self.get_namespace(),
            attrs=attrs,
        )

Comment on lines 13 to 14
global memory_data
global node_id_number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above: We shouldn't be using global variables but instead doing all of the processing within the build_graph scope.

def test_execute_and_check_memory_data_exists():
execute_command_and_wait(
MNIST_SHARDING_PATH,
{"optimizationPolicy": "DF Sharding"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I think we should keep the default for tests as Optimizer Disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add memory report visualization into tt explorer
2 participants