-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grim/explorer memory vis #2319
base: main
Are you sure you want to change the base?
Grim/explorer memory vis #2319
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! Overall some changes requested on where in the code the attribute gets added in.
I know for each op we collect the attributes in the get_attributes function, but you should be able to expose the attribute list to some helper functions and append them using these helper functions. In the OpHandler
add a function add_attribute
and call this before the line graph_node = operation.make_graph_node()
to add the Memory attributes (if present).
#if memory_data: | ||
# graph = utils.add_to_dataclass(graph, "memory", memory_data) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to clean up unused code if it's not going to be used in the future
except: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the try/except block if the exception is just to pass. This case should ideally be successful every time or we guard it with a strict check on how exactly this can go wrong with specific telemetry to the user if this is the case.
f"{self.model_state[model_path].model_output_dir}/run/program_0/memory_results.json" | ||
) | ||
if not os.path.exists(mem_file): | ||
raise FileNotFoundError(f"Memory file {mem_file} not found. Memory file may not have been created. Try running command: ttrt run out.ttnn --memory --save-artifacts") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explorer user will often be unable to directly access the flatbuffer being executed, nor should we expect them to manipulate this. The error message is good, but we don't need the last sentence.
result.append( | ||
utils.add_to_dataclass( | ||
graph_builder.KeyValue( | ||
key="dram_memory", | ||
value=str(memory_data[str(node_id_number)]["dram"]), | ||
), | ||
'display_type', | ||
'memory' | ||
) | ||
) | ||
|
||
result.append( | ||
utils.add_to_dataclass( | ||
graph_builder.KeyValue( | ||
key="l1_memory", | ||
value=str(memory_data[str(node_id_number)]["l1"]), | ||
), | ||
'display_type', | ||
'memory' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While these do need to be added to each op, I don't think the best location to keep this is in the parse_ttnn_ttnn_layout
. This also forces the global variables to be present which doesn't scale well. This code would be better off being added directly in the build_graph function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried adding this code to the build_graph function, because that's definitely more intuitive and eliminates the need for the global variables. I couldn't figure out how to make it work though. When I did, the memory attributes showed up under the outputs section of explorer instead of attributes. If you have any thoughts on how to resolve that, lmk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah check the parent comment to this review. Essentially you need to modify the OpHandler
class to expose the attributes array before it gets added in. An example modification could be like changing the make_graph_node
function to intake extra attributes and append them to the end of the get_attributes
results.
# Example
def make_graph_node(self, extra_attrs=None):
attrs = self.get_attributes()
if extra_attrs is not None:
attrs.extend(extra_attrs)
return graph_builder.GraphNode(
id=self.id,
label=str(self.op.name),
namespace=self.get_namespace(),
attrs=attrs,
)
global memory_data | ||
global node_id_number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above: We shouldn't be using global
variables but instead doing all of the processing within the build_graph
scope.
def test_execute_and_check_memory_data_exists(): | ||
execute_command_and_wait( | ||
MNIST_SHARDING_PATH, | ||
{"optimizationPolicy": "DF Sharding"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I think we should keep the default for tests as Optimizer Disabled
.
Ticket
Link to Github Issue
Problem description
Creating the memory visualization function.