Copy of the inputs for the evaluation during compilation #1150

vkovinicTT · 2025-01-31T09:43:51Z

Summary

This PR ensures that a copy of the input tensors is created when running forward during compilation (e.g. when trying to verify outputs using forward pass of the framework model). This prevents unintended modifications to the original input tensors if in-place operations are performed during the forward pass.

In order for it to work, first this PR in the tt-tvm needs to be merged.

Why is this needed?

Some models perform in-place operations on input tensors during the forward pass, which can lead to unintended changes in the original inputs. By making a copy of the inputs, we ensure correctness and avoid potential issues when running forward during compilation.

Example test:

    class Inplace(torch.nn.Module):
        def __init__(self):
            super().__init__()

        def forward(self, x):
            y = x + 1 
            x += 2 # in-place operation on the input that causes the inputs to change during forge.compile(...)

            return x + y
        
    input = torch.zeros(shape, requires_grad=False)
    framework_input = input.detach().clone()
    tt_inputs = [input]

    framework_model = Inplace()
    y = framework_model(framework_input)
        
    compiled_model = forge.compile(framework_model, sample_inputs=tt_inputs, module_name="inplace")
    tty = compiled_model(*tt_inputs)[0]

    compare_with_golden(golden=y, calculated=tty) # this would fail

❗❗❗ IMPORTANT NOTES ❗❗❗

In the training mode if the inputs require grad, pytorch will try to do the forward and will throw runtime error:
a leaf Variable that requires grad is being used in an in-place operation., while the compiled model silently computes the gradients and this need to be addressed.

Introducing this change our compiler will not change it's input, so it will act as tensorflow (it doesn't change it's input tensor), but it is won't be aligned with pytorch which allows in-place changes of the input tensor.

github-actions · 2025-01-31T10:17:54Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	505 ran	446 passed	59 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-31T10:22:18Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	564 ran	485 passed	79 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-31T11:19:07Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	507 ran	445 passed	62 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-31T11:24:29Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	565 ran	488 passed	77 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-31T18:33:16Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	565 ran	488 passed	77 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-01-31T18:37:15Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	507 ran	445 passed	62 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T14:23:11Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	511 ran	451 passed	60 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T14:23:14Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	511 ran	451 passed	60 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T14:25:03Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	570 ran	491 passed	79 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-02-04T14:25:08Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	570 ran	491 passed	79 skipped	0 failed

Test	Result
No test annotations available

nvukobratTT · 2025-02-19T10:30:14Z

forge/test/mlir/operators/tm/test_in_place.py

+            return x + y
+
+    input = torch.zeros(shape, requires_grad=False)
+    framework_input = input.detach().clone()


Why we're doing detach and clone here?

Because if run inference on the torch module and then want to run it on compiled module, inputs for the compiled module would be different due to the changes made by torch (as default torch is not functional, meaning it will change input)

nvukobratTT · 2025-02-19T10:30:39Z

forge/test/mlir/operators/tm/test_in_place.py

+    compiled_model = forge.compile(framework_model, sample_inputs=tt_inputs, module_name="inplace")
+    tty = compiled_model(*tt_inputs)[0]
+
+    compare_with_golden(golden=y, calculated=tty)


Should we use the standard verify function?

For the nature of the in-place problem we are facing, I have separate input for compiled module and torch module so I can't pass it to the verify function.

nvukobratTT · 2025-02-19T10:30:48Z

forge/test/mlir/operators/tm/test_in_place.py

+    tty = compiled_model(*tt_inputs)[0]
+
+    compare_with_golden(golden=y, calculated=tty)
+    print(framework_input)


Do we need these prints?

Not really.

nvukobratTT · 2025-02-19T10:31:28Z

forge/test/mlir/operators/tm/test_in_place.py

+    # convert tensor from tf to torch
+    y = torch.tensor(y.numpy())
+
+    compare_with_golden(golden=y, calculated=tty)


Similar comments as for PT example

vkovinicTT · 2025-02-19T10:53:30Z

I would still wait with this PR until we resolve which policy we want to follow (functional or per framework policy).

vkovinicTT force-pushed the vkovinic/eval_tensor_copy branch from 363f22e to 3413ebf Compare January 31, 2025 10:44

vkovinicTT force-pushed the vkovinic/eval_tensor_copy branch 3 times, most recently from 94ab912 to 4cf2184 Compare January 31, 2025 17:57

vkovinicTT changed the title ~~Vkovinic/eval tensor copy~~ Copy of the inputs for the evaluation during compilation Jan 31, 2025

vkovinicTT marked this pull request as ready for review January 31, 2025 18:00

vkovinicTT requested review from mstojkovicTT, nvukobratTT, pilkicTT and dgolubovicTT as code owners January 31, 2025 18:00

vkovinicTT added 2 commits February 4, 2025 08:48

created forward tests for pt and tf and test for backward

5133f70

added check for input

9565677

vkovinicTT force-pushed the vkovinic/eval_tensor_copy branch from 4cf2184 to 9565677 Compare February 4, 2025 13:31

nvukobratTT reviewed Feb 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copy of the inputs for the evaluation during compilation #1150

Copy of the inputs for the evaluation during compilation #1150

vkovinicTT commented Jan 31, 2025 •

edited

Loading

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

nvukobratTT Feb 19, 2025

vkovinicTT Feb 19, 2025

nvukobratTT Feb 19, 2025

vkovinicTT Feb 19, 2025

nvukobratTT Feb 19, 2025

vkovinicTT Feb 19, 2025

nvukobratTT Feb 19, 2025

vkovinicTT commented Feb 19, 2025

Copy of the inputs for the evaluation during compilation #1150

Are you sure you want to change the base?

Copy of the inputs for the evaluation during compilation #1150

Conversation

vkovinicTT commented Jan 31, 2025 • edited Loading

Summary

Why is this needed?

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Jan 31, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

nvukobratTT Feb 19, 2025

Choose a reason for hiding this comment

vkovinicTT Feb 19, 2025

Choose a reason for hiding this comment

nvukobratTT Feb 19, 2025

Choose a reason for hiding this comment

vkovinicTT Feb 19, 2025

Choose a reason for hiding this comment

nvukobratTT Feb 19, 2025

Choose a reason for hiding this comment

vkovinicTT Feb 19, 2025

Choose a reason for hiding this comment

nvukobratTT Feb 19, 2025

Choose a reason for hiding this comment

vkovinicTT commented Feb 19, 2025

vkovinicTT commented Jan 31, 2025 •

edited

Loading