#16720: and #14898 update output dims for argmax and move pad for generic reduce #16989

bbradelTT · 2025-01-22T21:02:23Z

Ticket

Link to Github Issues #16720 and #14898

Problem description

when specify a dim, output tensor of argmax is not one rank smaller than input
transpose seems to insert it's own padding, which occurs for the early tensor dimensions

What's changed

for argmax, change shape of output tensor to have the right rank
for generic reduce, move pad filling to right before the reduce op is called
also update tests and add deprecated to another specialized reduce function

Checklist

Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12956428377
Blackhole Post commit (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939182511
Model regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939185477/job/36091291360 in line with main https://github.com/tenstorrent/tt-metal/actions/runs/12937069874/job/36084729557
Device performance regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939183976
(For models and ops writers) Full new models tests passes
New/Existing tests provide coverage for changes

bbradelTT · 2025-01-23T20:52:36Z

Depends on #17048 for full functionality.
Will wait for that PR before running more tests / pipelines.

asandhupatlaTT · 2025-01-23T23:59:40Z

ttnn/cpp/ttnn/operations/reduction/argmax/device/argmax_op.cpp

@@ -58,7 +58,7 @@ std::vector<TensorSpec> ArgMax::compute_output_specs(
    ttnn::SimpleShape output_shape({1, 1, 1, 1});
    if (this->dim.has_value()) {
        auto input_shape = input_tensors[0].get_logical_shape();
-        output_shape = ttnn::SimpleShape{input_shape[0], input_shape[1], 1, input_shape[2]};
+        output_shape = ttnn::SimpleShape{input_shape[0], input_shape[1], input_shape[2]};


is keepdim always False?

argmax does not have keepdim

pwd /Users/bbradel/tt-metal/ttnn/cpp/ttnn/operations/reduction/argmax bbradel@Borys-Bradel's-Mac argmax % grep keep -r * | wc 0 0 0

asandhupatlaTT · 2025-01-24T00:04:13Z

tests/tt_eager/python_api_testing/sweep_tests/pytests/tt_dnn/test_argmax_int.py


        pt_out_tensor = golden_tensor
-        tt_out_tensor = tt_output_tensor_on_device.cpu().to(ttnn.ROW_MAJOR_LAYOUT).to_torch()
-        comp_pass, comp_out = comparison_funcs.comp_pcc(pt_out_tensor, tt_out_tensor, pcc=0.99)
+        assert_with_pcc(pt_out_tensor, tt_out_tensor)


please compare shapes too. I found wrong shapes at #16922

That's why I switched to assert_with_pcc:
tests/ttnn/utils_for_testing.py

def assert_with_pcc(expected_pytorch_result, actual_pytorch_result, pcc=0.9999): assert list(expected_pytorch_result.shape) == list( actual_pytorch_result.shape ), f"list(expected_pytorch_result.shape)={list(expected_pytorch_result.shape)} vs list(actual_pytorch_result.shape)={list(actual_pytorch_result.shape)}"

… for generic reduce (#16989)" This reverts commit 83145d2.

### Ticket Link to Github Issue #14898 Subset of previous PR (#16989) that caused a hang in (Single-card) Demo tests and got reverted. Verified that this pipeline passes for this subset of changes: https://github.com/tenstorrent/tt-metal/actions/runs/12992459972 ### Problem description - transpose was filling in non-logical areas with default pad value when called from reduce ### What's changed - pass in an appropriate pad value for transpose to use - also mark a method that should only be used by pool to be deprecated to deter other uses ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12992465641 - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

…eric reduce (#16989) ### Ticket Link to Github Issues #16720 and #14898 ### Problem description - when specify a dim, output tensor of argmax is not one rank smaller than input - transpose seems to insert it's own padding, which occurs for the early tensor dimensions ### What's changed - for argmax, change shape of output tensor to have the right rank - for generic reduce, move pad filling to right before the reduce op is called - also update tests and add deprecated to another specialized reduce function ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12956428377 - [x] Blackhole Post commit (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939182511 - [x] Model regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939185477/job/36091291360 in line with main https://github.com/tenstorrent/tt-metal/actions/runs/12937069874/job/36084729557 - [x] Device performance regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939183976 - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

### Ticket Link to Github Issue #14898 Subset of previous PR (#16989) that caused a hang in (Single-card) Demo tests and got reverted. Verified that this pipeline passes for this subset of changes: https://github.com/tenstorrent/tt-metal/actions/runs/12992459972 ### Problem description - transpose was filling in non-logical areas with default pad value when called from reduce ### What's changed - pass in an appropriate pad value for transpose to use - also mark a method that should only be used by pool to be deprecated to deter other uses ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12992465641 - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

…eric reduce (#16989) ### Ticket Link to Github Issues #16720 and #14898 ### Problem description - when specify a dim, output tensor of argmax is not one rank smaller than input - transpose seems to insert it's own padding, which occurs for the early tensor dimensions ### What's changed - for argmax, change shape of output tensor to have the right rank - for generic reduce, move pad filling to right before the reduce op is called - also update tests and add deprecated to another specialized reduce function ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12956428377 - [x] Blackhole Post commit (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939182511 - [x] Model regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939185477/job/36091291360 in line with main https://github.com/tenstorrent/tt-metal/actions/runs/12937069874/job/36084729557 - [x] Device performance regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939183976 - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

### Ticket Link to Github Issue #14898 Subset of previous PR (#16989) that caused a hang in (Single-card) Demo tests and got reverted. Verified that this pipeline passes for this subset of changes: https://github.com/tenstorrent/tt-metal/actions/runs/12992459972 ### Problem description - transpose was filling in non-logical areas with default pad value when called from reduce ### What's changed - pass in an appropriate pad value for transpose to use - also mark a method that should only be used by pool to be deprecated to deter other uses ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12992465641 - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

…eric reduce (#16989) ### Ticket Link to Github Issues #16720 and #14898 ### Problem description - when specify a dim, output tensor of argmax is not one rank smaller than input - transpose seems to insert it's own padding, which occurs for the early tensor dimensions ### What's changed - for argmax, change shape of output tensor to have the right rank - for generic reduce, move pad filling to right before the reduce op is called - also update tests and add deprecated to another specialized reduce function ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12956428377 - [x] Blackhole Post commit (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939182511 - [x] Model regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939185477/job/36091291360 in line with main https://github.com/tenstorrent/tt-metal/actions/runs/12937069874/job/36084729557 - [x] Device performance regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12939183976 - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

### Ticket Link to Github Issue #14898 Subset of previous PR (#16989) that caused a hang in (Single-card) Demo tests and got reverted. Verified that this pipeline passes for this subset of changes: https://github.com/tenstorrent/tt-metal/actions/runs/12992459972 ### Problem description - transpose was filling in non-logical areas with default pad value when called from reduce ### What's changed - pass in an appropriate pad value for transpose to use - also mark a method that should only be used by pool to be deprecated to deter other uses ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12992465641 - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [x] New/Existing tests provide coverage for changes

bbradelTT mentioned this pull request Jan 22, 2025

fix argmax issues #16479

Closed

6 tasks

bbradelTT changed the title ~~#16720: update output dims for argmax and update reduce tests~~ #16720: and #14898 update output dims for argmax and move pad for generic reduce Jan 22, 2025

bbradelTT marked this pull request as ready for review January 22, 2025 22:07

bbradelTT requested review from ayerofieiev-tt, dmakoviichuk-tt, rfurko-tt, cfjchu, TT-BrianLiu, razorback3, dongjin-na, asandhupatlaTT and sjameelTT as code owners January 22, 2025 22:07

This was referenced Jan 22, 2025

[Bug Report] ttnn.max wrong results #12662

Closed

Failing test scenarios for ttnn.max #14898

Closed

dmakoviichuk-tt approved these changes Jan 22, 2025

View reviewed changes

rfurko-tt approved these changes Jan 22, 2025

View reviewed changes

sjameelTT approved these changes Jan 22, 2025

View reviewed changes

bbradelTT mentioned this pull request Jan 22, 2025

[Bug Report] Operations that reduce broadcasted tensors give incorrect results #15965

Closed

bbradelTT force-pushed the bbradel-16720_arg branch from 24b377f to 0360a5e Compare January 23, 2025 03:45

bbradelTT force-pushed the bbradel-16720_arg branch from 44f86c4 to 79c3a50 Compare January 23, 2025 21:59

asandhupatlaTT approved these changes Jan 23, 2025

View reviewed changes

asandhupatlaTT reviewed Jan 23, 2025

View reviewed changes

asandhupatlaTT reviewed Jan 24, 2025

View reviewed changes

bbradelTT force-pushed the bbradel-16720_arg branch 3 times, most recently from e2040ce to 8d7c692 Compare January 24, 2025 14:25

bbradelTT added 3 commits January 24, 2025 19:04

#16720: update output dims for argmax and update reduce tests

a1396f7

#16720: update reduce test and add deprecated for a method

05a7d53

#16720: update argmax test to use assert_with_pcc

0d66016

bbradelTT added 7 commits January 24, 2025 19:04

#16720: move pad in generic reduc and add tests

589f81f

#16720: add fill parameter to reduce_impl

7cd02ef

#16720: adjust pcc for max test on GS

9f898e9

#16720: make generic reduce pass pad value to transpose

d86ecb4

#16720: remove a max test parameter

58a7a2b

#16720: remove max test case with pipeline issues

f87e1c1

#16720: skip max tests on GS if shapes not multiples of tiles

377bb49

bbradelTT force-pushed the bbradel-16720_arg branch from 63deba3 to 377bb49 Compare January 24, 2025 19:04

bbradelTT merged commit 83145d2 into main Jan 25, 2025
215 of 217 checks passed

bbradelTT deleted the bbradel-16720_arg branch January 25, 2025 04:57

tt-rkim pushed a commit that referenced this pull request Jan 26, 2025

Revert "#16720: and #14898 update output dims for argmax and move pad…

8d8c7c2

… for generic reduce (#16989)" This reverts commit 83145d2.

bbradelTT restored the bbradel-16720_arg branch January 26, 2025 23:36

bbradelTT mentioned this pull request Jan 27, 2025

#14898: pass in pad value to transpose in reduce #17142

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#16720: and #14898 update output dims for argmax and move pad for generic reduce #16989

#16720: and #14898 update output dims for argmax and move pad for generic reduce #16989

bbradelTT commented Jan 22, 2025 •

edited

Loading

bbradelTT commented Jan 23, 2025

asandhupatlaTT Jan 23, 2025

bbradelTT Jan 24, 2025

asandhupatlaTT Jan 24, 2025

bbradelTT Jan 24, 2025

#16720: and #14898 update output dims for argmax and move pad for generic reduce #16989

#16720: and #14898 update output dims for argmax and move pad for generic reduce #16989

Conversation

bbradelTT commented Jan 22, 2025 • edited Loading

Ticket

Problem description

What's changed

Checklist

bbradelTT commented Jan 23, 2025

asandhupatlaTT Jan 23, 2025

Choose a reason for hiding this comment

bbradelTT Jan 24, 2025

Choose a reason for hiding this comment

asandhupatlaTT Jan 24, 2025

Choose a reason for hiding this comment

bbradelTT Jan 24, 2025

Choose a reason for hiding this comment

bbradelTT commented Jan 22, 2025 •

edited

Loading