Skip to content

Commit

Permalink
Merge branch 'main' into fix-test-hand_landmark
Browse files Browse the repository at this point in the history
  • Loading branch information
ayerofieiev-tt authored Sep 13, 2024
2 parents ce5e787 + 4d6729e commit 9621163
Show file tree
Hide file tree
Showing 158 changed files with 32,874 additions and 14,347 deletions.
4 changes: 4 additions & 0 deletions .github/actions/common_cleanup/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ runs:
- name: Cleanup model cache
shell: bash
run: |
df -h
python3 -m pip cache purge
python3 tools/huggingface_delete_cache.py
rm -rf ~/.torch/models
rm -rf ~/.cache/custom_weights
free -h
df -h
8 changes: 5 additions & 3 deletions .github/actions/common_repo_setup/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,17 @@ runs:
uses: actions/setup-python@v5
with:
python-version: '3.8'
cache: 'pip'
cache-dependency-path: |
requirements-dev.txt
#cache: 'pip'
#cache-dependency-path: |
# requirements-dev.txt

- name: Install Dependencies
shell: bash
run: |
df -h
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements-dev.txt
python3 -m pip install pytest-github-report
df -h
- uses: ./.github/actions/common_cleanup
3 changes: 1 addition & 2 deletions .github/workflows/before_merge.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,7 @@ jobs:
runs-on: ["in-service", "n150"]
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/common_repo_setup

- uses: ./.github/actions/common_repo_setup
- name: Run Tools Tests
run: |
python3 -m pytest --github-report tests/tools/ -s
Expand Down
1,808 changes: 1,681 additions & 127 deletions README.md

Large diffs are not rendered by default.

42 changes: 42 additions & 0 deletions docs/KnownIssues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Known Issues
============

This document outlines the known issues present in the current release of the PyTorch 2.0 TTNN Compiler. Basically, the issues are about some specific attributes owned by a model but unable to be processed by the current compiler.

## 1. Complex dynamic shape.

While Torch Dynamo is capable of handling dynamic shapes to a certain extent — critical for models like YOLO, which require processing inputs of varying sizes — we encountered a model with dynamic shapes that our compiler was unable to handle. Without conducting a detailed investigation, the most likely cause seems to be the introduction of "complex" dynamic shape usage. Specifically, it may involve what is known as **data-dependent shapes**, where the shape of a tensor is determined not solely by the input's dimensions but also by the actual data within the input tensors.

This type of dynamic shape makes inference particularly challenging because:

* The shape cannot be determined or inferred statically before execution.
* The model needs to perform some operations, sometimes on the input data itself, to deduce the final shape.

We have a [test case](../tests/models/openpose/test_openpose.py) that reproduces this issue. The test triggers the following error message, indicating that the Dynamo backend compiler expects a `Tensor` type but encounters a `SymInt` type instead.

```
torch._dynamo.exc.BackendCompilerFailed: backend='ttnn_backend' raised:
E RuntimeError: aten::clone() Expected a value of type 'Tensor' for argument 'self' but instead found type 'SymInt'.
E Position: 0
E Value: s1
E Declaration: aten::clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor
E Cast error details: Unable to cast s1 to Tensor
```

Why does `SymInt` remain in the model? A likely reason is that it cannot be statically determined before execution, which aligns with the concept of data-dependent shapes — where the tensor's shape is influenced by the actual data within the input, rather than just the input dimensions.

This test serves as a useful starting point for further investigation.

Also refer to [PR 132](https://github.com/tenstorrent/pytorch2.0_ttnn/pull/132)

## 2. Pipeline abstraction.

A pipeline refers to a high-level abstraction that simplifies the process of using pre-trained models. A pipeline handles the entire workflow of passing input data through the model, managing preprocessing and postprocessing steps, and returning results in a user-friendly way.

It is understandable that the current compiler does not support the entire pipeline. Traditionally, we compile the model itself but leave the pre-processing and post-processing algorithms out of the compilation process. These algorithms tend to involve more control logic and less data-parallelism, making them more suited for execution on the CPU. We have a [test case](../tests/models/autoencoder_conv/test_autoencoder_conv.py) that reproduces this issue, where the error message "unable to create tensor" appears, likely indicating that the issue occurs in the code outside of the model.

```
E ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`input_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).
```

Also refer to [PR 127](https://github.com/tenstorrent/pytorch2.0_ttnn/pull/127)
Loading

0 comments on commit 9621163

Please sign in to comment.