Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit 0354abe136dfb37ea243aaceeed6a5accb12d738
Author: Lisa Ong <[email protected]>
Date:   Fri Nov 18 10:09:41 2022 +0000

    Merged PR 2953: Workaround debug mode failures with dimension argument ordering

    * Order dimension arguments after Array args to avoid this lowering issue in Debug mode (until Debug mode is fixed)

    ```
    test_all_dynamic_sizes_static_unroll_matmul_llvm.mlir:236:28: error: use of value '%7' expects different type than prior uses: 'i64' vs '!llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>'
        %42 = llvm.insertvalue %7, %41[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
                               ^
    /Users/lisaong/work/staging/Accera/build/lib.macosx-11.1-arm64-3.10/test_acccgen/test_all_dynamic_sizes_static_unroll_matmul/_tmp/test_all_dynamic_sizes_static_unroll_matmul/test_all_dynamic_sizes_static_unroll_matmul_llvm.mlir:201:5: note: prior use here
        %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
        ^
    ```

    * Enable DEV_MODE tests in one CI pipeline so that we can catch these in the future

commit 5d365527d0b17c6d147086e71f958ed8ec2a7c56
Author: Lisa Ong <[email protected]>
Date:   Wed Nov 16 06:20:47 2022 +0000

    Merged PR 2950: [Release] Rev docs to v1.2.12

    In preparation for 1.2.12 release EOW

commit e9cca1927e98ff5a8c8af16f8746c317ba5998d8
Author: Mason Remy <[email protected]>
Date:   Mon Oct 31 20:57:36 2022 +0000

    Merged PR 2946: Fix hierarchical partial fusing

    Fix hierarchical partial fusing

    Index attributes in fragment predicate ops weren't getting updated as
    part of fusion mapping old indices to new fused indices. This fix is a
    quick change to recursively walk predicates and update their index
    attributes manually.
    In the future we could use SymbolicIndexOps and rely on
    BlockAndValueMapping replacements in clone, however this will also
    require that we don't create as many duplicate SymbolicIndexOps for the
    same Index

commit ecbc5b63ea6d879437e2ae08a1fd0edf05505032
Author: Mason Remy <[email protected]>
Date:   Thu Oct 27 21:45:54 2022 +0000

    Merged PR 2942: Hold onto intermediate split indices when fusing

    Hold onto intermediate split indices when fusing

    When we split a loop multiple times, the outer index references the
    inner intermediate split indices in affine expressions, even if those
    indices get further split and are no longer loop indices. We have been
    discarding them because they aren't loop indices or dimension indices,
    but they wound up getting re-added to the transformed domain by
    serialization and this led to fusion bugs.

commit f38adc9b18c234e56c5fa4a213df51e5d0d1d033
Author: JUBI TANEJA <[email protected]>
Date:   Thu Oct 27 07:07:34 2022 +0000

    Merged PR 2834: match and rewrite a pattern to vectorize int16 matmul

    This rewrite rule matches the jj and kk loops in int16 matmul, where outer loop `jj` `{0..8}` is followed by an inner loop `kk` `{0..2}`. It vectorizes the `jj` and `kk` loop and replaces each affine op by a vectorized op. At the end, it generates `vpmaddwd` instruction for MatMul.

commit 7a1597d88e216bfb6df5780092b61711d6a36657
Author: Mason Remy <[email protected]>
Date:   Thu Oct 27 01:24:41 2022 +0000

    Merged PR 2918: Support vectorization and static size caching for split dynamic range

    Support vectorization and static size caching for split dynamic range
    loops

commit b6ebe0df5222184221f1a5cf7d99cac60131eadb
Author: Mason Remy <[email protected]>
Date:   Thu Oct 27 00:44:13 2022 +0000

    Merged PR 2914: Support static loop splits of dynamic sized ranges

    Support static loop splits of dynamic sized ranges

    This change creates a specialization of the AffineConstraintsHelper that
    works with Loopnest concepts and uses that in LoopNestBuilder to update
    the loop split generation

commit e5fa02d9b734dba2d24c10424ef947db7c5a3052
Author: Mason Remy <[email protected]>
Date:   Wed Oct 26 19:29:49 2022 +0000

    Merged PR 2911: Support dynamic ranges in ScheduledLoopOp

    Support dynamic ranges in ScheduledLoopOp

commit 2a9cf40237814b8ea793d4020ec0b5be1633306a
Author: Mason Remy <[email protected]>
Date:   Wed Oct 26 18:54:49 2022 +0000

    Merged PR 2907: Implement initial affine constraint helper for dynamic size loop

    Implement initial affine constraint helper for dynamic size loop
    handling

    Implements a wrapper around mlir::FlatAffineValueConstraints and a set
    of low-level tests using it that enable static-sized splitting of
    dynamic loop ranges

commit 5784d4483a1f03308b2ece7337fdf6d6b96fd757
Author: Captain Jack Sparrow <[email protected]>
Date:   Fri Oct 21 21:45:03 2022 +0000

    Merged PR 2935: Remove thread coarsening factor > 4 from GPU benchmarks

    Remove thread coarsening factor > 4 from GPU benchmarks

commit 432f88a6fbc5898ba93d85e87502f0421a68dbe4
Author: Captain Jack Sparrow <[email protected]>
Date:   Thu Oct 20 00:13:50 2022 +0000

    Merged PR 2932: Upgrade to CUDA 11.8

    Upgrade to CUDA 11.8

commit b2669c511c2cca8bef2991f74cfed3aba5d39440
Author: Captain Jack Sparrow <[email protected]>
Date:   Wed Oct 19 17:14:40 2022 +0000

    Merged PR 2931: Update to ROCm 5.3

    Update to ROCm 5.3

commit 1472394dcf14a881d5336447abd307ee5ba77209
Author: Lisa Ong <[email protected]>
Date:   Wed Oct 19 00:37:10 2022 +0000

    Merged PR 2926: Plumb parameter usages to emitted HAT files

commit 7c786ce8f0b337e7af7c281db69436f53c74d4ca
Author: Captain Jack Sparrow <[email protected]>
Date:   Tue Oct 18 23:02:21 2022 +0000

    Merged PR 2927: Reduce benchmark configs using thread coarsening

    Reduce benchmark configs using thread coarsening

commit 5f21639e49373112c5c42875b84ed2b74e58616b
Author: Captain Jack Sparrow <[email protected]>
Date:   Tue Oct 18 20:06:42 2022 +0000

    Merged PR 2925: Add optional optimization hint for number of thread blocks per SM

    Add optional optimization hint for number of thread blocks per SM

    Related work items: #3736
  • Loading branch information
Lisa Ong committed Nov 21, 2022
1 parent c363c3a commit 711af89
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 13 deletions.
5 changes: 4 additions & 1 deletion .azure/linux-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,12 @@ steps:
python -m pip install pytest-azurepipelines
ctest -C Debug -T test -VV -LE benchmark -j $(PARALLEL) --progress
displayName: Run all ctest targets
continueOnError: false
workingDirectory: "$(Build.SourcesDirectory)/build"
- bash: python -m unittest discover accera/test *.py
displayName: Run tests in DEV_MODE
workingDirectory: "$(Build.SourcesDirectory)/build/lib.linux-x86_64-3.9"

- task: CopyFiles@2
condition: always()
inputs:
Expand Down
28 changes: 16 additions & 12 deletions accera/python/accera/test/dsl_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -920,7 +920,8 @@ def _():
# plan.vectorize(jjjj)

package = Package()
function = package.add(plan, args=(A, B, C, K), base_name=test_name)
# BUGBUG: dim args ordered first due to issue with Debug mode
function = package.add(plan, args=(K, A, B, C), base_name=test_name)

M_test = M
N_test = N
Expand All @@ -929,8 +930,8 @@ def _():
B_test = np.random.random((K_test, N_test)).astype(np.uint8)
C_test = np.random.random((M_test, N_test)).astype(np.int32)
correctness_check_values = {
"pre": [A_test, B_test, C_test, K_test],
"post": [A_test, B_test, C_test + A_test @ B_test, K_test],
"pre": [K_test, A_test, B_test, C_test],
"post": [K_test, A_test, B_test, C_test + A_test @ B_test],
}

self._verify_helper(package, test_name, function.name, correctness_check_values)
Expand Down Expand Up @@ -965,7 +966,8 @@ def _():
plan.unroll(jj)

package = Package()
function = package.add(plan, args=(A, B, C, M, N, K), base_name=test_name)
# BUGBUG: dim args ordered first due to issue with Debug mode
function = package.add(plan, args=(M, N, K, A, B, C), base_name=test_name)

M_test = np.int64(123)
N_test = np.int64(234)
Expand All @@ -974,8 +976,8 @@ def _():
B_test = np.random.random((K_test, N_test)).astype(np.float32)
C_test = np.random.random((M_test, N_test)).astype(np.float32)
correctness_check_values = {
"pre": [A_test, B_test, C_test, M_test, N_test, K_test],
"post": [A_test, B_test, C_test + A_test @ B_test, M_test, N_test, K_test],
"pre": [M_test, N_test, K_test, A_test, B_test, C_test],
"post": [M_test, N_test, K_test, A_test, B_test, C_test + A_test @ B_test],
}

self._verify_helper(package, test_name, function.name, correctness_check_values)
Expand Down Expand Up @@ -1010,7 +1012,8 @@ def _():
plan.vectorize(jj)

package = Package()
function = package.add(plan, args=(A, B, C, M, N, K), base_name=test_name)
# BUGBUG: dim args ordered first due to issue with Debug mode
function = package.add(plan, args=(M, N, K, A, B, C), base_name=test_name)

M_test = np.int64(123)
N_test = np.int64(234)
Expand All @@ -1019,8 +1022,8 @@ def _():
B_test = np.random.random((K_test, N_test)).astype(np.float32)
C_test = np.random.random((M_test, N_test)).astype(np.float32)
correctness_check_values = {
"pre": [A_test, B_test, C_test, M_test, N_test, K_test],
"post": [A_test, B_test, C_test + A_test @ B_test, M_test, N_test, K_test],
"pre": [M_test, N_test, K_test, A_test, B_test, C_test],
"post": [M_test, N_test, K_test, A_test, B_test, C_test + A_test @ B_test],
}

self._verify_helper(package, test_name, function.name, correctness_check_values)
Expand Down Expand Up @@ -1068,7 +1071,8 @@ def _():
plan.vectorize(jjjj)

package = Package()
function = package.add(plan, args=(A, B, C, M, N, K), base_name=test_name)
# BUGBUG: dim args ordered first due to issue with Debug mode
function = package.add(plan, args=(M, N, K, A, B, C), base_name=test_name)

M_test = np.int64(123)
N_test = np.int64(234)
Expand All @@ -1077,8 +1081,8 @@ def _():
B_test = np.random.random((K_test, N_test)).astype(np.float32)
C_test = np.random.random((M_test, N_test)).astype(np.float32)
correctness_check_values = {
"pre": [A_test, B_test, C_test, M_test, N_test, K_test],
"post": [A_test, B_test, C_test + A_test @ B_test, M_test, N_test, K_test],
"pre": [M_test, N_test, K_test, A_test, B_test, C_test],
"post": [M_test, N_test, K_test, A_test, B_test, C_test + A_test @ B_test],
}

self._verify_helper(package, test_name, function.name, correctness_check_values)
Expand Down

0 comments on commit 711af89

Please sign in to comment.