v1.2.28
What's Changed
-
Merged PR 3199: Rename _slice to slice and add docs. [Captain Jack
Sparrow]Rename _slice to slice and add docs
-
Merged PR 3197: Preserve dest memref shape during SliceOp to SubViewOp
lowering. [Captain Jack Sparrow]Preserve dest memref shape during SliceOp to SubViewOp lowering:
Without this change, subview op would discard the dest memref type required by the slice op. For example,
%7 = "accv.slice"(%arg0, %6) {sliceDimensions = [0]} : (memref<1x30x256xui8>, index) -> memref<30x256xui8, affine_map<...>>
would get lowered to:
%4 = memref.subview %arg0[%3, 0, 0] [1, 30, 256] [1, 1, 1] : memref<1x30x256xui8> to memref<1x30x256xui8, affine_map<...>> %5 = memref.cast %4 : memref<1x30x256xui8, affine_map<...>> to memref<?x?x?xui8, affine_map<...>>
which does not drop the first dimension as expected. With this fix, the slice op correctly lowers to:
%4 = memref.subview %arg0[%3, 0, 0] [1, 30, 256] [1, 1, 1] : memref<1x30x256xui8> to memref<30x256xui8, affine_map<...>> %5 = memref.cast %4 : memref<30x256xui8, affine_map<...>> to memref<30x256xui8, affine_map<...>>
-
Merged PR 3194: Reorder the ops in GetTimeOpLowering to improve the
timing accuracy. [Denny Sun]In order to get the most accurate timing, we need to order the operations more appropriately,
from Independent logic GetTime() Independent logic Main logic to profile Independent logic GetTime() Independent logic to Independent logic Independent logic GetTime() Main logic to profile GetTime() Independent logic Independent logic
-
Merged PR 3187: Fully dynamic split_dimension op. [Denny Sun]
This change enable Accera to be able to split a dynamic dimension by a dynamic size
` M, N, MN = create_dimensions() Input = Array(role=Role.INPUT, element_type=ScalarType.float32, shape=(MN, )) Output = Array(role=Role.INPUT_OUTPUT, element_type=ScalarType.float32, shape=(M, N)) nest = Nest(shape=(M, N)) i, j = nest.get_indices() @nest.iteration_logic def _(): split_input = Input._split_dimension(0, N) Output[i, j] = split_input[i, j] package.add(nest, args=(MN, M, N, Input, Output), base_name=f"{test_name}_fn")`
-
Merged PR 3185: [nfc] Adds tests for vectorization, fast_exp_sum.
[Kern Handa] -
Merged PR 3168: [docs] Tensorization tutorials and type name updates.
[Captain Jack Sparrow]
Full Changelog: v1.2.27...v1.2.28