Skip to content

Releases: ROCm/Tensile

v2.2.3 - SplitU and WorkGroupMapping

30 Mar 17:18
Compare
Choose a tag to compare

SplitU
If you have large summations but small C tensor, then you can create extra parallelism by splitting up the summation; This allows smaller C tensors to fill up larger GPUs.

WorkGroupMapping
Changes which work-groups operate on which tiles of tensor C. This can help performance by improving caching.

v2.2.0 - Recursive Solution Selection Logic

13 Mar 16:46
Compare
Choose a tag to compare

Rather than choosing solutions based on size=M*N, the recursive solution selection logic (SSL) now chooses solutions based on M, N and K, by recursively partitioning the dimensions.

v2.0.0 - Benchmarking Overhaul: Faster, Simpler, Programmable

23 Feb 21:12
Compare
Choose a tag to compare

The benchmarking protocol has been completely re-designed to use config.yaml files rather an applications needing to generate problem.xml files.

Tensile is now an installable python module.

Please read the wiki to understand all the new features.

v1.1.0 - Bug Fixes

26 Jan 17:18
Compare
Choose a tag to compare

Several bug fixes for rocBLAS.

v0.1 - Preview Release

15 Aug 20:53
Compare
Choose a tag to compare
Pre-release

Full support for tensor contractions for BLAS and DNN.