Skip to content

v2.4.1 Pack LayerNorm

Compare
Choose a tag to compare
@DefTruth DefTruth released this 25 Sep 06:07
· 238 commits to main since this release
4667308

What's Changed

  • [Nsight] Add nsys/ncu usage, ptx/sass by @DefTruth in #44
  • [DotProd][FP16] support f16x8_pack kernel by @DefTruth in #45
  • [LayerNorm][FP16] Add pack support for f16x8 LD/ST by @DefTruth in #46

Full Changelog: v2.4...v2.4.1