forked from NVIDIA/TransformerEngine
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
TP-RS overlap with send/recv ring-exchange (NVIDIA#724)
* TP-RS overlap with send/recv Atomic GEMM based TP-RS overlap with send/recv Signed-off-by: Sangkug Lym <[email protected]> Specify userbuffer overlap method of each overlap instance Signed-off-by: Sangkug Lym <[email protected]> P2P TP-RS overlap with fp8 GEMM outputs Signed-off-by: Sangkug Lym <[email protected]> Fix TP-RS overlap with send/recv Signed-off-by: Sangkug Lym <[email protected]> * cleanup Signed-off-by: Sangkug Lym <[email protected]> * cleanup Signed-off-by: Sangkug Lym <[email protected]> * linting Signed-off-by: Sangkug Lym <[email protected]> * fix typo Signed-off-by: Sangkug Lym <[email protected]> --------- Signed-off-by: Sangkug Lym <[email protected]> Co-authored-by: Tim Moon <[email protected]>
- Loading branch information
Showing
11 changed files
with
497 additions
and
268 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.