Add mx_fp8_bf16 kernel #1637

drisspg · 2025-01-29T05:03:52Z

Stacked PRs:

Add mx_fp8_bf16 kernel

Will flesh out more but this moves over the kernel from here: https://github.com/drisspg/driss_torch/blob/2813322f0b0f9a0f0fc8d382090ad0aaecf3468a/src/mx_fp8_bf16.cu#L162

This does fp8xfp8 w/ E8m0 scales and group_size hard coded to 32. The format for the scales is the same as that for cublasLT. I have created a pytorch function that converts the [n_rows, n_cols//32] scales into the expected format:
https://github.com/drisspg/transformer_nuggets/blob/382cb0f19a5f615827174289b8ef552419d51fea/transformer_nuggets/mx/to_blocked.py#L11
This was surprisingly hard fought and would not have been possible w/ @albanD 😊

This allows this PR: #1625 to not have any dependencies on PT core updates while we add the required dtypes and bindings to cublas: pytorch/pytorch#145562

Follow up

Config needs more tuning

pytorch-bot · 2025-01-29T05:03:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1637

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCM Infra failures during checkout of PyTorch

❌ 1 New Failure

As of commit 7473aca with merge base b2fb664 ():

NEW FAILURE - The following job has failed:

Build M1 Wheels / pytorch/ao / upload / wheel-py3_9-cpu (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_ao_drisspg/stack/31_3.9_cpu_

This comment was automatically generated by Dr. CI and updates every 15 minutes.

stack-info: PR: #1637, branch: drisspg/stack/31

vkuzo

nice! if CI is green - looks good! I think this should have at least one numerical test though. Can be a follow-up PR if needed.

albanD

Very cool!

setup.py

torchao/ops.py

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg added a commit that referenced this pull request Jan 29, 2025

Add mx_fp8_bf16 kernel

ae51147

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from 3b57cd9 to ae51147 Compare January 29, 2025 05:03

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 29, 2025

drisspg mentioned this pull request Jan 29, 2025

Update to cutlass 3.8 | wait for tag to land #1634

Open

drisspg added the topic: new feature Use this tag if this PR adds a new feature label Jan 29, 2025

vkuzo approved these changes Jan 29, 2025

View reviewed changes

albanD approved these changes Jan 29, 2025

View reviewed changes

setup.py Outdated Show resolved Hide resolved

vkuzo reviewed Jan 29, 2025

View reviewed changes

torchao/ops.py Show resolved Hide resolved

drisspg changed the base branch from drisspg/stack/30 to main February 3, 2025 21:47

drisspg added a commit that referenced this pull request Feb 3, 2025

Add mx_fp8_bf16 kernel

1e3d2dd

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from ae51147 to 1e3d2dd Compare February 3, 2025 21:47

drisspg changed the base branch from main to drisspg/stack/30 February 3, 2025 21:47

drisspg changed the base branch from drisspg/stack/30 to main February 3, 2025 23:24

drisspg added a commit that referenced this pull request Feb 3, 2025

Add mx_fp8_bf16 kernel

8d90a66

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from 1e3d2dd to 8d90a66 Compare February 3, 2025 23:24

drisspg changed the base branch from main to drisspg/stack/30 February 3, 2025 23:25

drisspg changed the base branch from drisspg/stack/30 to main February 3, 2025 23:30

drisspg added a commit that referenced this pull request Feb 3, 2025

Add mx_fp8_bf16 kernel

0646800

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from 8d90a66 to 0646800 Compare February 3, 2025 23:30

drisspg changed the base branch from main to drisspg/stack/30 February 3, 2025 23:30

drisspg changed the base branch from drisspg/stack/30 to main February 4, 2025 19:55

drisspg changed the base branch from main to drisspg/stack/30 February 4, 2025 19:55

drisspg mentioned this pull request Feb 4, 2025

Add mx_fp4_kernel #1661

Open

Add mx_fp8_bf16 kernel

7473aca

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg changed the base branch from drisspg/stack/30 to main February 4, 2025 19:55

drisspg force-pushed the drisspg/stack/31 branch from 0646800 to 7473aca Compare February 4, 2025 19:57

drisspg changed the base branch from main to drisspg/stack/30 February 4, 2025 19:57

drisspg changed the base branch from drisspg/stack/30 to main February 4, 2025 20:00

drisspg changed the base branch from main to drisspg/stack/30 February 4, 2025 20:00

drisspg added the mx label Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mx_fp8_bf16 kernel #1637

Add mx_fp8_bf16 kernel #1637

drisspg commented Jan 29, 2025 •

edited

Loading

pytorch-bot bot commented Jan 29, 2025 •

edited

Loading

vkuzo left a comment

albanD left a comment

Add mx_fp8_bf16 kernel #1637

Are you sure you want to change the base?

Add mx_fp8_bf16 kernel #1637

Conversation

drisspg commented Jan 29, 2025 • edited Loading

Follow up

pytorch-bot bot commented Jan 29, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1637

❗ 1 Active SEVs

❌ 1 New Failure

vkuzo left a comment

Choose a reason for hiding this comment

albanD left a comment

Choose a reason for hiding this comment

drisspg commented Jan 29, 2025 •

edited

Loading

pytorch-bot bot commented Jan 29, 2025 •

edited

Loading