Fix PrecomputedValues::bindTensorMetaData for DID loop split #3854

wujingyue · 2025-02-08T07:21:41Z

in the same way as ExpressionEvaluator::bindTensorDomain and several other places. Caveat: having to fix multiple places in the same way probably indicates a pre-existing duplication of logic.

Fixes #3817

in the same way as ExpressionEvaluator::bindTensorDomain and several other places. Caveat: having to fix multiple places in the same way probably indicates a pre-existing duplication of logic.

wujingyue · 2025-02-08T07:36:34Z

!test

github-actions · 2025-02-08T07:36:54Z

Description

Fix PrecomputedValues::bindTensorMetaData for DID loop split
Update tensor size handling in bindTensorMetaData
Modify test case test_sdpa_loop_split for better parameterization
Add unshardedSizes utility function usage

Changes walkthrough 📝

Relevant files

Bug fix

evaluator_common.cpp `Update tensor size handling in bindTensorMetaData` csrc/evaluator_common.cpp Include `multidevice/utils.h` Update `bindTensorMetaData` to use `unshardedSizes` for tensor sizes Simplify device dimension handling in `bindTensorMetaData`	+6/-9

Enhancement

test_multidevice.py `Update test_sdpa_loop_split for better parameterization` tests/python/test_multidevice.py Update `test_sdpa_loop_split` to use parameterized tensor shapes Reorganize tensor shape definition in `test_sdpa_loop_split`	+6/-5

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
⚡ Recommended focus areas for review Possible Issue The new code introduces a dependency on `multidevice/utils.h` which was not previously included. Ensure that this inclusion is necessary and does not introduce any unintended side effects. #include <multidevice/utils.h> Logic Change The logic for binding tensor metadata has been changed to use `unshardedSizes` and `logical_sizes`. Verify that this change does not alter the behavior for non-multidevice scenarios. std::vector<int64_t> logical_sizes = unshardedSizes(tv, tensor.sizes()); for (const auto dim : c10::irange(static_cast<int64_t>(logical_domain.size()))) { IterDomain* id = logical_domain[dim]; const auto dim_size = logical_sizes.at(dim); if (id->isBroadcast()) { Test Changes The test `test_sdpa_loop_split` has been modified to use a different approach for defining tensor shapes. Ensure that these changes do not inadvertently skip valid test cases. d = multidevice_test.size mesh = nvfuser.DeviceMesh(range(d)) class Model(FusionDefinition): def __init__(self, qkv_format: QkvFormat): super().__init__() self._qkv_format = qkv_format def definition(self) -> None:

wujingyue linked an issue Feb 8, 2025 that may be closed by this pull request

test_sdpa_loop_split errors out with dynamic shape #3817

Open

wujingyue added 2 commits February 7, 2025 23:28

Add a repro

afcb246

Fix PrecomputedValues::bindTensorMetaData for DID loop split

8725b94

in the same way as ExpressionEvaluator::bindTensorDomain and several other places. Caveat: having to fix multiple places in the same way probably indicates a pre-existing duplication of logic.

wujingyue force-pushed the bug3817 branch from a98e962 to 8725b94 Compare February 8, 2025 07:36

wujingyue requested review from Priya2698 and naoyam February 8, 2025 07:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix PrecomputedValues::bindTensorMetaData for DID loop split #3854

Fix PrecomputedValues::bindTensorMetaData for DID loop split #3854

wujingyue commented Feb 8, 2025 •

edited

Loading

wujingyue commented Feb 8, 2025

github-actions bot commented Feb 8, 2025

Fix PrecomputedValues::bindTensorMetaData for DID loop split #3854

Are you sure you want to change the base?

Fix PrecomputedValues::bindTensorMetaData for DID loop split #3854

Conversation

wujingyue commented Feb 8, 2025 • edited Loading

wujingyue commented Feb 8, 2025

github-actions bot commented Feb 8, 2025

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

wujingyue commented Feb 8, 2025 •

edited

Loading