Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PrecomputedValues::bindTensorMetaData for DID loop split #3854

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wujingyue
Copy link
Collaborator

@wujingyue wujingyue commented Feb 8, 2025

in the same way as ExpressionEvaluator::bindTensorDomain and several other places. Caveat: having to fix multiple places in the same way probably indicates a pre-existing duplication of logic.

Fixes #3817

@wujingyue wujingyue linked an issue Feb 8, 2025 that may be closed by this pull request
in the same way as ExpressionEvaluator::bindTensorDomain and several
other places. Caveat: having to fix multiple places in the same way
probably indicates a pre-existing duplication of logic.
@wujingyue
Copy link
Collaborator Author

!test

Copy link

github-actions bot commented Feb 8, 2025

Description

  • Fix PrecomputedValues::bindTensorMetaData for DID loop split

  • Update tensor size handling in bindTensorMetaData

  • Modify test case test_sdpa_loop_split for better parameterization

  • Add unshardedSizes utility function usage


Changes walkthrough 📝

Relevant files
Bug fix
evaluator_common.cpp
Update tensor size handling in bindTensorMetaData               

csrc/evaluator_common.cpp

  • Include multidevice/utils.h
  • Update bindTensorMetaData to use unshardedSizes for tensor sizes
  • Simplify device dimension handling in bindTensorMetaData
  • +6/-9     
    Enhancement
    test_multidevice.py
    Update test_sdpa_loop_split for better parameterization   

    tests/python/test_multidevice.py

  • Update test_sdpa_loop_split to use parameterized tensor shapes
  • Reorganize tensor shape definition in test_sdpa_loop_split
  • +6/-5     

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review

    Possible Issue

    The new code introduces a dependency on multidevice/utils.h which was not previously included. Ensure that this inclusion is necessary and does not introduce any unintended side effects.

    #include <multidevice/utils.h>
    Logic Change

    The logic for binding tensor metadata has been changed to use unshardedSizes and logical_sizes. Verify that this change does not alter the behavior for non-multidevice scenarios.

    std::vector<int64_t> logical_sizes = unshardedSizes(tv, tensor.sizes());
    for (const auto dim :
         c10::irange(static_cast<int64_t>(logical_domain.size()))) {
      IterDomain* id = logical_domain[dim];
      const auto dim_size = logical_sizes.at(dim);
      if (id->isBroadcast()) {
    Test Changes

    The test test_sdpa_loop_split has been modified to use a different approach for defining tensor shapes. Ensure that these changes do not inadvertently skip valid test cases.

    d = multidevice_test.size
    mesh = nvfuser.DeviceMesh(range(d))
    
    class Model(FusionDefinition):
        def __init__(self, qkv_format: QkvFormat):
            super().__init__()
            self._qkv_format = qkv_format
    
        def definition(self) -> None:

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    test_sdpa_loop_split errors out with dynamic shape
    1 participant