Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conv2d fail in DETR model #794

Open
ayerofieiev-tt opened this issue Feb 26, 2025 · 2 comments
Open

Conv2d fail in DETR model #794

ayerofieiev-tt opened this issue Feb 26, 2025 · 2 comments
Assignees

Comments

@ayerofieiev-tt
Copy link
Member

ayerofieiev-tt commented Feb 26, 2025

Conv2d call in DETR model fails with out of memory in latest main.
Started to happen somewhere during past 3 weeks.

More info here
https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/13539288917/job/37836823583

pytest models/detr/test_detr.py 

Log

    def conv2d(
        *,
        input_tensor: ttnn.Tensor,  # may or may not be sharded
        weight_tensor: ttnn.Tensor,
        device: ttnn.Device,
        in_channels: int,
        out_channels: int,
        batch_size: int,
        input_height: int,
        input_width: int,
        kernel_size: Union[int, Tuple[int, int]],
        stride: Union[int, Tuple[int, int]],
        padding: Union[int, Tuple[int, int]],
        dilation: Union[int, Tuple[int, int]] = (1, 1),
        groups: int = 1,
        bias_tensor: ttnn.Tensor = None,
        conv_config: Conv2dConfig = None,  # config overrides by user
        compute_config=None,  # compute config overrides by user
        memory_config: ttnn.MemoryConfig = None,  # memory config overrides by user
        conv_op_cache={},  # basic conv object caching in python needed for intermediate refactoring. Not needed after full op refactoring in C++.
        debug=False,  # ignored
        return_output_dim=False,
        return_weights_and_bias=False,
    ) -> Tuple[ttnn.Tensor, int, int, ttnn.Tensor, ttnn.Tensor]:
        (
            conv_output,
            output_height,
            output_width,
            prepared_device_weight,
            prepared_device_bias,
>       ) = ttnn._ttnn.operations.conv.conv2d(
            input_tensor=input_tensor,
            weight_tensor=weight_tensor,
            device=device,
            in_channels=in_channels,
            out_channels=out_channels,
            batch_size=batch_size,
            input_height=input_height,
            input_width=input_width,
            kernel_size=kernel_size,
            stride=stride,
            padding=padding,
            dilation=dilation,
            groups=groups,
            bias_tensor=bias_tensor,
            conv_config=conv_config,
            compute_config=compute_config,
            memory_config=memory_config,
        )
E       RuntimeError: TT_THROW @ /work/tt_metal/impl/program/program.cpp:905: tt::exception
E       info:
E       Statically allocated circular buffers in program 2372 clash with L1 buffers on core range [(x=0,y=0) - (x=7,y=3)]. L1 buffer allocated at 498368 and static circular buffer region ends at 560352
E       backtrace:
@ayerofieiev-tt
Copy link
Member Author

@jmalone-tt fyi

@ayerofieiev-tt
Copy link
Member Author

@pavlejosipovic , can you please help to take a look at this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants