Skip to content

Commit

Permalink
#15868: use a buffer's size when creating its CB in groupnorm (#16093)
Browse files Browse the repository at this point in the history
### Ticket
Link to Github Issue
#15868

### Problem description
- Some group norm tests are failing with Debug Build with an assert
about CB size
- The underlying reason is that the in0 CB was always sized to the
nearest tile, even when the input was row major and therefore smaller

### What's changed
- Set the CB size for in0 based on the per bank size to get the right
size for row major and tile input

### Checklist
- [x] Post commit CI passes across
https://github.com/tenstorrent/tt-metal/actions/runs/12360041457 and
https://github.com/tenstorrent/tt-metal/actions/runs/12374697868 all
sub-jobs pass at least once
- [x] Blackhole Post commit (if applicable)
https://github.com/tenstorrent/tt-metal/actions/runs/12360044285
- [x] Model regression CI testing passes (if applicable)
https://github.com/tenstorrent/tt-metal/actions/runs/12360048449
- [x] Device performance regression CI testing passes (if applicable)
https://github.com/tenstorrent/tt-metal/actions/runs/12360046481
- [ ] **(For models and ops writers)** Full [new
models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml)
tests passes
- [x] New/Existing tests provide coverage for changes (in Debug)
  • Loading branch information
bbradelTT authored Jan 8, 2025
1 parent 204ed99 commit 5f48c90
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,8 @@ operation::ProgramWithCallbacks groupnorm_multi_core_sharded(
////////////////////////////////////////////////////////////////////////////
// block size for in0 (tensor a)
uint32_t in0_block_tiles = per_core_Nt * per_core_Mt;
uint32_t in0_CB_size = in0_block_tiles * in_single_tile_size;
uint32_t in0_CB_size = a.buffer()->aligned_size_per_bank(); // use buffer size to handle both RM and Tile
uint32_t in_CB_size = in0_block_tiles * in_single_tile_size;
// in2 - scaler
uint32_t in2_CB_size = single_tile_size;
// in3 - eps
Expand All @@ -338,7 +339,6 @@ operation::ProgramWithCallbacks groupnorm_multi_core_sharded(
uint32_t repack_CB_size = per_core_Nt * in_single_tile_size * 2; // double buffer
// itermediate buffers
uint32_t interm_block_tiles = block_ht * block_wt;
uint32_t in_CB_size = in0_CB_size;
uint32_t im_out_CB_size = out_single_tile_size * interm_block_tiles;
uint32_t x_CB_size = interm_block_tiles * single_tile_size;
uint32_t xmm_CB_size = interm_block_tiles * single_tile_size;
Expand Down

0 comments on commit 5f48c90

Please sign in to comment.