-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#0: Added a naive batching mechanism for some models in the tests folder #740
Open
jbedichekTT
wants to merge
47
commits into
main
Choose a base branch
from
jb/adding-batching-to-tests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+372
−87
Open
Changes from 42 commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
bd6ec3c
#0: Added a naive batching mechanism for some models in the tests folder
jbedichekTT 1b79c98
Added workflow for testing max batch sizes
jbedichekTT 485a032
Empty workflow placeholder
jbedichekTT 48e330e
Reduce verbosity of tests
jbedichekTT d616377
Merge branch 'main' into jb/adding-batching-to-tests
jbedichekTT c4a6a5a
First max batch size search (unoptimzed)
jbedichekTT 253f534
Fixed issue with Github triggering
jbedichekTT 9a7438d
Workflow troubleshooting
jbedichekTT 1f2b60f
Fixed PYTHONPATH issue
jbedichekTT 2bb52a2
Updated workflow enviornment var
jbedichekTT 28a6369
Fixing workflow config
jbedichekTT 4748595
Fixing variable scope
jbedichekTT f8f7112
Adjust bounds of batch search
jbedichekTT de0e23d
added adaptive search bounds
jbedichekTT 47e2ea0
converting end-to-end test
jbedichekTT c8c0af6
hand_landmark fix
jbedichekTT e11510d
syntax fix for search script
jbedichekTT bf0027d
syntax fix for search script
jbedichekTT 4e31900
Update dependencies to 0.56.0-rc9 (#745)
ayerofieiev-tt ea160c9
Update requirements.txt
ayerofieiev-tt b406a8c
Update requirements.txt
ayerofieiev-tt 250cddc
Update requirements.txt
ayerofieiev-tt fe6bb36
Update update-ttnn-wheel.yaml
ayerofieiev-tt a757c30
Update requirements.txt
ayerofieiev-tt f6c0ded
deleted unnecessary files
jbedichekTT caa813e
reformatting and adding default batch size
jbedichekTT 13964e8
run additional tests
jbedichekTT 036e966
refactoring
jbedichekTT 907751f
removing extraneous imports
jbedichekTT 74dd352
removing extreneous imports
jbedichekTT bf2cd30
conftest argument fix
jbedichekTT 3ba6ef6
utils typo fix
jbedichekTT d92db1f
reconfigure test
jbedichekTT 68fbb73
refactoring tests and only searching even batches
jbedichekTT 8ffaeda
search script fix
jbedichekTT 5d710ec
search script grouping fix
jbedichekTT 894b00f
reduced verbosity
jbedichekTT 153b274
further reducing verbsoity
jbedichekTT e3a5db9
modified search to exit on highest even value
jbedichekTT bce3f43
reduced num iterations
jbedichekTT 1f6a8c2
exclude tests
jbedichekTT 1204887
pruning tests for next iteration
jbedichekTT 8403376
next batch of tests
jbedichekTT 95f3367
bounds reconfig
jbedichekTT 15c386e
arithmetic fix
jbedichekTT 6f70217
reset tests
jbedichekTT b2493e3
isolating BERT test
jbedichekTT File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,205 @@ | ||
name: Maximum Batch Size Experiment | ||
|
||
on: | ||
workflow_dispatch: | ||
inputs: | ||
branch: | ||
description: "Branch name" | ||
required: true | ||
type: string | ||
|
||
jobs: | ||
say-hello: | ||
runs-on: ubuntu-latest | ||
permissions: | ||
actions: read | ||
contents: write | ||
pages: write | ||
id-token: write | ||
pull-requests: write | ||
|
||
jobs: | ||
model-tests: | ||
runs-on: ["in-service"] | ||
strategy: | ||
matrix: | ||
group: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24] | ||
env: | ||
pytest_verbosity: 0 | ||
pytest_report_title: "⭐️ Model Tests" | ||
PYTHONPATH: ${{ github.workspace }} | ||
steps: | ||
- name: Print Hello | ||
run: echo "hello" | ||
- uses: actions/checkout@v4 | ||
with: | ||
lfs: true | ||
- uses: ./.github/actions/common_repo_setup | ||
|
||
- name: Run Model Tests in Parallel | ||
shell: bash | ||
run: | | ||
set +e | ||
num_iterations=1 | ||
file_prefix="tests/" | ||
|
||
# Assign tests dynamically based on matrix group | ||
TOTAL_GROUPS=24 | ||
CURRENT_GROUP=${{ matrix.group }} | ||
|
||
mapfile -t test_ids_verbose < <(pytest --collect-only -q -m converted_end_to_end --ignore=tests/models/autoencoder_conv tests/models/ | awk -F '::' '{print $1}' | sort -u) | ||
test_ids=() | ||
|
||
# Array of tests to exclude | ||
exclude_tests=( | ||
"models/mnist/test_mnist.py" | ||
"models/MobileNetV2/test_MobileNetV2.py" | ||
"models/openpose/test_openpose_v2.py" | ||
"models/resnet/test_resnet.py" | ||
"models/resnet50/test_resnet50.py" | ||
"models/roberta/test_roberta.py" | ||
"models/unet/test_unet.py" | ||
"models/hand_landmark/test_hand_landmark.py" | ||
"models/squeeze_bert/test_squeeze_bert.py" | ||
"models/llama/test_llama.py" | ||
"models/timm/test_timm_image_classification.py" | ||
"models/torchvision/test_torchvision_image_classification.py" | ||
"models/unet/test_unet.py" | ||
"models/albert/test_albert_question_answering.py" | ||
"models/albert/test_alert_sequence_classification.py" | ||
"models/albert/test_albert_token_classification.py" | ||
"models/albert/test_albert_masked_lm.py" | ||
"models/unet_carvana/test_unet_carvana.py" | ||
"models/autoencoder_linear/test_autoencoder_linear.py" | ||
"models/perceiver_io/test_perceiver_io.py" | ||
) | ||
|
||
# Preprocess file paths | ||
for file in "${test_ids_verbose[@]}"; do | ||
if [[ "$file" == models/* ]]; then | ||
# Check if the file is in the exclude_tests array | ||
skip=false | ||
for exclude_test in "${exclude_tests[@]}"; do | ||
if [[ "$file" == "$exclude_test" ]]; then | ||
skip=true | ||
break | ||
fi | ||
done | ||
|
||
if ! $skip; then | ||
test_ids+=("$file") | ||
fi | ||
fi | ||
done | ||
|
||
TOTAL_TESTS=${#test_ids[@]} | ||
TESTS_PER_GROUP=1 | ||
|
||
START_INDEX=$(( (CURRENT_GROUP - 1) * TESTS_PER_GROUP )) | ||
END_INDEX=$(( CURRENT_GROUP * TESTS_PER_GROUP )) | ||
|
||
if (( END_INDEX > TOTAL_TESTS )); then | ||
END_INDEX=$TOTAL_TESTS | ||
fi | ||
|
||
# Slice the test array for the current group | ||
group_test_ids=("${test_ids[@]:START_INDEX:TESTS_PER_GROUP}") | ||
echo "All tests ($TOTAL_TESTS): ${test_ids[@]}" | ||
echo "Running tests in group $CURRENT_GROUP..." | ||
echo "Tests assigned to this group: ${group_test_ids[@]}" | ||
|
||
failed_batch_and_test_array=() | ||
max_batch_sizes_array=() | ||
counter=0 | ||
|
||
# Define function for finding max batch size when uninitialized | ||
find_max_batch_size_uninitialized() { | ||
local test_path=$1 | ||
batch_range_lower=$2 | ||
batch_range_upper=$3 | ||
local batch_range=($batch_range_lower $batch_range_upper) | ||
|
||
not_found=1 | ||
local min_failed_batch=0 | ||
local max_successful_batch=0 | ||
|
||
while (( not_found )); do | ||
if (( batch_size_to_test == batch_range_upper - 2 )); then | ||
batch_range_upper=$(( batch_range_upper * 2 )) | ||
batch_range[1]=$batch_range_upper # Update the upper bound in the array | ||
echo "Expanding upper bound to: $batch_range_upper" # Optional logging | ||
fi | ||
local batch_size_to_test=$(( (batch_range[0] + batch_range[1]) / 2 )) | ||
if (( batch_size_to_test % 2 != 0)); then | ||
batch_size_to_test=$batch_size_to_test-1 | ||
fi | ||
|
||
echo "Testing with batch size $batch_size_to_test" | ||
|
||
python3 -m pytest "$test_path" -s --batch_size $batch_size_to_test --report_nth_iteration $num_iterations | ||
exit_code=$? | ||
|
||
if (( exit_code != 0 )); then | ||
batch_range[1]=$batch_size_to_test | ||
min_failed_batch=$batch_size_to_test | ||
else | ||
batch_range[0]=$batch_size_to_test | ||
max_successful_batch=$batch_size_to_test | ||
fi | ||
|
||
if (( min_failed_batch - max_successful_batch == 2)); then | ||
not_found=0 | ||
fi | ||
done | ||
echo "min failed batch: $min_failed_batch" | ||
echo "Max batch size for $test_path: $max_successful_batch" | ||
max_batch_sizes_array+=("$max_successful_batch") | ||
failed_batch_and_test_array+=("$max_successful_batch $test_path") | ||
} | ||
|
||
# Define function for finding max batch size when initialized | ||
find_max_batch_size_initialized() { | ||
test_path=$1 | ||
batch_range_lower=$2 | ||
batch_range_upper=$3 | ||
batch_range=($batch_range_lower $batch_range_upper) | ||
|
||
prior_batches_array=($(printf "%s\n" "${max_batch_sizes_array[@]}" | sort -n)) | ||
not_found=1 | ||
min_failed_batch=0 | ||
max_successful_batch=0 | ||
first_iter=1 | ||
|
||
while (( not_found )); do | ||
if (( first_iter )); then | ||
median_index=$(( ${#prior_batches_array[@]} / 2 )) | ||
batch_size_to_test=${prior_batches_array[$median_index]} | ||
first_iter=0 | ||
else | ||
batch_size_to_test=$(( (batch_range[0] + batch_range[1]) / 2 )) | ||
fi | ||
|
||
echo "Testing with batch size $batch_size_to_test" | ||
python3 -m pytest "$test_path" -s --batch_size $batch_size_to_test --report_nth_iteration $num_iterations | ||
exit_code=$? | ||
|
||
if (( exit_code != 0 )); then | ||
batch_range[1]=$batch_size_to_test | ||
min_failed_batch=$batch_size_to_test | ||
else | ||
batch_range[0]=$batch_size_to_test | ||
max_successful_batch=$batch_size_to_test | ||
fi | ||
|
||
if (( min_failed_batch - max_successful_batch == 1)); then | ||
not_found=0 | ||
fi | ||
done | ||
|
||
echo "Max batch size $max_successful_batch found for $test_path" | ||
max_batch_sizes_array+=("$max_successful_batch") | ||
failed_batch_and_test_array+=("$max_successful_batch $test_path") | ||
} | ||
|
||
# Main loop to distribute test runs across groups | ||
for t in "${group_test_ids[@]}"; do | ||
if [ -z "$t" ]; then | ||
continue | ||
fi | ||
echo "Running test: $t" | ||
file_path="${file_prefix}${t%%::*}" | ||
l_bound=1 | ||
u_bound=256 | ||
find_max_batch_size_uninitialized "$file_path" "$l_bound" "$u_bound" | ||
done | ||
|
||
echo "Final Max Batches: ${failed_batch_and_test_array[@]}" | ||
exit 0 |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
tests/models/MobileNetV2/test_MobileNetV2.py | ||
tests/models/albert/test_albert_masked_lm.py | ||
tests/models/albert/test_albert_question_answering.py | ||
tests/models/albert/test_albert_sequence_classification.py | ||
tests/models/albert/test_albert_token_classification.py | ||
tests/models/autoencoder_linear/test_autoencoder_linear.py | ||
tests/models/beit/test_beit_image_classification.py | ||
tests/models/bert/test_bert.py | ||
tests/models/bloom/test_bloom.py | ||
tests/models/distilbert/test_distilbert.py | ||
tests/models/dpr/test_dpr.py | ||
tests/models/llama/test_llama.py | ||
tests/models/mlpmixer/test_mlpmixer.py | ||
tests/models/mnist/test_mnist.py | ||
tests/models/openpose/test_openpose_v2.py | ||
tests/models/perceiver_io/test_perceiver_io.py | ||
tests/models/resnet/test_resnet.py | ||
tests/models/resnet50/test_resnet50.py | ||
tests/models/roberta/test_roberta.py | ||
tests/models/squeeze_bert/test_squeeze_bert.py | ||
tests/models/unet/test_unet.py | ||
tests/models/unet_brain/test_unet_brain.py | ||
tests/models/unet_carvana/test_unet_carvana.py | ||
tests/models/yolov5/test_yolov5.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,10 @@ | ||
torch@https://download.pytorch.org/whl/cpu/torch-2.2.1%2Bcpu-cp38-cp38-linux_x86_64.whl | ||
torchvision@https://download.pytorch.org/whl/cpu/torchvision-0.17.1%2Bcpu-cp38-cp38-linux_x86_64.whl | ||
torch==2.2.1+cpu | ||
torchvision==0.17.1+cpu | ||
|
||
tabulate==0.9.0 | ||
networkx==3.1 | ||
graphviz | ||
matplotlib==3.7.1 | ||
ttnn@https://github.com/tenstorrent/tt-metal/releases/download/v0.56.0-rc9/ttnn-0.56.0rc9+any-cp38-cp38-linux_x86_64.whl | ||
|
||
ttnn @ https://github.com/tenstorrent/tt-metal/releases/download/v0.56.0-rc37/ttnn-0.56.0rc37+any-cp38-cp38-linux_x86_64.whl ; python_version=="3.8" | ||
ttnn @ https://github.com/tenstorrent/tt-metal/releases/download/v0.56.0-rc37/ttnn-0.56.0rc37+any-cp310-cp310-linux_x86_64.whl ; python_version=="3.10" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will restore