-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Benchmarks #92
docs: Benchmarks #92
Conversation
docs/docs/benchmarks/memory-usage.md
Outdated
|
||
| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] | | ||
| ----------------------------------------------------------------------------------------------- | ---------------------- | ----------------- | | ||
| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets split it, one model per line. This looks a bit off to me
@@ -0,0 +1,39 @@ | |||
--- | |||
title: Inference Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to only list 'consecutive' value. Firstly because it might be a bit confusing what it actually means and also who knows when executorch and system might decide to reload the model, causing new 'first' run. Better put warning banner on the top of the page with notice that initial runs can be significantly (even 2x) slower due to model loading in to the memory.
| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | 44.4 | | ||
| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 | | ||
| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 | | ||
| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, below this table add description of why we have x's in some places (not enough memory)
Description
Add models Benchmarks (memory usage, inference time, model size)
Type of change
Checklist