Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Benchmarks #92

Merged
merged 6 commits into from
Feb 3, 2025
Merged

docs: Benchmarks #92

merged 6 commits into from
Feb 3, 2025

Conversation

jakmro
Copy link
Member

@jakmro jakmro commented Jan 31, 2025

Description

Add models Benchmarks (memory usage, inference time, model size)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update (improves or adds clarity to existing documentation)

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

@jakmro jakmro changed the title @jakmro/benchmarks docs: Benchmarks Jan 31, 2025
@jakmro jakmro requested a review from mkopcins January 31, 2025 16:08

| Model | Android (XNNPack) [MB] | iOS (CoreML) [MB] |
| ----------------------------------------------------------------------------------------------- | ---------------------- | ----------------- |
| STYLE_TRANSFER_CANDY, STYLE_TRANSFER_MOSAIC, STYLE_TRANSFER_UDNIE, STYLE_TRANSFER_RAIN_PRINCESS | 950 | 350 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets split it, one model per line. This looks a bit off to me

@@ -0,0 +1,39 @@
---
title: Inference Time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to only list 'consecutive' value. Firstly because it might be a bit confusing what it actually means and also who knows when executorch and system might decide to reload the model, causing new 'first' run. Better put warning banner on the top of the page with notice that initial runs can be significantly (even 2x) slower due to model loading in to the memory.

| LLAMA3_2_1B_QLORA | 31.8 | 11.4 | 11.2 | 37.3 | 44.4 |
| LLAMA3_2_3B | ❌ | ❌ | ❌ | ❌ | 7.1 |
| LLAMA3_2_3B_SPINQUANT | 17.2 | 8.2 | ❌ | 16.2 | 19.4 |
| LLAMA3_2_3B_QLORA | 14.5 | ❌ | ❌ | 14.8 | 18.1 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, below this table add description of why we have x's in some places (not enough memory)

@jakmro jakmro requested a review from mkopcins February 3, 2025 14:04
@mkopcins mkopcins merged commit c2eee13 into main Feb 3, 2025
2 checks passed
@mkopcins mkopcins deleted the @jakmro/benchmarks branch February 3, 2025 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants