Skip to content

Commit

Permalink
Test file for copilot review
Browse files Browse the repository at this point in the history
  • Loading branch information
johnbradley committed Dec 30, 2024
1 parent 386f630 commit 1abc003
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions examples/FineTuneSVM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Fine-tune with a SVM
Creates a model that combines pybioclip image embeddings with a SVM using images from the [Somnath01/Birds_Species](https://huggingface.co/datasets/Somnath01/Birds_Species) dataset. This dataset contains 1000 train images, 403 test images, and 50 validation images. This notebook only uses the train and test images. This dataset was chosen for convenience. No analysis of the suitability of this dataset has been done.

When running this notebook in COLAB change the _runtime type_ to a GPU type to speed up processing. Additionally when running the next step in COLAB you you may see an error about the version of `fsspec` installed. This issue doesn't seem to cause any problem with this notebook.

## Load dataset
This step takes around 7 minutes to download the images the first time it is run.

## Setup a SVM model
The `init_svc()` function is copied from [biobench newt](https://github.com/samuelstevens/biobench/blob/637432bfda2b567d966d49bf8c4b37b339d4dc2a/biobench/newt/__init__.py#L247-L262)
created by [@samuelstevens](https://github.com/samuelstevens).

## Train the SVM model
Trains the SVM using the train dataset. This step takes ~ 10 minutes when running on CPU and ~1 minute otherwise.


## Create predictions
Predicts species for the test dataset. This step takes ~ 5 minutes when running on CPU and ~1 minute otherwise.

## Compare against untrained pybioclip model
This step takes ~ 6 minutes when running on CPU and ~1 minute otherwise.

0 comments on commit 1abc003

Please sign in to comment.