BioAcoustic Collection Pipeline
This repository aims to streamline the generation and testing of embeddings using a large variety of bioacoustic models.
The below image shows a comparison of umap embeddings based on 16 different bioacoustic models. The models are being evaluated on the DCASE Task 5 Few-Shot learning dataset. This repository is an attempt to enable a comparison of bioacoustic models based on their embedding spaces, rather than their classification results.
Create a virtual environment using python3.11 or python3.10 and virtualenv
python3.11 -m virtualenv env_bacpipe
activate the environment
source env_bacpipe/bin/activate
- for
fairseq
to install you will need python headers:sudo apt-get install python3.11-dev
- pip version 24.0 (
pip install pip==24.0
, omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.*. pip 24.1 will enforce this behaviour change and installation will thus fail.)
pip install -r requirements.txt
pip install -r requirements_windows.txt
If you do not have admin rights and encounter a permission denied
error when using pip install
, use python -m pip install ...
instead.
By doing so you will also ensure that the directory structure for the model checkpoints will be created.
python -m pytest -v --disable-warnings bacpipe/tests/*
Download the ones that are available from here and create directories corresponding to the pipeline-names and place the checkpoints within them.
Modify the config.yaml file in the root directory to specify the path to your dataset
. Define what models to run by specifying the strings in the embedding_model
list (copy and paste as needed). If you want to run a dimensionality reduction model, specify the name in the dim_reduction_model
variable.
Once the configuration is complete, execute the run_pipeline.py file (make sure the environment is activated)
python run_pipeline.py
.
While the scripts are executed, directories will be created in the bacpipe/evaluation
directory. Embeddings will be saved in bacpipe/evaluation/embeddings
(see here for more info) and if selected, reduced dimensionality embeddings will be saved in bacpipe/evaluation/dim_reduced_embeddings
(see here for more infor) .
Evaluation of the models is possible in different ways, which is all explained here.
The models all have their model specific code to ensure inference runs smoothly. More info on the models and their pipelines can be found here.
Models currently include:
Name | ref paper | ref code | sampling rate | input length | embedding dimension |
---|---|---|---|---|---|
Animal2vec_XC | paper | code | 24 kHz | 5 s | 768 |
Animal2vec_MK | paper | code | 8 kHz | 10 s | 1024 |
AudioMAE | paper | code | 16 kHz | 10 s | 768 |
AVES_ESpecies | paper | code | 16 kHz | 1 s | 768 |
BioLingual | paper | code | 48 kHz | 10 s | 512 |
BirdAVES_ESpecies | paper | code | 16 kHz | 1 s | 1024 |
BirdNET | paper | code | 48 kHz | 3 s | 1024 |
AvesEcho_PASST | paper | code | 32 kHz | 3 s | 768 |
HumpbackNET | paper | code | 2 kHz | 3.9124 s | 2048 |
Insect66NET | paper | code | 44.1 kHz | 5.5 s | 1280 |
Insect459NET | paper | pending | 44.1 kHz | 5.5 s | 1280 |
Mix2 | paper | code | 16 kHz | 3 s | 960 |
Perch_Bird | paper | code | 32 kHz | 5 s | 1280 |
ProtoCLR | paper | code | 16 kHz | 6 s | 384 |
RCL_FS_BSED | paper | code | 22.05 kHz | 0.2 s | 2048 |
SurfPerch | paper | code | 32 kHz | 5 s | 1280 |
Google_Whale | paper | code | 24 kHz | 5 s | 1280 |
VGGish | paper | code | 16 kHz | 0.96 s | 128 |
Given that this repository compiles a large number of very different deep learning models wich different requirements, some issues have been noted.
Please raise issues if there are questions or bugs.
Also, please cite the authors of the respective models, all models are referenced in the table above.