Skip to content

bioacoustic-ai/bacpipe

Repository files navigation

bacpipe

BioAcoustic Collection Pipeline

This repository aims to streamline the generation and testing of embeddings using a large variety of bioacoustic models.

The below image shows a comparison of umap embeddings based on 16 different bioacoustic models. The models are being evaluated on the DCASE Task 5 Few-Shot learning dataset. This repository is an attempt to enable a comparison of bioacoustic models based on their embedding spaces, rather than their classification results.

Installation

Create and activate your environment

Create a virtual environment using python3.11 or python3.10 and virtualenv python3.11 -m virtualenv env_bacpipe

activate the environment

source env_bacpipe/bin/activate

Ensure you have the following before installing the requirements.

  • for fairseq to install you will need python headers: sudo apt-get install python3.11-dev
  • pip version 24.0 (pip install pip==24.0, omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.*. pip 24.1 will enforce this behaviour change and installation will thus fail.)

Install the dependencies once the prerequisites are satisfied.

pip install -r requirements.txt

For Windows use the windows-specific requirements

pip install -r requirements_windows.txt

If you do not have admin rights and encounter a permission denied error when using pip install, use python -m pip install ... instead.

Test the installation you can execute the test suite.

By doing so you will also ensure that the directory structure for the model checkpoints will be created.

python -m pytest -v --disable-warnings bacpipe/tests/*

Add the model checkpoints that are not included by default.

Download the ones that are available from here and create directories corresponding to the pipeline-names and place the checkpoints within them.

Usage

Modify the config.yaml file in the root directory to specify the path to your dataset. Define what models to run by specifying the strings in the embedding_model list (copy and paste as needed). If you want to run a dimensionality reduction model, specify the name in the dim_reduction_model variable.

Once the configuration is complete, execute the run_pipeline.py file (make sure the environment is activated) python run_pipeline.py .

While the scripts are executed, directories will be created in the bacpipe/evaluation directory. Embeddings will be saved in bacpipe/evaluation/embeddings (see here for more info) and if selected, reduced dimensionality embeddings will be saved in bacpipe/evaluation/dim_reduced_embeddings (see here for more infor) .

Evaluation

Evaluation of the models is possible in different ways, which is all explained here.

Available models

The models all have their model specific code to ensure inference runs smoothly. More info on the models and their pipelines can be found here.

Models currently include:

Name ref paper ref code sampling rate input length embedding dimension
Animal2vec_XC paper code 24 kHz 5 s 768
Animal2vec_MK paper code 8 kHz 10 s 1024
AudioMAE paper code 16 kHz 10 s 768
AVES_ESpecies paper code 16 kHz 1 s 768
BioLingual paper code 48 kHz 10 s 512
BirdAVES_ESpecies paper code 16 kHz 1 s 1024
BirdNET paper code 48 kHz 3 s 1024
AvesEcho_PASST paper code 32 kHz 3 s 768
HumpbackNET paper code 2 kHz 3.9124 s 2048
Insect66NET paper code 44.1 kHz 5.5 s 1280
Insect459NET paper pending 44.1 kHz 5.5 s 1280
Mix2 paper code 16 kHz 3 s 960
Perch_Bird paper code 32 kHz 5 s 1280
ProtoCLR paper code 16 kHz 6 s 384
RCL_FS_BSED paper code 22.05 kHz 0.2 s 2048
SurfPerch paper code 32 kHz 5 s 1280
Google_Whale paper code 24 kHz 5 s 1280
VGGish paper code 16 kHz 0.96 s 128

Known issues

Given that this repository compiles a large number of very different deep learning models wich different requirements, some issues have been noted.

Please raise issues if there are questions or bugs.

Also, please cite the authors of the respective models, all models are referenced in the table above.

About

BioAcoustic Collection Pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages