diff --git a/_images/meter_analysis_27_2.png b/_images/meter_analysis_27_2.png
new file mode 100644
index 0000000..7636d3b
Binary files /dev/null and b/_images/meter_analysis_27_2.png differ
diff --git a/_images/pitch-extraction_22_0.png b/_images/pitch-extraction_24_0.png
similarity index 100%
rename from _images/pitch-extraction_22_0.png
rename to _images/pitch-extraction_24_0.png
diff --git a/_images/singing-voice-extraction_24_0.png b/_images/singing-voice-extraction_24_0.png
new file mode 100644
index 0000000..a93c088
Binary files /dev/null and b/_images/singing-voice-extraction_24_0.png differ
diff --git a/_sources/melodic_analysis/pitch-extraction.ipynb b/_sources/melodic_analysis/pitch-extraction.ipynb
index 3e0be66..9dd1ed3 100644
--- a/_sources/melodic_analysis/pitch-extraction.ipynb
+++ b/_sources/melodic_analysis/pitch-extraction.ipynb
@@ -239,21 +239,54 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
- "tags": []
+ "tags": [
+ "remove-output"
+ ]
},
"outputs": [],
"source": [
"# Initializing an FTANet instance\n",
"ftanet_carnatic = compiam.load_model(\"melody:ftanet-carnatic\")\n",
"\n",
+ "# Importing and initializing again a melodia instance of comparison\n",
+ "from compiam.melody.pitch_extraction import Melodia\n",
+ "melodia = Melodia() "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We will now initialize Saraga Carnatic dataset to get an example track, load it, and run inference using FTANet-Carnatic and Melodia, both accessed through `compiam`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "import librosa\n",
+ "from compiam import load_dataset\n",
+ "\n",
+ "saraga_carnatic = load_dataset(\n",
+ " \"saraga_carnatic\",\n",
+ " data_home=os.path.join(\"..\", \"audio\", \"mir_datasets\")\n",
+ ")\n",
+ "saraga_tracks = saraga_carnatic.load_tracks()\n",
+ "example = saraga_tracks[\"109_Sri_Raghuvara_Sugunaalaya\"]\n",
+ "\n",
+ "# Loading the audio\n",
+ "y, sr = librosa.load(example.audio_path)\n",
+ "y = y[:3*sr] # Getting just the first 3 seconds\n",
+ "\n",
"# Predict!\n",
- "ftanet_pitch_track = ftanet_carnatic.predict(\n",
- " os.path.join(\n",
- " \"..\", \"audio\", \"mir_datasets\", \"saraga1.5_carnatic\",\n",
- " \"Live at Vani Mahal by Sanjay Subrahmanyan\", \"Sri Raghuvara Sugunaalaya\",\n",
- " \"Sanjay Subrahmanyan - Sri Raghuvara Sugunaalaya.mp3\"\n",
- " ),\n",
- ")"
+ "ftanet_pitch_track = ftanet_carnatic.predict(example.audio_path)\n",
+ "melodia_pitch_track = melodia.extract(example.audio_path)"
]
},
{
@@ -274,15 +307,6 @@
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
- "y, sr = librosa.load(\n",
- " os.path.join(\n",
- " \"..\", \"audio\", \"mir_datasets\", \"saraga1.5_carnatic\",\n",
- " \"Live at Vani Mahal by Sanjay Subrahmanyan\", \"Sri Raghuvara Sugunaalaya\",\n",
- " \"Sanjay Subrahmanyan - Sri Raghuvara Sugunaalaya.mp3\"\n",
- " )\n",
- ")\n",
- "y = y[:3*sr] # Getting just the first 3 seconds for better visualizatiom\n",
- "\n",
"fig, ax = plt.subplots(nrows=1, ncols=1, sharex=True, figsize=(15, 12))\n",
"D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)\n",
"img = librosa.display.specshow(D, y_axis='linear', x_axis='time', sr=sr, ax=ax);\n",
diff --git a/_sources/melodic_analysis/raga-recognition.ipynb b/_sources/melodic_analysis/raga-recognition.ipynb
index 5da48c9..0332250 100644
--- a/_sources/melodic_analysis/raga-recognition.ipynb
+++ b/_sources/melodic_analysis/raga-recognition.ipynb
@@ -93,6 +93,15 @@
"deepsrgm = compiam.load_model(\"melody:deepsrgm\")"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "```{important}\n",
+ "Deep Learning model checkpoints tend to be large in size, therefore storing these in `compiam` may become unsustainable. For that reason, we store the checkpoints in the cloud and download these when the user initializes a model using the wrapper. Note that you can specify to which location the checkpoint should be donwloaded by specifying `data_home` argument in `load_model()`.\n",
+ "```"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -179,7 +188,7 @@
"deepsrgm.predict(feat)\n",
"```\n",
"\n",
- "**Oops! Why is this line of code not running?** This ``.predict()`` method runs inference with the DEEPSRGM model using the passed features. As you may now, DL models tend to occupy an important amount of memory from your computer to run, especially in the training stage. Some light-weight models can be run from a conventional laptop, but in some other cases you might need a machine with enough power to run your models and perform inference. \n",
+ "**Oops! Why is this line of code not being executed?** This ``.predict()`` method runs inference with the DEEPSRGM model using the passed features. As you may now, DL models tend to occupy an important amount of memory from your computer to run, especially in the training stage. Some light-weight models can be run from a conventional laptop, but in some other cases you might need a machine with enough power to run your models and perform inference. \n",
"\n",
"```{tip}\n",
"As mentioned in the [introduction](google-collab), Google Collab has GPU and TPU access which may allow you running certain models that are too big for your machine.\n",
diff --git a/_sources/resources/citing.md b/_sources/resources/citing.md
index b7d15ed..96a07d2 100644
--- a/_sources/resources/citing.md
+++ b/_sources/resources/citing.md
@@ -22,7 +22,7 @@ The cite the book as a whole please cite:
We now explicitly attribute each author to their contribution, in case you need to cite bits of text or code instead of the book as a while.
* [Setup](welcome-setup), [Python](python) and [compIAM](compiam), [corpora](corpora) and [datasets](datasets), and tool walkthroughs: **Thomas Nuttall, Genís Plaja-Roglans, Lara Pearson and Brindha Manickavasakan, Xavier Serra**
* [What is Indian Art Music?](indian-art-music) and introduction to Carnatic Music ([instrumentation](carnatic-instrumentation), [format](carnatic-formats), and [melodic concepts](carnatic-melodic-concepts)): **Lara Pearson and Brindha Manickavasakan**
-* Introduction to Hindustani Music ([instrumentation](hindustani-instrumentation), [format](hindustani-formats), and [melodic concepts](hindustani-melodic-concepts)): **Kaustuv Kanti Ganguli**
+* [Introduction to Hindustani Music](hindustani-music): **Kaustuv Kanti Ganguli**
* [Carnatic Rhythm](carnatic-rhythm) and [Hindustani Rhythm](hindustani-rhythm): **Ajay Srinivasamurthy**
If you use `compiam` please cite as:
diff --git a/_sources/rhythmic_analysis/meter_analysis.ipynb b/_sources/rhythmic_analysis/meter_analysis.ipynb
index daa4013..6e6f314 100644
--- a/_sources/rhythmic_analysis/meter_analysis.ipynb
+++ b/_sources/rhythmic_analysis/meter_analysis.ipynb
@@ -19,7 +19,11 @@
},
"outputs": [],
"source": [
- "## Importing compiam to the project\n",
+ "## Installing (if not) and importing compiam to the project\n",
+ "import importlib.util\n",
+ "if importlib.util.find_spec('compiam') is None:\n",
+ " ## Bear in mind this will only run in a jupyter notebook / Collab session\n",
+ " %pip install compiam\n",
"import compiam\n",
"\n",
"# Import extras and supress warnings to keep the tutorial clean\n",
@@ -238,7 +242,7 @@
"metadata": {},
"outputs": [],
"source": [
- "predicted_aksharas[:20]"
+ "predicted_aksharas"
]
},
{
@@ -277,7 +281,7 @@
"\n",
"# And we plot!\n",
"plot_waveform(\n",
- " file_path=track.audio_path,\n",
+ " input_data=track.audio_path,\n",
" t1=0,\n",
" t2=4,\n",
" labels=predicted_beats_dict,\n",
@@ -293,7 +297,7 @@
},
"language_info": {
"name": "python",
- "version": "3.9.4"
+ "version": "3.11.6"
},
"orig_nbformat": 4,
"vscode": {
diff --git a/_sources/separation/singing-voice-extraction.ipynb b/_sources/separation/singing-voice-extraction.ipynb
new file mode 100644
index 0000000..df62ef4
--- /dev/null
+++ b/_sources/separation/singing-voice-extraction.ipynb
@@ -0,0 +1,356 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "(singing-voice-extraction)=\n",
+ "# Pitch extraction\n",
+ "\n",
+ "As seen in the melodic introduction of [Carnatic](carnatic-melodic-concepts) and [Hindustani](hindustani-raga), pitch time-series is a very relevant feature to tackle the melodic analysis of these music traditions, showing utility for many musicologically-relevant problems. Given the shortage of recordings for isolated predominant melodic instruments, this task has had to be tackled directly from mixture recordings. In this context, the task is referred as *predominant* pitch extraction (otherwise commonly referred as *vocal* pitch extraction when the source is the singing voice).\n",
+ "\n",
+ "Historically, in the literature of computational analysis of Indian Art Music, this task has been tackled using knowledge-based approaches {cite}`rao_pitch_2010, salamon_pitch_2012`. Melodia {cite}`salamon_pitch_2012`, typically combined with additional post-processing steps {cite}`gulati_patterns_2016`, has been the preferred approach until today."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "## Installing (if not) and importing compiam to the project\n",
+ "import importlib.util\n",
+ "if importlib.util.find_spec('compiam') is None:\n",
+ " ## Bear in mind this will only run in a jupyter notebook / Collab session\n",
+ " %pip install compiam\n",
+ "import compiam\n",
+ "\n",
+ "# Import extras and supress warnings to keep the tutorial clean\n",
+ "import os\n",
+ "import numpy as np\n",
+ "from pprint import pprint\n",
+ "\n",
+ "import warnings\n",
+ "warnings.filterwarnings('ignore')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's first print out the available tools we do have available to extract the pitch from Indian Art Music recordings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pprint(compiam.melody.pitch_extraction.list_tools())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "## Melodia\n",
+ "\n",
+ "Let's extract the pitch from an audio sample using Melodia {cite}`salamon_pitch_2012`. This is a salience-based method, meaning that it captures the most salient melodic segments in the time-frequency representation of the input signal, keeping and connecting the segments that are more likely to belong to the predominant melody, using heuristic rules. Make sure you check out the paper for further detail!\n",
+ "\n",
+ "We first need to install `essentia`, which is the optional dependency required to load this tool."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "%pip install essentia"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "# Importing and initializing a melodia instance\n",
+ "from compiam.melody.pitch_extraction import Melodia\n",
+ "melodia = Melodia() \n",
+ "\n",
+ "# Running extraction for an example track\n",
+ "melodia_pitch_track = melodia.extract(\n",
+ " os.path.join(\n",
+ " \"..\", \"audio\", \"mir_datasets\", \"saraga1.5_carnatic\",\n",
+ " \"Live at Vani Mahal by Sanjay Subrahmanyan\", \"Sri Raghuvara Sugunaalaya\",\n",
+ " \"Sanjay Subrahmanyan - Sri Raghuvara Sugunaalaya.mp3\"\n",
+ " )\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(\"Shape of the output pitch:\", np.shape(melodia_pitch_track))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can infer from the shape the number of estimated pitch values, while the dimension 2 refers to ``(frequency values, time-stamps)``. Let us show the first 5 time-stamps here."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pprint(list(melodia_pitch_track[:5, 0]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's now print out the final 5 pitch values."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pprint(list(melodia_pitch_track[-5:, 1]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Melodia has been found, in the original paper experiments and also in the [MIREX campaign](https://nema.lis.illinois.edu/nema_out/mirex2011/results/ame/indian08/sg1results.html), to decently work on Indian Art Music samples. However, recent DL-based models have claimed the state-of-the-art for the task of pitch extraction. \n",
+ "\n",
+ "\n",
+ "## FTANet-Carnatic\n",
+ "\n",
+ "**Maybe we can use a Carnatic-trained version of one of these models to extract the pitch?** Let's now import a DL model that learns to automatically extract the predominant melody from audio recordings. We will use FTANet {cite}`yu_pitch_2021`. This is an attention-based network that laverages and fuses information from frequency and periodicity domains to capture the right pitch values for the predominant source. It learns to focus on a particular source, using an additional branch that helps reducing the false alarm rate (detecting pitch values that do not correspond to the source we target).\n",
+ "\n",
+ "To train this model we need a dataset that includes mixture recordings plus annotated pitch time-series for the source we aim at capturing the pitch from. This FTANet instance is trained with the [Saraga Carnatic Melody Synth (SMCS)](https://zenodo.org/record/5553925).\n",
+ "\n",
+ "In the documentation we observe that this model is based on `tensorflow`, therefore we must install this dependency before importing it."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "%pip install tensorflow==2.9.3"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from compiam.melody.pitch_extraction import FTANetCarnatic"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "```{admonition} This is a non-trained instance!\n",
+ "Directly importing a ML/DL-based tool from the corresponding task initialises a non-trained instance. However, if a ``*`` appears at the end of the tool name when running ``.list_tools()``, the pre-trained weights can be loaded using the ``compiam.load_model()`` wrapper.\n",
+ "```\n",
+ "\n",
+ "In ths case, we are only interested in inference. Therefore, we might be able to load FTANet as an already trained model. For that, let's print the models with available weights to load in `compiam`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pprint(compiam.list_models())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Cool! A Carnatic-trained FTANet is there.** Therefore, let's load it and run inference on an example track."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Disabling tensorflow warnings and debugging info\n",
+ "import os \n",
+ "os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"3\" \n",
+ "\n",
+ "# Importing tensorflow and disabling GPU usage\n",
+ "import tensorflow as tf\n",
+ "tf.config.set_visible_devices([], \"GPU\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's first deactivate the GPU usage, since we assume no CUDA-capable GPU is available in most of the cases. We import `tensorflow` and set the visible GPU devices to none. We also disable the `tensorflow` warnings in order to keep the tutorial clean.\n",
+ "\n",
+ "```{note}\n",
+ "If you have an available GPU to allocate the model, get the index of the GPU (probably 0 if you have only a single instance) and change ``tf.config.set_visible_devices([], \"GPU\")`` for ``os.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0\"``\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "# Initializing an FTANet instance\n",
+ "ftanet_carnatic = compiam.load_model(\"melody:ftanet-carnatic\")\n",
+ "\n",
+ "# Importing and initializing again a melodia instance of comparison\n",
+ "from compiam.melody.pitch_extraction import Melodia\n",
+ "melodia = Melodia() "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We will now initialize Saraga Carnatic dataset to get an example track, load it, and run inference using FTANet-Carnatic and Melodia, both accessed through `compiam`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "import librosa\n",
+ "from compiam import load_dataset\n",
+ "\n",
+ "saraga_carnatic = load_dataset(\n",
+ " \"saraga_carnatic\",\n",
+ " data_home=os.path.join(\"..\", \"audio\", \"mir_datasets\")\n",
+ ")\n",
+ "saraga_tracks = saraga_carnatic.load_tracks()\n",
+ "example = saraga_tracks[\"109_Sri_Raghuvara_Sugunaalaya\"]\n",
+ "\n",
+ "# Loading the audio\n",
+ "y, sr = librosa.load(example.audio_path)\n",
+ "y = y[:3*sr] # Getting just the first 3 seconds\n",
+ "\n",
+ "# Predict!\n",
+ "ftanet_pitch_track = ftanet_carnatic.predict(example.audio_path)\n",
+ "melodia_pitch_track = melodia.extract(example.audio_path)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's visualise the extracted pitch tracks on top of the spectrogram of the input signal."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import librosa\n",
+ "import librosa.display\n",
+ "import numpy as np\n",
+ "import matplotlib.pyplot as plt\n",
+ "\n",
+ "fig, ax = plt.subplots(nrows=1, ncols=1, sharex=True, figsize=(15, 12))\n",
+ "D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)\n",
+ "img = librosa.display.specshow(D, y_axis='linear', x_axis='time', sr=sr, ax=ax);\n",
+ "ax.set_ylim(0, 2000)\n",
+ "plt.plot(\n",
+ " melodia_pitch_track[:, 0], melodia_pitch_track[:, 1],\n",
+ " color=\"white\", label=\"Melodia\",\n",
+ ")\n",
+ "plt.plot(\n",
+ " ftanet_pitch_track[:, 0], ftanet_pitch_track[:, 1],\n",
+ " color=\"black\",label=\"FTANet-Carnatic\",\n",
+ ")\n",
+ "plt.legend()\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "## Some final thoughts\n",
+ "\n",
+ "Again, going through the melodic introduction of [Carnatic](carnatic-melodic-concepts) and Hindustani Music, we note the importance of a reliable pitch extraction for the computational analysis of Indian Art Music. The task of pitch extraction has received quite a lot of attention recently and the state-of-the-art has been continuously moved forward, especially given the use of DL techniques. However, to the best of our knowledge, no DL-based pitch extraction has been trained or evaluated on Indian Art Music signals, the lack of data being the principal cause. The FTA-Net trained for Carnatic Music that we present in this walkthrough and that we have included in `compiam` has been trained with the [Saraga Carnatic Melody Synth (SMCS)](https://zenodo.org/record/5553925), a recently released dataset that includes several hours of Carnatic Music recordings annotated with ground-truth vocal pitch data that have been artificially compiled. "
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3.9.4 64-bit",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "name": "python",
+ "version": "3.11.6"
+ },
+ "orig_nbformat": 4,
+ "vscode": {
+ "interpreter": {
+ "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
+ }
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/_sources/structure_analysis/music-segmentation.ipynb b/_sources/structure_analysis/music-segmentation.ipynb
index 5199687..208a2ca 100644
--- a/_sources/structure_analysis/music-segmentation.ipynb
+++ b/_sources/structure_analysis/music-segmentation.ipynb
@@ -19,7 +19,11 @@
},
"outputs": [],
"source": [
- "## Importing compiam to the project\n",
+ "## Installing (if not) and importing compiam to the project\n",
+ "import importlib.util\n",
+ "if importlib.util.find_spec('compiam') is None:\n",
+ " ## Bear in mind this will only run in a jupyter notebook / Collab session\n",
+ " %pip install compiam\n",
"import compiam\n",
"\n",
"# Import extras and supress warnings to keep the tutorial clean\n",
@@ -80,13 +84,25 @@
"```"
]
},
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "remove-output"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "dbs = compiam.load_model(\"structure:dhrupad-bandish-segmentation\")"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
- "dbs = compiam.load_model(\"structure:dhrupad-bandish-segmentation\")\n",
"help(dbs)"
]
},
@@ -340,7 +356,7 @@
"outputs": [],
"source": [
"dbs.predict_stm(\n",
- " file_path=os.path.join(\n",
+ " input_data=os.path.join(\n",
" \"..\", \"audio\", \"59c88c32-0bde-433b-b194-0f65281e5714\", \"vocals.wav\"\n",
" )\n",
")"
@@ -395,7 +411,9 @@
"\n",
"What is more, the standardized source separation models target whether *vocals* and *accompaniment*, or *vocals*, *bass*, *drums*, and *other*. While to separate the singing voice from an accompaniment is OK, the *4 stem* configuration is far from being representative of the actual Carnatic and Hindustani Music arrangements.\n",
"\n",
- "As a final note, another factor that is currently blocking the research on music source separation for Indian Art Music is the shortage of available datasets for this task. We have observed that the Saraga Carnatic collection has multi-track audio, but this has leakage (it has been recorded in live performances). In such case, a leakage-aware approach would be needed to use this data. Alternatively, a music source separation dataset including completely isolated and aligned tracks, which to the best of our knowledge is unavailable as of now, would open the door the music source separation research on Indian Art Music."
+ "As a final note, another factor that is currently blocking the research on music source separation for Indian Art Music is the shortage of available datasets for this task. We have observed that the Saraga Carnatic collection has multi-track audio, but this has leakage (it has been recorded in live performances). In such case, a leakage-aware approach would be needed to use this data. Alternatively, a music source separation dataset including completely isolated and aligned tracks, which to the best of our knowledge is unavailable as of now, would open the door the music source separation research on Indian Art Music.\n",
+ "\n",
+ "**Nov 2023 Update:** A Carnatic-specific singing voice separation model has been developed and presented at ISMIR 2023 in Milan, Italy. See [the separation walkthrough](singing-voice-extraction) for an example."
]
}
],
@@ -407,7 +425,7 @@
},
"language_info": {
"name": "python",
- "version": "3.9.4"
+ "version": "3.11.6"
},
"orig_nbformat": 4,
"vscode": {
diff --git a/_sources/timbre_analysis/stroke-classification.ipynb b/_sources/timbre_analysis/stroke-classification.ipynb
index a727874..a8cb331 100644
--- a/_sources/timbre_analysis/stroke-classification.ipynb
+++ b/_sources/timbre_analysis/stroke-classification.ipynb
@@ -21,7 +21,11 @@
},
"outputs": [],
"source": [
- "## Importing compiam to the project\n",
+ "## Installing (if not) and importing compiam to the project\n",
+ "import importlib.util\n",
+ "if importlib.util.find_spec('compiam') is None:\n",
+ " ## Bear in mind this will only run in a jupyter notebook / Collab session\n",
+ " %pip install compiam\n",
"import compiam\n",
"\n",
"# Import extras and supress warnings to keep the tutorial clean\n",
@@ -263,7 +267,7 @@
},
"language_info": {
"name": "python",
- "version": "3.9.4"
+ "version": "3.11.6"
},
"orig_nbformat": 4,
"vscode": {
diff --git a/corpora_and_datasets/corpora.html b/corpora_and_datasets/corpora.html
index 4970819..65434f5 100644
--- a/corpora_and_datasets/corpora.html
+++ b/corpora_and_datasets/corpora.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/corpora_and_datasets/datasets.html b/corpora_and_datasets/datasets.html
index 384c74a..bcc048c 100644
--- a/corpora_and_datasets/datasets.html
+++ b/corpora_and_datasets/datasets.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/genindex.html b/genindex.html
index 1fbf6cb..4bff972 100644
--- a/genindex.html
+++ b/genindex.html
@@ -243,6 +243,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/indian_art_music/carnatic-music.html b/indian_art_music/carnatic-music.html
index 165d06b..e6a5580 100644
--- a/indian_art_music/carnatic-music.html
+++ b/indian_art_music/carnatic-music.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/indian_art_music/hindustani-music.html b/indian_art_music/hindustani-music.html
index cda1aa4..a2a9617 100644
--- a/indian_art_music/hindustani-music.html
+++ b/indian_art_music/hindustani-music.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/indian_art_music/iam.html b/indian_art_music/iam.html
index 5c57b6c..e734282 100644
--- a/indian_art_music/iam.html
+++ b/indian_art_music/iam.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/introduction/compiam.html b/introduction/compiam.html
index 0526c9e..1c67b97 100644
--- a/introduction/compiam.html
+++ b/introduction/compiam.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/introduction/python.html b/introduction/python.html
index d14de1f..f3c5610 100644
--- a/introduction/python.html
+++ b/introduction/python.html
@@ -248,6 +248,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/introduction/setup.html b/introduction/setup.html
index d8e0459..746d81d 100644
--- a/introduction/setup.html
+++ b/introduction/setup.html
@@ -246,6 +246,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
diff --git a/landing.html b/landing.html
index cbcc389..3da695c 100644
--- a/landing.html
+++ b/landing.html
@@ -245,6 +245,18 @@
+
+
+ Music Source Separation
+
+
+
Resources
@@ -561,6 +573,8 @@ Reference this Book