DEC

Defined in DEC.py

Index

class ResidualAutoEncoder
func load_encoder
class ClusteringModule
- func init_centroid
class DEC
func diarizationDEC

class ResidualAutoEncoder()

class ResidualAutoEncoder(ip_features,
                          hidden_dims=[500, 500, 2000, 30]))

Defined in DEC.py

Create a torch.nn.Module for a deep autoencoder composed of Residual Neural Network (ResNet) bloacks as the encoder and decoder layer. Activation used is ReLU. The bottleneck encoder output and final decoder output are not activated to avoid data loss due to ReLU activation.

Parameters:

Argument	Detail
`ip_features:`	int, Input features size
`hidden_dims:`	list of int, List of hidden dimension features. Last element on the list is the output dimension of bottleneck of the autoencoder

Returns:

Variable	Detail
`z:`	torch.Tensor, Output from the bottle encoder of the deep autoencoder network.
`xo:`	list of torch.Tensor, Output from each encoder except the bottle encoder of the deep autoencoder. First item of the list is the input given to the system.
`xr:`	list of torch.Tensor, Reconstruction of inputs to each encoder layer of autoencoder. xr is reversed so that i-th item in list xr is the reconstruction of i-th item in list xo. Eg. First item of xo is the input to the ResidualAutoEncoder network, and first item of xr is the reconstruction from the ResidualAutoEncoder network.

def load_encoder()

def load_encoder():

Defined in DEC.py

Load weights from the ResidualAutoEncoder trained on the training data.

Returns:

Variable	Detail
`model:`	_ResidualAutoEncoder, Model with input feature size of 192, and hidden layers of size 500, 500, 2000, 30. Weights of the model initialized to weight of the autoencoder trained on training data.

class ClusteringModule()

class ClusteringModule(nn.Module):
    def __init__(self,
                 num_clusters,
                 encoder, data,
                 cinit = "KMeans"):

Defined in DEC.py

Clustering module of the deep embedding clustering (DEC) algorithm. It uses the trained encoder of the ResidualAutoEncoder to initialize the DEC Clustering network. Kmeans is used to initialize centroids in the latent space.

Parameters:

Argument	Detail
`num_clusters:`	str, Number of clusters to create from the algorithm
`encoder:`	nn.Module, Pre-trained encoder for intializing the centroids. Encoder tranforms data to the latent space for clustering
`cinit:`	str, Initialization method of centroids of clusters. Default `KMeans`

Returns:

Variable	Detail
`q:`	torch.Tensor, Tensor of similarity between embedding points z_i and centroid mu_j. Assumes Student's t distribution as the kernel
`p:`	torch.Tensor, Tensor of target distribution based on soft assignment of q_i
`xo[0]`	torch.Tensor, Input data to the ResidualAutoEncoder
`xr[0]`	-torch.Tensor_, Reconstructed input by the ResidualAutoEncoder

Class Functions:

init_centroid:

def init_centroid(self,
                  data,
                  method = "KMeans")

Returns clustered data after calculating the optimal number of speakers using eigen-gap method, and then clustering the data based on the method specified.

Parameters:

Argument	Detail
`data:`	torch.Tensor, Input data to be clustered
`method:`	numpy.ndarray, Clustering method. Default `KMeans`. Options `KMeans`/`Spectral`

Returns:

Variable	Detail
`output:`	torch.Tensor, Tensor containing intialized centroids for the dataset

class DEC()

class DEC(self,
          num_clusters,
          encoder, data,
          cinit = "KMeans"):

Defined in DEC.py

Deep embedding clustering (DEC) algorithm. It uses the trained encoder of the ResidualAutoEncoder to initialize the DEC Clustering network. It calls ClusteringModule class to initialize the centroids.

Parameters:

Argument	Detail
`encoder:`	nn.Module, Pre-trained encoder for intializing the centroids. Encoder tranforms data to the latent space for clustering
`num_clusters:`	str, Number of clusters to create from the algorithm. Default `None` uses eigengap to determine number of clusters
`cinit:`	str, Initialization method of centroids of clusters. Default `KMeans`. Options `KMeans`/`Spectral`

Class Functions:

fit:

def fit(self,
        data,
        y_true = None,
        niter = 150,
        lrEnc = 1e-4,
        lrCC = 1e-4,
        verbose = False)

Trains the algorithm by measuring the KL Divergence between target and observed distributions. Also updates the ResidualAutoEncoder using MSE loss in parallel to improve the latent space project of the data for better clustering. Both the updates use the Adams optimizer and the objective function is a linear combination of KL Divergence between target and observed distribution, and MSE Loss between input data and its reconstruction by the ResidualAutoEncoder.

Parameters:

Argument	Detail
`data:`	torch.Tensor, Input data to be clustered
`y_true:`	numpy.ndarray, True labels of the data we aim to cluster. `predict()` and `clusterAccuracy()` functions are invoked only if y_true is not `None`
`niter`	int, Number of epochs to train the model for
`lrEnc`	float, Learning rate for updating the encoder
`lrCC`	float, Learning rate for updating the cluster centres
`verbose`	bool, `True` value activates the tqdm progress bar while training. `False` returns no updates when training

predict: def predict(self, data)

Predict the cluster label to the data by inspecting the label about which the observed distribution is maximized.

Parameters:

Argument	Detail
`data:`	torch.Tensor, Input data to be labels after clustering

Returns:

Variable	Detail
`y_pred:`	numpy.ndarray, Soft prediction labels of the data

clusterAccuracy: def clusterAccuracy(self, y_pred, y_true)

Predict the cluster labels accuracy as the maximum accuracy between y_pred and y_true for all the permutation of y_pred. This permutation is found by linear_sum_assignment optimization function of scipy.

Parameters:

Argument	Detail
`y_pred:`	numpy.ndarray, Prediction of the labels by DEC algorithm
`y_true`	numpy.ndarray, True labels of the data

Returns:

Variable	Detail
`accuracy:`	float, Cluster assignment accuracy
`reassignment:`	dict, dictionary with key as rows and value as cols indices for the optimal assignment

def diarizationDEC()

def diarizationDEC(audio_dataset,
                   num_spkr = None,
                   hypothesis_dir = None)

Defined in DEC.py

Compute diarization labels based on oracle number of speakers if num_spkr = 'oracle'. Used as an optimal benchmark for performance of DEC. If num_spkr = None, uses eigen-gap maximization in the ClusteringModule to determine the number of speakers.

Parameters:

Argument	Detail
`audio_dataset:`	utils.DiarizationDataset, Test diarization dataset
`num_spkr:`	str, `None` for calculating the optimal number of speakers from eigen-gap maximization. `oracle` for using the number of speakers in each window given with the data.
`hypothesis_dir:`	str, Directory to store the predicted speaker labels in the audio segments in an rttm file. `None` stores it in `./rttm_output/` directory

Returns:

Variable	Detail
`hypothesis_dir:`	str, Directory to the rttm files containing predicted speaker labels with their timestamps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEC

Defined in DEC.py

Index

class ResidualAutoEncoder()

def load_encoder()

class ClusteringModule()

class DEC()

def diarizationDEC()

API Documentation

Clone this wiki locally