Skip to content

Gaussian Mixture Model package accounting for measurement uncertainties, selection effects, and an arbitrary number of components

License

Notifications You must be signed in to change notification settings

bernardinelli/gmm_anyk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gmm_anyk

Gaussian Mixture Model package accounting for measurement uncertainties, selection effects, and an arbitrary number of components

Introduction

This repository implements a few modifications of the standard Gaussian Mixture Model algorithm that allows one to use datasets with measurement uncertainties and selection effects, as well as determine, during training, the optimal number of Gaussians required to describe the data.

The following references detail the procedure:

The full implementation will be described in Bernardinelli et al., 2025, arXiv:2501.01551.

Basic usage

Suppose we have a 3D dataset (x), on the example, we're using a 3D dataset with one cluster:

import numpy as np
import gmm_anyk as ga

x = np.random.multivariate_normal(np.array([0., 2., 3.]), np.identity(3), size=1000) # 3D Gaussian centered at (0,2,3) with the identity matrix as the covariance, 1000 samples
gmm = ga.GMM(1,3) #number of clusters and number of dimensions
gmm.fit(x, #your data 
		scale=1, # initial guess for the scale of the standard deviation 
		tolerance=1e-5, # tolerance for the log-likelihood until the training is considered successfgull 
		maxiter=10000, # max number of iterations
		miniter=1000) # min number of iterations
print(gmm.mean_best) # best fit mean, it should be close to the input
print(gmm.cov_best) # best fit covariance

GMMNoise includes measurement uncertainties, AdaptiveGMM uses an arbitrary number of components, and IncompleteGMM implements a stochastic selection effect. IncompleteAdaptiveGMMNoise implements all of these three effects at once. The classes also include options for numerical regularization and other initialization techniques.

Dependencies

  • numpy
  • numba
  • scipy
  • Optional: compress_pickle (allows one to save and load the GMM class)

About

Gaussian Mixture Model package accounting for measurement uncertainties, selection effects, and an arbitrary number of components

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages