This project provides tools to compare image using different image metrics and color space
- Compute Image Quality Assessment Metrics: Assessment quality with multiple full and no reference metrics
- Image Difference: Generate thresholded difference images to highlight significant differences between two images
- Heatmaps Generation: Generate metric maps visualizing the spatial distribution of metric values across the image
It supports:
- 18 full-reference
- 5 no-reference
- image diffs in 8 different color spaces with flexible thresholding
The full reference and no reference metrics are from these python packages:
Some example comparison databases are available here: https://lanl.github.io/libra/
- Clone Repository
git clone https://github.com/lanl/libra
- Install Dependencies
pip install opencv-python-headless numpy matplotlib scikit-image torch piq pyiqa ImageHash
Note: some dependencies are not available through conda. We recommend using virtual environments for now.
A command line interface is provided, that is accessive as follows:
python src/app.py -h
There are three modes to use the tool:
- using a JSON file
- using the command line interface
- as a library
The JSON configuration file should contain the following keys:
- reference_image_path (str): Path to the reference image.
- distorted_image_path (str): Path to the distorted image.
- output_directory (str): Path to the output directory where the CSV file and metric maps will be saved.
- output_filename (str, optional): Name of the output CSV file (default: "metrics.csv").
- generate_metrics (bool, optional): Flag to generate metrics (default: False).
- generate_maps (bool, optional): Flag to generate metric maps (default: False).
- generate_image_difference (bool, optional): Flag to generate thresholded difference images (default: False).
- difference_threshold (int, optional): Threshold value for generating thresholded difference images (default: 10).
- metrics (list of str, optional): List of metrics to compute.
- color_spaces (list of str, optional): List of color spaces to use for computing metrics (default: ["RGB"]).
- map_window_size (int, optional): Window size for computing metric maps (default: 161).
- map_step_size (int, optional): Step size for computing metric maps (default: 50).
Here is an example of a JSON configuration, also available in the samples folder:
{
"reference_image_path": "tests/data/test/orig.png",
"distorted_image_path": "tests/data/test/compressed.png",
"output_directory": "test_output",
"output_filename": "metrics.csv",
"generate_maps": true,
"generate_metrics": true,
"generate_image_difference": true,
"difference_threshold": 10,
"metrics": ["PSNR", "SSIM", "VSI", "GMSD", "MSE", "DSS"],
"color_spaces": ["RGB", "HSV", "LAB"],
"map_window_size": 161,
"map_step_size": 50
}
It can be run from the home directory as follows:
python src/app.py -j samples/sample_input.json
The command line interface is useful for quick comparisons between two images. It can be used e.g. as
python src/main.py -r tests/data/test/orig.png -c tests/data/test/compressed.png -m SSIM -p
Refer to the example.ipynb notebook in the samples folder
This example evaluates the visualization quality of an isotropic turbulence dataset subjected to tensor compression with a maximum Peak Signal-to-Noise Ratio (PSNR) of 40. The assessment focuses on how effectively the tensor compression retains the visual fidelity of the turbulence data.
References
Dataset: https://klacansky.com/open-scivis-datasets/\
Compression Technique: https://github.com/rballester/tthresh
Reference Image |
Compressed Image (PSNR: 40) |
Color Space | Description |
---|---|
RGB | Standard color space with three primary colors: Red, Green, and Blue. Commonly used in digital images and displays. |
HSV | Stands for Hue, Saturation, and Value. Often used in image processing and computer vision because it separates color. |
HLS | Stands for Hue, Lightness, and Saturation. Similar to HSV but with a different way of representing colors. |
LAB | Consists of three components: Lightness (L*), a* (green to red), and b* (blue to yellow). Mimics human vision. |
XYZ | A linear color space derived from the CIE 1931 color matching functions. Basis for many other color spaces. |
LUV | Similar to LAB but with a different chromaticity component. Used in color difference calculations and image analysis. |
YCbCr | Color space used in video compression. Separates the image into luminance (Y) and chrominance (Cb and Cr) components. |
YUV | Used in analog television and some digital video formats. Separates image into luminance (Y) and chrominance (U and V). |
- AUTUMN
- BONE
- JET
- WINTER
- RAINBOW
- OCEAN
- SUMMER
- SPRING
- COOL
- HSV
- PINK
- HOTb
- PARULA
- MAGMA
- INFERNO
- PLASMA
- VIRIDIS
- CVIRIDIS
- TWILIGHT
- TWILIGHT_SHILFTED
- TURBO
- DEEPGREEN
Metric | Python Package | Description | Value Ranges |
---|---|---|---|
MSE | libra | Measures the average squared difference between the reference and test images. | Range: [0, ∞). Lower MSE indicates higher similarity. |
SSIM | piq | Assesses the structural similarity between images considering luminance, contrast, and structure. | Range: [-1, 1]. Higher values indicate better similarity. |
PSNR | piq | Represents the ratio between the maximum possible power of a signal and the power of corrupting noise. | Range: [0, ∞) dB. Higher values indicate better image quality. |
FSIM | piq | Evaluates image quality based on feature similarity considering phase congruency and gradient magnitude. | Range: [0, 1]. Higher values indicate better feature similarity. |
MS-SSIM | piq | Extension of SSIM that evaluates image quality at multiple scales. | Range: [0, 1]. Higher values indicate better structural similarity. |
VSI | piq | Measures image quality based on visual saliency. | Range: [0, 1]. Higher values indicate better visual similarity. |
SR-SIM | piq | Assesses image quality using spectral residual information. | Range: [0, 1]. Higher values indicate better visual similarity. |
MS-GMSD | piq | Evaluates image quality based on gradient magnitude similarity across multiple scales. | Range: [0, ∞). Lower values indicate higher similarity. |
LPIPS | piq | Uses deep learning models to assess perceptual similarity. | Range: [0, 1]. Lower values indicate higher similarity. |
PieAPP | piq | Deep learning-based metric for perceptual image quality. | Range: [0, 1]. Lower values indicate higher quality. |
DISTS | piq | Combines deep learning features to evaluate image quality based on structure and texture similarity. | Range: [0, 1]. Lower values indicate higher similarity. |
MDSI | piq | Measures image quality based on mean deviation similarity index. | Range: [0, ∞). Lower values indicate better quality. |
DSS | piq | Computes image quality using a detailed similarity structure. | Range: [0, 1]. Higher values indicate better similarity. |
IW-SSIM | piq | Information-weighted SSIM that emphasizes important regions in images. | Range: [0, 1]. Higher values indicate better structural similarity. |
VIFp | piq | Measures image quality based on visual information fidelity. | Range: [0, 1]. Higher values indicate better preservation of information. |
GMSD | piq | Gradient Magnitude Similarity Deviation metric for assessing image quality. | Range: [0, ∞). Lower values indicate higher similarity. |
HaarPSI | piq | Uses Haar wavelet-based perceptual similarity index to evaluate image quality. | Range: [0, 1]. Higher values indicate better perceptual similarity. |
pHash | ImageHash | Generates a compact hash value that represents the perceptual content of an image. | Range: [0, ∞). Higher values indicate worse perceptual similarity. |
Metric | Python Package | Description | Value Ranges |
---|---|---|---|
BRISQUE | pyiqa | Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) uses natural scene statistics to measure image quality. | Range: [0, 100]. Lower values indicate better quality. |
CLIP-IQA | piq | Image quality metric that utilizes the CLIP model to assess the visual quality of images based on their similarity to predefined text prompts. | Range: [0, 1]. Higher values indicate better quality. |
NIQE | pyiqa | Natural Image Quality Evaluator. It assesses image quality based on statistical features derived from natural scene statistics. | Range: [0, 100]. Lower values indicate better quality. |
MUSIQ | pyiqa | Multi-Scale Image Quality. An advanced metric that evaluates image quality across multiple scales to better capture perceptual quality. | Range: [0, 1]. Higher values indicate better quality. |
NIMA | pyiqa | Neural Image Assessment. A deep learning-based model that predicts the aesthetic and technical quality of images. | Range: [0, 10]. Higher values indicate better quality. |
This cinema database shows the results of a small lossy compression study.