Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support encoding 10-bit input data #94

Merged
merged 3 commits into from
Dec 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
212 changes: 77 additions & 135 deletions ravif/src/av1encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ use crate::error::Error;
use crate::rayoff as rayon;
use imgref::{Img, ImgVec};
use rav1e::prelude::*;
use rgb::{RGB8, RGBA8};
use rgb::{Rgb, Rgba};

/// For [`Encoder::with_internal_color_model`]
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
Expand Down Expand Up @@ -38,13 +38,25 @@ pub enum AlphaColorMode {
Premultiplied,
}

/// The 8-bit mode only exists as a historical curiosity caused by lack of interoperability with old Safari versions.
/// There's no other reason to use it. 8 bits internally isn't precise enough for a complex codec like AV1, and 10 bits always compresses much better (even if the input and output are 8-bit sRGB).
/// The workaround for Safari is no longer needed, and the 8-bit encoding is planned to be deleted in a few months when usage of the oldest Safari versions becomes negligible.
/// https://github.com/kornelski/cavif-rs/pull/94#discussion_r1883073823
#[derive(Default, Debug, Copy, Clone, Eq, PartialEq)]
pub enum BitDepth {
#[default]
Eight,
#[default]
Ten,
/// Pick 8 or 10 depending on image format and decoder compatibility
Auto,
hannes-vernooij marked this conversation as resolved.
Show resolved Hide resolved
}

impl BitDepth {
/// Returns the bit depth in usize, this can currently be either `8` or `10`.
fn to_usize(self) -> usize {
match self {
BitDepth::Eight => 8,
BitDepth::Ten => 10,
}
}
}

/// The newly-created image file + extra info FYI
Expand Down Expand Up @@ -110,11 +122,12 @@ impl Encoder {
#[doc(hidden)]
#[deprecated(note = "Renamed to with_bit_depth")]
pub fn with_depth(self, depth: Option<u8>) -> Self {
self.with_bit_depth(depth.map(|d| if d >= 10 { BitDepth::Ten } else { BitDepth::Eight }).unwrap_or(BitDepth::Auto))
self.with_bit_depth(depth.map(|d| if d >= 10 { BitDepth::Ten } else { BitDepth::Eight }).unwrap_or(BitDepth::Ten))
}

/// Depth 8 or 10.
/// Depth 8 or 10-bit, default is 10-bit, even when 8 bit input data is provided.
#[inline(always)]
#[track_caller]
#[must_use]
pub fn with_bit_depth(mut self, depth: BitDepth) -> Self {
self.depth = depth;
Expand Down Expand Up @@ -154,7 +167,6 @@ impl Encoder {
}

#[doc(hidden)]
#[deprecated = "Renamed to `with_internal_color_model()`"]
pub fn with_internal_color_space(self, color_model: ColorModel) -> Self {
self.with_internal_color_model(color_model)
}
Expand Down Expand Up @@ -199,13 +211,11 @@ impl Encoder {
///
/// If all pixels are opaque, the alpha channel will be left out automatically.
///
/// This function takes 8-bit inputs, but will generate an AVIF file using 10-bit depth.
///
/// returns AVIF file with info about sizes about AV1 payload.
pub fn encode_rgba(&self, in_buffer: Img<&[rgb::RGBA<u8>]>) -> Result<EncodedImage, Error> {
pub fn encode_rgba<P: Pixel + Default>(&self, in_buffer: Img<&[Rgba<P>]>) -> Result<EncodedImage, Error> {
let new_alpha = self.convert_alpha(in_buffer);
let buffer = new_alpha.as_ref().map(|b| b.as_ref()).unwrap_or(in_buffer);
let use_alpha = buffer.pixels().any(|px| px.a != 255);
let buffer = new_alpha.as_ref().map(|b: &Img<Vec<Rgba<P>>>| b.as_ref()).unwrap_or(in_buffer);
let use_alpha = buffer.pixels().any(|px| px.a != P::cast_from(255));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast_from doesn't seem to be applying any scaling. For 10-bit depth, the opaque alpha is 1023, so I don't see how this would work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change only changes the type to rav1e::Pixel since that is what is eventually expected anyways. I wanted to keep the alpha changes for a separate PR to keep this reviewable, this doesn't change the functionality. Hardcoding this based on the format didn't seem like a great solution to me since 12 bit values are also valid.

if !use_alpha {
return self.encode_rgb_internal(buffer.width(), buffer.height(), buffer.pixels().map(|px| px.rgb()));
}
Expand All @@ -216,49 +226,31 @@ impl Encoder {
ColorModel::YCbCr => MatrixCoefficients::BT601,
ColorModel::RGB => MatrixCoefficients::Identity,
};
match self.depth {
BitDepth::Eight | BitDepth::Auto => {
let planes = buffer.pixels().map(|px| {
let (y, u, v) = match self.color_model {
ColorModel::YCbCr => rgb_to_8_bit_ycbcr(px.rgb(), BT601),
ColorModel::RGB => rgb_to_8_bit_gbr(px.rgb()),
};
[y, u, v]
});
let alpha = buffer.pixels().map(|px| px.a);
self.encode_raw_planes_8_bit(width, height, planes, Some(alpha), PixelRange::Full, matrix_coefficients)
},
BitDepth::Ten => {
let planes = buffer.pixels().map(|px| {
let (y, u, v) = match self.color_model {
ColorModel::YCbCr => rgb_to_10_bit_ycbcr(px.rgb(), BT601),
ColorModel::RGB => rgb_to_10_bit_gbr(px.rgb()),
};
[y, u, v]
});
let alpha = buffer.pixels().map(|px| to_ten(px.a));
self.encode_raw_planes_10_bit(width, height, planes, Some(alpha), PixelRange::Full, matrix_coefficients)
},
}
let planes = buffer.pixels().map(|px| match self.color_model {
ColorModel::YCbCr => rgb_to_ycbcr(px.rgb(), self.depth, BT601),
ColorModel::RGB => [px.g, px.b, px.r],
});
let alpha = buffer.pixels().map(|px| px.a);
self.encode_raw_planes(width, height, planes, Some(alpha), PixelRange::Full, matrix_coefficients)
}

fn convert_alpha(&self, in_buffer: Img<&[RGBA8]>) -> Option<ImgVec<RGBA8>> {
fn convert_alpha<P: Pixel + Default>(&self, in_buffer: Img<&[Rgba<P>]>) -> Option<ImgVec<Rgba<P>>> {
let max_value = (1 << self.depth.to_usize()) -1;
match self.alpha_color_mode {
AlphaColorMode::UnassociatedDirty => None,
AlphaColorMode::UnassociatedClean => blurred_dirty_alpha(in_buffer),
AlphaColorMode::Premultiplied => {
let prem = in_buffer.pixels()
.filter(|px| px.a != 255)
let prem = in_buffer
.pixels()
.filter(|px| px.a != P::cast_from(max_value))
.map(|px| {
if px.a == 0 {
RGBA8::default()
if Into::<u32>::into(px.a) == 0 {
kornelski marked this conversation as resolved.
Show resolved Hide resolved
Rgba::new(px.a, px.a, px.a, px.a)
kornelski marked this conversation as resolved.
Show resolved Hide resolved
} else {
RGBA8::new(
(u16::from(px.r) * 255 / u16::from(px.a)) as u8,
(u16::from(px.g) * 255 / u16::from(px.a)) as u8,
(u16::from(px.b) * 255 / u16::from(px.a)) as u8,
px.a,
)
let r = px.r * P::cast_from(max_value) / px.a;
let g = px.g * P::cast_from(max_value) / px.a;
let b = px.b * P::cast_from(max_value) / px.a;
Rgba::new(r, g, b, px.a)
}
})
.collect();
Expand All @@ -284,76 +276,49 @@ impl Encoder {
///
/// returns AVIF file, size of color metadata
#[inline]
pub fn encode_rgb(&self, buffer: Img<&[RGB8]>) -> Result<EncodedImage, Error> {
pub fn encode_rgb<P: Pixel + Default>(&self, buffer: Img<&[Rgb<P>]>) -> Result<EncodedImage, Error> {
self.encode_rgb_internal(buffer.width(), buffer.height(), buffer.pixels())
}

fn encode_rgb_internal(&self, width: usize, height: usize, pixels: impl Iterator<Item = RGB8> + Send + Sync) -> Result<EncodedImage, Error> {
fn encode_rgb_internal<P: Pixel + Default>(
&self, width: usize, height: usize, pixels: impl Iterator<Item = Rgb<P>> + Send + Sync,
) -> Result<EncodedImage, Error> {
let matrix_coefficients = match self.color_model {
ColorModel::YCbCr => MatrixCoefficients::BT601,
ColorModel::RGB => MatrixCoefficients::Identity,
};

match self.depth {
BitDepth::Eight => {
let planes = pixels.map(|px| {
let (y, u, v) = match self.color_model {
ColorModel::YCbCr => rgb_to_8_bit_ycbcr(px, BT601),
ColorModel::RGB => rgb_to_8_bit_gbr(px),
};
[y, u, v]
});
self.encode_raw_planes_8_bit(width, height, planes, None::<[_; 0]>, PixelRange::Full, matrix_coefficients)
},
BitDepth::Ten | BitDepth::Auto => {
let planes = pixels.map(|px| {
let (y, u, v) = match self.color_model {
ColorModel::YCbCr => rgb_to_10_bit_ycbcr(px, BT601),
ColorModel::RGB => rgb_to_10_bit_gbr(px),
};
[y, u, v]
});
self.encode_raw_planes_10_bit(width, height, planes, None::<[_; 0]>, PixelRange::Full, matrix_coefficients)
},
}
}
let is_eight_bit = std::mem::size_of::<P>() == 1;
let input_bit_depth = if is_eight_bit { BitDepth::Eight } else { BitDepth::Ten };

/// Encodes AVIF from 3 planar channels that are in the color space described by `matrix_coefficients`,
/// with sRGB transfer characteristics and color primaries.
///
/// Alpha always uses full range. Chroma subsampling is not supported, and it's a bad idea for AVIF anyway.
/// If there's no alpha, use `None::<[_; 0]>`.
///
/// returns AVIF file, size of color metadata, size of alpha metadata overhead
#[inline]
pub fn encode_raw_planes_8_bit(
&self, width: usize, height: usize, planes: impl IntoIterator<Item = [u8; 3]> + Send, alpha: Option<impl IntoIterator<Item = u8> + Send>,
color_pixel_range: PixelRange, matrix_coefficients: MatrixCoefficients,
) -> Result<EncodedImage, Error> {
self.encode_raw_planes(width, height, planes, alpha, color_pixel_range, matrix_coefficients, 8)
// First convert from RGB to GBR or YCbCr
let planes = pixels.map(|px| match self.color_model {
ColorModel::YCbCr => rgb_to_ycbcr(px, input_bit_depth, BT601),
ColorModel::RGB => [px.g, px.b, px.r],
});

// Then convert the bit depth when needed.
if self.depth != BitDepth::Eight && is_eight_bit {
let planes_u16 = planes.map(|px| [to_ten(px[0]), to_ten(px[1]), to_ten(px[2])]);
self.encode_raw_planes(width, height, planes_u16, None::<[_; 0]>, PixelRange::Full, matrix_coefficients)
} else {
self.encode_raw_planes(width, height, planes, None::<[_; 0]>, PixelRange::Full, matrix_coefficients)
}
}

/// Encodes AVIF from 3 planar channels that are in the color space described by `matrix_coefficients`,
/// with sRGB transfer characteristics and color primaries.
///
/// The pixels are 10-bit (values `0.=1023`).
/// If pixels are 10-bit values range from `0.=1023`.
///
/// Alpha always uses full range. Chroma subsampling is not supported, and it's a bad idea for AVIF anyway.
/// If there's no alpha, use `None::<[_; 0]>`.
///
/// returns AVIF file, size of color metadata, size of alpha metadata overhead
#[inline]
pub fn encode_raw_planes_10_bit(
&self, width: usize, height: usize, planes: impl IntoIterator<Item = [u16; 3]> + Send, alpha: Option<impl IntoIterator<Item = u16> + Send>,
color_pixel_range: PixelRange, matrix_coefficients: MatrixCoefficients,
) -> Result<EncodedImage, Error> {
self.encode_raw_planes(width, height, planes, alpha, color_pixel_range, matrix_coefficients, 10)
}

#[inline(never)]
fn encode_raw_planes<P: rav1e::Pixel + Default>(
fn encode_raw_planes<P: Pixel + Default>(
&self, width: usize, height: usize, planes: impl IntoIterator<Item = [P; 3]> + Send, alpha: Option<impl IntoIterator<Item = P> + Send>,
color_pixel_range: PixelRange, matrix_coefficients: MatrixCoefficients, bit_depth: u8,
color_pixel_range: PixelRange, matrix_coefficients: MatrixCoefficients
) -> Result<EncodedImage, Error> {
let color_description = Some(ColorDescription {
transfer_characteristics: TransferCharacteristics::SRGB,
Expand All @@ -370,7 +335,7 @@ impl Encoder {
&Av1EncodeConfig {
width,
height,
bit_depth: bit_depth.into(),
bit_depth: self.depth.to_usize(),
quantizer: self.quantizer.into(),
speed: SpeedTweaks::from_my_preset(self.speed, self.quantizer),
threads,
Expand All @@ -387,7 +352,7 @@ impl Encoder {
&Av1EncodeConfig {
width,
height,
bit_depth: bit_depth.into(),
bit_depth: self.depth.to_usize(),
quantizer: self.alpha_quantizer.into(),
speed: SpeedTweaks::from_my_preset(self.speed, self.alpha_quantizer),
threads,
Expand Down Expand Up @@ -417,7 +382,7 @@ impl Encoder {
_ => return Err(Error::Unsupported("matrix coefficients")),
})
.premultiplied_alpha(self.premultiplied_alpha)
.to_vec(&color, alpha.as_deref(), width as u32, height as u32, bit_depth);
.to_vec(&color, alpha.as_deref(), width as u32, height as u32, self.depth.to_usize() as u8);
let color_byte_size = color.len();
let alpha_byte_size = alpha.as_ref().map_or(0, |a| a.len());

Expand All @@ -427,45 +392,24 @@ impl Encoder {
}
}

#[inline(always)]
fn to_ten(x: u8) -> u16 {
(u16::from(x) << 2) | (u16::from(x) >> 6)
}

#[inline(always)]
fn rgb_to_10_bit_gbr(px: rgb::RGB<u8>) -> (u16, u16, u16) {
(to_ten(px.g), to_ten(px.b), to_ten(px.r))
}
// const REC709: [f32; 3] = [0.2126, 0.7152, 0.0722];
const BT601: [f32; 3] = [0.2990, 0.5870, 0.1140];

#[inline(always)]
fn rgb_to_8_bit_gbr(px: rgb::RGB<u8>) -> (u8, u8, u8) {
(px.g, px.b, px.r)
fn to_ten<P: Pixel + Default>(x: P) -> u16 {
(u16::cast_from(x) << 2) | (u16::cast_from(x) >> 6)
}

// const REC709: [f32; 3] = [0.2126, 0.7152, 0.0722];
const BT601: [f32; 3] = [0.2990, 0.5870, 0.1140];

#[inline(always)]
fn rgb_to_ycbcr(px: rgb::RGB<u8>, depth: u8, matrix: [f32; 3]) -> (f32, f32, f32) {
fn rgb_to_ycbcr<P: Pixel + Default>(px: Rgb<P>, bit_depth: BitDepth, matrix: [f32; 3]) -> [P; 3] {
let depth = bit_depth.to_usize();
let max_value = ((1 << depth) - 1) as f32;
let scale = max_value / 255.;
let shift = (max_value * 0.5).round();
let y = scale * matrix[0] * f32::from(px.r) + scale * matrix[1] * f32::from(px.g) + scale * matrix[2] * f32::from(px.b);
let cb = (f32::from(px.b) * scale - y).mul_add(0.5 / (1. - matrix[2]), shift);
let cr = (f32::from(px.r) * scale - y).mul_add(0.5 / (1. - matrix[0]), shift);
(y.round(), cb.round(), cr.round())
}

#[inline(always)]
fn rgb_to_10_bit_ycbcr(px: rgb::RGB<u8>, matrix: [f32; 3]) -> (u16, u16, u16) {
let (y, u, v) = rgb_to_ycbcr(px, 10, matrix);
(y as u16, u as u16, v as u16)
}

#[inline(always)]
fn rgb_to_8_bit_ycbcr(px: rgb::RGB<u8>, matrix: [f32; 3]) -> (u8, u8, u8) {
let (y, u, v) = rgb_to_ycbcr(px, 8, matrix);
(y as u8, u as u8, v as u8)
let y = scale * matrix[0] * u32::cast_from(px.r) as f32 + scale * matrix[1] * u32::cast_from(px.g) as f32 + scale * matrix[2] * u32::cast_from(px.b) as f32;
let cb = P::cast_from((u32::cast_from(px.b) as f32 * scale - y).mul_add(0.5 / (1. - matrix[2]), shift).round() as u16);
let cr = P::cast_from((u32::cast_from(px.r) as f32 * scale - y).mul_add(0.5 / (1. - matrix[0]), shift).round() as u16);
[P::cast_from(y.round() as u16), cb, cr]
}

fn quality_to_quantizer(quality: f32) -> u8 {
Expand Down Expand Up @@ -652,9 +596,7 @@ fn rav1e_config(p: &Av1EncodeConfig) -> Config {
}
}

fn init_frame_3<P: rav1e::Pixel + Default>(
width: usize, height: usize, planes: impl IntoIterator<Item = [P; 3]> + Send, frame: &mut Frame<P>,
) -> Result<(), Error> {
fn init_frame_3<P: Pixel + Default>(width: usize, height: usize, planes: impl IntoIterator<Item = [P; 3]> + Send, frame: &mut Frame<P>) -> Result<(), Error> {
let mut f = frame.planes.iter_mut();
let mut planes = planes.into_iter();

Expand All @@ -677,7 +619,7 @@ fn init_frame_3<P: rav1e::Pixel + Default>(
Ok(())
}

fn init_frame_1<P: rav1e::Pixel + Default>(width: usize, height: usize, planes: impl IntoIterator<Item = P> + Send, frame: &mut Frame<P>) -> Result<(), Error> {
fn init_frame_1<P: Pixel + Default>(width: usize, height: usize, planes: impl IntoIterator<Item = P> + Send, frame: &mut Frame<P>) -> Result<(), Error> {
let mut y = frame.planes[0].mut_slice(Default::default());
let mut planes = planes.into_iter();

Expand All @@ -691,7 +633,7 @@ fn init_frame_1<P: rav1e::Pixel + Default>(width: usize, height: usize, planes:
}

#[inline(never)]
fn encode_to_av1<P: rav1e::Pixel>(p: &Av1EncodeConfig, init: impl FnOnce(&mut Frame<P>) -> Result<(), Error>) -> Result<Vec<u8>, Error> {
fn encode_to_av1<P: Pixel>(p: &Av1EncodeConfig, init: impl FnOnce(&mut Frame<P>) -> Result<(), Error>) -> Result<Vec<u8>, Error> {
let mut ctx: Context<P> = rav1e_config(p).new_context()?;
let mut frame = ctx.new_frame();

Expand Down
Loading