Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Headers and crate documentation #16

Merged
merged 7 commits into from
Jan 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,19 @@ repository = "https://github.com/diba-io/carbonado.git"
[dependencies]
anyhow = "1"
bao = "0.12.1"
bech32 = "0.9"
bitmask-enum = "2.1.0"
combination = "0.2.2"
ecies = { version = "0.2.2", default-features = false, features = ["pure"] }
hex = "0.4"
log = "0.4"
pretty_env_logger = "0.4"
secp256k1 = { version = "0.25.0", features = [
"global-context",
"rand-std",
"bitcoin-hashes-std",
"serde",
] }
serde = "1"
snap = "1"
zfec-rs = "0.1.0"
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ Carbonado has features to make it resistant against:

All without needing a blockchain, however, they can be useful for periodically checkpointing data in a durable place.

### Documentation

More detailed information on formats and operations can be found in the [carbonado crate docs](https://docs.rs/carbonado), hosted on <docs.rs>.

### Checkpoints

Carbonado supports an optional Bitcoin-compatible HD wallet with a specific derivation path that can be used to secure timestamped Carbonado Checkpoints using an on-chain OP_RETURN.
Expand Down
52 changes: 51 additions & 1 deletion src/constants.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,53 @@
pub const SLICE_LEN: usize = 1024; // Bao slice length
use bitmask_enum::bitmask;

pub const SLICE_LEN: u16 = 1024; // Bao slice length
pub const FEC_K: usize = 4; // Zfec chunks needed
pub const FEC_M: usize = 8; // Zfec chunks produced

/// ## Bitmask for Carbonado formats c0-c15
///
/// | Format | Encryption | Compression | Verifiability | Error correction | Use-cases |
/// |-----|----|----|----|----|----|
/// | c0 | | | | | Marks a file as scanned by Carbonado |
/// | c1 | ✅ | | | | Encrypted incompressible throwaway append-only data streams such as CCTV footage |
/// | c2 | | ✅ | | | Rotating public logs |
/// | c3 | ✅ | ✅ | | | Private archives |
/// | c4 | | | ✅ | | Unencrypted incompressible data such as NFT/UDA image assets |
/// | c5 | ✅ | | ✅ | | Private media backups |
/// | c6 | | ✅ | ✅ | | Compiled binaries |
/// | c7 | ✅ | ✅ | ✅ | | Full drive backups |
/// | c8 | | | | ✅ | ??? |
/// | c9 | ✅ | | | ✅ | ??? |
/// | c10 | | ✅ | | ✅ | ??? |
/// | c11 | ✅ | ✅ | | ✅ | Encrypted device-local Catalogs |
/// | c12 | | | ✅ | ✅ | Publicly-available archival media |
/// | c13 | ✅ | | ✅ | ✅ | Georedundant private media backups |
/// | c14 | | ✅ | ✅ | ✅ | Source code, token genesis |
/// | c15 | ✅ | ✅ | ✅ | ✅ | Contract data |
///
/// These operations correspond to the following implementations:
///
/// | Implementation | Operation |
/// |-------|-------|
/// | ecies | Encryption |
/// | snap | Compression |
/// | bao | Verifiability |
/// | zfec | Error correction |
///
/// While the implementations are called in a different order, as outlined in [encoding::encode](crate::encode), operations are ordered this way in the bitmask in order to make the format more intuitive.
///
/// Verifiability is needed to pay others for storing or hosting your files, but it inhibits use-cases for mutable or append-only data other than snapshots, since the hash will change so frequently. Bao encoding does not have a large overhead, about 5% at most.
///
/// Any data that is verifiable but also unencrypted is instead signed by the local key. This is good for signed compiled binaries or hosted webpages.
#[bitmask(u8)]
pub enum Format {
Ecies,
Snappy,
Bao,
Zfec,
}

/// "Magic number" used by the Carbonado file format.
pub const MAGICNO: [u8; 12] = [
b'C', b'A', b'R', b'B', b'O', b'N', b'A', b'D', b'O', b'0', b'0', b'\n',
];
75 changes: 51 additions & 24 deletions src/decode.rs → src/decoding.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,24 @@ use snap::read::FrameDecoder;
use zfec_rs::{Chunk, Fec};

use crate::{
constants::{FEC_K, FEC_M, SLICE_LEN},
encode,
constants::{Format, FEC_K, FEC_M, SLICE_LEN},
encoding,
structs::EncodeInfo,
utils::decode_bao_hash,
};

fn zfec_chunks(chunked_bytes: Vec<Vec<u8>>, padding: usize) -> Result<Vec<u8>> {
fn zfec_chunks(chunked_bytes: Vec<Vec<u8>>, padding: u32) -> Result<Vec<u8>> {
let mut zfec_chunks = vec![];
for (i, chunk) in chunked_bytes.into_iter().enumerate() {
zfec_chunks.push(Chunk::new(chunk, i));
}
let fec = Fec::new(FEC_K, FEC_M)?;
let decoded = fec.decode(&zfec_chunks, padding)?;
let decoded = fec.decode(&zfec_chunks, padding as usize)?;
Ok(decoded)
}

/// Zfec forward error correction decoding
pub fn zfec(input: &[u8], padding: usize) -> Result<Vec<u8>> {
pub fn zfec(input: &[u8], padding: u32) -> Result<Vec<u8>> {
let input_len = input.len();
if input_len % FEC_M != 0 {
return Err(anyhow!(
Expand All @@ -48,41 +48,68 @@ pub fn zfec(input: &[u8], padding: usize) -> Result<Vec<u8>> {
}

/// Bao stream extraction
pub fn bao(decoded: &[u8], hash: &[u8]) -> Result<Vec<u8>> {
pub fn bao(input: &[u8], hash: &[u8]) -> Result<Vec<u8>> {
let hash = decode_bao_hash(hash)?;
let decoded = bao_decode(decoded, &hash)?;
let decoded = bao_decode(input, &hash)?;

Ok(decoded)
}

/// Ecies decryption
pub fn ecies(decoded: &[u8], privkey: &[u8]) -> Result<Vec<u8>> {
let decrypted = decrypt(privkey, decoded)?;
pub fn ecies(input: &[u8], secret_key: &[u8]) -> Result<Vec<u8>> {
let decrypted = decrypt(secret_key, input)?;

Ok(decrypted)
}

/// Snappy decompression
pub fn snap(decrypted: &[u8]) -> Result<Vec<u8>> {
pub fn snap(input: &[u8]) -> Result<Vec<u8>> {
let mut decompressed = vec![];
FrameDecoder::new(decrypted).read_to_end(&mut decompressed)?;
FrameDecoder::new(input).read_to_end(&mut decompressed)?;

Ok(decompressed)
}

/// Decode data from Carbonado format in reverse order:
/// bao -> zfec -> ecies -> snap
pub fn decode(privkey: &[u8], hash: &[u8], input: &[u8], padding: usize) -> Result<Vec<u8>> {
let verified = bao(input, hash)?;
let decoded = zfec(&verified, padding)?;
let decrypted = ecies(&decoded, privkey)?;
let decompressed = snap(&decrypted)?;
pub fn decode(
secret_key: &[u8],
hash: &[u8],
input: &[u8],
padding: u32,
format: u8,
) -> Result<Vec<u8>> {
let format = Format::try_from(format)?;

let verified = if format.contains(Format::Bao) {
bao(input, hash)?
} else {
input.to_owned()
};

let decoded = if format.contains(Format::Zfec) {
zfec(&verified, padding)?
} else {
verified
};

let decrypted = if format.contains(Format::Ecies) {
ecies(&decoded, secret_key)?
} else {
decoded
};

let decompressed = if format.contains(Format::Snappy) {
snap(&decrypted)?
} else {
decrypted
};

Ok(decompressed)
}

/// Extract a 1KB slice of a Bao stream at a specific index, after decoding it from zfec
pub fn extract_slice(encoded: &[u8], index: usize) -> Result<Vec<u8>> {
pub fn extract_slice(encoded: &[u8], index: u16) -> Result<Vec<u8>> {
let slice_start = index * SLICE_LEN;
let encoded_cursor = Cursor::new(&encoded);
let mut extractor = SliceExtractor::new(encoded_cursor, slice_start as u64, SLICE_LEN as u64);
Expand All @@ -93,7 +120,7 @@ pub fn extract_slice(encoded: &[u8], index: usize) -> Result<Vec<u8>> {
}

/// Verify a number of 1KB slices of a Bao stream starting at a specific index
pub fn verify_slice(hash: &Hash, input: &[u8], index: usize, count: usize) -> Result<Vec<u8>> {
pub fn verify_slice(hash: &Hash, input: &[u8], index: u16, count: u16) -> Result<Vec<u8>> {
let slice_start = index * SLICE_LEN;
let slice_len = count * SLICE_LEN;
trace!("Verify slice start: {slice_start} len: {slice_len}");
Expand All @@ -113,7 +140,7 @@ pub fn scrub(input: &[u8], hash: &[u8], encode_info: &EncodeInfo) -> Result<Vec<
let hash = decode_bao_hash(hash)?;
let chunk_size = encode_info.chunk_size;
let padding = encode_info.padding;
let slices_per_chunk = chunk_size / SLICE_LEN;
let slices_per_chunk = (chunk_size / SLICE_LEN as u32) as u16;

match bao_decode(input, &hash) {
Ok(_decoded) => Err(anyhow!("Data does not need to be scrubbed.")),
Expand All @@ -122,7 +149,7 @@ pub fn scrub(input: &[u8], hash: &[u8], encode_info: &EncodeInfo) -> Result<Vec<
let mut chunks = vec![];

for i in 0..FEC_M {
let slice_index = i * slices_per_chunk;
let slice_index = i as u16 * slices_per_chunk;
match verify_slice(&hash, input, slice_index, slices_per_chunk) {
Ok(chunk) => chunks.push(chunk),
Err(e) => {
Expand All @@ -134,20 +161,20 @@ pub fn scrub(input: &[u8], hash: &[u8], encode_info: &EncodeInfo) -> Result<Vec<
info!("{} good chunks found, of {FEC_K} needed.", chunks.len());

let mut decoded = zfec_chunks(chunks, padding)?;
decoded.truncate(encode_info.bytes_encoded - padding);
decoded.truncate((encode_info.bytes_ecc - padding) as usize);
assert_eq!(
encode_info.bytes_encrypted,
decoded.len(),
decoded.len() as u32,
"Byte lengths match"
);

let (scrubbed, scrubbed_padding, _) = encode::zfec(&decoded)?;
let (scrubbed, scrubbed_padding, _) = encoding::zfec(&decoded)?;
assert_eq!(
padding, scrubbed_padding,
"Scrubbed padding should remain 0"
);

let (verified, scrubbed_hash) = encode::bao(&scrubbed)?;
let (verified, scrubbed_hash) = encoding::bao(&scrubbed)?;
assert_eq!(
hash, scrubbed_hash,
"Scrubbed hash is equal to original hash"
Expand Down
77 changes: 58 additions & 19 deletions src/encode.rs → src/encoding.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ use snap::write::FrameEncoder;
use zfec_rs::Fec;

use crate::{
constants::{FEC_K, FEC_M},
constants::{Format, FEC_K, FEC_M},
structs::EncodeInfo,
utils::calc_padding_len,
};
Expand All @@ -25,8 +25,8 @@ pub fn snap(input: &[u8]) -> Result<Vec<u8>> {
}

/// Ecies encryption
pub fn ecies(pubkey: &[u8], compressed: &[u8]) -> Result<Vec<u8>> {
let encrypted = encrypt(pubkey, compressed)?;
pub fn ecies(pubkey: &[u8], input: &[u8]) -> Result<Vec<u8>> {
let encrypted = encrypt(pubkey, input)?;

Ok(encrypted)
}
Expand All @@ -39,11 +39,11 @@ pub fn bao(input: &[u8]) -> Result<(Vec<u8>, Hash)> {
}

/// Zfec forward error correction encoding
pub fn zfec(input: &[u8]) -> Result<(Vec<u8>, usize, usize)> {
pub fn zfec(input: &[u8]) -> Result<(Vec<u8>, u32, u32)> {
let input_len = input.len();
let (padding_len, chunk_size) = calc_padding_len(input_len);
// TODO: CSPRNG padding
let mut padding_bytes = vec![0u8; padding_len];
let mut padding_bytes = vec![0u8; padding_len as usize];
let mut padded_input = Vec::from(input);
padded_input.append(&mut padding_bytes);
debug!(
Expand All @@ -64,7 +64,7 @@ pub fn zfec(input: &[u8]) -> Result<(Vec<u8>, usize, usize)> {
for chunk in &mut encoded_chunks {
assert_eq!(
chunk_size,
chunk.data.len(),
chunk.data.len() as u32,
"Chunk size should be as calculated"
);
encoded.append(&mut chunk.data);
Expand All @@ -74,22 +74,61 @@ pub fn zfec(input: &[u8]) -> Result<(Vec<u8>, usize, usize)> {
}

/// Encode data into Carbonado format in this order:
/// snap -> ecies -> zfec -> bao
///
/// `snap -> ecies -> zfec -> bao`
///
/// It performs compression, encryption, stream encoding, and adds error correction codes, in that order.
pub fn encode(pubkey: &[u8], input: &[u8]) -> Result<(Vec<u8>, Hash, EncodeInfo)> {
let input_len = input.len();

let compressed = snap(input)?;
let bytes_compressed = compressed.len();
pub fn encode(pubkey: &[u8], input: &[u8], format: u8) -> Result<(Vec<u8>, Hash, EncodeInfo)> {
let input_len = input.len() as u32;
let format = Format::try_from(format)?;

let compressed;
let encrypted;
let encoded;
let padding;
let chunk_size;
let verifiable;
let hash;

let bytes_compressed;
let bytes_encrypted;
let bytes_ecc;
let bytes_verifiable;

if format.contains(Format::Snappy) {
compressed = snap(input)?;
bytes_compressed = compressed.len() as u32;
} else {
compressed = input.to_owned();
bytes_compressed = 0;
}

let encrypted = ecies(pubkey, &compressed)?;
let bytes_encrypted = encrypted.len();
if format.contains(Format::Ecies) {
encrypted = ecies(pubkey, &compressed)?;
bytes_encrypted = encrypted.len() as u32;
} else {
encrypted = compressed;
bytes_encrypted = 0;
}

let (encoded, padding, chunk_size) = zfec(&encrypted)?;
let bytes_encoded = encoded.len();
if format.contains(Format::Zfec) {
(encoded, padding, chunk_size) = zfec(&encrypted)?;
bytes_ecc = encoded.len() as u32;
} else {
encoded = encrypted;
padding = 0;
chunk_size = 0;
bytes_ecc = 0;
}

let (verifiable, hash) = bao(&encoded)?;
let bytes_verifiable = verifiable.len();
if format.contains(Format::Bao) {
(verifiable, hash) = bao(&encoded)?;
bytes_verifiable = verifiable.len() as u32;
} else {
verifiable = encoded;
hash = Hash::from([0; 32]);
bytes_verifiable = 0;
}

// Calculate totals
let compression_factor = bytes_compressed as f32 / input_len as f32;
Expand All @@ -102,7 +141,7 @@ pub fn encode(pubkey: &[u8], input: &[u8]) -> Result<(Vec<u8>, Hash, EncodeInfo)
input_len,
bytes_compressed,
bytes_encrypted,
bytes_encoded,
bytes_ecc,
bytes_verifiable,
compression_factor,
amplification_factor,
Expand Down
Loading