Skip to content

Commit

Permalink
no one reads this anyway
Browse files Browse the repository at this point in the history
  • Loading branch information
0o-de-lally committed Oct 5, 2024
1 parent 68065b5 commit f9e67ac
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 14 deletions.
31 changes: 18 additions & 13 deletions compatibility/src/README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,40 @@

# Backwards Compatibility
# Legacy Backwards Compatibility

## TL;DR
The encoding of the bytes in libra uses `BCS` which is a de/serialization implementation using `serde`. Libra Version Six and beyond cannot decode transaction logs or state snapshots from V5 without these tools.
# TL;DR
The encoding of the bytes in libra uses `BCS` which is a de/serialization implementation using `serde`. Libra Version Six and beyond cannot decode transaction logs or state snapshots from Libra Version Five without these tools.

# Explain

V6 was a kitchen sink upgrade with a new genesis, since there were upgrades throughout the stack that would have created a discontinuity in blocks anyhow.
Version Six was a kitchen sink upgrade with a new genesis since there were upgrades throughout the stack that would have created a discontinuity in blocks anyhow.

The bytes present in prior db, logs, and backups prior to V6 had different memory layouts. For every K-V structure the keys had different hashes, and the values had different encoding layouts.
The bytes present in prior db, logs, and backups prior to V6 had different memory layouts. For any K-V structure used the keys would have had different hashes, and the value bytes had different encoding layouts.

Also, looking up data in K-V representations of bytes is done with byte encoded access_paths. Since the Move language address format and data structure names have changed, nothing can be found, and you will receive a `remaining input` error. I gift you this koan.
Looking up data in K-V representations of bytes is done with byte encoded access_paths. Since the Move language address format and data structure names have changed nothing can be found. You will instead receive a `remaining input` error. I gift you this koan.

This compatibility library ports the some V5 Rust code so that certain elemental types (StructTag, TypeTag, HashValue, AccountAddress), use the correct layout.
This compatibility library ports the some V5 Rust code so that certain elemental types will use the correct V5 layout, e.g.: StructTag, TypeTag, HashValue, AccountAddress.

# Principal PITAs

1. Backup Manifests have changed layout. State Snapshot Manifests have changed ever so slightly, they previously did not include "epoch" keys. Reading V5 backup archive manifests would fail with V6+ tooling.
1. Backup Manifests have changed layout. State Snapshot Manifests JSON files have changed ever so slightly, they previously did not include the `epoch` field. Reading V5 backup archive manifests would fail with V6+ tooling.

1. `AccountStateBlob` stored bytes are not what they seem: In the State Snapshot backup files, each chunk is represented by a tuple of `(HashedValue, AccountStateBlob)`. For clarity we added a definition of `AccountStateBlobChunkV5`.
2. `AccountStateBlob` stored bytes in records are not what they seem. Vendor gifts you this koan: "What's the sound of recursion that goes nowhere?". In the State Snapshot backup files, each chunk is represented by a tuple of `(HashedValue, AccountStateBlob)`. However, AccountStateBlob already includes a `hash` field for HashedValue. For reasons, this field was flagged to be skipped be de/serializer. In practice the bytes at rest are prepended by the hash, and not post-pended. For clarity we added a definition of `AccountStateBlobRecord`.

3. `HashValue` is evil: The HashValue layout has not changed, but it invokes `loop garoo`, and the handcrafted deserializer of `HashedValue` uses a different intermediary representation for the byte layout.

1. `HashValue` is evil: The HashValue layout has not changed, but it invokes loup garou vodoo, and the custom deserializer of HashedValue uses a different intermediary representation for the byte layout
```
// V5:
#[derive(::serde::Deserialize)]
#[serde(rename = "HashValue")]
struct Value<'a>(&'a [u8]);
let value = Value::deserialize(deserializer)?;
Self::from_slice(value.0).map_err(<D::Error as ::serde::de::Error>::custom)
// V6:
struct Value<'a> {
hash: &'a [u8; HashValue::LENGTH],
}
```
1. `AccountAddress` makes everything fail: fixed lengths have changed, from V5 to V6 the addresses doubled in size (32 to 64 bits). No KV lookup will work because the byte-encoded key always has the Core Code Address, (0x1) which changed from being prepended with 16 zeros, to 32 zeros. So all language_storage.rs structs are changed to use `LegacyAddressV5`.

4. `AccountAddress` makes everything fail: fixed lengths have changed, from V5 to V6 the addresses doubled in size (32 to 64 bits). No KV lookup will work because the byte-encoded key always has the Core Code Address, (0x1) which changed from being prepended with 16 zeros, to 32 zeros. So all language_storage.rs structs are changed to use `LegacyAddressV5`.

# Tests
The principal tests to run are in `state_snapshot_v5.rs`, where we try to sanity test the encoding and decoding of structs for the v5 language elements.
Expand Down
9 changes: 8 additions & 1 deletion compatibility/src/version_five/state_snapshot_v5.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,13 @@ pub struct StateSnapshotBackupV5 {
pub proof: FileHandle,
}

/// The tuple in which all account bytes are stored as
/// All backup records are stored with the data byte length. See ReadRecordBytes.
/// An account's on-chain bytes are represented in storage files as an AccountStateBlob. However, the chunks are stored with a tuple of HashValue of the bytes prior to they bytes themselves.
// NOTE: Paradoxically the data layout of the AccountStateBlob also has a `hash` field, but this one is not serialized. Unclear why the tuple is needed when the blob could have been de/serialized fully. Alas.

pub struct AccountStateBlobRecord(HashValue, AccountStateBlob);

////// SNAPSHOT FILE IO //////
/// read snapshot manifest file into object
pub fn v5_read_from_snapshot_manifest(path: &PathBuf) -> Result<StateSnapshotBackupV5, Error> {
Expand All @@ -51,7 +58,7 @@ pub fn v5_read_from_snapshot_manifest(path: &PathBuf) -> Result<StateSnapshotBac
pub async fn read_account_state_chunk(
file_handle: FileHandle,
archive_path: &PathBuf,
) -> Result<Vec<(HashValue, AccountStateBlob)>, Error> {
) -> Result<Vec<AccountStateBlobRecord>, Error> {
let full_handle = archive_path
.parent()
.expect("could not read archive path")
Expand Down

0 comments on commit f9e67ac

Please sign in to comment.