diff --git a/compatibility/src/README.md b/compatibility/src/README.md index a0185e2c2..a788c54ff 100644 --- a/compatibility/src/README.md +++ b/compatibility/src/README.md @@ -1,35 +1,40 @@ -# Backwards Compatibility +# Legacy Backwards Compatibility -## TL;DR -The encoding of the bytes in libra uses `BCS` which is a de/serialization implementation using `serde`. Libra Version Six and beyond cannot decode transaction logs or state snapshots from V5 without these tools. +# TL;DR +The encoding of the bytes in libra uses `BCS` which is a de/serialization implementation using `serde`. Libra Version Six and beyond cannot decode transaction logs or state snapshots from Libra Version Five without these tools. # Explain -V6 was a kitchen sink upgrade with a new genesis, since there were upgrades throughout the stack that would have created a discontinuity in blocks anyhow. +Version Six was a kitchen sink upgrade with a new genesis since there were upgrades throughout the stack that would have created a discontinuity in blocks anyhow. -The bytes present in prior db, logs, and backups prior to V6 had different memory layouts. For every K-V structure the keys had different hashes, and the values had different encoding layouts. +The bytes present in prior db, logs, and backups prior to V6 had different memory layouts. For any K-V structure used the keys would have had different hashes, and the value bytes had different encoding layouts. -Also, looking up data in K-V representations of bytes is done with byte encoded access_paths. Since the Move language address format and data structure names have changed, nothing can be found, and you will receive a `remaining input` error. I gift you this koan. +Looking up data in K-V representations of bytes is done with byte encoded access_paths. Since the Move language address format and data structure names have changed nothing can be found. You will instead receive a `remaining input` error. I gift you this koan. -This compatibility library ports the some V5 Rust code so that certain elemental types (StructTag, TypeTag, HashValue, AccountAddress), use the correct layout. +This compatibility library ports the some V5 Rust code so that certain elemental types will use the correct V5 layout, e.g.: StructTag, TypeTag, HashValue, AccountAddress. # Principal PITAs -1. Backup Manifests have changed layout. State Snapshot Manifests have changed ever so slightly, they previously did not include "epoch" keys. Reading V5 backup archive manifests would fail with V6+ tooling. +1. Backup Manifests have changed layout. State Snapshot Manifests JSON files have changed ever so slightly, they previously did not include the `epoch` field. Reading V5 backup archive manifests would fail with V6+ tooling. -1. `AccountStateBlob` stored bytes are not what they seem: In the State Snapshot backup files, each chunk is represented by a tuple of `(HashedValue, AccountStateBlob)`. For clarity we added a definition of `AccountStateBlobChunkV5`. +2. `AccountStateBlob` stored bytes in records are not what they seem. Vendor gifts you this koan: "What's the sound of recursion that goes nowhere?". In the State Snapshot backup files, each chunk is represented by a tuple of `(HashedValue, AccountStateBlob)`. However, AccountStateBlob already includes a `hash` field for HashedValue. For reasons, this field was flagged to be skipped be de/serializer. In practice the bytes at rest are prepended by the hash, and not post-pended. For clarity we added a definition of `AccountStateBlobRecord`. + +3. `HashValue` is evil: The HashValue layout has not changed, but it invokes `loop garoo`, and the handcrafted deserializer of `HashedValue` uses a different intermediary representation for the byte layout. -1. `HashValue` is evil: The HashValue layout has not changed, but it invokes loup garou vodoo, and the custom deserializer of HashedValue uses a different intermediary representation for the byte layout ``` +// V5: #[derive(::serde::Deserialize)] #[serde(rename = "HashValue")] struct Value<'a>(&'a [u8]); -let value = Value::deserialize(deserializer)?; -Self::from_slice(value.0).map_err(::custom) +// V6: +struct Value<'a> { + hash: &'a [u8; HashValue::LENGTH], +} ``` -1. `AccountAddress` makes everything fail: fixed lengths have changed, from V5 to V6 the addresses doubled in size (32 to 64 bits). No KV lookup will work because the byte-encoded key always has the Core Code Address, (0x1) which changed from being prepended with 16 zeros, to 32 zeros. So all language_storage.rs structs are changed to use `LegacyAddressV5`. + +4. `AccountAddress` makes everything fail: fixed lengths have changed, from V5 to V6 the addresses doubled in size (32 to 64 bits). No KV lookup will work because the byte-encoded key always has the Core Code Address, (0x1) which changed from being prepended with 16 zeros, to 32 zeros. So all language_storage.rs structs are changed to use `LegacyAddressV5`. # Tests The principal tests to run are in `state_snapshot_v5.rs`, where we try to sanity test the encoding and decoding of structs for the v5 language elements. diff --git a/compatibility/src/version_five/state_snapshot_v5.rs b/compatibility/src/version_five/state_snapshot_v5.rs index 4cf406db2..4916c34e6 100644 --- a/compatibility/src/version_five/state_snapshot_v5.rs +++ b/compatibility/src/version_five/state_snapshot_v5.rs @@ -34,6 +34,13 @@ pub struct StateSnapshotBackupV5 { pub proof: FileHandle, } +/// The tuple in which all account bytes are stored as +/// All backup records are stored with the data byte length. See ReadRecordBytes. +/// An account's on-chain bytes are represented in storage files as an AccountStateBlob. However, the chunks are stored with a tuple of HashValue of the bytes prior to they bytes themselves. +// NOTE: Paradoxically the data layout of the AccountStateBlob also has a `hash` field, but this one is not serialized. Unclear why the tuple is needed when the blob could have been de/serialized fully. Alas. + +pub struct AccountStateBlobRecord(HashValue, AccountStateBlob); + ////// SNAPSHOT FILE IO ////// /// read snapshot manifest file into object pub fn v5_read_from_snapshot_manifest(path: &PathBuf) -> Result { @@ -51,7 +58,7 @@ pub fn v5_read_from_snapshot_manifest(path: &PathBuf) -> Result Result, Error> { +) -> Result, Error> { let full_handle = archive_path .parent() .expect("could not read archive path")