docs: update doc (#132)

* docs: update doc * refactor: apply suggestions Co-authored-by: wwared <[email protected]> * refactor: second batch of suggestion Co-authored-by: wwared <[email protected]> --------- Co-authored-by: wwared <[email protected]>
argumentcomputer · Jul 30, 2024 · beb555e · beb555e
1 parent 72b84b3
commit beb555e
Show file tree

Hide file tree

Showing 19 changed files with 368 additions and 23 deletions.
diff --git a/aptos/docs/src/components/client.md b/aptos/docs/src/components/client.md
@@ -21,4 +21,4 @@ the proofs generation in parallel. This flow happens during initialization where
 Aptos network while producing an inclusion proof for a given account at the latest block.
 
 The bundled example client currently only requests and verifies STARK proofs. The proof servers have support for generating
-and verifying SNARK proofs, but the example client does not yet make use of this.
+and verifying SNARK proofs, but the example client does not yet make use of this.
diff --git a/aptos/docs/src/design/overview.md b/aptos/docs/src/design/overview.md
@@ -26,7 +26,7 @@ to some account needs to be validated.
 The current Verifying Key Hashes which uniquely identify the specific RISC-V binaries for the proof programs, located in the
 [`aptos/aptos-programs/artifacts/`](https://github.com/lurk-lab/zk-light-clients/tree/dev/aptos/aptos-programs/artifacts)
 directory are:
-* `epoch_change`: `0x00eea0650222f7e5bb6a2fe57c0e0e504d1df8b3d848d5116174a8703d228c94`
+* `epoch_change`: `0x008f0133dc5a02eb31ac769e9e3a2f34da1af34c963bf3ee9a058982a2978cc9`
 * `inclusion`: `0x00336c570224c00161ca7b3c275c24f3968aa09086c31d09d98691bce109f4f6`
 
-These values are also present in and used by the [solidity fixtures](../benchmark/on_chain.md).
+These values are also present in and used by the [solidity fixtures](../benchmark/on_chain.md).
diff --git a/aptos/docs/src/misc/release.md b/aptos/docs/src/misc/release.md
@@ -8,29 +8,29 @@ run it.
 The release process is mostly automated through the usage of GitHub Actions.
 
 A release should be initiated through the manually triggered GitHub Action **Bump Version**. When triggering a release,
-the reference base that should be chosen is the `dev` branch, with a `release` type and the desired release version. The
+the reference base that should be chosen is the `dev` branch, with a `release` type, `aptos` light-client and the desired release version. The
 specified release version should follow [the Semver standard](https://semver.org/).
 
-This action opens a new PR from a branch named `release/<release-version>` with `dev` as its base. A commit is
+This action opens a new PR from a branch named `release/aptos-v<release-version>` with `dev` as its base. A commit is
 automatically applied to bump all the `Cargo.toml` version of the relevant crates. The developer in charge of the
 release should use this branch to make any necessary updates to the codebase and documentation to have the release
 ready.
 
 Once all the changes are done, the PR can be squash and merged in `dev`. This will trigger the **Tag release** action
-that is charged with the publication of a release and a tag named `<release-version>`.
+that is charged with the publication of a release and a tag named `v<release-version>`.
 
 ## Hotfix process
 
 The hotfix process is quite similar to the release one.
 
-**Bump Version** should also be triggered, but with the desired `release/<release-to-fix>` as reference. A PR will be
-opened from a branch named `hotfix/<hotfix-version>` with the base `release/<release-to-fix>`. A commit is automatically
+**Bump Version** should also be triggered, but with the desired `release/aptos-v<release-to-fix>` as reference. A PR will be
+opened from a branch named `hotfix/aptos-v<hotfix-version>` with the base `release/aptos-v<release-to-fix>`. A commit is automatically
 applied to bump all the `Cargo.toml` version of the relevant crates. The developer in charge of the
 hotfix should use this branch to make any necessary updates to the codebase and documentation to have the hotfix
 ready.
 
 Once all the changes are done, the PR can be squash and merged in `release/<release-to-fix>`. This will trigger the
-**Tag release** action that is charged with the publication of a release and a tag named `<hotfix-version>`.
+**Tag release** action that is charged with the publication of a release and a tag named `v<hotfix-version>`.
 
 Finally, the developer will also need to port the changes made to `dev` so that they are reflected on the latest
 development stage of the Light Client.

diff --git a/ethereum/docs/src/README.md b/ethereum/docs/src/README.md
@@ -0,0 +1,38 @@
+<img src="images/ethereum.png" style="border-radius: 20px">
+
+> **Note**
+>
+> The following documentation has been written with the supposition that the
+> reader is already knowledgeable about the synchronisation protocol for Light
+> Clients implemented on Ethereum.\\(N\\) 
+>
+> To read about it refer
+> to [the Light Client section](https://ethereum.org/en/developers/docs/nodes-and-clients/light-clients/)
+> of the Ethereum documentation
+> and
+> the [Sync Protocol specification](https://github.com/ethereum/consensus-specs/blob/v1.4.0/specs/altair/light-client/sync-protocol.md).
+
+The Ethereum Light Client (LC) provides a streamlined and efficient way to verify blockchain state transitions and
+proofs without needing to store or synchronize the entire blockchain.
+
+The following documentation aims to provide a high-level overview of the Ethereum LC and its components, along with a
+guide
+on how to set up and run or benchmark the Ethereum LC.
+
+### Sections
+
+**[High-level design](./design/overview.md)**
+
+An overview of what is the Light Client and the feature set it provides.
+
+**[Components](./components/overview.md)**
+
+A detailed description of the components that make up the Light Client.
+
+**[Run the Light Client](./run/overview.md)**
+
+A guide on how to set up and run the Light Client.
+
+**[Benchmark the Light Client](./benchmark/overview.md)**
+
+A guide on how to benchmark the Light Client.
diff --git a/ethereum/docs/src/SUMMARY.md b/ethereum/docs/src/SUMMARY.md
@@ -38,7 +38,6 @@ This section goes over how to run the benchmarks to measure the performances of
 - [Overview](./benchmark/overview.md)
 - [Configuration](./benchmark/configuration.md)
 - [Benchmark individual proofs](./benchmark/proof.md)
-- [E2E benchmarks](./benchmark/e2e.md)
 - [On-chain verification benchmarks](./benchmark/on_chain.md)
 
 # Miscellaneous

diff --git a/ethereum/docs/src/benchmark/on_chain.md b/ethereum/docs/src/benchmark/on_chain.md
@@ -1,7 +1,5 @@
 # Benchmark on-chain verification
 
-# Benchmark on-chain verification
-
 Our Light Client is able to produce SNARK proofs that can be verified on-chain. This section will cover how to run the
 benchmarks for the on-chain verification.
 

diff --git a/ethereum/docs/src/benchmark/overview.md b/ethereum/docs/src/benchmark/overview.md
@@ -0,0 +1,12 @@
+# Benchmark proving time
+
+There are two types of benchmarks that can be used to get insight on the proving time necessary for each kind of proof
+generated by the proof server. The first type will generate STARK core proofs, and represents the time it takes to
+generate and prove execution of one of the programs. The second type will generate a SNARK proof that can be verified
+on-chain, and represents the end-to-end time it takes to generate a proof that can be verified directly on-chain.
+Due to the SNARK compression, the SNARK proofs take longer to generate and require more resources.
+
+## GPU acceleration
+
+Currently, the Sphinx prover is **CPU-only**, and there is no GPU acceleration integrated yet. We are working on
+integrating future work for GPU acceleration as soon as we can to improve the overall proving time.
diff --git a/ethereum/docs/src/benchmark/proof.md b/ethereum/docs/src/benchmark/proof.md
@@ -4,6 +4,9 @@ In this section we will cover how to run the benchmarks for the individual proof
 the `light-client` crate folder. Those benchmarks are associated with programs that are meant to reproduce
 a production environment settings. They are meant to measure performance for a complete end-to-end flow.
 
+The numbers we've measured using our [production configuration](../run/overview.md) are further detailed in the
+following section.
+
 ## Sync committee change
 
 Benchmark that will run a proof generation for the sync committee change program. This program will execute a hash for
@@ -14,32 +17,52 @@ new `LightClientStore::current_sync_committee`.
 On our [production configuration](../run/overview.md), we currently get the following results for SNARK generation for
 this benchmark:
 
+For STARKS:
+```json
+{
+  // Time in milliseconds, 2 minutes 10s ~
+  "proving_time":133511,
+  "verification_time":2469
+}
+```
+
+For SNARKS:
 ```json
 {
-  // Time in milliseconds, 13~ minutes
-  "proving_time": 791696,
-  "verification_time": 1
+    // Time in milliseconds, 13 minute 13s ~
+  "proving_time":793280,
+  "verification_time":1
 }
 ```
 
 ## Storage inclusion
 
 Benchmark that will run a proof generation for the storage inclusion program. This program will execute a hash for the
-received `LightClientStore::current_sync_committee` to ensure that the signature is from the current known sync
+received `CompactStore::sync_committee` to ensure that the signature is from the current known sync
 committee
-set, execute a `LightClientStore::validate_light_client_update` to confirm that the received block information is one
+set, execute a `CompactStore::validate_compact_update` to confirm that the received block information is one
 signed
 by the committee, and finally run an `EIP1186Proof::verify` against the state root of the finalized execution block
 header.
 
 On our [production configuration](../run/overview.md), we currently get the following results for SNARK generation for
 this benchmark:
 
+For STARKS:
+```json
+{
+  // Time in milliseconds, 1 minute 20s ~
+  "proving_time":83383,
+  "verification_time":2178
+}
+```
+
+For SNARKS:
 ```json
 {
-  // Time in milliseconds, 7~ minutes
-  "proving_time": 441123,
-  "verification_time": 2
+  // Time in milliseconds, 11 minute 35s ~
+  "proving_time":695637,
+  "verification_time":1
 }
 ```
 

diff --git a/ethereum/docs/src/components/client.md b/ethereum/docs/src/components/client.md
@@ -0,0 +1,17 @@
+# Client
+
+The client is the coordinator of the Light Client. It is responsible for orchestrating the communication between the
+Proof Server and the Ethereum nodes. In our current example implementation it can also serve as a drop-in replacement
+for what an on-chain verifier would be responsible for.
+
+The client also demonstrates how to request data from the Ethereum nodes endpoints, how to forward it to the proof servers
+using the simple binary RPC protocol, example and how to parse the received responses from the server. See
+[the source](https://github.com/lurk-lab/zk-light-clients/blob/dev/ethereum/light-client/src/bin/client.rs)
+for more details.
+
+The client has two phases:
+
+- **Initialization**: The client fetches the initial data from the Ethereum nodes and generates the initial state for
+  itself and the verifier.
+- **Main Loop**: The client listens for new data from the Ethereum nodes and generates proofs for the verifier to verify.
+  This includes new proofs for epoch changes.
diff --git a/ethereum/docs/src/components/eth_nodes.md b/ethereum/docs/src/components/eth_nodes.md
@@ -0,0 +1,25 @@
+# Ethereum Nodes
+
+In order to generate the two proofs composing the Light Client, it is needed to
+fetch data from the Ethereum network. To retrieve this data, the Light Client
+needs to interact with both a node from [the Beacon chain](https://ethereum.org/en/roadmap/beacon-chain/)
+and from [the execution chain](https://ethereum.org/en/developers/docs/nodes-and-clients/#execution-clients).
+
+## Beacon Chain Node
+
+The Beacon Node is responsible for providing the Light Client with the necessary data to handle the parts of the proving
+related to consensus on the chain. There are multiple ways to get such an endpoint, such as leveraging one provided by
+an infrastructure company (such as [Ankr](https://www.ankr.com/docs/rpc-service/chains/chains-api/eth-beacon/) or
+leveraging a public one, such as the one provided by [a16z](https://www.lightclientdata.org).
+
+## Execution RPC Endpoint
+
+The Execution RPC endpoint is responsible for providing the Light Client with the necessary data to prove value
+inclusion
+in the state of the chain. The Light Client needs to connect to an Ethereum node that exposes the necessary RPC
+endpoints.
+
+The RPC endpoint to be used to fetch this data is [`eth_getProof`](https://eips.ethereum.org/EIPS/eip-1186). This RPC
+endpoint can be accessed through various RPC provider such
+as [Infura](https://docs.infura.io/api/networks/polygon-pos/json-rpc-methods/eth_getproof)
+or [Chainstack](https://docs.chainstack.com/reference/getproof).
diff --git a/ethereum/docs/src/components/overview.md b/ethereum/docs/src/components/overview.md
@@ -0,0 +1,19 @@
+# Architecture components
+
+Light clients can be seen as lightweight nodes that enable users to interact with the blockchain without needing to
+download the entire blockchain history. They **rely on full nodes to provide necessary data**, such as block headers,
+and use cryptographic proofs to verify transactions and maintain security.
+
+There are four core components that need to exist to have a functional light client bridge running:
+
+- [**Source Chain Node**](./eth_nodes.md): A full node of the source chain from which it is possible to fetch the
+  necessary data to generate our proofs.
+- [**Coordinator Middleware**](./client.md): This middleware is responsible for orchestrating the other components that are part of the
+  architecture.
+- [**Light Client Proving Servers**](./proof_server.md): The core service developed by Lurk. It contains all the
+  necessary logic to generate the necessary proofs and exposes it through a simple RPC endpoint.
+- [**Verifier**](../benchmark/on_chain.md): A software that can verify the proofs generated by the Light Client. This
+  verification can happen in a regular computer, using a Rust verifier exposed by the Proof servers, or it can be
+  implemented as a smart contract living on a destination chain.
+
+<img src="../images/lc-arch.png">
diff --git a/ethereum/docs/src/components/proof_server.md b/ethereum/docs/src/components/proof_server.md
@@ -0,0 +1,50 @@
+# Proof Server
+
+The Proof Server is a component of the Light Client that is responsible for generating and serving proofs to the client.
+The server is designed to be stateless and can be scaled horizontally to handle a large number of requests. The Proof
+Server can be divided in two distinct implementations:
+
+- **Proof programs**: The proof program contains the logic that will be executed by our Proof server, generating
+  the succinct proof to be verified. The proof programs are run inside the Sphinx zkVM and prover.
+- **Server**: The server is a layer added on top of the proving service that makes it available to external users via a
+  simple protocol.
+
+## Proof programs
+
+This layer of the Proof Server corresponds to the code for which the execution has to be proven. Its logic is the core
+of our whole implementation and ensures the correctness of what we are trying to achieve. The programs are written in Rust
+and leverages the [`lurk-lab/sphinx`](https://github.com/lurk-lab/sphinx) zkVM to generate the proofs and verify them.
+
+In the design document of both the [Sync Committee change proof](../design/committee_change_proof.md) and
+the [inclusion proof](../design/inclusion_proof.md), we describe what each program has to prove. Most computations
+performed by the proof programs are directed towards cryptographic operations, such as verifying signatures on the block
+header.
+
+To accelerate those operations, we leverage some out-of-VM circuits called **pre-compiles** that are optimized for those
+specific operations. The following libraries that make use of our pre-compiles are used in our codebase:
+
+- [bls12_381](https://github.com/lurk-lab/bls12_381/tree/zkvm): A library for BLS12-381 operations based on
+  [`zkcrypto/bls12_381`](https://github.com/zkcrypto/bls12_381) making use of pre-compiles for non-native arithmetic. Used
+  for verifying the signatures over block header.
+- [sha2](https://github.com/sp1-patches/RustCrypto-hashes/tree/patch-v0.10.8): A library for SHA-256 hashing making use of
+  pre-compiles for the compression function. Used to reconstruct the Merkle Root from a Merkle Proof.
+- [tiny-keccak](https://github.com/sp1-patches/tiny-keccak/tree/patch-v2.0.2): A library for SHA-3 hashing, making use of
+  pre-compiles for the compression function. Used for hashing the sync committee data.
+
+The code to be proven is written in Rust and then compiled to RISC-V binaries, stored in `ethereum/ethereum-programs/artifacts/`.
+We then use Sphinx to generate the proofs and verify them based on those binaries. The generated proofs can be STARKs, which
+are faster to generate but cannot be verified directly on-chain, or wrapped in a SNARK, which take longer to generate but can
+be verified cheaply on-chain.
+
+## Server
+
+The server is a layer added on top of the proving service that makes it available to external users. It is a simple
+TCP server that is open to incoming connections on a port specified at runtime.
+
+The server is divided in two, with one main entrypoint. This allows us to handle the worst-case scenario of having to
+generate both proofs in parallel, since each server handles one proof at a time. It is possible to generate and verify
+both STARK core proofs and SNARK proofs.
+
+The RPC protocol used by the servers is a very simple length-prefixed protocol passing serialized messages back and forth.
+The messages are defined in [`proof-server/src/types/proof_server.rs`](https://github.com/lurk-lab/zk-light-clients/blob/dev/ethereum/light-client/src/types/network.rs).
+See also the documentation on the [client](./client.md).
diff --git a/ethereum/docs/src/design/committee_change_proof.md b/ethereum/docs/src/design/committee_change_proof.md
@@ -0,0 +1,36 @@
+# Sync committee change proof
+
+The Ethereum chain has (at any given time) a committee of 512 validators that
+is randomly selected every sync committee period (~1 day), and while a validator
+is part of the currently active sync committee they are expected to continually
+sign the block header that is the new head of the chain at each slot.
+
+At the start of each period \\(N\\) any Light Client can trustfully know and verify
+the current valid sync committee and the one for the period \\(N+1\\). The Light Client
+needs to keep track of those two hashes.
+
+For a given period \\(N\\) with a set of validators \\(V_n\\), it is expected to be able to
+find a block containing information about the new validator set \\(V_{\text{n+1}}\\) signed by
+\\(V_n\\).
+
+It is the job of the light client to produce a proof at least every other period to
+verify the signature for the next validator set. This is handled by the Sync
+Committee Change program.
+
+## Epoch Change program IO
+
+[Program reference](https://github.com/lurk-lab/zk-light-clients/blob/dev/ethereum/programs/committee-change/src/main.rs)
+
+### Inputs
+
+The following data structures are required for proof generation:
+
+- **`LightClientStore`**: The current state of the Light Client, containing information about the latest handled finalized block and the known committees.
+- **`Update`**: A Light Client update, containing information about a change of the Sync Committee.
+
+### Outputs
+
+- **Finalized header slot**: The slot of the finalized beacon header.
+- **Hash of the signing sync committee**: The hash of the signing committee for the finalized beacon block.
+- **Hash of the new sync committee**: The hash of the new sync committee set in the store.
+- **Hash of the new sync committee for the next period**: The hash of the new sync committee for the following period set in the update.
diff --git a/ethereum/docs/src/design/edge_cases.md b/ethereum/docs/src/design/edge_cases.md
@@ -0,0 +1,13 @@
+# Edge cases
+
+Latency-wise, Ethereum Light Client's do not have a worst case scenario. As we 
+explained earlier, it is possible for a Light Client to know at any given time in a
+period \\(N\\) the current valid sync committee and the one for the period \\(N+1\\).
+
+This allows the Light Client to generate inclusion proof for both the current period and the
+one after if the Sync Committee Change proof has yet to be generated.
+
+This effectively means that the Light Client has 2 periods (~2 days) to generate the Sync
+Committee Change proof, which is more than enough time to generate the proof.
+It also means that the Light Client can generate the Inclusion Proof at any time, even
+when the Sync Committee Change proof is being generated.