diff --git a/404.html b/404.html index 72b240e90..7aa2a60eb 100644 --- a/404.html +++ b/404.html @@ -11,7 +11,7 @@ - + diff --git a/assets/js/b2f554cd.29cd5ada.js b/assets/js/b2f554cd.29cd5ada.js deleted file mode 100644 index 9f851eae5..000000000 --- a/assets/js/b2f554cd.29cd5ada.js +++ /dev/null @@ -1 +0,0 @@ -"use strict";(self.webpackChunkformal_land=self.webpackChunkformal_land||[]).push([[5894],{6042:e=>{e.exports=JSON.parse('{"blogPosts":[{"id":"/2025/01/30/links-for-rust-in-rocq","metadata":{"permalink":"/blog/2025/01/30/links-for-rust-in-rocq","source":"@site/blog/2025-01-30-links-for-rust-in-rocq.md","title":"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)","description":"In this article we show how we re-build the type and naming information of \ud83e\udd80 Rust code in  Rocq/Coq, the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of Revm, a virtual machine to run EVM programs.","date":"2025-01-30T00:00:00.000Z","formattedDate":"January 30, 2025","tags":[{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"links","permalink":"/blog/tags/links"},{"label":"simulations","permalink":"/blog/tags/simulations"}],"readingTime":7.485,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)","tags":["Rust","links","simulations"],"authors":[]},"unlisted":false,"nextItem":{"title":"\ud83e\udd16 Designing a coding assistant for Rocq","permalink":"/blog/2025/01/21/designing-a-coding-assistant-for-rocq"}},"content":"In this article we show how we re-build the type and naming information of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code in [ Rocq/Coq](https://rocq-prover.org/), the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of [Revm](https://github.com/bluealloy/revm), a virtual machine to run [EVM](https://en.wikipedia.org/wiki/Ethereum#Virtual_machine) programs.\\n\\n\x3c!-- truncate --\x3e\\n\\nThis is the continuation of the following article:\\n\\n- [\ud83e\udd80 Translation of the Rust\'s core and alloc crates](/blog/2024/04/26/translation-core-alloc-crates)\\n\\n:::success Ask for the highest security!\\n\\nWhen millions are at stake, bug bounties are not enough. How do you ensure your security audits are exhaustive?\\n\\nThe best way is to use **formal verification**.\\n\\n**Contact us** at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) to make sure your code is safe! \ud83d\udee1\ufe0f\\n\\nWe cover **Rust**, **Solidity**, and **ZK systems**.\\n\\n:::\\n\\n
\\n ![Green forest](2025-01-30/green-forest.webp)\\n
\\n\\n## \ud83c\udfaf The challenge\\n\\nOur goal is to be able to formally verify large Rust codebases, counting thousands of lines, and without having to modify the code to make it more amenable to formal verification. Our concrete example is the verification of the Revm that includes about 10,000 lines of Rust code, depending on how far we include the dependencies.\\n\\nThis requires to have a methodology of verification that both:\\n\\n- Scales with the size of the codebase. Rust programs often use a lot of abstractions, and we make the choice to keep these abstractions in the formal model. Combined with the expressivity of the Rocq prover, we hope this will ensure we can scale our reasoning.\\n- Supports most of the Rust language, noting that Rust is a complex and feature-rich language.\\n\\nTo make sure our translation from the Rust language to the Rocq system has good support, we generate a translation that is very verbose and rather low-level without interpreting the meaning of the various Rust primitives too much. For example, our translation tool is only about 5,000 lines long. It is written in Rust and uses the APIs of the `rustc` compiler.\\n\\nThis approach leaves the burdens of defining the semantics of Rust and designing the reasoning primitives on the Rocq side.\\n\\n## \ud83d\udedd Strategy\\n\\nWe plan to reason on the translated Rust code with two intermediate steps:\\n\\n1. **Links** These represent a complete rewriting of the translated code, adding type and naming information that are erased during the translation to Rocq. We also prove that this rewriting is equivalent to the initial translation. We hope to automate this step as much as possible.\\n2. **Simulations** In this step we make the less obvious transformations, in particular representing the memory mutations in a clean and custom state monad, as well as various optimizations such as collapsing all the integer types if it helps for the proofs later. We also prove that this rewriting is equivalent to the links.\\n\\nAt the end of the **Simulations** step, we should obtain a purely functional and idiomatic representation of the original Rust code in Rocq. This representation should be easier to reason about, and we will be able to formally verify properties of the code.\\n\\nAs a summary, here are the steps we want to follow:\\n\\n
\\n ![Compilation steps](2025-01-30/compilation-steps.svg)\\n
\\n\\n## \ud83e\uddea Example\\n\\nHere is an example from the standard library of Rust, which is used to define other comparison operators:\\n\\n```rust\\npub fn max_by Ordering>(v1: T, v2: T, compare: F) -> T {\\n match compare(&v1, &v2) {\\n Ordering::Less | Ordering::Equal => v2,\\n Ordering::Greater => v1,\\n }\\n}\\n```\\n\\nThis example is interesting as it uses some abstractions, with polymorphism, traits, closures, and a bit of pointer manipulations. Ideally, we should be able to represent it with a Rocq code of a similar size, without the explicit references `&` that are mostly useless in a purely functional setting. But here is the Rocq code we obtain after running [coq-of-rust](https://github.com/formal-land/coq-of-rust):\\n\\n```coq\\nDefinition max_by (\u03b5 : list Value.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03b5, \u03c4, \u03b1 with\\n | [], [ T; F ], [ v1; v2; compare ] =>\\n ltac:(M.monadic\\n (let v1 := M.alloc (| v1 |) in\\n let v2 := M.alloc (| v2 |) in\\n let compare := M.alloc (| compare |) in\\n M.read (|\\n M.match_operator (|\\n M.alloc (|\\n M.call_closure (|\\n M.get_trait_method (|\\n \\"core::ops::function::FnOnce\\",\\n F,\\n [],\\n [ Ty.tuple [ Ty.apply (Ty.path \\"&\\") [] [ T ]; Ty.apply (Ty.path \\"&\\") [] [ T ] ] ],\\n \\"call_once\\",\\n [],\\n []\\n |),\\n [\\n M.read (| compare |);\\n Value.Tuple\\n [\\n M.borrow (|\\n Pointer.Kind.Ref,\\n M.deref (| M.borrow (| Pointer.Kind.Ref, v1 |) |)\\n |);\\n M.borrow (|\\n Pointer.Kind.Ref,\\n M.deref (| M.borrow (| Pointer.Kind.Ref, v2 |) |)\\n |)\\n ]\\n ]\\n |)\\n |),\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (M.find_or_pattern (|\\n \u03b3,\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ := M.is_struct_tuple (| \u03b3, \\"core::cmp::Ordering::Less\\" |) in\\n Value.Tuple []));\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ := M.is_struct_tuple (| \u03b3, \\"core::cmp::Ordering::Equal\\" |) in\\n Value.Tuple []))\\n ],\\n fun \u03b3 =>\\n ltac:(M.monadic\\n match \u03b3 with\\n | [] => ltac:(M.monadic v2)\\n | _ => M.impossible \\"wrong number of arguments\\"\\n end)\\n |)));\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ := M.is_struct_tuple (| \u03b3, \\"core::cmp::Ordering::Greater\\" |) in\\n v1))\\n ]\\n |)\\n |)))\\n | _, _, _ => M.impossible \\"wrong number of arguments\\"\\n end.\\n```\\n\\nThis is extremely verbose and not idiomatic for Rocq! We can see some of the Rust features that are made explicit:\\n\\n- The list of constant generics `\u03b5`, the list of type generics `\u03c4`, and the list of arguments `\u03b1`.\\n- The memory operations `alloc` and `read`, and the pointers manipulations `borrow` and `deref`.\\n- The trait instance resolution with `M.get_trait_method`.\\n- The decomposition of the pattern matching in more elementary operations like `M.is_struct_tuple`.\\n\\nMost of this information comes from the [THIR intermediate representation](https://rustc-dev-guide.rust-lang.org/thir.html) of the code as provided by the Rust compiler.\\n\\nHere is the link definition we will write, proven equivalent to the code above by construction:\\n\\n```coq\\nDefinition run_max_by {T F : Set} `{Link T} `{Link F}\\n (Run_FnOnce_for_F :\\n function.FnOnce.Run\\n F\\n (Ref.t Pointer.Kind.Ref T * Ref.t Pointer.Kind.Ref T)\\n (Output := Ordering.t)\\n )\\n (v1 v2 : T) (compare : F) :\\n {{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}.\\nProof.\\n destruct Run_FnOnce_for_F as [[call_once [H_call_once run_call_once]]].\\n run_symbolic.\\n eapply Run.CallPrimitiveGetTraitMethod. {\\n apply H_call_once.\\n }\\n run_symbolic.\\n eapply Run.CallClosure. {\\n apply (run_call_once compare (Ref.immediate _ v1, Ref.immediate _ v2)).\\n }\\n intros [ordering |]; cbn; [|run_symbolic].\\n destruct ordering; run_symbolic.\\nDefined.\\n```\\n\\nThe beginning of the definition corresponds to the trait resolution and calls to the `compare` function. The last part with `destruct ordering` is the representation of the `match` statement in the Rust code. With this definition, we add explicit Rocq types instead of the universal `Value.t` type of the translated code and make explicit the trait resolution. The trait instance has to be provided as an explicit parameter with the `Run_FnOnce_for_F` argument.\\n\\nWith the statement:\\n\\n```coq\\n{{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}\\n```\\n\\nwe say that the translated function `cmp.max_by` has a \\"link\\" definition, built implicitly in the proof, returning a value of type `T`. We can extract the definition of this function calling the primitive:\\n\\n```coq\\nevaluate : forall {Output : Set} `{Link Output} {e : M},\\n {{ e \ud83d\udd3d Output }} ->\\n LowM.t (Output.t Output)\\n```\\n\\nIt returns a \\"link\\" computation in the `LowM.t` monad. The output is often unreadable as it is, but we can step through it by symbolic execution. This will be useful for the next step to define and prove equivalent the \\"simulations\\".\\n\\n## \ud83d\udd2e Link monad\\n\\nLike the monad used for the translation of Rust programs by `coq-of-rust`, the link\'s monad is a free monad but with fewer primitive operations. The primitive operations are only related to the memory handling:\\n\\n```coq\\nInductive t : Set -> Set :=\\n| StateAlloc {A : Set} `{Link A} (value : A) : t (Ref.Core.t A)\\n| StateRead {A : Set} `{Link A} (ref_core : Ref.Core.t A) : t A\\n| StateWrite {A : Set} `{Link A} (ref_core : Ref.Core.t A) (value : A) : t unit\\n| GetSubPointer {A Sub_A : Set} `{Link A} `{Link Sub_A}\\n (ref_core : Ref.Core.t A) (runner : SubPointer.Runner.t A Sub_A) :\\n t (Ref.Core.t Sub_A).\\n```\\n\\nCompared to the side effects in the generated translation, we eliminate all the operations related to name handling (trait resolution, function calls, etc.). We also always use explicit types instead of the universal `Value.t` type and get rid of the `M.impossible` operation that was necessary to represent impossible branches in the absence of types.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have presented our general strategy to formally verify large Rust codebases. In the next blog posts, we will go into more details to look at the definition of the proof of equivalence for the links, and at how we automate the most repetitive parts of the proofs.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2025/01/21/designing-a-coding-assistant-for-rocq","metadata":{"permalink":"/blog/2025/01/21/designing-a-coding-assistant-for-rocq","source":"@site/blog/2025-01-21-designing-a-coding-assistant-for-rocq.md","title":"\ud83e\udd16 Designing a coding assistant for Rocq","description":"This blog post provides a review of the existing literature on agent-based systems for automated theorem proving, while presenting a general approach to the problem. Additionally, it serves as an informal specification outlining the requirements for a future system we intend to develop.","date":"2025-01-21T00:00:00.000Z","formattedDate":"January 21, 2025","tags":[{"label":"llm","permalink":"/blog/tags/llm"},{"label":"ai","permalink":"/blog/tags/ai"}],"readingTime":9.29,"hasTruncateMarker":true,"authors":[{"name":"Andrea Delmastro","url":"https://github.com/andreadlm","imageURL":"https://github.com/andreadlm.png","key":"andrea_delmastro"}],"frontMatter":{"title":"\ud83e\udd16 Designing a coding assistant for Rocq","tags":["llm","ai"],"authors":["andrea_delmastro"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)","permalink":"/blog/2025/01/30/links-for-rust-in-rocq"},"nextItem":{"title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","permalink":"/blog/2025/01/13/verification-one-instruction-sui"}},"content":"This blog post provides a review of the existing literature on agent-based systems for automated theorem proving, while presenting a general approach to the problem. Additionally, it serves as an informal specification outlining the requirements for a future system we intend to develop.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nWe exclusively focus on formal verification to offer you the highest degree of security for your application.\\n\\nWe cover **Rust**, **Solidity**, and **zero-knowledge** projects.\\n\\n:::\\n\\n## \ud83c\udfaf Our goal\\nWe aim to develop an integrated coding assistant for the proof assistant [ Rocq/Coq](https://rocq-prover.org/) within [Visual Studio Code](https://code.visualstudio.com/). Despite recent advancements in artificial intelligence, the challenge of creating systems that effectively assist users in writing formal verification code remains unresolved. Our primary focus is on providing support for theorem proving, which we consider the most compelling aspect of the task; other functionalities, such as definition writing, may be explored in future work.\\n\\n## \ud83c\udf33 Automated theorem proving as a search in a state space\\nA coding assistant for a proof assistant can take advantage of a fundamental property that is not possessed by traditional programming languages: it is always possible to deterministically verify whether the code generated for a demonstration is correct (or simply, not incorrect). It only requires the code to be well-typed. More broadly, the assistant can track the progress of the solution.\\nA proof can be seen as a sequence of tactics, each of which modifies the current goal. Consequently, the proof construction process can be framed as a search through a state space. Using classical terminology for such problems, we can categorize the components of our system as follows:\\n\\n* **state**: enriched representation of the current goal\\n* **starting state**: initial goal associated with the theorem (its definition)\\n* **arrival state**: closed goal\\n* **actions**: tactics\\n\\n
\\n
\\n ![Tree search simple](2025-01-21/tree_search_simple.svg)\\n
\\n
\\n\\nCertain states can be pruned if they do not meet some conditions, such as error states, those where with certainty no progress has been made (e.g., the [copra](https://github.com/trishullab/copra) system proposes a simple symbolic approach to recognize some trivial cases) or if too many attempts have already been made at a certain node.\\n\\nThe set of tactics that can be applied in a given state is potentially infinite. To guide the search, one or more oracles are queried, which provide suggestions on applicable tactics. These oracles can be either LLM-based agents or traditional symbolic procedures (e.g., [CoqHammer](https://coqhammer.github.io/)). Multiple oracles may coexist, offering alternative solutions. Two examples of LLM-based oracles are:\\n\\n* oracle that produces a list of $k$ possible alternative tactics to be applied at a given goal;\\n* oracle that produces a complete demonstration for a given goal.\\n\\nIt is not obvious whether a procedure that proceeds in depth or one that proceeds in breadth is preferable. As is often the case with research problems, a \\"hybrid\\" approach might be preferable. In any case, one could imagine ordering the frontier on the basis of how promising a certain state is, thereby guiding the search process. The problem of determining whether one state is more promising than another through heuristics (a kind of \\"distance\\" from the successful state) is certainly interesting and would merit future study.\\n\\nOne can imagine two procedures, one in breadth and one in depth. From the union of the two, a hybrid solution could be devised.\\n\\n
\\n
\\n ![Depth search](2025-01-21/depth_tree_search.svg)\\n
Depth search
\\n
\\n
\\n\\n
\\n
\\n ![Breadth search](2025-01-21/breadth_tree_search.svg)\\n
Breadth search by beam search: in this specific case, a heuristic is employed to limit the number of expanded nodes
\\n
\\n
\\n\\nThe system should be flexible enough and allow for the implementation of different versions of the search algorithm that could be refined as the work progresses.\\n\\n### Learning from errors\\nBy leveraging the ability of an LLM to generate an infinite number of tactics and possibly update the prompt to refine the query, errors can be exploited to generate new tactics. For example, a node (state) might not be closed as soon as it is expanded, but it could be re-expanded in the future, enriched with the knowledge of past errors.\\n\\n
\\n
\\n ![Tree errors](2025-01-21/tree_errors.svg)\\n
\\n
\\n\\n## \ud83d\udc68\ud83c\udffb\u200d\ud83d\udcbb Integration with the user\\nThe system must integrate forms of communication and interaction with the user, which guide the user\'s construction of the proof. In this context, the coding assistant is not envisioned as a fully automated proof tool, but rather as a coding companion that leverages the support of the human developer. For instance, such a companion system does not necessarily need to complete the proof, but could instead generate partial solutions, offering the user multiple incomplete options and allowing them to select the one they deem most appropriate as a starting point.\\n\\n## \ud83d\udd27 The technology stack\\nThe system is designed to be distributed as a Visual Studio Code extension. This approach offers several advantages, including access to the editor\'s extensive ecosystem of APIs, which facilitates seamless integration into standard development workflows and user interactions. Additionally, it simplifies the publishing and installation processes.\\n\\n
\\n
\\n ![Tech stack](2025-01-21/tech_stack.svg)\\n
\\n
\\n\\n### Large language models\\nModels and ad-hoc architectures for theorem proving have been proposed in the literature, including [ReProver](https://github.com/lean-dojo/ReProver) (for [Lean](https://lean-lang.org/)). The cost of maintaining and the complexity of configuring and adapting these systems is generally high. More simply, commercial versions of the most common LLMs (GPTs, , ...) can be used, leveraging prompt-engineering techniques. Several papers demonstrate that comparable results to state-of-the-art models can be achieved using such approaches.\\n\\n### The VSCode API ecosystem\\nVSCode offers a rich ecosystem of APIs that can be used to integrate your extension with common development processes, simplify user interaction, and communicate with external tools. In particular, the new [Language Model API](https://code.visualstudio.com/api/extension-guides/language-model) is particularly useful for our purposes. It offers a common interface as well as tools to simplify communication with popular LLMs. Through the use of [Proposed API](https://github.com/microsoft/vscode/blob/main/src/vscode-dts/vscode.proposed.chatProvider.d.ts), it is also possible to integrate local models, which are useful mainly in the testing phase of the first iterations. VSCode\'s [Language API](https://code.visualstudio.com/api/references/vscode-api#languages) simplifies the development and integration of an LSP client.\\n\\n\\n### Language server\\nLanguage analysis capabilities are provided by the language server [Coq-LSP](https://github.com/ejgallego/coq-lsp), which has recently been released as part of the [P\xe9tanque](https://github.com/ejgallego/coq-lsp/tree/main/petanque) project, a lightweight environment for intensive applications targeted at automated theorem-proving projects and especially at agent-based systems. P\xe9tanque operates as a [Gymnasium](https://gymnasium.farama.org/index.html) environment and has already been successfully used in the [NLIR](https://github.com/LLM4Coq/nlir) system. \\nAt the architectural level, Coq-LSP (and P\xe9tanque) operates as a server towards the coding assistant (the client), providing some functionality in the form of an API via an extended version of the [LSP](https://microsoft.github.io/language-server-protocol/) protocol, including:\\n\\n* obtaining the current goal for a given theorem,\\n* obtaining the location of a given theorem\'s definition,\\n\\nand many other functionalities commonly accessible through IDEs.\\n\\n## \ud83e\udde0 The agent perspective\\nAn alternative description of the system can be accomplished from the agent\'s perspective. We understand an _agent_ as a software system whose behavior is conditioned by an environment that it can actively alter by performing some actions whose effects condition subsequent observations and, consequently, its future choices.\\n\\n
\\n
\\n ![Agent RL](2025-01-21/agent_rl.svg)\\n
\\n
\\n\\nBased on the above definition, we can attempt to classify the components of our system within a classic agent context as follows:\\n\\n* **Agent**: prompting + large language model + parsing\\n* **Environment**: user, language server and search algorithm\\n* **Actions**: tactics\\n* **Observations**: current goal, examples, definitions, ...\\n\\n
\\n ![Agent loop](2025-01-21/agent_loop.svg)\\n
\\n\\nLet us recall that in the proposed general architecture, the agent is only one of the possible types of _oracle_ from which we can obtain useful information to advance the demonstration (albeit the most interesting one), and that multiple _oracles_ at the same time can coexist, e.g., agents implementing different resolution strategies or configurations.\\n\\nThe agent is designed to interact with various components of the environment, for example, by requesting examples, additional information, or simple advice from the user; the current goal; the list of previously attempted and failed tactics for the search algorithm; or semantic data from the language server. The interaction process could be deliberative (guided by the LLM\'s reasoning) or, more simply, a form of abstraction for a predetermined set of information we intend to request. This interaction with the environment is crucial for defining the goal, enriching the agent\'s context, and generating input for the prompt. In a recent [blog post](/blog/2025/01/06/annotating-what-we-are-doing), we documented our interest in internally gathering as much information as possible about the recurring human processes of building a demonstration in order to standardize and emulate them within the agent\'s interaction logic with the environment.\\n\\nSeveral propting techniques can be tried. In the [NLIR](https://github.com/LLM4Coq/nlir) system, the prompt is gradually refined through a chain-of-thought approach: first, a natural language response is requested from the LLM, and this response is then used to generate a more precise Rocq code response.\\n\\nOnce the LLM produces an output\u2014whether a tactic, a list of tactics, or a complete proof, depending on the type of agent\u2014the response is parsed and executed within the environment via P\xe9tanque. If an error occurs, the environment is either reverted to its previous state (backtracking), or an error recovery technique is applied (e.g., replacing the problematic code with `admit.`).\\n\\n## \ud83d\udcca Evaluation strategies\\nBenchmarks used for evaluating automated demonstration systems tend to be limited to classical mathematics, focusing on demonstration systems that are _completely automated_ and not of _support_ to demonstration writing.\\nAs a result, these benchmarks can be misleading with regard to the practical utility of such tools in real-world contexts. In addition to traditional evaluation methods, the system should be tested in practical scenarios, such as by applying it to ongoing formal verification projects within [Formal Land](/).\\n\\nA second critical consideration when evaluating a practical support tool is the cost per request. Integrated LLMs should not be viewed as infinite resources, but rather as constrained resources whose usage must be optimized and minimized, even if that means sacrificing some efficiency.\\n\\n## \ud83d\uddc2\ufe0f Similar projects and resources\\nThe research field in fully autonomous automated theorem using proof assistants is very active and has received a strong boost since the advent of LLMs. The proposed system architecture has been influenced by the following works:\\n- LeanCopilot ([code](https://github.com/lean-dojo/LeanCopilot), [paper](https://arxiv.org/abs/2404.12534))\\n- copra ([code](https://github.com/trishullab/copra), [paper](https://arxiv.org/abs/2310.04353))\\n- CoqPilot ([code](https://github.com/JetBrains-Research/coqpilot), [paper](https://arxiv.org/abs/2410.19605))\\n- NLIR ([code](https://github.com/LLM4Coq/nlir), [paper](https://openreview.net/forum?id=QzOc0tpdef))\\n\\nOther interesting resources to further explore this topic:\\n- \ud83d\udcfd\ufe0f [Lean Together 2025: Jason Rute, The last mile](https://www.youtube.com/watch?v=Yr8dzfVkeHg)\\n\\n## \ud83e\udd61 Key takeaway\\n* The agent perspective and the search perspective are here complemented in a single system\\n* The automatic demonstration process can be seen as a sophisticated search in a space of states\\n* The system must be flexible overall and adapt to different refinements that might be decided in the process\\n* In a support tool, completeness of proof is not mandatory\\n* User interaction is crucial\\n* Evaluation must be carried out in a practical context\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2025/01/13/verification-one-instruction-sui","metadata":{"permalink":"/blog/2025/01/13/verification-one-instruction-sui","source":"@site/blog/2025-01-13-verification-one-instruction-sui.md","title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","description":"This is the last article of a series of blog post presenting our formal verification effort in  Rocq/Coq to ensure the correctness of the type-checker of the Move language for Sui.","date":"2025-01-13T00:00:00.000Z","formattedDate":"January 13, 2025","tags":[{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Move","permalink":"/blog/tags/move"},{"label":"Sui","permalink":"/blog/tags/sui"},{"label":"type-checker","permalink":"/blog/tags/type-checker"}],"readingTime":5.73,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","tags":["Rust","Move","Sui","type-checker"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd16 Designing a coding assistant for Rocq","permalink":"/blog/2025/01/21/designing-a-coding-assistant-for-rocq"},"nextItem":{"title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","permalink":"/blog/2025/01/06/annotating-what-we-are-doing"}},"content":"This is the last article of a series of blog post presenting our formal verification effort in [ Rocq/Coq](https://rocq-prover.org/) to ensure the correctness of the type-checker of the [Move language](https://sui.io/move) for [Sui](https://sui.io/).\\n\\nHere we show how the formal proof works to check that the type-checker is correct on a particular instruction, for any possible initial states. The general idea is to symbolically execute the code step by step on the type-checker side, accumulating properties about the stack assuming the type-checker succeeds, and then to show that the interpreter will produce a stack of the expected type as a result.\\n\\n\x3c!-- truncate --\x3e\\n\\nPrevious post:\\n\\n- [\ud83e\udd80 Example of verification for the Move\'s checker of Sui](/blog/2024/11/14/sui-move-checker-abstract-stack)\\n\\n:::success Ask for the highest security!\\n\\nWhen millions are at stake, bug bounties are not enough.\\n\\nHow do you ensure your security audits are exhaustive?\\n\\nThe best way to do this is to use **formal verification**.\\n\\nThis is what we provide as a service. **Contact us** at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) to make sure your code is safe! \ud83d\udee1\ufe0f\\n\\nWe cover **Rust**, **Solidity**, and **ZK systems**.\\n\\n:::\\n\\n
\\n ![Green forest with water](2025-01-13/green-forest-with-water.webp)\\n
\\n\\n## \ud83e\udd80 The Rust code\\n\\nWe are verifying the type-checking for the Move bytecode instruction `CastU8`. This instruction takes the top-most element of the stack, checks that it is an integer, and pushes it back on the stack as a `U8` if it is in the right range or fails with an error `StatusCode::ARITHMETIC_ERROR` otherwise.\\n\\nHere is the code of the interpreter:\\n\\n```rust\\nBytecode::CastU8 => {\\n gas_meter.charge_simple_instr(S::CastU8)?;\\n let integer_value =\\n interpreter.operand_stack.pop_as::()?;\\n interpreter\\n .operand_stack\\n .push(Value::u8(integer_value.cast_u8()?))?;\\n}\\n```\\n\\nWe ignore the gas metering for now. The `pop_as` method pops the top-most element of the stack and checks that it is an integer. The `cast_u8` method checks that the integer is in the right range (`0` to `255`) and returns the value as a `U8`. The `push` method pushes the value back on the stack. The question mark operator `?` is used to propagate errors.\\n\\nHere is the corresponding code in the type-checker:\\n\\n```rust\\nBytecode::CastU8 => {\\n let operand = safe_unwrap_err!(verifier.stack.pop());\\n if !operand.is_integer() {\\n return Err(verifier.error(\\n StatusCode::INTEGER_OP_TYPE_MISMATCH_ERROR,\\n offset,\\n ));\\n }\\n verifier.push(meter, ST::U8)?;\\n}\\n```\\n\\nIt pops the top-most element of the stack of types (we do not have values here) and checks that it is an integer type. If it is not, it returns an error. Otherwise, it pushes the type `U8` on the stack. Note that there are no ways to know, in the type-checker, if the value is in the right range.\\n\\n## \ud83d\udc26\u200d\u2b1b The Rocq translation\\n\\nIn previous posts, we covered our manual translation of the Rust code in Rocq. We repeat it here. The interpreter code in Rocq:\\n\\n```coq\\n| Bytecode.CastU8 =>\\n letS!? integer_value := liftS! State.Lens.interpreter (\\n liftS! Interpreter.Lens.operand_stack $\\n Stack.Impl_Stack.pop_as IntegerValue.t\\n ) in\\n letS!? integer_value :=\\n returnS! $ IntegerValue.cast_u8 integer_value in\\n doS!? liftS! State.Lens.interpreter (\\n liftS! Interpreter.Lens.operand_stack $\\n Stack.Impl_Stack.push $\\n ValueImpl.U8 integer_value\\n ) in\\n returnS!? InstrRet.Ok\\n```\\n\\nThe type-checker code in Rocq:\\n\\n```coq\\n| Bytecode.CastU8 => \\n letS! operand :=\\n liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in\\n letS! operand := return!toS! $ safe_unwrap_err operand in\\n if negb $ SignatureToken.is_integer operand then\\n returnS! $\\n Result.Err $\\n TypeSafetyChecker.Impl_TypeSafetyChecker.error\\n verifier StatusCode.INTEGER_OP_TYPE_MISMATCH_ERROR offset\\n else\\n TypeSafetyChecker.Impl_TypeSafetyChecker.push SignatureToken.U8\\n```\\n\\n## \ud83d\udcdc Formal statement\\n\\nHere is the formal statement of the property we want to prove to ensure the correctness of the type-checker:\\n\\n```coq\\nLemma progress\\n (* [...] parameters and hypothesis *)\\n (* We assume that the initial state is well-typed *)\\n IsInterpreterContextOfType.t locals interpreter type_safety_checker ->\\n match\\n verify_instr instruction pc type_safety_checker,\\n execute_instruction ty_args function resolver instruction state\\n with\\n | Panic.Value (Result.Ok _, type_safety_checker\'),\\n Panic.Value (Result.Ok _, state\') =>\\n let \'{|\\n State.pc := _;\\n State.locals := locals\';\\n State.interpreter := interpreter\';\\n |} := state\' in\\n IsInterpreterContextOfType.t locals\' interpreter\' type_safety_checker\'\\n (* If the type-checker succeeds, then the interpreter cannot return a panic *)\\n | Panic.Value (Result.Ok _, _), Panic.Panic _ => False\\n (* Other errors are allowed *)\\n | Panic.Value (Result.Ok _, _), Panic.Value (Result.Err _, _)\\n | Panic.Value (Result.Err _, _), _\\n | Panic.Panic _, _ => True\\n end.\\n```\\n\\nThis lemma is in the file [proofs/move_bytecode_verifier/type_safety.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/move_sui/proofs/move_bytecode_verifier/type_safety.v). It compares the behavior of the type-checker and the interpreter when executing an instruction. If the type-checker succeeds, then the interpreter cannot return a panic. If the interpreter also succeeds, then the new state is well-typed according to the types returned by the type-checker.\\n\\n## \ud83d\udee1\ufe0f Proof time\\n\\nWe prove the statement above by reasoning about all possible instructions. For the `CastU8` instruction, the Rocq proof is as follows:\\n\\n```coq\\n{ guard_instruction Bytecode.CastU8.\\n destruct_abstract_pop.\\n step; cbn; [exact I|].\\n destruct_abstract_push.\\n step; cbn; (try easy); (try now destruct operand_ty);\\n repeat (step; cbn; try easy);\\n constructor; cbn; try assumption;\\n sauto lq: on.\\n}\\n```\\n\\nHere is what this script does:\\n\\n- `guard_instruction Bytecode.CastU8` checks that the current instruction is `CastU8`. This helps debugging if we are not at the right place.\\n- `destruct_abstract_pop` pops the top-most element of the stack of types and gives it the name `operand_ty`. It handles the cases where the stack is empty.\\n- `step; cbn; [exact I|]` is a command to handle the next `if` in the code of the type-checker. We are only interested in the success branch (`else` branch in this case).\\n- `destruct_abstract_push` pushes the type `U8` on the stack of types.\\n\\nThen, there is a set of automated tactics iterating over all the possible types of values that can be on the stack. Just before the end of the proof, we have the following proof state:\\n\\n```coq\\n---------------------------------------\\n(1/6)\\nList.Forall2 IsValueImplOfType.t (ValueImpl.U8 z :: x0)\\n (SignatureToken.U8 :: AbstractStack.flatten stack_ty0)\\n```\\n\\nThis proof state is repeated identically six times, once for each possible integer type (`U8`, `U16`, `U32`, `U64`, `U128`, `U256`). It says that the stack of values:\\n\\n```coq\\nValueImpl.U8 z :: x0\\n```\\n\\nmust have the stack of types:\\n\\n```coq\\nSignatureToken.U8 :: AbstractStack.flatten stack_ty0\\n```\\n\\nThe value `z` is the result of the `cast_u8` function in the interpreter. The `flatten` function is used to flatten the stack of types that may contain duplicates.\\n\\nFor the head of the stack the property is trivially true. For the tail of the stack, we use one of the hypotheses from the context, coming from the fact that the stack was initially well-typed and with did not modify the tail.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have show how to formally verify that the type-checker for the Move\'s bytecode virtual machine is correct on a simple instruction `CastU8`. This is part of a larger effort to ensure the correctness of the whole type-checker.\\n\\nOther instructions operating on atomic types (integers, booleans, addresses) are similar to this one. The most complex instructions are the ones operating on references and data structures like vectors and structs. These require more work, and we have not yet tackled them.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2025/01/06/annotating-what-we-are-doing","metadata":{"permalink":"/blog/2025/01/06/annotating-what-we-are-doing","source":"@site/blog/2025-01-06-annotating-what-we-are-doing.md","title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","description":"We want to write a series of blog posts about our efforts to use LLMs to formally verify code faster with the  Rocq/Coq theorem prover. Here, we present an experiment consisting of writing all that we are doing so that we can document our reasoning and help LLMs to pick up human techniques.","date":"2025-01-06T00:00:00.000Z","formattedDate":"January 6, 2025","tags":[{"label":"llm","permalink":"/blog/tags/llm"},{"label":"ai","permalink":"/blog/tags/ai"}],"readingTime":3.795,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","tags":["llm","ai"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","permalink":"/blog/2025/01/13/verification-one-instruction-sui"},"nextItem":{"title":"\ud83e\udd84 Mutually recursive functions with notation","permalink":"/blog/2024/12/26/mutually-recursive-functions-with-notation"}},"content":"We want to write a series of blog posts about our efforts to use LLMs to formally verify code faster with the [ Rocq/Coq](https://rocq-prover.org/) theorem prover. Here, we present an experiment consisting of writing all that we are doing so that we can document our reasoning and help LLMs to pick up human techniques.\\n\\nAccording to many publications about using generative AI to help formal verification, it is almost impossible to find a proof in \\"one shot\\". So, one certainly has to interact with the system, maybe by following the human way. Here we aim to document this \\"human way\\" of writing proofs.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nHow do you ensure your security audits are exhaustive?\\n\\nWhen millions are at stake, bug bounties are not enough.\\n\\nThe only way to do this is to use **formal verification** to _prove_ your code is correct.\\n\\nThis is what we provide as a service. **Contact us** at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) to ensure your code is safe! \ud83d\ude80\\n\\nWe cover **Rust**, **Solidity**, and soon **zk circuits**.\\n\\n:::\\n\\n
\\n ![Robot](2025-01-06/robot-forest.webp)\\n
\\n\\n## \ud83d\udd0d Example\\n\\nWe take as an example our verification effort for the type-checker of the Move language. We have a big lemma to verify with 77 cases, one per Move instruction. We now write everything we do in a single linear document [what_we_do.md](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/what_we_do.md). Here is an extract:\\n\\n>Now a previous case is failing:\\n>\\n>```\\n>hauto l: on.\\n>```\\n>\\n>with:\\n>\\n>```\\n>Error: hauto failed\\n>```\\n>\\n>As this is a tactic generated by `best`, we try to use `best` again. It works! We continue and arrive at our current goal. Out of curiosity, we try `best` again. It works! The idea is that since we made weaker the definition of what we want to prove, maybe we can now solve it automatically.\\n>\\n>We have six cases which are solved by `best`:\\n>\\n>```\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>```\\n>\\n>We replace it by `; best` after the block of previous tactics:\\n>\\n>```\\n>step; cbn; (try easy); (try now destruct operand_ty);\\n> repeat (step; cbn; try easy);\\n> constructor; cbn; try assumption;\\n> best.\\n>```\\n>\\n>It works! By running `make` again we get that we can replace the `best` by `\\n>\\n>So now we have done the `Bytecode.CastU8` case.\\n\\nWe document both our successes and failures, as this is what we do when we interact with the system to try to find the proof of a property.\\n\\n## \ud83d\udc06 Quick takeaways\\n\\nThis is time-consuming. Hopefully, this pays off in the long run. There may be a way to automatically record what we are doing, by recording the user interactions in a VSCode plugin. In addition, when writing what we do by hand we might forget to write some important steps but seemingly obvious steps, like checking into another file, due to laziness.\\n\\nThe autocomplete from GitHub Copilot, while writing the document, already generated the right steps to do from the journal we are writing, like \\"compile the project again\\" or a good tactic to try.\\n\\nWe realize that we have a lot to write in consolidated documents and that a lot of what we do are coding conventions we have taken. These might not be the ones used by everyone, so we have to distinguish between our conventions and general Rocq knowledge.\\n\\nThere is a lot of domain-specific knowledge that only a human can provide and that is specific to each project. For example, here, a human has to give hints related to how the Move type-checker is implemented, which can only be understood by reading the source code.\\n\\nHere we try to give some sense of mid-level intuitions: how to navigate the project, go to a definition, add a new property, ... We do not focus too much on the details of the tactics to use (more low-level), or the high-level intuition behind the proof which might be better done by a human.\\n\\nThis helps to understand how an LLM thinks and which information it has access to.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have quickly presented the idea of writing what we are doing along the way to help LLMs understand how to verify some code.\\n\\nPlease tell us what you think or if you have some ideas for improving this process!\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/12/26/mutually-recursive-functions-with-notation","metadata":{"permalink":"/blog/2024/12/26/mutually-recursive-functions-with-notation","source":"@site/blog/2024-12-26-mutually-recursive-functions-with-notation.md","title":"\ud83e\udd84 Mutually recursive functions with notation","description":"In this blog post, we present a technique with the  Rocq/Coq theorem prover to define mutually recursive functions using a notation. This is sometimes convenient for types defined using a container type, such as types depending on a list of itself.","date":"2024-12-26T00:00:00.000Z","formattedDate":"December 26, 2024","tags":[{"label":"recursion","permalink":"/blog/tags/recursion"},{"label":"notation","permalink":"/blog/tags/notation"},{"label":"mutual","permalink":"/blog/tags/mutual"}],"readingTime":3.735,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 Mutually recursive functions with notation","tags":["recursion","notation","mutual"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","permalink":"/blog/2025/01/06/annotating-what-we-are-doing"},"nextItem":{"title":"\ud83d\udc7b Translation of Circom to Coq","permalink":"/blog/2024/12/20/translation-of-circom-to-coq"}},"content":"In this blog post, we present a technique with the [ Rocq/Coq](https://rocq-prover.org/) theorem prover to define mutually recursive functions using a notation. This is sometimes convenient for types defined using a container type, such as types depending on a list of itself.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nWe exclusively focus on formal verification to offer you the highest degree of security for your application.\\n\\nWe currently work with some of the leading blockchain entities, such as:\\n\\n- The [Ethereum Foundation](https://ethereum.foundation/)\\n- The [Sui Foundation](https://sui.io/about)\\n- Previously, the [Aleph Zero](https://alephzero.org/) and [Tezos](https://tezos.com/) foundations\\n\\n:::\\n\\n
\\n ![Forest](2024-12-26/two-trees.jpg)\\n
\\n\\n## \ud83d\udd0d Example\\n\\nHere is a typical example of a type defined using a container of itself, written in [\ud83e\udd80 Rust](https://www.rust-lang.org/):\\n\\n```rust\\nstruct Trees(Vec>);\\n\\nenum Tree {\\n Leaf,\\n Node { data: A, children: Trees },\\n}\\n```\\n\\nThese two definitions are mutually dependent. We choose to represent it in Rocq/Coq with the following definition:\\n\\n```coq\\nInductive Tree (A : Set) : Set :=\\n| Leaf : Tree A\\n| Node : A -> list (Tree A) -> Tree A.\\n\\nDefinition Trees (A : Set) : Set :=\\n list (Tree A).\\n```\\n\\nIf we define a recursive function on this type, for example, to compute the sum of all the values in the tree, we would naturally write a function that iterates both on:\\n\\n- The tree constructors,\\n- The list of the `Node` case.\\n\\n## \ud83d\udcdd First solution\\n\\nHere is a first attempt to define a `sum` function that adds all the elements of the tree:\\n\\n```coq\\nFixpoint sum_tree {A : Set} (f : A -> nat) (t : Tree A) : nat :=\\n match t with\\n | Leaf => 0\\n | Node a ts => f a + sum_trees f ts\\n end\\n\\nwith sum_trees {A : Set} (f : A -> nat) (ts : Trees A) : nat :=\\n match ts with\\n | nil => 0\\n | t :: ts => sum_tree f t + sum_trees f ts\\n end.\\n```\\n\\nThis definition does not work as the `Tree` type is not mutually recursive, but the function `sum_tree` is. The error message is:\\n\\n```\\nError: Cannot guess decreasing argument of fix.\\n```\\n\\nA first solution is to define the function `sum_trees` as a local definition in `sum_tree`:\\n\\n```coq\\nFixpoint sum_tree {A : Set} (f : A -> nat) (t : Tree A) : nat :=\\n let fix sum_trees (ts : Trees A) : nat :=\\n match ts with\\n | nil => 0\\n | t :: ts => sum_tree f t + sum_trees ts\\n end in\\n match t with\\n | Leaf => 0\\n | Node a ts => f a + sum_trees ts\\n end.\\n```\\n\\nThis definition gets accepted by the prover!\\n\\n## \ud83d\ude80 Second solution\\n\\nAn issue is that we cannot call `sum_trees` directly as its definition is hidden in the one of `sum_tree`. This is a problem if further top-level definitions depend on `sum_trees`, or if we want to verify intermediate properties about `sum_trees` itself.\\n\\nA solution we use for this kind of problem is to add a notation to make `sum_trees` a top-level definition while keeping the mutual recursion with `sum_tree`:\\n\\n```coq\\nReserved Notation \\"\'sum_trees\\".\\n\\nFixpoint sum_tree {A : Set} (f : A -> nat) (t : Tree A) : nat :=\\n match t with\\n | Leaf => 0\\n | Node a ts => f a + \'sum_trees _ f ts\\n end\\n\\nwhere \\"\'sum_trees\\" := (fix sum_trees (A : Set) (f : A -> nat) (ts : Trees A) : nat :=\\n match ts with\\n | nil => 0\\n | t :: ts => sum_tree f t + sum_trees _ f ts\\n end).\\n\\nDefinition sum_trees {A : Set} := \'sum_trees A.\\n```\\n\\nHere, both `sum_tree` and `sum_trees` are defined as top-level, and the mutually recursive definition is accepted. Note that we have to make the type `A` explicit in the notation, as implicit parameters are not allowed there.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have shown a technique that is sometimes useful for us to define complex, mutually dependent data structures. This was recently useful for defining the `ValueImpl` type in the type-checker of [Move](https://sui.io/move) for the blockchain [Sui](https://sui.io/).\\n\\nYou can tell us what you think or if you prefer another way to define mutually recursive functions!\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/12/20/translation-of-circom-to-coq","metadata":{"permalink":"/blog/2024/12/20/translation-of-circom-to-coq","source":"@site/blog/2024-12-20-translation-of-circom-to-coq.md","title":"\ud83d\udc7b Translation of Circom to Coq","description":"In this post, we present the beginning of our work to translate programs written in the Circom circuit language to the \ud83d\udc13 Coq proof assistant. This work is part of our research on the formal verification of zero-knowledge systems.","date":"2024-12-20T00:00:00.000Z","formattedDate":"December 20, 2024","tags":[{"label":"Circom","permalink":"/blog/tags/circom"},{"label":"zero-knowledge","permalink":"/blog/tags/zero-knowledge"}],"readingTime":10.84,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc7b Translation of Circom to Coq","tags":["Circom","zero-knowledge"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 Mutually recursive functions with notation","permalink":"/blog/2024/12/26/mutually-recursive-functions-with-notation"},"nextItem":{"title":"\ud83e\udd84 How does formal verification of smart contracts work?","permalink":"/blog/2024/12/20/what-is-formal-verification-of-smart-contracts"}},"content":"In this post, we present the beginning of our work to translate programs written in the [Circom](https://iden3.io/circom) circuit language to the [\ud83d\udc13 Coq](https://coq.inria.fr/) proof assistant. This work is part of our research on the formal verification of zero-knowledge systems.\\n\\nWe will aim to write more regularly about what we are doing, even if the posts are then shorter. Here, we focus on the translation part for a simple example without defining a semantics for the generated Coq code.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nWe exclusively focus on formal verification to offer you the highest degree of security for your application.\\n\\nWe are already working with some of the leading blockchain entities, such as:\\n\\n- The [Ethereum Foundation](https://ethereum.foundation/)\\n- The [Sui Foundation](https://sui.io/about)\\n- Previously, the [Aleph Zero](https://alephzero.org/) and [Tezos](https://tezos.com/) foundations\\n\\n:::\\n\\n
\\n ![Forest](2024-12-20/ghost-forest.webp)\\n
\\n\\n## \ud83d\udc7b The Circom language\\n\\nThis is a language to write composable and optimized zero-knowledge circuits. It has been in use for quite some time, and there are a lot of examples of Circom programs implementing common cryptographic primitives, such as hash functions. See, for example, the [github.com/iden3/circomlib](https://github.com/iden3/circomlib) repository. It is quoted by many projects, see for example this blog post [How zkLogin Made Cryptography Faster and More Secure](https://www.mystenlabs.com/blog/how-zklogin-made-cryptography-faster-and-more-secure) of the development team of [Sui](How zkLogin Made Cryptography Faster and More Secure) mentioning Circom.\\n\\nHere is the example which we consider to add `ops` numbers of `n` bits:\\n\\n```circom\\nfunction nbits(a) {\\n var n = 1;\\n var r = 0;\\n while (n-1> k) & 1;\\n\\n // Ensure out is binary\\n out[k] * (out[k] - 1) === 0;\\n\\n lout += out[k] * e2;\\n\\n e2 = e2+e2;\\n }\\n\\n // Ensure the sum;\\n\\n lin === lout;\\n}\\n```\\n\\nYou can find this example in [github.com/iden3/circomlib/blob/master/circuits/binsum.circom](https://github.com/iden3/circomlib/blob/master/circuits/binsum.circom).\\n\\nIt contains a function `nbits` to compute the number of bits needed to represent a number, and a template `BinSum` to add `ops` numbers of `n` bits. The function `bits` does not make any operations related to zero-knowledge. It is a simple imperative function with a loop and mutable variables `n` and `r`.\\n\\nThe template `BinSum` defines what is required to instantiate a new component, with input signals `in` and output signals `out`. With the equality assertion `===` it ensures that the output signals must be the bits of the sum of the input signals, and cannot be anything else. This is the property that we will want to formally verify to ensure that it is not possible to provide a proof of this circuit which does not compute the addition. This is called verifying that the circuit is not _underconstrained_.\\n\\n## \ud83d\udc13 The Coq proof assistant\\n\\nThe [Coq proof assistant](https://coq.inria.fr/), which we use exclusively at [Formal Land](https://formal.land), is a generic formal verification system. You can use it to verify any kind of maths or programs. You can never be stuck in the verification of a property, thanks to its interactive mode to refine proofs step by step. You can express almost any kind of property as it is based on the very expressive [Calculus Of Constructions](https://en.wikipedia.org/wiki/Calculus_of_constructions) logic.\\n\\nIts community focuses a lot on the verification of programs, the most notable example being the full verification of the C compiler [CompCert](https://compcert.org/).\\n\\nOur strategy is always the same: finding a nice embedding of a language in Coq, so that we can formally verify programs written in this language and reuse all the existing Coq tools and libraries.\\n\\n## \ud83d\ude80 Translation of Circom to Coq\\n\\nThe Circom compiler is written in [\ud83e\udd80 Rust](https://www.rust-lang.org/). Generally, a compiler is composed of many intermediate languages, starting from a language that is essentially a representation of what the parser returns, down to some form of assembly or circuit language.\\n\\nIf you translate a high-level intermediate language to a proof system, you retain a lot of information from the original program and the specifications/proofs tend to be simpler. If you translate a low-level language, the translation itself will be simpler and more trustworthy, but the verification part will be harder. As Circom is a rather small language (compared to a full programming language), we choose to translate its high-level representation to Coq.\\n\\n### To JSON\\n\\nWe write our translation tool in \ud83d\udc0d Python for simplicity, reading a JSON export of the [abstract syntax tree](https://github.com/iden3/circom/blob/master/program_structure/src/abstract_syntax_tree/ast.rs) of Circom. The quickest way to export data from Rust is to use the [Serde](https://serde.rs/) to generate pretty-printing functions to JSON. We have done it in this [pull request](https://github.com/formal-land/circom/pull/1) from a fork of the Circom compiler.\\n\\nHere is, for example, the beginning of the JSON version of the `nbits` function above:\\n\\n```json\\n\\"Function\\": {\\n \\"meta\\": {},\\n \\"name\\": \\"nbits\\",\\n \\"args\\": [\\n \\"a\\"\\n ],\\n \\"arg_location\\": {\\n \\"start\\": 1571,\\n \\"end\\": 1572\\n },\\n \\"body\\": {\\n \\"Block\\": {\\n \\"meta\\": {},\\n \\"stmts\\": [\\n {\\n \\"InitializationBlock\\": {\\n \\"meta\\": {},\\n \\"xtype\\": \\"Var\\",\\n \\"initializations\\": [\\n {\\n \\"Declaration\\": {\\n \\"meta\\": {},\\n \\"xtype\\": \\"Var\\",\\n \\"name\\": \\"n\\",\\n \\"dimensions\\": [],\\n \\"is_constant\\": true\\n }\\n },\\n {\\n \\"Substitution\\": {\\n \\"meta\\": {},\\n \\"var\\": \\"n\\",\\n \\"access\\": [],\\n \\"op\\": \\"AssignVar\\",\\n \\"rhe\\": {\\n \\"Number\\": [\\n {},\\n [\\n 1,\\n [\\n 1\\n ]\\n ]\\n```\\n\\n### To Coq\\n\\nWe iterate over each nodes of this JSON file to produce a corresponding Coq file with the Python script [scipts/coq_of_circom.py](https://github.com/formal-land/garden/blob/main/scripts/coq_of_circom.py). Here is a short extract from this script:\\n\\n```python\\n\\"\\"\\"\\npub enum Access {\\n ComponentAccess(String),\\n ArrayAccess(Expression),\\n}\\n\\"\\"\\"\\ndef to_coq_access(node) -> str:\\n if \\"ComponentAccess\\" in node:\\n return f\\"Access.Component ({node[\'ComponentAccess\']})\\"\\n if \\"ArrayAccess\\" in node:\\n return f\\"Access.Array ({to_coq_expression(node[\'ArrayAccess\'])})\\"\\n return f\\"Unknown access: {node}\\"\\n```\\n\\nFor every node type in the Circom AST, we copy its Rust type in triple quotes in Python and let GitHub Copilot write a conversion function, which we complete and fix by hand. That way, we quickly cover all syntax with a reasonable Coq output.\\n\\nWe put all the code of our translation in a monad in Coq to represent the side effects, which are mainly here:\\n\\n- imperative effects such as mutations and potentially non-terminating loops,\\n- the instantiation of components with signals, and the enforcement of the equality constraints.\\n\\nHere is the Coq translation for the Circom example above, given in [Garden/Circom/Example/binsum.v](https://github.com/formal-land/garden/blob/main/Garden/Circom/Example/binsum.v):\\n\\n```coq\\n(* Function *)\\nDefinition nbits (a : F.t) : M.t F.t :=\\n M.function_body (\\n (* Var *)\\n do~ M.declare_var \\"n\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"n\\" [[ 1 ]] in\\n (* Var *)\\n do~ M.declare_var \\"r\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"r\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| InfixOp.sub ~(| M.var ~(| \\"n\\" |), 1 |), M.var ~(| \\"a\\" |) |) ]] (\\n do~ M.substitute_var \\"r\\" [[ InfixOp.add ~(| M.var ~(| \\"r\\" |), 1 |) ]] in\\n do~ M.substitute_var \\"n\\" [[ InfixOp.mul ~(| M.var ~(| \\"n\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.return_ [[ M.var ~(| \\"r\\" |) ]] in\\n M.pure BlockUnit.Tt\\n ).\\n\\n(* Template *)\\nDefinition BinSum (n ops : F.t) : M.t BlockUnit.t :=\\n (* Var *)\\n do~ M.declare_var \\"nout\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"nout\\" [[ nbits ~(| InfixOp.mul ~(| InfixOp.sub ~(| InfixOp.pow ~(| 1, M.var ~(| \\"n\\" |) |), 1 |), M.var ~(| \\"ops\\" |) |) |) ]] in\\n (* Signal Input *)\\n do~ M.declare_signal \\"in\\" [[ [M.var ~(| \\"ops\\" |); M.var ~(| \\"n\\" |)] ]] in\\n (* Signal Output *)\\n do~ M.declare_signal \\"out\\" [[ [M.var ~(| \\"nout\\" |)] ]] in\\n (* Var *)\\n do~ M.declare_var \\"lin\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"lin\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"lout\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"lout\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"k\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"k\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"j\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"j\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"e2\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"e2\\" [[ 0 ]] in\\n do~ M.substitute_var \\"e2\\" [[ 1 ]] in\\n do~ M.substitute_var \\"k\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| M.var ~(| \\"k\\" |), M.var ~(| \\"n\\" |) |) ]] (\\n do~ M.substitute_var \\"j\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| M.var ~(| \\"j\\" |), M.var ~(| \\"ops\\" |) |) ]] (\\n do~ M.substitute_var \\"lin\\" [[ InfixOp.add ~(| M.var ~(| \\"lin\\" |), InfixOp.mul ~(| M.var_access ~(| \\"in\\", [Access.Array (M.var ~(| \\"j\\" |)); Access.Array (M.var ~(| \\"k\\" |))] |), M.var ~(| \\"e2\\" |) |) |) ]] in\\n do~ M.substitute_var \\"j\\" [[ InfixOp.add ~(| M.var ~(| \\"j\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.substitute_var \\"e2\\" [[ InfixOp.add ~(| M.var ~(| \\"e2\\" |), M.var ~(| \\"e2\\" |) |) ]] in\\n do~ M.substitute_var \\"k\\" [[ InfixOp.add ~(| M.var ~(| \\"k\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.substitute_var \\"e2\\" [[ 1 ]] in\\n do~ M.substitute_var \\"k\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| M.var ~(| \\"k\\" |), M.var ~(| \\"nout\\" |) |) ]] (\\n do~ M.substitute_var \\"out\\" [[ InfixOp.bitand ~(| InfixOp.shiftr ~(| M.var ~(| \\"lin\\" |), M.var ~(| \\"k\\" |) |), 1 |) ]] in\\n do~ M.equality_constraint\\n [[ InfixOp.mul ~(| M.var_access ~(| \\"out\\", [Access.Array (M.var ~(| \\"k\\" |))] |), InfixOp.sub ~(| M.var_access ~(| \\"out\\", [Access.Array (M.var ~(| \\"k\\" |))] |), 1 |) |) ]]\\n [[ 0 ]]\\n in\\n do~ M.substitute_var \\"lout\\" [[ InfixOp.add ~(| M.var ~(| \\"lout\\" |), InfixOp.mul ~(| M.var_access ~(| \\"out\\", [Access.Array (M.var ~(| \\"k\\" |))] |), M.var ~(| \\"e2\\" |) |) |) ]] in\\n do~ M.substitute_var \\"e2\\" [[ InfixOp.add ~(| M.var ~(| \\"e2\\" |), M.var ~(| \\"e2\\" |) |) ]] in\\n do~ M.substitute_var \\"k\\" [[ InfixOp.add ~(| M.var ~(| \\"k\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.equality_constraint\\n [[ M.var ~(| \\"lin\\" |) ]]\\n [[ M.var ~(| \\"lout\\" |) ]]\\n in\\n M.pure BlockUnit.Tt.\\n```\\n\\nIf you compare the translated code to the original Circom code, you will see that the two are very similar, up to a more verbose syntax in Coq. This is because we translate the high-level representation of Circom to Coq.\\n\\n### Free monad\\n\\nEven if we do not define the Circom semantics for now, we need to write a few Coq definitions so that the code above can be type-checked. As in the translations we make for other languages, we use a free-monad to represent side effects. This is convenient, as we can first express which are the various \\"special\\" operators of the language (declaring a signal, instantiating a template, ...) and define their behavior in a second step. The behavior might be defined in a computational or relational way later.\\n\\nThe definitions of the monad are in [Garden/Garden.v](https://github.com/formal-land/garden/blob/main/Garden/Garden.v). Here is an extract:\\n\\n```coq\\nModule Primitive.\\n (** We group together primitives that share being impure functions operating over the state. *)\\n Inductive t : Set -> Set :=\\n | OpenScope : t unit\\n | CloseScope : t unit\\n | DeclareVar (name : string) (value : F.t) : t unit\\n | DeclareSignal (name : string) (dimensions : list F.t) : t unit\\n | SubstituteVar (name : string) (value : F.t) : t unit\\n | GetVarAccess (name : string) (access : list Access.t) : t F.t\\n | GetPrime : t F.t\\n | EqualityConstraint (value1 value2 : F.t) : t unit.\\nEnd Primitive.\\n```\\n\\nThese are some of the primitives from the Circom language, which we needed for our example (we will add more as we cover more of the language). For example:\\n\\n```coq\\n| DeclareVar (name : string) (value : F.t) : t unit\\n```\\n\\ncorresponds to the Circom construction:\\n\\n```circom\\nvar n = 1;\\n```\\n\\nwith `name` the string `\\"n\\"` and value the field element `1` in this case. We will use a scope of local variables for each function, as a dictionary from the name of the variable to its value. All local variables might be mutated. Compared to more complex languages, such as Rust, we do not need to handle the notion of pointer, simplifying many things.\\n\\nNote that with the free monad, we only give the list of primitive operations with their signatures, but we have not yet defined how to evaluate them.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have seen to define a first translation from the Circom language to Coq, for one example, with the goal of having a translation that is well-typed.\\n\\nIn the following article, we will explore the definition of a meaning to the primitive operations of Circom, such as signals and constraints, in order to formally verify that the `BinSum` is correct.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/12/20/what-is-formal-verification-of-smart-contracts","metadata":{"permalink":"/blog/2024/12/20/what-is-formal-verification-of-smart-contracts","source":"@site/blog/2024-12-20-what-is-formal-verification-of-smart-contracts.md","title":"\ud83e\udd84 How does formal verification of smart contracts work?","description":"We make here a general presentation about how the formal verification of smart contracts works by explaining:","date":"2024-12-20T00:00:00.000Z","formattedDate":"December 20, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"smart contract","permalink":"/blog/tags/smart-contract"},{"label":"audit","permalink":"/blog/tags/audit"}],"readingTime":8.275,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 How does formal verification of smart contracts work?","tags":["Solidity","smart contract","audit"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc7b Translation of Circom to Coq","permalink":"/blog/2024/12/20/translation-of-circom-to-coq"},"nextItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","permalink":"/blog/2024/11/15/tool-for-noir-2"}},"content":"We make here a general presentation about how the formal verification of smart contracts works by explaining:\\n\\n- How people secure their smart contracts without formal verification.\\n- How do formal tools typically work?\\n- How our solution [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) works on a short example (an [ERC-20](https://ethereum.org/en/developers/docs/standards/tokens/erc-20/) contract).\\n- Where LLMs could be the most useful, according to us, for formal verification work.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can also be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe are already working with some of the leading blockchain entities such as:\\n\\n- The [Ethereum Foundation](https://ethereum.foundation/)\\n- The [Sui Foundation](https://sui.io/about)\\n- Previously, the [Aleph Zero](https://alephzero.org/) and [Tezos](https://tezos.com/) foundations\\n\\n:::\\n\\n
\\n ![Forest](2024-12-20/forest.webp)\\n
\\n\\n## \ud83d\udee1\ufe0f Securing smart contracts, the common way\\n\\nSmart contracts are short programs, typically less than 5,000 lines of code, running \\"on the blockchain\\" to implement transaction rules. Examples can be virtual marketplaces to trade cryptocurrencies, virtual dollar coins, traceability databases, and NFTs, ... Most of the smart contracts are written in [Solidity](https://soliditylang.org/), a JavaScript-like language, and some are in [Rust](https://www.rust-lang.org/).\\n\\nTo know what a smart contract looks like, you can find a list of the biggest ones (in terms of users) on [shafu0x/awesome-smart-contracts](https://github.com/shafu0x/awesome-smart-contracts). A popular library to write smart contracts is [OpenZeppelin](https://www.openzeppelin.com/solidity-contracts). You can also search for the [Solidity language](https://github.com/search?q=lang%3ASolidity%20&type=repositories) on GitHub to find repositories with Solidity code.\\n\\nSmart contracts are most of the time open-source, as it is important for the users to know what are the rules which handle their money. If a contract is not open-source, _it is probably a scam_.\\n\\nSecuring smart contracts is very important as a single bug can mean that an attacker can steal all the funds of the users who deposited money on the contract, or just block it to compromise the service. Millions of dollars are stolen every month due to bugs in the contracts, and some projects almost lose everything in such attacks. An historically important attack is the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) where $60 million was stolen, leading to a hard fork of the [Ethereum](https://ethereum.org/) blockchain.\\n\\nNow, how do people secure their code? First of all, most projects are well aware that software security is important, and if they want to raise money or advertise their product, they need to show that they are secure. They typically do the following:\\n\\n- **Audits** Projects require a few audits, which are made by specialized companies or individuals, to review the code of a smart contract and find bugs or vulnerabilities. The issues are classified into categories of importance: informational, low, medium, high, or critical. The highest categories mean it is possible to steal all of the funds. Lower categories are more remarks about the coding style/missing documentation. At the end of an audit, a **report** is published together with the corrections for the vulnerabilities that were discovered. As an example, [here](https://github.com/trailofbits/publications/tree/master/reviews) is a list of audit reports from the company [Trail of Bits](https://www.trailofbits.com/).\\n- **Competitions** They enable anyone, during a pre-defined period of time of like a month, to look for bugs in a smart contract. At the end of the competition, a price pot is shared among the persons who found the most bugs. A typical price pot is $100,000, and some large competitions can go above $1,000,000 \ud83d\udcb0. You can see the list of all ongoing competitions on [www.dailywarden.com](https://www.dailywarden.com/).\\n- **Bounties** Finally, bounties are like competitions but always live. The aim is to reward critical vulnerabilities, such that there is an incentive to report a bug instead of exploiting it. A popular platform is [Immunefi](https://immunefi.com/).\\n\\nTo give an idea of the amounts that are at risk of attacks on the blockchain, the total valuation of Ethereum, the main smart contracts platform, is estimated at more than 300 Billion dollars! Attacks are believed to be mainly done by \ud83c\uddf0\ud83c\uddf5 North Korean agents, but sometimes they happen to be single, clever individuals.\\n\\n## \ud83d\udee0\ufe0f Formal verification tools\\n\\nSo, where does formal verification stand in all that?\\n\\nAs it is the idea to mathematically reason about code to show the total absence of bugs in a protocol, formal verification seems to be the ideal tool to ensure the absence of vulnerabilities in a smart contract. In fact, most popular platforms do not take the risk of deploying new versions without a formal verification step, mainly with the leading tool [Certora](https://www.certora.com/).\\n\\nAs the verification is, at the end of the day, a mathematical proof, we can be sure that the code is correct for any possible user inputs, for a given and explicit _specification_. In addition, when the code changes, there is no need to review everything again: you can just formally verify the code that changed, as you would do when writing tests. This saves you time and money.\\n\\n## \ud83d\udea7 Limitations\\n\\nSo, what are the limitations? Here are a few:\\n\\n- **Cost** You need to pay more than with traditional audits. Although the rewards are probably there, given the quantity of funds at risk in a smart contract, many small companies take the risk.\\n- **Time** Sometimes, time is an issue, even if the verification can be done in a continuous manner.\\n- **Specification** You still to have write the correct specification of your code! This is what defines what is a bug and what is a feature.\\n- **Complexity** Formal verification requires some specific knowledge which most developers do not have (This is why we are here to help you! \ud83d\ude04).\\n\\nAnother one is completeness. Some formal verification tools aim to _fully automate_ the proof part, so that you only need to write the specifications. But then:\\n\\n- Some properties are unprovable, or need to be cut into smaller ones in non-trivial ways.\\n- Some parts of the code are not verified. Typically, loops are only unrolled a few times (two or three times), instead of covering all the possible iterations.\\n- Some properties cannot even be expressed!\\n\\nThis is a _real_ concern, according to the security teams of a few blockchain companies we talked to. Popular tools such as Certora or [Halmos](https://github.com/a16z/halmos) fall into this category.\\n\\n## \ud83c\udf1f How to do better?\\n\\nUsing interactive theorem provers, such as [\ud83d\udc13 Coq](https://coq.inria.fr/) or Lean, you overcome the limitations of automated provers as presented above. Here are a few tools you can use:\\n\\n- [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) using the Coq theorem prover. This is the tool we made! \ud83c\udf89\\n- [Clear](https://github.com/NethermindEth/Clear) using the Lean theorem prover. This is a tool made by the company [Nethermind](https://nethermind.io/).\\n\\n[Kontrol](https://kontrol.runtimeverification.com/) from Runtime Verification is another verification tool providing ways to go further than automated tools.\\n\\nFor the question of correct and complete specifications, here is our idea:\\n\\n> Build a set of high-level primitives encoding ideas such as \\"identity\\", \\"value\\", \\"ownership\\", \\"exact calculation\\", ... which are not necessarily definable in a programming language but can be axiomatized in a proof system. Use them to give the business rules of a contract in a clear manner and to express meta-properties such as \\"it is impossible to steal\\" \ud83d\ude93.\\n\\n## \ud83d\udd27 Technical pipeline\\n\\nSolidity is a complex language. All the tools we mentioned above translate the code into a formal language to reason about it. They never take the Solidity code as it is. Instead, they first translate it to a simpler language, generally EVM bytecode (the assembly language for Solidity) or [Yul](https://docs.soliditylang.org/en/latest/yul.html) which is slightly higher level.\\n\\nThen, they run several steps to first \\"\ud83e\uddfc clean up the code\\" and obtain a representation that is high-level again. See, for example, the [Practical Verification of Smart Contracts using Memory Splitting](https://dl.acm.org/doi/10.1145/3689796) article from Certora about optimizing the memory representation of EVM code to retrieve some properties from the Solidity representation.\\n\\nIn `coq-of-solidity`, we call this step of going from low-level to high-level writing a \\"simulation\\", which is a high-level representation of the low-level code. This task is time-consuming. An alternative would be to use LLMs to generate it. We can check that the simulation is equivalent to the low-level version, either by writing a formal proof or by testing.\\n\\nAs an example, here is what we get for the verification of an ERC-20 smart contract with `coq-of-solidity` (you can click on the links to see the code):\\n\\n- [the ERC-20 Solidity contract](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/contract.sol)\\n- [the low-level version (in Yul, generated)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/contract.yul)\\n- [the Coq translation of the low-level version (generated)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/shallow.v)\\n- [the simulation in Coq (hand-written)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/simulations/contract.v)\\n- [the formal proof that the two are equivalent (hand-written)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/proofs/contract.v)\\n\\n## \ud83e\udde0 Use cases for LLMs\\n\\nHere are a few areas where LLMs can be useful:\\n\\n1. Writing formal **specifications** from the code of a smart contract, its documentation, and a dataset of known vulnerabilities. We can find such datasets on the Internet or by reading audits and competition reports.\\n2. Writing **formal proofs** for the specifications.\\n3. Writing a **high-level representation** of a smart contract in a formal system.\\n4. Writing a formal proof that a **high-level representation is valid**.\\n\\nThe fact that most of the smart contracts are open-source should also help running learning algorithms. We hope to explore this area more in the future or give ideas to others.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have made a general presentation of security challenges around the deployment of smart contracts and how formal verification works and helps to secure smart contracts even more. We also presented a few ways to potentially improve current tooling.\\n\\nWe hope that this article will help you understand the importance of formal verification and how it can be used to secure your smart contracts. Please contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) if you need formal verification services or advice!\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/11/15/tool-for-noir-2","metadata":{"permalink":"/blog/2024/11/15/tool-for-noir-2","source":"@site/blog/2024-11-15-tool-for-noir-2.md","title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","description":"In this blog post, we continue our presentation about our formal verification tool for \u25fc\ufe0f Noir programs coq-of-noir. Noir is a Rust-like language to write programs designed to run efficiently in zero-knowledge environments. It has a growing popularity and a focus on providing optimized libraries for common needs, such as a base64 library using \ud83e\udde0 field arithmetic that we use in this series of blog posts.","date":"2024-11-15T00:00:00.000Z","formattedDate":"November 15, 2024","tags":[{"label":"Noir","permalink":"/blog/tags/noir"},{"label":"smart contract","permalink":"/blog/tags/smart-contract"},{"label":"circuits","permalink":"/blog/tags/circuits"}],"readingTime":8.895,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","tags":["Noir","smart contract","circuits"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 How does formal verification of smart contracts work?","permalink":"/blog/2024/12/20/what-is-formal-verification-of-smart-contracts"},"nextItem":{"title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","permalink":"/blog/2024/11/14/sui-move-checker-abstract-stack"}},"content":"In this blog post, we continue our presentation about our formal verification tool for [\u25fc\ufe0f Noir](https://noir-lang.org/) programs [coq-of-noir](https://github.com/formal-land/coq-of-noir). Noir is a Rust-like language to write programs designed to run efficiently in zero-knowledge environments. It has a growing popularity and a focus on providing optimized libraries for common needs, such as a [base64](https://github.com/noir-lang/noir_base64) library using \ud83e\udde0 field arithmetic that we use in this series of blog posts.\\n\\nHere we present the details of our semantic rules to show that a Noir program has an expected behavior for any possible parameters. We focus, in particular, on our memory-handling approach and the definition of loops.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Require the strongest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Noir](2024-11-15/noir.webp)\\n
\\n\\n## \u2699\ufe0f Semantic rules\\n\\nIn the previous blog post [\u25fc\ufe0f A formal verification tool for Noir \u2013 1](/blog/2024/11/01/tool-for-noir-1) we presented our general translation from the Noir syntax to [\ud83d\udc13 Coq](https://coq.inria.fr/), as well as the free monad we use to represent side-effects such as mutations. We now need to define semantic rules to be able to say that a particular translated Noir program evaluates to a certain value.\\n\\nFor expressions that do not have side effects we rely on the usual reduction rules of Coq. This is really convenient as we can then reuse the existing Coq tactics and automation to reason about pure expressions.\\n\\nFor side-effects like mutations or function calls, which we also consider as side-effects as there might be infinite recursion, we use a big-step semantics with the following predicate:\\n\\n```coq\\n{{ p, state_in | e \u21d3 output | state_out }}\\n```\\n\\nIt says that for a certain prime number $p$ which is the size of the arithmetic field, for an initial state `state_in`, the expression `e` evaluates to the output `output` and the final state `state_out`.\\n\\nWe define this rule with a Coq `Inductive` with one case per case in our free monad for effects. This is similar to the work we have done for Rust with [coq-of-rust](https://github.com/formal-land/coq-of-rust). Here are the relevant rules.\\n\\n- `Pure`\\n Expressions without side effects evaluate to their value and do not change the state. Note that in Coq, we do not distinguish between expressions and values, as all values are equal modulo evaluation rules, so we can directly use the expression as the output.\\n ```coq\\n | Pure :\\n {{ p, state_out | LowM.Pure output \u21d3 output | state_out }}\\n ```\\n- `GetFieldPrime`\\n To obtain the current size of the field $p$ we use the `GetFieldPrime` primitive. This is a side-effect as it depends on the current settings to compile the Noir program in circuits. We use this operation as an internal operation to define the arithmetic operations in the field by computing modulo $p$.\\n ```coq\\n | CallPrimitiveGetFieldPrime\\n (k : Z -> M.t)\\n (state_in : State) :\\n {{ p, state_in | k p \u21d3 output | state_out }} ->\\n {{ p, state_in |\\n LowM.CallPrimitive Primitive.GetFieldPrime k \u21d3 output\\n | state_out }}\\n ```\\n We use a semantics by continuation with a continuation `k` for most of the operations of the monad. Instead of directly returning some result, we pass it to the continuation and evaluate it. In our experience, this simplifies the reasoning on code instead of having to use another monadic operation to pass this value.\\n- `CallClosure`\\n We define a closure as a function from a list of values to some monadic expression. In our translation, terms are totally untyped; in particular we do not enforce any arity for the functions. In case a wrong number of arguments is passed to a function, we will have a runtime error. This is a trade-off to keep the translation simple and to avoid having to define a type system for Noir.\\n ```coq\\n | CallClosure\\n (f : list Value.t -> M.t) (args : list Value.t)\\n (k : Result.t -> M.t)\\n (output_inter : Result.t)\\n (state_in state_inter : State) :\\n let closure := Value.Closure (existS (_, _) f) in\\n {{ p, state_in | f args \u21d3 output_inter | state_inter }} ->\\n {{ p, state_inter | k output_inter \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.CallClosure closure args k \u21d3 output | state_out }}\\n ```\\n To call a function, we first evaluate its body on the arguments and then the continuation `k`. If the result is some `output` and `state_out`, we can say that the whole expression evaluates to `output` and `state_out`.\\n- `Let`\\n The `Let` primitive is the monadic bind. It allows to sequentially compose the execution of two expressions. We first evaluate the first expression, then the second one with the result of the first one.\\n ```coq\\n | Let\\n (e : M.t)\\n (k : Result.t -> M.t)\\n (output_inter : Result.t)\\n (state_in state_inter : State) :\\n {{ p, state_in | e \u21d3 output_inter | state_inter }} ->\\n {{ p, state_inter | k output_inter \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.Let e k \u21d3 output | state_out }}\\n ```\\n\\n## \ud83d\udc18 Memory handling\\n\\nIn Noir, you can make a new variable mutable with the keyword `let mut`:\\n\\n```rust\\nlet mut result: [u8; InputElements] = [0; InputElements];\\n```\\n\\nThen you can assign a new value to this variable or its content with the `=` operator:\\n\\n```rust\\nresult[i] = Base64Decoder.get(input_byte as Field);\\n```\\n\\nThere is basic pointer manipulation with the `&` operator to get a reference to a variable and the `*` operator to dereference a pointer. You can even pass a mutable reference to a function to modify the value of a variable. There is no deallocation of memory, which entirely removes the need for a garbage collector or deallocation strategy. This is because Noir programs are supposed to be very short-lived.\\n\\nTo handle all expressions in a uniform way, we consider that each Noir expression is an address to its content. For most (intermediate) values, which are not mutable, the address is the value itself. For mutable values, we use a fresh address for each `let mut` assignment.\\n\\n:::info Thanks\\n\\nAs [GitHub Copilot](https://github.com/features/copilot) correctly suggests me, this is similar to the approach we have taken for Rust in `coq-of-rust`. Thanks for following what we are doing! \ud83d\ude4f\\n\\n:::\\n\\nTo simplify the proofs, we let the user input a memory model of its choice. The only constraint is to provide memory operations for `read`, `write`, and `alloc`, and to make sure that these operations are consistent. Once it is done, here are the rules for the memory handling of mutable references:\\n\\n- `StateAlloc`\\n ```coq\\n | CallPrimitiveStateAlloc\\n (value : Value.t)\\n (address : Address)\\n (k : Value.t -> M.t)\\n (state_in state_in\' : State) :\\n let pointer := Pointer.Mutable (Pointer.Mutable.Make address []) in\\n State.read address state_in = None ->\\n State.alloc_write address state_in value = Some state_in\' ->\\n {{ p, state_in\' | k (Value.Pointer pointer) \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.CallPrimitive (Primitive.StateAlloc value) k \u21d3 output | state_out }}\\n ```\\n- `StateRead`\\n ```coq\\n | CallPrimitiveStateRead\\n (address : Address)\\n (value : Value.t)\\n (k : Value.t -> M.t)\\n (state_in : State) :\\n State.read address state_in = Some value ->\\n {{ p, state_in | k value \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.CallPrimitive (Primitive.StateRead address) k \u21d3 output | state_out }}\\n ```\\n- `StateWrite`\\n ```coq\\n | CallPrimitiveStateWrite\\n (value : Value.t)\\n (address : Address)\\n (k : unit -> M.t)\\n (state_in state_in\' : State) :\\n State.alloc_write address state_in value = Some state_in\' ->\\n {{ p, state_in\' | k tt \u21d3 output | state_out }} ->\\n {{ p, state_in |\\n LowM.CallPrimitive (Primitive.StateWrite address value) k \u21d3 output\\n | state_out }}\\n ```\\n\\nWhen using these rules to show that a certain Noir program evaluates to an expression, one has to make the right choice for the address used to allocate the value. This choice is arbitrary but can make the proof more or less complex later. The read and write operations are deterministic.\\n\\n## \u27b0 Loops\\n\\nThere is only one kind of loop in Noir, bounded `for` loops:\\n\\n```rust\\nfor i in 0..InputElements {\\n let input_byte = input[i];\\n result[i] = Base64Decoder.get(input_byte as Field);\\n}\\n```\\n\\nThe index `i` evolves in between statically known bounds. As such, these bounds always terminate, which is a requirement for formal verification to proceed! As a result, we do not need to introduce a dedicated monadic primitive for the loops and can define them with a recursive function:\\n\\n```coq\\nFixpoint for_nat (end_ : Z) (fuel : nat) (body : Z -> M.t) {struct fuel} : M.t :=\\n match fuel with\\n | O => pure (Value.Tuple [])\\n | S fuel\' =>\\n let* _ := body (end_ - Z.of_nat fuel) in\\n for_nat end_ fuel\' body\\n end.\\n\\nDefinition for_Z (start end_ : Z) (body : Z -> M.t) : M.t :=\\n for_nat end_ (Z.to_nat (end_ - start)) body.\\n```\\n\\nNote that we do not handle `break` or `continue` yet but propagate assert failures with `let*`. We _prove_ the following reasoning rule for loops:\\n\\n```coq\\nLemma For {State Address : Set} `{State.Trait State Address}\\n (p : Z) (state_in : State)\\n (integer_kind : IntegerKind.t) (start : Z) (len : nat) (body : Value.t -> M.t)\\n {Accumulator : Set}\\n (inject : State -> Accumulator -> State)\\n (accumulator_in : Accumulator)\\n (body_expression : Z -> MS! Accumulator unit)\\n (H_body : forall (accumulator_in : Accumulator) (i : Z),\\n let output_accumulator_out := body_expression i accumulator_in in\\n {{ p, inject state_in accumulator_in |\\n body (M.alloc (Value.Integer integer_kind i)) \u21d3\\n Panic.to_result (fst output_accumulator_out)\\n | inject state_in (snd output_accumulator_out) }}\\n ) :\\n let output_accumulator_out :=\\n foldS!\\n tt\\n (List.map (fun offset => start + Z.of_nat offset) (List.seq 0 len))\\n (fun (_ : unit) => body_expression)\\n accumulator_in in\\n {{ p, inject state_in accumulator_in |\\n M.for_\\n (Value.Integer integer_kind start)\\n (Value.Integer integer_kind (start + Z.of_nat len))\\n body \u21d3\\n Panic.to_result (fst output_accumulator_out)\\n | inject state_in (snd output_accumulator_out) }}.\\n```\\n\\nIt is a little bit involved but basically says that if the body of the loop evaluates to an expression for each possible iteration, then the whole loop evaluates to the recursive function `foldS!` using the modified memory as an accumulator.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have shown how we define the semantic rules for the Noir language in Coq, for the general monadic primitives, memory, and loops.\\n\\nIn the next blog post, we will apply these reasoning principles to give a semantics to the `base64` library of Noir.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/11/14/sui-move-checker-abstract-stack","metadata":{"permalink":"/blog/2024/11/14/sui-move-checker-abstract-stack","source":"@site/blog/2024-11-14-sui-move-checker-abstract-stack.md","title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","description":"We are continuing our formal verification work for the implementation of the type-checker of the Move language in the \ud83d\udca7 Sui blockchain. We verify a manual translation in the proof system \ud83d\udc13 Coq of the \ud83e\udd80 Rust code of the Move checker as available on GitHub.","date":"2024-11-14T00:00:00.000Z","formattedDate":"November 14, 2024","tags":[{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Move","permalink":"/blog/tags/move"},{"label":"Sui","permalink":"/blog/tags/sui"},{"label":"type-checker","permalink":"/blog/tags/type-checker"}],"readingTime":7.74,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","tags":["Rust","Move","Sui","type-checker"],"authors":[]},"unlisted":false,"prevItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","permalink":"/blog/2024/11/15/tool-for-noir-2"},"nextItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","permalink":"/blog/2024/11/01/tool-for-noir-1"}},"content":"We are continuing our formal verification work for the implementation of the type-checker of the [Move](https://sui.io/move) language in the [\ud83d\udca7 Sui](https://sui.io/) blockchain. We verify a manual translation in the proof system [\ud83d\udc13 Coq](https://coq.inria.fr/) of the [\ud83e\udd80 Rust](https://www.rust-lang.org/) code of the Move checker as available on [GitHub](https://github.com/move-language/move-sui/tree/main/crates/move-bytecode-verifier).\\n\\nIn this blog post, we present in detail the verification of a particular function `AbstractStack::pop_eq_n` that manipulates \ud83d\udcda stacks of types to show that it is equivalent to its naive implementation.\\n\\nAll the code presented here is on our GitHub at [github.com/formal-land/coq-of-rust](https://github.com/formal-land/coq-of-rust) \ud83e\uddd1\u200d\ud83c\udfeb.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Water in forest](2024-11-14/water-in-forest.webp)\\n
\\n\\n## \ud83d\udd75\ufe0f The code to verify\\n\\nHere is the definition in Rust of an `AbstractStack`, from the file [move-abstract-stack/src/lib.rs](https://github.com/move-language/move-sui/blob/main/crates/move-abstract-stack/src/lib.rs):\\n\\n```rust\\n/// An abstract value that compresses runs of the same value to reduce space usage\\npub struct AbstractStack {\\n values: Vec<(u64, T)>,\\n len: u64,\\n}\\n```\\n\\nIt says that a stack of elements of type `T` is a vector of pairs of a number and a value. The number is the number of times the value is repeated in the stack. The field `len` is the total number of elements in the stack. This representation is more efficient than a naive stack, in case the stack contains many repeated values.\\n\\nHere is one of the primitives to remove elements from this stack:\\n\\n```rust\\n/// Pops n values off the stack, erroring if there are not enough items or if the n items are\\n/// not equal\\npub fn pop_eq_n(&mut self, n: NonZeroU64) -> Result {\\n let n: u64 = n.get();\\n if self.is_empty() || n > self.len {\\n return Err(AbsStackError::Underflow);\\n }\\n let (count, last) = self.values.last_mut().unwrap();\\n debug_assert!(*count > 0);\\n let ret = match (*count).cmp(&n) {\\n Ordering::Less => return Err(AbsStackError::ElementNotEqual),\\n Ordering::Equal => {\\n let (_, last) = self.values.pop().unwrap();\\n last\\n }\\n Ordering::Greater => {\\n *count -= n;\\n last.clone()\\n }\\n };\\n self.len -= n;\\n Ok(ret)\\n}\\n```\\n\\nThis function removes `n` elements from the stack, returning the value of removed elements. It returns an error if there are not enough elements in the stack or if the `n` last items are not grouped as equal elements.\\n\\nOur goal is to **show that this function is equal to the naive pop function with repetition** on flattened stacks.\\n\\n## \u2696\ufe0f Specification\\n\\nHere is the property we want to verify in the formal language Coq:\\n\\n```coq\\nLemma flatten_pop_eq_n {A : Set} `{Eq.Trait A} (n : Z) (stack : AbstractStack.t A)\\n (H_n : n > 0) :\\n match AbstractStack.pop_eq_n n stack with\\n | Panic.Value (Result.Ok item, stack\') =>\\n flatten stack = List.repeat item (Z.to_nat n) ++ flatten stack\'\\n | _ => True\\n end.\\n```\\n\\nIt says that for any possible `stack` and `n` greater than 0, if we remove `n` elements from the stack and when the execution succeeds, the flattened stack is equal to the repetition of the removed element `n` times followed by the flattened stack.\\n\\nHow did we get from the Rust code above to the expression of this property? We manually converted the Rust code above in Coq with the following definitions:\\n\\n```coq\\nModule AbstractStack.\\n Record t (A : Set) : Set := {\\n values : list (Z * A);\\n len : Z;\\n }.\\n```\\n\\nfor the `AbstractStack` type, and:\\n\\n```coq\\nDefinition pop_eq_n {A : Set} (n : Z) : MS! (t A) (Result.t A AbsStackError.t) :=\\n fun (self : t A) =>\\n if (is_empty self || (n >? len self))%bool then\\n return! (Result.Err AbsStackError.Underflow, self)\\n else\\n let! (count, last) := Option.unwrap (List.hd_error self.(values)) in\\n if count panic! \\"unreachable\\"\\n | (_, last) :: values => return! ((count - n, last) :: values)\\n end in\\n let self := {|\\n values := values;\\n len := self.(len) - n\\n |} in\\n return! (Result.Ok last, self).\\n```\\n\\nfor the `pop_eq_n` function. Note that this definition uses a lot of user-defined notations, such as `let!`, that we made in order to simplify the expression of effects in Coq. You can read more about these notations on our previous blog post [\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2](/blog/2024/10/14/verification-move-sui-type-checker-2). We checked by testing that our translation above behaves as the original Rust code, as explained in our blog post [\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3](/blog/2024/10/15/verification-move-sui-type-checker-3). It is not necessary to understand the translation in detail, as its verification will flow naturally.\\n\\nWe define the `flatten` function to translate a stack with repetitions to a flat stack as:\\n\\n```coq\\nDefinition flatten {A : Set} (abstract_stack : AbstractStack.t A) : list A :=\\n List.flat_map (fun \'(n, v) => List.repeat v (Z.to_nat n)) abstract_stack.(AbstractStack.values).\\n```\\n\\nIt duplicates all the elements `n` times with `List.repeat v (Z.to_nat n)` and concatenates them with `List.flat_map`.\\n\\n## \ud83e\udd13 Proof\\n\\nTo show that the specification above is correct for any stacks, we cannot test it as it will only cover a finite amount of cases. We must write a Coq proof showing by mathematical reasoning that the code is always correct.\\n\\nHere is our full proof:\\n\\n```coq\\nProof.\\n destruct stack as [stack].\\n unfold AbstractStack.pop_eq_n, flatten.\\n (* if (is_empty self || (n >? len self))%bool then *)\\n destruct (_ || _); simpl; [reflexivity|].\\n unfold List.hd_error.\\n (* Option.unwrap (List.hd_error self.(values)) *)\\n destruct stack as [|[count last] stack]; simpl; [reflexivity|].\\n (* if count = 0\\nHeqb: (count List.repeat v (Z.to_nat n0)) stack =\\nList.repeat last (Z.to_nat n) ++ List.flat_map (fun \'(n0, v) => List.repeat v (Z.to_nat n0)) stack\\n\\n--------------------------------------\\n\\n2/2\\nList.repeat last (Z.to_nat count) ++ List.flat_map (fun \'(n0, v) => List.repeat v (Z.to_nat n0)) stack =\\nList.repeat last (Z.to_nat n) ++\\nList.repeat last (Z.to_nat (count - n)) ++ List.flat_map (fun \'(n0, v) => List.repeat v (Z.to_nat n0)) stack\\n```\\n\\nThis is how we can progress in the proof and know which command to type. We see two sub-goals `(1/2)` and `(2/2)` for each branch explored by the last `destruct`. In both cases, we need to show an equality:\\n\\n1. The first one is solved by the fact that `count = n` in this branch.\\n2. The second one is solved by the fact that `count > n` in this branch, so that we can group the `List.repeat last (Z.to_nat n)` with `List.repeat last (Z.to_nat (count - n))` (repeating a \\"negative\\" number of times is the empty list so we need to make sure that `count - n` is not negative).\\n\\n## \u2712\ufe0f Conclusion\\n\\nIn this example, we have seen how to verify that the `pop_eq_n` function of the `AbstractStack` type in the Move checker of Sui is equivalent to the naive pop function with repetition on flattened stacks. As this is a formal proof, we are sure that this property holds for any possible stack and value of `n`.\\n\\nWe are continuing the work to verify the other functions of the project, with the final aim to verify the whole type-checker. We will keep you updated on our progress in the next blog posts \ud83d\ude80.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/11/01/tool-for-noir-1","metadata":{"permalink":"/blog/2024/11/01/tool-for-noir-1","source":"@site/blog/2024-11-01-tool-for-noir-1.md","title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","description":"In this series of blog posts, we present our development of a formal verification tool for the \u25fc\ufe0f Noir smart contract language. It is particularly suited to writing zero-knowledge applications, providing primitive constructs such as a Field type to write programs that run efficiently as circuits. Having a formal verification for Noir enables the development of applications holding a large amount of money in this language, as it ensures that the code is correct with a mathematical level of certainty.","date":"2024-11-01T00:00:00.000Z","formattedDate":"November 1, 2024","tags":[{"label":"Noir","permalink":"/blog/tags/noir"},{"label":"smart contract","permalink":"/blog/tags/smart-contract"},{"label":"circuits","permalink":"/blog/tags/circuits"}],"readingTime":11.94,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","tags":["Noir","smart contract","circuits"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","permalink":"/blog/2024/11/14/sui-move-checker-abstract-stack"},"nextItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 2","permalink":"/blog/2024/10/28/verification-smooth-library-2"}},"content":"In this series of blog posts, we present our development of a formal verification tool for the [\u25fc\ufe0f Noir](https://noir-lang.org/) smart contract language. It is particularly suited to writing zero-knowledge applications, providing primitive constructs such as a `Field` type to write programs that run efficiently as circuits. Having a formal verification for Noir enables the development of applications holding a large amount of money in this language, as it ensures that the code is correct with a mathematical level of certainty.\\n\\nIn this first post, we present how we translate Noir code to the [\ud83d\udc13 Coq](https://coq.inria.fr/) proof system. We explore a translation after monomorphization and then at the HIR level. Note that we are interested in verifying programs _written in Noir_. The verification of the Noir compiler itself is a separated topic.\\n\\nAll our code is available as open-source on [github.com/formal-land/coq-of-noir](https://github.com/formal-land/coq-of-noir), and you are welcome to use it. We also provide all-included audit services to formally verify your smart contracts using `coq-of-noir`.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Noir](2024-11-01/noir.webp)\\n
\\n\\n## \u25fc\ufe0f Quick presentation of Noir\\n\\nNoir is designed as a small version of [\ud83e\udd80 Rust](https://www.rust-lang.org/) with many built-in constructs to make it more amenable to efficient compilation to zero-knowledge circuits. Being a smaller version of Rust, this simplifies the development of tooling as the surface of the language is reduced. In addition, as it shares similarities with Rust, we can reuse our knowledge from [coq-of-rust](https://github.com/formal-land/coq-of-rust), a formal verification tool for Rust, to propose an equivalent tool for Noir.\\n\\nA notable difference between Rust and Noir is that Noir has a much simpler memory management model: nothing is ever deallocated! As a result, the various kinds of pointers that exist in Rust (`Rc`, `RefCell`, ...) are not present in Noir. Most of the data is immutable, and mutations are encouraged to be done only on local variables.\\n\\nThe loops are restricted to `for` loops with bounds known at compile time, which simplifies the reasoning about them. For example, we are sure that all the loops terminate, which is required for the verification of the code.\\n\\nHere is an example of Noir program that we will use in this series of blog posts. It showcases the use of mutable variables in a loop, as well as generic values such as `InputElements` that are known at compile time and specialized during the monomorphization phase to compile the code down to a circuit. It is part of the [noir_base64](https://github.com/noir-lang/noir_base64) library to encode an array of ASCII values into base64 values using finite field operations to stay efficient.\\n\\n```rust\\n/**\\n * @brief Take an array of ASCII values and convert into base64 values\\n **/\\npub fn base64_encode_elements(\\n input: [u8; InputElements]\\n) -> [u8; InputElements] {\\n let mut Base64Encoder = Base64EncodeBE::new();\\n let mut result: [u8; InputElements] = [0; InputElements];\\n\\n for i in 0..InputElements {\\n result[i] = Base64Encoder.get(input[i] as Field);\\n }\\n\\n result\\n}\\n```\\n\\n## 1\ufe0f\u20e3 Monomorphization\\n\\nIn this phase of compilation, all generic types and values are instantiated with their concrete values, as well as trait instances. The resulting code is much simpler as it only contains functions and types. If we translate the code to an untyped representation in Coq, we can even consider that the monomorphized code only contains functions. Thus, for convenience, we started doing our translation from the monomorphized level.\\n\\nThe [abstract syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) for this level is in the Rust file [compiler/noirc_frontend/src/monomorphization/ast.rs](https://github.com/formal-land/coq-of-noir/blob/master/compiler/noirc_frontend/src/monomorphization/ast.rs) from the Noir\'s compiler. As an example, here is how the expressions are represented:\\n\\n```rust\\npub enum Expression {\\n Ident(Ident),\\n Literal(Literal),\\n Block(Vec),\\n Unary(Unary),\\n Binary(Binary),\\n Index(Index),\\n Cast(Cast),\\n For(For),\\n If(If),\\n Tuple(Vec),\\n ExtractTupleField(Box, usize),\\n Call(Call),\\n Let(Let),\\n Constrain(Box, Location, Option>),\\n Assign(Assign),\\n Semi(Box),\\n Break,\\n Continue,\\n}\\n```\\n\\nIf you look at the various constructors of this enum they correspond to the language\'s primitives presented in the reference manual of Noir. Expressions (`Ident`, `Binary`, `Call`, ...) and statements (`If`, `Let`, `Break`, ...) are mixed together. If we look at the definition of `Ident`:\\n\\n```rust\\npub struct Ident {\\n pub location: Option,\\n pub definition: Definition,\\n pub mutable: bool,\\n pub name: String,\\n pub typ: Type,\\n}\\n```\\n\\nand then at the definition of `Definition`:\\n\\n```rust\\npub enum Definition {\\n Local(LocalId),\\n Function(FuncId),\\n Builtin(String),\\n LowLevel(String),\\n // used as a foreign/externally defined unconstrained function\\n Oracle(String),\\n}\\n```\\n\\nwe get that most of the names have an associated _id_ that is a unique number. This is because in the monomorphization phase, we duplicate a lot of the definitions (once for each instantiation of a generic type), so we have to give them a unique id to distinguish them.\\n\\n### Translation\\n\\nWe translate the monomorphized code to Coq by doing:\\n\\n1. An extraction to JSON thanks to the `serde` serialization library in Rust.\\n2. Pretty-printing the resulting JSON to a Coq file with a Python script.\\n\\nWe find this development process to be rather efficient as the Python language is quite flexible and allows us to manipulate the JSON data easily. Compared to the work of a full compiler, which can be rather expensive computationally, what we do is mostly a translation from one syntax to another, and Python is a good fit.\\n\\nOur Noir example is monomorphized to the following code, which can be shown by the development option `--show-monomorphized` of `nargo`:\\n\\n```rust\\nfn base64_encode_elements$f4(input$l26: [u8; 36]) -> [u8; 36] {\\n let Base64Encoder$27 = new$f6();\\n let result$28 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];\\n for i$29 in 0 .. 36 {\\n result$l28[i$l29] = get$f7(Base64Encoder$l27, (input$l26[i$l29] as Field))\\n };\\n result$l28\\n}\\n```\\n\\nWe see that the generic variable `InputElements` is replaced by the constant value `36` as this is the value we use in the example we translate. All the identifiers have an additional `$...` suffix to make them unique. Thanks to the serialization library `serde`, we automatically get the JSON representation of this code that starts with:\\n\\n```json\\n{\\n \\"id\\": 4,\\n \\"name\\": \\"base64_encode_elements\\",\\n \\"parameters\\": [\\n [\\n 49,\\n false,\\n \\"input\\",\\n {\\n \\"Array\\": [\\n 118,\\n {\\n \\"Integer\\": [\\n \\"Unsigned\\",\\n \\"Eight\\"\\n ]\\n }\\n ]\\n }\\n ]\\n ],\\n \\"body\\": {\\n \\"Block\\": [\\n {\\n \\"Let\\": {\\n \\"id\\": 27,\\n \\"mutable\\": true,\\n \\"name\\": \\"Base64Encoder\\",\\n \\"expression\\": {\\n \\"Call\\": {\\n \\"func\\": {\\n \\"Ident\\": {\\n \\"location\\": {\\n \\"span\\": {\\n \\"start\\": 5312,\\n \\"end\\": 5315\\n },\\n \\"file\\": 70\\n },\\n \\"definition\\": {\\n \\"Function\\": 6\\n },\\n \\"mutable\\": false,\\n \\"name\\": \\"new\\",\\n \\"typ\\": {\\n \\"Function\\": [\\n [],\\n {\\n \\"Tuple\\": [\\n {\\n \\"Array\\": [\\n 64,\\n {\\n \\"Integer\\": [\\n \\"Unsigned\\",\\n \\"Eight\\"\\n ]\\n }\\n ]\\n }\\n ]\\n },\\n // much more JSON code\\n```\\n\\nThis is extremely verbose, and there is some information that we do not need, such as the locations of some of the items in the source. The advantage of JSON is that it is easy to parse and handle in most programming languages. In our case, here is an extract of the Python script that translates this JSON to Coq:\\n\\n```python\\n\'\'\'\\npub enum Expression {\\n Ident(Ident),\\n Literal(Literal),\\n Block(Vec),\\n Unary(Unary),\\n Binary(Binary),\\n Index(Index),\\n Cast(Cast),\\n For(For),\\n If(If),\\n Tuple(Vec),\\n ExtractTupleField(Box, usize),\\n Call(Call),\\n Let(Let),\\n Constrain(Box, Location, Option>),\\n Assign(Assign),\\n Semi(Box),\\n Break,\\n Continue,\\n}\\n\'\'\'\\ndef expression_to_coq(node) -> str:\\n node_type: str = list(node.keys())[0]\\n\\n if node_type == \\"Ident\\":\\n node = node[\\"Ident\\"]\\n return ident_to_coq(node)\\n\\n if node_type == \\"Literal\\":\\n node = node[\\"Literal\\"]\\n return alloc(literal_to_coq(node))\\n\\n if node_type == \\"Block\\":\\n node = node[\\"Block\\"]\\n return \\\\\\n \\"\\\\n\\".join(\\n expression_inside_block_to_coq(expression, index == len(node) - 1)\\n for index, expression in enumerate(node)\\n )\\n\\n if node_type == \\"Unary\\":\\n node = node[\\"Unary\\"]\\n return unary_to_coq(node)\\n\\n if node_type == \\"Binary\\":\\n node = node[\\"Binary\\"]\\n return binary_to_coq(node)\\n\\n # more cases...\\n```\\n\\nFor each kind of node in the AST, we write the original Rust type in comments, then let GitHub Copilot write the Python code and refine it. Here is the final Coq code that we get for this example:\\n\\n```coq\\nDefinition base64_encode_elements\u2084 (\u03b1 : list Value.t) : M.t :=\\n match \u03b1 with\\n | [input] =>\\n let input := M.alloc input in\\n let* result :=\\n let~ Base64Encoder := [[ M.copy_mutable (|\\n M.alloc (M.call_closure (|\\n M.read (| M.get_function (| \\"new\\", 6 |) |),\\n []\\n |))\\n |) ]] in\\n let~ result := [[ M.copy_mutable (|\\n M.alloc (Value.Array [\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |)\\n ])\\n |) ]] in\\n do~ [[\\n M.for_ (|\\n M.read (| M.alloc (Value.Integer IntegerKind.U32 0) |),\\n M.read (| M.alloc (Value.Integer IntegerKind.U32 36) |),\\n fun (i : Value.t) =>\\n [[\\n M.alloc (M.assign (|\\n M.read (| M.alloc (M.index (|\\n M.read (| M.alloc (result) |),\\n M.read (| i |)\\n |)) |),\\n M.read (| M.alloc (M.call_closure (|\\n M.read (| M.get_function (| \\"get\\", 7 |) |),\\n [\\n M.read (| Base64Encoder |);\\n M.read (| M.alloc (M.cast (|\\n M.read (| M.alloc (M.index (|\\n M.read (| input |),\\n M.read (| i |)\\n |)) |),\\n IntegerKind.Field\\n |)) |)\\n ]\\n |)) |)\\n |))\\n ]]\\n |)\\n ]] in\\n [[\\n result\\n ]] in\\n M.read result\\n | _ => M.impossible \\"wrong number of arguments\\"re\\n end.\\n```\\n\\nIf you attentively compare this Coq code to the original Noir version, you will see that the two are similar, although the Coq version is much more verbose with all the explicit memory allocations and reads. You might be wondering why we are choosing this specific representation. How did we know we had to use `M.for_`, for example, to represent the loops?\\n\\n### Semantics\\n\\nThis is where the semantics comes in. In the semantics phase, we define the meaning of each construct of the language in Coq. We reused our experience in building the [coq-of-rust](https://github.com/formal-land/coq-of-rust) and [coq-of-solidity](https://github.com/formal-land/coq-of-solidity), where we also had to define the semantics of imperative languages in Coq.\\n\\nWe remove all the type information to avoid the differences between the Coq\'s type system and the type system of Noir. All the values have the same type `Value.t`:\\n\\n```coq\\nModule Value.\\n Inductive t : Set :=\\n | Bool (b : bool)\\n | Integer (kind : IntegerKind.t) (integer : Z)\\n | String (s : string)\\n | FmtStr : string -> Z -> t -> t\\n | Pointer (pointer : Pointer.t t)\\n | Array (values : list t)\\n | Slice (values : list t)\\n | Tuple (values : list t)\\n | Closure : {\'(Value, M) : (Set * Set) @ list Value -> M} -> t.\\nEnd Value.\\n```\\n\\nWe have a monad `M.t` to represent the side-effects of Noir in Coq (memory mutation, non-termination for recursive calls, ...). We define this monad from the composition of two monads:\\n\\n- A free monad `LowM.t` that contains all the effects we cannot directly represent in Coq.\\n- An error monad `Result.t` to represent special control-flow operations, such as `break` and `continue`, which have to interrupt the execution of the current loop prematurely, and a panic value in case of assert failure, which must propagate up to the main function.\\n\\nThe definition of these types is as follows:\\n\\n- The free monad:\\n ```coq\\n Module LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure (value : A)\\n | CallPrimitive {B : Set} (primitive : Primitive.t B) (k : B -> t A)\\n | CallClosure (closure : Value.t) (args : list Value.t) (k : A -> t A)\\n | Let (e : t A) (k : A -> t A)\\n | Loop (body : t A) (k : A -> t A)\\n | Impossible (message : string).\\n End LowM.\\n ```\\n- The error monad:\\n ```coq\\n Module Result.\\n Inductive t : Set :=\\n | Ok (value : Value.t)\\n | Break\\n | Continue\\n | Panic {A : Set} (payload : A).\\n End Result.\\n ```\\n- The composition of the two monads:\\n ```coq\\n Module M.\\n Definition t : Set :=\\n LowM.t Result.t.\\n End M.\\n ```\\n\\nNote that since our type of values is always `Value.t`, we do not parameterize the monad `M.t` by the type of values.\\n\\n## \u2712\ufe0f Conclusion\\n\\nThanks to all the work above, we obtain a translation for a large subset of the Noir language to the Coq proof system, which type-checks and has a semantics. A difficulty with handling the code we produce from monomorphization is the unique identifier added after each name to make them unique. These identifiers are generated in a rather non-deterministic way that can depend on the machine that runs the compiler. In addition, they change every time we make changes to the source code.\\n\\nIn the next blog post, we will see how we prevent the identifiers from appearing in the generated code by working at a higher level than the monomorphization phase.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/10/28/verification-smooth-library-2","metadata":{"permalink":"/blog/2024/10/28/verification-smooth-library-2","source":"@site/blog/2024-10-28-verification-smooth-library-2.md","title":"\u2688 Verification of the Smoo.th library \u2013 2","description":"In this blog post, we detail the continuation of our work to formally verify the \u2688 Smoo.th library, which is an optimized implementation of elliptic curve operations in Solidity. We use our tool coq-of-solidity, representing any Solidity code in the generic proof assistant \ud83d\udc13 Coq, to verify the code for any execution path.","date":"2024-10-28T00:00:00.000Z","formattedDate":"October 28, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"},{"label":"elliptic curves","permalink":"/blog/tags/elliptic-curves"}],"readingTime":6.86,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u2688 Verification of the Smoo.th library \u2013 2","tags":["Solidity","Yul","elliptic curves"],"authors":[]},"unlisted":false,"prevItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","permalink":"/blog/2024/11/01/tool-for-noir-1"},"nextItem":{"title":"\ud83c\udf32 What we bring you","permalink":"/blog/2024/10/22/what-we-bring-to-you"}},"content":"In this blog post, we detail the continuation of our work to formally verify the [\u2688 Smoo.th](https://smoo.th/) library, which is an optimized implementation of elliptic curve operations in Solidity. We use our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity), representing any Solidity code in the generic proof assistant [\ud83d\udc13 Coq](https://coq.inria.fr/), to verify the code for any execution path.\\n\\nIn particular, we cover the changes we made to use unoptimized Yul code and how we made a functional representation of the loop to compute the most significant bit of the scalars.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Smooth in forest](2024-10-28/forest-smooth.webp)\\n
\\n\\n## \ud83d\udc0c Unoptimized Yul\\n\\nWe are now verifying the code based on the unoptimized [Yul](https://docs.soliditylang.org/en/latest/yul.html) output of the Solidity compiler instead of the optimized one. As a consequence the code is a little bit more verbose, although in our present case the difference is limited as we are verifying a code that is already hand-optimized. The main advantage is that the variables are preserved instead of being moved to locations in the memory, which makes the verification easier, especially when handling loop invariants. A downside is that we now have to trust the correctness of the Solidity compiler\'s optimization passes.\\n\\nAs an example, here is how we now translate in Coq the loop to compute the most significant bit of the scalars with the unoptimized Yul code:\\n\\n```coq\\nlet~ var_ZZZ_83 := [[ 0 ]] in\\nlet_state~ \'(var_ZZZ_83, var_mask_63) :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n (var_ZZZ_83, var_mask_63)\\n (* condition *)\\n (fun \'(var_ZZZ_83, var_mask_63) => [[\\n iszero ~(| var_ZZZ_83 |)\\n ]])\\n (* body *)\\n (fun \'(var_ZZZ_83, var_mask_63) =>\\n Shallow.lift_state_update\\n (fun var_ZZZ_83 => (var_ZZZ_83, var_mask_63))\\n (let~ var_ZZZ_83 := [[ add ~(| add ~(| sub ~(| 1, iszero ~(| and ~(| var_scalar_u_55, var_mask_63 |) |) |), shl ~(| 1, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, var_scalar_u_55 |), var_mask_63 |) |) |) |) |), add ~(| shl ~(| 2, sub ~(| 1, iszero ~(| and ~(| var_scalar_v_57, var_mask_63 |) |) |) |), shl ~(| 3, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, var_scalar_v_57 |), var_mask_63 |) |) |) |) |) |) ]] in\\n M.pure (BlockUnit.Tt, var_ZZZ_83)))\\n (* post *)\\n (fun \'(var_ZZZ_83, var_mask_63) =>\\n Shallow.lift_state_update\\n (fun var_mask_63 => (var_ZZZ_83, var_mask_63))\\n (let~ var_mask_63 := [[ shr ~(| 1, var_mask_63 |) ]] in\\n M.pure (BlockUnit.Tt, var_mask_63)))\\n```\\n\\nAs a reference, here is the original smart contract code, in hand-written Yul:\\n\\n```go\\nZZZ := 0\\nfor {} iszero(ZZZ) { mask := shr(1, mask) } {\\n ZZZ := add(\\n add(\\n sub(1, iszero(and(scalar_u, mask))),\\n shl(1, sub(1, iszero(and(shr(128, scalar_u), mask))))\\n ),\\n add(\\n shl(2, sub(1, iszero(and(scalar_v, mask)))),\\n shl(3, sub(1, iszero(and(shr(128, scalar_v), mask))))\\n )\\n )\\n}\\n```\\n\\nWe recognize the variables `var_ZZZ_83` and `var_mask_63`, corresponding to `ZZZ` and `mask` in the original code. They are made explicit in a state monad with the state `(var_ZZZ_83, var_mask_63)` for the loop.\\n\\nWe had some constructs we were not handling in `coq-of-solidity`, for constructs that appeared in the optimized code but not in the unoptimized one. An example is the initialization part of the `for` loop that seems to be always move away in the optimized code. We added those missing cases to our tool to be able to translate the unoptimized Yul code of Smoo.th.\\n\\n## \ud83c\udf97\ufe0f Verification of the loop\\n\\nVerifying the `for` loop above can be challenging. Automated verification tools for Solidity typically do not fully handle loops, and instead unroll them three or four times to check the first iterations, which can miss some bugs.\\n\\nThe first step is to prove the loop is equivalent to a recursive function, as this will simplify reasoning. Here is a recursive function that computes the most significant bit of the scalars `u` and `v`:\\n\\n```coq\\nFixpoint get\\n (u_low u_high v_low v_high : U128.t) (over_index : nat) :\\n PointsSelector.t * nat :=\\n match over_index with\\n | O =>\\n (* We should never reach this case if the scalars\\n are not all zero *)\\n (PointsSelector.Build_t false false false false, O)\\n | S index =>\\n let selector := HighLow.get_selector\\n u_low u_high v_low v_high (Z.of_nat index) in\\n if PointsSelector.is_zero selector then\\n let new_over_index := index in\\n get u_low u_high v_low v_high new_over_index\\n else\\n let next_over_index := index in\\n (selector, next_over_index)\\n end.\\n```\\n\\nHere are some notable changes compared to the original `for` loop:\\n\\n- We decompose the scalars `u` and `v` of 256 bits into their high and low parts, `u_low`, `u_high`, `v_low`, and `v_high` of 128 bits each.\\n- We make explicit the scalars that we select with the `PointsSelector` type, which is a record with four boolean fields. In the original code, the `ZZZ` variable is used to group these four booleans into a single integer.\\n- We use a natural number `over_index` to represent the mask. We decrement it at each iteration until it reaches zero, proving by construction the termination of the function. The relation with the mask is:\\n\\n$$\\n\\\\text{mask} = \\\\lfloor 2^{\\\\text{over\\\\_index} - 1} \\\\rfloor\\n$$\\n\\nNote that this means that when the `over_index` is zero, then the `mask` is zero. This corresponds to the last case of the loop. We use the variable name `over_index` so that if we define:\\n\\n$$\\n\\\\text{over\\\\_index} = \\\\text{index} + 1\\n$$\\n\\nthen the relation with the mask is:\\n\\n$$\\n\\\\text{mask} = 2^{\\\\text{index}}\\n$$\\n\\nfor all cases except the last one.\\n\\n## \ud83d\udca1 Reasoning rule\\n\\nHere is the reasoning rule for the smart contract loops in Coq:\\n\\n```coq\\nLemma LoopStep codes environment {In Out : Set}\\n (init : In)\\n (body : In -> LowM.t Out)\\n (break_with : Out -> In + Out)\\n (k : Out -> LowM.t Out)\\n (output output_inter : Out)\\n state state_inter state\'\\n (H_body :\\n {{? codes, environment, state |\\n body init \u21d3 output_inter\\n | state_inter ?}}\\n )\\n (H_break_with :\\n match break_with output_inter with\\n | inr output_inter\' =>\\n {{? codes, environment, state_inter |\\n k output_inter\' \u21d3 output\\n | state\' ?}}\\n | inl next_init =>\\n {{? codes, environment, state_inter |\\n LowM.Loop next_init body break_with k \u21d3 output\\n | state\' ?}}\\n end\\n ) :\\n {{? codes, environment, state |\\n LowM.Loop init body break_with k \u21d3 output\\n | state\' ?}}.\\n```\\n\\nThis rule, to be used in combination with some reasoning by induction, allows us to verify that a certain property is true for any number of iterations of the loop. In the present case, we use it to prove that the recursive function `get` is equivalent to the `for` loop. Basically, it states that:\\n\\n- Assuming that the `body` of the loop evaluates to some output `output_inter`,\\n- if the `break_with` helper, which wraps the end of the end of the loop to either continue the loop or break it, evaluates to `output`,\\n- then the whole loop evaluates to `output`.\\n\\nHere, the output of the body of the loop contains the state of the state monad, that is to say, the two variables `ZZZ` and `mask`, and a special variable to break or continue the `for` loop iterations.\\n\\nDue to a lack of time, we only made a sketch of the proof of evaluation of this loop, admitting some intermediate lemmas about identities over the selector function. This work is available in the file [coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4/run.v](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4/run.v).\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have seen how to reason about loops with `coq-of-solidity`. This example with bit-level arithmetic was rather complex, but the general idea is still to reason by induction, showing the equivalence with a recursive function, using the reasoning rule `LoopStep` above to step through the loop.\\n\\nIf you have smart contracts that you need to secure, talk to us! \ud83e\udd1d The cost of an attack always far outweights the cost of an audit, and our solution, with full formal verification, is the more extensive in terms of coverage.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/10/22/what-we-bring-to-you","metadata":{"permalink":"/blog/2024/10/22/what-we-bring-to-you","source":"@site/blog/2024-10-22-what-we-bring-to-you.md","title":"\ud83c\udf32 What we bring you","description":"We bring you the highest possible level of security \ud83e\uddb8 for your blockchain applications by using formal verification \u2728 optimized by AI solutions to keep the cost down. We believe that for systems holding a lot of value \ud83d\udcb0, it is necessary to use the most advanced techniques \u269b\ufe0f to ensure their security; otherwise attackers with large means (like North Korea \ud83c\uddf0\ud83c\uddf5, but not only) will be able to steal or damage the system by using these techniques themselves.","date":"2024-10-22T00:00:00.000Z","formattedDate":"October 22, 2024","tags":[],"readingTime":3.42,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83c\udf32 What we bring you","tags":[],"authors":[]},"unlisted":false,"prevItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 2","permalink":"/blog/2024/10/28/verification-smooth-library-2"},"nextItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 1","permalink":"/blog/2024/10/21/verification-smooth-library-1"}},"content":"We bring you the **highest possible level of security \ud83e\uddb8** for your blockchain applications by using **formal verification \u2728** optimized by **AI solutions** to keep the cost down. We believe that for systems **holding a lot of value \ud83d\udcb0**, it is necessary to use the most advanced techniques \u269b\ufe0f to ensure their security; otherwise attackers with large means (like **North Korea \ud83c\uddf0\ud83c\uddf5**, but not only) will be able to **steal or damage** the system by using these techniques themselves.\\n\\nIn this blog post we present how we work with customers to integrate full formal verification in their workflow and ensure that their code is **secure** in the best possible way.\\n\\n\x3c!-- It is possible to have a system which is fully secured \ud83d\udcaf once you have a mathematical proof of its security that is itself verified by a computer. This is what we provide with **formal verification**. In some sense this is good to know there is an end to the quest of finding security vulnerabilities . The issue is that formal verification is only as good as the **scope** of the code we verify, and the quality of the **security predicates** that we use. --\x3e\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Network in forest](2024-10-22/network-in-forest.webp)\\n
\\n\\n## \ud83d\udee1\ufe0f Why Formal Verification Matters\\n\\nSecurity is central to the long term success of decentralized platforms. Traditional testing or security audits can catch many issues, but are not enough to guarantee the absence of bugs. Formal verification is a technique that **checks every possible input** of your program to ensure that it is always correct, for a given set of security properties. It works by mathematically reasoning about the code constructs and then checking this reasoning with a computer.\\n\\n## \ud83d\udd04 Our Process\\n\\nOur process is as follows:\\n\\n1. **Understanding Your Needs** We start by meeting with you to understand your system and your security requirements.\\n2. **Formal Modeling** We then create a formal model of your system in a proof assistant, using automated translation tools to make sure we make no mistakes.\\n3. **Proof Generation** We then generate mathematical proofs that your system satisfies the security properties you require, using the latest techniques in proof automation to reduce the cost.\\n4. **Seamless Integration** We help you integrate the proofs into your CI pipeline to ensure that every commit is automatically checked for correctness.\\n\\n## \ud83c\udf81 Benefits You Can Expect\\n\\n* **Enhanced Security** You improve the security of your system by showing that whole classes of bugs are impossible.\\n* **Cost Savings** You prevent costly security incidents and reduce the need for extensive manual audits.\\n* **Investor Confidence** You demonstrate that your system is secure and that you protect your users.\\n* **Regulatory Compliance** Finally, you show that you have taken all necessary steps to meet regulatory requirements.\\n\\n## \ud83c\udf10 Why Choose Us?\\n\\n* **Expert Team** Our team has years of experience in formal verification, cryptography, and A, with publications in all of these domains.\\n* **Cutting-Edge Tool** We use and develop the latest tools in formal verification to ensure we can provide the best possible service cost-effectively.\\n* **Customized Solutions** We customize our solutions to your system. You made a new language for zk-circuits or smart contracts and want the technology to verify it? We can help you.\\n\\n## \ud83e\udd1d Get in Touch\\n\\nReady to take your application\'s security to the next level? Reach out to us at[ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land), and let\'s build a secure future together! \ud83d\ude80\\n\\n:::success Stay Tuned\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more insights into formal verification, blockchain security, and how AI is changing the field. We share our case studies, tutorials, and the latest industry news to keep you ahead of the curve._\\n\\n:::"},{"id":"/2024/10/21/verification-smooth-library-1","metadata":{"permalink":"/blog/2024/10/21/verification-smooth-library-1","source":"@site/blog/2024-10-21-verification-smooth-library-1.md","title":"\u2688 Verification of the Smoo.th library \u2013 1","description":"In this blog post, we present the formal verification effort we started to show the absence of bugs in the \u2688 Smoo.th library, a library for optimized \u3030\ufe0f elliptic curve operations in Solidity. We are using our tool coq-of-solidity to make this non-trivial verification using the generic proof assistant \ud83d\udc13 Coq.","date":"2024-10-21T00:00:00.000Z","formattedDate":"October 21, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"},{"label":"elliptic curves","permalink":"/blog/tags/elliptic-curves"}],"readingTime":10.45,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u2688 Verification of the Smoo.th library \u2013 1","tags":["Solidity","Yul","elliptic curves"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83c\udf32 What we bring you","permalink":"/blog/2024/10/22/what-we-bring-to-you"},"nextItem":{"title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","permalink":"/blog/2024/10/16/coq-of-solidity-enhanced-version-1"}},"content":"In this blog post, we present the formal verification effort we started to show the absence of bugs in the [\u2688 Smoo.th](https://smoo.th/) library, a library for optimized [\u3030\ufe0f elliptic curve](https://en.wikipedia.org/wiki/Elliptic_curve) operations in [Solidity](https://soliditylang.org/). We are using our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) to make this non-trivial verification using the generic proof assistant [\ud83d\udc13 Coq](https://coq.inria.fr/).\\n\\nThe **Smoo.th** library is interesting as elliptic curves are at the core of many cryptographic protocols, including authentication protocols, and having a generic and fast implementation simplifies the development of [dApps](https://en.wikipedia.org/wiki/Decentralized_application) in environments with missing pre-compiled (like L1s) or missing circuits (like zero-knowledge layers).\\n\\nFrom a verification point of view, it is very challenging as it combines low-level operations (hand-optimized [Yul](https://docs.soliditylang.org/en/latest/yul.html) code with bit shifts, inlined functions, ...) with higher-level reasoning on elliptic curves and arithmetic \ud83d\udcaa.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to a **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Panda in forest](2024-10-21/panda-in-forest.webp)\\n
\\n\\n## \ud83d\uddfa\ufe0f Design of the library\\n\\nThe library is implemented in [SCL_mulmuladdX_fullgen_b4.sol](https://github.com/get-smooth/crypto-lib/blob/main/src/elliptic/SCL_mulmuladdX_fullgen_b4.sol) mostly in Yul. Given two points $G$ and $Q$ on an elliptic curve in the field $\\\\mathbb{F}_p$ and two scalars $u$ and $v$, it computes the following operation:\\n\\n$$\\nu \\\\cdot G + v \\\\cdot Q\\n$$\\n\\nwhere the points are represented as $(x, y)$ coordinates, the scalars are integers, and the curve is described in the short Weierstrass form.\\n\\nHere is a diagram to summarize the workflow of the library \ud83e\udd13:\\n\\n
\\n ![Smoo.th workflow](2024-10-21/smoo-th-diagram.svg)\\n
\\n\\nYou can find more details about the algorithms used in the library in the complete [audit report](https://github.com/get-smooth/crypto-lib/blob/main/doc/Audits/CRX_smooth_report_2024_07_11_v1.2.pdf) by [CryptoExperts](https://www.cryptoexperts.com/).\\n\\nOur goal is to show that all these steps are equivalent to doing the naive operation of adding the points $u \\\\cdot G$ and $v \\\\cdot Q$ on the elliptic curve, ignoring a higher gas consumption and that the library is then free of bugs. Note that there are a few exceptional points, for example, when $G$ is the opposite of $Q$, where the library does not work as it is and runs another algorithm instead. We need to make these points explicit in the proof and assume we are not in these special cases.\\n\\n## \ud83d\udc13 Translation to Coq\\n\\nIn order to formally verify that the code is correct for any possible inputs, we need to first translate it to a proof language, in our case Coq. We run our tool `coq-of-solidity` on the optimized Yul code as generated by the Solidity compiler, that optimizes further the already hand-optimized code of the library. All our verification work is available on GitHub in the folder [coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4](https://github.com/formal-land/coq-of-solidity/tree/develop/coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4) of the [coq-of-solidity\'s repository](https://github.com/formal-land/coq-of-solidity).\\n\\nHere is an example of hand-written Yul code from the contract, to compute the most-significant bit from the scalars:\\n\\n```go\\nZZZ := 0\\nfor {} iszero(ZZZ) { mask := shr(1, mask) } {\\n ZZZ := add(\\n add(\\n sub(1, iszero(and(scalar_u, mask))),\\n shl(1, sub(1, iszero(and(shr(128, scalar_u), mask))))\\n ),\\n add(\\n shl(2, sub(1, iszero(and(scalar_v, mask)))),\\n shl(3, sub(1, iszero(and(shr(128, scalar_v), mask))))\\n )\\n )\\n}\\n```\\n\\nThe Yul code after optimization by the Solidity compiler is:\\n\\n```go\\nmstore(0xe0, 0)\\nfor { } iszero(mload(0xe0)) { mstore(0x01a0, shr(1, mload(0x01a0))) } {\\n mstore(0xe0, add(\\n add(\\n sub(1, iszero(and(mload(0x0120), mload(0x01a0)))),\\n shl(1, sub(1, iszero(and(shr(128, mload(0x0120)), mload(0x01a0)))))\\n ),\\n add(\\n shl(2, sub(1, iszero(and(mload(0x0160), mload(0x01a0))))),\\n shl(3, sub(1, iszero(and(shr(128, mload(0x0160)), mload(0x01a0)))))\\n )\\n ))\\n}\\n```\\n\\nAs we can see, the variable names were replaced by fixed memory addresses. As we can see, this will make the verification more complex. The Coq code that we generate with `coq-of-solidity` is:\\n\\n```coq\\ndo~ [[ mstore ~(| 0xe0, 0 |) ]] in\\nlet_state~ \'tt :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n tt\\n (* condition *)\\n (fun \'tt => [[\\n iszero ~(| mload ~(| 0xe0 |) |)\\n ]])\\n (* body *)\\n (fun \'tt =>\\n do~ [[\\n mstore ~(| 0xe0, add ~(|\\n add ~(|\\n sub ~(| 1, iszero ~(| and ~(| mload ~(| 0x0120 |), mload ~(| 0x01a0 |) |) |) |),\\n shl ~(| 1, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, mload ~(| 0x0120 |) |), mload ~(| 0x01a0 |) |) |) |) |)\\n |),\\n add ~(|\\n shl ~(| 2, sub ~(| 1, iszero ~(| and ~(| mload ~(| 0x0160 |), mload ~(| 0x01a0 |) |) |) |) |),\\n shl ~(| 3, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, mload ~(| 0x0160 |) |), mload ~(| 0x01a0 |) |) |) |) |)\\n |)\\n |) |)\\n ]] in\\n M.pure (BlockUnit.Tt, tt))\\n (* post *)\\n (fun \'tt =>\\n do~ [[ mstore ~(| 0x01a0, shr ~(| 1, mload ~(| 0x01a0 |) |) |) ]] in\\n M.pure (BlockUnit.Tt, tt))\\ndefault~ tt in\\n```\\n\\nWe use a monadic notation `f ~(| x1, ..., xn |)` to represent the side-effects of the EVM, such as memory read and write with `mload` and `mstore`. The function `Shallow.for_` represents a for loop with an initial state, a condition, a body, and a post-action. We implement it using a primitive from our monad to represent potentially non-terminating loops.\\n\\nHere the proper state of the loop is empty (value `tt`) and we instead modify the memory with `mload`. Ideally we should have `(ZZZ, mask)` as the state of the loop to simplify the verification. For our next attempt at verifying this code, we will look at the Yul code generated before optimizations by the Solidity compiler in order to keep these variables.\\n\\n## \ud83d\udd2c What we verified\\n\\nWe are not done yet with the verification of this library. For now, we have verified that:\\n\\n- The addition operation `ecAddn2` is implemented as specified.\\n- The doubling and negation operation `ecDblNeg` is implemented as in the specification, in an inlined manner.\\n- The pre-computations of the sums of the possible combinations of points are correct.\\n- The retrieval of the pre-computed sums from the current bits of the scalars is correct.\\n\\nFor example, here is our statement for the execution of the `ecAddn2` operation:\\n\\n```coq\\nLemma run_usr\'dollar\'ecAddn2 codes environment state\\n (P1_X P1_Y P1_ZZ P1_ZZZ P2_X P2_Y : U256.t) (p : U256.t) :\\n let output :=\\n ecAddn2 p\\n {| PZZ.X := P1_X; PZZ.Y := P1_Y; PZZ.ZZ := P1_ZZ; PZZ.ZZZ := P1_ZZZ |}\\n {| PA.X := P2_X; PA.Y := P2_Y |} in\\n let output := Result.Ok (output.(PZZ.X), output.(PZZ.Y), output.(PZZ.ZZ), output.(PZZ.ZZZ)) in\\n {{? codes, environment, Some state |\\n Contract_91.Contract_91_deployed.usr\'dollar\'ecAddn2 P1_X P1_Y P1_ZZ P1_ZZZ P2_X P2_Y p \u21d3\\n output\\n | Some state ?}}.\\n```\\n\\nIt says that in a given environment (`codes`, `environment`, `state`), the execution of the translated function `Contract_91.Contract_91_deployed.usr\'dollar\'ecAddn2` gives the same result as a hand-written purely functional version `ecAddn2` operating on data types directly representing the curve points (`PZZ.t` and `PA.t`).\\n\\nWe verify this execution in a straightforward way by unfolding the definition and executing it step by step:\\n\\n```coq\\nProof.\\n simpl.\\n unfold Contract_91.Contract_91_deployed.usr\'dollar\'ecAddn2.\\n l. {\\n repeat (l; [repeat cu; p|]).\\n p.\\n }\\n p.\\nQed.\\n```\\n\\nFor the verification of the inlined`ecDblNeg` operation, here is the memory state just after computing the coordinates of the doubled point:\\n\\n```coq\\n[\\n mem0; mem1; Pure.add 0 2048; mem3; mem4;\\n Pure.addmod\\n (Pure.mulmod\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p) p)\\n (Pure.mulmod (Pure.sub p 2)\\n (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p) p) p;\\n Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p;\\n Pure.mulmod\\n (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p)\\n P_127.(PZZ.ZZZ) p;\\n Pure.addmod\\n (Pure.mulmod\\n (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p)\\n P_127.(PZZ.Y) p)\\n (Pure.mulmod\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)\\n (Pure.addmod\\n (Pure.addmod\\n (Pure.mulmod\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p) p)\\n (Pure.mulmod (Pure.sub p 2)\\n (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p) p) p)\\n (Pure.sub p (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p))\\n p) p) p;\\n HighLow.merge u_high u_low; 480; HighLow.merge v_high v_low; Pure.add 0 2048; 2 ^ 126;\\n Pure.mulmod (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) P_127.(PZZ.ZZ) p;\\n p; Q.(PA.Y); Q\'.(PA.X); Q\'.(PA.Y); p; a; G.(PA.X); G.(PA.Y); G\'.(PA.X); G\'.(PA.Y);\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n P0.(PZZ.X); P0.(PZZ.Y); P0.(PZZ.ZZ); P0.(PZZ.ZZZ);\\n P1.(PZZ.X); P1.(PZZ.Y); P1.(PZZ.ZZ); P1.(PZZ.ZZZ);\\n P2.(PZZ.X); P2.(PZZ.Y); P2.(PZZ.ZZ); P2.(PZZ.ZZZ);\\n P3.(PZZ.X); P3.(PZZ.Y); P3.(PZZ.ZZ); P3.(PZZ.ZZZ);\\n P4.(PZZ.X); P4.(PZZ.Y); P4.(PZZ.ZZ); P4.(PZZ.ZZZ);\\n P5.(PZZ.X); P5.(PZZ.Y); P5.(PZZ.ZZ); P5.(PZZ.ZZZ);\\n P6.(PZZ.X); P6.(PZZ.Y); P6.(PZZ.ZZ); P6.(PZZ.ZZZ);\\n P7.(PZZ.X); P7.(PZZ.Y); P7.(PZZ.ZZ); P7.(PZZ.ZZZ);\\n P8.(PZZ.X); P8.(PZZ.Y); P8.(PZZ.ZZ); P8.(PZZ.ZZZ);\\n P9.(PZZ.X); P9.(PZZ.Y); P9.(PZZ.ZZ); P9.(PZZ.ZZZ);\\n P10.(PZZ.X); P10.(PZZ.Y); P10.(PZZ.ZZ); P10.(PZZ.ZZZ);\\n P11.(PZZ.X); P11.(PZZ.Y); P11.(PZZ.ZZ); P11.(PZZ.ZZZ);\\n P12.(PZZ.X); P12.(PZZ.Y); P12.(PZZ.ZZ); P12.(PZZ.ZZZ);\\n P13.(PZZ.X); P13.(PZZ.Y); P13.(PZZ.ZZ); P13.(PZZ.ZZZ);\\n P14.(PZZ.X); P14.(PZZ.Y); P14.(PZZ.ZZ); P14.(PZZ.ZZZ);\\n P15.(PZZ.X); P15.(PZZ.Y); P15.(PZZ.ZZ); P15.(PZZ.ZZZ);\\n 0; p\\n]\\n```\\n\\nThe state is very large as we are verifying a large function (250 lines) directly mutating the memory. We recognize the parameters of the function (`Q`, `Q\'`, `G`, `G\'`) as well as the pre-computed points (`P0`, `P1`, `P2`, ..., `P16`). We also see the computation of the coordinates of the doubled point, stored at fixed memory addresses.\\n\\nWe define the `dbl_neg_P_127` point as:\\n\\n```coq\\nset (dbl_neg_P_127 := ecDblNeg a p P_127).\\n```\\n\\nWe then rewrite the memory locations of the doubled point with the coordinates of `dbl_neg_P_127`:\\n\\n```coq\\napply_memory_update_at P_127_X_address dbl_neg_P_127.(PZZ.X); [reflexivity|].\\napply_memory_update_at P_127_Y_address dbl_neg_P_127.(PZZ.Y); [reflexivity|].\\napply_memory_update_at P_127_ZZ_address dbl_neg_P_127.(PZZ.ZZ); [reflexivity|].\\napply_memory_update_at P_127_ZZZ_address dbl_neg_P_127.(PZZ.ZZZ); [reflexivity|].\\n```\\n\\ngiving us the new state:\\n\\n```coq\\n[\\n mem0; mem1; Pure.add 0 2048; mem3; mem4; dbl_neg_P_127.(PZZ.X);\\n Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p;\\n dbl_neg_P_127.(PZZ.ZZZ); dbl_neg_P_127.(PZZ.Y);\\n HighLow.merge u_high u_low; 480; HighLow.merge v_high v_low; Pure.add 0 2048; 2 ^ 126;\\n dbl_neg_P_127.(PZZ.ZZ);\\n p; Q.(PA.Y); Q\'.(PA.X); Q\'.(PA.Y); p; a; G.(PA.X); G.(PA.Y); G\'.(PA.X); G\'.(PA.Y);\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n P0.(PZZ.X); P0.(PZZ.Y); P0.(PZZ.ZZ); P0.(PZZ.ZZZ);\\n P1.(PZZ.X); P1.(PZZ.Y); P1.(PZZ.ZZ); P1.(PZZ.ZZZ);\\n P2.(PZZ.X); P2.(PZZ.Y); P2.(PZZ.ZZ); P2.(PZZ.ZZZ);\\n P3.(PZZ.X); P3.(PZZ.Y); P3.(PZZ.ZZ); P3.(PZZ.ZZZ);\\n P4.(PZZ.X); P4.(PZZ.Y); P4.(PZZ.ZZ); P4.(PZZ.ZZZ);\\n P5.(PZZ.X); P5.(PZZ.Y); P5.(PZZ.ZZ); P5.(PZZ.ZZZ);\\n P6.(PZZ.X); P6.(PZZ.Y); P6.(PZZ.ZZ); P6.(PZZ.ZZZ);\\n P7.(PZZ.X); P7.(PZZ.Y); P7.(PZZ.ZZ); P7.(PZZ.ZZZ);\\n P8.(PZZ.X); P8.(PZZ.Y); P8.(PZZ.ZZ); P8.(PZZ.ZZZ);\\n P9.(PZZ.X); P9.(PZZ.Y); P9.(PZZ.ZZ); P9.(PZZ.ZZZ);\\n P10.(PZZ.X); P10.(PZZ.Y); P10.(PZZ.ZZ); P10.(PZZ.ZZZ);\\n P11.(PZZ.X); P11.(PZZ.Y); P11.(PZZ.ZZ); P11.(PZZ.ZZZ);\\n P12.(PZZ.X); P12.(PZZ.Y); P12.(PZZ.ZZ); P12.(PZZ.ZZZ);\\n P13.(PZZ.X); P13.(PZZ.Y); P13.(PZZ.ZZ); P13.(PZZ.ZZZ);\\n P14.(PZZ.X); P14.(PZZ.Y); P14.(PZZ.ZZ); P14.(PZZ.ZZZ);\\n P15.(PZZ.X); P15.(PZZ.Y); P15.(PZZ.ZZ); P15.(PZZ.ZZZ);\\n 0; p\\n]\\n```\\n\\nStill large but much cleaner!\\n\\n## \ud83d\udc40 What remains to be done\\n\\nThere are two main parts that remain to be done in order to have a full formal verification of the library:\\n\\n1. We need to complete the proof stating that the execution of the smart contract is equivalent to the execution of a purely functional version written in Coq, especially using recursive functions instead of `for` loops. Reasoning on the loops is complex; in the current version, we unroll the loops once in order to have a first step towards the full proof. As the memory used by the main function is quite large, we will first need to change the code we verify by looking at the Yul code generated before optimizations by the Solidity compiler.\\n2. Show that the purely functional version of the library is equivalent to the plain addition and scalar multiplication. We have only started this work. The main challenge is to show that we can remove the loop by doing the bitwise addition. This will require some bit-arithmetic reasoning, as well as field arithmetic for the operations modulo the prime number $p$.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have seen how the **Smoo.th** library works at a high level, how we can start verifying it, and what challenges do we face. This is also an interesting example to improve our tool `coq-of-solidity` and develop reasoning primitives for cryptographic code. We will continue this work in the coming weeks to verify more parts of this library.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any formal verification services you need._\\n\\n:::"},{"id":"/2024/10/16/coq-of-solidity-enhanced-version-1","metadata":{"permalink":"/blog/2024/10/16/coq-of-solidity-enhanced-version-1","source":"@site/blog/2024-10-16-coq-of-solidity-enhanced-version-1.md","title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","description":"We present improvements we made to our tool coq-of-solidity to formally verify Solidity smart contracts for any advanced properties, relying on the proof assistant \ud83d\udc13 Coq. The idea is to be able to prove the full absence of bugs \u2728 in very complex contracts, like L1 verifiers for zero-knowledge L2s \ud83d\udd75\ufe0f, or contracts with very large amounts of money \ud83d\udcb0 (in the billions).","date":"2024-10-16T00:00:00.000Z","formattedDate":"October 16, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"monad","permalink":"/blog/tags/monad"},{"label":"effects","permalink":"/blog/tags/effects"},{"label":"Yul","permalink":"/blog/tags/yul"},{"label":"loops","permalink":"/blog/tags/loops"},{"label":"mutations","permalink":"/blog/tags/mutations"}],"readingTime":8.82,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","tags":["Solidity","monad","effects","Yul","loops","mutations"],"authors":[]},"unlisted":false,"prevItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 1","permalink":"/blog/2024/10/21/verification-smooth-library-1"},"nextItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","permalink":"/blog/2024/10/15/verification-move-sui-type-checker-3"}},"content":"We present improvements we made to our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) to formally verify [Solidity](https://soliditylang.org/) smart contracts for any advanced properties, relying on the proof assistant [\ud83d\udc13 Coq](https://coq.inria.fr/). The idea is to be able to prove the **full absence of bugs \u2728** in **very complex contracts**, like L1 verifiers for **zero-knowledge L2s \ud83d\udd75\ufe0f**, or contracts with **very large amounts of money \ud83d\udcb0** (in the billions).\\n\\nIn this blog post, we present how we developed an effect inference mechanism to translate optimized [Yul](https://docs.soliditylang.org/en/latest/yul.html) code combining variable mutations and control flow with loops and nested premature returns (`break`, `continue`, and `leave`) to a clean \ud83e\uddfc purely functional representation in the proof system Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\nWe will be talking about this work at the [Encode London Conference](https://lu.ma/encode-london-24) on Friday, October 25, 2024 \ud83d\udce2.\\n\\n:::\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Frozen Solidity rock](2024-10-16/frozen-solidity.webp)\\n
\\n\\n## \ud83e\udde8 The issue\\n\\nYul is the intermediate language of the Solidity compiler that we translate to the Coq proof system to formally verify properties of smart contracts. The issue is that it has slightly different behaviors than the Coq language. In particular, it allows for variable mutations and imperative loops (`for` loops) with premature exits that have no native equivalents in purely functional languages like the ones used for formal verification.\\n\\nHere is a short example of Yul code that is impossible to translate to Coq as it is:\\n\\n```go\\nfunction rugby() -> x {\\n let i := 0\\n x := 0\\n for { } lt(i, 10) { i := add(i, 1) } {\\n x := add(x, i)\\n if eq(i, 5) {\\n leave\\n }\\n }\\n}\\n```\\n\\nIt uses the variable `x` to store the sum of the increasing sequence of integers `1`, `2`, `3`, ... but prematurely stops the loop when `i` reaches `5` and returns the final value of `x`.\\n\\nTo represent this code in a purely functional language, we need to:\\n\\n- Make explicit the fact that we operate on a local state, that is to say, the couple of the two variables `i` and `x`.\\n- Represent the control flow of the loop, which repeats its body until the condition `eq(i, 5)` is satisfied and then bubbles up to the body of the function to return the final result `x`.\\n\\n## Why is it important?\\n\\nHaving a purely functional representation of the Yul code is important as verifying functional programs is easier than verifying imperative ones, especially in the case of a system like Coq that is based on functional programming even at the logical level.\\n\\nIdeally, such a translation should be done automatically so that we are not at risk of making mistakes and can focus our time on the verification work. This would allow to more efficiently formally verify properties of smart contracts or similar imperative programs. Not that in Yul, in addition to mutations on variables, there are also mutations on the contract\'s memory and storage, which we do not cover here.\\n\\n## The solution\\n\\nOur solution is a tool that does an effect inference on the Yul code to determine which variables might be mutated at each point of the program, and then propagates the results in the two cases where the execution continues to the next instruction and the case where it bubbles up.\\n\\n### \ud83c\udfd7\ufe0f The tool\\n\\nWe wrote our tool in \ud83d\udc0d Python, for ease of development, parsing the Yul code from the JSON output of the Solidity compiler and outputting a Coq file that represents the functional version of the code. Yul is a rather pleasant language, optimized for formal verification and with very few constructs. Our code is available on our GitHub repository [github.com/formal-land/coq-of-solidity](https://github.com/formal-land/coq-of-solidity), in a pull request that is about to be merged.\\n\\nHere is the header of our main Python function, which translates Yul statements to Coq:\\n\\n```python\\ndef statement_to_coq(node) -> tuple[Callable[[set[str]], str], set[str], set[str]]:\\n```\\n\\nIt takes a JSON `node` corresponding to a statement (assignment, `if`, `for`, `leave`, ...) and returns a triple with:\\n\\n1. A function that takes the yet-to-be-determined mutated variables in the surrounding block and returns the Coq code of the statement.\\n2. The set of newly declared variables.\\n3. The set of mutated variables.\\n\\nFrom these information we can infer the variables that are mutated at each point of the program and propagate them.\\n\\n### \ud83d\udd0d Example\\n\\nAs an example, here is the generated Coq translation of our \ud83c\udfc9 `rugby` example above:\\n\\n```coq showLineNumbers\\nDefinition rugby : M.t U256.t :=\\n let~ \'(_, result) :=\\n let~ i := [[ 0 ]] in\\n let~ x := [[ 0 ]] in\\n let_state~ \'(i, x) :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n (i, x)\\n (* condition *)\\n (fun \'(i, x) => [[\\n lt ~(| i, 10 |)\\n ]])\\n (* body *)\\n (fun \'(i, x) =>\\n Shallow.lift_state_update\\n (fun x => (i, x))\\n (let~ x := [[ add ~(| x, i |) ]] in\\n let_state~ \'tt := [[\\n Shallow.if_ (|\\n eq ~(| i, 5 |),\\n M.pure (BlockUnit.Leave, tt),\\n tt\\n |)\\n ]] default~ x in\\n M.pure (BlockUnit.Tt, x)))\\n (* post *)\\n (fun \'(i, x) =>\\n Shallow.lift_state_update\\n (fun i => (i, x))\\n (let~ i := [[ add ~(| i, 1 |) ]] in\\n M.pure (BlockUnit.Tt, i)))\\n default~ x in\\n M.pure (BlockUnit.Tt, x)\\n in\\n M.pure result.\\n```\\n\\nOn lines `3` and `4` we see that we use normal `let` declarations for the variables `i` and `x`:\\n\\n```coq\\nlet~ i := [[ 0 ]] in\\nlet~ x := [[ 0 ]] in\\n```\\n\\nThe notation `let~` is a monadic notation to represent the side-effects of the EVM (storage updates, contract calls, ...) but the variables `i` and `x` are plain Coq variables, what will facilitate the formal verification process later.\\n\\nIn line `5`, we see that we consider the `for` loop to have a two-variable state `(i, x)`:\\n\\n```coq\\nlet_state~ \'(i, x) :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n (i, x)\\n```\\n\\nThe condition depends on the whole state, even if it only uses a part of it:\\n\\n```coq\\n(* condition *)\\n(fun \'(i, x) => [[\\n lt ~(| i, 10 |)\\n]])\\n```\\nThe body is more interesting. We only modify the variable `x` but we need to read and return the whole state `(i, x)`, so we start with a lift operation:\\n\\n```coq\\n(* body *)\\n(fun \'(i, x) =>\\n Shallow.lift_state_update\\n (fun x => (i, x))\\n```\\n\\nThen we update the variable `x` with a standard variable declaration as if the variable was immutable:\\n\\n```coq\\n(let~ x := [[ add ~(| x, i |) ]] in\\n```\\n\\nThe updated value of the variable `x` is propagated at the end of the body:\\n\\n```coq\\nM.pure (BlockUnit.Tt, x)))\\n```\\n\\nThis is how we translate the inner `if`:\\n\\n```coq\\nlet_state~ \'tt := [[\\n Shallow.if_ (|\\n eq ~(| i, 5 |),\\n M.pure (BlockUnit.Leave, tt),\\n tt\\n |)\\n]] default~ x in\\n```\\n\\nIf the condition is satisfied, we return the special value `BlockUnit.Leave` that will be interpreted as a premature exit of the function and activate the bubble-up mechanism. The associated state is the special empty value `tt` as there are no mutations in the `if` statement. We use `default~ x` at the next line to say that we complete the `tt` state with the value `x` if we are bubbling up.\\n\\nThe binding of the expression of `default~` is done after the `let_state~` to be able to retrieve parts of the state that might have been modified, if needed. This is, for example, the case for the `for` loop where we say that we first get the values of the two variables `i` and `x`:\\n\\n```coq\\nlet_state~ \'(i, x) :=\\n (* for loop *)\\n```\\n\\nand then propagate only the state `x` in case of a premature exit:\\n\\n```coq\\ndefault~ x in\\n```\\n\\nat the line `33`.\\n\\n### \ud83d\udd2e Monad\\n\\nThe [monad](https://en.wikipedia.org/wiki/Monad_(functional_programming)) we use to represent the bubble-up mechanism is the following:\\n\\n```coq\\nModule Shallow.\\n Definition t (State : Set) : Set :=\\n M.t (BlockUnit.t * State).\\n```\\n\\nwhere:\\n\\n- `M.t` is the monad representing the side-effects of the EVM,\\n- `BlockUnit.t` is a type representing the different modes of the bubble-up mechanism: no bubble-up, or a bubble-up with a `break`, `continue`, or `leave` instruction,\\n- `State` is the type of the current state that we might be writing to.\\n\\nWe define the notation `let_state~ ... default~ ... in` with:\\n\\n```coq\\nNotation \\"\'let_state~\' pattern \':=\' e \'default~\' state \'in\' k\\" :=\\n (let_state e (fun pattern => (state, k)))\\n```\\n\\nand the function:\\n\\n```coq\\nDefinition let_state {State1 State2 : Set}\\n (expression : t State1) (body : State1 -> State2 * t State2) :\\n t State2 :=\\n M.strong_let_ expression (fun value =>\\n let \'(mode, state1) := value in\\n match mode with\\n (* no bubble-up, do not use the default state *)\\n | BlockUnit.Tt => snd (body state1)\\n (* bubble-up, use the default state and keep the same bubble-up mode *)\\n | _ => M.pure (mode, fst (body state1))\\n end).\\n```\\n\\nYou can also look at the definitions of the `Shallow.if_` and `Shallow.for_` functions in our code. For loops, we use a non-termination effect of the underlying monad `M.t`. This is because loops can be infinite, and this is not allowed in Coq.\\n\\n## Application\\n\\nWe are using the new translation above to formally verify the implementation of a hand-optimized Yul code using loops and mutations to implement cryptographic operations in an efficient way. We believe that this translation would work as well for any other examples of Yul code, enabling the formal verification of arbitrary Solidity or Yul code in a more functional way.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have show how we can automatically translate arbitrary Yul code in a purely functional form \ud83c\udf1f, excluding mutations of the memory and the storage, in order to simplify further formal verification operations \ud83d\ude42.\\n\\nA work left to be done is to prove that this transformation is correct, showing it equivalent to our initial and simpler Yul semantics where variables are represented as string keys in a map. We believe this is possible by generating a proof on a case-by-case basis for each transformed program, working by unification and exploring all the branches. But this remains to be done.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any formal verification services you need._\\n\\n:::"},{"id":"/2024/10/15/verification-move-sui-type-checker-3","metadata":{"permalink":"/blog/2024/10/15/verification-move-sui-type-checker-3","source":"@site/blog/2024-10-15-verification-move-sui-type-checker-3.md","title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","description":"In the previous blog post, we have seen how we represent side-effects from the Rust code of the Sui\'s Move type-checker of bytecode in Coq. This translation represents about 3,200 lines of Coq code excluding comments. We need to trust that this translation is faithful to the original Rust code, as we generate it by hand or with GitHub Copilot.","date":"2024-10-15T00:00:00.000Z","formattedDate":"October 15, 2024","tags":[{"label":"monad","permalink":"/blog/tags/monad"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Sui","permalink":"/blog/tags/sui"}],"readingTime":5.795,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","tags":["monad","Rust","Sui"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","permalink":"/blog/2024/10/16/coq-of-solidity-enhanced-version-1"},"nextItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","permalink":"/blog/2024/10/14/verification-move-sui-type-checker-2"}},"content":"In the [previous blog post](/blog/2024/10/14/verification-move-sui-type-checker-2), we have seen how we represent side-effects from the Rust code of the [Sui](https://sui.io/)\'s [Move](https://sui.io/move) type-checker of bytecode in Coq. This translation represents about 3,200 lines of Coq code excluding comments. We need to trust that this translation is faithful to the original Rust code, as we generate it by hand or with GitHub Copilot.\\n\\nIn this blog post, we present how we test this translation to ensure it is correct by running the type-checker on each opcode of the Move bytecode and comparing the results with the Rust code, testing the success and error cases.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Forge in forest](2024-10-15/rock-with-mirror.webp)\\n
\\n\\n## The type-checker\\n\\nThe type-checker of Move Sui is a large piece of Rust code with a core function `verify_instr` in [move-bytecode-verifier/src/type_safety.rs](https://github.com/formal-land/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety.rs) that type-checks each individual instruction in a Move bytecode. There are exactly `77` different opcodes. To give you an example, here is how it type-checks the opcode `Add`:\\n\\n```rust\\nlet operand1 = safe_unwrap_err!(verifier.stack.pop());\\nlet operand2 = safe_unwrap_err!(verifier.stack.pop());\\nif operand1.is_integer() && operand1 == operand2 {\\n verifier.push(meter, operand1)?;\\n} else {\\n return Err(verifier.error(StatusCode::INTEGER_OP_TYPE_MISMATCH_ERROR, offset));\\n}\\n```\\n\\nThe Move virtual machine is stack-based. The type-checker maintains a stack of types, corresponding to the types of the values that should be on the stack at the current point of the execution. For the `Add` operation it pops the two last types on the types, checks that they are integers and equal, and pushes the result type on the stack. The result of an addition is of the same type as the operands. In case of an error, it returns the status code `INTEGER_OP_TYPE_MISMATCH_ERROR`.\\n\\nWe translate this code to Coq in the following way:\\n\\n```coq\\nletS! operand1 :=\\n liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in\\nletS! operand1 := return!toS! $ safe_unwrap_err operand1 in\\nletS! operand2 :=\\n liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in\\nletS! operand2 := return!toS! $ safe_unwrap_err operand2 in\\nif andb\\n (SignatureToken.is_integer operand1)\\n (SignatureToken.t_beq operand1 operand2)\\nthen\\n TypeSafetyChecker.Impl_TypeSafetyChecker.push operand1\\nelse\\n returnS! $ Result.Err $ TypeSafetyChecker.Impl_TypeSafetyChecker.error\\n verifier StatusCode.INTEGER_OP_TYPE_MISMATCH_ERROR offset\\n```\\n\\n## Tests\\n\\nThe two code extracts above seem very similar, but how to make sure that they are indeed the same, and that we made no typos or misunderstanding in the 3,200 lines of translation?\\n\\nTo answer that question, we choose to write unit tests on the Rust side covering all the execution paths (success and error, all the opcodes) and to run the same tests on the Coq side after a manual/AI assisted translation of these tests. We will compare the results of the tests to ensure that the Coq code behaves exactly like the Rust code.\\n\\nThe tests on the Rust side are in the file [move-bytecode-verifier/src/type_safety_tests/mod.rs](https://github.com/formal-land/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety_tests/mod.rs), which is a 3,000-line file with 176 tests. For example, for the addition we have:\\n\\n```rust\\n#[test]\\nfn test_arithmetic_correct_types() {\\n for instr in vec![\\n Bytecode::Add,\\n Bytecode::Sub,\\n Bytecode::Mul,\\n Bytecode::Mod,\\n Bytecode::Div,\\n Bytecode::BitOr,\\n Bytecode::BitAnd,\\n Bytecode::Xor,\\n ] {\\n for push_ty_instr in vec![\\n Bytecode::LdU8(42),\\n Bytecode::LdU16(257),\\n Bytecode::LdU32(89),\\n Bytecode::LdU64(94),\\n Bytecode::LdU128(Box::new(9999)),\\n Bytecode::LdU256(Box::new(U256::from(745_u32))),\\n ] {\\n let code = vec![push_ty_instr.clone(), push_ty_instr.clone(), instr.clone()];\\n let module = make_module(code);\\n let fun_context = get_fun_context(&module);\\n let result = type_safety::verify(&module, &fun_context, &mut DummyMeter);\\n assert!(result.is_ok());\\n }\\n }\\n}\\n```\\n\\nThere are four other tests covering the error cases (missing arguments, wrong types, ...).\\n\\nOne of the difficulties in these tests, apart from their size, is that we need to initialize the `module` variable with the proper content to be able to type-check some of the instructions. We defined some helpers for that, such as:\\n\\n```rust\\nfn add_simple_struct_with_abilities(module: &mut CompiledModule, abilities: AbilitySet) {\\n let struct_def = StructDefinition {\\n struct_handle: StructHandleIndex(0),\\n field_information: StructFieldInformation::Declared(vec![FieldDefinition {\\n name: IdentifierIndex(5),\\n signature: TypeSignature(SignatureToken::U32),\\n }]),\\n };\\n\\n let struct_handle = StructHandle {\\n module: ModuleHandleIndex(0),\\n name: IdentifierIndex(0),\\n abilities: abilities,\\n type_parameters: vec![],\\n };\\n\\n module.struct_defs.push(struct_def);\\n module.struct_handles.push(struct_handle);\\n}\\n```\\n\\nthat is used in `26` tests involving struct data structures.\\n\\n## Translation of the tests\\n\\nWe translated the tests using the same approach as for the type-checker, with the same monadic representation of effects. For example, we represent in Coq the arithmetic test above as:\\n\\n```coq\\nDefinition test_arithmetic_correct_types\\n (instr push_ty_instr : Bytecode.t) :\\n M!? PartialVMError.t unit :=\\n let code := [push_ty_instr; push_ty_instr; instr] in\\n let module := make_module code in\\n let! fun_context := get_fun_context module in\\n verify module fun_context.\\n\\nGoal List.Forall\\n (fun instr =>\\n List.Forall\\n (fun push_ty_instr =>\\n test_arithmetic_correct_types instr push_ty_instr = return!? tt\\n )\\n [\\n Bytecode.LdU8 42;\\n Bytecode.LdU16 257;\\n Bytecode.LdU32 89;\\n Bytecode.LdU64 94;\\n Bytecode.LdU128 9999;\\n Bytecode.LdU256 745\\n ]\\n )\\n [\\n Bytecode.Add;\\n Bytecode.Sub;\\n Bytecode.Mul;\\n Bytecode.Mod;\\n Bytecode.Div;\\n Bytecode.BitOr;\\n Bytecode.BitAnd;\\n Bytecode.Xor\\n ].\\nProof.\\n repeat constructor.\\nQed.\\n```\\n\\nWe convert the test that iterates assertions to an anonymous proof goal that uses the `List.Forall` predicate to verify a series of equalities. The `List.Forall` predicate is defined as \\"the following property is valid for all elements of the list\\".\\n\\nFortunately for us, GitHub Copilot was extremely efficient in the translation of these tests with a success rate of about %95 (we did not make a precise measurement). These end result is in [move_sui/simulations/move_bytecode_verifier/type_safety_tests/mod.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety_tests/mod.v) that contains more than 6,000 lines of Coq code excluding comments.\\n\\n## Detected issues\\n\\nAbout %20 of our translated Coq tests failed \ud83d\udca5, which we actually consider a very good success \ud83d\udcaa as the translated Coq code of the type-checker was not run before. Apart from one misunderstanding of the Rust code, all the issues were due to typos in the translation. We had about a dozen of them, such as a missing negation in a condition, some of them generating multiple test failures. It took about one day to fix all of them by changing our Coq translation of the type-checker accordingly. Now all the tests work \ud83c\udf89!\\n\\nA few errors where also due to incorrectly translated tests, typically with a missing line. We did a manual review, but we do not know for sure if there are tests with a mistake that by chance fix an error in the translation of the type-checker. We have not seen any such case yet.\\n\\n## Conclusion\\n\\nWe now have an idiomatic \ud83d\udc13 Coq translation of the type-checker of the Move bytecode in Rust. In addition, we test the result of this translation for every opcode and error case.\\n\\nNow that we are confident enough in the translation, we can start the specification and formal verification of the type-checker. This will involve reasoning on both the type-checker and the bytecode interpreter, showing that:\\n\\n- \u2705 The interpreter preserves the well-typedness of the code as it steps through the opcodes.\\n- \u2705 When a program is accepted by the type checker, the interpreter will not fail at runtime with a type error.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any services you need._\\n\\n:::"},{"id":"/2024/10/14/verification-move-sui-type-checker-2","metadata":{"permalink":"/blog/2024/10/14/verification-move-sui-type-checker-2","source":"@site/blog/2024-10-14-verification-move-sui-type-checker-2.md","title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","description":"We are working on formally verifying the \ud83e\udd80 Rust implementation of the Move type-checker for bytecode in the proof system \ud83d\udc13 Coq. You can find the code of this type-checker in the crate move-bytecode-verifier.","date":"2024-10-14T00:00:00.000Z","formattedDate":"October 14, 2024","tags":[{"label":"monad","permalink":"/blog/tags/monad"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Sui","permalink":"/blog/tags/sui"}],"readingTime":9.045,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","tags":["monad","Rust","Sui"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","permalink":"/blog/2024/10/15/verification-move-sui-type-checker-3"},"nextItem":{"title":"\ud83c\udf32 What we do at Formal Land","permalink":"/blog/2024/10/13/class-what-we-do"}},"content":"We are working on formally verifying the [\ud83e\udd80 Rust](https://www.rust-lang.org/) implementation of the [Move](https://sui.io/move) type-checker for bytecode in the proof system [\ud83d\udc13 Coq](https://coq.inria.fr/). You can find the code of this type-checker in the crate [move-bytecode-verifier](https://github.com/move-language/move-sui/tree/main/crates/move-bytecode-verifier).\\n\\nThis requires translating all the Rust code in idiomatic Coq on which we will write our specifications and proofs. We write this translation by hand relying as much as possible on generative AI tools such as [GitHub Copilot](https://github.com/features/copilot), as there are many particular cases. We plan, eventually, to prove it equivalent to the translation automatically generated by [coq-of-rust](https://github.com/formal-land/coq-of-rust).\\n\\nIn this blog post we present how we organize our \ud83d\udd2e monad to represent the side-effects used in this Rust code. We believe this organization should work for other Rust projects as well.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Forge in forest](2024-10-14/symbol-in-forest.webp)\\n
\\n\\n## Primitive effects\\n\\nIn functional programming, effects (or side-effects) are every operation that cannot be directly represented as a mathematical function, that is to say, a procedure that returns an output purely based on the value of its inputs and does nothing else. For example, the function returning the current time makes an effect as it depends on a hidden state (the current time) that is not passed as an argument. The function printing a message to the console makes an effect as it modifies the state of the console, in addition to returning a value that is generally either empty or a confirmation of the printing. Arithmetic operations (`+`, `*`, ...) are an example of pure functions.\\n\\nWe consider three primitive effects in our Rust code:\\n\\n- **Panic** For many reasons, a Rust program can panic, as a result of an out-of-bounds access to an array or a wrong [unwrap](https://doc.rust-lang.org/core/option/enum.Option.html#method.unwrap), for example. This is an effect as no outputs are returned in case of a panic.\\n- **Result** The `Result` type is used to represent the result of a computation that can fail. It is a sum type with two constructors: `Ok` for the successful result and `Err` for the error. The [Rust operator `?`](https://doc.rust-lang.org/rust-by-example/std/result/question_mark.html) is used to propagate errors in a function that returns a `Result`. This is another effect for us.\\n- **State** Finally, we consider the functions that mutate one of their arguments as effectful. The mutated parameter is generally typed as a mutable reference `&mut`.\\n\\nAll our Coq definitions to represent the effects are in the file [simulations/M.v](https://github.com/formal-land/coq-of-rust/blob/guillaume-claret%40fix-remaining-tests/CoqOfRust/simulations/M.v).\\n\\n### Panic\\n\\nWe define a monad `Panic.t` to represent the effect of a panic with:\\n\\n```coq\\nModule Panic.\\n Inductive t (A : Set) : Set :=\\n | Value : A -> t A\\n | Panic {Error : Set} : Error -> t A.\\n```\\n\\nNote that the type `Error` in this position is an existential type. This has a few consequences:\\n\\n- We do not need to annotate the type `Panic.t` with the type of the error.\\n- We can use any type for `Error` when we trigger a panic operation. This is useful for debugging, as we can add any payload to the panic message to help us understand what went wrong.\\n- We cannot compute on the panic payload. We do not consider this as a limitation, as panics should not be caught and handled in a Rust program, only propagated.\\n\\nWe define the monadic _return_ and _bind_ operations as usual:\\n\\n```coq\\nDefinition return_ {A : Set} (value : A) : t A := Value value.\\n\\nDefinition bind {A B : Set} (value : t A) (f : A -> t B) : t B :=\\n match value with\\n | Value value => f value\\n | Panic error => Panic error\\n end.\\n```\\n\\nWe introduce notations based on the exclamation mark `!` to make the code more readable:\\n\\n```coq\\nNotation \\"M!\\" := Panic.t.\\n\\nNotation \\"return!\\" := Panic.return_.\\n\\nNotation \\"\'let!\' x \':=\' X \'in\' Y\\" :=\\n (Panic.bind X (fun x => Y))\\n (at level 200, x pattern, X at level 100, Y at level 200).\\n```\\n\\n### Result\\n\\nWe define the monad `Result.t` to represent the propagation of errors with the `?` operator with:\\n\\n```coq\\nModule Result.\\n Inductive t (A Error : Set) : Set :=\\n | Ok : A -> t A Error\\n | Err : Error -> t A Error.\\n```\\n\\nThe difference with the `Panic.t` monad is that the error type is not existential anymore. This is because we want to be able to compute on the error payload, as some functions depend on the error value.\\n\\nWe define the _return_ and _bind_ operations as:\\n\\n```coq\\nDefinition return_ {A Error : Set} (value : A) : t A Error := Ok value.\\n\\nDefinition bind {Error A B : Set} (value : t A Error) (f : A -> t B Error) : t B Error :=\\n match value with\\n | Ok value => f value\\n | Err error => Err error\\n end.\\n```\\n\\nThe _bind_ corresponds to the question mark operator `?` in Rust. We also introduce notation to make the code more readable:\\n\\n```coq\\nNotation \\"M?\\" := (fun A Error => Result.t Error A).\\n\\nNotation \\"return?\\" := Result.return_.\\n\\nNotation \\"\'let?\' x \':=\' X \'in\' Y\\" := ...\\n```\\n\\n### State\\n\\nFinally, we define the monad `State.t` \ud83c\uddfa\ud83c\uddf8 to represent the effect of one or several mutable references with a mutable state type `S`:\\n\\n```coq\\nModule State.\\n Definition t (State A : Set) : Set := State -> A * State.\\n\\n Definition return_ {State A : Set} (value : A) : t State A :=\\n fun state => (value, state).\\n\\n Definition bind {State A B : Set} (value : t State A) (f : A -> t State B) : t State B :=\\n fun state =>\\n let (value, state) := value state in\\n f value state.\\n```\\n\\nThe state `S` will typically be the tuple of all the current mutable references in the Rust code. We use notations based on the letter `S`.\\n\\nWe also introduce lens operations that mimic how we can extract a mutable reference to the part of a data structure from a mutable reference to the whole data structure in Rust. Here is the definition of the lens type:\\n\\n```coq\\nRecord t {Big_A A : Set} : Set := {\\n read : Big_A -> M! A;\\n write : Big_A -> A -> M! Big_A\\n}.\\n```\\n\\nThe `read` and `write` operations correspond to the dereferencing and the assignment of a mutable reference in Rust. The type `Big_A` is the type of the whole data structure, and the type `A` is the type of the part that we are referencing. These primitives might fail (there are in the panic monad) if the mutable reference is not valid, for example, for an out-of-bounds access in an array or an invalid case in an enum.\\n\\nWe can use a lens to lift a computation that operates on a part of a data structure to a computation that operates on the whole data structure. We provide various _lift_ operators to help with this.\\n\\n## Combinaisons\\n\\nDepending on the Rust code we want to translate, we might need to use none, one, or several of the effects above. We explicitly define all the possible combinations of the above monads, as well as return operations to go from one monad to another, more general monad.\\n\\nThe special case is for the combination of the panic and state effect. When a panic occurs, we do not return the resulting state, as we are not supposed to continue the evaluation after a panic so the current state should not be relevant. We lose the information about the state of the program when a panic occurs, which can be a limitation for debugging, but:\\n\\n- It simplifies some definitions of simulations, and forces us not to speak about the specification of a state after a panic, what should not be relevant.\\n- We can still return the current state as an additional payload in the panic operator. This is actually what our panic operator does by default.\\n\\nThe most complete monad combines all the effects:\\n\\n```coq\\nModule StatePanicResult.\\n Definition t (State Error A : Set) : Set :=\\n MS! State (M? Error A).\\n\\n Definition return_ {State Error A : Set} (value : A) : t State Error A :=\\n returnS! (Result.Ok value).\\n\\n Definition bind {State Error A B : Set}\\n (value : t State Error A)\\n (f : A -> t State Error B) :\\n t State Error B :=\\n letS! value := value in\\n match value with\\n | Result.Ok value => f value\\n | Result.Err error => returnS! (Result.Err error)\\n end.\\n```\\n\\nwith the notations:\\n\\n```coq\\nNotation \\"MS!?\\" := StatePanicResult.t.\\n\\nNotation \\"returnS!?\\" := StatePanicResult.return_.\\n\\nNotation \\"\'letS!?\' x \':=\' X \'in\' Y\\" := ...\\n```\\n\\n:::info\\n\\nWe are repeating our notations a lot, as our three effects and their combinations are very similar. In addition, we always have to explicitly choose in our code which monad we use and add explicit conversions to go from one to another. A future enhancement could be to add some automation at this level, through the use of type-classes, for example, to automatically infer the monad to use based on the operations used in the code \ud83e\uddbe. For now, we prefer to stay explicit.\\n\\n:::\\n\\n## Iterations\\n\\nTo convert code involving `for` loops \ud83d\udd01 or manipulations with the `.map` method of iterators, we introduce the effectful version of the `for` loop (_fold_ or _reduce_ in functional languages) and the `map` method. For example, for the folding operation:\\n\\n```coq\\n(** The order of parameters is the same as in the source `for` loops. *)\\nDefinition fold_left {State Error A B : Set}\\n (init : A)\\n (l : list B)\\n (f : A -> B -> t State Error A) :\\n t State Error A :=\\n List.fold_left (fun acc x => bind acc (fun acc => f acc x)) l (return_ init).\\n```\\n\\nwith the notation:\\n\\n```coq\\nNotation \\"foldS!?\\" := StatePanicResult.fold_left.\\n```\\n\\n## Conclusion\\n\\nThanks to the definitions and notations above, we were able to translate (manually/with GitHub Copilot) all the code of the type-checker for the Move bytecode to Coq in an idiomatic Coq code of a size roughly similar to the original Rust code. This translation is available in our folder [move_sui/simulations/move_bytecode_verifier](https://github.com/formal-land/coq-of-rust/tree/guillaume-claret%40fix-remaining-tests/CoqOfRust/move_sui/simulations/move_bytecode_verifier) \ud83d\ude80.\\n\\nIn the next post we will present how we tested this translation to be faithful to the original Rust code, waiting to have an efficient way to prove it equivalent.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any services you need._\\n\\n:::"},{"id":"/2024/10/13/class-what-we-do","metadata":{"permalink":"/blog/2024/10/13/class-what-we-do","source":"@site/blog/2024-10-13-class-what-we-do.md","title":"\ud83c\udf32 What we do at Formal Land","description":"In this blog post, we present what we do at Formal Land \ud83c\udf32, what tools and services we are developing to provide more security for our customers \ud83e\uddb8. We believe that for critical applications such as blockchains (L1, L2, dApps) you should always use the most advanced technologies to find bugs, otherwise bad actors will do and overtake you in the never-ending race for security \ud83c\udfce\ufe0f.","date":"2024-10-13T00:00:00.000Z","formattedDate":"October 13, 2024","tags":[{"label":"security","permalink":"/blog/tags/security"},{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"interactive theorem proving","permalink":"/blog/tags/interactive-theorem-proving"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Solidity","permalink":"/blog/tags/solidity"}],"readingTime":6.75,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83c\udf32 What we do at Formal Land","tags":["security","formal verification","interactive theorem proving","Rust","Solidity"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","permalink":"/blog/2024/10/14/verification-move-sui-type-checker-2"},"nextItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","permalink":"/blog/2024/08/19/verification-move-sui-type-checker-1"}},"content":"In this blog post, we present what we do at Formal Land \ud83c\udf32, what tools and services we are developing to provide more security for our customers \ud83e\uddb8. We believe that for critical applications such as blockchains (L1, L2, dApps) you should always **use the most advanced technologies to find bugs, otherwise bad actors will do** and overtake you in the never-ending race for security \ud83c\udfce\ufe0f.\\n\\n**Formal verification** is one of the best techniques to ensure that your code is correct, as it **checks every possible input \u2728** of your program. For a long, formal verification was reserved for specific fields, such as the space industry \ud83e\uddd1\u200d\ud83d\ude80. We are making this technology accessible for the blockchain industry and general programming thanks to tools and services we develop, like [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) and [coq-of-rust](https://github.com/formal-land/coq-of-rust).\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Forge in forest](2024-10-13/forge.webp)\\n
\\n\\n## Company\\n\\nWe have existed for **three years**, focusing on formal verification for the web3 industry to validate software \ud83d\udee1\ufe0f where safety is of paramount importance. **Formal verification** is a technique to analyze the code of a program, which relies on making a **mathematical proof that the code is correct**, proof that is furthermore checked by a computer \ud83e\udd13 to make sure there are absolutely no missing cases! As programs are made of 0 and 1 and fully deterministic, obtaining perfect programs is something we can reach.\\n\\nWe need to rely on a proof system. We exclusively use the [\ud83d\udc13 Coq](https://coq.inria.fr/) proof system as it is both:\\n\\n- **\ud83c\udf0c A generic proof system** We can represent any programming languages and security properties in Coq.\\n- **\ud83d\udc95 A well known system** Coq is taught in many universities and has a large community of users, with complex software such as the C compiler [CompCert](https://en.wikipedia.org/wiki/CompCert) fully implemented and verified in it.\\n\\nWe choose to verify **existing \ud83d\uddff** code rather than to develop new code written in a style simplifying formal verification. This is generally harder, but it is also more useful for many of our customers who have already written code and want to ensure it is correct without rewriting it. Verifying the existing code also enables the verification of the optimizations, which generally involve low-level operations that would be forbidden when rewriting the code in a formal verification language.\\n\\nWe verify the actual **\ud83c\udf0d implementation** of programs rather than a **\ud83d\uddfa\ufe0f model** of them. This is to capture all the implementation details, such as integer overflows or the use of specific data structures or libraries. We believe that a lot of bugs are hidden in the details (the devil is in the details), in addition to the high-level bugs of design. Verifying the implementation also helps to **follow code updates \ud83e\ude9c** as we are able to say that we verified the code for a precise commit hash.\\n\\n## Tools\\n\\n### \ud83d\udc2b coq-of-ocaml\\n\\nThe tool [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml) was our first product to analyze [\ud83d\udc2b OCaml](https://ocaml.org/) programs by translating the code to Coq. The translation is almost one-to-one in terms of size, for a verification work simplified at a maximum. It was initially developed as part of a PhD at [Inria](https://inria.fr/) and then at the [ Nomadic Labs](https://www.nomadic-labs.com/) company.\\n\\nWe use it to verify properties of the code of the Layer 1 of [Tezos](https://tezos.com/) with the project [Coq Tezos of OCaml](https://formal-land.gitlab.io/coq-tezos-of-ocaml/). We analyzed a code base of more than 100,000 lines of OCaml code, for which we made a full and automatic translation to the proof system Coq that can be maintained as the code evolves. We verified various properties, including:\\n\\n- The compatibility of the serialization/deserialization functions.\\n- The adequacy of the smart contract interpreter with the existing smart contract semantics.\\n- The preservation of various invariants on the data structures.\\n\\nMany more properties are yet to be verified, but the project is currently on hold. You can have more information by looking at the [project blog](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog)!\\n\\n### \ud83e\udd80 coq-of-rust\\n\\nOur second project is [coq-of-rust](https://github.com/formal-land/coq-of-rust) to verify Rust programs. Rust is an interesting target as more and more programs are getting written in it, especially for projects where the security is critical. Even if Rust offers a strong type system, with memory safe programs by design, there are still many bugs that can happen, like logical bugs or code making a panic (sudden stop of the program) in production due to an out-of-bound access in an array.\\n\\nThe project `coq-of-rust` was funded by [Aleph Zero](https://alephzero.org/) to verify the code of their smart contracts.\\n\\nWe achieve to translate most Rust programs to the Coq proof system, including the `core` library \ud83c\udf89, which is the standard library of Rust. To our knowledge, we are the only ones who have achieved such a translation of the standard library. The generated Coq code is about ten times the size of the initial Rust code. This is quite verbose and related in particular to:\\n\\n- the expansion of macros,\\n- the expansion of referencing/dereferencing operations that are often implicit in the source code,\\n- the expansion of `match` patterns to primitive patterns.\\n\\nWe have a semantics for the translated code, and are working on reasoning principles to show that this translated code is equivalent to a much simpler version (simulations) on which to reason.\\n\\nAs an example, here is the Coq translation of one of the functions of the [revm](https://github.com/bluealloy/revm), a Rust implementation of the Ethereum Virtual Machine:\\n\\n```rust\\n(*\\npub fn add(interpreter: &mut Interpreter, _host: &mut H) {\\n gas!(interpreter, gas::VERYLOW);\\n pop_top!(interpreter, op1, op2);\\n *op2 = op1.wrapping_add( *op2);\\n}\\n*)\\nDefinition add (\u03b5 : list Value.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03b5, \u03c4, \u03b1 with\\n | [], [ H ], [ interpreter; _host ] =>\\n ltac:(M.monadic\\n (let interpreter := M.alloc (| interpreter |) in\\n let _host := M.alloc (| _host |) in\\n M.catch_return (|\\n ltac:(M.monadic\\n (M.read (|\\n let~ _ :=\\n M.match_operator (|\\n M.alloc (| Value.Tuple [] |),\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let \u03b3 :=\\n M.use\\n (M.alloc (|\\n UnOp.not (|\\n M.call_closure (|\\n M.get_associated_function (|\\n Ty.path \\"revm_interpreter::gas::Gas\\",\\n \\"record_cost\\",\\n []\\n |),\\n [\\n M.SubPointer.get_struct_record_field (|\\n M.read (| interpreter |),\\n \\"revm_interpreter::interpreter::Interpreter\\",\\n \\"gas\\"\\n |);\\n M.read (|\\n M.get_constant (|\\n \\"revm_interpreter::gas::constants::VERYLOW\\"\\n |)\\n |)\\n ]\\n |)\\n |)\\n |)) in\\n let _ :=\\n M.is_constant_or_break_match (| M.read (| \u03b3 |), Value.Bool true |) in\\n (* ... more code ... *)\\n```\\n\\n### \ud83e\ude81 coq-of-solidity\\n\\nLast but not least, the tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) to translate [Solidity](https://soliditylang.org/) smart contracts to Coq. We use the Yul intermediate language of the Solidity compiler to do our translation, with roughly a three times size increase in the translated code.\\n\\nWe support most of the Solidity instructions, passing 90% of tests of the Solidity compiler. We recently developed a new translation mode that can represent arbitrary Solidity code, or Yul written by hand, in a nice monad, even in case of complex control flow like nested loops with `break` and `continue` instructions and variable mutations. This is done thanks to our new effect inference engine in `coq-of-solidity` to always give a purely functional representation of imperative code.\\n\\nCompared to other formal analysis tools for Solidity, the strength is to be able to **verify arbitrary complex properties**. This is crucial for the verification of cryptographic operations (**elliptic curve** implementations, **zero-knowledge verifiers** linking the L1 to the L2s, ...) that are out of reach of standard verification tools. For example, we are currently verifying a [hand-optimized Yul implementation](https://github.com/get-smooth/crypto-lib/blob/main/src/elliptic/SCL_mulmuladdX_fullgen_b4.sol) of elliptic curve operations.\\n\\n## Conclusion\\n\\nWe have seen what we are proposing at Formal Land to enhance the security of your applications to the best possible level \ud83c\udf1f, with security of mathematical certainty. Next time, we will see how to use the Coq proof system to verify simple properties by following the [Coq in a Hurry](https://cel.hal.science/inria-00001173v6/file/coq-hurry.pdf) tutorial \ud83d\ude80.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any services you need._\\n\\n:::"},{"id":"/2024/08/19/verification-move-sui-type-checker-1","metadata":{"permalink":"/blog/2024/08/19/verification-move-sui-type-checker-1","source":"@site/blog/2024-08-19-verification-move-sui-type-checker-1.md","title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","description":"In this blog post, we present our project to formally verify the implementation of the type checker for smart contracts of the \ud83d\udca7 Sui blockchain. The Sui blockchain uses the Move language to express smart contracts. This language is implemented in \ud83e\udd80 Rust and compiles down to the Move bytecode that is loaded in memory when executing the smart contracts.","date":"2024-08-19T00:00:00.000Z","formattedDate":"August 19, 2024","tags":[{"label":"Sui","permalink":"/blog/tags/sui"},{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Move","permalink":"/blog/tags/move"},{"label":"type checker","permalink":"/blog/tags/type-checker"}],"readingTime":2.575,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","tags":["Sui","formal verification","Coq","Rust","Move","type checker"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83c\udf32 What we do at Formal Land","permalink":"/blog/2024/10/13/class-what-we-do"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","permalink":"/blog/2024/08/13/coq-of-solidity-4"}},"content":"In this blog post, we present our project to formally verify the implementation of the type checker for smart contracts of the [\ud83d\udca7 Sui blockchain](https://sui.io/). The Sui blockchain uses the [Move](https://sui.io/move) language to express smart contracts. This language is implemented in [\ud83e\udd80 Rust](https://www.rust-lang.org/) and compiles down to the Move bytecode that is loaded in memory when executing the smart contracts.\\n\\nWe will formally verify the part that checks that the bytecode is well-typed, so that when a smart contract is executed it cannot encounter critical errors. The [type checker itself](https://github.com/move-language/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety.rs) is also written in Rust, and we will verify it using the proof assistant [Coq \ud83d\udc13](https://coq.inria.fr/) and our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) that translates Rust programs to Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo formally verify your Rust code and ensure it contains no bugs or vulnerabilities, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nThe cost is \u20ac10 per line of Rust code (excluding comments) and \u20ac20 per line for concurrent code.\\n\\n:::\\n\\n
\\n ![Sui in forest](2024-08-19/sui-in-forest.webp)\\n
\\n\\n## Plan\\n\\nThe plan for this project is as follows:\\n\\n1. **Write simulations \ud83e\uddee** of the Rust code we want to verify in Coq, namely the [type checker](https://github.com/move-language/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety.rs) and the [interpreter of bytecode](https://github.com/move-language/move-sui/blob/main/crates/move-vm-runtime/src/interpreter.rs). The simulations are functions that are equivalent to the ones in the original Rust program, but written in a style that is more amenable to formal verification. The changes can be:\\n - very simple, for example renaming variables to avoid name collisions in Coq,\\n - more involved like solving the trait instances or replacing Rust references with purely functional code, or\\n - specific to the project, like reversing the order of the bytecode\'s stack to simplify the proofs.\\n2. **Test these simulations \ud83d\udd0d** to show they behave like the original Rust code on examples covering all the opcodes of the Move bytecode.\\n3. **Prove the equivalence \ud83d\udff0** between the Coq simulations and the Rust code as translated to Coq by `coq-of-rust`. This part will give more precise results than the tests, as it will cover all possible inputs and states of the program. The complexity of this part is to go through all the details that exist in the Rust code, like the use of references to manipulate the memory, the macros after expansion, and the parts of the Rust standard library that the code depends on.\\n4. **Prove that the type checker is correct \ud83d\udee1\ufe0f** in Coq. The main properties we want to check are:\\n - the interpreter preserves the well-typedness of the code as it steps through the opcodes,\\n - when a program is accepted by the type checker, the interpreter will not fail at runtime with a type error.\\n\\n## What is done\\n\\nFor now, we have written a simulation for the type checker in [CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety.v). We are now:\\n\\n- adding tests to compare this simulation with the original Rust code,\\n- writing the simulation for the interpreter of the Move bytecode.\\n\\nIn the following blog posts, we will describe how we structured the simulations and how we are testing or verifying them.\\n\\n:::success Thanks\\n\\n_This project is kindly founded by the [Sui Foundation](https://sui.io/about) which we thank for their trust and support._\\n\\n:::"},{"id":"/2024/08/13/coq-of-solidity-4","metadata":{"permalink":"/blog/2024/08/13/coq-of-solidity-4","source":"@site/blog/2024-08-13-coq-of-solidity-4.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","description":"In this blog post we explain how we specify and formally verify a whole ERC-20 smart contract using our tool coq-of-solidity, which translates Solidity code to the proof assistant Coq \ud83d\udc13.","date":"2024-08-13T00:00:00.000Z","formattedDate":"August 13, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":6.49,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","permalink":"/blog/2024/08/19/verification-move-sui-type-checker-1"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","permalink":"/blog/2024/08/12/coq-of-solidity-3"}},"content":"In this blog post we explain how we specify and formally verify a whole [ERC-20 smart contract](https://github.com/ethereum/solidity,/blob/develop/test/libsolidity/semanticTests/various/erc20.sol) using our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity), which translates [Solidity](https://soliditylang.org/) code to the proof assistant [Coq \ud83d\udc13](https://coq.inria.fr/).\\n\\nThe proofs are still tedious for now, as there are around 1,000 lines of proofs for 100 lines of Solidity. We plan to automate this work as much as possible in the subsequent iterations of the tool. One good thing about the interactive theorem prover Coq is that we know we can never be stuck, so we can always make progress in our proof techniques and verify complex properties even if it takes time \u2728.\\n\\nFormal verification with an interactive proof assistant is the strongest way to verify programs since:\\n\\n- it covers all possible inputs and program states,\\n- it checks any kind of properties.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo audit your smart contracts and make sure they contain no bugs, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nWe refund our work if we missed a high/critical severity bug.\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-08-13/ethereum-in-forest.webp)\\n
\\n\\n## Functional specification\\n\\nWe specify the ERC-20 smart contract by writing an equivalent version in Coq that acts as a functional specification. In this specification, we ignore the `emit` operations that are logging events in Solidity and the precise payload of revert operations (we only say that \\"a revert occurs\\"). We make all our arithmetic operations on `Z` the type of unbounded integers with explicit overflow checks.\\n\\nFor example, here is the `_transfer` function of the Solidity smart contract:\\n```solidity\\nfunction _transfer(address from, address to, uint256 value) internal {\\n require(to != address(0), \\"ERC20: transfer to the zero address\\");\\n\\n // The subtraction and addition here will revert on overflow.\\n _balances[from] = _balances[from] - value;\\n _balances[to] = _balances[to] + value;\\n emit Transfer(from, to, value);\\n}\\n```\\nWe specify it in the file [erc20.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40verify-erc20/CoqOfSolidity/simulations/erc20.v) by:\\n```coq\\nDefinition _transfer (from to : Address.t) (value : U256.t) (s : Storage.t)\\n : Result.t Storage.t :=\\n if to =? 0 then\\n revert_address_null\\n else if balanceOf s from in\\n if balanceOf s to + value >=? 2 ^ 256 then\\n revert_arithmetic\\n else\\n Result.Success s <| Storage.balances :=\\n Dict.declare_or_assign s.(Storage.balances) to (balanceOf s to + value)\\n |>.\\n```\\nWith the Coq notation:\\n```coq\\nstorage <| field := new_value |>\\n```\\nwe modify a storage element as in the equivalent Solidity:\\n```solidity\\nfield = new_value;\\n```\\nWith the two tests:\\n```coq\\nif balanceOf s from =? 2 ^ 256 then\\n```\\nwe make explicit the overflow checks that are implicit in the Solidity code.\\n\\n## Dispatch to the entrypoints\\n\\nA Solidity smart contract has two public functions:\\n\\n1. One is the deployment code, which essentially initializes the storage of the smart contract and loads the rest of the code in memory,\\n2. The other one is executed when a transaction is sent to the smart contract, which is dispatched to the relevant entrypoint according to the payload of the transaction.\\n\\nWe will focus on the second one. It takes the contract\'s payload in a specific format:\\n\\n1. The first four bytes are the function selector, which is the first four bytes of the hash of the function signature,\\n2. The rest of the payload is the arguments of the function, following the ABI ([Application Binary Interface](https://en.wikipedia.org/wiki/Application_binary_interface)) of Solidity.\\n\\nThis blog article [Deconstructing a Solidity Contract\u200a-\u200aPart III: The Function Selector](https://blog.openzeppelin.com/deconstructing-a-solidity-contract-part-iii-the-function-selector-6a9b6886ea49) from OpenZeppelin gives more information about it. In Coq, we represent the payload of a contract with a sum type:\\n```coq\\nModule Payload.\\n Inductive t : Set :=\\n | Transfer (to: Address.t) (value: U256.t)\\n | Approve (spender: Address.t) (value: U256.t)\\n | TransferFrom (from: Address.t) (to: Address.t) (value: U256.t)\\n | IncreaseAllowance (spender: Address.t) (addedValue: U256.t)\\n | DecreaseAllowance (spender: Address.t) (subtractedValue: U256.t)\\n | TotalSupply\\n | BalanceOf (owner: Address.t)\\n | Allowance (owner: Address.t) (spender: Address.t).\\nEnd Payload.\\n```\\nWe define how to get this payload from the binary representation:\\n```coq\\nDefinition of_calldata (callvalue : U256.t) (calldata: list U256.t) :\\n option Payload.t :=\\n if Z.of_nat (List.length calldata) None\\n | erc20.Result.Success (memory_end_beginning, memory_end_end, s) =>\\n Some (make_state environment state\\n (memory_end_beginning ++ memory_end_middle ++ memory_end_end)\\n (SimulatedStorage.of_erc20_state s)\\n )\\n end in\\n {{? codes, environment, Some state_start |\\n // highlight-next-line\\n The original code here:\\n ERC20_403.ERC20_403_deployed.body \u21d3\\n match output with\\n | erc20.Result.Revert p s => Result.Revert p s\\n | erc20.Result.Success (_, memory_end_end, _) =>\\n Result.Return memoryguard (32 * Z.of_nat (List.length memory_end_end))\\n end\\n | state_end ?}}.\\n```\\nThe proof is done in the same way as in the previous blog post [\ud83e\ude81 Coq of Solidity \u2013 part 3](/blog/2024/08/12/coq-of-solidity-3) about the verification of the `_approve` function. The body of the contract calls all the other functions of the contract, and we reuse the equivalence proofs for the other functions here.\\n\\nThe main difficulty we encountered in the proof was missing information in the specification. For example, our predicate of equivalence requires for the memory of the smart contract to have the exact same value as its specification at the end of execution, except in case of revert. This means we needed to add the final state of the memory in the specification also, even if this is an implementation detail. We will refine our equivalence statement in the future to avoid this kind of issue.\\n\\nFor the most part of the proof, the work was about stepping through both codes and making sure, by automatic unification, that the twos are indeed equal.\\n\\n:::success AlephZero\\n\\n_The development of `coq-of-solidity` is made possible thanks to the [AlephZero](https://alephzero.org/) project. We thank the AlephZero Foundation for their support \ud83d\ude4f._\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented how to specify and formally verify a typical smart contract in Solidity, the ERC-20 token, using our tool `coq-of-solidity` (open-source). In the next post, we will see how to verify an invariant on the code and how the proof system Coq reacts if we introduce a bug."},{"id":"/2024/08/12/coq-of-solidity-3","metadata":{"permalink":"/blog/2024/08/12/coq-of-solidity-3","source":"@site/blog/2024-08-12-coq-of-solidity-3.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","description":"We continue to strengthen the security of smart contracts with our tool coq-of-solidity \ud83d\udee0\ufe0f. It checks for vulnerabilities or bugs in Solidity code. It uses formal verification with an interactive theorem prover (Coq \ud83d\udc13) to make sure that we cover:","date":"2024-08-12T00:00:00.000Z","formattedDate":"August 12, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":10.83,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","permalink":"/blog/2024/08/13/coq-of-solidity-4"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","permalink":"/blog/2024/08/07/coq-of-solidity-2"}},"content":"We continue to strengthen the security of smart contracts with our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) \ud83d\udee0\ufe0f. It checks for vulnerabilities or bugs in [Solidity](https://soliditylang.org/) code. It uses formal verification with an interactive theorem prover ([Coq \ud83d\udc13](https://coq.inria.fr/)) to make sure that we cover:\\n\\n- all possible user inputs/storage states, even if there are infinite possibilities,\\n- for any security properties.\\n\\nThis is very important as a single bug can lead to the loss of millions of dollars in smart contracts, as we have regularly seen in the past, and we can never be sure that a human review of the code did not miss anything.\\n\\nOur tool `coq-of-solidity` is one of the only tools using an interactive theorem prover for Solidity, together with [Clear](https://github.com/NethermindEth/Clear) from [Nethermind](https://www.nethermind.io/). This might be the most powerful approach to making code without bugs, as exemplified in this [PLDI paper](https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf) comparing the reliability of various C compilers. They found numerous bugs in each compiler except in the [formally verified one](https://github.com/AbsInt/CompCert)!\\n\\nIn this blog post we show how we functionally specify and verify the `_approve` function of an [ERC-20 smart contract](https://github.com/ethereum/solidity/blob/develop/test/libsolidity/semanticTests/various/erc20.sol). We will see how we prove that a refined version of the function is equivalent to the original one.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success AlephZero\\n\\n_The development of `coq-of-solidity` is made possible thanks to the [AlephZero](https://alephzero.org/) project. We thank the AlephZero Foundation for their support \ud83d\ude4f._\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-08-12/ethereum-in-forest.webp)\\n
\\n\\n## Functional specification\\n\\nHere is the `_approve` function of the Solidity smart contract that we want to specify:\\n\\n```solidity\\nmapping (address => mapping (address => uint256)) private _allowances;\\n\\nfunction _approve(address owner, address spender, uint256 value) internal {\\n require(owner != address(0), \\"ERC20: approve from the zero address\\");\\n require(spender != address(0), \\"ERC20: approve to the zero address\\");\\n\\n _allowances[owner][spender] = value;\\n emit Approval(owner, spender, value);\\n}\\n```\\n\\nIt modifies an item in the `_allowances` map and emits an `Approval` event after a few sanity checks. We will now write a functional specification of this function in Coq. The idea is to explain what this function is supposed to do describing its behavior with an idiomatic Coq code. This will be useful to make sure there are no mistakes in the smart contract, although here we have a very simple example. From the functional specification, we will also be able to check higher-level properties of the smart contract, such as the fact that the total amount of tokens is always conserved.\\n\\nHere is the Coq version of the `_approve` function:\\n```coq\\nModule Storage.\\n Record t := {\\n allowances : Dict.t (Address.t * Address.t) U256.t;\\n (* other fields *)\\n }.\\nEnd Storage.\\n\\nDefinition _approve (owner spender : Address.t) (value : U256.t) (s : Storage.t) :\\n Result.t Storage.t :=\\n if (owner =? 0) || (spender =? 0) then\\n revert_address_null\\n else\\n Result.Success s <| Storage.allowances :=\\n Dict.declare_or_assign s.(Storage.allowances) (owner, spender) value\\n |>.\\n```\\nIt takes the same parameters as the Solidity code: `owner`, `spender`, `value`, and the current state `s` of the storage. It returns a `Result.t Storage.t` type, which is either a `Result.Success` with the new storage after the execution of the `_approve` function, or a `revert_address_null` if the `owner` or `spender` is the null address. To create the new storage, we use the corresponding Coq notation and function to update the `_allowances` map.\\n\\n:::info\\n\\nWe ignore the `emit` primitives for now.\\n\\n:::\\n\\nNow let us show that, for any possible `owner`, `spender`, `value`, and storage state `s`, the `_approve` function in Solidity will behave exactly as the Coq specification.\\n\\n## Approve function\\n\\nHere is the Coq translation of the `_approve` function as generated by `coq-of-solidity`:\\n```coq\\nDefinition fun_approve (var_owner : U256.t) (var_spender : U256.t) (var_value : U256.t) : M.t unit :=\\n let~ _1 := [[ and ~(| var_owner, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in\\n do~ [[\\n M.if_unit (| iszero ~(| _1 |),\\n let~ memPtr := [[ mload ~(| 64 |) ]] in\\n do~ [[ mstore ~(| memPtr, (shl ~(| 229, 4594637 |)) |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 4 |)), 32 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 36 |)), 36 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 68 |)), 0x45524332303a20617070726f76652066726f6d20746865207a65726f20616464 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 100 |)), 0x7265737300000000000000000000000000000000000000000000000000000000 |) ]] in\\n do~ [[ revert ~(| memPtr, 132 |) ]] in\\n M.pure tt\\n |)\\n ]] in\\n let~ _2 := [[ and ~(| var_spender, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in\\n do~ [[\\n M.if_unit (| iszero ~(| _2 |),\\n let~ memPtr_1 := [[ mload ~(| 64 |) ]] in\\n do~ [[ mstore ~(| memPtr_1, (shl ~(| 229, 4594637 |)) |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 4 |)), 32 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 36 |)), 34 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 68 |)), 0x45524332303a20617070726f766520746f20746865207a65726f206164647265 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 100 |)), 0x7373000000000000000000000000000000000000000000000000000000000000 |) ]] in\\n do~ [[ revert ~(| memPtr_1, 132 |) ]] in\\n M.pure tt\\n |)\\n ]] in\\n do~ [[ mstore ~(| 0x00, _1 |) ]] in\\n do~ [[ mstore ~(| 0x20, 0x01 |) ]] in\\n let~ dataSlot := [[ keccak256 ~(| 0x00, 0x40 |) ]] in\\n let~ dataSlot_1 := [[ 0 ]] in\\n do~ [[ mstore ~(| 0, _2 |) ]] in\\n do~ [[ mstore ~(| 0x20, dataSlot |) ]] in\\n let~ dataSlot_1 := [[ keccak256 ~(| 0, 0x40 |) ]] in\\n do~ [[ sstore ~(| dataSlot_1, var_value |) ]] in\\n let~ _3 := [[ mload ~(| 0x40 |) ]] in\\n do~ [[ mstore ~(| _3, var_value |) ]] in\\n do~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in\\n M.pure tt.\\n```\\nWe plug into the Solidity compiler and translate the intermediate representation [Yul](https://docs.soliditylang.org/en/latest/yul.html) that `solc` uses to generate EVM bytecode. We automatically refine the Yul generated by the Solidity compiler but for now this refinement is limited.\\n\\nThe two `M.if_unit` at the beginning correspond to the `require` statements in the Solidity code. The `revert` statements are used to return an error message to the caller. The `mstore` and `sstore` functions are used to store values in the memory and the storage of the EVM. The `keccak256` function encodes the storage addresses to access the `_allowances` map. The `log3` function is used to emit an event at the end.\\n\\nThis representation of the `_approve` function is very verbose as it corresponds exactly to what the source code does and contains a lot of implementation details. Our goal now is to show that this version is equivalent to the functional specification that we wrote by hand.\\n\\n## Equivalence\\n\\nWe express that the functional specification is equivalent to the original one with this lemma:\\n```coq\\nLemma run_fun_approve codes environment state\\n (owner spender : Address.t) (value : U256.t) (s : erc20.Storage.t)\\n (mem_0 mem_1 mem_3 mem_4 : U256.t)\\n (H_owner : Address.Valid.t owner)\\n (H_spender : Address.Valid.t spender) :\\n let memoryguard := 0x80 in\\n let memory_start :=\\n [mem_0; mem_1; memoryguard; mem_3; mem_4] in\\n let state_start :=\\n make_state environment state memory_start (SimulatedStorage.of_erc20_state s) in\\n let output :=\\n erc20._approve owner spender value s in\\n let memory_end :=\\n [spender; erc20.keccak256_tuple2 owner 1; memoryguard; mem_3; value] in\\n let state_end :=\\n match output with\\n | erc20.Result.Revert _ _ => None\\n | erc20.Result.Success s =>\\n Some (make_state environment state memory_end (SimulatedStorage.of_erc20_state s))\\n end in\\n {{? codes, environment, Some state_start |\\n ERC20_403.ERC20_403_deployed.fun_approve owner spender value \u21d3\\n match output with\\n | erc20.Result.Revert p s => Result.Revert p s\\n | erc20.Result.Success _ => Result.Ok tt\\n end\\n | state_end ?}}.\\n```\\n\\nThis lemma of equivalence requires some parameters:\\n\\n- an initial `codes`, `environment`, and `state` values, that describe the state of the blockchain before the execution of the `_approve` function,\\n- a `memoryguard` value that gives a memory zone that we are safe to use,\\n- some `mem_i` variables, as we do not know the exact values of the memory slots before the execution of the function,\\n- an `owner`, `spender`, and `value` that are the parameters of the `_approve` function,\\n- an `s` that is the state of storage of the smart contract before the execution of the `_approve` function,\\n- an `H_owner` and `H_spender` proofs that the `owner` and `spender` are valid addresses. These two proofs are required to execute the function as expected and always available, thanks to runtime checks made at the entrypoints of the smart contract.\\n\\nThe lemma will hold for any possible values of the parameters above, even if there are infinite possibilities. This is the power of formal verification: we can prove that our smart contract is correct for all possible inputs and states.\\n\\nThe core statement uses the predicate:\\n```coq\\n{{? codes, environment, start_state |\\n original_code \u21d3\\n refined_code\\n| end_state ?}}\\n```\\nIt says that some `original_code` executed in the `start_state` environment will give the same output as the `refined_code` and will result in the final state `end_state`. The state is an option type: either `Some` state or `None` if the execution reverted. That way we do not have to deal with describing the state after a contract revert, that will reset the storage anyways.\\n\\n:::info\\n\\nThe statement of equivalence is relatively verbose so there could be mistakes in the way it is stated. This is not really an issue, as the `_approve` function is an intermediate function, so the only statement that really matters is the one on the main function of the contract that dispatches to the relevant entrypoint according to the payload of the transaction. There could also be mistakes there, but perhaps we can automatically generate this statement from the Solidity code.\\n\\n:::\\n\\n## Proof of equivalence\\n\\nThe way we write the proof is interesting. We use Coq as a symbolic debugger, where we execute both the original code and the functional specification until we reach the end of execution for all the branches, always with the same result.\\n\\nHere is an example of a debugging step (in the proof mode of Coq):\\n```coq\\n{{?codes, environment,\\nSome\\n (make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; mem_4]\\n [IsStorable.IMap.(IsStorable.to_storable_value) s.(erc20.Storage.balances);\\n StorableValue.Map2\\n (Dict.declare_or_assign s.(erc20.Storage.allowances) (owner, spender) value);\\n StorableValue.U256 s.(erc20.Storage.total_supply)])\\n|\\n // highlight-next-line\\n The original code here:\\n do~ call (Stdlib.mstore 128 value)\\n in (do~ call\\n (Stdlib.log3 128 32\\n 63486140976153616755203102783360879283472101686154884697241723088393386309925\\n owner spender) in LowM.Pure (Result.Ok tt)) \u21d3\\n // highlight-next-line\\n The functional specification here:\\n Result.Ok tt\\n| Some\\n (make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; value]\\n (SimulatedStorage.of_erc20_state\\n s<|erc20.Storage.allowances:= Dict.declare_or_assign s.(erc20.Storage.allowances)\\n (owner, spender) value|>))?}}\\n```\\nOn the original code side we can recognize:\\n```coq\\ndo~ [[ mstore ~(| _3, var_value |) ]] in\\ndo~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in\\nM.pure tt\\n```\\nthat corresponds to the end of the execution of the `_approve` function. On the functional specification, we have:\\n```coq\\nResult.Ok tt\\n```\\nthat ends the execution successfully but does not return anything. This is because we ignore the `emit` operation, translated as a `log3` Yul primitive. We also ignore the `mstore` call as it is only used to fill information for the `log3` call.\\n\\nHere are the various commands to step through the code, encoded as Coq tactics:\\n\\n- `p`: final **P**ure expression\\n- `pn`: final **P**ure expression ignoring the resulting state with a **N**one (for a revert)\\n- `pe`: final **P**ure expression with non-trivial **E**quality of results\\n- `pr`: Yul **PR**imitive\\n- `prn`: Yul **PR**imitive ignoring the resulting state with a **N**one\\n- `l`: step in a **L**et\\n- `lu`: step in a **L**et by **U**nfolding\\n- `c`: step in a function **C**all\\n- `cu`: step in a function **C**all by **U**nfolding\\n- `s`: **S**implify the goal\\n\\nThese commands verify that the two programs are equivalent as we step through them. As a reference, the proof is in [CoqOfSolidity/proofs/ERC20_functional.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40verify-erc20/CoqOfSolidity/proofs/ERC20_functional.v):\\n\\n```coq\\nProof.\\n simpl.\\n unfold ERC20_403.ERC20_403_deployed.fun_approve, erc20._approve.\\n l. {\\n now apply run_is_non_null_address.\\n }\\n unfold Stdlib.Pure.iszero.\\n lu.\\n c; [p|].\\n s.\\n unfold Stdlib.Pure.iszero.\\n destruct (owner =? 0); s. {\\n change (true || _) with true; s.\\n lu; c. {\\n apply_run_mload.\\n }\\n repeat (\\n lu ||\\n cu ||\\n (prn; intro) ||\\n s ||\\n p\\n ).\\n }\\n l. {\\n now apply run_is_non_null_address.\\n }\\n lu.\\n c; [p|]; s.\\n unfold Stdlib.Pure.iszero.\\n change (false || ?e) with e; s.\\n destruct (spender =? 0); s. {\\n lu; c. {\\n apply_run_mload.\\n }\\n repeat (\\n lu ||\\n cu ||\\n (prn; intro) ||\\n s ||\\n p\\n ).\\n }\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_keccak256_tuple2.\\n }\\n lu.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_keccak256_tuple2.\\n }\\n lu; c. {\\n apply_run_sstore_map2_u256.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_mload.\\n }\\n s.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n p.\\n }\\n p.\\nQed.\\n```\\n\\n:::success Get started\\n\\nTo audit your smart contracts and make sure they contain no bugs, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nWe refund our work in case we missed any high/critical severity bugs.\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented how to functionally specify a function with `coq-of-solidity`. In the next blog post we will see how to extend this proof and specification to the entire ERC-20 smart contract."},{"id":"/2024/08/07/coq-of-solidity-2","metadata":{"permalink":"/blog/2024/08/07/coq-of-solidity-2","source":"@site/blog/2024-08-07-coq-of-solidity-2.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","description":"We continue to work on our open source formal verification tool for Solidity named coq-of-solidity \ud83d\udee0\ufe0f. Formal verification is the strongest form of code audits, as we verify that the code behaves correctly in all possible execution cases \ud83d\udd0d. We use the interactive theorem prover Coq to express and verify any kinds of properties.","date":"2024-08-07T00:00:00.000Z","formattedDate":"August 7, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":6.36,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","permalink":"/blog/2024/08/12/coq-of-solidity-3"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","permalink":"/blog/2024/06/28/coq-of-solidity-1"}},"content":"We continue to work on our open source **formal verification** tool for [Solidity](https://soliditylang.org/) named [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) \ud83d\udee0\ufe0f. Formal verification is the strongest form of code audits, as we verify that the code behaves correctly in all possible execution cases \ud83d\udd0d. We use the **interactive theorem prover** [Coq](https://coq.inria.fr/) to express and verify any kinds of properties.\\n\\nWe work by translating the [Yul](https://docs.soliditylang.org/en/latest/yul.html) version of a smart contract to the formal language Coq \ud83d\udc13, in which we then express the code specifications/security properties and formally verify them \ud83d\udd04. The Yul language is an intermediate language used by the Solidity compiler and others to generate EVM bytecode. Yul is simpler than Solidity and at a higher level than the EVM bytecode, making it a good target for formal verification.\\n\\nIn this blog post we present the recent developments we made to simplify the reasoning \ud83e\udde0 about Yul programs once translated in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success AlephZero\\n\\n_This development is made possible thanks to [AlephZero](https://alephzero.org/). We thank the Aleph Zero Foundation for their support to bring more security to the Web3 space \ud83d\ude4f._\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-08-07/ethereum-in-forest.webp)\\n
\\n\\n## Workflow\\n\\nWe present here the general workflow to use `coq-of-solidity` to make sure your smart contracts contain no bugs \ud83d\udc1b.\\n\\n
\\n ![Workflow](2024-08-07/workflow.png)\\n
\\n\\nThe workflow is as follows:\\n\\n1. We start with a Solidity smart contract.\\n2. The Solidity compiler translates it to the intermediate language Yul.\\n3. The `coq-of-yul` tool generates a first Coq version. This version is very low-level, with, for example, variable names represented by the string of their names.\\n4. The `prepare.py` script makes as many refinements as possible in the Coq code to make it more readable and easier to reason about. For example, we order the functions definitions by the order in which they are used and replace the Yul variables by standard Coq variables.\\n5. As we are not fully automated yet for the refinements, we add another manual step where we, for example, name the memory locations so that they appear as variables instead of fixed integers.\\n6. We write in Coq the formal specification of what we expect our smart contract to do or not do. A formal specification is like a test but expressed with quantifiers (\u2200, \u2203) so that we cover all execution cases.\\n7. We write a formal proof showing that our smart contract indeed validates the formal specification for any user inputs and blockchain states.\\n8. You can now deploy your smart contract, having followed one of the most secure development methodologies.\\n\\n## Refinement step\\n\\nThe code that `coq-of-solidity` generates is very verbose. For example, for this Yul function generated by the Solidity compiler to make an addition with overflow check:\\n\\n```go\\nfunction checked_add_uint256(x) -> sum\\n{\\n sum := add(x, /** @src 0:419:421 \\"20\\" */ 0x14)\\n /// @src 0:33:3484 \\"contract ERC20 {...\\"\\n if gt(x, sum)\\n {\\n mstore(0, shl(224, 0x4e487b71))\\n mstore(4, 0x11)\\n revert(0, 0x24)\\n }\\n}\\n```\\nwe get a Coq translation:\\n```coq\\nCode.Function.make (\\n \\"checked_add_uint256\\",\\n [\\"x\\"],\\n [\\"sum\\"],\\n M.scope (\\n do! ltac:(M.monadic (\\n M.assign (|\\n [\\"sum\\"],\\n Some (M.call (|\\n \\"add\\",\\n [\\n M.get_var (| \\"x\\" |);\\n [Literal.number 0x14]\\n ]\\n |))\\n |)\\n )) in\\n do! ltac:(M.monadic (\\n M.if_ (|\\n M.call (|\\n \\"gt\\",\\n [\\n M.get_var (| \\"x\\" |);\\n M.get_var (| \\"sum\\" |)\\n ]\\n |),\\n M.scope (\\n do! ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"mstore\\",\\n [\\n [Literal.number 0];\\n M.call (|\\n \\"shl\\",\\n [\\n [Literal.number 224];\\n [Literal.number 0x4e487b71]\\n ]\\n |)\\n ]\\n |)\\n |)\\n )) in\\n do! ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"mstore\\",\\n [\\n [Literal.number 4];\\n [Literal.number 0x11]\\n ]\\n |)\\n |)\\n )) in\\n do! ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"revert\\",\\n [\\n [Literal.number 0];\\n [Literal.number 0x24]\\n ]\\n |)\\n |)\\n )) in\\n M.pure BlockUnit.Tt\\n )\\n |)\\n )) in\\n M.pure BlockUnit.Tt\\n )\\n)\\n```\\n\\nThis is quite long to follow, and even harder to use to write formal proofs. We made a script [prepare.py](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40verify-erc20/CoqOfSolidity/test/libsolidity/semanticTests/various/erc20/prepare.py) that simplifies the code above to:\\n```coq\\nDefinition checked_add_uint256 (x : U256.t) : M.t U256.t :=\\n let~ sum := [[ add ~(| x, 0x14 |) ]] in\\n do~ [[\\n M.if_unit (| gt ~(| x, sum |),\\n do~ [[ mstore ~(| 0, (shl ~(| 224, 0x4e487b71 |)) |) ]] in\\n do~ [[ mstore ~(| 4, 0x11 |) ]] in\\n do~ [[ revert ~(| 0, 0x24 |) ]] in\\n M.pure tt\\n |)\\n ]] in\\n M.pure sum.\\n```\\nThis is much more readable. We have monadic notations to compose all the primitive Yul functions such as `mstore` and `revert`, that may cause side effects such as memory mutation or premature return. The code uses standard Coq variables and functions instead of strings, which simplifies the proofs.\\n\\nTo make sure that this transformation is correct, we also generate a Coq proof file that shows that our transformation is correct and that the original and transformed code from `prepare.py` are equivalent \u2714\ufe0f.\\n\\n### Next\\n\\nWe can simplify the code even further. For example:\\n\\n- We know that the functions `add`, `gt`, and `shl` are purely functional, so we could explicit this property in the Coq code. For now they are called as monadic functions with the notation `f ~(| arg1, ..., argn |)` even if they never make side effects.\\n- The `mstore` function stores values at fixed addresses in the memory, here `0` and `4`. We could remove these memory operations by introducing named variables to hold the results instead.\\n\\nWe hope to be able to automate as many refinements as possible in the future, but for now we have to do some manual work \ud83d\udd27.\\n\\n## Manual refinements\\n\\nWe manually refine the code by showing that it returns the same result, for every possible input and initial memory state, as a simplified code written by hand. For the `checked_add_uint256` function above we use:\\n```coq\\nDefinition simulation_checked_add_uint256 (x y : Z) : Result.t Z :=\\n if x + y >=? 2 ^ 256 then\\n Result.Revert 0 0x24\\n else\\n Result.Ok (x + y).\\n```\\nHere, all the computations are made with the `Z` type of unbounded integers that are simpler to manipulate for the proofs. We use an `if` statement to explicitly detect the overflows. The revert statement has the same parameters as in the original code, but we do not fill the memory area `0` to `0x24` anymore. The reason is that we ignore what the `revert` returned in our specifications as this is not relevant for now and also simplifies the proofs.\\n\\nIn the code above we do not manipulate the memory anymore. In general, we do the following kinds of refinements:\\n\\n- Using unbounded integers with explicit overflow checks instead of the fixed-size integers of the EVM.\\n- Using side effects only when necessary, for example for the `revert` statement.\\n- Removing memory operations by introducing named variables to hold the results.\\n- Simplifying the storage accesses by using explicit arrays or maps instead of the`keccak256` hash encoding of the addresses.\\n- Using explicit names for the entrypoints instead of binary encoding with the `keccak256` function.\\n\\nFor now these transformations are manual and semi-automated, but we hope to automate them as much as possible in the future. By proving that `simulation_checked_add_uint256` behaves as the original `checked_add_uint256` function we are sure that we can reason on the simplified code instead of the original one without losing any information \ud83d\udd0d.\\n\\n:::success Get started\\n\\nTo audit your smart contracts with the method above contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nCompared to other auditing methods, formal verification has the strong advantage of covering all possible execution cases \ud83d\udcaa.\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented the current status of our work to formally verify smart contracts, especially the refinements steps that make the reasoning possible. In our next posts we will continue seeing how we can verify a full smart contract \ud83d\udd2e."},{"id":"/2024/06/28/coq-of-solidity-1","metadata":{"permalink":"/blog/2024/06/28/coq-of-solidity-1","source":"@site/blog/2024-06-28-coq-of-solidity-1.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","description":"Solidity is the most widely used smart contract language on the blockchain. As smart contracts are critical software handling a lot of money, there is a huge interest in finding all possible bugs before putting them into production.","date":"2024-06-28T00:00:00.000Z","formattedDate":"June 28, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":16.26,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","permalink":"/blog/2024/08/07/coq-of-solidity-2"},"nextItem":{"title":"\ud83e\udd84 Software correctness from first principles","permalink":"/blog/2024/06/05/software-correctness-from-first-principles"}},"content":"[Solidity](https://soliditylang.org/) is the most widely used **smart contract language** on the blockchain. As smart contracts are **critical software** handling a lot of money, there is a huge interest in finding **all possible bugs** before putting them into production.\\n\\n**Formal verification** is a technique to test a program on all possible entries, even when there are **infinitely many**. This contrasts with the traditional test techniques, which can only execute a finite set of scenarios. As such, it appears to be an ideal way to bring more security to smart contract audits.\\n\\nIn this blog post, we present the **formal verification tool `coq-of-solidity`** that we are developing for Solidity. Its specificities are that:\\n\\n1. It is open-source (GPL-3 for the translation, MIT for the proofs).\\n2. It uses an interactive theorem prover, the system Coq, to verify arbitrarily complex properties.\\n\\nHere, we present how we translate Solidity code into Coq using the intermediate language [Yul](https://docs.soliditylang.org/en/latest/yul.html). We explain the semantics we use and what remains to be done.\\n\\nThe code is available in our fork of the Solidity compiler at [github.com/formal-land/coq-of-solidity](https://github.com/formal-land/coq-of-solidity).\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info AlephZero\\n\\n_We are happy to be working with [AlephZero](https://alephzero.org/) to develop tools to bring more security for the audit of Solidity smart contracts, thanks to the use of formal verification and the interactive theorem prover [Coq](https://coq.inria.fr/). We thank the Aleph Zero Foundation for their support._\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-06-28/ethereum-in-forest.webp)\\n
\\n\\n## Architecture of the tool\\n\\nWe reuse the code of the standard Solidity compiler `solc` in order to make sure that we can stay in sync with the evolutions of the language and be compatible with all the Solidity features. Thus, our most straightforward path to implementing a translation tool from Solidity to Coq was to fork the C++ code of `solc` in [github.com/formal-land/coq-of-solidity](https://github.com/formal-land/coq-of-solidity). We add a new `solc`\'s flag `--ir-coq` that tells the compiler to also generate a Coq output in addition to the expected EVM bytecode.\\n\\nAt first, we looked at the direct translation from the Solidity language to Coq, but this was getting too complex. We changed our strategy to instead target the Yul language, an intermediate language used by the Solidity compiler to have an intermediate step in its translation to the EVM bytecode. The Yul language is simpler than Solidity and still has a higher level than the EVM bytecode, making it a good target for formal verification. In contrast to the EVM bytecode, there are no explicit stack-manipulation or `goto` instructions in Yul simplifying formal verification.\\n\\nTo give an idea of the size difference between Solidity and Yul, here are the files to export these languages to JSON in the Solidity compiler:\\n\\n- [ast/ASTJsonExporter.cpp](https://github.com/ethereum/solidity/blob/develop/libsolidity/ast/ASTJsonExporter.cpp): Solidity to JSON, 1127 lines\\n- [libyul/AsmJsonConverter.cpp](https://github.com/ethereum/solidity/blob/develop/libyul/AsmJsonConverter.cpp): Yul to JSON, 205 lines\\n\\nThe Solidity language appears as more complex than Yul as the code to translate it to JSON is five times longer.\\n\\nWe copied the file `libyul/AsmJsonConverter.cpp` above to make a version that translates Yul to Coq: [libyul/AsmCoqConverter.cpp](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret@experiments-with-yul/libyul/AsmCoqConverter.cpp). We reused the code for compilation flags to add a new option `--ir-coq`, which runs the conversion to Coq instead of the conversion to JSON.\\n\\n## Translation of Yul\\n\\nTo limit the size of the generated Coq code, we translate the Yul code after the optimization passes. This helps to remove boilerplate code but may make the Yul code less relatable to the Solidity sources. Thankfully, the optimized Yul code is still readable in our tests, and the Solidity compiler can pretty-print a version of the optimized Yul code with comments to quote the corresponding Solidity source code.\\n\\nAs an example, here is how we translate the [if keyword](https://docs.soliditylang.org/en/latest/yul.html#if) of Yul:\\n\\n```cpp\\nstd::string AsmCoqConverter::operator()(If const& _node)\\n{\\n\\tyulAssert(_node.condition, \\"Invalid if condition.\\");\\n\\tstd::string ret = \\"M.if_ (|\\\\n\\";\\n\\tm_indent++;\\n\\tret += indent() + std::visit(*this, *_node.condition) + \\",\\\\n\\";\\n\\tret += indent() + (*this)(_node.body) + \\"\\\\n\\";\\n\\tm_indent--;\\n\\tret += indent() + \\"|)\\";\\n\\n\\treturn ret;\\n}\\n```\\n\\nWe convert each Yul `_node` to an `std::string` that represents the Coq code. We use the `m_indent` variable to keep track of the indentation level, and the `indent()` function to add the right number of spaces at the beginning of each line. We do not need to add extra parenthesis to disambiguate priorities, as the Yul language is simple enough.\\n\\nHere is the generated Coq code for the beginning of the [erc20.sol](https://github.com/ethereum/solidity/blob/develop/test/libsolidity/semanticTests/various/erc20.sol) example from the Solidity compiler\'s test suite:\\n\\n```coq\\n(* Generated by solc *)\\nRequire Import CoqOfSolidity.CoqOfSolidity.\\n\\nModule ERC20_403.\\n Definition code : M.t BlockUnit.t :=\\n do* ltac:(M.monadic (\\n M.function (|\\n \\"allocate_unbounded\\",\\n [],\\n [\\"memPtr\\"],\\n do* ltac:(M.monadic (\\n M.assign (|\\n [\\"memPtr\\"],\\n Some (M.call (|\\n \\"mload\\",\\n [\\n [Literal.number 64]\\n ]\\n |))\\n |)\\n )) in\\n M.od\\n |)\\n )) in\\n do* ltac:(M.monadic (\\n M.function (|\\n \\"revert_error_ca66f745a3ce8ff40e2ccaf1ad45db7774001b90d25810abd9040049be7bf4bb\\",\\n [],\\n [],\\n do* ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"revert\\",\\n [\\n [Literal.number 0];\\n [Literal.number 0]\\n ]\\n |)\\n |)\\n )) in\\n M.od\\n |)\\n )) in\\n (* ... 6,000 remaining lines ... *)\\n```\\n\\nThis code is quite verbose, for an original smart contract size of 100 lines of Solidity. As a reference, the corresponding Yul code is 1,000 lines long and starts with:\\n\\n```go\\n/// @use-src 0:\\"erc20.sol\\"\\nobject \\"ERC20_403\\" {\\n code {\\n function allocate_unbounded() -> memPtr\\n { memPtr := mload(64) }\\n function revert_error_ca66f745a3ce8ff40e2ccaf1ad45db7774001b90d25810abd9040049be7bf4bb()\\n { revert(0, 0) }\\n // ... 1,000 remaining lines ...\\n```\\n\\nThe content is actually the same up to the notations, but we use many more line breaks and keywords in the Coq version.\\n\\n## Runtime in Coq\\n\\nNow that the code is translated in Coq, we need to define a _runtime_ for the Coq code. This means giving a definition for all the functions and types that are used in the generated code, like `M.t BlockUnit.t`, `M.monadic`, `M.function`, ... This runtime gives the semantics of the Yul language, that is to say, the meaning of all the primitives of the language.\\n\\n### Notation\\n\\nWe first define a monadic notation `ltac:(M.monadic ...)` to make a [monadic transformation](https://xavierleroy.org/mpri/2-4/monads.pdf) on the generated code. We reuse here what we have done for our [Rust translation to Coq](https://github.com/formal-land/coq-of-rust), which we describe in our blog post [\ud83e\udd80 Monadic notation for the Rust translation](/blog/2024/04/03/monadic-notation-for-rust-translation). The notation:\\n\\n```coq\\nf (| x_1, ..., x_n |)\\n```\\n\\ncorresponds to the call of a monadic function. The tactic `M.monadic` automatically chains all these calls using the monadic bind operator.\\n\\nThe `do* ... in ...` is another monadic notation to chain monadic expressions, directly calling the monadic bind. This notation is more explicit, and we use it in combination with the `ltac:(M.monadic ...)` notation as it might be more efficient to type-check very large files.\\n\\n### Monad\\n\\nTo represent the side effects in Yul, we use the following Coq monad, that we define in [CoqOfSolidity/CoqOfSolidity.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40experiments-with-yul/CoqOfSolidity/CoqOfSolidity.v):\\n\\n\\n```coq\\nModule U256.\\n Definition t := Z.\\nEnd U256.\\n\\nModule Environment.\\n Record t : Set := {\\n caller : U256.t;\\n (** Amount of wei sent to the current contract *)\\n callvalue : U256.t;\\n calldata : list Z;\\n (** The address of the contract. *)\\n address : U256.t;\\n }.\\nEnd Environment.\\n\\nModule BlockUnit.\\n (** The return value of a code block. *)\\n Inductive t : Set :=\\n (** The default value in case of success *)\\n | Tt\\n (** The instruction `break` was called *)\\n | Break\\n (** The instruction `continue` was called *)\\n | Continue\\n (** The instruction `leave` was called *)\\n | Leave.\\nEnd BlockUnit.\\n\\nModule Result.\\n (** A wrapper for the result of an expression or a code block. We can either return a normal value\\n with [Ok], or a special instruction [Return] that will stop the execution of the contract. *)\\n Inductive t (A : Set) : Set :=\\n | Ok (output : A)\\n | Return (p s : U256.t)\\n | Revert (p s : U256.t).\\n Arguments Ok {_}.\\n Arguments Return {_}.\\n Arguments Revert {_}.\\nEnd Result.\\n\\nModule Primitive.\\n (** We group together primitives that share being impure functions operating over the state. *)\\n Inductive t : Set -> Set :=\\n | OpenScope : t unit\\n | CloseScope : t unit\\n | GetVar (name : string) : t U256.t\\n | DeclareVars (names : list string) (values : list U256.t) : t unit\\n | AssignVars (names : list string) (values : list U256.t) : t unit\\n | MLoad (address length : U256.t) : t (list Z)\\n | MStore (address : U256.t) (bytes : list Z) : t unit\\n | SLoad (address : U256.t) : t U256.t\\n | SStore (address value : U256.t) : t unit\\n | RLoad : t (list Z)\\n | TLoad (address : U256.t) : t U256.t\\n | TStore (address value : U256.t) : t unit\\n | Log (topics : list U256.t) (payload : list Z) : t unit\\n | GetEnvironment : t Environment.t\\n | GetNonce : t U256.t\\n | GetCodedata (address : U256.t) : t (list Z)\\n | CreateAccount (address code : U256.t) (codedata : list Z) : t unit\\n | UpdateCodeForDeploy (address code : U256.t) : t unit\\n | LoadImmutable (name : U256.t) : t U256.t\\n | SetImmutable (name value : U256.t) : t unit\\n (** The call stack is there to debug the semantics of Yul. *)\\n | CallStackPush (name : string) (arguments : list (string * U256.t)) : t unit\\n | CallStackPop : t unit.\\nEnd Primitive.\\n\\nModule LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure (output : A)\\n | Primitive {B : Set}\\n (primitive : Primitive.t B)\\n (k : B -> t A)\\n | DeclareFunction\\n (name : string)\\n (body : list U256.t -> t (Result.t (list U256.t)))\\n (k : t A)\\n | CallFunction\\n (name : string)\\n (arguments : list U256.t)\\n (k : Result.t (list U256.t) -> t A)\\n | Loop {B : Set}\\n (body : t B)\\n (** The final value to return if we decide to break of the loop. *)\\n (break_with : B -> option B)\\n (k : B -> t A)\\n | CallContract\\n (address : U256.t)\\n (value : U256.t)\\n (input : list Z)\\n (k : U256.t -> t A)\\n (** Explicit cut in the monadic expressions, to provide better composition for the proofs. *)\\n | Let {B : Set} (e1 : t B) (k : B -> t A)\\n | Impossible (message : string).\\nEnd LowM.\\n\\nModule M.\\n Definition t (A : Set) := LowM.t (Result.t A).\\n```\\n\\nThe only type for values in Yul is the 256-bit unsigned integer `U256.t` that we represent with the `Z` type of Coq. The `BlockUnit.t` type represents the possible outcomes of a block of code:\\n\\n- `Ok` for the normal ending;\\n- `Break` or `Continue` to propagate a premature return from a call to the `break` or `continue` primitives;\\n- `Leave` to propagate the call to the `leave` primitive to terminate a function.\\n\\nWe define the monad in two steps. First, we define the `LowM.t` monad parameterized by the type of output `A`. The monad has the following constructors:\\n\\n- `Pure` to return a value without side effects;\\n- `Primitive` to execute one of the primitive, that are functions operating over the state (defined later);\\n- `DeclareFunction` to declare a function with a name and a body, which is a function taking a list of arguments and returning a list of results, as this is the case in Yul;\\n- `CallFunction` to call a function by its name with a list of arguments;\\n- `Loop` to execute a block of code in a loop, with a function to decide if we should break the loop, helpful to implement the `for` construct;\\n- `CallContract` a dedicated primitive to implement the `call` instruction of the EVM to call another contract located at a certain address;\\n- `Let` to compose two monadic expressions in a more explicit way than using the continuations;\\n- `Impossible` to signal an unexpected branch in the code.\\n\\nThis monad is purely descriptive. We give the list of primitives but we do not explain here how each operator behaves. Most of the primitives take a continuation `k`, which is a function from the output of the primitive to the rest of the code. This is a way to chain the primitives together. For convenience we define a monadic bind `let_` that chains these continuations to chain two monadic expressions.\\n\\nThen we define a monad `M.t` as:\\n```coq\\nModule M.\\n Definition t (A : Set) := LowM.t (Result.t A).\\n```\\n\\nto represent calculations that return a `Result.t` to take into account that a contract might return or revert at any point in its execution.\\n\\nFinally, we define the Yul keywords from these primitives. For example, for the `if` keyword:\\n\\n```coq\\nDefinition if_ (condition : list U256.t) (success : t BlockUnit.t) : t BlockUnit.t :=\\n match condition with\\n | [0] => pure BlockUnit.Tt\\n | [_] => success\\n | _ => LowM.Impossible \\"if: expected a single value as condition\\"\\n end.\\n```\\n\\n### Evaluation rules\\n\\nTo define how to run the primitives of the Yul\'s monad, we use evaluation rules in [CoqOfSolidity/simulations/CoqOfSolidity.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40experiments-with-yul/CoqOfSolidity/simulations/CoqOfSolidity.v):\\n\\n```coq\\nModule Run.\\n Reserved Notation \\"{{ environment , state | e \u21d3 output | state\' }}\\"\\n (at level 70, no associativity).\\n\\n Inductive t {A : Set} (environment : Environment.t) (state : State.t) (output : A) :\\n LowM.t A -> State.t -> Prop :=\\n | Pure : {{ environment, state | LowM.Pure output \u21d3 output | state }}\\n | Primitive {B : Set} (primitive : Primitive.t B) (k : B -> LowM.t A) value state_inter state\' :\\n inl (value, state_inter) = eval_primitive environment primitive state ->\\n {{ environment, state_inter | k value \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.Primitive primitive k \u21d3 output | state\' }}\\n | DeclareFunction name body k stack_inter state\' :\\n inl stack_inter = Stack.declare_function state.(State.stack) name body ->\\n let state_inter := state <| State.stack := stack_inter |> in\\n {{ environment, state_inter | k \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.DeclareFunction name body k \u21d3 output | state\' }}\\n | CallFunction name arguments k results state_inter state\' :\\n let function := Stack.get_function state.(State.stack) name in\\n {{ environment, state | function arguments \u21d3 results | state_inter }} ->\\n {{ environment, state_inter | k results \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.CallFunction name arguments k \u21d3 output | state\' }}\\n | Let {B : Set} (e1 : LowM.t B) k state_inter output_inter state\' :\\n {{ environment, state | e1 \u21d3 output_inter | state_inter }} ->\\n {{ environment, state_inter | k output_inter \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.Let e1 k \u21d3 output | state\' }}\\n\\n where \\"{{ environment , state | e \u21d3 output | state\' }}\\" :=\\n (t environment state output e state\').\\nEnd Run.\\n```\\n\\nWe use the notation:\\n\\n```coq\\n{{ environment , state | e \u21d3 output | state\' }}\\n```\\n\\nto say that a certain monadic expression `e` evaluates to the value `output`, with the environment `environment`, the initial state `state`, and the final state `state\'`. We define the evaluation rules for each primitive of the monad.\\n\\n### Evaluation function\\n\\nWe also define an evaluation function that will be useful in further tests to extract the Coq code back to OCaml and run tests to compare its behavior with the original Yul code. We define the evaluation function as follows:\\n\\n```coq\\n(** A function to evaluate an expression given enough [fuel]. *)\\nFixpoint eval {A : Set}\\n (fuel : nat)\\n (environment : Environment.t)\\n (e : LowM.t A) :\\n State.t -> (A + string) * State.t :=\\n match fuel with\\n | O => fun state => (inr \\"out of fuel\\", state)\\n | S fuel =>\\n match e with\\n | LowM.Pure output => fun state => (inl output, state)\\n | LowM.Primitive primitive k =>\\n fun state =>\\n let value_state := eval_primitive environment primitive state in\\n match value_state with\\n | inl (value, state) => eval fuel environment (k value) state\\n | inr error => (inr error, state)\\n end\\n | LowM.DeclareFunction name body k =>\\n (* ... other cases ... *)\\n```\\n\\nIt uses a `fuel` parameter to make sure that the evaluation terminates. For a monadic expression `e` and an initial state and environment, it returns either the value of the expression or an error message, as well as a final state. The error might be due to an unexpected branch in the code, like a `break` outside a loop, or to a lack of fuel. We plan to prove that it is equivalent to the evaluation rules defined above.\\n\\n## Testing\\n\\nTo test that our translation works, we ran it on all the Solidity files in the test suite of the Solidity compiler. There are, at the time of writing, 4856 `.sol` example files in the [semanticTests](https://github.com/ethereum/solidity/tree/develop/test/libsolidity/semanticTests) and [syntaxTests](https://github.com/ethereum/solidity/tree/develop/test/libsolidity/syntaxTests) folders. On each of them we run the Solidity compiler with the `--ir-coq` flag to generate the Coq code. This works for most of the test files, although some of the test files have a special format that combine several Solidity files into one file that we do not handle yet. Then type-check the generated code with Coq, what succeeds for all the Solidity files we translate.\\n\\nA more complex check is to ensure that our semantics is correct, that is to say that when we run our `eval` function in Coq on a smart contract, we get the same output as running this smart contract on an actual EVM once compiled with the Solidity compiler. We have a mechanism to extract the expected execution traces in the semantic tests to equivalent checks in Coq. We succeed in more than 90% of the test cases now. There are still a few builtin functions that we need to implement, like pre-compiled contracts.\\n\\n## Existing solutions\\n\\nThere are already a few formal verification tools for Solidity, as smart contracts are an important kind of program to check. A few of them, like the [Certora Prover](https://www.certora.com/), are closed source. Most work at the EVM bytecode level, as the semantics of the EVM is simpler than the semantics of Solidity. A disadvantage of working at the EVM level is that this is a low-level language, so the code is hard to understand (explicit stack manipulations, ...). This is the reason why we believe this approach is mostly used with automated verification tools.\\n\\nIt is hard to have a rather complete support for the Solidity language, despite of many attempts including [one of ours](https://gitlab.com/formal-land/coq-of-solidity). We can cite the [Verisol](https://github.com/microsoft/verisol) project from Microsoft to verify Solidity programs.\\n\\nThe Yul language offers a good compromise between the high-level Solidity language and the low-level EVM bytecode. It was actually designed with *formal verification in mind*, according to its documentation. These [notes](https://hackmd.io/@FranckC/BJz02K4Za) from [Franck Cassez](https://franck44.github.io/) give a good overview of the formal verification efforts for Yul. One of the conclusions is that a lot of the existing work is either incomplete/unmaintained or not designed for the formal verification of smart contracts, but rather to verify the Yul language itself. As a result, they propose a formal verification framework for Yul in [Dafny](https://dafny.org/) with [yul-dafny](https://github.com/franck44/yul-dafny).\\n\\n:::warning For more\\n\\nIf you have smart contract projects that you want to formally verify, going further than a manual audit to find bugs, contact us at [contact@formal.land](mailto:contact@formal.land)! Formal verification has the strong advantage of covering all possible execution cases.\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented our ongoing development of a formal verification tool for Solidity using the Coq proof assistant. We have briefly shown how we translate Solidity code to Coq using the Yul intermediate language and how we define the semantics of Yul in Coq. We have tested our tool on the examples of the Solidity compiler\'s test suite to check that our formalization is correct.\\n\\nOur next steps will be to:\\n\\n1. Complete our definitions of the Ethereum\'s primitives, to have a 100% success on the Solidity test suite.\\n2. Formally specify and verify an example of contract, looking at the [erc20.sol](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40experiments-with-yul/test/libsolidity/semanticTests/various/erc20.sol) example."},{"id":"/2024/06/05/software-correctness-from-first-principles","metadata":{"permalink":"/blog/2024/06/05/software-correctness-from-first-principles","source":"@site/blog/2024-06-05-software-correctness-from-first-principles.md","title":"\ud83e\udd84 Software correctness from first principles","description":"Formal verification is a technique to verify the absence of bugs in a program by reasoning from first principles. Instead of testing a program on examples, what covers a finite number of cases, formal verification checks all possible cases. It does so by going back to the definition of programming languages, showing why the whole code is correct given how each individual keyword behaves.","date":"2024-06-05T00:00:00.000Z","formattedDate":"June 5, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"software correctness","permalink":"/blog/tags/software-correctness"},{"label":"first principles","permalink":"/blog/tags/first-principles"},{"label":"example","permalink":"/blog/tags/example"},{"label":"Python","permalink":"/blog/tags/python"}],"readingTime":7.425,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 Software correctness from first principles","tags":["formal verification","software correctness","first principles","example","Python"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","permalink":"/blog/2024/06/28/coq-of-solidity-1"},"nextItem":{"title":"\ud83d\udc0d Simulation of Python code from traces in Coq","permalink":"/blog/2024/05/22/translation-of-python-code-simulations-from-trace"}},"content":"**Formal verification** is a technique to verify the **absence of bugs** in a program by reasoning from **first principles**. Instead of testing a program on examples, what covers a finite number of cases, formal verification checks **all possible cases**. It does so by going back to the **definition of programming languages**, showing why the whole code is correct given how each individual keyword behaves.\\n\\nWe will present this idea in detail and illustrate how it works for a very simple example.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Use of formal verification\\n\\nWe typically use formal verification for critical applications, where either:\\n\\n- life is at stake, like in the case of trains, airplanes, medical devices, or\\n- money is at stake, like in the case of financial applications.\\n\\nWith formal verification, in theory, **we can guarantee that the software will never fail**, as we can check **all possible cases** for a given property. A property can be that no non-admin users can read sensitive data, or that a program never fails with uncaught exceptions. For that to be truly the case, we need to verify the whole software stack for all the relevant properties.\\n\\nIn this research paper [Finding and Understanding Bugs in C Compilers](https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf), no bugs were found in the middle-end of the formally verified [CompCert](https://en.wikipedia.org/wiki/CompCert) C compiler, while the other C compilers (GCC, LLVM, ...) all contained subtle bugs. This illustrates that formal verification can be an effective way to make complex software with zero bugs!\\n\\n## Definition of programming languages\\n\\nTo be able to reason on a program we go back to the definition of programming languages. The programming languages (C, JavaScript, Python, ...) are generally defined with a precise set of rules. For example, in Python, the `if` statement is [defined in the reference manual](https://docs.python.org/3/reference/compound_stmts.html#if) by:\\n\\n```python\\nif_stmt ::= \\"if\\" assignment_expression \\":\\" suite\\n (\\"elif\\" assignment_expression \\":\\" suite)*\\n [\\"else\\" \\":\\" suite]\\n```\\n> It selects exactly one of the suites by evaluating the expressions one by one until one is found to be true (see section Boolean operations for the definition of true and false); then that suite is executed (and no other part of the if statement is executed or evaluated). If all expressions are false, the suite of the else clause, if present, is executed.\\n>\\n> — The Python\'s reference manual\\n\\nThis means that the Python code:\\n\\n```python\\nif condition:\\n a\\nelse:\\n b\\n```\\n\\nwill execute `a` when the `condition` is true, and `b` otherwise. There are similar rules for all other program constructs (loops, function definitions, classes, ...).\\n\\nTo make these rules more manageable, we generally split them into two parts:\\n\\n- The syntax part, that defines what is a valid program in the language. For example, in Python, the syntax is defined by the [grammar](https://docs.python.org/3/reference/grammar.html).\\n- The semantics part, that defines what a program does. This is what we have seen above with the description of the behavior of the `if` statement.\\n\\nIn formal verification, we will focus on the semantics of programs, assuming that the syntax is already verified by the compiler or interpreter, generating \\"syntax errors\\" in case of ill-formed programs.\\n\\n## Example to verify\\n\\nWe consider this short Python example of a function returning the maximum number in a list:\\n\\n```python\\ndef my_max(l):\\n m = l[0]\\n for x in l:\\n if x > m:\\n m = x\\n return m\\n```\\n\\nWe assume that the list `l` is not empty and only contains integers. If we run it on a few examples:\\n\\n```python\\nmy_max([1, 2, 3]) # => 3\\nmy_max([3, 2, 1]) # => 3\\nmy_max([1, 3, 2]) # => 3\\n```\\n\\nit always returns `3`, the biggest number in the list! But can we make sure this is always the case?\\n\\nWe can certainly not run `my_max` on all possible lists of integers, as there are infinitely many of them. We need to reason from the definition of the Python language, which is what we call formal verification reasoning.\\n\\n## Formal verification\\n\\nHere is a general specification that we give of the `my_max` function above:\\n\\n```python\\nforall (index : int) (l : list[int]),\\n 0 \u2264 index < len(l) \u21d2\\n l[index] \u2264 my_max(l)\\n```\\n\\nIt says that for all integer `index` and list of integers `l`, if the index is valid (between `0` and the length of the list), then the element at this index is less than or equal to the maximum of the list that we compute.\\n\\nTo verify this property for all possible list `l`, we reason by induction. A non-empty list is either:\\n\\n- a list with one element, where the maximum is the only element, or\\n- a list with at least two elements, where the maximum is either the last element or the maximum of the rest of the list.\\n\\nAt the start of the code, we will always have:\\n\\n```python\\ndef my_max(l):\\n m = l[0]\\n```\\n\\nwith `m` being equal to the first item of the list. Then:\\n\\n- If the list has only one element, we iterate only once in the `for` loop, with `x` equal to `l[0]`. The condition:\\n ```python\\n if x > m:\\n ```\\n is then equivalent to:\\n ```python\\n if l[0] > l[0]:\\n ```\\n and is always false. We then return `m = l[0]`, which is the only element of the list, and it verifies our property as:\\n ```python\\n l[0] \u2264 l[0]\\n ```\\n- If the list has at least two elements, we unroll the code execution of the `for` loop and iterate over all the elements until the last one. Our induction hypothesis tells us that the property we verify is true for the first part of the list, excluding the last element. This means that:\\n ```python\\n l[index] \u2264 m\\n ```\\n for all `index` between `0` and `len(l) - 2`. When we reach the last element, we have:\\n ```python\\n if x > m:\\n m = x\\n ```\\n with `x` being `l[len(l) - 1]`. There are two possibilities. Either *(i)* `x` is less than or equal to `m`, and we do not update `m`, or *(ii)* `x` is greater than `m`, and we update `m` to `x`. In both cases, the property is verified for the last element of the list, as:\\n 1. In the first case, `m` stays the same, so it is still larger or equal to all the elements of the list except the last one, as well as larger or equal to the last one according to this last `if` statement.\\n 2. In the second case, `m` is updated to `x`, which is the last element of the list and a greater value than the original `m`. Then it means that `m` is still larger or equal to all the elements of the list except the last one, being larger that the original `m`, and larger or equal to the last one as it is in fact equals to the last one.\\n\\nWe have now closed our induction proof and verified that our property is true for all possible lists of integers! The reasoning above is rather verbose but should actually correspond to the intuition of most programmers when reading this code.\\n\\nIn practice, with formal verification, the reasoning above is done in a proof assistance such as [Coq](https://coq.inria.fr/) to help making sure that we did not forget any case, and automatically solve simple cases for us. Having a proof written in a proof language like Coq also allows us to re-run it to check that it is still valid after a change in the code, and allows third-party persons to check it without reading all the details.\\n\\n## Completing the property\\n\\nAn additional property that we did not verify is:\\n\\n```python\\nforall (l : list[int]),\\n exists (index : int),\\n 0 \u2264 index < len(l) and\\n l[index] = my_max(l)\\n```\\n\\nIt says that the maximum of the list is actually in the list. We can verify it by induction in the same way as we did for the first property. You can detail this verification as an exercise.\\n\\n:::info For more\\n\\nIf you want to go into more details for the formal verification of Python programs, you can look at our [coq-of-python](https://github.com/formal-land/coq-of-python) project, where we define the semantics of Python in Coq and verify properties of Python programs (ongoing project!). We also provide formal verification services for [Rust](https://github.com/formal-land/coq-of-rust) and other languages like [OCaml](https://github.com/formal-land/coq-of-ocaml). Contact us at [contact@formal.land](mailto:contact@formal.land) to discuss if you have critical applications to check!\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented here the idea of **formal verification**, a technique to verify the absence of bugs in a program by reasoning from **first principles**. We have illustrated this idea for a simple Python example, showing how we can verify that a function computing the maximum of a list is correct **for all possible lists of integers**.\\n\\nWe will continue with more blog posts explaining what we can do with formal verification and why it matters. Feel free to share this post and to tell us what subjects you want to see covered!"},{"id":"/2024/05/22/translation-of-python-code-simulations-from-trace","metadata":{"permalink":"/blog/2024/05/22/translation-of-python-code-simulations-from-trace","source":"@site/blog/2024-05-22-translation-of-python-code-simulations-from-trace.md","title":"\ud83d\udc0d Simulation of Python code from traces in Coq","description":"In order to formally verify Python code in Coq our approach is the following:","date":"2024-05-22T00:00:00.000Z","formattedDate":"May 22, 2024","tags":[{"label":"coq-of-python","permalink":"/blog/tags/coq-of-python"},{"label":"Python","permalink":"/blog/tags/python"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"Ethereum","permalink":"/blog/tags/ethereum"},{"label":"simulation","permalink":"/blog/tags/simulation"},{"label":"trace","permalink":"/blog/tags/trace"}],"readingTime":8.59,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc0d Simulation of Python code from traces in Coq","tags":["coq-of-python","Python","Coq","translation","Ethereum","simulation","trace"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 Software correctness from first principles","permalink":"/blog/2024/06/05/software-correctness-from-first-principles"},"nextItem":{"title":"\ud83d\udc0d Simulation of Python code in Coq","permalink":"/blog/2024/05/14/translation-of-python-code-simulations"}},"content":"In order to formally verify Python code in Coq our approach is the following:\\n\\n1. Import Python code in Coq by running [coq-of-python](https://github.com/formal-land/coq-of-python).\\n2. Write a purely functional simulation in Coq of the code.\\n3. Show that this simulation is equivalent to the translation.\\n4. Verify the simulation.\\n\\nWe will show in this article how we can merge the steps 2. and 3. to save time in the verification process. We do so by relying on the proof mode of Coq and unification.\\n\\nOur mid-term goal is to formally specify the [Ethereum Virtual Machine](https://ethereum.org/en/developers/docs/evm/) (EVM) and prove that this specification is correct according to [reference implementation of the EVM](https://github.com/ethereum/execution-specs) in Python. This would ensure that it is always up-to-date and exhaustive. The code of this project is open-source and available on GitHub: [formal-land/coq-of-python](https://github.com/formal-land/coq-of-python).\\n\\n\x3c!-- truncate --\x3e\\n\\n
\\n ![Python at work](2024-05-22/python.webp)\\n
\\n\\n## Our Python\'s monad \ud83d\udc0d\\n\\nWe put the Python code that we import in Coq in a monad `M` to represent all the features that are hard to express in Coq, mainly the side effects. This monad is a combination of two levels:\\n\\n- `LowM` for the side effects except the control flow.\\n- `M` that adds an error monad on top of `LowM` to handle the control flow (exceptions, `break` instruction, ...).\\n\\n### LowM\\n\\nHere is the definition of the `LowM` monad in [CoqOfPython.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/CoqOfPython.v):\\n\\n```coq\\nModule Primitive.\\n Inductive t : Set -> Set :=\\n | StateAlloc (object : Object.t Value.t) : t (Pointer.t Value.t)\\n | StateRead (mutable : Pointer.Mutable.t Value.t) : t (Object.t Value.t)\\n | StateWrite (mutable : Pointer.Mutable.t Value.t) (update : Object.t Value.t) : t unit\\n | GetInGlobals (globals : Globals.t) (name : string) : t Value.t.\\nEnd Primitive.\\n\\nModule LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure (a : A)\\n | CallPrimitive {B : Set} (primitive : Primitive.t B) (k : B -> t A)\\n | CallClosure {B : Set} (closure : Data.t Value.t) (args kwargs : Value.t) (k : B -> t A)\\n | Impossible.\\n Arguments Pure {_}.\\n Arguments CallPrimitive {_ _}.\\n Arguments CallClosure {_ _}.\\n Arguments Impossible {_}.\\n\\n Fixpoint bind {A B : Set} (e1 : t A) (e2 : A -> t B) : t B :=\\n match e1 with\\n | Pure a => e2 a\\n | CallPrimitive primitive k => CallPrimitive primitive (fun v => bind (k v) e2)\\n | CallClosure closure args kwargs k => CallClosure closure args kwargs (fun a => bind (k a) e2)\\n | Impossible => Impossible\\n end.\\nEnd LowM.\\n```\\n\\nThis is a monad defined by continuation (the variable `k`):\\n\\n- We terminate a computation with the primitive `Pure` and some result `a`, that can be any purely functional expression.\\n- We can call some primitives grouped in `Primitive.t` that are side effects:\\n - `StateAlloc` to allocate a new object in the memory,\\n - `StateRead` to read an object from the memory,\\n - `StateWrite` to write an object in the memory,\\n - `GetInGlobals` to read a global variable, doing name resolution. This is a side effects as function definitions in Python do not need to be ordered.\\n- We can call a closure (an anonymous function) with `CallClosure`. This is required for termination, as we cannot define an eval function on the type of Python values since some do not terminate like the [\u03a9 expression](https://medium.com/@dkeout/why-you-must-actually-understand-the-%CF%89-and-y-combinators-c9204241da7a). See our previous post [Translation of Python code to Coq](/blog/2024/05/10/translation-of-python-code) for our definition of Python values. The combinator `CallClosure` is also very convenient to modularize our proofs: we reason on each closure independently.\\n- We can mark a code path as unreachable with `Impossible`.\\n\\n### M\\n\\nThe final monad `M` is defined as:\\n\\n```coq\\nDefinition M : Set :=\\n LowM.t (Value.t + Exception.t).\\n```\\n\\nIt has no parameters as Python is untyped, so all expressions have the same result type:\\n\\n- either a success value of type `Value.t`,\\n- or an exception of type `Exception.t`, with some special cases to represent a `return`, a `break`, or a `continue` instruction.\\n\\nWe define the monadic bind of `M` like for the error monad:\\n\\n```coq\\nDefinition bind (e1 : M) (e2 : Value.t -> M) : M :=\\n LowM.bind e1 (fun v => match v with\\n | inl v => e2 v\\n | inr e => LowM.Pure (inr e)\\n end).\\n```\\n\\n## Traces \ud83d\udc3e\\n\\nWe define our semantics of a computation `e` of type `M` in [simulations/proofs/CoqOfPython.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/simulations/proofs/CoqOfPython.v) with the predicate:\\n\\n```coq\\n{{ stack, heap | e \u21d3 to_value | P_stack, P_heap }}\\n```\\n\\nthat we call a _run_ or a _trace_, saying that:\\n\\n- starting from the initial state `stack`, `heap`,\\n- the computation `e` terminates with a value,\\n- that is in the image of the function `to_value`,\\n- and with a final stack and heap that satisfy the predicates `P_stack` and `P_heap`.\\n\\nNote that we do not explicit the resulting value and memory state of a computation in this predicate. We only say that it exists and verifies a few properties, that are here for compositionality. We have a purely functional function `evaluate` that can derive the result of a run of a computation:\\n\\n```coq\\nevaluate :\\n forall `{Heap.Trait} {A B : Set}\\n {stack : Stack.t} {heap : Heap} {e : LowM.t B}\\n {to_value : A -> B} {P_stack : Stack.t -> Prop} {P_heap : Heap -> Prop}\\n (run : {{ stack, heap | e \u21d3 to_value | P_stack, P_heap }}),\\n A * { stack : Stack.t | P_stack stack } * { heap : Heap | P_heap heap }\\n```\\n\\nThe function `evaluate` is defined in Coq by a `Fixpoint`. Its result is what we call a _simulation_, which is a purely functional definition equivalent to the orignal computation `e` from Python. It is equivalent by construction.\\n\\n## Building a trace \ud83d\udd28\\n\\nA trace is an inductive in `Set` that we can build with the following constructors:\\n\\n```coq\\nInductive t `{Heap.Trait} {A B : Set}\\n (stack : Stack.t) (heap : Heap)\\n (to_value : A -> B) (P_stack : Stack.t -> Prop) (P_heap : Heap -> Prop) :\\n LowM.t B -> Set :=\\n(* [Pure] primitive *)\\n| Pure\\n (result : A)\\n (result\' : B) :\\n result\' = to_value result ->\\n P_stack stack ->\\n P_heap heap ->\\n {{ stack, heap |\\n LowM.Pure result\' \u21d3\\n to_value\\n | P_stack, P_heap }}\\n(* [StateRead] primitive *)\\n| CallPrimitiveStateRead\\n (mutable : Pointer.Mutable.t Value.t)\\n (object : Object.t Value.t)\\n (k : Object.t Value.t -> LowM.t B) :\\n IsRead.t stack heap mutable object ->\\n {{ stack, heap |\\n k object \u21d3\\n to_value\\n | P_stack, P_heap }} ->\\n {{ stack, heap |\\n LowM.CallPrimitive (Primitive.StateRead mutable) k \u21d3\\n to_value\\n | P_stack, P_heap }}\\n(* [CallClosure] primitive *)\\n| CallClosure {C : Set}\\n (f : Value.t -> Value.t -> M)\\n (args kwargs : Value.t)\\n (to_value_inter : C -> Value.t + Exception.t)\\n (P_stack_inter : Stack.t -> Prop) (P_heap_inter : Heap -> Prop)\\n (k : Value.t + Exception.t -> LowM.t B) :\\n let closure := Data.Closure f in\\n {{ stack, heap |\\n f args kwargs \u21d3\\n to_value_inter\\n | P_stack_inter, P_heap_inter }} ->\\n (* We quantify over every possible values as we cannot compute the result of the closure here.\\n We only know that it exists and respects some constraints in this inductive definition. *)\\n (forall value_inter stack_inter heap_inter,\\n P_stack_inter stack_inter ->\\n P_heap_inter heap_inter ->\\n {{ stack_inter, heap_inter |\\n k (to_value_inter value_inter) \u21d3\\n to_value\\n | P_stack, P_heap }}\\n ) ->\\n {{ stack, heap |\\n LowM.CallClosure closure args kwargs k \u21d3\\n to_value\\n | P_stack, P_heap }}\\n(* ...cases for the other primitives of the monad... *)\\n```\\n\\n### Pure\\n\\nIn the `Pure` case we return the final result of the computation. We check the state fulfills the predicate `P_stack` and `P_heap`, and that the result is the image by the function `to_value` of some `result`.\\n\\n### CallPrimitiveStateRead\\n\\nTo read a value in memory, we rely on another predicate `IsRead` that checks if the `mutable` pointer is valid in the `stack` or `heap` and that the `object` is the value at this pointer. We then call the continuation `k` with this object. We have similar rules for allocating a new object in memory and writing at a pointer.\\n\\nNote that we parameterize all our semantics by `` `{Heap.Trait}`` that provides a specific `Heap` type with read and write primitives. We can choose the implementation of the memory model that we want to use in our simulations in order to simplify the reasoning.\\n\\n### CallClosure\\n\\nTo call a closure, we first evaluate the closure with the arguments and keyword arguments. We then call the continuation `k` with the result of the closure. We quantify over all possible results of the closure, as we cannot compute it here. This would require to be able to define `Fixpoint` together with `Inductive`, which is not possible in Coq. So we only know that the result of the closure exists, and can use the constraints on its result (the function `to_value` and the predicates `P_stack_inter` and `P_heap_inter`) to build a run of the continuation.\\n\\nThe other constructors are not presented here but are similar to the above. We will also add a monadic primitive for loops with the following idea: we show that a loop terminates by building a trace, as traces are `Inductive` so must be finite. We have no rules for the `Impossible` case so that building the trace of a computation also shows that the `Impossible` calls are in unreachable paths.\\n\\n## Example \ud83d\udd0d\\n\\nWe have applied these technique to a small code example with allocation, memory read, and closure call primitives. We were able to show that the resulting simulation obtained by running `evaluate` on the trace is equal to a simulation written by hand. The proof was just the tactic `reflexivity`. We believe that we can automate most of the tactics used to build a run, except for the allocations were the user needs to make a choice (immediate, stack, or heap allocation, which address, ...).\\n\\nTo continue our experiments we now need to complete our semantics of Python, especially to take into account method and operator calls.\\n\\n## Conclusion\\n\\nWe have presented an alternative way to build simulations of imperative Python code in purely functional Coq code. The idea is to enable faster reasoning over Python code by removing the need to build explicit simulations. We plan to port this technique to other tools like [coq-of-rust](https://github.com/formal-land/coq-of-rust) as well.\\n\\nTo see what we can do for you talk with us at [contact@formal.land](mailto:contact@formal.land) \ud83c\udfc7. For our previous projects, see our [formal verification of the Tezos\' L1](https://formal-land.gitlab.io/coq-tezos-of-ocaml/)!"},{"id":"/2024/05/14/translation-of-python-code-simulations","metadata":{"permalink":"/blog/2024/05/14/translation-of-python-code-simulations","source":"@site/blog/2024-05-14-translation-of-python-code-simulations.md","title":"\ud83d\udc0d Simulation of Python code in Coq","description":"We are continuing to specify the Ethereum Virtual Machine (EVM) in the formal verification language Coq. We are working from the automatic translation in Coq of the reference implementation of the EVM, which is written in the language Python.","date":"2024-05-14T00:00:00.000Z","formattedDate":"May 14, 2024","tags":[{"label":"coq-of-python","permalink":"/blog/tags/coq-of-python"},{"label":"Python","permalink":"/blog/tags/python"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"Ethereum","permalink":"/blog/tags/ethereum"}],"readingTime":6.63,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc0d Simulation of Python code in Coq","tags":["coq-of-python","Python","Coq","translation","Ethereum"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc0d Simulation of Python code from traces in Coq","permalink":"/blog/2024/05/22/translation-of-python-code-simulations-from-trace"},"nextItem":{"title":"\ud83d\udc0d Translation of Python code to Coq","permalink":"/blog/2024/05/10/translation-of-python-code"}},"content":"We are continuing to specify the [Ethereum Virtual Machine](https://ethereum.org/en/developers/docs/evm/) (EVM) in the formal verification language [Coq](https://coq.inria.fr/). We are working from the [automatic translation in Coq](https://github.com/formal-land/coq-of-python/tree/main/CoqOfPython/ethereum) of the [reference implementation of the EVM](https://github.com/ethereum/execution-specs), which is written in the language [Python](https://www.python.org/).\\n\\nIn this article, we will see how we specify the EVM in Coq by writing an interpreter that closely mimics the behavior of the Python code. We call that implementation a _simulation_ as it aims to reproduce the behavior of the Python code, the reference.\\n\\nIn contrast to the automatic translation from Python, the simulation is a manual translation written in idiomatic Coq. We expect it to be ten times smaller in lines compared to the automatic translation, and of about the same size as the Python code. This is because the automatic translation needs to encode all the Python specific features in Coq, like variable mutations and the class system.\\n\\nIn the following article, we will show how we can prove that the simulation is correct, meaning that it behaves exactly as the automatic translation.\\n\\nThe code of this project is open-source and available on GitHub: [formal-land/coq-of-python](https://github.com/formal-land/coq-of-python). This work follows a call from [Vitalik Buterin](https://en.wikipedia.org/wiki/Vitalik_Buterin) for more formal verification of the Ethereum\'s code.\\n\\n\x3c!-- truncate --\x3e\\n\\n
\\n ![Python writing simulations](2024-05-14/python_simulation.webp)\\n
\\n\\n## The `add` function \ud83e\uddee\\n\\nWe focus on a simulation for the `add` function in [vm/instructions/arithmetic.py](https://github.com/ethereum/execution-specs/blob/master/src/ethereum/paris/vm/instructions/arithmetic.py) that implements the addition primitive of the EVM. The Python code is:\\n\\n```python\\ndef add(evm: Evm) -> None:\\n \\"\\"\\"\\n Adds the top two elements of the stack together, and pushes the result back\\n on the stack.\\n\\n Parameters\\n ----------\\n evm :\\n The current EVM frame.\\n\\n \\"\\"\\"\\n # STACK\\n x = pop(evm.stack)\\n y = pop(evm.stack)\\n\\n # GAS\\n charge_gas(evm, GAS_VERY_LOW)\\n\\n # OPERATION\\n result = x.wrapping_add(y)\\n\\n push(evm.stack, result)\\n\\n # PROGRAM COUNTER\\n evm.pc += 1\\n```\\n\\nMost of the functions of the interpreter are written in this style. They take the global state of the interpreter, called `Evm` as input, and mutate it with the effect of the current instruction.\\n\\nThe `Evm` structure is defined as:\\n\\n```python\\n@dataclass\\nclass Evm:\\n \\"\\"\\"The internal state of the virtual machine.\\"\\"\\"\\n\\n pc: Uint\\n stack: List[U256]\\n memory: bytearray\\n code: Bytes\\n gas_left: Uint\\n env: Environment\\n valid_jump_destinations: Set[Uint]\\n logs: Tuple[Log, ...]\\n refund_counter: int\\n running: bool\\n message: Message\\n output: Bytes\\n accounts_to_delete: Set[Address]\\n touched_accounts: Set[Address]\\n return_data: Bytes\\n error: Optional[Exception]\\n accessed_addresses: Set[Address]\\n accessed_storage_keys: Set[Tuple[Address, Bytes32]]\\n```\\n\\nIt contains the current instruction pointer `pc`, the stack of the EVM, the memory, the code, the gas left, ...\\n\\nAs the EVM is a stack-based machine, the addition function does the following:\\n\\n1. It pops the two top elements of the stack `x` and `y`,\\n2. It charges a very low amount of gas,\\n3. It computes the result of the addition `result = x + y`,\\n4. It pushes the result back on the stack,\\n5. It increments the program counter `pc`.\\n\\nNote that all these operations might fail and raise an exception, for example,if the stack is empty when we pop `x`and `y` at the beginning.\\n\\n## Monad for the simulations \ud83e\uddea\\n\\nThe main side-effects that we want to integrate into the Coq simulations are:\\n\\n- the mutation of the global state `Evm`,\\n- the raising of exceptions.\\n\\nFor that, we use a state and error monad `MS?`:\\n\\n```coq\\nModule StateError.\\n Definition t (State Error A : Set) : Set :=\\n State -> (A + Error) * State.\\n\\n Definition return_ {State Error A : Set}\\n (value : A) :\\n t State Error A :=\\n fun state => (inl value, state).\\n\\n Definition bind {State Error A B : Set}\\n (value : t State Error A)\\n (f : A -> t State Error B) :\\n t State Error B :=\\n fun state =>\\n let (value, state) := value state in\\n match value with\\n | inl value => f value state\\n | inr error => (inr error, state)\\n end.\\nEnd StateError.\\n\\nNotation \\"MS?\\" := StateError.t.\\n```\\n\\nWe parametrize it by an equivalent definition in Coq of the type `Evm` and the type of exceptions that we might raise.\\n\\nIn Python the exceptions are a class that is extended as needed to add new kinds of exceptions. We use a closed sum type in Coq to represent the all possible exceptions that might happen in the EVM interpreter.\\n\\nFor the `Evm` state, some functions might actually only modify a part of it. For example, the `pop` function only modifies the `stack` field. We use a mechanism of [lens](https://medium.com/javascript-scene/lenses-b85976cb0534) to specialize the state monad to only modify a part of the state. For example, the `pop` function has the type:\\n\\n```coq\\npop : MS? (list U256.t) Exception.t U256.t\\n```\\n\\nwhere `list U256.t` is the type of the stack, while the `add` function has type:\\n\\n```coq\\nadd : MS? Evm.t Exception.t unit\\n```\\n\\nWe define a lens for the stack in the `Evm` type with:\\n\\n```coq\\nModule Lens.\\n Record t (Big_A A : Set) : Set := {\\n read : Big_A -> A;\\n write : Big_A -> A -> Big_A\\n }.\\nEnd Lens.\\n\\nModule Evm.\\n Module Lens.\\n Definition stack : Lens.t Evm.t (list U256.t) := {|\\n Lens.read := (* ... *);\\n Lens.write := (* ... *);\\n |}.\\n```\\n\\nWe can then lift the `pop` function to be used in a context where the `Evm` state is modified with:\\n\\n```coq\\nletS? x := StateError.lift_lens Evm.Lens.stack pop in\\n```\\n\\n## Typing discipline \ud83d\udc6e\\n\\nWe keep in Coq all the type names from the Python source code. When a new class is created we create a new Coq type. When the class inherits from another one, we add a field in the Coq type to represent the parent class. Thus we work by composition rather than inheritance.\\n\\nHere is an example of the primitive types defined in [base_types.py](https://github.com/ethereum/execution-specs/blob/master/src/ethereum/base_types.py):\\n\\n```python\\nclass FixedUint(int):\\n MAX_VALUE: ClassVar[\\"FixedUint\\"]\\n\\n # ...\\n\\n def __add__(self: T, right: int) -> T:\\n # ...\\n\\nclass U256(FixedUint):\\n MAX_VALUE = 2**256 - 1\\n\\n # ...\\n```\\n\\nWe simulate it by:\\n\\n```coq\\nModule FixedUint.\\n Record t : Set := {\\n MAX_VALUE : Z;\\n value : Z;\\n }.\\n\\n Definition __add__ (self right_ : t) : M? Exception.t t :=\\n (* ... *).\\nEnd FixedUint.\\n\\nModule U256.\\n Inductive t : Set :=\\n | Make (value : FixedUint.t).\\n\\n Definition of_Z (value : Z) : t :=\\n Make {|\\n FixedUint.MAX_VALUE := 2^256 - 1;\\n FixedUint.value := value;\\n |}.\\n\\n (* ... *)\\nEnd U256.\\n```\\n\\nFor the imports, that are generally written with an explicit list of names:\\n\\n```python\\nfrom ethereum.base_types import U255_CEIL_VALUE, U256, U256_CEIL_VALUE, Uint\\n```\\n\\nwe follow the same pattern in Coq:\\n\\n```coq\\nRequire ethereum.simulations.base_types.\\nDefinition U255_CEIL_VALUE := base_types.U255_CEIL_VALUE.\\nModule U256 := base_types.U256.\\nDefinition U256_CEIL_VALUE := base_types.U256_CEIL_VALUE.\\nModule Uint := base_types.Uint.\\n```\\n\\nThis is a bit more verbose than the usual way in Coq to import a module, but it makes the translation more straightforward.\\n\\n## Final simulation \ud83e\udeb6\\n\\nFinally, our Coq simulation of the `add` function is the following:\\n\\n```coq\\nDefinition add : MS? Evm.t Exception.t unit :=\\n (* STACK *)\\n letS? x := StateError.lift_lens Evm.Lens.stack pop in\\n letS? y := StateError.lift_lens Evm.Lens.stack pop in\\n\\n (* GAS *)\\n letS? _ := charge_gas GAS_VERY_LOW in\\n\\n (* OPERATION *)\\n let result := U256.wrapping_add x y in\\n\\n letS? _ := StateError.lift_lens Evm.Lens.stack (push result) in\\n\\n (* PROGRAM COUNTER *)\\n letS? _ := StateError.lift_lens Evm.Lens.pc (fun pc =>\\n (inl tt, Uint.__add__ pc (Uint.Make 1))) in\\n\\n returnS? tt.\\n```\\n\\nWe believe that it has a size and readability close to the original Python code. You can look at this definition in [vm/instructions/simulations/arithmetic.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/ethereum/paris/vm/instructions/simulations/arithmetic.v). As a reference, the automatic translation is 65 lines long and in [vm/instructions/arithmetic.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/ethereum/paris/vm/instructions/arithmetic.v).\\n\\n## Conclusion\\n\\nWe have seen how to write a simulation for one example of a Python function. We now need to do it for the rest of the code of the interpreter. We will also see in a following article how to prove that the simulation behaves as the automatic translation of the Python code in Coq.\\n\\nFor our formal verification services, reach us at [contact@formal.land](mailto:contact@formal.land) \ud83c\udfc7! To know more about what we have done, see [our previous project](https://formal-land.gitlab.io/coq-tezos-of-ocaml/) on the verification of the L1 of Tezos."},{"id":"/2024/05/10/translation-of-python-code","metadata":{"permalink":"/blog/2024/05/10/translation-of-python-code","source":"@site/blog/2024-05-10-translation-of-python-code.md","title":"\ud83d\udc0d Translation of Python code to Coq","description":"We are starting to work on a new product, coq-of-python. The idea of this tool is, as you can guess, to translate Python code to the proof system Coq.","date":"2024-05-10T00:00:00.000Z","formattedDate":"May 10, 2024","tags":[{"label":"coq-of-python","permalink":"/blog/tags/coq-of-python"},{"label":"Python","permalink":"/blog/tags/python"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"Ethereum","permalink":"/blog/tags/ethereum"}],"readingTime":10.445,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc0d Translation of Python code to Coq","tags":["coq-of-python","Python","Coq","translation","Ethereum"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc0d Simulation of Python code in Coq","permalink":"/blog/2024/05/14/translation-of-python-code-simulations"},"nextItem":{"title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","permalink":"/blog/2024/04/26/translation-core-alloc-crates"}},"content":"We are starting to work on a new product, [coq-of-python](https://github.com/formal-land/coq-of-python). The idea of this tool is, as you can guess, to translate Python code to the [proof system Coq](https://coq.inria.fr/).\\n\\nWe want to import specifications written in Python to a formal system like Coq. In particular, we are interested in the [reference specification](https://github.com/ethereum/execution-specs) of [Ethereum](https://ethereum.org/), which describes how [EVM smart contracts](https://ethereum.org/en/developers/docs/evm/) run. Then, we will be able to use this specification to either formally verify the various implementations of the EVM or smart contracts.\\n\\nAll this effort follows [a Tweet](https://twitter.com/VitalikButerin/status/1759369749887332577) from [Vitalik Buterin](https://en.wikipedia.org/wiki/Vitalik_Buterin) hoping for more formal verification of the Ethereum\'s code:\\n\\n> One application of AI that I am excited about is AI-assisted formal verification of code and bug finding.\\n>\\n> Right now ethereum\'s biggest technical risk probably is bugs in code, and anything that could significantly change the game on that would be amazing.\\n>\\n> — Vitalik Buterin\\n\\nWe will now describe the technical development of `coq-of-python`. For the curious, all the code is on GitHub: [formal-land/coq-of-python](https://github.com/formal-land/coq-of-python).\\n\\n\x3c!-- truncate --\x3e\\n\\n
\\n ![Python with a rooster](2024-05-10/python_rooster.webp)\\n \x3c!--
A python with a rooster
--\x3e\\n
\\n\\n## Reading Python code \ud83d\udcd6\\n\\nA first step we need to do to translate Python code is to read it in a programmatic way. For simplicity and better integration, we chose to write `coq-of-python` in Python.\\n\\nWe use the [ast](https://docs.python.org/3/library/ast.html) module to parse the code and get an abstract syntax tree (AST) of the code. This is a tree representation of the code that we can manipulate in Python. We could have used other representations, such as the Python bytecode, but it seemed too low-level to be understandable by a human.\\n\\nGiven the path to a Python file, we get its AST with the following code:\\n\\n```python\\nimport ast\\n\\ndef read_python_file(path: str) -> ast.Module:\\n with open(path, \\"r\\") as file:\\n return ast.parse(file.read())\\n```\\n\\nThis code is very short, and we benefit from the general elegance of Python. There is no typing or advanced data types in Python, keeping the AST rather small. Here is an extract of it:\\n\\n```\\nexpr = BoolOp(boolop op, expr* values)\\n | NamedExpr(expr target, expr value)\\n | BinOp(expr left, operator op, expr right)\\n | UnaryOp(unaryop op, expr operand)\\n | Lambda(arguments args, expr body)\\n | IfExp(expr test, expr body, expr orelse)\\n | Dict(expr* keys, expr* values)\\n | Set(expr* elts)\\n | ListComp(expr elt, comprehension* generators)\\n | SetComp(expr elt, comprehension* generators)\\n | ... more cases ...\\n```\\n\\nAn expression is described as being of one of several kinds. For example, the application of a binary operator such as:\\n\\n```python\\n1 + 2\\n```\\n\\ncorresponds to the case `BinOp` with `1` as the `left` expression, `+` as the `op` operator, and `2` as the `right` expression.\\n\\n## Outputting Coq code \ud83d\udcdd\\n\\nWe translate each element of the Python\'s AST into a string of Coq code. We keep track of the current indentation level in order to present a nice output. Here is the code to translate the binary operator expressions:\\n\\n```python\\ndef generate_expr(indent, is_with_paren, node: ast.expr):\\n if isinstance(node, ast.BoolOp):\\n ...\\n elif isinstance(node, ast.BinOp):\\n return paren(\\n is_with_paren,\\n generate_operator(node.op) + \\" (|\\\\n\\" +\\n generate_indent(indent + 1) +\\n generate_expr(indent + 1, False, node.left) + \\",\\\\n\\" +\\n generate_indent(indent + 1) +\\n generate_expr(indent + 1, False, node.right) + \\"\\\\n\\" +\\n generate_indent(indent) + \\"|)\\"\\n )\\n elif ...\\n```\\n\\nWe have the current number of indentation levels in the `indent` variable. We use the flag `is_with_paren` to know whether we should add parenthesis around the current expression if it is the sub-expression of another one.\\n\\nWe apply the `node.op` operator on the two parameters `node.left` and `node.right`. For example, the translation of the Python code `1 + 2` will be:\\n\\n```coq\\nBinOp.add (|\\n Constant.int 1,\\n Constant.int 2\\n|)\\n```\\n\\nWe use a special notation `f (| x1, ..., xn |)` to represent a function application in a monadic context. In the next section, we explain why we need this notation.\\n\\n## Monad and values \ud83d\udd2e\\n\\nOne of the difficulties in translating some code to a language such as Coq is that Coq is purely functional. This means that a function can never modify a variable or raise an exception. The non-purely functional actions are called side-effects.\\n\\nTo solve this issue, we represent the side-effects of the Python code in a [monad]() in Coq. A monad is a special data structure representing the side-effects of a computation. We can chain monadic actions together to represent a sequence of side-effects.\\n\\nWe thus have two Coq types:\\n\\n- `Value.t` for the Python values (there is only one type for all values, as Python is a dynamically typed language),\\n- `M` for the monadic expressions.\\n\\nNote that we do not need to parametrize the monad by the type of the values, as we only have one type of value.\\n\\n### Values\\n\\nAccording to the reference manual of Python on the [data model](https://docs.python.org/3/reference/datamodel.html):\\n\\n> All data in a Python program is represented by objects or by relations between objects.\\n\\n> Every object has an identity, a type and a value. An object\u2019s identity never changes once it has been created; you may think of it as the object\u2019s address in memory.\\n\\n> Like its identity, an object\u2019s type is also unchangeable.\\n\\n> The value of some objects can change. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable.\\n\\nBy following this description, we propose this formalization for the values:\\n\\n```coq\\nModule Data.\\n Inductive t (Value : Set) : Set :=\\n | Ellipsis\\n | Bool (b : bool)\\n | Integer (z : Z)\\n | Tuple (items : list Value)\\n (* ... various other primitive types like lists, ... *)\\n | Closure {Value M : Set} (f : Value -> Value -> M)\\n | Klass {Value M : Set}\\n (bases : list (string * string))\\n (class_methods : list (string * (Value -> Value -> M)))\\n (methods : list (string * (Value -> Value -> M))).\\nEnd Data.\\n\\nModule Object.\\n Record t {Value : Set} : Set := {\\n internal : option (Data.t Value);\\n fields : list (string * Value);\\n }.\\nEnd Object.\\n\\nModule Pointer.\\n Inductive t (Value : Set) : Set :=\\n | Imm (data : Object.t Value)\\n | Mutable {Address A : Set}\\n (address : Address)\\n (to_object : A -> Object.t Value).\\nEnd Pointer.\\n\\nModule Value.\\n Inductive t : Set :=\\n | Make (globals : string) (klass : string) (value : Pointer.t t).\\nEnd Value.\\n```\\n\\nWe describe a `Value.t` by:\\n\\n- its type, given by a class name `klass` and a module name `globals` from which the class is defined,\\n- its value, given by a pointer to an object.\\n\\nA `Pointer.t` is either an immutable object `Imm` or a mutable object `Mutable` with an address and a function to get the object from what is stored in the memory. This function `to_object` is required as we plan to allow the user to provide its own custom memory model.\\n\\nAn `Object.t` has a list of named fields that we can populate in the `__init__` method of a class. It also has a special `internal` field that we can use to store special kinds of data, like primitive values.\\n\\nIn `Data.t`, we list the various primitive values that we use to define the primitive types of the Python language. We have:\\n\\n- atomic values such as booleans, integers, strings, ...\\n- composite values such as tuples, lists, dictionaries, ...\\n- closures with a function that takes the two arguments `*args` and `**kwargs` and returns a monadic value,\\n- classes with their bases, class methods, and instance methods.\\n\\n### Monad\\n\\nFor now, we axiomatize the monad `M`:\\n\\n```coq\\nParameter M : Set.\\n```\\n\\nWe will see later how to define it, probably by taking some inspiration from our monad from our similar project [coq-of-rust](https://github.com/formal-land/coq-of-rust).\\n\\nTo make the monadic code less heavy, we use a notation inspired by the `async/await` notation of many languages. We believe it to be less heavy than the monadic notation of languages like [Haskell](https://www.haskell.org/). We note:\\n\\n```coq\\nf (| x1, ..., xn |)\\n```\\n\\nto call a function `f` of type:\\n\\n```coq\\nValue.t -> ... -> Value.t -> M\\n```\\n\\nwith the arguments `x1`, ..., `xn` of type `Value.t` and binds its result to the current continuation in the context of the tactic `ltac:(M.monadic ...)`. See our blog post [Monadic notation for the Rust translation](/blog/2024/04/03/monadic-notation-for-rust-translation) for more information.\\n\\nIn summary:\\n\\n- `f (| x1, ..., xn |)` is like `await`,\\n- `ltac:(M.monadic ...)` is like `async`.\\n\\n## Handling of the names \ud83c\udff7\ufe0f\\n\\nNow we talk about how we handle the variable names and link them to their definitions. In the reference manual of Python, the part [Execution model](https://docs.python.org/3/reference/executionmodel.html) gives some information.\\n\\nFor now, we distinguish between two scopes, the global one (top-level definitions) and the local one for variables defined in a function. We might introduce a stack of local scopes to handle nested functions.\\n\\nWe name the global scope with a string, that is the path of the current file. Having absolute names helps us translating each file independently. The only file that a translated file requires is `CoqOfPython.CoqOfPython`, to have the definition of the values and the monad.\\n\\nTo translate `import` statements, we use assertions:\\n\\n```coq\\nAxiom ethereum_crypto_imports_elliptic_curve :\\n IsImported globals \\"ethereum.crypto\\" \\"elliptic_curve\\".\\nAxiom ethereum_crypto_imports_finite_field :\\n IsImported globals \\"ethereum.crypto\\" \\"finite_field\\".\\n```\\n\\nThis represents:\\n\\n```python\\nfrom . import elliptic_curve, finite_field\\n```\\n\\nIt means that in the current global scope `globals` we can use the name `\\"elliptic_curve\\"` from the other global scope `\\"ethereum.crypto\\"`.\\n\\nWe set the local scope at the entry of a function with the call:\\n\\n```coq\\nM.set_locals (| args, kwargs, [ \\"x1\\"; ...; \\"xn\\" ] |)\\n```\\n\\nfor a function whose parameter names are `x1`, ..., `xn`. For uniformity, we always group the function\'s parameters as `*args` and `**kwargs`. We do not yet handle the default values.\\n\\nWhen a user creates or updates a local variable `x` with a value `value`, we run:\\n\\n```coq\\nM.assign_local \\"x\\" value : M\\n```\\n\\nTo read a variable, we have a primitive:\\n\\n```coq\\nM.get_name : string -> string -> M\\n```\\n\\nIt takes as a parameter the name of the current global scope and the name of the variable the are reading. The local scope should be accessible from the monad. For now all these primitives are axiomatized.\\n\\n## Some numbers \ud83d\udcca\\n\\nThe code base that we analyze, the Python specification of Ethereum, contains _28,455 lines_ of Python, excluding comments. When we translate it to Coq we obtain _299,484 lines_. This is a roughly ten times increase.\\n\\nThe generated code completely compiles. For now, we avoid some complex Python expressions, like list comprehension, by generating a dummy expression instead. Having all the code that compiles will allow us to iterate and add support for more Python features with a simple check: making sure that all the code still compiles.\\n\\nAs an example, we translate the following function:\\n\\n```python\\ndef bnf2_to_bnf12(x: BNF2) -> BNF12:\\n \\"\\"\\"\\n Lift a field element in `BNF2` to `BNF12`.\\n \\"\\"\\"\\n return BNF12.from_int(x[0]) + BNF12.from_int(x[1]) * (\\n BNF12.i_plus_9 - BNF12.from_int(9)\\n )\\n```\\n\\nto the Coq code:\\n\\n```coq\\nDefinition bnf2_to_bnf12 : Value.t -> Value.t -> M :=\\n fun (args kwargs : Value.t) => ltac:(M.monadic (\\n let _ := M.set_locals (| args, kwargs, [ \\"x\\" ] |) in\\n let _ := Constant.str \\"\\n Lift a field element in `BNF2` to `BNF12`.\\n \\" in\\n let _ := M.return_ (|\\n BinOp.add (|\\n M.call (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"from_int\\" |),\\n make_list [\\n M.get_subscript (|\\n M.get_name (| globals, \\"x\\" |),\\n Constant.int 0\\n |)\\n ],\\n make_dict []\\n |),\\n BinOp.mult (|\\n M.call (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"from_int\\" |),\\n make_list [\\n M.get_subscript (|\\n M.get_name (| globals, \\"x\\" |),\\n Constant.int 1\\n |)\\n ],\\n make_dict []\\n |),\\n BinOp.sub (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"i_plus_9\\" |),\\n M.call (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"from_int\\" |),\\n make_list [\\n Constant.int 9\\n ],\\n make_dict []\\n |)\\n |)\\n |)\\n |)\\n |) in\\n M.pure Constant.None_)).\\n```\\n\\n## Conclusion\\n\\nWe continue working on the translation from Python to Coq, especially to now add a semantics to the translation. Our next goal is to have a version, written in idiomatic Coq, of the file [src/ethereum/paris/vm/instructions/arithmetic.py](https://github.com/ethereum/execution-specs/blob/master/src/ethereum/paris/vm/instructions/arithmetic.py), and proven equal to the original code. This will open the door to making a Coq specification of the EVM that is always synchronized to the Python\'s version.\\n\\nFor our services, reach us at [contact@formal.land](mailto:contact@formal.land) \ud83c\udfc7! We want to ensure the blockchain\'s L1 and L2 are bug-free, thanks to a mathematical analysis of the code. See [our previous project](https://formal-land.gitlab.io/coq-tezos-of-ocaml/) on the L1 of Tezos."},{"id":"/2024/04/26/translation-core-alloc-crates","metadata":{"permalink":"/blog/2024/04/26/translation-core-alloc-crates","source":"@site/blog/2024-04-26-translation-core-alloc-crates.md","title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","description":"We continue our work on formal verification of Rust programs with our tool coq-of-rust, to translate Rust code to the formal proof system Coq. One of the limitation we had was the handling of primitive constructs from the standard library of Rust, like Option::unwrapordefault or all other primitive functions. For each of these functions, we had to make a Coq definition to represent its behavior. This is both tedious and error prone.","date":"2024-04-26T00:00:00.000Z","formattedDate":"April 26, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"core","permalink":"/blog/tags/core"},{"label":"alloc","permalink":"/blog/tags/alloc"}],"readingTime":5.365,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","tags":["coq-of-rust","Rust","Coq","translation","core","alloc"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc0d Translation of Python code to Coq","permalink":"/blog/2024/05/10/translation-of-python-code"},"nextItem":{"title":"\ud83e\udd80 Monadic notation for the Rust translation","permalink":"/blog/2024/04/03/monadic-notation-for-rust-translation"}},"content":"We continue our work on formal verification of [Rust](https://www.rust-lang.org/) programs with our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust), to translate Rust code to the formal proof system [Coq](https://coq.inria.fr/). One of the limitation we had was the handling of primitive constructs from the standard library of Rust, like [Option::unwrap_or_default](https://doc.rust-lang.org/core/option/enum.Option.html#method.unwrap_or_default) or all other primitive functions. For each of these functions, we had to make a Coq definition to represent its behavior. This is both tedious and error prone.\\n\\nTo solve this issue, we worked on the translation of the [core](https://doc.rust-lang.org/core/) and [alloc](https://doc.rust-lang.org/alloc/) crates of Rust using `coq-of-rust`. These are very large code bases, with a lot of unsafe or advanced Rust code. We present what we did to have a \\"best effort\\" translation of these crates. The resulting translation is in the following folders:\\n\\n- [CoqOfRust/alloc](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/alloc)\\n- [CoqOfRust/core](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/core)\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat \u260e\ufe0f!\\n\\n:::\\n\\n
\\n ![Crab with a pen](2024-04-26/crab-in-library.webp)\\n
A crab in a library
\\n
\\n\\n## Initial run \ud83d\udc25\\n\\nAn initial run of `coq-of-rust` on the `alloc` and `core` crates of Rust generated us two files of a few hundred thousands lines of Coq corresponding to the whole translation of these crates. This is a first good news, as it means the tool runs of these large code bases. However the generated Coq code does not compile, even if the errors are very rare (one every few thousands lines).\\n\\nTo get an idea, here is the size of the input Rust code as given by the `cloc` command:\\n\\n- `alloc`: 26,299 lines of Rust code\\n- `core`: 54,192 lines of Rust code\\n\\nGiven that this code uses macros that we expand in our translation, the actual size that we have to translate is even bigger.\\n\\n## Splitting the generated code \ud83e\ude93\\n\\nThe main change we made was to split the output generated by `coq-of-rust` with one file for each input Rust file. This is possible because our translation is insensitive to the order of definitions and context-free. So, even if there are typically cyclic dependencies between the files in Rust, something that is forbidden in Coq, we can still split them.\\n\\nWe get the following sizes as output:\\n\\n- `alloc`: 54 Coq files, 171,783 lines of Coq code\\n- `core`: 190 Coq files, 592,065 lines of Coq code\\n\\nThe advantages of having the code split are:\\n\\n- it is easier to read and navigate in the generated code\\n- it is easier to compile as we can parallelize the compilation\\n- it is easier to debug as we can focus on one file at a time\\n- it is easier to ignore files that do not compile\\n- it will be easier to maintain, as it is easier to follow the diff of a single file\\n\\n## Fixing some bugs \ud83d\udc1e\\n\\nWe had some bugs related to the collisions between module names. These can occur when we choose a name for the module for an `impl` block. We fixed these by adding more information in the module names to make them more unique, like the `where` clauses that were missing. For example, for the implementation of the `Default` trait for the `Mapping` type:\\n\\n```rust\\n#[derive(Default)]\\nstruct Mapping {\\n // ...\\n}\\n```\\n\\nwe were generating the following Coq code:\\n\\n```coq\\nModule Impl_core_default_Default_for_dns_Mapping_K_V.\\n (* ...trait implementation ... *)\\nEnd Impl_core_default_Default_for_dns_Mapping_K_V.\\n```\\n\\nWe now generate:\\n\\n```coq\\nModule Impl_core_default_Default_where_core_default_Default_K_where_core_default_Default_V_for_dns_Mapping_K_V.\\n (* ... *)\\n```\\n\\nwith a module name that includes the `where` clauses of the `impl` block, stating that both `K` and `V` should implement the `Default` trait.\\n\\nHere is the list of files that do not compile in Coq, as of today:\\n\\n- `alloc/boxed.v`\\n- `core/any.v`\\n- `core/array/mod.v`\\n- `core/cmp/bytewise.v`\\n- `core/error.v`\\n- `core/escape.v`\\n- `core/iter/adapters/flatten.v`\\n- `core/net/ip_addr.v`\\n\\nThis represents 4% of the files. Note that in the files that compile there are some unhandled Rust constructs that are axiomatized, so this does not give the whole picture of what we do not support.\\n\\n## Example \ud83d\udd0e\\n\\nHere is the source code of the `unwrap_or_default` method for the `Option` type:\\n\\n```rust\\npub fn unwrap_or_default(self) -> T\\nwhere\\n T: Default,\\n{\\n match self {\\n Some(x) => x,\\n None => T::default(),\\n }\\n}\\n```\\n\\nWe translate it to:\\n\\n```coq\\nDefinition unwrap_or_default (T : Ty.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n let Self : Ty.t := Self T in\\n match \u03c4, \u03b1 with\\n | [], [ self ] =>\\n ltac:(M.monadic\\n (let self := M.alloc (| self |) in\\n M.read (|\\n M.match_operator (|\\n self,\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let \u03b30_0 :=\\n M.get_struct_tuple_field_or_break_match (|\\n \u03b3,\\n \\"core::option::Option::Some\\",\\n 0\\n |) in\\n let x := M.copy (| \u03b30_0 |) in\\n x));\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (M.alloc (|\\n M.call_closure (|\\n M.get_trait_method (| \\"core::default::Default\\", T, [], \\"default\\", [] |),\\n []\\n |)\\n |)))\\n ]\\n |)\\n |)))\\n | _, _ => M.impossible\\n end.\\n```\\n\\nWe prove that it is equivalent to the simpler functional code:\\n\\n```coq\\nDefinition unwrap_or_default {T : Set}\\n {_ : core.simulations.default.Default.Trait T}\\n (self : Self T) :\\n T :=\\n match self with\\n | None => core.simulations.default.Default.default (Self := T)\\n | Some x => x\\n end.\\n```\\n\\nThis simpler definition is what we use when verifying code. The proof of equivalence is in [CoqOfRust/core/proofs/option.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/core/proofs/option.v). In case the original source code changes, we are sure to capture these changes thanks to our proof. Because the translation of the `core` library was done automatically, we trust the generated definitions more than definitions that would be done by hand. However, there can still be mistakes or incompleteness in `coq-of-rust`, so we still need to check at proof time that the code makes sense.\\n\\n## Conclusion\\n\\nWe can now work on the verification of Rust programs with more trust in our formalization of the standard library. Our next target is to simplify our proof process, which is still tedious. In particular, showing that simulations are equivalent to the original Rust code requires doing the name resolution, introduction of high-level types, and removal of the side-effects. We would like to split these steps.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) \ud83d\udc8c! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification."},{"id":"/2024/04/03/monadic-notation-for-rust-translation","metadata":{"permalink":"/blog/2024/04/03/monadic-notation-for-rust-translation","source":"@site/blog/2024-04-03-monadic-notation-for-rust-translation.md","title":"\ud83e\udd80 Monadic notation for the Rust translation","description":"At Formal Land our mission is to reduce the cost of finding bugs in software. We use formal verification, that is to say mathematical reasoning on code, to make sure we find more bugs than with testing. As part of this effort, we are working on a tool coq-of-rust to translate Rust code to Coq, a proof assistant, to analyze Rust programs. Here we present a technical improvement we made in this tool.","date":"2024-04-03T00:00:00.000Z","formattedDate":"April 3, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"monad","permalink":"/blog/tags/monad"}],"readingTime":5.185,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Monadic notation for the Rust translation","tags":["coq-of-rust","Rust","Coq","translation","monad"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","permalink":"/blog/2024/04/26/translation-core-alloc-crates"},"nextItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","permalink":"/blog/2024/03/22/improvements-rust-translation-part-3"}},"content":"At Formal Land our mission is to reduce the cost of finding bugs in software. We use [formal verification](https://runtimeverification.com/blog/formal-verification-lore), that is to say mathematical reasoning on code, to make sure we find more bugs than with testing. As part of this effort, we are working on a tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) to translate Rust code to Coq, a proof assistant, to analyze Rust programs. Here we present a technical improvement we made in this tool.\\n\\nOne of the challenges of our translation from Rust to Coq is that the generated code is very verbose. The size increase is about ten folds in our examples. A reasons is that we use a monad to represent side effects in Coq, so we need to name each intermediate result and apply the `bind` operator. Here, we will present a monadic notation that prevents naming intermediate results to make the code more readable.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat \u260e\ufe0f!\\n\\n:::\\n\\n
\\n ![Crab with a pen](2024-04-03/crab-writing.webp)\\n
\\n\\n## Example \ud83d\udd0e\\n\\nHere is the Rust source code that we consider:\\n\\n```rust\\nfn add(a: i32, b: i32) -> i32 {\\n a + b\\n}\\n```\\n\\nBefore, we were generating the following Coq code, with `let*` as the notation for the bind:\\n\\n```coq\\nDefinition add (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03c4, \u03b1 with\\n | [], [ a; b ] =>\\n let* a := M.alloc a in\\n let* b := M.alloc b in\\n let* \u03b10 := M.read a in\\n let* \u03b11 := M.read b in\\n BinOp.Panic.add \u03b10 \u03b11\\n | _, _ => M.impossible\\n end.\\n```\\n\\nNow, with the new monadic notation, we generate:\\n\\n```coq\\nDefinition add (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03c4, \u03b1 with\\n | [], [ a; b ] =>\\n ltac:(M.monadic\\n (let a := M.alloc (| a |) in\\n let b := M.alloc (| b |) in\\n BinOp.Panic.add (| M.read (| a |), M.read (| b |) |)))\\n | _, _ => M.impossible\\n end.\\n```\\n\\nThe main change is that we do not need to introduce intermediate `let*` expressions with generated names. The code structure is more similar to the original Rust code, with additional calls to memory primitives such as `M.alloc` and `M.read`.\\n\\nThe notation `f (| x1, ..., xn |)` represents the call to the function `f` with the arguments `x1`, ..., `xn` returning a monadic result. We bind the result with the current continuation that goes up to the wrapping `ltac:(M.monadic ...)` tactic. We automatically transform the `let` into a `let*` with the `M.monadic` tactic when needed.\\n\\n## Where do we use this notation? \ud83e\udd14\\n\\nWe use this notation in all the function bodies that we generate, that are all in a monad to represent side effects. We call the `ltac:(M.monadic ...)` tactic at the start of the functions, as well as at the start of closure bodies that are defined inside functions. This also applies to the translation of `if`, `match`, and `loop` expressions, as we represent their bodies as functions.\\n\\nHere is an example of code with a `match` expression:\\n\\n```rust\\nfn add(a: i32, b: i32) -> i32 {\\n match a - b {\\n 0 => a + b,\\n _ => a - b,\\n }\\n}\\n```\\n\\nWe translate it to:\\n\\n```coq\\nDefinition add (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03c4, \u03b1 with\\n | [], [ a; b ] =>\\n ltac:(M.monadic\\n (let a := M.alloc (| a |) in\\n let b := M.alloc (| b |) in\\n M.read (|\\n M.match_operator (|\\n M.alloc (| BinOp.Panic.sub (| M.read (| a |), M.read (| b |) |) |),\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ :=\\n M.is_constant_or_break_match (|\\n M.read (| \u03b3 |),\\n Value.Integer Integer.I32 0\\n |) in\\n M.alloc (|\\n BinOp.Panic.add (| M.read (| a |), M.read (| b |) |)\\n |)));\\n fun \u03b3 =>\\n ltac:(M.monadic (\\n M.alloc (|\\n BinOp.Panic.sub (| M.read (| a |), M.read (| b |) |)\\n |)\\n ))\\n ]\\n |)\\n |)))\\n | _, _ => M.impossible\\n end.\\n```\\n\\nWe see that we call the tactic `M.monadic` for each branch of the `match` expression.\\n\\n## How does it work? \ud83d\udee0\ufe0f\\n\\nThe `M.monadic` tactic is defined in [M.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/M.v). The main part is:\\n\\n```coq showLineNumbers\\nLtac monadic e :=\\n lazymatch e with\\n (* ... *)\\n | context ctxt [M.run ?x] =>\\n lazymatch context ctxt [M.run x] with\\n | M.run x => monadic x\\n | _ =>\\n refine (M.bind _ _);\\n [ monadic x\\n | let v := fresh \\"v\\" in\\n intro v;\\n let y := context ctxt [v] in\\n monadic y\\n ]\\n end\\n (* ... *)\\n end.\\n```\\n\\nIn our translation of Rust, all of the values have the common type `Value.t`. The monadic bind is of type `M -> (Value.t -> M) -> M` where `M` is the type of the monad. The `M.run` function is an axiom that we use as a marker to know where we need to apply `M.bind`. The type of `M.run` is:\\n\\n```coq\\nAxiom run : M -> Value.t.\\n```\\n\\nThe notation for monadic function calls is defined using the `M.run` axiom with:\\n\\n```coq\\nNotation \\"e (| e1 , .. , en |)\\" := (M.run ((.. (e e1) ..) en)).\\n```\\n\\nWhen we encounter a `M.run` (line 4) we apply the `M.bind` (line 8) to the monadic expression `x` (line 9) and its continuation `ctx` that we obtain thanks to the `context` keyword (line 4) of the matching of expressions in Ltac.\\n\\nThere is another case in the `M.monadic` tactic to handle the `let` expressions, that is not shown here.\\n\\n## Conclusion\\n\\nThanks to this new monadic notation, the generated Coq code is more readable and closer to the original Rust code. This should simplify our work in writing proofs on the generated code, as well as debugging the translation.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) \ud83d\udc8c! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification."},{"id":"/2024/03/22/improvements-rust-translation-part-3","metadata":{"permalink":"/blog/2024/03/22/improvements-rust-translation-part-3","source":"@site/blog/2024-03-22-improvements-rust-translation-part-3.md","title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","description":"We explained how we started updating our translation tool coq-of-rust in our previous blog post, to support more of the Rust language. Our goal is to provide formal verification for the Rust \ud83e\udd80 language, relying on the proof system Coq \ud83d\udc13. We will see in this post how we continue implementing changes in coq-of-rust to:","date":"2024-03-22T00:00:00.000Z","formattedDate":"March 22, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":10.105,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","tags":["coq-of-rust","Rust","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Monadic notation for the Rust translation","permalink":"/blog/2024/04/03/monadic-notation-for-rust-translation"},"nextItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","permalink":"/blog/2024/03/08/improvements-rust-translation-part-2"}},"content":"We explained how we started updating our translation tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) in our [previous blog post](/blog/2024/03/08/improvements-rust-translation-part-2), to support more of the Rust language. Our goal is to provide formal verification for the Rust \ud83e\udd80 language, relying on the proof system Coq \ud83d\udc13. We will see in this post how we continue implementing changes in `coq-of-rust` to:\\n\\n1. remove the types from the translation,\\n2. be independent of the ordering of the definitions.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\n- Previous post: [Improvements in the Rust translation to Coq, part 2](/blog/2024/03/08/improvements-rust-translation-part-2)\\n\\n:::\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat \u260e\ufe0f!\\n\\n:::\\n\\n## Translating the `dns` example \ud83d\ude80\\n\\nWe continue with our previous example [dns.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/dns.rs), which is composed of around 200 lines of Rust code.\\n\\n### Borrow and dereference\\n\\nThe next error that we encounter when type-checking the Coq translation of `dns.rs` is:\\n\\n```\\nFile \\"./examples/default/examples/ink_contracts/dns.v\\", line 233, characters 22-27:\\nError: The reference deref was not found in the current environment.\\n```\\n\\nIn Rust, we can either take the address of a value with `&`, or dereference a reference with `*`. In our translation, we do not distinguish between the four following pointer types:\\n\\n- `&`\\n- `&mut`\\n- `*const`\\n- `*mut`\\n\\nWe let the user handle these in different ways if it can simplify their proofs, especially regarding the distinction between mutable and non-mutable pointers. It simplifies the definition of our borrowing and dereferencing operators, as we need only two to cover all cases. We even go further: we remove these two operators in the translation, as they are the identity in our case!\\n\\nTo better understand why they are the identity, we need to see that there are two kinds of Rust values in our representation:\\n\\n- the value itself and\\n- the value with its address.\\n\\nThe value itself is useful to compute over the values. For example, we use it to define the primitive addition over integers. The value with its address corresponds to the final Rust expression. Indeed, we can take the address of any sub-expression in Rust with the `&` operator, so each sub-expression should come with its address. When we take the address of an expression, we:\\n\\n- start from a value with its address and go to\\n- a value that is an address to the value above, which we will need to allocate to have an address for it also.\\n\\nThus, the `&` operator behaves as the identity function followed by an allocation. Similarly, the `*` is a memory read followed by the identity function. Since we already use the alloc and read operations to go from a value to a value with its address and the other way around, we do not need to define the `*` and `&` operators in our translation and remove them.\\n\\n### Primitive operators\\n\\nWe now need to distinguish between the function calls, that use the primitive:\\n\\n```coq\\nM.get_function : string -> M\\n```\\n\\nto find the right function to call when defining the semantics of the program (even if the function is defined later), and the calls to primitive operators (`+`, `*`, `!`, ...) that we define in our base library for Rust in Coq. The full list of primitive operators is given by:\\n\\n- [rustc_middle::mir::syntax::BinOp](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/syntax/enum.BinOp.html)\\n- [rustc_middle::thir::LogicalOp](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/thir/enum.LogicalOp.html) (with lazy evaluation of the parameters)\\n- [rustc_middle::mir::syntax::UnOp](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/syntax/enum.UnOp.html)\\n\\nWe adapted the handling of primitive operators from the code we had before and added a few other fixes so that now the `dns.rs` example type-checks in Coq \ud83c\udf8a! We will now focus on fixing the other examples.\\n\\n## Cleaning the code \ud83e\uddfc\\n\\nBut let us first clean the code a bit. All the expressions in the internal [AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) of `coq-of-rust` are in a wrapper with the current type of the expression:\\n\\n```rust\\npub(crate) struct Expr {\\n pub(crate) kind: Rc,\\n pub(crate) ty: Option>,\\n}\\n\\npub(crate) enum ExprKind {\\n Pure(Rc),\\n LocalVar(String),\\n Var(Path),\\n Constructor(Path),\\n // ... all the cases\\n```\\n\\nHaving access to the type of each sub-expression was useful before annotating the `let` expressions. This is not required anymore, as all the values have the type `Value.t`. Thus, we remove the wrapper `Expr` and rename `ExprKind` into `Expr`. The resulting code is easier to read, as wrapping everything with a type was verbose sometimes.\\n\\nWe also cleaned some translated types that were not used anymore in the code, removed unused `Derive` traits, and removed the monadic translation on the types.\\n\\n
\\n ![Crab in space](2024-03-22/crab-in-space.webp)\\n
A crab safely walking in space thanks to formal verification.
\\n
\\n\\n## Handling the remaining examples\\n\\nTo handle the remaining examples of our test suite (extracted from the snippets of the [Rust by Example](https://doc.rust-lang.org/rust-by-example/) book), we mainly needed to re-implement the pattern matching on the new untyped values. Here is an example of Rust code with matching:\\n\\n```rust\\nfn matching(tuple: (i32, i32)) -> i32 {\\n match tuple {\\n (0, 0) => 0,\\n (_, _) => 1,\\n }\\n}\\n```\\n\\nwith its translation in Coq:\\n\\n```coq showLineNumbers\\nDefinition matching (\ud835\udf0f : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \ud835\udf0f, \u03b1 with\\n | [], [ tuple ] =>\\n let* tuple := M.alloc tuple in\\n let* \u03b10 :=\\n match_operator\\n tuple\\n [\\n fun \u03b3 =>\\n let* \u03b30_0 := M.get_tuple_field \u03b3 0 in\\n let* \u03b30_1 := M.get_tuple_field \u03b3 1 in\\n let* _ :=\\n let* \u03b10 := M.read \u03b30_0 in\\n M.is_constant_or_break_match \u03b10 (Value.Integer Integer.I32 0) in\\n let* _ :=\\n let* \u03b10 := M.read \u03b30_1 in\\n M.is_constant_or_break_match \u03b10 (Value.Integer Integer.I32 0) in\\n M.alloc (Value.Integer Integer.I32 0);\\n fun \u03b3 =>\\n let* \u03b30_0 := M.get_tuple_field \u03b3 0 in\\n let* \u03b30_1 := M.get_tuple_field \u03b3 1 in\\n M.alloc (Value.Integer Integer.I32 1)\\n ] in\\n M.read \u03b10\\n | _, _ => M.impossible\\n end.\\n```\\n\\nHere is a breakdown of how it works:\\n\\n- On line 6 we call the `match_operator` primitive that takes a value to match on, `tuple`, and a list of functions that try to match the value with a pattern and execute some code in case of success. We execute the matching functions successively until one succeeds and we stop. There should be at least one succeeding function as pattern-match in Rust is exhaustive.\\n- On line 10 we get the first element of the tuple. Note that, more precisely, what we get is the address of the first element of `\u03b3` that is the address of the tuple `tuple` given as parameter to the function. Having the address might be required for some operations, like doing subsequent matching by reference or using the `&` operator in the `match`\'s body.\\n- On line 11 we do the same with the second element of the tuple. The indices for `\u03b3` are generated to avoid name clashes. They correspond to the depth of the sub-pattern being considered, followed by the index of the current item in this sub-pattern.\\n- On line 14, we check that the first element of the tuple is `0`. We use the `M.is_constant_or_break_match` primitive that checks if the value is a constant and if it is equal to the expected value. If it is not the case, it exits the current matching function, and the `match_operator` primitive will evaluate the next one, going to line 19.\\n- On line 24 we return the final result. Note that we always do a `M.alloc` followed by `M.read` to return the result. This could be simplified, as immediately reading an allocated value is like running the identity function.\\n\\nBy implementing the new version of the pattern-matching, as well as a few other smaller fixes, we were able to make all the examples type-check again! We now need to fix the proofs we had on the [erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/erc20.v) example, as the generated code changed a lot.\\n\\n## Updating the proofs \ud83d\udc69\u200d\ud83d\ude80\\n\\nUnfortunately, all these changes in the generated code are breaking our proofs. We still want to write our specifications and proofs by first showing a simulation of the Rust code with a simpler and functional definition. Before, with our simulations, we were:\\n\\n- replacing the management of pointers by either stateless functions or functions in a state monad;\\n- simplifying the error handling, especially for code that cannot panic.\\n\\nNow we also have to:\\n\\n- define the types;\\n- add the typing information;\\n- add the trait constraints and resolve the trait instances;\\n- resolve the function or associated function calls.\\n\\nWe have not finished updating the proofs but still merged our work in `main` with the pull request [#472](https://github.com/formal-land/coq-of-rust/pull/472) as this was taking too long. The proof that we want to update is in the file [proofs/erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/proofs/erc20.v) and is about the smart contract [erc20.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/erc20.rs).\\n\\n### Phi operators \ud83c\udfa0\\n\\nOur basic strategy for the proof, in order to handle the untyped Rust values of the new translation, is to define various `\u03c6` operators coming from a user-defined Coq type to a Rust value of type `Value.t`. These translate the data types that we define to represent the Rust types of the original program. Note that we previously had trouble translating the Rust types in the general case, especially for mutually recursive types or types involving a lot of trait manipulations.\\n\\nMore formally, we introduce the Coq typeclass:\\n\\n```coq\\nClass ToValue (A : Set) : Set := {\\n \u03a6 : Ty.t;\\n \u03c6 : A -> Value.t;\\n}.\\nArguments \u03a6 _ {_}.\\n```\\n\\nThis describes how to go from a user-defined type in Coq to the equivalent representation in `Value.t`. In addition to the `\u03c6` operator, we also define the `\u03a6` operator that gives the Rust type of the Coq type. This type is required to give for polymorphic definitions.\\n\\nWe always go from user-defined types to `Value.t`. We write our simulation statements like this:\\n\\n```coq\\n{{env, state |\\n code.example.get_at_index [] [\u03c6 vector; \u03c6 index] \u21d3\\n inl (\u03c6 (simulations.example.get_at_index vector index))\\n| state\'}}\\n```\\n\\nwhere:\\n\\n```coq\\n{{env, state | rust_program \u21d3 simulation_result | state\'}}\\n```\\n\\nis our predicate to state an evaluation of a Rust program to a simulation result. We apply the `\u03c6` operator to the arguments of the Rust program and to the result of the simulation. In some proofs, we set this operator as `Opaque` in order to keep track of it and avoid unwanted reductions.\\n\\n### Traits\\n\\nThe trait definitions, as well as trait constraints, are absent from the generated Coq code. For now, we add them back as follows, for the example of the `Default` trait:\\n\\n1. We define a `Default` typeclass in Coq:\\n\\n ```coq\\n Module Default.\\n Class Trait (Self : Set) : Set := {\\n default : Self;\\n }.\\n End Default.\\n ```\\n\\n2. We define what it means to implement the `Default` trait and have a corresponding simulation:\\n\\n ```coq\\n Module Default.\\n Record TraitHasRun (Self : Set)\\n `{ToValue Self}\\n `{core.simulations.default.Default.Trait Self} :\\n Prop := {\\n default :\\n exists default,\\n IsTraitMethod\\n \\"core::default::Default\\" (\u03a6 Self) []\\n \\"default\\" default /\\\\\\n Run.pure\\n (default [] [])\\n (inl (\u03c6 core.simulations.default.Default.default));\\n }.\\n End Default.\\n ```\\n\\n where `Run.pure` is our simulation predicate for the case where the `state` does not change.\\n\\n3. Finally, we use the `TraitHasRun` predicate as an additional hypothesis for simulation proofs on functions that depend on the `Default` trait in Rust:\\n\\n ```coq\\n (** Simulation proof for `unwrap_or_default` on the type `Option`. *)\\n Lemma run_unwrap_or_default {T : Set}\\n {_ : ToValue T}\\n {_ : core.simulations.default.Default.Trait T}\\n (self : option T) :\\n core.proofs.default.Default.TraitHasRun T ->\\n Run.pure\\n (core.option.Impl_Option_T.unwrap_or_default (\u03a6 T) [] [\u03c6 self])\\n (inl (\u03c6 (core.simulations.option.Impl_Option_T.unwrap_or_default self))).\\n Proof.\\n (* ... *)\\n Qed.\\n ```\\n\\n## Conclusion \u270d\ufe0f\\n\\nWe still have a lot to do, especially in finding the right approach to verify the newly generated Rust code. But we have finalized our new translation mode without types and ordering, which helps to successfully translate many more Rust examples. We also do not need to translate the dependencies of a project anymore before compiling it.\\n\\nOur next target is to translate the whole of Rust\'s standard library (with the help of some axioms for the expressions which we do not handle yet), in order to have a faithful definition of the Rust primitives, such as functions of the [option](https://doc.rust-lang.org/core/option/) and [vec](https://doc.rust-lang.org/alloc/vec/) modules.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) \ud83d\udc8c! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification."},{"id":"/2024/03/08/improvements-rust-translation-part-2","metadata":{"permalink":"/blog/2024/03/08/improvements-rust-translation-part-2","source":"@site/blog/2024-03-08-improvements-rust-translation-part-2.md","title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","description":"In our previous blog post, we stated our plan to improve our translation of Rust \ud83e\udd80 to Coq \ud83d\udc13 with coq-of-rust. We also provided a new definition for our Rust monad in Coq, and the definition of a unified type to represent any Rust values. We will now see how we modify the Rust implementation of coq-of-rust to make the generated code use these new definitions.","date":"2024-03-08T00:00:00.000Z","formattedDate":"March 8, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":9.055,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","tags":["coq-of-rust","Rust","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","permalink":"/blog/2024/03/22/improvements-rust-translation-part-3"},"nextItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","permalink":"/blog/2024/02/29/improvements-rust-translation"}},"content":"In our [previous blog post](/blog/2024/02/29/improvements-rust-translation), we stated our plan to improve our translation of Rust \ud83e\udd80 to Coq \ud83d\udc13 with [coq-of-rust](https://github.com/formal-land/coq-of-rust). We also provided a new definition for our Rust monad in Coq, and the definition of a unified type to represent any Rust values. We will now see how we modify the Rust implementation of `coq-of-rust` to make the generated code use these new definitions.\\n\\nWith this new translation strategy, to support more Rust code, we want:\\n\\n1. to remove the types from the translation,\\n2. to avoid the need to order the definitions in the generated Coq code.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\n- Next post: [Improvements in the Rust translation to Coq, part 3](/blog/2024/03/22/improvements-rust-translation-part-3)\\n- Previous post: [Improvements in the Rust translation to Coq, part 1](/blog/2024/02/29/improvements-rust-translation)\\n\\n:::\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat!\\n\\n:::\\n\\n## Implementation of the monad\\n\\nWe implemented the new monad and the type `Value.t` holding any kind of Rust values as described in the previous blog post. For now, we have removed the definitions related to the standard library of Rust (everything except the base definitions such as the integer types). This should not be an issue to type-check the generated Coq code, as the new code should be independent of the ordering of definitions: in particular, it should type-check even if the needed definitions are not yet there.\\n\\nWe added some definitions for the primitive unary and binary operators. These include some operations on the integers such arithmetic operations (with or without overflow, depending on the compilation mode), as well as comparisons (equality, lesser or equal than, ...).\\n\\nNow that the main library file [CoqOfRust/CoqOfRust.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/CoqOfRust.v) compiles in Coq, we can start to test the translation on our examples.\\n\\n## Generating the tests\\n\\nWe generate new snapshots for our translations with:\\n\\n```sh\\ncargo build && time python run_tests.py\\n```\\n\\nThis builds the project `coq-of-rust` (with a lot of warning about unused code for now) and re-generates our snapshots: for each Rust file in the [examples](https://github.com/formal-land/coq-of-rust/tree/main/examples) directory, we generate a Coq file with the same name but the extension `.v`. We generate two versions:\\n\\n- one in axiom mode, where all definitions are axiomatized, to translate libraries, for example, and\\n- one in full definition mode, where we also translate the bodies of the function definitions.\\n\\n## Axiom mode\\n\\nWe first try to type-check and fix the code generated in axiom mode.\\n\\n### Type aliases\\n\\nWe have a first error for type aliases that we do not translate properly. We need access to the fully qualified name of the alias. We do that by combining calls to the functions:\\n\\n- [crate_name](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.crate_name) to get the name of the current crate and\\n- [def_path](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.def_path) to get the whole definition path without the crate name.\\n\\nAs a result, for the file [examples/ink_contracts/basic_contract_caller.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/basic_contract_caller.rs), we translate the type alias:\\n\\n```rust\\ntype Hash = [u8; 32];\\n```\\n\\ninto the Coq code:\\n\\n```coq\\nAxiom Hash :\\n (Ty.path \\"basic_contract_caller::Hash\\") =\\n (Ty.apply (Ty.path \\"array\\") [Ty.path \\"u8\\"]).\\n```\\n\\nThen, during the proofs, we will be able to substitute the type `Hash` by its definition when it appears. Note that we now translate types by values of the type `Ty.t`, so there should be no difficulties in rewriting types.\\n\\nWe should add the length of the array in the type. This is not done yet.\\n\\n### Traits\\n\\nIn axiom mode, we remove most of the trait definitions. Instead, with our new translation model, the traits are mostly unique names (the absolute path of the trait definition). The main use of traits is to distinguish them from other traits, to know which trait implementation to use when calling a trait\'s method. We still translate the provided methods (that are default methods in the trait definition) to axioms and add a predicate stating that they are associated with the current trait. For example, we translate the following Rust trait:\\n\\n```rust\\n// crate `my_crate`\\n\\ntrait Animal {\\n fn new(name: &\'static str) -> Self;\\n\\n fn name(&self) -> &\'static str;\\n fn noise(&self) -> &\'static str;\\n\\n fn talk(&self) {\\n println!(\\"{} says {}\\", self.name(), self.noise());\\n }\\n}\\n```\\n\\nto the Coq code:\\n\\n```coq\\n(* Trait *)\\nModule Animal.\\n Parameter talk : (list Ty.t) -> (list Value.t) -> M.\\n\\n Axiom ProvidedMethod_talk : M.IsProvidedMethod \\"my_crate::Animal\\" talk.\\nEnd Animal.\\n```\\n\\nWe realize with this example that the translation in axiom mode generates very few errors, as we remove all the type definitions and all the function axioms have the same signature:\\n\\n```coq\\n(* A list of types that can be empty for non-polymorphic functions,\\n a list of parameters, and a return value in the monad `M`. *)\\nlist Ty.t -> list Value.t -> M\\n```\\n\\nso the type-checking of these axioms never fails. We thus jump to the full definition mode as this is where our new approach might fail.\\n\\n## Definition mode\\n\\nWe now try to type-check the generated Coq code in full definition mode. We start with the [dns.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/dns.rs) smart contract example.\\n\\n### Polymorphic trait implementation\\n\\nThis example is interesting, as it contains polymorphic implementations, such as for the [mock](https://en.wikipedia.org/wiki/Mock_object) type `Mapping`:\\n\\n```rust\\n#[derive(Default)]\\nstruct Mapping {\\n _key: core::marker::PhantomData,\\n _value: core::marker::PhantomData,\\n}\\n```\\n\\nthat implements the [Default](https://doc.rust-lang.org/core/default/trait.Default.html) trait on the type `Mapping` for two type parameters `K` and `V`. We translate it to:\\n\\n```coq showLineNumbers\\n(* Struct Mapping *)\\n\\nModule Impl_core_default_Default_for_dns_Mapping_K_V.\\n (*\\n Default\\n *)\\n Definition default (\ud835\udf0f : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \ud835\udf0f, \u03b1 with\\n | [ Self; K; V ], [] =>\\n let* \u03b10 :=\\n M.get_method\\n \\"core::default::Default\\"\\n \\"default\\"\\n [ (* Self *) Ty.apply (Ty.path \\"core::marker::PhantomData\\") [ K ] ] in\\n let* \u03b11 := M.call \u03b10 [] in\\n let* \u03b12 :=\\n M.get_method\\n \\"core::default::Default\\"\\n \\"default\\"\\n [ (* Self *) Ty.apply (Ty.path \\"core::marker::PhantomData\\") [ V ] ] in\\n let* \u03b13 := M.call \u03b12 [] in\\n M.pure\\n (Value.StructRecord \\"dns::Mapping\\" [ (\\"_key\\", \u03b11); (\\"_value\\", \u03b13) ])\\n | _, _ => M.impossible\\n end.\\n\\n Axiom Implements :\\n forall (K V : Ty.t),\\n M.IsTraitInstance\\n \\"core::default::Default\\"\\n (* Self *) (Ty.apply (Ty.path \\"dns::Mapping\\") [ K; V ])\\n []\\n [ (\\"default\\", InstanceField.Method default) ]\\n [ K; V ].\\nEnd Impl_core_default_Default_for_dns_Mapping_K_V.\\n```\\n\\nHere are the interesting bits of this code:\\n\\n- On line 1, we translate the `Mapping` type into a single comment, as the types disappear in our translation and become just markers. The marker for `Mapping` is its absolute name `Ty.path \\"dns::Mapping\\"`.\\n- On line 7, the function `default` takes a list of types `\ud835\udf0f` as a parameter in case it is polymorphic. Here, this method is not polymorphic, but we still add the `\ud835\udf0f` parameter for uniformity. We also take three additional type parameters:\\n\\n - `Self`\\n - `K`\\n - `V`\\n\\n that represent the `Self` type on which the trait is implemented, and the two type parameters of the `Mapping` type. These will be provided when calling the `default` method.\\n\\n- On line 11, we use the primitive `M.get_method` (axiomatized for now) to get the method `default` of the trait `core::default::Default` for the type `core::marker::PhantomData`. Here, we see that having access to the type `K` in the body of the `default` function is useful, as it helps us to disambiguate between the various implementations of the `Default` trait instances that we call. Here, we provide the `Self` type of the trait in a list of a single element. If the `Default` trait or the `default` method were polymorphic, we would also append these type parameters in this list.\\n- On line 15, we call the `default` method instance that we found with an empty list of arguments.\\n- On line 23, we build a value of type `Mapping` with the two fields `_key` and `_value` initialized with the results of the two calls to the `default` method. We use the `Value.StructRecord` constructor to build the value, and its result is of type `Value.t` like all other Rust values.\\n- On line 24, we eliminate a case with a wrong number of type and value arguments. This should never happen as the arity of all the function calls is checked by the Rust type-checker.\\n- On line 27, we state that we have a new instance of the `Default` trait for the `Mapping` type, with the `default` method implemented by the `default` function. This is true for any values of the types `K` and `V`.\\n- On line 34, we specify that `[K, V]` are the type parameters of this implementation that should be given as extra parameters when calling the `default` method of this instance, together with the `Self` type.\\n\\n### Polymorphic implementation\\n\\nNext, we have a polymorphic implementation of mock associated functions for the `Mapping` type:\\n\\n```rust\\nimpl Mapping {\\n fn contains(&self, _key: &K) -> bool {\\n unimplemented!()\\n }\\n\\n // ...\\n```\\n\\nWe translate it to:\\n\\n```coq showLineNumbers\\nModule Impl_dns_Mapping_K_V.\\n Definition Self (K V : Ty.t) : Ty.t :=\\n Ty.apply (Ty.path \\"dns::Mapping\\") [ K; V ].\\n\\n (*\\n fn contains(&self, _key: &K) -> bool {\\n unimplemented!()\\n }\\n *)\\n Definition contains (\ud835\udf0f : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \ud835\udf0f, \u03b1 with\\n | [ Self; K; V ], [ self; _key ] =>\\n let* self := M.alloc self in\\n let* _key := M.alloc _key in\\n let* \u03b10 := M.var \\"core::panicking::panic\\" in\\n let* \u03b11 := M.read (mk_str \\"not implemented\\") in\\n let* \u03b12 := M.call \u03b10 [ \u03b11 ] in\\n never_to_any \u03b12\\n | _, _ => M.impossible\\n end.\\n\\n Axiom AssociatedFunction_contains :\\n forall (K V : Ty.t),\\n M.IsAssociatedFunction (Self K V) \\"contains\\" contains [ K; V ].\\n\\n (* ... *)\\n```\\n\\nWe follow a similar approach as for the translation of trait implementations, especially regarding the handling of polymorphic type variables. Here are some differences:\\n\\n- On line 2, we define a `Self` type as a function of the type parameters `K` and `V`. This is useful for avoiding repeating the same type expression later.\\n- On line 22, we use the predicate `M.IsAssociatedFunction` to state that we have a new associated function `contains` for the `Mapping` type, with the `contains` method implemented by the `contains` function. This is true for any values of the types `K` and `V`. Like for the trait implementations, we explicit the list `[K, V]` that will be given as an extra parameter to the function `contains`.\\n\\n## Conclusion\\n\\nIn the next blog post, we will see how we continue to translate the examples in full definition mode. There is still a lot to do to get to the same level of Rust support as before, but we are hopeful that our new approach will be more robust and easier to maintain.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land)! Formal verification provides the highest level of safety for critical applications. See the [White House report on secure software development](https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf) for more on the importance of formal verification."},{"id":"/2024/02/29/improvements-rust-translation","metadata":{"permalink":"/blog/2024/02/29/improvements-rust-translation","source":"@site/blog/2024-02-29-improvements-rust-translation.md","title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","description":"Our tool coq-of-rust is translating Rust \ud83e\udd80 programs to the proof system Coq \ud83d\udc13 to do formal verification on Rust programs. Even if we are able to verify realistic code, such as an ERC-20 smart contract, coq-of-rust still has some limitations:","date":"2024-02-29T00:00:00.000Z","formattedDate":"February 29, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":12.655,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","tags":["coq-of-rust","Rust","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","permalink":"/blog/2024/03/08/improvements-rust-translation-part-2"},"nextItem":{"title":"\ud83e\uddab Translating Go to Coq, part 1","permalink":"/blog/2024/02/22/journey-coq-of-go"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) is translating Rust \ud83e\udd80 programs to the proof system Coq \ud83d\udc13 to do formal verification on Rust programs. Even if we are able to verify realistic code, such as an [ERC-20 smart contract](/blog/2023/12/13/rust-verify-erc-20-smart-contract), `coq-of-rust` still has some limitations:\\n\\n- fragile trait handling\\n- difficulties in ordering the definitions, in their order of dependencies as required by Coq\\n\\nWe will present how we plan to improve our tool to address these limitations.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\n- Next post: [Improvements in the Rust translation to Coq, part 2](/blog/2024/03/08/improvements-rust-translation-part-2)\\n\\n:::\\n\\n## Introduction\\n\\nAs emphasized in the [recent report from the White House](https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf), memory safety and formal verification are keys to ensure secure and correct software. Rust provides memory safety and we provide formal verification on top of it with `coq-of-rust`.\\n\\nWe will take the Rust [serde](https://github.com/serde-rs/serde) serialization library to have an example of code to translate in Coq. This is a popular Rust library that is used in almost all projects, either as a direct or transitive dependency. Serialization has a simple specification (being a bijection between the data and its serialized form) and is a good candidate for formal verification. We might verify this library afterwards if there is a need.\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency in order to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase totally bug-free. Contact us at [contact@formal.land](mailto:contact@formal.land) to chat! We offer a free audit to assess the feasibility of formal verification on your case.\\n\\n:::\\n\\n:::note Goal\\n\\nOur company goal is to make formal verification accessible to all projects, reducing its cost to 20% of the development cost. There should be no reason to have bugs in end-user products!\\n\\n:::\\n\\n## Warnings\\n\\nWe start by running the command:\\n\\n```sh\\ncargo coq-of-rust\\n```\\n\\nin the `serde` directory. We get a lot of warnings, but the translation does not panic as it tries to always produce something for debugging purposes. We have two kinds of warnings.\\n\\n### Constants in patterns\\n\\nThe warning is the following:\\n\\n```\\nwarning: Constants in patterns are not yet supported.\\n --\x3e serde/src/de/mod.rs:2277:13\\n |\\n2277 | 0 => panic!(), // special case elsewhere\\n | ^\\n```\\n\\nThe reason why we did not handle constants in patterns is that they are represented in a special format in the Rust compiler that was not obvious to handle. The definition of [rustc_middle::mir::consts::Const](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/consts/enum.Const.html) representing the constants in patterns is:\\n\\n```rust\\npub enum Const<\'tcx> {\\n Ty(Const<\'tcx>),\\n Unevaluated(UnevaluatedConst<\'tcx>, Ty<\'tcx>),\\n Val(ConstValue<\'tcx>, Ty<\'tcx>),\\n}\\n```\\n\\nThere are three cases, and each contains several more cases. To fix this issue, we added the code to handle the signed and unsigned integers, which are enough for our `serde` example. We will need to add other cases later, especially for the strings. This allowed us to discover and fix a bug in our handling of patterns for tuples with elision `..`, like in the example:\\n\\n```rust\\nfn main() {\\n let triple = (0, -2, 3);\\n\\n match triple {\\n (0, y, z) => println!(\\"First is `0`, `y` is {:?}, and `z` is {:?}\\", y, z),\\n (1, ..) => println!(\\"First is `1` and the rest doesn\'t matter\\"),\\n (.., 2) => println!(\\"last is `2` and the rest doesn\'t matter\\"),\\n (3, .., 4) => println!(\\"First is `3`, last is `4`, and the rest doesn\'t matter\\"),\\n _ => println!(\\"It doesn\'t matter what they are\\"),\\n }\\n}\\n```\\n\\nThese changes are in the pull-request [coq-of-rust#470](https://github.com/formal-land/coq-of-rust/pull/470).\\n\\n### Unimplemented `parent_kind`\\n\\nWe get a second form of warning:\\n\\n```\\nunimplemented parent_kind: Struct\\nexpression: Expr {\\n kind: ZstLiteral {\\n user_ty: None,\\n },\\n ty: FnDef(\\n DefId(2:31137 ~ core[10bc]::cmp::Reverse::{constructor#0}),\\n [\\n T/#1,\\n ],\\n ),\\n temp_lifetime: Some(\\n Node(14),\\n ),\\n span: serde/src/de/impls.rs:778:22: 778:29 (#0),\\n}\\n```\\n\\nThis is for some cases of expressions [rustc_middle::thir::ExprKind::ZstLiteral](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/thir/enum.ExprKind.html#variant.ZstLiteral) in the Rust\'s [THIR representation](https://rustc-dev-guide.rust-lang.org/thir.html) that we do not handle. If we look at the `span` field, we see that it appears in the source in the file `serde/src/de/impls.rs` at line 778:\\n\\n```rust\\nforwarded_impl! {\\n (T), Reverse, Reverse // Here is the error\\n}\\n```\\n\\nThis is not very informative as this code is generated by a macro. Another similar kind of expression appears later:\\n\\n```rust\\nimpl<\'de, T> Deserialize<\'de> for Wrapping\\nwhere\\n T: Deserialize<\'de>,\\n{\\n fn deserialize(deserializer: D) -> Result\\n where\\n D: Deserializer<\'de>,\\n {\\n Deserialize::deserialize(deserializer).map(\\n // Here is the error:\\n Wrapping\\n )\\n }\\n}\\n```\\n\\nThe `Wrapping` term is the constructor of a structure, used as a function. We add the support of this case in the pull-request [coq-of-rust#471](https://github.com/formal-land/coq-of-rust/pull/471).\\n\\n## Coq errors\\n\\nWhen we type-check the generated Coq code, we quickly get an error:\\n\\n```coq\\n(* Generated by coq-of-rust *)\\nRequire Import CoqOfRust.CoqOfRust.\\n\\nModule lib.\\n Module core.\\n\\n End core.\\nEnd lib.\\n\\nModule macros.\\n\\nEnd macros.\\n\\nModule integer128.\\n\\nEnd integer128.\\n\\nModule de.\\n Module value.\\n Module Error.\\n Section Error.\\n Record t : Set := {\\n (* Here is the error: *)\\n err : ltac:(serde.de.value.ErrorImpl);\\n }.\\n\\n (* 180.000 more lines! *)\\n```\\n\\nThe reason is that `serde.de.value.ErrorImpl` is not yet defined here. In Coq, we must order the definitions in the order of dependencies to ensure that there are no non-terminating definitions with infinite recursive calls and to preserve the consistency of the system.\\n\\nThis issue does not seem easy to us, as in a Rust crate, everything can depend on each other:\\n\\n- types\\n- definitions\\n- traits\\n- `impl` blocks\\n\\nOur current solutions are:\\n\\n1. **To reorder the definitions in the source Rust code**, so that they appear in the right order for Coq. This is technically the simplest solution (no changes in `coq-of-rust`), but it is not very practical. Indeed, reordering elements in a big project generates a lot of conflicts in the version control system, especially if we cannot upstream the changes to the original project.\\n2. **To use a configuration file** to specify the order of the definitions. This works in a lot of cases, but we need to write this file manually and have it complete to compile the whole crate in Coq, even if we are interested in verifying a small part of the code. There are also some cases that are hard to entangle, in particular with traits that can depend on both types and definitions, that themselves may depend on traits.\\n\\nIn order to handle large projects, such as `serde`, we need to find a more definitive solution to handle the order of dependencies.\\n\\n## Plan for the order of definitions\\n\\nOur idea is to use a more verbose, but simpler translation, to generate Coq code that is not sensitive to the ordering of Rust. In addition, we should have a more robust mechanism for the traits, as there are still some edge cases that we do not handle well.\\n\\nOur main ingredients are:\\n\\n1. Generating an untyped code, where all Rust values become part of a single and shared `Value` type. With this approach, we can represent mutually recursive Rust types, that are generally hard to translate in a sound manner to Coq. We should also avoid a lot of errors on the Coq side related to type inference.\\n2. Adding an indirection level to all function calls, as any function call might refer to a definition that appears later in the code.\\n\\nThese ingredients have some drawbacks:\\n\\n- By removing the types, we will obtain a code that is less readable. It might contain translation errors that will be harder to spot. We will need to add the types back during the specification of the code.\\n- We will need to add error cases corresponding to type errors at runtime, as we will not have the type system to ensure that functions expecting a certain type of value receive it. We know from the Rust type checker that these errors should not happen, but we will need to prove it in Coq.\\n- We will have to resolve the indirections in the calls at proof time, or with other mechanisms, that will be more complex than the current translation.\\n- We will still need to have a translation of the types (as values), to guide the inference of trait instances.\\n\\n## Definition of a new monad\\n\\nWe rework our definitions of values, pointers and monad to represent the effects, taking into account the fact that we remove the types from the translation. Here are the main definitions that we are planning to use. We have not tested them yet as we need to update the translation to Coq to use them. We will do that just after.\\n\\n### Pointers\\n\\n```coq\\nModule Pointer.\\n Module Index.\\n Inductive t : Set :=\\n | Tuple (index : Z)\\n | Array (index : Z)\\n | StructRecord (constructor field : string)\\n | StructTuple (constructor : string) (index : Z).\\n End Index.\\n\\n Module Path.\\n Definition t : Set := list Index.t.\\n End Path.\\n\\n Inductive t (Value : Set) : Set :=\\n | Immediate (value : Value)\\n | Mutable {Address : Set} (address : Address) (path : Path.t).\\n Arguments Immediate {_}.\\n Arguments Mutable {_ _}.\\nEnd Pointer.\\n```\\n\\nA pointer is either:\\n\\n- a pointer to an immutable data, that is directly represented by its data;\\n- a pointer to a mutable data, that is inside a cell at a certain address in the memory. The exact location in the cell is given by the path.\\n\\nThe type of `Address` is not enforced yet, but we will do it when defining the semantics.\\n\\n### Values\\n\\n```coq\\nModule Value.\\n Inductive t : Set :=\\n | Bool : bool -> t\\n | Integer : Integer.t -> Z -> t\\n (** For now we do not know how to represent floats so we use a string *)\\n | Float : string -> t\\n | UnicodeChar : Z -> t\\n | String : string -> t\\n | Tuple : list t -> t\\n | Array : list t -> t\\n | StructRecord : string -> list (string * t) -> t\\n | StructTuple : string -> list t -> t\\n | Pointer : Pointer.t t -> t\\n (** The two existential types of the closure must be [Value.t] and [M]. We\\n cannot enforce this constraint there yet, but we will do when defining the\\n semantics. *)\\n | Closure : {\'(t, M) : Set * Set @ t -> M} -> t.\\nEnd Value.\\n```\\n\\nHere, this type aims to represent any Rust value. We might add a few cases later to represent the `dyn` values, for example. Most of the cases of this type are as expected:\\n\\n- The constructor `StructRecord` is for constructors of `struct` or `enum` with named fields.\\n- The constructor `StructTuple` is for constructors of `struct` or `enum` with unnamed fields.\\n- The constructor `Pointer` is for pointers to data, that could be either `&`, `&mut`, `*const`, or `*mut`.\\n- The constructor `Closure` is for closures (anonymous functions). To prevent errors with the positivity checker of Coq, we use an existential type for the type `Value.t` (as well as `M`, which will be defined later). Note that we are using impredicative `Set` in Coq, and `{A : Set @ P A}` is our notation for existential `Set` in `Set`. Without impredicative sets, we could have issues with the universe levels. The fact that these existential types are always `Value.t` and `M` will be enforced when defining the semantics.\\n\\n### Monad\'s primitives\\n\\n```coq\\nModule Primitive.\\n Inductive t : Set :=\\n | StateAlloc (value : Value.t)\\n | StateRead {Address : Set} (address : Address)\\n | StateWrite {Address : Set} (address : Address) (value : Value.t)\\n | EnvRead.\\nEnd Primitive.\\n```\\n\\nHere are the IO calls to the system that the monad can make. This list might be extended later. For now, we mainly have primitives to access the memory.\\n\\n### Monad: base\\n\\n```coq\\nModule LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure : A -> t A\\n | CallPrimitive : Primitive.t -> (Value.t -> t A) -> t A\\n | Loop : t A -> (A -> bool) -> (A -> t A) -> t A\\n | Impossible : t A\\n (** This constructor is not strictly necessary, but is used as a marker for\\n functions calls in the generated code, to help the tactics to recognize\\n points where we can compose about functions. *)\\n | Call : t A -> (A -> t A) -> t A.\\n Arguments Pure {_}.\\n Arguments CallPrimitive {_}.\\n Arguments Loop {_}.\\n Arguments Impossible {_}.\\n Arguments Call {_}.\\n\\n Fixpoint let_ {A : Set} (e1 : t A) (f : A -> t A) : t A :=\\n match e1 with\\n | Pure v => f v\\n | CallPrimitive primitive k =>\\n CallPrimitive primitive (fun v => let_ (k v) f)\\n | Loop body is_break k =>\\n Loop body is_break (fun v => let_ (k v) f)\\n | Impossible => Impossible\\n | Call e k =>\\n Call e (fun v => let_ (k v) f)\\n end.\\nEnd LowM.\\n```\\n\\nThis is the first layer of our monad, very similar to what we had before. We remove the cast operation, as now everything has the same type. We use a style by continuation, but we also define a `let_` function to have a \\"bind\\" operator. Note that we always have the same type as parameter, so this is not really a monad as the \\"bind\\" operator should have the type:\\n\\n```coq\\nforall {A B : Set}, M A -> (A -> M B) -> M B\\n```\\n\\nAlways having the same type is enough for us as we use a single type of all Rust values.\\n\\n### Monad: with exceptions\\n\\nWe have the same type as before for the exceptions, representing the panics and all the special control flow operations such as `continue`, `return`, and `break`:\\n\\n```coq\\nModule Exception.\\n Inductive t : Set :=\\n (** exceptions for Rust\'s `return` *)\\n | Return : Value.t -> t\\n (** exceptions for Rust\'s `continue` *)\\n | Continue : t\\n (** exceptions for Rust\'s `break` *)\\n | Break : t\\n (** escape from a match branch once we know that it is not valid *)\\n | BreakMatch : t\\n | Panic : string -> t.\\nEnd Exception.\\n```\\n\\nOur final monad definition is a thin wrapper around `LowM`, to add an error monad to propagate the exceptions:\\n\\n```coq\\nDefinition M : Set :=\\n LowM.t (Value.t + Exception.t).\\n\\nDefinition let_ (e1 : M) (e2 : Value.t -> M) : M :=\\n LowM.let_ e1 (fun v1 =>\\n match v1 with\\n | inl v1 => e2 v1\\n | inr error => LowM.Pure (inr error)\\n end).\\n```\\n\\nOnce again, this is not really a monad as the type of the values that we compute is always the same, and we do not need more. Having a definition in two steps (`LowM` and `M`) is useful to separate the part that can be defined by computation (the `M` part) from the part whose semantics can only be given by inductive predicates (the `LowM` part).\\n\\n## Conclusion\\n\\nNext, we will see how we can use this new definition of Rust values, whether it works to translate our examples, and most importantly, how to modify `coq-of-rust` to generate terms without types.\\n\\nIf you are interested in formally verifying Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) or go to our [GitHub repository](https://github.com/formal-land/coq-of-rust) for `coq-of-rust`."},{"id":"/2024/02/22/journey-coq-of-go","metadata":{"permalink":"/blog/2024/02/22/journey-coq-of-go","source":"@site/blog/2024-02-22-journey-coq-of-go.md","title":"\ud83e\uddab Translating Go to Coq, part 1","description":"In this blog post, we present our development steps to build a tool to translate Go programs to the proof system Coq.","date":"2024-02-22T00:00:00.000Z","formattedDate":"February 22, 2024","tags":[{"label":"coq-of-go","permalink":"/blog/tags/coq-of-go"},{"label":"Go","permalink":"/blog/tags/go"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":12.03,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\uddab Translating Go to Coq, part 1","tags":["coq-of-go","Go","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","permalink":"/blog/2024/02/29/improvements-rust-translation"},"nextItem":{"title":"\ud83e\uddee Experiment on translation from Haskell to Coq","permalink":"/blog/2024/02/14/experiment-coq-of-hs"}},"content":"In this blog post, we present our development steps to build a tool to translate Go programs to the proof system Coq.\\n\\nThe goal is to formally verify Go programs to make them totally bug-free. It is actually possible to make a program totally bug-free, as [formal verification](https://en.wikipedia.org/wiki/Formal_verification) can cover all execution cases and kinds of properties thanks to the use of mathematical methods. This corresponds to the highest level of the [Evaluation Assurance Levels](https://en.wikipedia.org/wiki/Evaluation_Assurance_Level) used for critical applications, such as the space industry.\\n\\nAll the code of our work is available on GitHub at [github.com/formal-land/coq-of-go](https://github.com/formal-land/coq-of-go).\\n\\n\x3c!-- truncate --\x3e\\n\\n## Introduction\\n\\nWe believe that there are not yet a lot of formal verification tools for Go. We can cite [Goose](https://github.com/tchajed/goose), which is working by translation from Go to the proof system Coq. We will follow a similar approach, translating the Go language to our favorite proof system Coq. In contrast to Goose, we plan to support the whole Go language, even at the expense of the simplicity of the translation.\\n\\nFor that, we target the translation of the [SSA form of Go](https://pkg.go.dev/golang.org/x/tools/go/ssa) of Go instead of the [Go AST](https://pkg.go.dev/go/ast). The SSA form is a more low-level representation of Go, so we hope to capture the semantics of the whole Go language more easily. This should be at the expense of the simplicity of the generated translation, but we hope that having full language support outweighs this.\\n\\nGo is an interesting target as:\\n\\n- this is quite a popular language,\\n- it is focusing on simplicity, with a reduced set of language features,\\n- a lot of critical backend applications are written in Go, including for very large companies (Google, Netflix, Uber, Twitch, etc.).\\n\\nAmong interesting properties that we can verify are:\\n\\n- the absence of reachable `panic` in the code,\\n- the absence of race conditions or deadlocks,\\n- the backward compatibility from release to release, for parts of the code whose behavior is not supposed to change,\\n- the strict application of business rules.\\n\\n:::tip Contact\\n\\nYou can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase totally bug-free. Contact us at [contact@formal.land](mailto:contact@formal.land) to chat! We offer a free audit to assess the feasibility of formal verification on your case.\\n\\n:::\\n\\n:::note Goal\\n\\nOur company goal is to make formal verification accessible to all projects, reducing its cost to 20% of the development cost. There should be no reason to have bugs in end-user products!\\n\\n:::\\n\\n![Mole and Rooster](2024-02-22/mole_rooster.webp)\\n\\n## First target\\n\\nOur first target is to achieve the formal verification _including all the dependencies_ of the hello world program:\\n\\n```go\\npackage main\\n\\nimport \\"fmt\\"\\n\\nfunc main() {\\n\\tfmt.Println(\\"Hello, World!\\")\\n}\\n```\\n\\nWhat we want to show about this code is that it does a single and only thing: outputting the string \\"Hello, World!\\" to the standard output. Its only dependency is the `fmt` package, but when we look at the transitive dependencies of this package:\\n\\n```sh\\ngo list -f \'{{ .Deps }}\' fmt\\n```\\n\\nwe get around forty packages:\\n\\n```\\nerrors\\ninternal/abi\\ninternal/bytealg\\ninternal/coverage/rtcov\\ninternal/cpu\\ninternal/fmtsort\\ninternal/goarch\\ninternal/godebugs\\ninternal/goexperiment\\ninternal/goos\\ninternal/itoa\\ninternal/oserror\\ninternal/poll\\ninternal/race\\ninternal/reflectlite\\ninternal/safefilepath\\ninternal/syscall/execenv\\ninternal/syscall/unix\\ninternal/testlog\\ninternal/unsafeheader\\nio\\nio/fs\\nmath\\nmath/bits\\nos\\npath\\nreflect\\nruntime\\nruntime/internal/atomic\\nruntime/internal/math\\nruntime/internal/sys\\nruntime/internal/syscall\\nsort\\nstrconv\\nsync\\nsync/atomic\\nsyscall\\ntime\\nunicode\\nunicode/utf8\\nunsafe\\n```\\n\\nWe will need to translate all these packages to meaningful Coq code.\\n\\n## The start\\n\\nWe made the `coq-of-go` tool, with everything in a single file [main.go](https://github.com/formal-land/coq-of-go/blob/main/main.go) for now. We retrieve the SSA form of a Go package provided as a command line parameter (code without the error handling):\\n\\n```go\\nfunc main() {\\n\\tpackageToTranslate := os.Args[1]\\n\\tcfg := &packages.Config{Mode: packages.LoadSyntax}\\n\\tinitial, _ := packages.Load(cfg, packageToTranslate)\\n\\t_, pkgs := ssautil.Packages(initial, 0)\\n\\tpkgs[0].Build()\\n\\tmembers := pkgs[0].Members\\n```\\n\\n:::note SSA form\\n\\nThe [SSA form](https://en.wikipedia.org/wiki/Static_single-assignment_form) of a program is generally used internally by compilers to have a simple representation to work on. The [LLVM](https://llvm.org/) language is such an example. In SSA, each variable is assigned exactly once and the control flow is explicit, with jumps or conditional jumps to labels. There are no `for` loops, `if` statements, or non-primitive expressions.\\n\\n:::\\n\\nThen we iterate over all the SSA `members`, and directly print the corresponding Coq code to the standard output. We do not use an intermediate representation or make intermediate passes. We do not even do pretty-printing (splitting lines that are too long at the right place, and introducing indentation)! This should not be necessary as the SSA code cannot nest sub-expressions or statements. We still try to print a readable Coq code, as it will be used in the proofs.\\n\\nThere are four kinds of SSA members:\\n\\n- named constants,\\n- globals,\\n- types,\\n- functions.\\n\\nNamed constants and globals are similar, and are for top-level variables whose value is either known at compile-time or computed at the program\'s init. Types are for type definitions. We will focus on functions, as this is where the code is.\\n\\n## Functions\\n\\nThe SSA functions in Go are described by the type [`ssa.Function`](https://pkg.go.dev/golang.org/x/tools/go/ssa#Function):\\n\\n```go\\ntype Function struct {\\n\\tSignature *types.Signature\\n\\n\\t// source information\\n\\tSynthetic string // provenance of synthetic function; \\"\\" for true source functions\\n\\n\\tPkg *Package // enclosing package; nil for shared funcs (wrappers and error.Error)\\n\\tProg *Program // enclosing program\\n\\n\\tParams []*Parameter // function parameters; for methods, includes receiver\\n\\tFreeVars []*FreeVar // free variables whose values must be supplied by closure\\n\\tLocals []*Alloc // frame-allocated variables of this function\\n\\tBlocks []*BasicBlock // basic blocks of the function; nil => external\\n\\tRecover *BasicBlock // optional; control transfers here after recovered panic\\n\\tAnonFuncs []*Function // anonymous functions directly beneath this one\\n\\t// contains filtered or unexported fields\\n}\\n```\\n\\nThe main part of interest for us is `Blocks`. A block is a sequence of instructions, and the control flow is explicit. The last instruction of a block is a jump to another block, or a return. The first instructions of a block can be the special `Phi` instruction, which is used to merge control flow from different branches.\\n\\nWe decided to write a first version to see what the SSA code of Go looks like when printed in Coq, without thinking about generating a well-typed code. This looks like this:\\n\\n```coq\\nwith MakeUint64 (\u03b1 : list Val.t) : M (list Val.t) :=\\n M.Thunk (\\n match \u03b1 with\\n | [x] =>\\n M.Thunk (M.EvalBody [(0,\\n let* \\"t0\\" := Instr.BinOp x \\"<\\" (Val.Lit (Lit.Int 9223372036854775808)) in\\n Instr.If (Register.read \\"t0\\") 1 2\\n );\\n (1,\\n let* \\"t1\\" := Instr.Convert x in\\n let* \\"t2\\" := Instr.ChangeType (Register.read \\"t1\\") in\\n let* \\"t3\\" := Instr.MakeInterface (Register.read \\"t2\\") in\\n M.Return [(Register.read \\"t3\\")]\\n );\\n (2,\\n let* \\"t4\\" := Instr.Alloc (* complit *) Alloc.Local \\"*go/constant.intVal\\" in\\n let* \\"t5\\" := Instr.FieldAddr (Register.read \\"t4\\") 0 in\\n let* \\"t6\\" := Instr.Call (CallKind.Function (newInt [])) in\\n let* \\"t7\\" := Instr.Call (CallKind.Function (TODO_method [(Register.read \\"t6\\"); x])) in\\n do* Instr.Store (Register.read \\"t5\\") (Register.read \\"t7\\") in\\n let* \\"t8\\" := Instr.UnOp \\"*\\" (Register.read \\"t4\\") in\\n let* \\"t9\\" := Instr.MakeInterface (Register.read \\"t8\\") in\\n M.Return [(Register.read \\"t9\\")]\\n )])\\n | _ => M.Thunk (M.EvalBody [])\\n end)\\n```\\n\\nfor a source Go code (from the [go/constant](https://pkg.go.dev/go/constant) package):\\n\\n```go\\n// MakeUint64 returns the [Int] value for x.\\nfunc MakeUint64(x uint64) Value {\\n\\tif x < 1<<63 {\\n\\t\\treturn int64Val(int64(x))\\n\\t}\\n\\treturn intVal{newInt().SetUint64(x)}\\n}\\n```\\n\\nThere are three blocks of code, labeled with `0`, `1`, and `2`. The first block ends with a conditional jump `If` corresponding to the `if` statement in the Go code. The following blocks are corresponding to the two possible branches of the `if` statement. They both end with a `Return` instruction, corresponding to the `return` statement in the Go code. They run various primitive instructions that we have translated as we can.\\n\\nThe generated Coq code is still readable but more verbose than the original Go code. We will later develop proof techniques using simulations to enable the user to define equivalent but simpler versions of the translation. Being able to define simulations of an imperative program is also important for the proofs, as we can rewrite the code in functional style to make it easier to reason about.\\n\\n## Type-checking\\n\\nFrom there, a second step is to have a generated code that type-checks, forgetting about making a code with sound semantics for now. We generate the various Coq definitions that are needed in a header of the generated code, using axioms for all the definitions. For example, for the allocations we do:\\n\\n```coq\\nModule Alloc.\\n Inductive t : Set :=\\n | Heap\\n | Local.\\nEnd Alloc.\\n\\nModule Instr.\\n Parameter Alloc : Alloc.t -> string -> M Val.t.\\n```\\n\\nThe `Inductive` keyword in Coq defines a type with two constructors `Heap` and `Local`. The `Parameter` keyword defines an axiomatized definition, where we only provide the type but not the definition itself. The `Instr.Alloc` instruction takes as parameters an allocation mode `Alloc.t` and a string and returns an `M Val.t` value.\\n\\n### Representation of values\\n\\nWe make the choice to remove the types while doing the translation, as the type system of Go is probably incompatible with the one of Coq in many ways. We thus translate everything to a single type `Val.t` in Coq to represent all kinds of possible Go values. The downside of this approach is that is makes the generated code less readable and less safe, as types are useful to track the correct use of values.\\n\\nFor now, we define the `Val.t` type as:\\n\\n```coq\\nModule Val.\\n Inductive t : Set :=\\n | Lit (_ : Lit.t)\\n | Tuple (_ : list t).\\nEnd Val.\\n```\\n\\nwith the literals `Lit.t` as:\\n\\n```coq\\nModule Lit.\\n Inductive t : Set :=\\n | Bool (_ : bool)\\n | Int (_ : Z)\\n | Float (_ : Rational)\\n | Complex (_ _ : Rational)\\n | String (_ : string)\\n | Nil.\\nEnd Lit.\\n```\\n\\nWe plan to refine this type and add more cases as we improve `coq-of-go`. Structures, pointers, and closures are missing for now.\\n\\n### Monadic style\\n\\nIn order to represent the side-effects of the Go code, we use a [monadic style](). This is a standard approach to represent side-effects like mutations, exceptions, or non-termination in a purely function language such as Coq. We choose to use:\\n\\n- A free monad, where all the primitives are constructor of the inductive type `M` of the monad. This simplifies the manipulation of the monad by allowing to compute on it and by delegating the actual implementation of the monadic primitives for later.\\n- A co-inductive type, to allow potentially non-terminating programs. Co-inductive types are like lazy definitions in Haskell where it is possible to make an infinite list for example, as long as only a finite number of elements are consumed.\\n\\nIn that sense, we follow the approach in the paper [Modular, Compositional, and Executable Formal Semantics for LLVM IR](https://cambium.inria.fr/~eyoon/paper/vir.pdf), that is using a co-inductive free monad (interaction tree) to formalize a reasonable subset of the LLVM language that is also an SSA representation but with more low-level instructions than Go.\\n\\nOur definition for `M` for now is:\\n\\n```coq\\nModule M.\\n CoInductive t (A : Set) : Set :=\\n | Return (_ : A)\\n | Bind {B : Set} (_ : t B) (_ : B -> t A)\\n | Thunk (_ : t A)\\n | EvalBody (_ : list (Z * t A)).\\n Arguments Return {A}.\\n Arguments Bind {A B}.\\n Arguments Thunk {A}.\\n Arguments EvalBody {A}.\\nEnd M.\\nDefinition M : Set -> Set := M.t.\\n```\\n\\nWe define all the functions that we translate as mutually recursive with the `CoFixpoint ... with ...` keyword of Coq. Thus, we do not have to preserve the ordering of definitions that is required by Coq or care for recursive or mutually recursive functions in Go.\\n\\nHowever, we did not achieve to make the type-checker of Coq happy for our `CoFixpoint` as many definitions are axiomatized, and the type-checker of Coq wants their definitions to know if they produce co-inductive constructors. So, for now, we admit this step by disabling the termination checker with this flag:\\n\\n```coq\\nLocal Unset Guard Checking.\\n```\\n\\n## Next\\n\\nWhen we translate our hello world example we get the Coq code:\\n\\n```coq\\nCoFixpoint Main (\u03b1 : list Val.t) : M (list Val.t) :=\\n M.Thunk (\\n match \u03b1 with\\n | [] =>\\n M.Thunk (M.EvalBody [(0,\\n let* \\"t0\\" := Instr.Alloc (* varargs *) Alloc.Heap \\"*[1]any\\" in\\n let* \\"t1\\" := Instr.IndexAddr (Register.read \\"t0\\") (Val.Lit (Lit.Int 0)) in\\n let* \\"t2\\" := Instr.MakeInterface (Val.Lit (Lit.String \\"Hello, World!\\")) in\\n do* Instr.Store (Register.read \\"t1\\") (Register.read \\"t2\\") in\\n let* \\"t3\\" := Instr.Slice (Register.read \\"t0\\") None None in\\n let* \\"t4\\" := Instr.Call (CallKind.Function (fmt.Println [(Register.read \\"t3\\")])) in\\n M.Return []\\n )])\\n | _ => M.Thunk (M.EvalBody [])\\n end)\\n\\nwith init (\u03b1 : list Val.t) : M (list Val.t) :=\\n M.Thunk (\\n match \u03b1 with\\n | [] =>\\n M.Thunk (M.EvalBody [(0,\\n let* \\"t0\\" := Instr.UnOp \\"*\\" (Register.read \\"init$guard\\") in\\n Instr.If (Register.read \\"t0\\") 2 1\\n );\\n (1,\\n do* Instr.Store (Register.read \\"init$guard\\") (Val.Lit (Lit.Bool true)) in\\n let* \\"t1\\" := Instr.Call (CallKind.Function (fmt.init [])) in\\n Instr.Jump 2\\n );\\n (2,\\n M.Return []\\n )])\\n | _ => M.Thunk (M.EvalBody [])\\n end).\\n```\\n\\nThe `init` function, which is automatically generated by the Go compiler to initialize global variables, does not do much here. It checks whether it was already called or not reading the `init$guard` variable, and if not, it calls the `fmt.init` function. The `Main` function is the one that we are interested in. It allocates a variable to store the string \\"Hello, World!\\", and then calls the `fmt.Println` function to print it.\\n\\nFrom there, to continue the project we have two possibilities:\\n\\n1. Give actual definitions to each primitive instruction that is used in this example (for now, everything is axiomatized).\\n2. Translate all the transitive dependencies of the hello world program to Coq, and make sure that we can compile everything together.\\n\\nFor the next step, we choose to follow the second possibility as we are more confident in being able to define the semantics of the instructions, which is purely done on the Coq side, than in being able to use the Go compiler\'s APIs to retrieve the definitions of all the dependencies and related them together.\\n\\n## Conclusion\\n\\nWe have presented the beginning of our journey to translate Go programs to Coq, to build a formal verification tool for Go. The translation type-checks on the few examples we have tried but has no semantics. We will follow by handling the translation of dependencies of a package.\\n\\nIf you are interested in this project, please contact us at [contact@formal.land](mailto:contact@formal.land) or go to our [GitHub repository](https://github.com/formal-land/coq-of-go)."},{"id":"/2024/02/14/experiment-coq-of-hs","metadata":{"permalink":"/blog/2024/02/14/experiment-coq-of-hs","source":"@site/blog/2024-02-14-experiment-coq-of-hs.md","title":"\ud83e\uddee Experiment on translation from Haskell to Coq","description":"We present an experiment coq-of-hs that we have made on the translation of Haskell programs to the proof system Coq \ud83d\udc13. The goal is to formally verify Haskell programs to make them totally bug-free.","date":"2024-02-14T00:00:00.000Z","formattedDate":"February 14, 2024","tags":[{"label":"coq-of-hs","permalink":"/blog/tags/coq-of-hs"},{"label":"Haskell","permalink":"/blog/tags/haskell"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":4.365,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\uddee Experiment on translation from Haskell to Coq","tags":["coq-of-hs","Haskell","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\uddab Translating Go to Coq, part 1","permalink":"/blog/2024/02/22/journey-coq-of-go"},"nextItem":{"title":"\ud83e\udd84 The importance of formal verification","permalink":"/blog/2024/02/02/formal-verification-for-aleph-zero"}},"content":"We present an experiment [coq-of-hs](https://github.com/formal-land/coq-of-hs-experiment) that we have made on the translation of [Haskell](https://www.haskell.org/) programs to the proof system [Coq \ud83d\udc13](https://coq.inria.fr/). The goal is to formally verify Haskell programs to make them totally bug-free.\\n\\nIndeed, even with the use of a strict type system, there can still be bugs for properties that cannot be expressed with types. An example of such a property is the backward compatibility of an API endpoint for the new release of a web service when there has been code refactoring. Only formal verification can cover all execution cases and kinds of properties.\\n\\nThe code of the tool is at: [github.com/formal-land/coq-of-hs-experiment](https://github.com/formal-land/coq-of-hs-experiment) (AGPL license)\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Contact\\n\\nWe propose tools to make your codebase totally bug-free. Contact us at [contact@formal.land](mailto:contact@formal.land) for more information! We offer a free audit to assess the feasibility of formal verification for your case.\\n\\n:::\\n\\n:::info Info\\n\\nWe estimate that the cost of formal verification should be 20% of the development cost. There are no reasons to still have bugs today!\\n\\n:::\\n\\n![Haskell Logo](2024-02-14/haskell_logo.svg)\\n\\n## Goal of the experiment\\n\\nThere are already some tools to formally verify Haskell programs:\\n\\n- [\ud83d\udc13 hs-to-coq](https://github.com/plclub/hs-to-coq) translation from Haskell to Coq\\n- [\ud83d\udca7 Liquid Haskell](https://en.wikipedia.org/wiki/Liquid_Haskell) verification using [SMT solvers](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories)\\n\\nIn this experiment, we want to check the feasibility of translation from Haskell to Coq:\\n\\n- \ud83d\udc4d covering all the language without manual configuration or code changes,\\n- \ud83d\udc4e even if this is at the cost of a more verbose and low-level translation.\\n\\n## Example\\n\\nHere is an example of a Haskell function:\\n\\n```haskell\\nfixObvious :: (a -> a) -> a\\nfixObvious f = f (fixObvious f)\\n```\\n\\nthat `coq-of-hs` translates to this valid Coq code:\\n\\n```coq\\nCoFixpoint fixObvious : Val.t :=\\n (Val.Lam (fun (f : Val.t) => (Val.App f (Val.App fixObvious f)))).\\n```\\n\\n## Infrastructure\\n\\nWe read the [Haskell Core](https://serokell.io/blog/haskell-to-core) representation of Haskell using the GHC plugin system. Thus, we read the exact same code version as the one that is compiled down to assembly code by [GHC](https://www.haskell.org/ghc/), to take into account all compilation options.\\n\\nHaskell Core is an intermediate representation of Haskell that is close to the lambda calculus and used by the Haskell compiler for various optimizations passes. Here are all the constructors of the `Expr` type of Haskell Core:\\n\\n```haskell\\ndata Expr b\\n = Var Id\\n | Lit Literal\\n | App (Expr b) (Arg b)\\n | Lam b (Expr b)\\n | Let (Bind b) (Expr b)\\n | Case (Expr b) b Type [Alt b]\\n | Cast (Expr b) Coercion\\n | Tick (Tickish Id) (Expr b)\\n | Type Type\\n | Coercion Coercion\\n```\\n\\nThis paper [System FC, as implemented in GHC](https://repository.brynmawr.edu/cgi/viewcontent.cgi?article=1015&context=compsci_pubs) presents it as [System F](https://en.wikipedia.org/wiki/System_F) plus coercions. We translate Haskell code to an untyped version of the lambda calculus in Coq, with co-induction to allow for infinite data structures:\\n\\n```coq\\nModule Val.\\n #[bypass_check(positivity)]\\n CoInductive t : Set :=\\n | Lit (_ : Lit.t)\\n | Con (_ : string) (_ : list t)\\n | App (_ _ : t)\\n | Lam (_ : t -> t)\\n | Case (_ : t) (_ : t -> list (Case.t t))\\n | Impossible.\\nEnd Val.\\n```\\n\\nWe make the translation by induction over the Haskell Core representation, and we translate each constructor to a corresponding constructor of the Coq representation. We pretty-print the Coq code directly without using an intermediate representation. We use the [prettyprinter](https://github.com/quchen/prettyprinter) package with the two main following primitives:\\n\\n```haskell\\nconcatNest :: [Doc ()] -> Doc ()\\nconcatNest = group . nest 2 . vsep\\n\\nconcatGroup :: [Doc ()] -> Doc ()\\nconcatGroup = group . vsep\\n```\\n\\nto display a sub-term with or without indentation when splitting lines that are too long. This translation works well on all the Haskell expressions that we have tested.\\n\\n## Missing features\\n\\n### Semantics\\n\\nWe have not yet defined a semantics. For now, the terms that we generate in Coq are purely descriptive. We will wait to have examples of things to verify to define semantics that are practical to use.\\n\\n### Type-classes\\n\\nWe have not yet translated typeclasses. The Haskell Core language hides most of the typeclasses-related code. For example, it represents instances as additional function parameters for functions that have a typeclass constraints. But we still need to declare the functions corresponding to the member of the typeclasses, what we have not done yet.\\n\\n### Multi-file projects\\n\\nWe have not yet implemented the translation of multi-file projects. We have only tested the translation of a single-file project.\\n\\n### Standard library\\n\\nSimilarly to the handling of multi-file projects, we have not yet tested the translation of projects using external libraries or translating the base library of Haskell.\\n\\n### Strict positivity\\n\\nWe had to turn off the strict positivity condition for the definition of `Val.t` in Coq with:\\n\\n```coq\\n#[bypass_check(positivity)]\\n```\\n\\nThis is for to the case:\\n\\n```coq\\n| Lam (_ : t -> t)\\n```\\n\\nwhere `t` appears as a parameter of a function (negative position). We do not know if this causes any problem in practice, on values that correspond to well-typed Haskell programs.\\n\\n## Conclusion\\n\\nWe have presented an experiment on the translation of Haskell programs to Coq. If you are interested in this project, please get in touch with us at [contact@formal.land](mailto:contact@formal.land) or go to the [GitHub repository](https://github.com/formal-land/coq-of-hs-experiment) of the project."},{"id":"/2024/02/02/formal-verification-for-aleph-zero","metadata":{"permalink":"/blog/2024/02/02/formal-verification-for-aleph-zero","source":"@site/blog/2024-02-02-formal-verification-for-aleph-zero.md","title":"\ud83e\udd84 The importance of formal verification","description":"Ensuring Flawless Software in a Flawed World","date":"2024-02-02T00:00:00.000Z","formattedDate":"February 2, 2024","tags":[],"readingTime":5.53,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 The importance of formal verification","authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\uddee Experiment on translation from Haskell to Coq","permalink":"/blog/2024/02/14/experiment-coq-of-hs"},"nextItem":{"title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","permalink":"/blog/2024/01/18/update-coq-of-rust"}},"content":"> Ensuring Flawless Software in a Flawed World\\n\\nIn this blog post, we present what formal verification is and why this is such a valuable tool to improve the security of your applications.\\n\\n\x3c!-- truncate --\x3e\\n\\n![Formal verification](2024-02-02/formal_verification.png)\\n\\n:::tip Contact\\n\\nIf you want to formally verify your codebase to improve the security of your application, contact us at [contact@formal.land](mailto:contact@formal.land)! We offer a free audit of your codebase to assess the feasibility of formal verification.\\n\\n:::\\n\\n:::info Thanks\\n\\nThe current development of our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust), for the formal verification of Rust code, is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation and its [Ecosystem Funding Program](https://alephzero.org/ecosystem-funding-program). The aim is to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n## What is formal verification?\\n\\nFormal verification is a set of techniques to check for the complete correctness of a program, reasoning at a symbolic level rather than executing a particular instance of the code. By symbolic reasoning, we mean following the values of the variables by tracking their names and constraints, without necessarily giving them an example value. This is what we would do in our heads to understand a code where a variable `username` appears, following which functions it is given to, to know where we use the user name. The concrete user name that we consider is irrelevant, although some people prefer to think with an example.\\n\\nIn formal verification, we rely on precise mathematical reasoning to make sure that there are no mistakes or missing cases. We check this reasoning with a dedicated program ([SMT](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories) solver, [Coq](https://coq.inria.fr/) proof system, ...). Indeed, as programs grow in complexity, it could be easy to forget an `if` branch or an error case.\\n\\nFor example, to say that the following Rust program is valid:\\n\\n```coq\\n/// Return the maximum of [a] and [b]\\nfn get_max(a: u128, b: u128) -> u128 {\\n if a > b {\\n a\\n } else {\\n b\\n }\\n}\\n```\\n\\nwe reason on two cases (reasoning by disjunction):\\n\\n- `a > b` where `a` is the maximum,\\n- `a <= b` where `b` is the maximum,\\n\\nwith the values of `a` and `b` being irrelevant (symbolic). In both cases, we can conclude that `get_max` returns the maximum.\\n\\nThis is in contrast with testing, where we need to execute the program with all possible instances of `a` and `b` to check that the program is correct with 100% certainty. This is infeasible in this case as the type `u128` is too large to be tested exhaustively: there are `2^256` possible values for `a` and `b`, meaning `115792089237316195423570985008687907853269984665640564039457584007913129639936` possible values!\\n\\nA program is shown correct with respect to an expected behavior, called a _formal specification_. This is expressed in a mathematical language to be non-ambiguous. For example, we can specify the behavior of the previous program as:\\n\\n```\\nFORALL (a b : u128),\\n (get_max a b = a OR get_max a b = b) AND\\n (get_max a b >= a AND get_max a b >= b)\\n```\\n\\nstating that we indeed return the maximum of `a` and `b`.\\n\\nWhen a program is formally verified, we are mathematically sure it will always follow its specifications. This is a way to eliminate all bugs, as long as we have a complete specification of what it is supposed to do or not do. This corresponds to the highest level of Evaluation Assurance Level, [EAL7](https://en.wikipedia.org/wiki/Evaluation_Assurance_Level#EAL7:_Formally_Verified_Design_and_Tested). This is used for critical applications, such as space rocket software, where a single bug can be extremely expensive (the loss of a rocket!).\\n\\nThere are various formal verification tools, such as the proof system [Coq](https://coq.inria.fr/). The C compiler [CompCert](https://en.wikipedia.org/wiki/CompCert) is an example of large software verified in Coq. It is proven correct, in contrast to most other C compilers that contain [subtle bugs](https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf). CompCert is now used by Airbus to compile C programs embedded in planes \ud83d\udeeb.\\n\\n## Why is it such a useful tool?\\n\\nFormal verification is extremely useful as it can anticipate all the bugs by exploring all possible execution cases of a program. Here is a quote from [Edsger W. Dijkstra](https://en.wikipedia.org/wiki/Formal_verification):\\n\\n> Program testing can be used to show the presence of bugs, but never to show their absence!\\n\\nIt offers the possibility to make software that never fails. This is often required for applications with human life at stake, such as planes or medical devices. But it can also be useful for applications where a single bug can be extremely expensive, such as financial applications.\\n\\nSmart contracts are a good example of such applications. They are programs that are executed on a blockchain and are used to manage assets worth billions of dollars. A single bug in a smart contract can lead to the loss of all the assets managed by the contract. In the first half of 2023, some estimate that attacks on web3 platforms resulted in a loss of [$655.61 million](https://www.linkedin.com/pulse/h1-2023-global-web3-security-report-aml-analysis-crypto-regulatory/), with most of these losses due to bugs in smart contracts. These bugs could be prevented using formally verified smart contracts.\\n\\nFinally, formal verification is useful to improve the quality of a program by enforcing the need to use:\\n\\n- clear programming constructs,\\n- an explicit specification of the behavior of the program.\\n\\n## Comparison of formal verification and testing\\n\\nCompared to testing, formal verification is more complex as:\\n\\n- it typically takes much more time to formally verify a program than to test it on a reasonable set of inputs,\\n- it requires a formal specification of the program, which is not always available,\\n- it requires some specific expertise to use the formal verification tools and to write the specifications.\\n\\nIn addition, formal verification assumes a certain model of the environment of the program, which is not always accurate. When actually executing the code, we also exercise all the dependencies (libraries, operating system, network, ...) that might cause issues at runtime.\\n\\nHowever, formal verification is the only way to have an exhaustive check of the program. It verifies all corner cases, such as integer overflows, or hard-to-reproduce issues, such as concurrency bugs. We recommend combining both approaches as they do not catch the same kinds of bugs.\\n\\nAt [Formal Land](https://formal.land/), we consider it critical to lower the cost of formal verification to apply it to a larger scope of programs and prevent more bugs and attacks. We work on the formal verification of Rust with [coq-of-rust](https://github.com/formal-land/coq-of-rust) and OCaml with [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml).\\n\\n## Conclusion\\n\\nFormal verification is a powerful tool to improve the security of your applications. It is the only way to prevent all bugs by exploring all possible executions of your programs. It complements existing testing methods. It is particularly useful for critical applications, such as smart contracts, where a single bug can be extremely expensive."},{"id":"/2024/01/18/update-coq-of-rust","metadata":{"permalink":"/blog/2024/01/18/update-coq-of-rust","source":"@site/blog/2024-01-18-update-coq-of-rust.md","title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","description":"We continue our work on the coq-of-rust tool to formally verify Rust programs with the Coq proof assistant. We have upgraded the Rust version that we support, simplified the translation of the traits, and are adding better support for the standard library of Rust.","date":"2024-01-18T00:00:00.000Z","formattedDate":"January 18, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Aleph-Zero","permalink":"/blog/tags/aleph-zero"}],"readingTime":3.5,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","tags":["coq-of-rust","Rust","Coq","Aleph-Zero"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 The importance of formal verification","permalink":"/blog/2024/02/02/formal-verification-for-aleph-zero"},"nextItem":{"title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","permalink":"/blog/2024/01/04/rust-translating-match"}},"content":"We continue our work on the [coq-of-rust](https://github.com/formal-land/coq-of-rust) tool to formally verify Rust programs with the [Coq proof assistant](https://coq.inria.fr/). We have upgraded the Rust version that we support, simplified the translation of the traits, and are adding better support for the standard library of Rust.\\n\\nOverall, we are now able to translate **about 80%** of the Rust examples from the [Rust by Example](https://doc.rust-lang.org/stable/rust-by-example/) book into valid Coq files. This means we support a large subset of the Rust language.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nTo formally verify your Rust codebase and improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)! Formal verification is the only way to prevent all bugs by exploring all possible executions of your programs \ud83c\udfaf.\\n\\n:::\\n\\n:::info Thanks\\n\\nThis work and the development of [coq-of-rust](https://github.com/formal-land/coq-of-rust) is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation, to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n![Rust rooster](2024-01-18/rooster.png)\\n\\n## Upgrade of the Rust version\\n\\nThe tool `coq-of-rust` is tied to a particular version of the Rust compiler that we use to parse and type-check a `cargo` project. We now support the `nightly-2023-12-15` version of Rust, up from `nightly-2023-04-30`. Most of the changes were minor, but it is good to handle these regularly to have smooth upgrades. The corresponding pull request is [coq-of-rust/pull/445](https://github.com/formal-land/coq-of-rust/pull/445). We also got more [Clippy](https://github.com/rust-lang/rust-clippy) warnings thanks to the new version of Rust.\\n\\n## Simplify the translation of traits\\n\\nThe traits of Rust are similar to the [type-classes of Coq](https://coq.inria.fr/refman/addendum/type-classes.html). This is how we translate traits to Coq.\\n\\nBut there are a lot of subtle differences between the two languages. The type-class inference mechanism of Coq does not work all the time on generated Rust code, even when adding a lot of code annotations. We think that the only reliable way to translate Rust traits would be to explicit the implementations inferred by the Rust compiler, but the Rust compiler currently throws away this information.\\n\\nInstead, our new solution is to use a Coq tactic:\\n\\n```coq\\n(** Try first to infer the trait instance, and if unsuccessful, delegate it at\\n proof time. *)\\nLtac get_method method :=\\n exact (M.pure (method _)) ||\\n exact (M.get_method method).\\n```\\n\\nthat first tries to infer the trait instance for a particular method, and if it fails, delegates its definition to the user at proof time. This is a bit unsafe, as a user could provide invalid instances at proof time, by giving some custom instance definitions instead of the ones generated by `coq-of-rust`. So, one should be careful to only apply generated instances to fill the hole made by this tactic in case of failure. We believe this to be a reasonable assumption that we could enforce someday if needed.\\n\\nWe are also starting to remove the trait constraints on polymorphic functions (the `where` clauses). We start by doing it in our manual definition of the standard library of Rust. The rationale is that we can provide the actual trait instances at proof time by having the right hypothesis replicating the constraints of the `where` clauses. Having fewer `where` clauses reduces the complexity of the type inference of Coq on the generated code. There are still some cases that we need to clarify, for example, the handling of [associated types](https://doc.rust-lang.org/rust-by-example/generics/assoc_items/types.html) in the absence of traits.\\n\\n## Handling more of the standard library\\n\\nWe have a definition of the standard library of Rust, mainly composed of axiomatized[^1] definitions, in these three folders:\\n\\n- [CoqOfRust/alloc](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/alloc)\\n- [CoqOfRust/core](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/core)\\n- [CoqOfRust/std](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/std)\\n\\nBy adding more of these axioms, as well as with some small changes to the `coq-of-rust` tool, we are now able to successfully translate around 80% of the examples of the [Rust by Example](https://doc.rust-lang.org/stable/rust-by-example/) book. There can still be some challenges on larger programs, but this showcases the good support of `coq-of-rust` for the Rust language.\\n\\n## Conclusion\\n\\nWe are continuing to improve our tool `coq-of-rust` to support more of the Rust language and are making good progress. If you need to improve the security of critical applications written in Rust, contact us at [contact@formal.land](mailto:contact@formal.land) to start formally verifying your code!\\n\\n[^1]: An axiom in Coq is either a theorem whose proof is admitted, or a function/constant definition left for latter. This is the equivalent in Rust of the `todo!` macro."},{"id":"/2024/01/04/rust-translating-match","metadata":{"permalink":"/blog/2024/01/04/rust-translating-match","source":"@site/blog/2024-01-04-rust-translating-match.md","title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","description":"Our tool coq-of-rust enables formal verification of \ud83e\udd80 Rust code to make sure that a program has no bugs. This technique checks all possible execution paths using mathematical techniques. This is important for example to ensure the security of smart contracts written in Rust language.","date":"2024-01-04T00:00:00.000Z","formattedDate":"January 4, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Aleph-Zero","permalink":"/blog/tags/aleph-zero"}],"readingTime":6.005,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","tags":["coq-of-rust","Rust","Coq","Aleph-Zero"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","permalink":"/blog/2024/01/18/update-coq-of-rust"},"nextItem":{"title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","permalink":"/blog/2023/12/13/rust-verify-erc-20-smart-contract"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) enables [formal verification](https://en.wikipedia.org/wiki/Formal_verification) of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code to make sure that a program has no bugs. This technique checks all possible execution paths using mathematical techniques. This is important for example to ensure the security of smart contracts written in Rust language.\\n\\nOur tool `coq-of-rust` works by translating Rust programs to the general proof system [\ud83d\udc13 Coq](https://coq.inria.fr/). Here we explain how we translate[ `match` patterns](https://doc.rust-lang.org/book/ch06-02-match.html) from Rust to Coq. The specificity of Rust patterns is to be able to match values either by value or reference.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nTo formally verify your Rust codebase and improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)! Formal verification is the only way to prevent all bugs by exploring all possible executions of your program.\\n\\n:::\\n\\n:::info Thanks\\n\\nThis work and the development of [coq-of-rust](https://github.com/formal-land/coq-of-rust) is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation, to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n![Rust rooster](2024-01-04/rust-rooster.png)\\n\\n## Rust example \ud83e\udd80\\n\\nTo illustrate the pattern matching in Rust, we will use the following example featuring a match by reference:\\n\\n```rust\\npub(crate) fn is_option_equal
(\\n is_equal: fn(x: &A, y: &A) -> bool,\\n lhs: Option,\\n rhs: &A,\\n) -> bool {\\n match lhs {\\n None => false,\\n Some(ref value) => is_equal(value, rhs),\\n }\\n}\\n```\\n\\nWe take a function `is_equal` as a parameter, operating only on references to the type `A`. We apply it to compare two values `lhs` and `rhs`:\\n\\n- if `lhs` is `None`, we return `false`,\\n- if `lhs` is `Some`, we get its value by reference and apply `is_equal`.\\n\\nWhen we apply the pattern:\\n\\n```rust\\nSome(ref value) => ...\\n```\\n\\nwe do something interesting: we read the value of `lhs` to know if we are in a `Some` case but leave it in place and return `value` the reference to its content.\\n\\nTo simulate this behavior in Coq, we need to match in two steps:\\n\\n1. match the value of `lhs` to know if we are in a `Some` case or not,\\n2. if we are in a `Some` case, create the reference to the content of a `Some` case based on the reference to `lhs`.\\n\\n## Coq translation \ud83d\udc13\\n\\nThe Coq translation that our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) generates is the following:\\n\\n```coq\\nDefinition is_option_equal\\n {A : Set}\\n (is_equal : (ref A) -> (ref A) -> M bool.t)\\n (lhs : core.option.Option.t A)\\n (rhs : ref A)\\n : M bool.t :=\\n let* is_equal := M.alloc is_equal in\\n let* lhs := M.alloc lhs in\\n let* rhs := M.alloc rhs in\\n let* \u03b10 : M.Val bool.t :=\\n match_operator\\n lhs\\n [\\n fun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.None => M.alloc false\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t);\\n fun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.Some _ =>\\n let \u03b30_0 := \u03b3.[\\"Some.0\\"] in\\n let* value := M.alloc (borrow \u03b30_0) in\\n let* \u03b10 : (ref A) -> (ref A) -> M bool.t := M.read is_equal in\\n let* \u03b11 : ref A := M.read value in\\n let* \u03b12 : ref A := M.read rhs in\\n let* \u03b13 : bool.t := M.call (\u03b10 \u03b11 \u03b12) in\\n M.alloc \u03b13\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t)\\n ] in\\n M.read \u03b10.\\n```\\n\\nWe run the `match_operator` on `lhs` and the two branches of the `match`. This operator is of type:\\n\\n```coq\\nDefinition match_operator {A B : Set}\\n (scrutinee : A)\\n (arms : list (A -> M B)) :\\n M B :=\\n ...\\n```\\n\\nIt takes a `scrutinee` value to match as a parameter, and runs a sequence of functions `arms` on it. Each function `arms` takes the value of the `scrutinee` and returns a monadic value `M B`. This monadic value can either be a success value if the pattern matches, or a special failure value if the pattern does not match. We evaluate the branches until one succeeds.\\n\\n### `None` branch\\n\\nThe `None` branch is the simplest one. We read the value at the address given by `lhs` (we represent each Rust variable by its address) and match it with the `None` constructor:\\n\\n```coq\\nfun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.None => M.alloc false\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t)\\n```\\n\\nIf it matches, we return `false`. If it does not, we return the special value `M.break_match` to indicate that the pattern does not match.\\n\\n### `Some` branch\\n\\nIn the `Some` branch, we first also read the value at the address given by `lhs` and match it with the `Some` constructor:\\n\\n```coq\\nfun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.Some _ =>\\n let \u03b30_0 := \u03b3.[\\"Some.0\\"] in\\n let* value := M.alloc (borrow \u03b30_0) in\\n let* \u03b10 : (ref A) -> (ref A) -> M bool.t := M.read is_equal in\\n let* \u03b11 : ref A := M.read value in\\n let* \u03b12 : ref A := M.read rhs in\\n let* \u03b13 : bool.t := M.call (\u03b10 \u03b11 \u03b12) in\\n M.alloc \u03b13\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t)\\n```\\n\\nIf we are in that case, we create the value:\\n\\n```coq\\nlet \u03b30_0 := \u03b3.[\\"Some.0\\"] in\\n```\\n\\nwith the address of the first field of the `Some` constructor, relative to the address of `lhs` given in `\u03b3`. We define the operator `.[\\"Some.0\\"]` when we define the option type and generate such definitions for all user-defined enum types.\\n\\nWe then encapsulate the address `\u03b30_0` in a proper Rust reference:\\n\\n```coq\\nlet* value := M.alloc (borrow \u03b30_0) in\\n```\\n\\nof type `ref A` in the original Rust code. Finally, we call the function `is_equal` on the two references `value` and `rhs`, with some boilerplate code to read and allocate the variables.\\n\\n## General translation\\n\\nWe generalize this translation to all patterns by:\\n\\n- flattening all the or patterns `|` so that only patterns with a single choice remain,\\n- evaluating each match branch in order with the `match_operator` operator,\\n- in each branch, evaluating the inner patterns in order. This evaluation might fail at any point if the pattern does not match. In this case, we return the special value `M.break_match` and continue with the next branch.\\n\\nAt least one branch should succeed as the Rust compiler checks that all cases are covered. We still have a special value `M.impossible` in Coq for the case where no patterns match and satisfy the type checker.\\n\\nWe distinguish and handle the following kind of patterns (and all their combinations):\\n\\n- wild patterns `_`,\\n- binding patterns `(ref) name` or `(ref) name as pattern` (the `ref` keyword is optional),\\n- struct patterns `Name { field1: pattern1, ... }` or `Name(pattern1, ...)`\\n- tuple patterns `(pattern1, ...)`,\\n- literal patterns `12`, `true`, ...,\\n- slice patterns `[first, second, tail @ ..]`,\\n- dereference patterns `&pattern`.\\n\\nThis was enough to cover all of our examples. The Rust compiler can also automatically add some `ref` patterns when matching on references. We do not need to handle this case as this is automatically done by the Rust compiler during its compilation to the intermediate [THIR](https://rustc-dev-guide.rust-lang.org/thir.html) representation, and e directly read the THIR code.\\n\\n## Conclusion\\n\\nIn this blog post, we have presented how we translate Rust patterns to the proof system Coq. The difficult part is handling the `ref` patterns, which we do by matching in two steps: matching on the values and then computing the addresses of the sub-fields.\\n\\nIf you have Rust smart contracts or programs to verify, feel free to email us at [contact@formal.land](mailto:contact@formal.land). We will be happy to help!"},{"id":"/2023/12/13/rust-verify-erc-20-smart-contract","metadata":{"permalink":"/blog/2023/12/13/rust-verify-erc-20-smart-contract","source":"@site/blog/2023-12-13-rust-verify-erc-20-smart-contract.md","title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","description":"Our tool coq-of-rust enables formal verification of \ud83e\udd80 Rust code to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system \ud83d\udc13 Coq.","date":"2023-12-13T00:00:00.000Z","formattedDate":"December 13, 2023","tags":[{"label":"Aleph-Zero","permalink":"/blog/tags/aleph-zero"},{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"ERC-20","permalink":"/blog/tags/erc-20"},{"label":"ink!","permalink":"/blog/tags/ink"}],"readingTime":20.12,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","tags":["Aleph-Zero","coq-of-rust","Rust","Coq","ERC-20","ink!"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","permalink":"/blog/2024/01/04/rust-translating-match"},"nextItem":{"title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","permalink":"/blog/2023/11/26/rust-function-body"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) enables formal verification of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system [\ud83d\udc13 Coq](https://coq.inria.fr/).\\n\\nHere, we show how we formally verify an [ERC-20 smart contract](https://github.com/use-ink/ink/blob/master/integration-tests/public/erc20/lib.rs) written in Rust for the [Aleph Zero](https://alephzero.org/) blockchain. [ERC-20](https://en.wikipedia.org/wiki/Ethereum#ERC20) smart contracts are used to create new kinds of tokens in an existing blockchain. Examples are stable coins such as the [\ud83d\udcb2USDT](https://tether.to/).\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nTo formally verify your Rust codebase and improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)! Formal verification is the only way to prevent all bugs by exploring all possible executions of your program.\\n\\n:::\\n\\n:::info Thanks\\n\\nThis work and the development of [coq-of-rust](https://github.com/formal-land/coq-of-rust) is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation, to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n![Rooster verifying](2023-12-13/rooster-verifying.png)\\n\\n## Smart contract code \ud83e\udd80\\n\\nHere is the Rust code of the smart contract that we want to verify:\\n\\n```rust\\n#[ink::contract]\\nmod erc20 {\\n use ink::storage::Mapping;\\n\\n #[ink(storage)]\\n #[derive(Default)]\\n pub struct Erc20 {\\n total_supply: Balance,\\n balances: Mapping,\\n allowances: Mapping<(AccountId, AccountId), Balance>,\\n }\\n\\n #[ink(event)]\\n pub struct Transfer {\\n // ...\\n }\\n\\n #[ink(event)]\\n pub struct Approval {\\n // ...\\n }\\n\\n #[derive(Debug, PartialEq, Eq)]\\n #[ink::scale_derive(Encode, Decode, TypeInfo)]\\n pub enum Error {\\n // ...\\n }\\n\\n pub type Result = core::result::Result;\\n\\n impl Erc20 {\\n #[ink(constructor)]\\n pub fn new(total_supply: Balance) -> Self {\\n let mut balances = Mapping::default();\\n let caller = Self::env().caller();\\n balances.insert(caller, &total_supply);\\n Self::env().emit_event(Transfer {\\n from: None,\\n to: Some(caller),\\n value: total_supply,\\n });\\n Self {\\n total_supply,\\n balances,\\n allowances: Default::default(),\\n }\\n }\\n\\n #[ink(message)]\\n pub fn total_supply(&self) -> Balance {\\n self.total_supply\\n }\\n\\n #[ink(message)]\\n pub fn balance_of(&self, owner: AccountId) -> Balance {\\n self.balance_of_impl(&owner)\\n }\\n\\n #[inline]\\n fn balance_of_impl(&self, owner: &AccountId) -> Balance {\\n self.balances.get(owner).unwrap_or_default()\\n }\\n\\n #[ink(message)]\\n pub fn allowance(&self, owner: AccountId, spender: AccountId) -> Balance {\\n self.allowance_impl(&owner, &spender)\\n }\\n\\n #[inline]\\n fn allowance_impl(&self, owner: &AccountId, spender: &AccountId) -> Balance {\\n self.allowances.get((owner, spender)).unwrap_or_default()\\n }\\n\\n #[ink(message)]\\n pub fn transfer(&mut self, to: AccountId, value: Balance) -> Result<()> {\\n let from = self.env().caller();\\n self.transfer_from_to(&from, &to, value)\\n }\\n\\n #[ink(message)]\\n pub fn approve(&mut self, spender: AccountId, value: Balance) -> Result<()> {\\n let owner = self.env().caller();\\n self.allowances.insert((&owner, &spender), &value);\\n self.env().emit_event(Approval {\\n owner,\\n spender,\\n value,\\n });\\n Ok(())\\n }\\n\\n #[ink(message)]\\n pub fn transfer_from(\\n &mut self,\\n from: AccountId,\\n to: AccountId,\\n value: Balance,\\n ) -> Result<()> {\\n let caller = self.env().caller();\\n let allowance = self.allowance_impl(&from, &caller);\\n if allowance < value {\\n return Err(Error::InsufficientAllowance)\\n }\\n self.transfer_from_to(&from, &to, value)?;\\n // We checked that allowance >= value\\n #[allow(clippy::arithmetic_side_effects)]\\n self.allowances\\n .insert((&from, &caller), &(allowance - value));\\n Ok(())\\n }\\n\\n fn transfer_from_to(\\n &mut self,\\n from: &AccountId,\\n to: &AccountId,\\n value: Balance,\\n ) -> Result<()> {\\n let from_balance = self.balance_of_impl(from);\\n if from_balance < value {\\n return Err(Error::InsufficientBalance)\\n }\\n // We checked that from_balance >= value\\n #[allow(clippy::arithmetic_side_effects)]\\n self.balances.insert(from, &(from_balance - value));\\n let to_balance = self.balance_of_impl(to);\\n self.balances\\n .insert(to, &(to_balance.checked_add(value).unwrap()));\\n self.env().emit_event(Transfer {\\n from: Some(*from),\\n to: Some(*to),\\n value,\\n });\\n Ok(())\\n }\\n }\\n}\\n```\\n\\nThis whole code is rather short and contains no loops, which will simplify our verification process. It uses a lot of macros, such as `#[ink(message)]`, that are specific to the [ink!](https://use.ink/) language for smart contracts, built on top of Rust. To verify this smart contract, we removed all the macros and added a mock of the dependencies, such as `ink::storage::Mapping` to get a map data structure.\\n\\n## The Coq translation \ud83d\udc13\\n\\nBy running our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) we automatically obtain the corresponding Coq code for the contract [erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/erc20.v). Here is an extract for the `transfer` function:\\n\\n```coq\\n(*\\n fn transfer(&mut self, to: AccountId, value: Balance) -> Result<()> {\\n let from = self.env().caller();\\n self.transfer_from_to(&from, &to, value)\\n }\\n*)\\nDefinition transfer\\n (self : mut_ref ltac:(Self))\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance))\\n : M ltac:(erc20.Result unit) :=\\n let* self : M.Val (mut_ref ltac:(Self)) := M.alloc self in\\n let* to : M.Val erc20.AccountId.t := M.alloc to in\\n let* value : M.Val ltac:(erc20.Balance) := M.alloc value in\\n let* from : M.Val erc20.AccountId.t :=\\n let* \u03b10 : mut_ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : erc20.Env.t :=\\n M.call (erc20.Erc20.t::[\\"env\\"] (borrow (deref \u03b10))) in\\n let* \u03b12 : M.Val erc20.Env.t := M.alloc \u03b11 in\\n let* \u03b13 : erc20.AccountId.t :=\\n M.call (erc20.Env.t::[\\"caller\\"] (borrow \u03b12)) in\\n M.alloc \u03b13 in\\n let* \u03b10 : mut_ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : u128.t := M.read value in\\n let* \u03b12 : core.result.Result.t unit erc20.Error.t :=\\n M.call\\n (erc20.Erc20.t::[\\"transfer_from_to\\"] \u03b10 (borrow from) (borrow to) \u03b11) in\\n let* \u03b10 : M.Val (core.result.Result.t unit erc20.Error.t) := M.alloc \u03b12 in\\n M.read \u03b10.\\n```\\n\\nMore details of the translation are given in previous blog posts, but basically:\\n\\n- we make explicit all memory and implicit operations (like borrowing and dereferencing),\\n- we apply a monadic translation to chain the primitive operations with `let*`.\\n\\n## Proof strategy\\n\\n![Proof strategy](2023-12-13/proof-strategy.png)\\n\\nWe verify the code in two steps:\\n\\n1. Show that a simpler, purely functional Coq code can simulate all the smart contract code.\\n2. Show that the simulation is correct.\\n\\nThat way, we can eliminate all the memory-related operations by showing the equivalence with a simulation. Then, we can focus on the functional code, which is more straightforward to reason about. We can cite another project, [Aeneas](https://github.com/AeneasVerif/aeneas), which proposes to do the first step (removing memory operations) automatically.\\n\\n## Simulations\\n\\n### Simulation code\\n\\nWe will work on the example of the `transfer` function. We define the simulations in [Simulations/erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/Simulations/erc20.v). For the `transfer` function this is:\\n\\n```coq\\nDefinition transfer\\n (env : erc20.Env.t)\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n MS? State.t ltac:(erc20.Result unit) :=\\n transfer_from_to (Env.caller env) to value.\\n```\\n\\nThe function `transfer` is a wrapper around `transfer_from_to`, using the smart contract caller as the `from` account. The monad `MS?` combines the state and error effect. The state is given by the `State.t` type:\\n\\n```coq\\nModule State.\\n Definition t : Set := erc20.Erc20.t * list erc20.Event.t.\\nEnd State.\\n```\\n\\nIt combines the state of the contract (type `Self` in the Rust code) and a list of events to represent the logs. The errors of the monad include panic errors, as well as control flow primitives such as `return` or `break` that we implement with exceptions.\\n\\n### Equivalence statement\\n\\nWe write all our proofs in [Proofs/erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/Proofs/erc20.v). The lemma stating that the simulation is equivalent to the original code is:\\n\\n```coq\\nLemma run_transfer\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance))\\n (H_storage : Erc20.Valid.t storage)\\n (H_value : Integer.Valid.t value) :\\n let state := State.of_storage storage in\\n let self := Ref.mut_ref Address.storage in\\n let simulation :=\\n lift_simulation\\n (Simulations.erc20.transfer env to value) storage in\\n {{ Environment.of_env env, state |\\n erc20.Impl_erc20_Erc20_t_2.transfer self to value \u21d3\\n simulation.(Output.result)\\n | simulation.(Output.state) }}.\\n```\\n\\nThe main predicate is:\\n\\n```coq\\n{{ env, state | translated_code \u21d3 result | final_state }}.\\n```\\n\\nThis predicate defines our semantics, explaining how to evaluate a translated Rust code in an environment `env` and a state `state`, to obtain a result `result` and a final state `final_state`. We use an environment in addition to a state to initialize various globals and other information related to the execution context. For example, here, we use the environment to store the `caller` of the contract and the pointer to the list of logs.\\n\\n### Semantics\\n\\nWe define our monad for the translated code `M A` in a style by continuation:\\n\\n```coq\\nInductive t (A : Set) : Set :=\\n| Pure : A -> t A\\n| CallPrimitive {B : Set} : Primitive.t B -> (B -> t A) -> t A\\n| Cast {B1 B2 : Set} : B1 -> (B2 -> t A) -> t A\\n| Impossible : t A.\\nArguments Pure {_}.\\nArguments CallPrimitive {_ _}.\\nArguments Cast {_ _ _}.\\nArguments Impossible {_}.\\n```\\n\\nFor now, we use the primitives to access the memory and the environment:\\n\\n```coq\\nModule Primitive.\\n Inductive t : Set -> Set :=\\n | StateAlloc {A : Set} : A -> t (Ref.t A)\\n | StateRead {Address A : Set} : Address -> t A\\n | StateWrite {Address A : Set} : Address -> A -> t unit\\n | EnvRead {A : Set} : t A.\\nEnd Primitive.\\n```\\n\\nFor each of our monad constructs, we add a case to our evaluation predicate that we will describe:\\n\\n- `Pure` The result is the value itself, and the state is unchanged:\\n ```coq\\n | Pure :\\n {{ env, state\' | LowM.Pure result \u21d3 result | state\' }}\\n ```\\n- `Cast` The evaluation is only possible when `B1` and `B2` are the same type `B`:\\n ```coq\\n | Cast {B : Set} (state : State) (v : B) (k : B -> LowM A) :\\n {{ env, state | k v \u21d3 result | state\' }} ->\\n {{ env, state | LowM.Cast v k \u21d3 result | state\' }}\\n ```\\n In this case, we return the result of the continuation `k` of the cast. We do not change the state in the cast.\\n- We read the state using the primitive `State.read`, checking that the `address` is indeed allocated (it returns `None` otherwise). Note that the type of `v` depends on its address. We directly allocate values with their original type, to avoid serializations/deserializations to represent the state.\\n ```coq\\n | CallPrimitiveStateRead\\n (address : Address) (v : State.get_Set address)\\n (state : State)\\n (k : State.get_Set address -> LowM A) :\\n State.read address state = Some v ->\\n {{ env, state | k v \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateRead address) k \u21d3 result\\n | state\' }}\\n ```\\n- Similarly, we write into the state with `State.alloc_write`, that only succeeds for allocated addresses:\\n ```coq\\n | CallPrimitiveStateWrite\\n (address : Address) (v : State.get_Set address)\\n (state state_inter : State)\\n (k : unit -> LowM A) :\\n State.alloc_write address state v = Some state_inter ->\\n {{ env, state_inter | k tt \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateWrite address v) k \u21d3 result\\n | state\' }}\\n ```\\n- To allocate a new value in memory, we have to make a choice depending on whether we want this value to be writable or not. For immutable values, we do not create a new address and instead say that the address is the value itself:\\n ```coq\\n | CallPrimitiveStateAllocNone {B : Set}\\n (state : State) (v : B)\\n (k : Ref B -> LowM A) :\\n {{ env, state | k (Ref.Imm v) \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateAlloc v) k \u21d3 result\\n | state\' }}\\n ```\\n If we later attempt to update this value, it will not be possible to define a semantics and we will be stuck. It is up to the user to correctly anticipate if a value will be updated or not to define the semantics. For values that might be updated, we use:\\n ```coq\\n | CallPrimitiveStateAllocSome\\n (address : Address) (v : State.get_Set address)\\n (state : State)\\n (k : Ref (State.get_Set address) -> LowM A) :\\n let r :=\\n Ref.MutRef (A := State.get_Set address) (B := State.get_Set address)\\n address (fun full_v => full_v) (fun v _full_v => v) in\\n State.read address state = None ->\\n State.alloc_write address state v = Some state\' ->\\n {{ env, state | k r \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateAlloc v) k \u21d3 result\\n | state\' }}\\n ```\\n We need to provide an address not already allocated: `State.read` should return `None`. At this point, we can make any choice of unallocated address in order to simplify the proofs later.\\n- Finally, we read the whole environment with:\\n ```coq\\n | CallPrimitiveEnvRead\\n (state : State) (k : Env -> LowM A) :\\n {{ env, state | k env \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive Primitive.EnvRead k \u21d3 result\\n | state\' }}\\n ```\\n\\n### Semantics remarks\\n\\nWe can make a few remarks about our semantics:\\n\\n- There are no cases for `M.Impossible` as this primitive corresponds to impossible branches in the code.\\n- The semantics is not computable, in the sense that we cannot define a function `run` to evaluate a monadic program in a certain environment and state. Indeed, the user needs to make a choice during the allocation of new values, to know if we allocate the value as immutable or mutable, and with which address. The `M.Cast` operator is also not computable, as we cannot decide if two types are equal.\\n- We can choose the type that we use for the `State`, as well as the primitives `State.read` and `State.alloc_write`, as long as they verify well-formedness properties. For example, reading after a write at the same address should return the written value. One should choose a `State` that simplifies its proofs the most. To verify the smart contract, we have taken a record with two fields:\\n 1. the storage of the contract (the `Self` type in Rust),\\n 2. the list of events logged by the contract.\\n- Even if the monad is in continuation-passing style, we add a primitive `M.Call` corresponding to a bind, to explicit the points in the code where we call user-defined functions. This is not necessary but helpful to track things in the proofs. Otherwise, the monadic bind is defined as a fixpoint with:\\n ```coq\\n Fixpoint bind {A B : Set} (e1 : t A) (f : A -> t B) : t B :=\\n match e1 with\\n | Pure v => f v\\n | CallPrimitive primitive k =>\\n CallPrimitive primitive (fun v => bind (k v) f)\\n | Cast v k =>\\n Cast v (fun v\' => bind (k v\') f)\\n | Impossible => Impossible\\n end.\\n ```\\n- To handle the panic and `return`/`break` exceptions, we wrap our monad into an error monad:\\n ```coq\\n Definition M (A : Set) : Set :=\\n LowM (A + Exception.t).\\n ```\\n where `LowM` is the monad without errors as defined above and `Exception.t` is:\\n ```coq\\n Module Exception.\\n Inductive t : Set :=\\n (** exceptions for Rust\'s `return` *)\\n | Return {A : Set} : A -> t\\n (** exceptions for Rust\'s `continue` *)\\n | Continue : t\\n (** exceptions for Rust\'s `break` *)\\n | Break : t\\n | Panic : Coq.Strings.String.string -> t.\\n End Exception.\\n ```\\n\\n### Proof of equivalence\\n\\nTo prove that the equivalence between the simulation and the original code holds, we proceed by induction on the monadic code. This corresponds to symbolically evaluating the monadic code, in the proof mode of Coq, applying the primitives of the semantics predicate at each step. We use the following tactic to automate this work:\\n\\n```coq\\nrun_symbolic.\\n```\\n\\nWe manually handle the following cases:\\n\\n- branching (`if` or `match`),\\n- external function calls: generally, we apply an existing equivalence proof for a call to another function instead of doing the symbolic evaluation of the function,\\n- memory allocations: we need to choose the type of allocation (mutable or immutable) and the address of the allocation for mutable ones.\\n\\nHere is the proof for the `transfer` function:\\n\\n```coq\\nProof.\\n unfold erc20.Impl_erc20_Erc20_t_2.transfer,\\n Simulations.erc20.transfer,\\n lift_simulation.\\n Opaque erc20.transfer_from_to.\\n run_symbolic.\\n eapply Run.Call. {\\n apply run_env.\\n }\\n run_symbolic.\\n eapply Run.Call. {\\n apply Env.run_caller.\\n }\\n run_symbolic.\\n eapply Run.Call. {\\n now apply run_transfer_from_to.\\n }\\n unfold lift_simulation.\\n destruct erc20.transfer_from_to as [[] [?storage ?logs]]; run_symbolic.\\n Transparent erc20.transfer_from_to.\\nQed.\\n```\\n\\n## Proofs\\n\\n### Handling of integers\\n\\nWe distinguish the various types of integers used in Rust:\\n\\n- unsigned ones: `u8`, `u16`, `u32`, `u64`, `u128`, `usize`,\\n- signed ones: `i8`, `i16`, `i32`, `i64`, `i128`, `isize`.\\n\\nWe define a separate type for each of them, that is to say, a wrapper around the `Z` type of unbounded integers from Coq:\\n\\n```coq\\nModule u8.\\n Inductive t : Set := Make (z : Z) : t.\\nEnd u8.\\n```\\n\\nTo enforce the bounds, we define a validity predicate for each type:\\n\\n```coq\\nModule Valid.\\n Definition t {A : Set} `{Integer.C A} (v : A) : Prop :=\\n Integer.min <= Integer.to_Z v <= Integer.max.\\nEnd Valid.\\n```\\n\\nAll integer types are of the class `Integer.C` with a `min`, `max`, and `to_Z` functions. We do not embed this predicate with the integer type ([refinement type](https://en.wikipedia.org/wiki/Refinement_type)) to avoid mixing proofs and code. We pay a cost by having to handle the values and the validity proofs separately.\\n\\nDepending on the configuration mode of Rust, integer operations can overflow or panic. We have several implementations of the arithmetic operations, depending on the mode:\\n\\n```coq\\nModule BinOp.\\n (** Operators with panic, in the monad. *)\\n Module Panic.\\n Definition add {A : Set} `{Integer.C A} (v1 v2 : A) : M A :=\\n (* ... *)\\n\\n Definition sub (* ... *)\\n End Panic.\\n\\n (** Operators with overflow, outside of the monad as\\n there cannot be any errors. *)\\n Module Wrap.\\n Definition add {A : Set} `{Integer.C A} (v1 v2 : A) : A :=\\n (* ... *)\\n\\n Definition sub (* ... *)\\n End Wrap.\\nEnd BinOp.\\n```\\n\\nWe also have additional operators, useful for the definition of simulations:\\n\\n- optimistic operators, operating on `Z` without checking the bounds of the result (for cases where we can prove that the result is never out of bounds),\\n- operators returning in the option monad, to handle the case where the result is out of bounds.\\n\\nNote that the comparison operators (`=`, `<`, ...) never panic or overflow. In the context of these smart contracts, the arithmetic operators are panicking in case of overflow.\\n\\n### Definition of messages\\n\\nWe can call the smart contract with three read primitives (`total_supply`, `balance_of`, `allowance`) and three write primitives (`transfer`, `approve`, `transfer_from`). We define two message types to formalize these access points. This will later allow us to express properties over all possible read and write messages:\\n\\n```coq\\nModule ReadMessage.\\n (** The type parameter is the type of result of the call. *)\\n Inductive t : Set -> Set :=\\n | total_supply :\\n t ltac:(erc20.Balance)\\n | balance_of\\n (owner : erc20.AccountId.t) :\\n t ltac:(erc20.Balance)\\n | allowance\\n (owner : erc20.AccountId.t)\\n (spender : erc20.AccountId.t) :\\n t ltac:(erc20.Balance).\\nEnd ReadMessage.\\n\\nModule WriteMessage.\\n Inductive t : Set :=\\n | transfer\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n t\\n | approve\\n (spender : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n t\\n | transfer_from\\n (from : erc20.AccountId.t)\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n t.\\nEnd WriteMessage.\\n```\\n\\n### No panics on read messages\\n\\nWe show that for all possible read messages, the smart contract does not panic:\\n\\n```coq\\nLemma read_message_no_panic\\n (env : erc20.Env.t)\\n (message : ReadMessage.t ltac:(erc20.Balance))\\n (storage : erc20.Erc20.t) :\\n let state := State.of_storage storage in\\n exists result,\\n {{ Environment.of_env env, state |\\n ReadMessage.dispatch message \u21d3\\n (* [inl] means success (no panics) *)\\n inl result\\n | state }}.\\n```\\n\\nThis is done by symbolic evaluation of the simulations:\\n\\n```coq\\nProof.\\n destruct message; simpl.\\n { eexists.\\n apply run_total_supply.\\n }\\n { eexists.\\n apply run_balance_of.\\n }\\n { eexists.\\n apply run_allowance.\\n }\\nQed.\\n```\\n\\n### Invariants\\n\\nThe data structure of the storage of the smart contract is as follows:\\n\\n```rust\\npub struct Erc20 {\\n total_supply: Balance,\\n balances: Mapping,\\n allowances: Mapping<(AccountId, AccountId), Balance>,\\n}\\n```\\n\\nAn invariant is that the total supply is always equal to the sum of all the balances in the mapping `Mapping`. We define this invariant in Coq as:\\n\\n```coq\\nDefinition sum_of_money (storage : erc20.Erc20.t) : Z :=\\n Lib.Mapping.sum Integer.to_Z storage.(erc20.Erc20.balances).\\n\\nModule Valid.\\n Definition t (storage : erc20.Erc20.t) : Prop :=\\n Integer.to_Z storage.(erc20.Erc20.total_supply) =\\n sum_of_money storage.\\nEnd Valid.\\n```\\n\\nWe show that this invariant holds for any output of the write messages, given that it holds for the input storage:\\n\\n```coq\\nLemma write_dispatch_is_valid\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (write_message : WriteMessage.t)\\n (H_storage : Erc20.Valid.t storage)\\n (H_write_message : WriteMessage.Valid.t write_message) :\\n let state := State.of_storage storage in\\n let \'(result, (storage, _)) :=\\n WriteMessage.simulation_dispatch env write_message (storage, []) in\\n match result with\\n | inl _ => Erc20.Valid.t storage\\n | _ => True\\n end.\\n```\\n\\nWe assume that the initial storage is valid with the hypothesis:\\n\\n```coq\\n(H_storage : Erc20.Valid.t storage)\\n```\\n\\nWe show the property in the case without panics with:\\n\\n```coq\\nmatch result with\\n | inl _ => ...\\n```\\n\\nWhen the smart contract panics (integer overflow), the storage is discarded anyways, and it might actually by invalid. For example, in the `transfer_from_to` function we have:\\n\\n```rust\\nself.balances.insert(*from, from_balance - value);\\nlet to_balance = self.balance_of_impl(to);\\nself.balances.insert(*to, to_balance + value);\\n```\\n\\nSo if there is a panic during the addition `+`, like an overflow, the final storage can have the `from` account modified but not the `to` account. So here, the balance sum is no longer equal to the total supply.\\n\\n### Total supply is constant\\n\\nWe show that the total supply is also a constant, meaning that no calls to the smart contract can modify its value. The statement is the following:\\n\\n```coq\\nLemma write_dispatch_is_constant\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (write_message : WriteMessage.t) :\\n let state := State.of_storage storage in\\n let \'(result, (storage\', _)) :=\\n WriteMessage.simulation_dispatch env write_message (storage, []) in\\n match result with\\n | inl _ =>\\n storage.(erc20.Erc20.total_supply) =\\n storage\'.(erc20.Erc20.total_supply)\\n | _ => True\\n end.\\n```\\n\\nIt says that for any initial `storage` and `write_message` sent to the smart contract, if we return a result without panicking (`inl _`), then the total supply in the final storage `storage\'` is equal to the initial one. We verify this fact by symbolic evaluation of all the branches of the simulation. There are no difficulties in this proof as the code never modifies the `total_supply`.\\n\\n### Action from the logs\\n\\nWe infer the action of the smart contract on the storage from its logs. This characterizes exactly what we modifications we can deduce on the storage from the logs. We define an action as a function from the storage to a set of possible new storages, given the knowledge of the logs of the contract:\\n\\n```coq\\nModule Action.\\n Definition t : Type := erc20.Erc20.t -> erc20.Erc20.t -> Prop.\\nEnd Action.\\n```\\n\\nThe main statement is the following:\\n\\n```coq\\nLemma retrieve_action_from_logs\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (write_message : WriteMessage.t)\\n (events : list erc20.Event.t) :\\n match\\n WriteMessage.simulation_dispatch env write_message (storage, [])\\n with\\n | (inl (result.Result.Ok tt), (storage\', events)) =>\\n action_of_events events storage storage\'\\n | _ => True\\n end.\\n```\\n\\nThis relates the final storage `storage\'` to the initial storage `storage` using the logs `events` when there are no panics. We define the `action_of_events` predicate as the successive application of the `action_of_event` predicate, which is defined as:\\n\\n```coq\\nDefinition action_of_event (event : erc20.Event.t) : Action.t :=\\n fun storage storage\' =>\\n match event with\\n | erc20.Event.Transfer (erc20.Transfer.Build_t\\n (option.Option.Some from)\\n (option.Option.Some to)\\n value\\n ) =>\\n (* In case of transfer event, we do not know how the allowances are\\n updated. *)\\n exists allowances\',\\n storage\' =\\n storage <|\\n erc20.Erc20.balances := balances_of_transfer storage from to value\\n |> <|\\n erc20.Erc20.allowances := allowances\'\\n |>\\n | erc20.Event.Transfer (erc20.Transfer.Build_t _ _ _) => False\\n | erc20.Event.Approval (erc20.Approval.Build_t owner spender value) =>\\n storage\' =\\n storage <|\\n erc20.Erc20.allowances :=\\n Lib.Mapping.insert (owner, spender) value\\n storage.(erc20.Erc20.allowances)\\n |>\\n end.\\n```\\n\\nWhen the `event` in the logs is of kind `erc20.Event.Transfer`, the resulting storage has:\\n\\n- the `balances` updated according to the function `balances_of_transfer`;\\n- the `allowances` updated to an unknown value `allowances\'`.\\n\\nWhen the `event` in the logs is of kind `erc20.Event.Approval`, the resulting storage has:\\n\\n- the `allowances` updated calling `Lib.Mapping.insert` on `(owner, spender)`;\\n- the `balances` unchanged.\\n\\n### Approve only on caller\\n\\nWe added one last proof to say that when the `approve` function succeeds, it only modifies the allowance of the caller:\\n\\n```coq\\nLemma approve_only_changes_owner_allowance\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (spender : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n let \'(result, (storage\', _)) :=\\n Simulations.erc20.approve env spender value (storage, []) in\\n match result with\\n | inl (result.Result.Ok tt) =>\\n forall owner spender,\\n Integer.to_Z (Simulations.erc20.allowance storage\' owner spender) <>\\n Integer.to_Z (Simulations.erc20.allowance storage owner spender) ->\\n owner = Simulations.erc20.Env.caller env\\n | _ => True\\n end.\\n```\\n\\nIf an allowance changes after the call to `approve`, then the owner of the allowance is the caller of the smart contract. This is done by symbolic evaluation of the simulation.\\n\\n## Conclusion\\n\\nIn this example, we have shown how we formally verify the ERC-20 smart contract written in Rust for the [Aleph Zero](https://alephzero.org/) project. Formally verifying smart contracts is extremely important as they can hold a lot of money, and a single bug can prove fatal as recent attacks continue to show: [List of crypto hacks in 2023](https://www.ccn.com/education/crypto-hacks-2023-full-list-of-scams-and-exploits-as-millions-go-missing/).\\n\\nIf you have Rust smart contracts to verify, feel free to email us at [contact@formal.land](mailto:contact@formal.land). We will be happy to help!"},{"id":"/2023/11/26/rust-function-body","metadata":{"permalink":"/blog/2023/11/26/rust-function-body","source":"@site/blog/2023-11-26-rust-function-body.md","title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","description":"Our tool coq-of-rust enables formal verification of \ud83e\udd80 Rust code, to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system \ud83d\udc13 Coq.","date":"2023-11-26T00:00:00.000Z","formattedDate":"November 26, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"}],"readingTime":4.975,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","tags":["coq-of-rust","Rust","Coq"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","permalink":"/blog/2023/12/13/rust-verify-erc-20-smart-contract"},"nextItem":{"title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","permalink":"/blog/2023/11/08/rust-thir-and-bundled-traits"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) enables formal verification of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code, to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system [\ud83d\udc13 Coq](https://coq.inria.fr/).\\n\\nHere, we present how we translate function bodies from Rust to Coq in an example. We also show some of the optimizations we made to reduce the size of the translation.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nIf you need to formally verify your Rust codebase to improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)!\\n\\n:::\\n\\n![Rust and Coq](2023-11-26/rust_and_coq.png)\\n\\n## Translating a function body\\n\\nWe take the following Rust example as input:\\n\\n```rust\\n// fn balance_of_impl(&self, owner: &AccountId) -> Balance { ... }\\n\\nfn balance_of(&self, owner: AccountId) -> Balance {\\n self.balance_of_impl(&owner)\\n}\\n```\\n\\nHere is the corresponding Coq code that `coq-of-rust` generates _without optimizations_:\\n\\n```coq\\nDefinition balance_of\\n (self : ref ltac:(Self))\\n (owner : erc20.AccountId.t)\\n : M ltac:(erc20.Balance) :=\\n let* self : M.Val (ref ltac:(Self)) := M.alloc self in\\n let* owner : M.Val erc20.AccountId.t := M.alloc owner in\\n let* \u03b10 : ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : M.Val erc20.Erc20.t := deref \u03b10 in\\n let* \u03b12 : ref erc20.Erc20.t := borrow \u03b11 in\\n let* \u03b13 : M.Val (ref erc20.Erc20.t) := M.alloc \u03b12 in\\n let* \u03b14 : ref erc20.Erc20.t := M.read \u03b13 in\\n let* \u03b15 : ref erc20.AccountId.t := borrow owner in\\n let* \u03b16 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b15 in\\n let* \u03b17 : ref erc20.AccountId.t := M.read \u03b16 in\\n let* \u03b18 : M.Val erc20.AccountId.t := deref \u03b17 in\\n let* \u03b19 : ref erc20.AccountId.t := borrow \u03b18 in\\n let* \u03b110 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b19 in\\n let* \u03b111 : ref erc20.AccountId.t := M.read \u03b110 in\\n let* \u03b112 : u128.t := erc20.Erc20.t::[\\"balance_of_impl\\"] \u03b14 \u03b111 in\\n let* \u03b113 : M.Val u128.t := M.alloc \u03b112 in\\n M.read \u03b113.\\n```\\n\\nThis code is much more verbose than the original Rust code as we make all pointer manipulations explicit. We will see just after how to simplify it. We start with the function declaration:\\n\\n```coq\\nDefinition balance_of\\n (self : ref ltac:(Self))\\n (owner : erc20.AccountId.t)\\n : M ltac:(erc20.Balance) :=\\n```\\n\\nthat repeats the parameters in the Rust source. Note that the final result is wrapped into the monad type `M`. This is a monad representing all the side-effects used in Rust programs (state, panic, non-termination, ...). Then, we allocate all the function parameters:\\n\\n```coq\\n let* self : M.Val (ref ltac:(Self)) := M.alloc self in\\n let* owner : M.Val erc20.AccountId.t := M.alloc owner in\\n```\\n\\nThis ensures that both `self` and `owner` have an address in memory, in case we borrow them later. This allocation is also fresh, so we cannot access the address of the values from the caller by mistake. We use the monadic let `let*` as allocations can modify the memory state.\\n\\nThen we start by the body of the function itself. We do all the necessary pointer manipulations to compute the parameters `self` and `&owner` of the function `balance_of_impl`. These representations are directly taken from the abstract syntax tree of the Rust compiler (using the [THIR](https://rustc-dev-guide.rust-lang.org/thir.html) version).\\n\\nFor example, for the first parameter `self`, named `\u03b14` in this translation, we do:\\n\\n```coq\\n let* \u03b10 : ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : M.Val erc20.Erc20.t := deref \u03b10 in\\n let* \u03b12 : ref erc20.Erc20.t := borrow \u03b11 in\\n let* \u03b13 : M.Val (ref erc20.Erc20.t) := M.alloc \u03b12 in\\n let* \u03b14 : ref erc20.Erc20.t := M.read \u03b13 in\\n```\\n\\nWe combine the operators:\\n\\n- `M.read`: to get a value of type `A` from a value with an address `M.Val`,\\n- `deref`: to get the value with an address `M.Val A` pointed by a reference `ref A`,\\n- `borrow`: to get the reference `ref A` to a value with an address `M.Val A`,\\n- `M.alloc`: to allocate a new value `A` in memory, returning a value with address `M.Val A`.\\n\\nWe do the same to compute the second parameter `&owner` of `balance_of_impl` with:\\n\\n```coq\\n let* \u03b15 : ref erc20.AccountId.t := borrow owner in\\n let* \u03b16 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b15 in\\n let* \u03b17 : ref erc20.AccountId.t := M.read \u03b16 in\\n let* \u03b18 : M.Val erc20.AccountId.t := deref \u03b17 in\\n let* \u03b19 : ref erc20.AccountId.t := borrow \u03b18 in\\n let* \u03b110 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b19 in\\n let* \u03b111 : ref erc20.AccountId.t := M.read \u03b110 in\\n```\\n\\nFinally, we call the `balance_of_impl` function and return the result:\\n\\n```coq\\n let* \u03b112 : u128.t := erc20.Erc20.t::[\\"balance_of_impl\\"] \u03b14 \u03b111 in\\n let* \u03b113 : M.Val u128.t := M.alloc \u03b112 in\\n M.read \u03b113.\\n```\\n\\nWe do not keep the address of the result, as it will be allocated again by the caller function.\\n\\n## Optimizations\\n\\nSome operations can always be removed, namely:\\n\\n- `M.read (M.alloc v) ==> v`: we do not need to allocate and give an address to a value if it will be immediately read,\\n- `deref (borrow v) ==> v` and `borrow (deref v) ==> v`: the borrowing and dereferencing operators are doing the opposite, so they cancel each other. We need to be careful of the mutability status of the borrowing and dereferencing.\\n\\nApplying these simple simplification rules, we get the following slimed-down translation:\\n\\n```coq\\nDefinition balance_of\\n (self : ref ltac:(Self))\\n (owner : erc20.AccountId.t)\\n : M ltac:(erc20.Balance) :=\\n let* self : M.Val (ref ltac:(Self)) := M.alloc self in\\n let* owner : M.Val erc20.AccountId.t := M.alloc owner in\\n let* \u03b10 : ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : ref erc20.AccountId.t := borrow owner in\\n erc20.Erc20.t::[\\"balance_of_impl\\"] \u03b10 \u03b11.\\n```\\n\\nThis is much shorter and easier to verify!\\n\\n## Conclusion\\n\\nWe have illustrated in an example how we translate a simple function from Rust to Coq. In this example, we saw how the pointer operations are made explicit in the abstract syntax tree of Rust, and how we simplify them for the frequent cases.\\n\\nIf you have any comments or suggestions, feel free to email us at [contact@formal.land](mailto:contact@formal.land). In future posts, we will go into more detail about the verification process itself."},{"id":"/2023/11/08/rust-thir-and-bundled-traits","metadata":{"permalink":"/blog/2023/11/08/rust-thir-and-bundled-traits","source":"@site/blog/2023-11-08-rust-thir-and-bundled-traits.md","title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","description":"We continued our work on coq-of-rust, a tool to formally verify Rust programs using the proof system Coq \ud83d\udc13. This tool translates Rust programs to an equivalent Coq program, which can then be verified using Coq\'s proof assistant. It opens the door to building mathematically proven bug-free Rust programs.","date":"2023-11-08T00:00:00.000Z","formattedDate":"November 8, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"trait","permalink":"/blog/tags/trait"},{"label":"THIR","permalink":"/blog/tags/thir"},{"label":"HIR","permalink":"/blog/tags/hir"}],"readingTime":5.22,"hasTruncateMarker":true,"authors":[{"name":"Guillaume Claret"}],"frontMatter":{"title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","tags":["coq-of-rust","Rust","Coq","trait","THIR","HIR"],"author":"Guillaume Claret"},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","permalink":"/blog/2023/11/26/rust-function-body"},"nextItem":{"title":"\ud83e\udd80 Trait representation in Coq","permalink":"/blog/2023/08/25/trait-representation-in-coq"}},"content":"We continued our work on [coq-of-rust](https://github.com/formal-land/coq-of-rust), a tool to formally verify [Rust](https://www.rust-lang.org/) programs using the proof system [Coq \ud83d\udc13](https://coq.inria.fr/). This tool translates Rust programs to an equivalent Coq program, which can then be verified using Coq\'s proof assistant. It opens the door to building mathematically proven bug-free Rust programs.\\n\\nWe present two main improvements we made to `coq-of-rust`:\\n\\n- Using the THIR intermediate language of Rust to have more information during the translation to Coq.\\n- Bundling the type-classes representing the traits of Rust to have faster type-checking in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n![Rust and Coq](2023-11-08/rust_and_coq.png)\\n\\n## THIR intermediate language\\n\\nTo translate Rust programs to Coq, we plug into the compiler of Rust, which operates on a series of intermediate languages:\\n\\n- source code (`.rs` files);\\n- abstract syntax tree (AST): immediately after parsing;\\n- [High-Level Intermediate Representation](https://rustc-dev-guide.rust-lang.org/hir.html) (HIR): after macro expansion, with name resolution and close to the AST;\\n- [Typed High-Level Intermediate Representation](https://rustc-dev-guide.rust-lang.org/thir.html) (THIR): after the type-checking;\\n- [Mid-level Intermediate Representation](https://rustc-dev-guide.rust-lang.org/mir/index.html) (MIR): low-level representation based on a [control-flow graph](https://en.wikipedia.org/wiki/Control-flow_graph), inlining traits and polymorphic functions, and with [borrow checking](https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html);\\n- machine code (assembly, LLVM IR, ...).\\n\\nWe were previously using the HIR language to start our translation to Coq, because it is not too low-level and close to what the user has originally in the `.rs` file. This helps relate the generated Coq code to the original Rust code.\\n\\nHowever, at the level of HIR, there is still a lot of implicit information. For example, Rust has [automatic dereferencing rules](https://users.rust-lang.org/t/automatic-dereferencing/53828) that are not yet explicit in HIR. In order not to make any mistakes during our translation to Coq, we prefer to use the next representation, THIR, that makes explicit such rules.\\n\\nIn addition, the THIR representation shows when a method call is from a trait (and which trait) or from a standalone `impl` block. Given that we still have trouble translating the traits with [type-classes](https://coq.inria.fr/doc/V8.18.0/refman/addendum/type-classes.html) that are inferrable by Coq, this helps a lot.\\n\\nA downside of the THIR representation is that it is much more verbose. For example, here is a formatting function generated from HIR:\\n\\n```coq\\nDefinition fmt\\n `{\u210b : State.Trait}\\n (self : ref Self)\\n (f : mut_ref core.fmt.Formatter)\\n : M core.fmt.Result :=\\n let* \u03b10 := format_argument::[\\"new_display\\"] (addr_of self.[\\"radius\\"]) in\\n let* \u03b11 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"Circle of radius \\" ])\\n (addr_of [ \u03b10 ]) in\\n f.[\\"write_fmt\\"] \u03b11.\\n```\\n\\nThis is the kind of functions generated by the `#[derive(Debug)]` macro of Rust, to implement a formatting function on a type. Here is the version translated from THIR, with explicit borrowing and dereferencing:\\n\\n```coq\\nDefinition fmt\\n `{\u210b : State.Trait}\\n (self : ref Self)\\n (f : mut_ref core.fmt.Formatter)\\n : M ltac:(core.fmt.Result) :=\\n let* \u03b10 := deref f core.fmt.Formatter in\\n let* \u03b11 := borrow_mut \u03b10 core.fmt.Formatter in\\n let* \u03b12 := borrow [ mk_str \\"Circle of radius \\" ] (list (ref str)) in\\n let* \u03b13 := deref \u03b12 (list (ref str)) in\\n let* \u03b14 := borrow \u03b13 (list (ref str)) in\\n let* \u03b15 := pointer_coercion \\"Unsize\\" \u03b14 in\\n let* \u03b16 := deref self converting_to_string.Circle in\\n let* \u03b17 := \u03b16.[\\"radius\\"] in\\n let* \u03b18 := borrow \u03b17 i32 in\\n let* \u03b19 := deref \u03b18 i32 in\\n let* \u03b110 := borrow \u03b19 i32 in\\n let* \u03b111 := core.fmt.rt.Argument::[\\"new_display\\"] \u03b110 in\\n let* \u03b112 := borrow [ \u03b111 ] (list core.fmt.rt.Argument) in\\n let* \u03b113 := deref \u03b112 (list core.fmt.rt.Argument) in\\n let* \u03b114 := borrow \u03b113 (list core.fmt.rt.Argument) in\\n let* \u03b115 := pointer_coercion \\"Unsize\\" \u03b114 in\\n let* \u03b116 := core.fmt.Arguments::[\\"new_v1\\"] \u03b15 \u03b115 in\\n core.fmt.Formatter::[\\"write_fmt\\"] \u03b11 \u03b116.\\n```\\n\\nWe went from a function having two intermediate variables to seventeen intermediate variables. This code is much more verbose, but it is also more explicit. In particular, it details when the:\\n\\n- borrowing (going from a value of type `T` to `&T`), and the\\n- dereferencing (going from a value of type `&T` to `T`)\\n\\noccur. It also shows that the method `write_fmt` is a method from the implementation of the type `core.fmt.Formatter`, generating:\\n\\n```coq\\ncore.fmt.Formatter::[\\"write_fmt\\"] \u03b11 \u03b116\\n```\\n\\ninstead of:\\n\\n```coq\\nf.[\\"write_fmt\\"] \u03b11\\n```\\n\\n## Bundled traits\\n\\nSome Rust codebases can have a lot of traits. For example in [paritytech/ink/crates/env/src/types.rs](https://github.com/paritytech/ink/blob/ccb38d2c3ac27523fe3108f2bb7bffbbe908cdb7/crates/env/src/types.rs#L120) the trait `Environment` references more than forty other traits:\\n\\n```rust\\npub trait Environment: Clone {\\n const MAX_EVENT_TOPICS: usize;\\n\\n type AccountId: \'static\\n + scale::Codec\\n + CodecAsType\\n + Clone\\n + PartialEq\\n + ...;\\n\\n type Balance: \'static\\n + scale::Codec\\n + CodecAsType\\n + ...;\\n\\n ...\\n```\\n\\nWe first used an unbundled approach to represent this trait by a type-class in Coq, as it felt more natural:\\n\\n```coq\\nModule Environment.\\n Class Trait (Self : Set) `{Clone.Trait Self}\\n {AccountId : Set}\\n `{scale.Codec.Trait AccountId}\\n `{CodecAsType AccountId}\\n `{Clone AccountId}\\n `{PartialEq AccountId}\\n ...\\n```\\n\\nHowever, the backquote operator generated too many implicit arguments, and the type-checker of Coq was very slow. We then switched to a bundled approach, as advocated in this blog post: [Exponential blowup when using unbundled typeclasses to model algebraic hierarchies](https://www.ralfj.de/blog/2019/05/15/typeclasses-exponential-blowup.html). The Coq code for this trait now looks like this:\\n\\n```coq\\nModule Environment.\\n Class Trait `{\u210b : State.Trait} (Self : Set) : Type := {\\n \u210b_0 :: Clone.Trait Self;\\n MAX_EVENT_TOPICS : usize;\\n AccountId : Set;\\n \u2112_0 :: parity_scale_codec.codec.Codec.Trait AccountId;\\n \u2112_1 :: ink_env.types.CodecAsType.Trait AccountId;\\n \u2112_2 :: core.clone.Clone.Trait AccountId;\\n \u2112_3 ::\\n core.cmp.PartialEq.Trait AccountId\\n (Rhs := core.cmp.PartialEq.Default.Rhs AccountId);\\n ...;\\n Balance : Set;\\n \u2112_8 :: parity_scale_codec.codec.Codec.Trait Balance;\\n \u2112_9 :: ink_env.types.CodecAsType.Trait Balance;\\n ...;\\n\\n ...\\n```\\n\\nWe use the notation `::` for fields that are trait instances. With this approach, traits have types as parameters but no other traits.\\n\\nThe type-checking is now much faster, and in particular, we avoid some cases with exponential blowup or non-terminating type-checking. But this is not a perfect solution as we still have cases where the instance inference does not terminate or fails with hard-to-understand error messages.\\n\\n## Conclusion\\n\\nWe have illustrated here some improvements we recently made to our [coq-of-rust](https://github.com/formal-land/coq-of-rust) translator for two key areas:\\n\\n- the translation of traits;\\n- the translation of the implicit borrowing and dereferencing, that can occur every time we call a function.\\n\\nThese improvements will allow us to formally verify some more complex Rust codebases. In particular, we are applying `coq-of-rust` to verify smart contracts written for the [ink!](https://use.ink/) platform, that is a subset of Rust.\\n\\n:::tip Contact\\n\\nIf you have comments, similar experiences to share, or wish to formally verify your codebase to improve the security of your application, contact us at [contact@formal.land](mailto:contact@formal.land)!\\n\\n:::"},{"id":"/2023/08/25/trait-representation-in-coq","metadata":{"permalink":"/blog/2023/08/25/trait-representation-in-coq","source":"@site/blog/2023-08-25-trait-representation-in-coq.md","title":"\ud83e\udd80 Trait representation in Coq","description":"In our project coq-of-rust we translate programs written in Rust to equivalent programs in the language of the proof system Coq \ud83d\udc13, which will later allow us to formally verify them.","date":"2023-08-25T00:00:00.000Z","formattedDate":"August 25, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"trait","permalink":"/blog/tags/trait"}],"readingTime":7.58,"hasTruncateMarker":true,"authors":[{"name":"Bart\u0142omiej Kr\xf3likowski"}],"frontMatter":{"title":"\ud83e\udd80 Trait representation in Coq","tags":["coq-of-rust","Rust","Coq","trait"],"author":"Bart\u0142omiej Kr\xf3likowski"},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","permalink":"/blog/2023/11/08/rust-thir-and-bundled-traits"},"nextItem":{"title":"\ud83e\udd80 Monad for side effects in Rust","permalink":"/blog/2023/05/28/monad-for-side-effects-in-rust"}},"content":"In our project [coq-of-rust](https://github.com/formal-land/coq-of-rust) we translate programs written in [Rust](https://www.rust-lang.org/) to equivalent programs in the language of the proof system [Coq \ud83d\udc13](https://coq.inria.fr/), which will later allow us to formally verify them.\\nBoth Coq and Rust have many unique features, and there are many differences between them, so in the process of translation we need to treat the case of each language construction separately.\\nIn this post, we discuss how we translate the most complicated one: [traits](https://doc.rust-lang.org/book/ch10-02-traits.html).\\n\\n\x3c!-- truncate --\x3e\\n\\n## \ud83e\udd80 Traits in Rust\\n\\nTrait is the way to define a shared behaviour for a group of types in Rust.\\nTo define a trait we have to specify a list of signatures of the methods we want to be implemented for the types implementing our trait.\\nWe can also create a generic definition of a trait with the same syntax as in every Rust definition.\\nOptionally, we can add a default implementation to any method or extend the list with associated types.\\nTraits can also extend a behaviour of one or more other traits, in which case, to implement a trait for a type we would have to implement all its supertraits first.\\n\\nConsider the following example (adapted from the [Rust Book](https://doc.rust-lang.org/book/)):\\n\\n```rust\\nstruct Sheep {\\n naked: bool,\\n name: &\'static str,\\n}\\n\\ntrait Animal {\\n // Associated function signature; `Self` refers to the implementor type.\\n fn new(name: &\'static str) -> Self;\\n\\n // Method signatures; these will return a string.\\n fn name(&self) -> &\'static str;\\n fn noise(&self) -> &\'static str;\\n\\n // Traits can provide default method definitions.\\n fn talk(&self) {\\n println!(\\"{} says {}\\", self.name(), self.noise());\\n }\\n}\\n\\nimpl Sheep {\\n fn is_naked(&self) -> bool {\\n self.naked\\n }\\n}\\n\\n// Implement the `Animal` trait for `Sheep`.\\nimpl Animal for Sheep {\\n // `Self` is the implementor type: `Sheep`.\\n fn new(name: &\'static str) -> Sheep {\\n Sheep {\\n name: name,\\n naked: false,\\n }\\n }\\n\\n fn name(&self) -> &\'static str {\\n self.name\\n }\\n\\n fn noise(&self) -> &\'static str {\\n if self.is_naked() {\\n \\"baaaaah?\\"\\n } else {\\n \\"baaaaah!\\"\\n }\\n }\\n\\n // Default trait methods can be overridden.\\n fn talk(&self) {\\n // For example, we can add some quiet contemplation.\\n println!(\\"{} pauses briefly... {}\\", self.name, self.noise());\\n }\\n}\\n\\nimpl Sheep {\\n fn shear(&mut self) {\\n if self.is_naked() {\\n // Implementor methods can use the implementor\'s trait methods.\\n println!(\\"{} is already naked...\\", self.name());\\n } else {\\n println!(\\"{} gets a haircut!\\", self.name);\\n\\n self.naked = true;\\n }\\n }\\n}\\n\\nfn main() {\\n // Type annotation is necessary in this case.\\n let mut dolly = Animal::new(\\"Dolly\\"): Sheep;\\n\\n dolly.talk();\\n dolly.shear();\\n dolly.talk();\\n}\\n```\\n\\nWe have a type `Sheep`, a trait `Animal`, and an implementation of `Animal` for `Sheep`.\\nAs we can see in `main`, after a trait is implemented for a type, we can use the methods of the trait like normal methods of the type.\\n\\n## Our translation\\n\\nRust notion of trait is very similar to the concept of [typeclasses](https://en.wikipedia.org/wiki/Type_class) in [functional programming](https://en.wikipedia.org/wiki/Functional_programming).\\nTypeclasses are also present in Coq, so translation of this construction is quite straightforward.\\n\\nFor a given trait we create a typeclass with fields being just translated signatures of the methods of the trait.\\nTo allow for the use of method syntax, we also define instances of `Notation.Dot` for every method name of the trait.\\nWe also add a parameter of type `Set` for every type parameter of the trait and translate trait bounds of the types into equivalent typeclass parameters.\\n\\n## Translation of associated types\\n\\nAssociated types are a bit harder than methods to translate, because it is possible to use `::` notation to access them.\\nFor that purpose, we created another typeclass in `Notation` module:\\n\\n```coq\\nClass DoubleColonType {Kind : Type} (type : Kind) (name : string) : Type := {\\n double_colon_type : Set;\\n}.\\n```\\n\\nwith a notation:\\n\\n```coq\\nNotation \\"e1 ::type[ e2 ]\\" := (Notation.double_colon_type e1 e2)\\n (at level 0).\\n```\\n\\nFor every associated type, we create a parameter and a field of the typeclass resulting from the trait translation, and below, we create an instance of `Notation.DoubleColonType`.\\n\\n## The example in Coq\\n\\nHere is our Coq translation of the example code above:\\n\\n```coq\\n(* Generated by coq-of-rust *)\\nRequire Import CoqOfRust.CoqOfRust.\\n\\nModule Sheep.\\n Unset Primitive Projections.\\n Record t : Set := {\\n naked : bool;\\n name : ref str;\\n }.\\n Global Set Primitive Projections.\\n\\n Global Instance Get_naked : Notation.Dot \\"naked\\" := {\\n Notation.dot \'(Build_t x0 _) := x0;\\n }.\\n Global Instance Get_name : Notation.Dot \\"name\\" := {\\n Notation.dot \'(Build_t _ x1) := x1;\\n }.\\nEnd Sheep.\\nDefinition Sheep : Set := @Sheep.t.\\n\\nModule Animal.\\n Class Trait (Self : Set) : Set := {\\n new `{H : State.Trait} : (ref str) -> (M (H := H) Self);\\n name `{H : State.Trait} : (ref Self) -> (M (H := H) (ref str));\\n noise `{H : State.Trait} : (ref Self) -> (M (H := H) (ref str));\\n }.\\n\\n Global Instance Method_new `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"new\\" := {\\n Notation.dot := new;\\n }.\\n Global Instance Method_name `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"name\\" := {\\n Notation.dot := name;\\n }.\\n Global Instance Method_noise `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"noise\\" := {\\n Notation.dot := noise;\\n }.\\n Global Instance Method_talk `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"talk\\" := {\\n Notation.dot (self : ref Self):=\\n (let* _ :=\\n let* _ :=\\n let* \u03b10 := self.[\\"name\\"] in\\n let* \u03b11 := format_argument::[\\"new_display\\"] (addr_of \u03b10) in\\n let* \u03b12 := self.[\\"noise\\"] in\\n let* \u03b13 := format_argument::[\\"new_display\\"] (addr_of \u03b12) in\\n let* \u03b14 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" says \\"; \\"\\n\\" ])\\n (addr_of [ \u03b11; \u03b13 ]) in\\n std.io.stdio._print \u03b14 in\\n Pure tt in\\n Pure tt\\n : M (H := H) unit);\\n }.\\nEnd Animal.\\n\\nModule Impl_traits_Sheep.\\n Definition Self := traits.Sheep.\\n\\n Definition is_naked `{H : State.Trait} (self : ref Self) : M (H := H) bool :=\\n Pure self.[\\"naked\\"].\\n\\n Global Instance Method_is_naked `{H : State.Trait} :\\n Notation.Dot \\"is_naked\\" := {\\n Notation.dot := is_naked;\\n }.\\nEnd Impl_traits_Sheep.\\n\\nModule Impl_traits_Animal_for_traits_Sheep.\\n Definition Self := traits.Sheep.\\n\\n Definition new\\n `{H : State.Trait}\\n (name : ref str)\\n : M (H := H) traits.Sheep :=\\n Pure {| traits.Sheep.name := name; traits.Sheep.naked := false; |}.\\n\\n Global Instance AssociatedFunction_new `{H : State.Trait} :\\n Notation.DoubleColon Self \\"new\\" := {\\n Notation.double_colon := new;\\n }.\\n\\n Definition name `{H : State.Trait} (self : ref Self) : M (H := H) (ref str) :=\\n Pure self.[\\"name\\"].\\n\\n Global Instance Method_name `{H : State.Trait} : Notation.Dot \\"name\\" := {\\n Notation.dot := name;\\n }.\\n\\n Definition noise\\n `{H : State.Trait}\\n (self : ref Self)\\n : M (H := H) (ref str) :=\\n let* \u03b10 := self.[\\"is_naked\\"] in\\n if (\u03b10 : bool) then\\n Pure \\"baaaaah?\\"\\n else\\n Pure \\"baaaaah!\\".\\n\\n Global Instance Method_noise `{H : State.Trait} : Notation.Dot \\"noise\\" := {\\n Notation.dot := noise;\\n }.\\n\\n Definition talk `{H : State.Trait} (self : ref Self) : M (H := H) unit :=\\n let* _ :=\\n let* _ :=\\n let* \u03b10 := format_argument::[\\"new_display\\"] (addr_of self.[\\"name\\"]) in\\n let* \u03b11 := self.[\\"noise\\"] in\\n let* \u03b12 := format_argument::[\\"new_display\\"] (addr_of \u03b11) in\\n let* \u03b13 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" pauses briefly... \\"; \\"\\n\\" ])\\n (addr_of [ \u03b10; \u03b12 ]) in\\n std.io.stdio._print \u03b13 in\\n Pure tt in\\n Pure tt.\\n\\n Global Instance Method_talk `{H : State.Trait} : Notation.Dot \\"talk\\" := {\\n Notation.dot := talk;\\n }.\\n\\n Global Instance I : traits.Animal.Trait Self := {\\n traits.Animal.new `{H : State.Trait} := new;\\n traits.Animal.name `{H : State.Trait} := name;\\n traits.Animal.noise `{H : State.Trait} := noise;\\n }.\\nEnd Impl_traits_Animal_for_traits_Sheep.\\n\\nModule Impl_traits_Sheep_3.\\n Definition Self := traits.Sheep.\\n\\n Definition shear `{H : State.Trait} (self : mut_ref Self) : M (H := H) unit :=\\n let* \u03b10 := self.[\\"is_naked\\"] in\\n if (\u03b10 : bool) then\\n let* _ :=\\n let* _ :=\\n let* \u03b10 := self.[\\"name\\"] in\\n let* \u03b11 := format_argument::[\\"new_display\\"] (addr_of \u03b10) in\\n let* \u03b12 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" is already naked...\\n\\" ])\\n (addr_of [ \u03b11 ]) in\\n std.io.stdio._print \u03b12 in\\n Pure tt in\\n Pure tt\\n else\\n let* _ :=\\n let* _ :=\\n let* \u03b10 := format_argument::[\\"new_display\\"] (addr_of self.[\\"name\\"]) in\\n let* \u03b11 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" gets a haircut!\\n\\" ])\\n (addr_of [ \u03b10 ]) in\\n std.io.stdio._print \u03b11 in\\n Pure tt in\\n let* _ := assign self.[\\"naked\\"] true in\\n Pure tt.\\n\\n Global Instance Method_shear `{H : State.Trait} : Notation.Dot \\"shear\\" := {\\n Notation.dot := shear;\\n }.\\nEnd Impl_traits_Sheep_3.\\n\\n(* #[allow(dead_code)] - function was ignored by the compiler *)\\nDefinition main `{H : State.Trait} : M (H := H) unit :=\\n let* dolly :=\\n let* \u03b10 := traits.Animal.new \\"Dolly\\" in\\n Pure (\u03b10 : traits.Sheep) in\\n let* _ := dolly.[\\"talk\\"] in\\n let* _ := dolly.[\\"shear\\"] in\\n let* _ := dolly.[\\"talk\\"] in\\n Pure tt.\\n```\\n\\nAs we can see, the trait `Animal` is translated to a module `Animal`. Every time we want to refer to the trait we use the name `Trait` or `Animal.Trait`, depending on whether we do it inside or outside its module.\\n\\n## Conclusion\\n\\nTraits are similar enough to Coq classes to make the translation relatively intuitive.\\nThe only hard case is a translation of associated types, for which we need a special notation.\\n\\n:::tip Contact\\n\\nIf you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us at [contact@formal.land](mailto:contact@formal.land). We will be happy to set up a call with you.\\n\\n:::"},{"id":"/2023/05/28/monad-for-side-effects-in-rust","metadata":{"permalink":"/blog/2023/05/28/monad-for-side-effects-in-rust","source":"@site/blog/2023-05-28-monad-for-side-effects-in-rust.md","title":"\ud83e\udd80 Monad for side effects in Rust","description":"To formally verify Rust programs, we are building coq-of-rust, a translator from Rust \ud83e\udd80 code to the proof system Coq \ud83d\udc13. We generate Coq code that is as similar as possible to the original Rust code, so that the user can easily understand the generated code and write proofs about it. In this blog post, we explain how we are representing side effects in Coq.","date":"2023-05-28T00:00:00.000Z","formattedDate":"May 28, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"monad","permalink":"/blog/tags/monad"},{"label":"side effects","permalink":"/blog/tags/side-effects"}],"readingTime":5.03,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Monad for side effects in Rust","tags":["coq-of-rust","Rust","Coq","monad","side effects"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Trait representation in Coq","permalink":"/blog/2023/08/25/trait-representation-in-coq"},"nextItem":{"title":"\ud83e\udd80 Representation of Rust methods in Coq","permalink":"/blog/2023/04/26/representation-of-rust-methods-in-coq"}},"content":"To formally verify Rust programs, we are building [coq-of-rust](https://github.com/formal-land/coq-of-rust), a translator from Rust \ud83e\udd80 code to the proof system [Coq \ud83d\udc13](https://coq.inria.fr/). We generate Coq code that is as similar as possible to the original Rust code, so that the user can easily understand the generated code and write proofs about it. In this blog post, we explain how we are representing side effects in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n## \ud83e\udd80 Side effects in Rust\\n\\nIn programming, [side effects]() are all what is not representable by pure functions (mathematical functions, functions that always return the same output for given input parameters). In Rust there are various kinds of side effects:\\n\\n- errors (the [panic!](https://doc.rust-lang.org/core/macro.panic.html) macro) that propagate and do appear in the return type of functions,\\n- non-termination, with some potentially non-terminating loops (never returning a result is considered as a side-effect),\\n- control-flow, with the `break`, `continue`, `return` keywords, that can jump to a different part of the code,\\n- memory allocations and memory mutations,\\n- I/O, with for example the [println!](https://doc.rust-lang.org/std/macro.println.html) macro, that prints a message to the standard output,\\n- concurrency, with the [thread::spawn](https://doc.rust-lang.org/std/thread/fn.spawn.html) function, that creates a new thread.\\n\\n## \ud83d\udc13 Coq, a purely functional language\\n\\nLike most proof systems, Coq is a purely functional language. This means we need to find an encoding for the side effects. The reason for most proof systems to forbid side effects is to be logically consistent. Otherwise, it would be easy to write a proof of `False` by writing a term that does not terminate for example.\\n\\n## \ud83d\udd2e Monads in Coq\\n\\nMonads are a common way to represent side effects in a functional language. A monad is a type constructor `M`:\\n\\n```coq\\nDefinition M (A : Set) : Set :=\\n ...\\n```\\n\\nrepresenting computations returning values of type `A`. As an example we can take the error monad of computations that can fail with an error message, using the [Result](https://doc.rust-lang.org/std/result/enum.Result.html) type like in Rust:\\n\\n```coq\\nDefinition M (A : Set) : Set :=\\n Result A string.\\n```\\n\\nIt must have two operators, `Pure` and `Bind`.\\n\\n### The `Pure` operator\\n\\nThe `Pure` operator has type:\\n\\n```coq\\nDefinition Pure {A : Set} (v : A) : M A :=\\n ...\\n```\\n\\nIt lifts a pure value `v` into the monad. For our error monad, the `Pure` operator is:\\n\\n```coq\\nDefinition Pure {A : Set} (v : A) : M A :=\\n Ok v.\\n```\\n\\n### The `Bind` operator\\n\\nThe `Bind` operator has type:\\n\\n```coq\\nDefinition Bind {A B : Set} (e1 : M A) (f : A -> M B) : M B :=\\n ...\\n```\\n\\nIt sequences two computations `e1` with `f`, where `f` is a function that takes the result of `e1` as input and returns a new computation. We also note the `Bind` operator:\\n\\n```coq\\nlet* x := e1 in\\ne2\\n```\\n\\nassuming that `f` is a function that takes `x` as input and returns `e2`. Requiring this operator for all monads shows that sequencing computations is a very fundamental operation for side effects.\\n\\nFor our error monad, the `Bind` operator is:\\n\\n```coq\\nDefinition Bind {A B : Set} (e1 : M A) (f : A -> M B) : M B :=\\n match e1 with\\n | Ok v => f v\\n | Err msg => Err msg\\n end.\\n```\\n\\n## \ud83d\udea7 State, exceptions, non-termination, control-flow\\n\\nWe use a single monad to represent all the side effects that interest us in Rust. This monad is called `M` and is defined as follows:\\n\\n```coq\\nDefinition RawMonad `{State.Trait} :=\\n ...\\n\\nModule Exception.\\n Inductive t (R : Set) : Set :=\\n | Return : R -> t R\\n | Continue : t R\\n | Break : t R\\n | Panic {A : Set} : A -> t R.\\n Arguments Return {_}.\\n Arguments Continue {_}.\\n Arguments Break {_}.\\n Arguments Panic {_ _}.\\nEnd Exception.\\nDefinition Exception := Exception.t.\\n\\nDefinition Monad `{State.Trait} (R A : Set) : Set :=\\n nat -> State -> RawMonad ((A + Exception R) * State).\\n\\nDefinition M `{State.Trait} (A : Set) : Set :=\\n Monad Empty_set A.\\n```\\n\\nWe assume the definition of some `RawMonad` for memory handling that we will describe in a later post. Our monad `M` is a particular case of the monad `Monad` with `R = Empty_set`. It is a combination four monads:\\n\\n1. The `RawMonad`.\\n2. A state monad, that takes a `State` as input and a return an updated state as output. The trait `State.Trait` provides read/write operations on the `State` type.\\n3. An error monad with errors of type `Exception R`. There errors include the `Return`, `Continue`, `Break` and `Panic` constructors. The `Return` constructor is used to return a value from a function. The `Continue` constructor is used to continue the execution of a loop. The `Break` constructor is used to break the execution of a loop. The `Panic` constructor is used to panic with an error message. We implement all these operations as exceptions, even if only `Panic` is really an error, as they behave in the same way: interrupting the execution of the current sub-expression to bubble up to a certain level.\\n4. A fuel monad for non-termination, with the additional `nat` parameter.\\n\\nThe parameter `R` of the type constructor `Monad` is used to represent the type of values that can be returned in the body of a function. It is the same as the return type of the function. So for a function returning a value of type `A`, we define its body in `Monad A A`. Then, we wrap it in an operator:\\n\\n```coq\\nDefinition catch_return {A : Set} (e : Monad A A) : M A :=\\n ...\\n```\\n\\nthat catches the `Return` exceptions and returns the value.\\n\\n## Conclusion\\n\\nWe will see in the next post how we define the `RawMonad` to handle the Rust state of a program and memory allocation.\\n\\n:::tip Contact\\n\\nIf you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us at [contact@formal.land](mailto:contact@formal.land). We will be happy to set up a call with you.\\n\\n:::"},{"id":"/2023/04/26/representation-of-rust-methods-in-coq","metadata":{"permalink":"/blog/2023/04/26/representation-of-rust-methods-in-coq","source":"@site/blog/2023-04-26-representation-of-rust-methods-in-coq.md","title":"\ud83e\udd80 Representation of Rust methods in Coq","description":"With our project coq-of-rust we aim to translate high-level Rust code to similar-looking Coq code, to formally verify Rust programs. One of the important constructs in the Rust language is the method syntax. In this post, we present our technique to translate Rust methods using type-classes in Coq.","date":"2023-04-26T00:00:00.000Z","formattedDate":"April 26, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"}],"readingTime":4.57,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Representation of Rust methods in Coq","tags":["coq-of-rust","Rust","Coq"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Monad for side effects in Rust","permalink":"/blog/2023/05/28/monad-for-side-effects-in-rust"},"nextItem":{"title":"\ud83e\udd84 Our current formal verification efforts","permalink":"/blog/2023/01/24/current-verification-efforts"}},"content":"With our project [coq-of-rust](https://github.com/formal-land/coq-of-rust) we aim to translate high-level Rust code to similar-looking [Coq](https://coq.inria.fr/) code, to [formally verify](https://en.wikipedia.org/wiki/Formal_verification) Rust programs. One of the important constructs in the Rust language is the [method syntax](https://doc.rust-lang.org/book/ch05-03-method-syntax.html). In this post, we present our technique to translate Rust methods using type-classes in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Rust Code To Translate\\n\\nConsider the following Rust example, which contains a method (adapted from the [Rust Book](https://doc.rust-lang.org/book/)):\\n\\n```rust\\nstruct Rectangle {\\n width: u32,\\n height: u32,\\n}\\n\\nimpl Rectangle {\\n // Here \\"area\\" is a method\\n fn area(&self) -> u32 {\\n self.width * self.height\\n }\\n}\\n\\nfn main() {\\n let rect1 = Rectangle {\\n width: 30,\\n height: 50,\\n };\\n\\n println!(\\n \\"The area of the rectangle is {} square pixels.\\",\\n // We are calling this method there\\n rect1.area()\\n );\\n}\\n```\\n\\nThe Rust compiler can find the implementation of the `.area()` method call because it knows that the type of `rect1` is `Rectangle`. There could be other `area` methods defined for different types, and the code would still compile calling the `area` method of `Rectangle`.\\n\\nCoq has no direct equivalent for calling a function based on its name and type.\\n\\n## Our Translation\\n\\nHere is our Coq translation of the code above:\\n\\n```coq\\n 1: (* Generated by coq-of-rust *)\\n 2: Require Import CoqOfRust.CoqOfRust.\\n 3:\\n 4: Import Root.std.prelude.rust_2015.\\n 5:\\n 6: Module Rectangle.\\n 7: Record t : Set := {\\n 8: width : u32;\\n 9: height : u32;\\n10: }.\\n11:\\n12: Global Instance Get_width : Notation.Dot \\"width\\" := {\\n13: Notation.dot \'(Build_t x0 _) := x0;\\n14: }.\\n15: Global Instance Get_height : Notation.Dot \\"height\\" := {\\n16: Notation.dot \'(Build_t _ x1) := x1;\\n17: }.\\n18: End Rectangle.\\n19: Definition Rectangle : Set := Rectangle.t.\\n20:\\n21: Module ImplRectangle.\\n22: Definition Self := Rectangle.\\n23:\\n24: Definition area (self : ref Self) : u32 :=\\n25: self.[\\"width\\"].[\\"mul\\"] self.[\\"height\\"].\\n26:\\n27: Global Instance Method_area : Notation.Dot \\"area\\" := {\\n28: Notation.dot := area;\\n29: }.\\n30: End ImplRectangle.\\n31:\\n32: Definition main (_ : unit) : unit :=\\n33: let rect1 := {| Rectangle.width := 30; Rectangle.height := 50; |} in\\n34: _crate.io._print\\n35: (_crate.fmt.Arguments::[\\"new_v1\\"]\\n36: [ \\"The area of the rectangle is \\"; \\" square pixels.\\\\n\\" ]\\n37: [ _crate.fmt.ArgumentV1::[\\"new_display\\"] rect1.[\\"area\\"] ]) ;;\\n38: tt ;;\\n39: tt.\\n```\\n\\nOn line `24` we define the `area` function. On line `27` we declare that `area` is a method. On line `37` we call the `area` method on `rect1` with:\\n\\n```coq\\nrect1.[\\"area\\"]\\n```\\n\\nwhich closely resembles the source Rust code:\\n\\n```rust\\nrect1.area()\\n```\\n\\nCoq can automatically find the code of the `area` method to call.\\n\\n## How It Works\\n\\nThe code:\\n\\n```coq\\nrect1.[\\"area\\"]\\n```\\n\\nis actually a notation for:\\n\\n```coq\\nNotation.dot \\"area\\" rect1\\n```\\n\\nThen we leverage the inference mechanism of type-classes in Coq to find the code of the `area` method:\\n\\n```coq\\nModule Notation.\\n (** A class to represent the notation [e1.e2]. This is mainly used to call\\n methods, or access to named or indexed fields of structures.\\n The kind is either a string or an integer. *)\\n Class Dot {Kind : Set} (name : Kind) {T : Set} : Set := {\\n dot : T;\\n }.\\n Arguments dot {Kind} name {T Dot}.\\nEnd Notation.\\n```\\n\\nThe `Dot` class has three parameters: `Kind`, `name`, and `T`. `Kind` is the type of the name of the method (generally a string but it could be an integer in rare cases), `name` is the name of the method, and `T` is the type of the method. The `dot` field of the class is the code of the method.\\n\\nWhen we define the class instance:\\n\\n```coq\\n27: Global Instance Method_area : Notation.Dot \\"area\\" := {\\n28: Notation.dot := area;\\n29: }.\\n```\\n\\nwe instantiate the class `Notation.Dot` with three parameters:\\n\\n- `Kind` (inferred) is `string` because the name of the method is a string,\\n- `name` is `\\"area\\"` because the name of the method is `area`,\\n- `T` (inferred) is `ref Rectangle -> u32` because the method is declared as `fn area(&self) -> u32`.\\n\\nThen we define the `dot` field of the class instance to be the `area` function.\\n\\nWhen we call:\\n\\n```coq\\nNotation.dot \\"area\\" rect1\\n```\\n\\nCoq will automatically find the class instance `Method_area` because the type of `rect1` is `Rectangle` and the name of the method is `\\"area\\"`.\\n\\n## Other Use Cases\\n\\nThe `Dot` class is also used to access to named or indexed fields of structures or traits. We use a similar mechanism for associated functions. For example, the Rust code:\\n\\n```rust\\nlet rect1 = Rectangle::square(3);\\n```\\n\\nis translated to:\\n\\n```coq\\nlet rect1 := Rectangle::[\\"square\\"] 3 in\\n```\\n\\nwith a type-class for the `type::[name]` notation as follows:\\n\\n```coq\\nModule Notation.\\n (** A class to represent associated functions (the notation [e1::e2]). The\\n kind might be [Set] for functions associated to a type,\\n or [Set -> Set] for functions associated to a trait. *)\\n Class DoubleColon {Kind : Type} (type : Kind) (name : string) {T : Set} :\\n Set := {\\n double_colon : T;\\n }.\\n Arguments double_colon {Kind} type name {T DoubleColon}.\\nEnd Notation.\\n```\\n\\n## In Conclusion\\n\\nThe type-classes mechanism of Coq appears flexible enough to represent our current use cases involving methods and associated functions. It remains to be seen whether this approach will suffice for future use cases.\\n\\n:::tip Contact\\n\\nIf you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us at [contact@formal.land](mailto:contact@formal.land). We will be happy to set up a call with you.\\n\\n:::"},{"id":"/2023/01/24/current-verification-efforts","metadata":{"permalink":"/blog/2023/01/24/current-verification-efforts","source":"@site/blog/2023-01-24-current-verification-efforts.md","title":"\ud83e\udd84 Our current formal verification efforts","description":"We are diversifying ourselves to apply formal verification on 3\ufe0f\u20e3 new languages with Solidity, Rust, and TypeScript. In this article we describe our approach. For these three languages, we translate the code to the proof system \ud83d\udc13 Coq. We generate the cleanest \ud83e\uddfc possible output to simplify the formal verification \ud83d\udcd0 effort that comes after.","date":"2023-01-24T00:00:00.000Z","formattedDate":"January 24, 2023","tags":[{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"OCaml","permalink":"/blog/tags/o-caml"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"TypeScript","permalink":"/blog/tags/type-script"}],"readingTime":4.89,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 Our current formal verification efforts","tags":["coq-of-ocaml","OCaml","Solidity","Rust","TypeScript"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Representation of Rust methods in Coq","permalink":"/blog/2023/04/26/representation-of-rust-methods-in-coq"},"nextItem":{"title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","permalink":"/blog/2022/12/13/latest-blog-posts-on-tezos"}},"content":"We are diversifying ourselves to apply [formal verification](https://en.wikipedia.org/wiki/Formal_verification) on 3\ufe0f\u20e3 new languages with **Solidity**, **Rust**, and **TypeScript**. In this article we describe our approach. For these three languages, we translate the code to the proof system [\ud83d\udc13 Coq](https://coq.inria.fr/). We generate the cleanest \ud83e\uddfc possible output to simplify the formal verification \ud83d\udcd0 effort that comes after.\\n\\n> Formal verification is a way to ensure that a program follows its specification in \ud83d\udcaf% of cases thanks to the use of mathematical methods. It removes far more bugs and security issues than testing, and is necessary to deliver software of the highest quality \ud83d\udc8e.\\n\\n\x3c!-- truncate --\x3e\\n\\n## \ud83d\uddfa\ufe0f General plan\\nTo apply formal verification to real-sized applications, we need to handle thousands of lines of code in a seamless way. We rely on the proof system Coq to write our proofs, as it has a mature ecosystem, and automated (SMT) and interactive ways to write proofs. To keep the proofs simple, we must find an efficient way to convert an existing and evolving codebase to Coq.\\n\\nFor example, given the following TypeScript example:\\n```typescript\\nexport function checkIfEnoughCredits(user: User, credits: number): boolean {\\n if (user.isAdmin) {\\n return credits >= 0;\\n }\\n\\n return credits >= 1000;\\n}\\n```\\nwe want to generate the corresponding Coq code in an automated way:\\n```coq\\nDefinition checkIfEnoughCredits (user : User) (credits : number) : bool :=\\n if user.(User.isAdmin) then\\n credits >= 0\\n else\\n credits >= 1000.\\n```\\nThis is the exact equivalent written using the Coq syntax, where we check the `credits` condition depending on the user\'s status. This is the `checkIfEnoughCredits` definition a Coq developer would directly write, in an idiomatic way.\\n\\nWe make some hypothesis on the input code. In TypeScript we assume the code does not contain mutations, which is often the case to simplify asynchronous code. In Rust we have other hypothesis as making safe mutations is one of the keys features of the language and a frequent pattern. For each language we look for a correct subset to work on, to support common use cases and still generate a clean Coq code.\\n\\n## \ud83c\uddf8 Solidity\\n\u27a1\ufe0f [Project page](/docs/verification/solidity) \u2b05\ufe0f\\n\\nThe [Solidity language](https://soliditylang.org/) is the main language to write smart contracts on the [Ethereum](https://ethereum.org/) blockchain. As smart contracts cannot be easily updated and handle a large amount of money, it is critical to formally verify them to prevent bugs.\\n\\nOur strategy is to develop a translator [coq-of-solidity](https://gitlab.com/formal-land/coq-of-solidity) from Solidity to Coq. We are using an implementation of an [ERC-20](https://en.wikipedia.org/wiki/Ethereum#ERC20) smart contract as an example to guide our translation. Two top difficulties in the translation of Solidity programs are:\\n* the use of object-oriented programming with inheritance on classes,\\n* the use of mutations and errors, that need to be handled in a monad.\\n\\nWe are still trying various approach to handle these difficulties and generate a clean Coq output for most cases.\\n\\nIn addition to our work on Solidity, we are looking at the [EVM code](https://ethereum.org/en/developers/docs/evm/) that is the assembly language of Ethereum. It has the advantage of being more stable and with a simpler semantics than Solidity. However, it is not as expressive and programs in EVM are much harder to read. We have a prototype of translator from EVM to Coq named [ethereum-vm-to-coq](https://gitlab.com/formal-land/ethereum-vm-to-coq). An interesting goal will be to connect the translation of Solidity and of EVM in Coq to show that they have the same semantics on a given smart contract.\\n\\nNote that EVM is the target language of many verification project on Ethereum such as [Certora](https://www.certora.com/) or static analyzers. We prefer to target Solidity as it is more expressive and the generated code in Coq will thus be easier to verify.\\n\\n## \ud83e\udd80 Rust\\n\u27a1\ufe0f [Project page](/docs/verification/rust) \u2b05\ufe0f\\n\\nThe [Rust language](https://www.rust-lang.org/) is a modern systems programming language that is gaining popularity. It is a safe language that prevents many common errors such as buffer overflows or use-after-free. It is also a language that is used to write low-level code, such as drivers or operating systems. As such, it is critical to formally verify Rust programs to prevent bugs.\\n\\nWe work in collaboration with the team developing the [Aeneas](https://github.com/AeneasVerif) project, with people from Inria and Microsoft. The aim is to translate Rust code with mutations to a purely functional form in Coq (without mutations) to simplify the verification effort and avoid the need of separation logic. The idea of this translation is explained in the [Aeneas paper](https://dl.acm.org/doi/abs/10.1145/3547647).\\n\\nThere are two steps in the translation:\\n1. **From [MIR](https://rustc-dev-guide.rust-lang.org/mir/index.html) (low-level intermediate form of Rust) to LLBC.** This is a custom language for the project that contains all the information of MIR but is better suited for analysis. For example, instead of using a control-flow graph it uses control structures and an abstract syntax tree. This step is implemented in Rust.\\n2. **From LLBC to Coq.** This is the heart of the project and is implemented in OCaml. This is where the translation from mutations to a purely functional form occurs.\\n\\nFor now we are focusing on adding new features to LLBC and improving the user experience: better error messages, generation of an output with holes for unhandled Rust features.\\n\\n## \ud83c\udf10 TypeScript\\n\u27a1\ufe0f [Project page](/docs/verification/typescript) \u2b05\ufe0f\\n\\nWe have a [\ud83d\udcfd\ufe0f demo project](https://formal-land.github.io/coq-of-js/) to showcase the translation of a purely functional subset of JavaScript to Coq. We handle functions and basic data types such as records, enums and discriminated unions. We are now porting the code to TypeScript in [coq-of-ts](https://github.com/formal-land/coq-of-ts). We prefer to work on TypeScript rather than JavaScript as type information are useful to guide the translation, and avoid the need of additional annotations on the source code.\\n\\nOur next target will be to make `coq-of-ts` usable on real-life project example.\\n\\n:::info Social media\\nFollow us on Twitter at [Twitter](https://twitter.com/FormalLand) \ud83d\udc26 and [Telegram](https://t.me/formal_land) to get the latest news about our projects. If you think our work is interesting, please share it with your friends and colleagues. \ud83d\ude4f\\n:::"},{"id":"/2022/12/13/latest-blog-posts-on-tezos","metadata":{"permalink":"/blog/2022/12/13/latest-blog-posts-on-tezos","source":"@site/blog/2022-12-13-latest-blog-posts-on-tezos.md","title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","description":"Here we recall some blog articles that we have written since this summer, on the formal verification of the protocol of Tezos. For this project, we are verifying a code base of around 100,000 lines of OCaml code. We automatically convert the OCaml code to the proof system Coq using the converter coq-of-ocaml. We then apply various proof techniques to make sure that the protocol of Tezos does not contain bugs.","date":"2022-12-13T00:00:00.000Z","formattedDate":"December 13, 2022","tags":[{"label":"coq-tezos-of-ocaml","permalink":"/blog/tags/coq-tezos-of-ocaml"},{"label":"Tezos","permalink":"/blog/tags/tezos"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"}],"readingTime":1.755,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","tags":["coq-tezos-of-ocaml","Tezos","coq-of-ocaml"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 Our current formal verification efforts","permalink":"/blog/2023/01/24/current-verification-efforts"},"nextItem":{"title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","permalink":"/blog/2022/06/23/upgrade-coq-of-ocaml-4.14"}},"content":"Here we recall some blog articles that we have written since this summer, on the [formal verification of the protocol of Tezos](https://formal-land.gitlab.io/coq-tezos-of-ocaml/). For this project, we are verifying a code base of around 100,000 lines of OCaml code. We automatically convert the OCaml code to the proof system Coq using the converter [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml). We then apply various proof techniques to make sure that the protocol of Tezos does not contain bugs.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Blog articles \ud83d\udcdd\\nHere is the list of articles about the work we have done since this summer. We believe that some of this work is very unique and specific to Tezos.\\n\\n* [The error monad, internal errors and validity predicates, step-by-step](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/12/12/internal-errors-step-by-step/) by *Pierre Vial*: a detailed explanation of what we are doing to verify the absence of unexpected errors in the whole code base;\\n* [Absence of internal errors](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/10/18/absence-of-internal-errors/) by *Guillaume Claret*: the current state of our proofs to verify the absence of unexpected errors;\\n* [Skip-list verification. Using inductive predicates](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/10/03/verifying-the-skip-list-inductive-predicates/) by *Bart\u0142omiej Kr\xf3likowski* and *Natalie Klaus*: a presentation of our verification effort on the skip-list algorithm implementation (part 2);\\n* [Verifying the skip-list](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/10/03/verifying-the-skip-list/) by *Natalie Klaus* and *Bart\u0142omiej Kr\xf3likowski*: a presentation of our verification effort on the skip-list algorithm implementation (part 1);\\n* [Verifying json-data-encoding](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/08/15/verify-json-data-encoding/) by *Tait van Strien*: our work to verify an external library used by the Tezos protocol, to safely serialize data to JSON values;\\n* [Fixing reused proofs](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/07/19/fixing-proofs/) by *Bart\u0142omiej Kr\xf3likowski*: a presentation, with examples, of the work we do to maintain existing proofs and specifications as the code evolves;\\n* [Formal verification of property based tests](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/06/07/formal-verification-of-property-based-tests/) by *Guillaume Claret*: the principle and status of our work to formally verify the generalized case of property-based tests;\\n* [Plan for backward compatibility verification](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/06/02/plan-backward-compatibility) by *Guillaume Claret*: an explanation of the strategy we use to show that two successive versions of the Tezos protocol are fully backward compatible.\\n\\nTo follow more of our activity, feel free to register on our [Twitter account \ud83d\udc26](https://twitter.com/FormalLand)! If you need services or advices to formally verify your code base, you can drop us an [email \ud83d\udce7](mailto:contact@formal.land)!"},{"id":"/2022/06/23/upgrade-coq-of-ocaml-4.14","metadata":{"permalink":"/blog/2022/06/23/upgrade-coq-of-ocaml-4.14","source":"@site/blog/2022-06-23-upgrade-coq-of-ocaml-4.14.md","title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","description":"In an effort to support the latest version of the protocol of Tezos we upgraded coq-of-ocaml to add compatibility with OCaml 4.14. The result is available in the branch ocaml-4.14. We describe here how we made this upgrade.","date":"2022-06-23T00:00:00.000Z","formattedDate":"June 23, 2022","tags":[{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"ocaml","permalink":"/blog/tags/ocaml"},{"label":"4.14","permalink":"/blog/tags/4-14"}],"readingTime":2.195,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","tags":["coq-of-ocaml","ocaml","4.14"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","permalink":"/blog/2022/12/13/latest-blog-posts-on-tezos"},"nextItem":{"title":"\ud83d\udc2b Status update on the verification of Tezos","permalink":"/blog/2022/06/15/status update-tezos"}},"content":"In an effort to support the latest version of the [protocol of Tezos](https://gitlab.com/tezos/tezos/-/tree/master/src/proto_alpha/lib_protocol) we upgraded [`coq-of-ocaml`](https://github.com/formal-land/coq-of-ocaml) to add compatibility with OCaml 4.14. The result is available in the branch [`ocaml-4.14`](https://github.com/formal-land/coq-of-ocaml/pull/217). We describe here how we made this upgrade.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Usage of Merlin\\nIn `coq-of-ocaml` we are using [Merlin](https://github.com/ocaml/merlin) to get the typed [abstract syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) of OCaml files. We see the AST through the [Typedtree](https://docs.mirage.io/ocaml/Typedtree/index.html) interface, together with an access to all the definitions of the current compilation environment. Merlin computes the current environment by understanding how an OCaml project is configured and connecting to the [dune](https://dune.build/) build system. The environment is mandatory for certain transformations in `coq-of-ocaml`, like:\\n* finding a canonical name for module types;\\n* propagating phantom types.\\n\\nIn order to use Merlin as a library (rather than as a daemon), we vendor the [LSP version](https://github.com/rgrinberg/merlin/tree/lsp) of [rgrinberg](https://github.com/rgrinberg) in the folder [`vendor/`](https://github.com/formal-land/coq-of-ocaml/tree/master/vendor). This vendored version works with no extra configurations.\\n\\n## Upgrade\\nWhen a new version of OCaml is out, we upgrade our vendored version of Merlin to a compatible one. Then we do the necessary changes to `coq-of-ocaml`, as the interface of the AST generally evolves with small changes. For OCaml 4.14, the main change was some types becoming abstract such as `Types.type_expr`. To access to the fields of these types, we now need to use a specific getter and do changes such as:\\n```diff\\n+ match typ.desc with\\n- match Types.get_desc typ with\\n```\\nThis made some patterns in `match` expressions more complex, but otherwise the changes were very minimal. We ran all the unit-tests of `coq-of-ocaml` after the upgrade and they were still valid.\\n\\n## Git submodule or copy & paste?\\nTo vendor Merlin we have two possibilities:\\n1. Using a [Git submodule](https://git-scm.com/book/en/v2/Git-Tools-Submodules).\\n2. Doing a copy & paste of the code.\\n\\nThe first possibility is more efficient in terms of space, but there are a few disadvantages:\\n* we cannot make small modifications if needed;\\n* the archives generated by Github do not contain the code of the submodules (see this [issue](https://github.com/dear-github/dear-github/issues/214))\\n* if a commit in the repository for the submodule disappears, then the submodule is unusable.\\n\\nThe last reason forced us to do a copy & paste for OCaml 4.14. We now have to be cautious not to commit the generate `.ml` file for the OCaml parser.\\n\\n## Next\\nThe next change will be doing the upgrade to OCaml 5. There should be much more changes, and in particular a new way of handling the effects. We do not know yet if it will be possible to translate the effect handlers to Coq in a nice way."},{"id":"/2022/06/15/status update-tezos","metadata":{"permalink":"/blog/2022/06/15/status update-tezos","source":"@site/blog/2022-06-15-status update-tezos.md","title":"\ud83d\udc2b Status update on the verification of Tezos","description":"Here we give an update on our verification effort on the protocol of Tezos. We add the marks:","date":"2022-06-15T00:00:00.000Z","formattedDate":"June 15, 2022","tags":[{"label":"tezos","permalink":"/blog/tags/tezos"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"coq","permalink":"/blog/tags/coq"}],"readingTime":7.53,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Status update on the verification of Tezos","tags":["tezos","coq-of-ocaml","coq"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","permalink":"/blog/2022/06/23/upgrade-coq-of-ocaml-4.14"},"nextItem":{"title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","permalink":"/blog/2022/02/02/make-tezos-a-formally-verified-crypto"}},"content":"Here we give an update on our [verification effort](https://formal-land.gitlab.io/coq-tezos-of-ocaml/) on the protocol of Tezos. We add the marks:\\n* \u2705 for \\"rather done\\"\\n* \ud83c\udf0a for \\"partially done\\"\\n* \u274c for \\"most is yet to do\\"\\n\\nOn the website of project, we also automatically generates pages such as [Compare](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/status/compare/) to follow the status of the tasks.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Maintenance of the translation \u2705\\nWe were able to maintain most of the translation from OCaml to Coq of the protocol of Tezos using [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml), including all the translation of the Michelson interpreter. There was an increase in the size of the OCaml code base in recent months, due to new features added in Tezos like the [rollups](https://research-development.nomadic-labs.com/tezos-is-scaling.html). Here are the numbers of lines of code (`.ml` and `.mli` files) for the various protocol versions:\\n* protocol H: `51147`\\n* protocol I: `59535`\\n* protocol J: `83271` (increase mainly due to the rollups)\\n* protocol Alpha (development version of K): `90716`\\n\\nWe still translate most of the protocol code up to version J. We stayed on version J for a while as we wanted to add as many proofs as possible before doing a proof of backward compatibility between J and K. We are currently updating the translation to support the protocol version Alpha, preparing for the translation of K.\\n\\nFor protocol J, we needed to add a [blacklist.txt](https://gitlab.com/nomadic-labs/coq-tezos-of-ocaml/-/blob/master/blacklist.txt) of files that we do not support. Indeed, we need to add new changes to `coq-of-ocaml` to support these or do hard-to-maintain changes to [our fork](https://gitlab.com/tezos/tezos/-/merge_requests/3303) of the Tezos protocol. We plan to complete the translation and remove this black-list for the protocol J soon (in a week or two).\\n\\n## Size of the proofs \u2705\\nOne of our plans is to have a reasonable quantity of proofs, to cover a reasonable quantity of code and properties from the protocol. We believe we have a good quantity of proofs now, as we have more than 50,000 lines of Coq code (for an OCaml codebase of 80,000 lines).\\n\\nIn addition to our main targets, we verify many \\"smaller\\" properties, such as:\\n* conversion functions are inverses (when there are two `to_int` and `of_int` functions in a file, we show that they are inverses);\\n* the `compare` functions, to order elements, are well defined (see our blog post [Verifying the compare functions of OCaml](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/04/04/verifying-the-compare-functions));\\n* invariants are preserved. For example, [here](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/proofs/carbonated_map#Make.update_is_valid) we show that updating a carbonated map preserves the property of having a size field actually equal to the number of elements.\\n\\nWe should note that the size of Coq proofs tends to grow faster than the size of the verified code. We have no coverage metrics to know how much of the code is covered by these proofs.\\n\\n## Data-encodings \ud83c\udf0a\\nThe [data-encoding](https://gitlab.com/nomadic-labs/data-encoding) library is a set of combinators to write serialization/de-serialization functions. We verify that the encodings defined for each protocol data type are bijective. The good thing we have is a semi-automated tactic to verify the use of the `data-encoding` primitives. We detail this approach in our blog post [Automation of `data_encoding` proofs](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/11/22/data-encoding-automation). We can verify most of the encoding functions that we encounter. From there, we also express the **invariant** associated with each data type, which the encodings generally check at runtime. The invariants are then the domain of definition of the encodings.\\n\\nHowever, we have a hole: we do not verify the `data-encoding` library itself. Thus the [axioms we made](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/environment/proofs/data_encoding) on the data-encoding primitives may have approximations. And indeed, we missed one issue in the development code of the protocol. This is thus a new high-priority target to verify the `data-encoding` library itself. One of the challenges for the proof is the use of side-effects (references and exceptions) in this library.\\n\\n## Property-based tests \ud83c\udf0a\\nThe property-based tests on the protocol are located in [`src/proto_alpha/lib_protocol/test/pbt`](https://gitlab.com/tezos/tezos/-/tree/master/src/proto_alpha/lib_protocol/test/pbt). These tests are composed of:\\n* a generator, generating random inputs of a certain shape;\\n* a property function, a boolean function taking a generated input and supposed to always answer `true`.\\n\\nWe translated a part of these tests to Coq, to convert them to theorems and have specifications extracted from the code. The result of this work is summarized in this blog post: [Formal verification of property based tests](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/06/07/formal-verification-of-property-based-tests). We have fully translated and verified four test files over a total of twelve. We are continuing the work of translations and proofs.\\n\\nHowever, we found that for some of the files the proofs were taking a long time to write compared to the gains in safety. Indeed, the statements made in the tests are sometimes too complex when translated into general theorems. For example, for [test_carbonated_map.ml](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/test/pbt/test_carbonated_map.ml) we have to deal with:\\n* gas exhaustion (seemingly impossible in the tests);\\n* data structures of size greater than `max_int` (impossible in practice).\\n\\nAll of that complicate the proofs for little gain in safety. So I would say that not all the property-based tests have a nice and useful translation to Coq. We should still note that for some of the tests, like with saturation arithmetic, we have proofs that work well. For these, we rely on the automated linear arithmetic tactic [`lia`](https://coq.inria.fr/refman/addendum/micromega.html) of Coq to verify properties over integer overflows.\\n\\n## Storage system \ud83c\udf0a\\nBy \\"storage system\\" we understand the whole set of functors defined in [`storage_functors.ml`](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/storage_functors.ml) and how we apply them to define the protocol storage in [`storage.ml`](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/storage_functors.ml). These functors create sub-storages with signatures such as:\\n```ocaml\\nmodule type Non_iterable_indexed_data_storage = sig\\n type t\\n type context = t\\n type key\\n type value\\n val mem : context -> key -> bool Lwt.t\\n val get : context -> key -> value tzresult Lwt.t\\n val find : context -> key -> value option tzresult Lwt.t\\n val update : context -> key -> value -> Raw_context.t tzresult Lwt.t\\n val init : context -> key -> value -> Raw_context.t tzresult Lwt.t\\n val add : context -> key -> value -> Raw_context.t Lwt.t\\n val add_or_remove : context -> key -> value option -> Raw_context.t Lwt.t\\n val remove_existing : context -> key -> Raw_context.t tzresult Lwt.t\\n val remove : context -> key -> Raw_context.t Lwt.t\\nend\\n```\\nThis `Non_iterable_indexed_data_storage` API looks like the API of an OCaml\'s [Map](https://v2.ocaml.org/api/Map.Make.html). As a result, our goal for the storage is to show that is can be simulated by standard OCaml data structures such as sets and maps. This is a key step to unlock further reasoning about code using the storage.\\n\\nUnfortunately, we were not able to verify the whole storage system yet. Among the difficulties are that:\\n* there are many layers in the definition of the storage;\\n* the storage functors use a lot of abstractions, and sometimes it is unclear how to specify them in the general case.\\n\\nStill, we have verified some of the functors as seen in [`Proofs/Storage_functors.v`](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/proofs/storage_functors) and specified the `storage.ml` file in [`Proos/Storage.v`](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/storage). We believe in having the correct specifications for all of the storage abstractions now. We plan to complete all these proofs later.\\n\\n## Michelson\\nThe verification of the Michelson interpreter is what occupied most of our time. By considering the OCaml files whose name starts by `script_`, the size of the Michelson interpreter is around 20,000 lines of OCaml code.\\n\\n### Simulations \ud83c\udf0a\\nThe interpreter relies heavily on [GADTs](https://v2.ocaml.org/manual/gadts.html) in OCaml. Because these do not translate nicely in Coq, we need to write simulations in dependent types of the interpreter functions, and prove them correct in Coq. We describe this process in our [Michelson Guide](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/guides/michelson).\\n\\nThe main difficulties we encountered are:\\n* the number of simulations to write (covering the 20,000 lines of OCaml);\\n* the execution time of the proof of correctness of the simulations. This is due to the large size of the inductive types describing the Michelson AST, and the use of dependent types generating large proof terms. For example, there are around 30 cases for the types and 150 for the instructions node in the AST.\\n\\nWhen writing the simulations, we are also verifying the termination of all the functions and the absence of reachable `assert false`. We have defined the simulation of many functions, but are still missing important ones such as [`parse_instr_aux`](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/script_ir_translator/#parse_instr_aux) to parse Michelson programs.\\n\\n### Mi-Cho-Coq \ud83c\udf0a\\nWe have a project to verify that the [Mi-Cho-Coq](https://gitlab.com/nomadic-labs/mi-cho-coq) framework, used to formally verify smart contracts written in Michelson, is compatible with the implementation of the Michelson interpreter in OCaml. We have a partial proof of compatibility in [Micho_to_dep.v](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/simulations/micho_to_dep). We still need to complete this proof, especially to handle instructions with loops. Our goal is to show a complete inclusion of the semantics of Mi-Cho-Coq into the semantics of the implementation.\\n\\n### Parse/unparse \u274c\\nWe wanted to verify that the various parsing and unparsing functions over Michelson are inverses. These functions exist for:\\n* comparable types\\n* types\\n* comparable data\\n* data\\n\\nBecause we are still focused on writing, verifying or updating the simulations, we are still not done for this task.\\n\\n## Conclusion\\nWe have many ongoing projects but few fully completed tasks. We will focus more on having terminated proofs."},{"id":"/2022/02/02/make-tezos-a-formally-verified-crypto","metadata":{"permalink":"/blog/2022/02/02/make-tezos-a-formally-verified-crypto","source":"@site/blog/2022-02-02-make-tezos-a-formally-verified-crypto.md","title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","description":"Elephants","date":"2022-02-02T00:00:00.000Z","formattedDate":"February 2, 2022","tags":[{"label":"tezos","permalink":"/blog/tags/tezos"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"coq","permalink":"/blog/tags/coq"}],"readingTime":3.675,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","tags":["tezos","coq-of-ocaml","coq"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Status update on the verification of Tezos","permalink":"/blog/2022/06/15/status update-tezos"},"nextItem":{"title":"\ud83d\udc2b New blog posts and Meetup talk","permalink":"/blog/2021/11/12/new-blog-posts-and-meetup-talk"}},"content":"![Elephants](elephants-elmira-gokoryan.webp)\\n\\nOur primary goal at [Formal Land \ud83c\udf32](https://formal.land/) is to make [Tezos](https://tezos.com/) the first crypto-currency with a formally verified implementation. With [formal verification](https://en.wikipedia.org/wiki/Formal_verification), thanks to mathematical methods, we can check that a program behaves as expected for all possible inputs. Formal verification goes beyond what testing can do, as testing can only handle a finite amount of cases. That is critical as cryptocurrencies hold a large amount of money (around $3B for Tezos today). The current result of our verification project is available on [nomadic-labs.gitlab.io/coq-tezos-of-ocaml](https://formal-land.gitlab.io/coq-tezos-of-ocaml/). Formal verification is also key to allowing Tezos to evolve constantly in a safe and backward compatible manner.\\n\\n\x3c!-- truncate --\x3e\\n\\nWe proceed in two steps:\\n1. we translate the code of Tezos, written in [OCaml](https://ocaml.org/), to the proof language [Coq](https://coq.inria.fr/) using the translator [coq-of-ocaml](https://github.com/foobar-land/coq-of-ocaml);\\n2. we write our specifications and proofs in the Coq language.\\n\\nWe believe this is one of the most efficient ways to proceed, as we can work on an almost unmodified version of the codebase and use the full power of the mature proof system Coq. The code of Tezos is composed of around:\\n* 50,000 lines for the protocol (the kernel of Tezos), and\\n* 200,000 lines for the shell (everything else, including the peer-to-peer layer and the storage backend).\\n\\nWe are currently focusing on verifying the protocol for the following modules.\\n\\n## Data-encoding\\nThe [data-encoding](https://gitlab.com/nomadic-labs/data-encoding) library offers serialization and deserialization to binary and JSON formats. It is used in various parts of the Tezos protocol, especially on all the data types ending up in the storage system. In practice, many encodings are defined in the OCaml files named `*_repr.ml`. We verify that the `data-encoding` library is correctly used to define the encodings. We check that converting a value to binary format and from binary returns the initial value. We explicit the domain of validity of such conversions. This verification work generally reveals and propagates invariants about the data structures of the protocol. As an invariant example, all the account amounts should always be positive. Having these invariants will be helpful for the verification of higher-level layers of the protocol.\\n\\n## Michelson smart contracts\\nThe smart contract language of Tezos is [Michelson](https://tezos.gitlab.io/active/michelson.html). The interpreter and type-checker of smart contracts is one of the most complex and critical parts of the protocol. We are verifying two things about this code:\\n* The equivalence of the interpreter and the Coq semantics for Michelson defined in the project [Mi-Cho-Coq](https://gitlab.com/nomadic-labs/mi-cho-coq). Thanks to this equivalence, we can make sure that the formal verification of smart contracts is sound for the current version of the protocol.\\n* The compatibility of the parsing and unparsing functions for the Michelson types and values. The parsing functions take care of the type-checking and do a lot of sanity checks on Michelson expressions with appropriate error messages. Showing that the parsing and unparsing functions are inverses is important for security reasons. The Michelson values are always unparsed at the end of a smart contract execution to be stored on disk.\\n\\nTo do these proofs, we also give a new semantics of Michelson, expressed using dependent types rather than [GADTs](https://ocaml.org/manual/gadts-tutorial.html) in the OCaml implementation.\\n\\n## Storage system\\nCryptocurrencies typically take a lot of space on disk (in the hundreds of gigabytes). In Tezos, we use the key-value database [Irmin](https://irmin.org/). The protocol provides a lot of [abstractions](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/storage_functors.ml) over this database to expose higher-level interfaces with set and map-like APIs. We verify that these abstractions are valid doing a proof by simulation, where we show that the whole system is equivalent to an [in-memory database](https://en.wikipedia.org/wiki/In-memory_database) using simpler data structures. Thanks to this simulation, we will be able to reason about code using the storage as if we were using the simpler in-memory version.\\n\\n## In addition\\nWe also plan to verify:\\n* The implementation of the `data-encoding` library itself. This code is challenging for formal verification as it contains many imperative features. Another specificity of this library is that it sits outside of the protocol of Tezos, and we might need to adapt `coq-of-ocaml` to support it.\\n* The [property-based tests of the protocol](https://gitlab.com/tezos/tezos/-/tree/master/src/proto_alpha/lib_protocol/test/pbt). These tests are written as boolean functions (or functions raising exceptions), which must return `true` on any possible inputs. We will verify them in the general case by importing their definitions to Coq and verifying with mathematical proofs that they are always correct.\\n\\n:::tip Contact\\nFor any questions or remarks, contact us on \ud83d\udc49 [contact@formal.land](mailto:contact@formal.land) \ud83d\udc48.\\n:::"},{"id":"/2021/11/12/new-blog-posts-and-meetup-talk","metadata":{"permalink":"/blog/2021/11/12/new-blog-posts-and-meetup-talk","source":"@site/blog/2021-11-12-new-blog-posts-and-meetup-talk.md","title":"\ud83d\udc2b New blog posts and Meetup talk","description":"Recently, we added two new blog posts about the verification of the crypto-currency Tezos:","date":"2021-11-12T00:00:00.000Z","formattedDate":"November 12, 2021","tags":[{"label":"tezos","permalink":"/blog/tags/tezos"},{"label":"mi-cho-coq","permalink":"/blog/tags/mi-cho-coq"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"meetup","permalink":"/blog/tags/meetup"}],"readingTime":0.58,"hasTruncateMarker":false,"authors":[],"frontMatter":{"title":"\ud83d\udc2b New blog posts and Meetup talk","tags":["tezos","mi-cho-coq","coq-of-ocaml","meetup"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","permalink":"/blog/2022/02/02/make-tezos-a-formally-verified-crypto"},"nextItem":{"title":"\ud83d\udc2b Verification of the use of data-encoding","permalink":"/blog/2021/10/27/verification-data-encoding"}},"content":"Recently, we added two new blog posts about the verification of the crypto-currency [Tezos](https://tezos.com/):\\n* [Verify the Michelson types of Mi-Cho-Coq](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/11/01/verify-michelson-types-mi-cho-coq/) to compare the types defined in the Tezos code for the [Michelson](http://tezos.gitlab.io/active/michelson.html) interpreter and in the [Mi-Cho-Coq library](https://gitlab.com/nomadic-labs/mi-cho-coq) to verify smart contracts;\\n* [Translate the Tenderbake\'s code to Coq](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/11/08/translate-tenderbake/) to explain how we translated the recent changes in Tezos to the Coq using [coq-of-ocaml](https://github.com/foobar-land/coq-of-ocaml). In particular we translated the code of the new [Tenderbake](https://research-development.nomadic-labs.com/a-look-ahead-to-tenderbake.html) consensus algorithm.\\n\\nWe also talked at the [Lambda Lille Meetup](https://www.meetup.com/LambdaLille/events/281374644/) (in French) to present our work on `coq-of-ocaml` for Tezos. A video on the [Youtube channel](https://www.youtube.com/channel/UC-hC7y_ilQBq0QCa9xDu1iA) of the Meetup should be available shortly. We thanks the organizers for hosting the talk."},{"id":"/2021/10/27/verification-data-encoding","metadata":{"permalink":"/blog/2021/10/27/verification-data-encoding","source":"@site/blog/2021-10-27-verification-data-encoding.md","title":"\ud83d\udc2b Verification of the use of data-encoding","description":"We added a blog post about the verification of the use of data-encodings in the protocol of Tezos. Currently, we work on the verification of Tezos and publish our blog articles there. We use coq-of-ocaml to translate the OCaml code to Coq and do our verification effort.","date":"2021-10-27T00:00:00.000Z","formattedDate":"October 27, 2021","tags":[{"label":"data-encoding","permalink":"/blog/tags/data-encoding"}],"readingTime":0.235,"hasTruncateMarker":false,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Verification of the use of data-encoding","tags":["data-encoding"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b New blog posts and Meetup talk","permalink":"/blog/2021/11/12/new-blog-posts-and-meetup-talk"},"nextItem":{"title":"\ud83d\ude00 Welcome","permalink":"/blog/2021/10/10/welcome"}},"content":"We added a blog post about the [verification of the use of data-encodings](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/10/20/data-encoding-usage) in the protocol of Tezos. Currently, we work on the verification of Tezos and publish our blog articles there. We use [coq-of-ocaml](https://foobar-land.github.io/coq-of-ocaml/) to translate the OCaml code to Coq and do our verification effort."},{"id":"/2021/10/10/welcome","metadata":{"permalink":"/blog/2021/10/10/welcome","source":"@site/blog/2021-10-10-welcome.md","title":"\ud83d\ude00 Welcome","description":"Welcome to the blog of Formal Land. Here we will post various updates about the work we are doing.","date":"2021-10-10T00:00:00.000Z","formattedDate":"October 10, 2021","tags":[{"label":"Welcome","permalink":"/blog/tags/welcome"}],"readingTime":0.095,"hasTruncateMarker":false,"authors":[],"frontMatter":{"title":"\ud83d\ude00 Welcome","tags":["Welcome"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Verification of the use of data-encoding","permalink":"/blog/2021/10/27/verification-data-encoding"}},"content":"Welcome to the blog of [Formal Land](/). Here we will post various updates about the work we are doing."}]}')}}]); \ No newline at end of file diff --git a/assets/js/b2f554cd.a9c17cae.js b/assets/js/b2f554cd.a9c17cae.js new file mode 100644 index 000000000..ec4724777 --- /dev/null +++ b/assets/js/b2f554cd.a9c17cae.js @@ -0,0 +1 @@ +"use strict";(self.webpackChunkformal_land=self.webpackChunkformal_land||[]).push([[5894],{6042:e=>{e.exports=JSON.parse('{"blogPosts":[{"id":"/2025/01/30/links-for-rust-in-rocq","metadata":{"permalink":"/blog/2025/01/30/links-for-rust-in-rocq","source":"@site/blog/2025-01-30-links-for-rust-in-rocq.md","title":"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)","description":"In this article we show how we re-build the type and naming information of \ud83e\udd80 Rust code in  Rocq/Coq, the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of Revm, a virtual machine to run EVM programs.","date":"2025-01-30T00:00:00.000Z","formattedDate":"January 30, 2025","tags":[{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"links","permalink":"/blog/tags/links"},{"label":"simulations","permalink":"/blog/tags/simulations"}],"readingTime":7.485,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)","tags":["Rust","links","simulations"],"authors":[]},"unlisted":false,"nextItem":{"title":"\ud83e\udd16 Designing a coding assistant for Rocq","permalink":"/blog/2025/01/21/designing-a-coding-assistant-for-rocq"}},"content":"In this article we show how we re-build the type and naming information of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code in [ Rocq/Coq](https://rocq-prover.org/), the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of [Revm](https://github.com/bluealloy/revm), a virtual machine to run [EVM](https://en.wikipedia.org/wiki/Ethereum#Virtual_machine) programs.\\n\\n\x3c!-- truncate --\x3e\\n\\nThis is the continuation of the following article:\\n\\n- [\ud83e\udd80 Translation of the Rust\'s core and alloc crates](/blog/2024/04/26/translation-core-alloc-crates)\\n\\n:::success Ask for the highest security!\\n\\nWhen millions are at stake, bug bounties are not enough. How do you ensure your security audits are exhaustive?\\n\\nThe best way is to use **formal verification**.\\n\\n**Contact us** at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) to make sure your code is safe! \ud83d\udee1\ufe0f\\n\\nWe cover **Rust**, **Solidity**, and **ZK systems**.\\n\\n:::\\n\\n
\\n ![Green forest](2025-01-30/green-forest.webp)\\n
\\n\\n## \ud83c\udfaf The challenge\\n\\nOur goal is to be able to formally verify large Rust codebases, counting thousands of lines, and without having to modify the code to make it more amenable to formal verification. Our concrete example is the verification of the Revm that includes about 10,000 lines of Rust code, depending on how far we include the dependencies.\\n\\nThis requires to have a methodology of verification that both:\\n\\n- Scales with the size of the codebase. Rust programs often use a lot of abstractions, and we make the choice to keep these abstractions in the formal model. Combined with the expressivity of the Rocq prover, we hope this will ensure we can scale our reasoning.\\n- Supports most of the Rust language, noting that Rust is a complex and feature-rich language.\\n\\nTo make sure our translation from the Rust language to the Rocq system has good support, we generate a translation that is very verbose and rather low-level without interpreting the meaning of the various Rust primitives too much. For example, our translation tool is only about 5,000 lines long. It is written in Rust and uses the APIs of the `rustc` compiler.\\n\\nThis approach leaves the burdens of defining the semantics of Rust and designing the reasoning primitives on the Rocq side.\\n\\n## \ud83d\udedd Strategy\\n\\nWe plan to reason on the translated Rust code with two intermediate steps:\\n\\n1. **Links** These represent a complete rewriting of the translated code, adding type and naming information that are erased during the translation to Rocq. We also prove that this rewriting is equivalent to the initial translation. We hope to automate this step as much as possible.\\n2. **Simulations** In this step we make the less obvious transformations, in particular representing the memory mutations in a clean and custom state monad, as well as various optimizations such as collapsing all the integer types if it helps for the proofs later. We also prove that this rewriting is equivalent to the links.\\n\\nAt the end of the **Simulations** step, we should obtain a purely functional and idiomatic representation of the original Rust code in Rocq. This representation should be easier to reason about, and we will be able to formally verify properties of the code.\\n\\nAs a summary, here are the steps we want to follow:\\n\\n
\\n ![Compilation steps](2025-01-30/compilation-steps.svg)\\n
\\n\\n## \ud83e\uddea Example\\n\\nHere is an example from the standard library of Rust, which is used to define other comparison operators:\\n\\n```rust\\npub fn max_by Ordering>(v1: T, v2: T, compare: F) -> T {\\n match compare(&v1, &v2) {\\n Ordering::Less | Ordering::Equal => v2,\\n Ordering::Greater => v1,\\n }\\n}\\n```\\n\\nThis example is interesting as it uses some abstractions, with polymorphism, traits, closures, and a bit of pointer manipulations. Ideally, we should be able to represent it with a Rocq code of a similar size, without the explicit references `&` that are mostly useless in a purely functional setting. But here is the Rocq code we obtain after running [coq-of-rust](https://github.com/formal-land/coq-of-rust):\\n\\n```coq\\nDefinition max_by (\u03b5 : list Value.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03b5, \u03c4, \u03b1 with\\n | [], [ T; F ], [ v1; v2; compare ] =>\\n ltac:(M.monadic\\n (let v1 := M.alloc (| v1 |) in\\n let v2 := M.alloc (| v2 |) in\\n let compare := M.alloc (| compare |) in\\n M.read (|\\n M.match_operator (|\\n M.alloc (|\\n M.call_closure (|\\n M.get_trait_method (|\\n \\"core::ops::function::FnOnce\\",\\n F,\\n [],\\n [ Ty.tuple [ Ty.apply (Ty.path \\"&\\") [] [ T ]; Ty.apply (Ty.path \\"&\\") [] [ T ] ] ],\\n \\"call_once\\",\\n [],\\n []\\n |),\\n [\\n M.read (| compare |);\\n Value.Tuple\\n [\\n M.borrow (|\\n Pointer.Kind.Ref,\\n M.deref (| M.borrow (| Pointer.Kind.Ref, v1 |) |)\\n |);\\n M.borrow (|\\n Pointer.Kind.Ref,\\n M.deref (| M.borrow (| Pointer.Kind.Ref, v2 |) |)\\n |)\\n ]\\n ]\\n |)\\n |),\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (M.find_or_pattern (|\\n \u03b3,\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ := M.is_struct_tuple (| \u03b3, \\"core::cmp::Ordering::Less\\" |) in\\n Value.Tuple []));\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ := M.is_struct_tuple (| \u03b3, \\"core::cmp::Ordering::Equal\\" |) in\\n Value.Tuple []))\\n ],\\n fun \u03b3 =>\\n ltac:(M.monadic\\n match \u03b3 with\\n | [] => ltac:(M.monadic v2)\\n | _ => M.impossible \\"wrong number of arguments\\"\\n end)\\n |)));\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ := M.is_struct_tuple (| \u03b3, \\"core::cmp::Ordering::Greater\\" |) in\\n v1))\\n ]\\n |)\\n |)))\\n | _, _, _ => M.impossible \\"wrong number of arguments\\"\\n end.\\n```\\n\\nThis is extremely verbose and not idiomatic for Rocq! We can see some of the Rust features that are made explicit:\\n\\n- The list of constant generics `\u03b5`, the list of type generics `\u03c4`, and the list of arguments `\u03b1`.\\n- The memory operations `alloc` and `read`, and the pointers manipulations `borrow` and `deref`.\\n- The trait instance resolution with `M.get_trait_method`.\\n- The decomposition of the pattern matching in more elementary operations like `M.is_struct_tuple`.\\n\\nMost of this information comes from the [THIR intermediate representation](https://rustc-dev-guide.rust-lang.org/thir.html) of the code as provided by the Rust compiler.\\n\\nHere is the link definition we will write, proven equivalent to the code above by construction:\\n\\n```coq\\nDefinition run_max_by {T F : Set} `{Link T} `{Link F}\\n (Run_FnOnce_for_F :\\n function.FnOnce.Run\\n F\\n (Ref.t Pointer.Kind.Ref T * Ref.t Pointer.Kind.Ref T)\\n (Output := Ordering.t)\\n )\\n (v1 v2 : T) (compare : F) :\\n {{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}.\\nProof.\\n destruct Run_FnOnce_for_F as [[call_once [H_call_once run_call_once]]].\\n run_symbolic.\\n eapply Run.CallPrimitiveGetTraitMethod. {\\n apply H_call_once.\\n }\\n run_symbolic.\\n eapply Run.CallClosure. {\\n apply (run_call_once compare (Ref.immediate _ v1, Ref.immediate _ v2)).\\n }\\n intros [ordering |]; cbn; [|run_symbolic].\\n destruct ordering; run_symbolic.\\nDefined.\\n```\\n\\nThe beginning of the definition corresponds to the trait resolution and calls to the `compare` function. The last part with `destruct ordering` is the representation of the `match` statement in the Rust code. With this definition, we add explicit Rocq types instead of the universal `Value.t` type of the translated code and make explicit the trait resolution. The trait instance has to be provided as an explicit parameter with the `Run_FnOnce_for_F` argument.\\n\\nWith the statement:\\n\\n```coq\\n{{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}\\n```\\n\\nwe say that the translated function `cmp.max_by` has a \\"link\\" definition, built implicitly in the proof, returning a value of type `T`. We can extract the definition of this function calling the primitive:\\n\\n```coq\\nevaluate : forall {Output : Set} `{Link Output} {e : M},\\n {{ e \ud83d\udd3d Output }} ->\\n LowM.t (Output.t Output)\\n```\\n\\nIt returns a \\"link\\" computation in the `LowM.t` monad. The output is often unreadable as it is, but we can step through it by symbolic execution. This will be useful for the next step to define and prove equivalent the \\"simulations\\".\\n\\n## \ud83d\udd2e Link\'s monad\\n\\nLike the monad used for the translation of Rust programs by `coq-of-rust`, the link\'s monad is a free monad but with fewer primitive operations. The primitive operations are only related to the memory handling:\\n\\n```coq\\nInductive t : Set -> Set :=\\n| StateAlloc {A : Set} `{Link A} (value : A) : t (Ref.Core.t A)\\n| StateRead {A : Set} `{Link A} (ref_core : Ref.Core.t A) : t A\\n| StateWrite {A : Set} `{Link A} (ref_core : Ref.Core.t A) (value : A) : t unit\\n| GetSubPointer {A Sub_A : Set} `{Link A} `{Link Sub_A}\\n (ref_core : Ref.Core.t A) (runner : SubPointer.Runner.t A Sub_A) :\\n t (Ref.Core.t Sub_A).\\n```\\n\\nCompared to the side effects in the generated translation, we eliminate all the operations related to name handling (trait resolution, function calls, etc.). We also always use explicit types instead of the universal `Value.t` type and get rid of the `M.impossible` operation that was necessary to represent impossible branches in the absence of types.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have presented our general strategy to formally verify large Rust codebases. In the next blog posts, we will go into more details to look at the definition of the proof of equivalence for the links, and at how we automate the most repetitive parts of the proofs.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2025/01/21/designing-a-coding-assistant-for-rocq","metadata":{"permalink":"/blog/2025/01/21/designing-a-coding-assistant-for-rocq","source":"@site/blog/2025-01-21-designing-a-coding-assistant-for-rocq.md","title":"\ud83e\udd16 Designing a coding assistant for Rocq","description":"This blog post provides a review of the existing literature on agent-based systems for automated theorem proving, while presenting a general approach to the problem. Additionally, it serves as an informal specification outlining the requirements for a future system we intend to develop.","date":"2025-01-21T00:00:00.000Z","formattedDate":"January 21, 2025","tags":[{"label":"llm","permalink":"/blog/tags/llm"},{"label":"ai","permalink":"/blog/tags/ai"}],"readingTime":9.29,"hasTruncateMarker":true,"authors":[{"name":"Andrea Delmastro","url":"https://github.com/andreadlm","imageURL":"https://github.com/andreadlm.png","key":"andrea_delmastro"}],"frontMatter":{"title":"\ud83e\udd16 Designing a coding assistant for Rocq","tags":["llm","ai"],"authors":["andrea_delmastro"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)","permalink":"/blog/2025/01/30/links-for-rust-in-rocq"},"nextItem":{"title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","permalink":"/blog/2025/01/13/verification-one-instruction-sui"}},"content":"This blog post provides a review of the existing literature on agent-based systems for automated theorem proving, while presenting a general approach to the problem. Additionally, it serves as an informal specification outlining the requirements for a future system we intend to develop.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nWe exclusively focus on formal verification to offer you the highest degree of security for your application.\\n\\nWe cover **Rust**, **Solidity**, and **zero-knowledge** projects.\\n\\n:::\\n\\n## \ud83c\udfaf Our goal\\nWe aim to develop an integrated coding assistant for the proof assistant [ Rocq/Coq](https://rocq-prover.org/) within [Visual Studio Code](https://code.visualstudio.com/). Despite recent advancements in artificial intelligence, the challenge of creating systems that effectively assist users in writing formal verification code remains unresolved. Our primary focus is on providing support for theorem proving, which we consider the most compelling aspect of the task; other functionalities, such as definition writing, may be explored in future work.\\n\\n## \ud83c\udf33 Automated theorem proving as a search in a state space\\nA coding assistant for a proof assistant can take advantage of a fundamental property that is not possessed by traditional programming languages: it is always possible to deterministically verify whether the code generated for a demonstration is correct (or simply, not incorrect). It only requires the code to be well-typed. More broadly, the assistant can track the progress of the solution.\\nA proof can be seen as a sequence of tactics, each of which modifies the current goal. Consequently, the proof construction process can be framed as a search through a state space. Using classical terminology for such problems, we can categorize the components of our system as follows:\\n\\n* **state**: enriched representation of the current goal\\n* **starting state**: initial goal associated with the theorem (its definition)\\n* **arrival state**: closed goal\\n* **actions**: tactics\\n\\n
\\n
\\n ![Tree search simple](2025-01-21/tree_search_simple.svg)\\n
\\n
\\n\\nCertain states can be pruned if they do not meet some conditions, such as error states, those where with certainty no progress has been made (e.g., the [copra](https://github.com/trishullab/copra) system proposes a simple symbolic approach to recognize some trivial cases) or if too many attempts have already been made at a certain node.\\n\\nThe set of tactics that can be applied in a given state is potentially infinite. To guide the search, one or more oracles are queried, which provide suggestions on applicable tactics. These oracles can be either LLM-based agents or traditional symbolic procedures (e.g., [CoqHammer](https://coqhammer.github.io/)). Multiple oracles may coexist, offering alternative solutions. Two examples of LLM-based oracles are:\\n\\n* oracle that produces a list of $k$ possible alternative tactics to be applied at a given goal;\\n* oracle that produces a complete demonstration for a given goal.\\n\\nIt is not obvious whether a procedure that proceeds in depth or one that proceeds in breadth is preferable. As is often the case with research problems, a \\"hybrid\\" approach might be preferable. In any case, one could imagine ordering the frontier on the basis of how promising a certain state is, thereby guiding the search process. The problem of determining whether one state is more promising than another through heuristics (a kind of \\"distance\\" from the successful state) is certainly interesting and would merit future study.\\n\\nOne can imagine two procedures, one in breadth and one in depth. From the union of the two, a hybrid solution could be devised.\\n\\n
\\n
\\n ![Depth search](2025-01-21/depth_tree_search.svg)\\n
Depth search
\\n
\\n
\\n\\n
\\n
\\n ![Breadth search](2025-01-21/breadth_tree_search.svg)\\n
Breadth search by beam search: in this specific case, a heuristic is employed to limit the number of expanded nodes
\\n
\\n
\\n\\nThe system should be flexible enough and allow for the implementation of different versions of the search algorithm that could be refined as the work progresses.\\n\\n### Learning from errors\\nBy leveraging the ability of an LLM to generate an infinite number of tactics and possibly update the prompt to refine the query, errors can be exploited to generate new tactics. For example, a node (state) might not be closed as soon as it is expanded, but it could be re-expanded in the future, enriched with the knowledge of past errors.\\n\\n
\\n
\\n ![Tree errors](2025-01-21/tree_errors.svg)\\n
\\n
\\n\\n## \ud83d\udc68\ud83c\udffb\u200d\ud83d\udcbb Integration with the user\\nThe system must integrate forms of communication and interaction with the user, which guide the user\'s construction of the proof. In this context, the coding assistant is not envisioned as a fully automated proof tool, but rather as a coding companion that leverages the support of the human developer. For instance, such a companion system does not necessarily need to complete the proof, but could instead generate partial solutions, offering the user multiple incomplete options and allowing them to select the one they deem most appropriate as a starting point.\\n\\n## \ud83d\udd27 The technology stack\\nThe system is designed to be distributed as a Visual Studio Code extension. This approach offers several advantages, including access to the editor\'s extensive ecosystem of APIs, which facilitates seamless integration into standard development workflows and user interactions. Additionally, it simplifies the publishing and installation processes.\\n\\n
\\n
\\n ![Tech stack](2025-01-21/tech_stack.svg)\\n
\\n
\\n\\n### Large language models\\nModels and ad-hoc architectures for theorem proving have been proposed in the literature, including [ReProver](https://github.com/lean-dojo/ReProver) (for [Lean](https://lean-lang.org/)). The cost of maintaining and the complexity of configuring and adapting these systems is generally high. More simply, commercial versions of the most common LLMs (GPTs, , ...) can be used, leveraging prompt-engineering techniques. Several papers demonstrate that comparable results to state-of-the-art models can be achieved using such approaches.\\n\\n### The VSCode API ecosystem\\nVSCode offers a rich ecosystem of APIs that can be used to integrate your extension with common development processes, simplify user interaction, and communicate with external tools. In particular, the new [Language Model API](https://code.visualstudio.com/api/extension-guides/language-model) is particularly useful for our purposes. It offers a common interface as well as tools to simplify communication with popular LLMs. Through the use of [Proposed API](https://github.com/microsoft/vscode/blob/main/src/vscode-dts/vscode.proposed.chatProvider.d.ts), it is also possible to integrate local models, which are useful mainly in the testing phase of the first iterations. VSCode\'s [Language API](https://code.visualstudio.com/api/references/vscode-api#languages) simplifies the development and integration of an LSP client.\\n\\n\\n### Language server\\nLanguage analysis capabilities are provided by the language server [Coq-LSP](https://github.com/ejgallego/coq-lsp), which has recently been released as part of the [P\xe9tanque](https://github.com/ejgallego/coq-lsp/tree/main/petanque) project, a lightweight environment for intensive applications targeted at automated theorem-proving projects and especially at agent-based systems. P\xe9tanque operates as a [Gymnasium](https://gymnasium.farama.org/index.html) environment and has already been successfully used in the [NLIR](https://github.com/LLM4Coq/nlir) system. \\nAt the architectural level, Coq-LSP (and P\xe9tanque) operates as a server towards the coding assistant (the client), providing some functionality in the form of an API via an extended version of the [LSP](https://microsoft.github.io/language-server-protocol/) protocol, including:\\n\\n* obtaining the current goal for a given theorem,\\n* obtaining the location of a given theorem\'s definition,\\n\\nand many other functionalities commonly accessible through IDEs.\\n\\n## \ud83e\udde0 The agent perspective\\nAn alternative description of the system can be accomplished from the agent\'s perspective. We understand an _agent_ as a software system whose behavior is conditioned by an environment that it can actively alter by performing some actions whose effects condition subsequent observations and, consequently, its future choices.\\n\\n
\\n
\\n ![Agent RL](2025-01-21/agent_rl.svg)\\n
\\n
\\n\\nBased on the above definition, we can attempt to classify the components of our system within a classic agent context as follows:\\n\\n* **Agent**: prompting + large language model + parsing\\n* **Environment**: user, language server and search algorithm\\n* **Actions**: tactics\\n* **Observations**: current goal, examples, definitions, ...\\n\\n
\\n ![Agent loop](2025-01-21/agent_loop.svg)\\n
\\n\\nLet us recall that in the proposed general architecture, the agent is only one of the possible types of _oracle_ from which we can obtain useful information to advance the demonstration (albeit the most interesting one), and that multiple _oracles_ at the same time can coexist, e.g., agents implementing different resolution strategies or configurations.\\n\\nThe agent is designed to interact with various components of the environment, for example, by requesting examples, additional information, or simple advice from the user; the current goal; the list of previously attempted and failed tactics for the search algorithm; or semantic data from the language server. The interaction process could be deliberative (guided by the LLM\'s reasoning) or, more simply, a form of abstraction for a predetermined set of information we intend to request. This interaction with the environment is crucial for defining the goal, enriching the agent\'s context, and generating input for the prompt. In a recent [blog post](/blog/2025/01/06/annotating-what-we-are-doing), we documented our interest in internally gathering as much information as possible about the recurring human processes of building a demonstration in order to standardize and emulate them within the agent\'s interaction logic with the environment.\\n\\nSeveral propting techniques can be tried. In the [NLIR](https://github.com/LLM4Coq/nlir) system, the prompt is gradually refined through a chain-of-thought approach: first, a natural language response is requested from the LLM, and this response is then used to generate a more precise Rocq code response.\\n\\nOnce the LLM produces an output\u2014whether a tactic, a list of tactics, or a complete proof, depending on the type of agent\u2014the response is parsed and executed within the environment via P\xe9tanque. If an error occurs, the environment is either reverted to its previous state (backtracking), or an error recovery technique is applied (e.g., replacing the problematic code with `admit.`).\\n\\n## \ud83d\udcca Evaluation strategies\\nBenchmarks used for evaluating automated demonstration systems tend to be limited to classical mathematics, focusing on demonstration systems that are _completely automated_ and not of _support_ to demonstration writing.\\nAs a result, these benchmarks can be misleading with regard to the practical utility of such tools in real-world contexts. In addition to traditional evaluation methods, the system should be tested in practical scenarios, such as by applying it to ongoing formal verification projects within [Formal Land](/).\\n\\nA second critical consideration when evaluating a practical support tool is the cost per request. Integrated LLMs should not be viewed as infinite resources, but rather as constrained resources whose usage must be optimized and minimized, even if that means sacrificing some efficiency.\\n\\n## \ud83d\uddc2\ufe0f Similar projects and resources\\nThe research field in fully autonomous automated theorem using proof assistants is very active and has received a strong boost since the advent of LLMs. The proposed system architecture has been influenced by the following works:\\n- LeanCopilot ([code](https://github.com/lean-dojo/LeanCopilot), [paper](https://arxiv.org/abs/2404.12534))\\n- copra ([code](https://github.com/trishullab/copra), [paper](https://arxiv.org/abs/2310.04353))\\n- CoqPilot ([code](https://github.com/JetBrains-Research/coqpilot), [paper](https://arxiv.org/abs/2410.19605))\\n- NLIR ([code](https://github.com/LLM4Coq/nlir), [paper](https://openreview.net/forum?id=QzOc0tpdef))\\n\\nOther interesting resources to further explore this topic:\\n- \ud83d\udcfd\ufe0f [Lean Together 2025: Jason Rute, The last mile](https://www.youtube.com/watch?v=Yr8dzfVkeHg)\\n\\n## \ud83e\udd61 Key takeaway\\n* The agent perspective and the search perspective are here complemented in a single system\\n* The automatic demonstration process can be seen as a sophisticated search in a space of states\\n* The system must be flexible overall and adapt to different refinements that might be decided in the process\\n* In a support tool, completeness of proof is not mandatory\\n* User interaction is crucial\\n* Evaluation must be carried out in a practical context\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2025/01/13/verification-one-instruction-sui","metadata":{"permalink":"/blog/2025/01/13/verification-one-instruction-sui","source":"@site/blog/2025-01-13-verification-one-instruction-sui.md","title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","description":"This is the last article of a series of blog post presenting our formal verification effort in  Rocq/Coq to ensure the correctness of the type-checker of the Move language for Sui.","date":"2025-01-13T00:00:00.000Z","formattedDate":"January 13, 2025","tags":[{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Move","permalink":"/blog/tags/move"},{"label":"Sui","permalink":"/blog/tags/sui"},{"label":"type-checker","permalink":"/blog/tags/type-checker"}],"readingTime":5.73,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","tags":["Rust","Move","Sui","type-checker"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd16 Designing a coding assistant for Rocq","permalink":"/blog/2025/01/21/designing-a-coding-assistant-for-rocq"},"nextItem":{"title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","permalink":"/blog/2025/01/06/annotating-what-we-are-doing"}},"content":"This is the last article of a series of blog post presenting our formal verification effort in [ Rocq/Coq](https://rocq-prover.org/) to ensure the correctness of the type-checker of the [Move language](https://sui.io/move) for [Sui](https://sui.io/).\\n\\nHere we show how the formal proof works to check that the type-checker is correct on a particular instruction, for any possible initial states. The general idea is to symbolically execute the code step by step on the type-checker side, accumulating properties about the stack assuming the type-checker succeeds, and then to show that the interpreter will produce a stack of the expected type as a result.\\n\\n\x3c!-- truncate --\x3e\\n\\nPrevious post:\\n\\n- [\ud83e\udd80 Example of verification for the Move\'s checker of Sui](/blog/2024/11/14/sui-move-checker-abstract-stack)\\n\\n:::success Ask for the highest security!\\n\\nWhen millions are at stake, bug bounties are not enough.\\n\\nHow do you ensure your security audits are exhaustive?\\n\\nThe best way to do this is to use **formal verification**.\\n\\nThis is what we provide as a service. **Contact us** at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) to make sure your code is safe! \ud83d\udee1\ufe0f\\n\\nWe cover **Rust**, **Solidity**, and **ZK systems**.\\n\\n:::\\n\\n
\\n ![Green forest with water](2025-01-13/green-forest-with-water.webp)\\n
\\n\\n## \ud83e\udd80 The Rust code\\n\\nWe are verifying the type-checking for the Move bytecode instruction `CastU8`. This instruction takes the top-most element of the stack, checks that it is an integer, and pushes it back on the stack as a `U8` if it is in the right range or fails with an error `StatusCode::ARITHMETIC_ERROR` otherwise.\\n\\nHere is the code of the interpreter:\\n\\n```rust\\nBytecode::CastU8 => {\\n gas_meter.charge_simple_instr(S::CastU8)?;\\n let integer_value =\\n interpreter.operand_stack.pop_as::()?;\\n interpreter\\n .operand_stack\\n .push(Value::u8(integer_value.cast_u8()?))?;\\n}\\n```\\n\\nWe ignore the gas metering for now. The `pop_as` method pops the top-most element of the stack and checks that it is an integer. The `cast_u8` method checks that the integer is in the right range (`0` to `255`) and returns the value as a `U8`. The `push` method pushes the value back on the stack. The question mark operator `?` is used to propagate errors.\\n\\nHere is the corresponding code in the type-checker:\\n\\n```rust\\nBytecode::CastU8 => {\\n let operand = safe_unwrap_err!(verifier.stack.pop());\\n if !operand.is_integer() {\\n return Err(verifier.error(\\n StatusCode::INTEGER_OP_TYPE_MISMATCH_ERROR,\\n offset,\\n ));\\n }\\n verifier.push(meter, ST::U8)?;\\n}\\n```\\n\\nIt pops the top-most element of the stack of types (we do not have values here) and checks that it is an integer type. If it is not, it returns an error. Otherwise, it pushes the type `U8` on the stack. Note that there are no ways to know, in the type-checker, if the value is in the right range.\\n\\n## \ud83d\udc26\u200d\u2b1b The Rocq translation\\n\\nIn previous posts, we covered our manual translation of the Rust code in Rocq. We repeat it here. The interpreter code in Rocq:\\n\\n```coq\\n| Bytecode.CastU8 =>\\n letS!? integer_value := liftS! State.Lens.interpreter (\\n liftS! Interpreter.Lens.operand_stack $\\n Stack.Impl_Stack.pop_as IntegerValue.t\\n ) in\\n letS!? integer_value :=\\n returnS! $ IntegerValue.cast_u8 integer_value in\\n doS!? liftS! State.Lens.interpreter (\\n liftS! Interpreter.Lens.operand_stack $\\n Stack.Impl_Stack.push $\\n ValueImpl.U8 integer_value\\n ) in\\n returnS!? InstrRet.Ok\\n```\\n\\nThe type-checker code in Rocq:\\n\\n```coq\\n| Bytecode.CastU8 => \\n letS! operand :=\\n liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in\\n letS! operand := return!toS! $ safe_unwrap_err operand in\\n if negb $ SignatureToken.is_integer operand then\\n returnS! $\\n Result.Err $\\n TypeSafetyChecker.Impl_TypeSafetyChecker.error\\n verifier StatusCode.INTEGER_OP_TYPE_MISMATCH_ERROR offset\\n else\\n TypeSafetyChecker.Impl_TypeSafetyChecker.push SignatureToken.U8\\n```\\n\\n## \ud83d\udcdc Formal statement\\n\\nHere is the formal statement of the property we want to prove to ensure the correctness of the type-checker:\\n\\n```coq\\nLemma progress\\n (* [...] parameters and hypothesis *)\\n (* We assume that the initial state is well-typed *)\\n IsInterpreterContextOfType.t locals interpreter type_safety_checker ->\\n match\\n verify_instr instruction pc type_safety_checker,\\n execute_instruction ty_args function resolver instruction state\\n with\\n | Panic.Value (Result.Ok _, type_safety_checker\'),\\n Panic.Value (Result.Ok _, state\') =>\\n let \'{|\\n State.pc := _;\\n State.locals := locals\';\\n State.interpreter := interpreter\';\\n |} := state\' in\\n IsInterpreterContextOfType.t locals\' interpreter\' type_safety_checker\'\\n (* If the type-checker succeeds, then the interpreter cannot return a panic *)\\n | Panic.Value (Result.Ok _, _), Panic.Panic _ => False\\n (* Other errors are allowed *)\\n | Panic.Value (Result.Ok _, _), Panic.Value (Result.Err _, _)\\n | Panic.Value (Result.Err _, _), _\\n | Panic.Panic _, _ => True\\n end.\\n```\\n\\nThis lemma is in the file [proofs/move_bytecode_verifier/type_safety.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/move_sui/proofs/move_bytecode_verifier/type_safety.v). It compares the behavior of the type-checker and the interpreter when executing an instruction. If the type-checker succeeds, then the interpreter cannot return a panic. If the interpreter also succeeds, then the new state is well-typed according to the types returned by the type-checker.\\n\\n## \ud83d\udee1\ufe0f Proof time\\n\\nWe prove the statement above by reasoning about all possible instructions. For the `CastU8` instruction, the Rocq proof is as follows:\\n\\n```coq\\n{ guard_instruction Bytecode.CastU8.\\n destruct_abstract_pop.\\n step; cbn; [exact I|].\\n destruct_abstract_push.\\n step; cbn; (try easy); (try now destruct operand_ty);\\n repeat (step; cbn; try easy);\\n constructor; cbn; try assumption;\\n sauto lq: on.\\n}\\n```\\n\\nHere is what this script does:\\n\\n- `guard_instruction Bytecode.CastU8` checks that the current instruction is `CastU8`. This helps debugging if we are not at the right place.\\n- `destruct_abstract_pop` pops the top-most element of the stack of types and gives it the name `operand_ty`. It handles the cases where the stack is empty.\\n- `step; cbn; [exact I|]` is a command to handle the next `if` in the code of the type-checker. We are only interested in the success branch (`else` branch in this case).\\n- `destruct_abstract_push` pushes the type `U8` on the stack of types.\\n\\nThen, there is a set of automated tactics iterating over all the possible types of values that can be on the stack. Just before the end of the proof, we have the following proof state:\\n\\n```coq\\n---------------------------------------\\n(1/6)\\nList.Forall2 IsValueImplOfType.t (ValueImpl.U8 z :: x0)\\n (SignatureToken.U8 :: AbstractStack.flatten stack_ty0)\\n```\\n\\nThis proof state is repeated identically six times, once for each possible integer type (`U8`, `U16`, `U32`, `U64`, `U128`, `U256`). It says that the stack of values:\\n\\n```coq\\nValueImpl.U8 z :: x0\\n```\\n\\nmust have the stack of types:\\n\\n```coq\\nSignatureToken.U8 :: AbstractStack.flatten stack_ty0\\n```\\n\\nThe value `z` is the result of the `cast_u8` function in the interpreter. The `flatten` function is used to flatten the stack of types that may contain duplicates.\\n\\nFor the head of the stack the property is trivially true. For the tail of the stack, we use one of the hypotheses from the context, coming from the fact that the stack was initially well-typed and with did not modify the tail.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have show how to formally verify that the type-checker for the Move\'s bytecode virtual machine is correct on a simple instruction `CastU8`. This is part of a larger effort to ensure the correctness of the whole type-checker.\\n\\nOther instructions operating on atomic types (integers, booleans, addresses) are similar to this one. The most complex instructions are the ones operating on references and data structures like vectors and structs. These require more work, and we have not yet tackled them.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2025/01/06/annotating-what-we-are-doing","metadata":{"permalink":"/blog/2025/01/06/annotating-what-we-are-doing","source":"@site/blog/2025-01-06-annotating-what-we-are-doing.md","title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","description":"We want to write a series of blog posts about our efforts to use LLMs to formally verify code faster with the  Rocq/Coq theorem prover. Here, we present an experiment consisting of writing all that we are doing so that we can document our reasoning and help LLMs to pick up human techniques.","date":"2025-01-06T00:00:00.000Z","formattedDate":"January 6, 2025","tags":[{"label":"llm","permalink":"/blog/tags/llm"},{"label":"ai","permalink":"/blog/tags/ai"}],"readingTime":3.795,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","tags":["llm","ai"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Verification of one instruction of the Move\'s type-checker","permalink":"/blog/2025/01/13/verification-one-instruction-sui"},"nextItem":{"title":"\ud83e\udd84 Mutually recursive functions with notation","permalink":"/blog/2024/12/26/mutually-recursive-functions-with-notation"}},"content":"We want to write a series of blog posts about our efforts to use LLMs to formally verify code faster with the [ Rocq/Coq](https://rocq-prover.org/) theorem prover. Here, we present an experiment consisting of writing all that we are doing so that we can document our reasoning and help LLMs to pick up human techniques.\\n\\nAccording to many publications about using generative AI to help formal verification, it is almost impossible to find a proof in \\"one shot\\". So, one certainly has to interact with the system, maybe by following the human way. Here we aim to document this \\"human way\\" of writing proofs.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nHow do you ensure your security audits are exhaustive?\\n\\nWhen millions are at stake, bug bounties are not enough.\\n\\nThe only way to do this is to use **formal verification** to _prove_ your code is correct.\\n\\nThis is what we provide as a service. **Contact us** at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) to ensure your code is safe! \ud83d\ude80\\n\\nWe cover **Rust**, **Solidity**, and soon **zk circuits**.\\n\\n:::\\n\\n
\\n ![Robot](2025-01-06/robot-forest.webp)\\n
\\n\\n## \ud83d\udd0d Example\\n\\nWe take as an example our verification effort for the type-checker of the Move language. We have a big lemma to verify with 77 cases, one per Move instruction. We now write everything we do in a single linear document [what_we_do.md](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/what_we_do.md). Here is an extract:\\n\\n>Now a previous case is failing:\\n>\\n>```\\n>hauto l: on.\\n>```\\n>\\n>with:\\n>\\n>```\\n>Error: hauto failed\\n>```\\n>\\n>As this is a tactic generated by `best`, we try to use `best` again. It works! We continue and arrive at our current goal. Out of curiosity, we try `best` again. It works! The idea is that since we made weaker the definition of what we want to prove, maybe we can now solve it automatically.\\n>\\n>We have six cases which are solved by `best`:\\n>\\n>```\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>{ best. }\\n>```\\n>\\n>We replace it by `; best` after the block of previous tactics:\\n>\\n>```\\n>step; cbn; (try easy); (try now destruct operand_ty);\\n> repeat (step; cbn; try easy);\\n> constructor; cbn; try assumption;\\n> best.\\n>```\\n>\\n>It works! By running `make` again we get that we can replace the `best` by `\\n>\\n>So now we have done the `Bytecode.CastU8` case.\\n\\nWe document both our successes and failures, as this is what we do when we interact with the system to try to find the proof of a property.\\n\\n## \ud83d\udc06 Quick takeaways\\n\\nThis is time-consuming. Hopefully, this pays off in the long run. There may be a way to automatically record what we are doing, by recording the user interactions in a VSCode plugin. In addition, when writing what we do by hand we might forget to write some important steps but seemingly obvious steps, like checking into another file, due to laziness.\\n\\nThe autocomplete from GitHub Copilot, while writing the document, already generated the right steps to do from the journal we are writing, like \\"compile the project again\\" or a good tactic to try.\\n\\nWe realize that we have a lot to write in consolidated documents and that a lot of what we do are coding conventions we have taken. These might not be the ones used by everyone, so we have to distinguish between our conventions and general Rocq knowledge.\\n\\nThere is a lot of domain-specific knowledge that only a human can provide and that is specific to each project. For example, here, a human has to give hints related to how the Move type-checker is implemented, which can only be understood by reading the source code.\\n\\nHere we try to give some sense of mid-level intuitions: how to navigate the project, go to a definition, add a new property, ... We do not focus too much on the details of the tactics to use (more low-level), or the high-level intuition behind the proof which might be better done by a human.\\n\\nThis helps to understand how an LLM thinks and which information it has access to.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have quickly presented the idea of writing what we are doing along the way to help LLMs understand how to verify some code.\\n\\nPlease tell us what you think or if you have some ideas for improving this process!\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/12/26/mutually-recursive-functions-with-notation","metadata":{"permalink":"/blog/2024/12/26/mutually-recursive-functions-with-notation","source":"@site/blog/2024-12-26-mutually-recursive-functions-with-notation.md","title":"\ud83e\udd84 Mutually recursive functions with notation","description":"In this blog post, we present a technique with the  Rocq/Coq theorem prover to define mutually recursive functions using a notation. This is sometimes convenient for types defined using a container type, such as types depending on a list of itself.","date":"2024-12-26T00:00:00.000Z","formattedDate":"December 26, 2024","tags":[{"label":"recursion","permalink":"/blog/tags/recursion"},{"label":"notation","permalink":"/blog/tags/notation"},{"label":"mutual","permalink":"/blog/tags/mutual"}],"readingTime":3.735,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 Mutually recursive functions with notation","tags":["recursion","notation","mutual"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd16 Annotating what we are doing for an LLM to pick up","permalink":"/blog/2025/01/06/annotating-what-we-are-doing"},"nextItem":{"title":"\ud83d\udc7b Translation of Circom to Coq","permalink":"/blog/2024/12/20/translation-of-circom-to-coq"}},"content":"In this blog post, we present a technique with the [ Rocq/Coq](https://rocq-prover.org/) theorem prover to define mutually recursive functions using a notation. This is sometimes convenient for types defined using a container type, such as types depending on a list of itself.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nWe exclusively focus on formal verification to offer you the highest degree of security for your application.\\n\\nWe currently work with some of the leading blockchain entities, such as:\\n\\n- The [Ethereum Foundation](https://ethereum.foundation/)\\n- The [Sui Foundation](https://sui.io/about)\\n- Previously, the [Aleph Zero](https://alephzero.org/) and [Tezos](https://tezos.com/) foundations\\n\\n:::\\n\\n
\\n ![Forest](2024-12-26/two-trees.jpg)\\n
\\n\\n## \ud83d\udd0d Example\\n\\nHere is a typical example of a type defined using a container of itself, written in [\ud83e\udd80 Rust](https://www.rust-lang.org/):\\n\\n```rust\\nstruct Trees
(Vec>);\\n\\nenum Tree {\\n Leaf,\\n Node { data: A, children: Trees },\\n}\\n```\\n\\nThese two definitions are mutually dependent. We choose to represent it in Rocq/Coq with the following definition:\\n\\n```coq\\nInductive Tree (A : Set) : Set :=\\n| Leaf : Tree A\\n| Node : A -> list (Tree A) -> Tree A.\\n\\nDefinition Trees (A : Set) : Set :=\\n list (Tree A).\\n```\\n\\nIf we define a recursive function on this type, for example, to compute the sum of all the values in the tree, we would naturally write a function that iterates both on:\\n\\n- The tree constructors,\\n- The list of the `Node` case.\\n\\n## \ud83d\udcdd First solution\\n\\nHere is a first attempt to define a `sum` function that adds all the elements of the tree:\\n\\n```coq\\nFixpoint sum_tree {A : Set} (f : A -> nat) (t : Tree A) : nat :=\\n match t with\\n | Leaf => 0\\n | Node a ts => f a + sum_trees f ts\\n end\\n\\nwith sum_trees {A : Set} (f : A -> nat) (ts : Trees A) : nat :=\\n match ts with\\n | nil => 0\\n | t :: ts => sum_tree f t + sum_trees f ts\\n end.\\n```\\n\\nThis definition does not work as the `Tree` type is not mutually recursive, but the function `sum_tree` is. The error message is:\\n\\n```\\nError: Cannot guess decreasing argument of fix.\\n```\\n\\nA first solution is to define the function `sum_trees` as a local definition in `sum_tree`:\\n\\n```coq\\nFixpoint sum_tree {A : Set} (f : A -> nat) (t : Tree A) : nat :=\\n let fix sum_trees (ts : Trees A) : nat :=\\n match ts with\\n | nil => 0\\n | t :: ts => sum_tree f t + sum_trees ts\\n end in\\n match t with\\n | Leaf => 0\\n | Node a ts => f a + sum_trees ts\\n end.\\n```\\n\\nThis definition gets accepted by the prover!\\n\\n## \ud83d\ude80 Second solution\\n\\nAn issue is that we cannot call `sum_trees` directly as its definition is hidden in the one of `sum_tree`. This is a problem if further top-level definitions depend on `sum_trees`, or if we want to verify intermediate properties about `sum_trees` itself.\\n\\nA solution we use for this kind of problem is to add a notation to make `sum_trees` a top-level definition while keeping the mutual recursion with `sum_tree`:\\n\\n```coq\\nReserved Notation \\"\'sum_trees\\".\\n\\nFixpoint sum_tree {A : Set} (f : A -> nat) (t : Tree A) : nat :=\\n match t with\\n | Leaf => 0\\n | Node a ts => f a + \'sum_trees _ f ts\\n end\\n\\nwhere \\"\'sum_trees\\" := (fix sum_trees (A : Set) (f : A -> nat) (ts : Trees A) : nat :=\\n match ts with\\n | nil => 0\\n | t :: ts => sum_tree f t + sum_trees _ f ts\\n end).\\n\\nDefinition sum_trees {A : Set} := \'sum_trees A.\\n```\\n\\nHere, both `sum_tree` and `sum_trees` are defined as top-level, and the mutually recursive definition is accepted. Note that we have to make the type `A` explicit in the notation, as implicit parameters are not allowed there.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have shown a technique that is sometimes useful for us to define complex, mutually dependent data structures. This was recently useful for defining the `ValueImpl` type in the type-checker of [Move](https://sui.io/move) for the blockchain [Sui](https://sui.io/).\\n\\nYou can tell us what you think or if you prefer another way to define mutually recursive functions!\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/12/20/translation-of-circom-to-coq","metadata":{"permalink":"/blog/2024/12/20/translation-of-circom-to-coq","source":"@site/blog/2024-12-20-translation-of-circom-to-coq.md","title":"\ud83d\udc7b Translation of Circom to Coq","description":"In this post, we present the beginning of our work to translate programs written in the Circom circuit language to the \ud83d\udc13 Coq proof assistant. This work is part of our research on the formal verification of zero-knowledge systems.","date":"2024-12-20T00:00:00.000Z","formattedDate":"December 20, 2024","tags":[{"label":"Circom","permalink":"/blog/tags/circom"},{"label":"zero-knowledge","permalink":"/blog/tags/zero-knowledge"}],"readingTime":10.84,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc7b Translation of Circom to Coq","tags":["Circom","zero-knowledge"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 Mutually recursive functions with notation","permalink":"/blog/2024/12/26/mutually-recursive-functions-with-notation"},"nextItem":{"title":"\ud83e\udd84 How does formal verification of smart contracts work?","permalink":"/blog/2024/12/20/what-is-formal-verification-of-smart-contracts"}},"content":"In this post, we present the beginning of our work to translate programs written in the [Circom](https://iden3.io/circom) circuit language to the [\ud83d\udc13 Coq](https://coq.inria.fr/) proof assistant. This work is part of our research on the formal verification of zero-knowledge systems.\\n\\nWe will aim to write more regularly about what we are doing, even if the posts are then shorter. Here, we focus on the translation part for a simple example without defining a semantics for the generated Coq code.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nWe exclusively focus on formal verification to offer you the highest degree of security for your application.\\n\\nWe are already working with some of the leading blockchain entities, such as:\\n\\n- The [Ethereum Foundation](https://ethereum.foundation/)\\n- The [Sui Foundation](https://sui.io/about)\\n- Previously, the [Aleph Zero](https://alephzero.org/) and [Tezos](https://tezos.com/) foundations\\n\\n:::\\n\\n
\\n ![Forest](2024-12-20/ghost-forest.webp)\\n
\\n\\n## \ud83d\udc7b The Circom language\\n\\nThis is a language to write composable and optimized zero-knowledge circuits. It has been in use for quite some time, and there are a lot of examples of Circom programs implementing common cryptographic primitives, such as hash functions. See, for example, the [github.com/iden3/circomlib](https://github.com/iden3/circomlib) repository. It is quoted by many projects, see for example this blog post [How zkLogin Made Cryptography Faster and More Secure](https://www.mystenlabs.com/blog/how-zklogin-made-cryptography-faster-and-more-secure) of the development team of [Sui](How zkLogin Made Cryptography Faster and More Secure) mentioning Circom.\\n\\nHere is the example which we consider to add `ops` numbers of `n` bits:\\n\\n```circom\\nfunction nbits(a) {\\n var n = 1;\\n var r = 0;\\n while (n-1> k) & 1;\\n\\n // Ensure out is binary\\n out[k] * (out[k] - 1) === 0;\\n\\n lout += out[k] * e2;\\n\\n e2 = e2+e2;\\n }\\n\\n // Ensure the sum;\\n\\n lin === lout;\\n}\\n```\\n\\nYou can find this example in [github.com/iden3/circomlib/blob/master/circuits/binsum.circom](https://github.com/iden3/circomlib/blob/master/circuits/binsum.circom).\\n\\nIt contains a function `nbits` to compute the number of bits needed to represent a number, and a template `BinSum` to add `ops` numbers of `n` bits. The function `bits` does not make any operations related to zero-knowledge. It is a simple imperative function with a loop and mutable variables `n` and `r`.\\n\\nThe template `BinSum` defines what is required to instantiate a new component, with input signals `in` and output signals `out`. With the equality assertion `===` it ensures that the output signals must be the bits of the sum of the input signals, and cannot be anything else. This is the property that we will want to formally verify to ensure that it is not possible to provide a proof of this circuit which does not compute the addition. This is called verifying that the circuit is not _underconstrained_.\\n\\n## \ud83d\udc13 The Coq proof assistant\\n\\nThe [Coq proof assistant](https://coq.inria.fr/), which we use exclusively at [Formal Land](https://formal.land), is a generic formal verification system. You can use it to verify any kind of maths or programs. You can never be stuck in the verification of a property, thanks to its interactive mode to refine proofs step by step. You can express almost any kind of property as it is based on the very expressive [Calculus Of Constructions](https://en.wikipedia.org/wiki/Calculus_of_constructions) logic.\\n\\nIts community focuses a lot on the verification of programs, the most notable example being the full verification of the C compiler [CompCert](https://compcert.org/).\\n\\nOur strategy is always the same: finding a nice embedding of a language in Coq, so that we can formally verify programs written in this language and reuse all the existing Coq tools and libraries.\\n\\n## \ud83d\ude80 Translation of Circom to Coq\\n\\nThe Circom compiler is written in [\ud83e\udd80 Rust](https://www.rust-lang.org/). Generally, a compiler is composed of many intermediate languages, starting from a language that is essentially a representation of what the parser returns, down to some form of assembly or circuit language.\\n\\nIf you translate a high-level intermediate language to a proof system, you retain a lot of information from the original program and the specifications/proofs tend to be simpler. If you translate a low-level language, the translation itself will be simpler and more trustworthy, but the verification part will be harder. As Circom is a rather small language (compared to a full programming language), we choose to translate its high-level representation to Coq.\\n\\n### To JSON\\n\\nWe write our translation tool in \ud83d\udc0d Python for simplicity, reading a JSON export of the [abstract syntax tree](https://github.com/iden3/circom/blob/master/program_structure/src/abstract_syntax_tree/ast.rs) of Circom. The quickest way to export data from Rust is to use the [Serde](https://serde.rs/) to generate pretty-printing functions to JSON. We have done it in this [pull request](https://github.com/formal-land/circom/pull/1) from a fork of the Circom compiler.\\n\\nHere is, for example, the beginning of the JSON version of the `nbits` function above:\\n\\n```json\\n\\"Function\\": {\\n \\"meta\\": {},\\n \\"name\\": \\"nbits\\",\\n \\"args\\": [\\n \\"a\\"\\n ],\\n \\"arg_location\\": {\\n \\"start\\": 1571,\\n \\"end\\": 1572\\n },\\n \\"body\\": {\\n \\"Block\\": {\\n \\"meta\\": {},\\n \\"stmts\\": [\\n {\\n \\"InitializationBlock\\": {\\n \\"meta\\": {},\\n \\"xtype\\": \\"Var\\",\\n \\"initializations\\": [\\n {\\n \\"Declaration\\": {\\n \\"meta\\": {},\\n \\"xtype\\": \\"Var\\",\\n \\"name\\": \\"n\\",\\n \\"dimensions\\": [],\\n \\"is_constant\\": true\\n }\\n },\\n {\\n \\"Substitution\\": {\\n \\"meta\\": {},\\n \\"var\\": \\"n\\",\\n \\"access\\": [],\\n \\"op\\": \\"AssignVar\\",\\n \\"rhe\\": {\\n \\"Number\\": [\\n {},\\n [\\n 1,\\n [\\n 1\\n ]\\n ]\\n```\\n\\n### To Coq\\n\\nWe iterate over each nodes of this JSON file to produce a corresponding Coq file with the Python script [scipts/coq_of_circom.py](https://github.com/formal-land/garden/blob/main/scripts/coq_of_circom.py). Here is a short extract from this script:\\n\\n```python\\n\\"\\"\\"\\npub enum Access {\\n ComponentAccess(String),\\n ArrayAccess(Expression),\\n}\\n\\"\\"\\"\\ndef to_coq_access(node) -> str:\\n if \\"ComponentAccess\\" in node:\\n return f\\"Access.Component ({node[\'ComponentAccess\']})\\"\\n if \\"ArrayAccess\\" in node:\\n return f\\"Access.Array ({to_coq_expression(node[\'ArrayAccess\'])})\\"\\n return f\\"Unknown access: {node}\\"\\n```\\n\\nFor every node type in the Circom AST, we copy its Rust type in triple quotes in Python and let GitHub Copilot write a conversion function, which we complete and fix by hand. That way, we quickly cover all syntax with a reasonable Coq output.\\n\\nWe put all the code of our translation in a monad in Coq to represent the side effects, which are mainly here:\\n\\n- imperative effects such as mutations and potentially non-terminating loops,\\n- the instantiation of components with signals, and the enforcement of the equality constraints.\\n\\nHere is the Coq translation for the Circom example above, given in [Garden/Circom/Example/binsum.v](https://github.com/formal-land/garden/blob/main/Garden/Circom/Example/binsum.v):\\n\\n```coq\\n(* Function *)\\nDefinition nbits (a : F.t) : M.t F.t :=\\n M.function_body (\\n (* Var *)\\n do~ M.declare_var \\"n\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"n\\" [[ 1 ]] in\\n (* Var *)\\n do~ M.declare_var \\"r\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"r\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| InfixOp.sub ~(| M.var ~(| \\"n\\" |), 1 |), M.var ~(| \\"a\\" |) |) ]] (\\n do~ M.substitute_var \\"r\\" [[ InfixOp.add ~(| M.var ~(| \\"r\\" |), 1 |) ]] in\\n do~ M.substitute_var \\"n\\" [[ InfixOp.mul ~(| M.var ~(| \\"n\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.return_ [[ M.var ~(| \\"r\\" |) ]] in\\n M.pure BlockUnit.Tt\\n ).\\n\\n(* Template *)\\nDefinition BinSum (n ops : F.t) : M.t BlockUnit.t :=\\n (* Var *)\\n do~ M.declare_var \\"nout\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"nout\\" [[ nbits ~(| InfixOp.mul ~(| InfixOp.sub ~(| InfixOp.pow ~(| 1, M.var ~(| \\"n\\" |) |), 1 |), M.var ~(| \\"ops\\" |) |) |) ]] in\\n (* Signal Input *)\\n do~ M.declare_signal \\"in\\" [[ [M.var ~(| \\"ops\\" |); M.var ~(| \\"n\\" |)] ]] in\\n (* Signal Output *)\\n do~ M.declare_signal \\"out\\" [[ [M.var ~(| \\"nout\\" |)] ]] in\\n (* Var *)\\n do~ M.declare_var \\"lin\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"lin\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"lout\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"lout\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"k\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"k\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"j\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"j\\" [[ 0 ]] in\\n (* Var *)\\n do~ M.declare_var \\"e2\\" [[ ([] : list F.t) ]] in\\n do~ M.substitute_var \\"e2\\" [[ 0 ]] in\\n do~ M.substitute_var \\"e2\\" [[ 1 ]] in\\n do~ M.substitute_var \\"k\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| M.var ~(| \\"k\\" |), M.var ~(| \\"n\\" |) |) ]] (\\n do~ M.substitute_var \\"j\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| M.var ~(| \\"j\\" |), M.var ~(| \\"ops\\" |) |) ]] (\\n do~ M.substitute_var \\"lin\\" [[ InfixOp.add ~(| M.var ~(| \\"lin\\" |), InfixOp.mul ~(| M.var_access ~(| \\"in\\", [Access.Array (M.var ~(| \\"j\\" |)); Access.Array (M.var ~(| \\"k\\" |))] |), M.var ~(| \\"e2\\" |) |) |) ]] in\\n do~ M.substitute_var \\"j\\" [[ InfixOp.add ~(| M.var ~(| \\"j\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.substitute_var \\"e2\\" [[ InfixOp.add ~(| M.var ~(| \\"e2\\" |), M.var ~(| \\"e2\\" |) |) ]] in\\n do~ M.substitute_var \\"k\\" [[ InfixOp.add ~(| M.var ~(| \\"k\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.substitute_var \\"e2\\" [[ 1 ]] in\\n do~ M.substitute_var \\"k\\" [[ 0 ]] in\\n do~ M.while [[ InfixOp.lesser ~(| M.var ~(| \\"k\\" |), M.var ~(| \\"nout\\" |) |) ]] (\\n do~ M.substitute_var \\"out\\" [[ InfixOp.bitand ~(| InfixOp.shiftr ~(| M.var ~(| \\"lin\\" |), M.var ~(| \\"k\\" |) |), 1 |) ]] in\\n do~ M.equality_constraint\\n [[ InfixOp.mul ~(| M.var_access ~(| \\"out\\", [Access.Array (M.var ~(| \\"k\\" |))] |), InfixOp.sub ~(| M.var_access ~(| \\"out\\", [Access.Array (M.var ~(| \\"k\\" |))] |), 1 |) |) ]]\\n [[ 0 ]]\\n in\\n do~ M.substitute_var \\"lout\\" [[ InfixOp.add ~(| M.var ~(| \\"lout\\" |), InfixOp.mul ~(| M.var_access ~(| \\"out\\", [Access.Array (M.var ~(| \\"k\\" |))] |), M.var ~(| \\"e2\\" |) |) |) ]] in\\n do~ M.substitute_var \\"e2\\" [[ InfixOp.add ~(| M.var ~(| \\"e2\\" |), M.var ~(| \\"e2\\" |) |) ]] in\\n do~ M.substitute_var \\"k\\" [[ InfixOp.add ~(| M.var ~(| \\"k\\" |), 1 |) ]] in\\n M.pure BlockUnit.Tt\\n ) in\\n do~ M.equality_constraint\\n [[ M.var ~(| \\"lin\\" |) ]]\\n [[ M.var ~(| \\"lout\\" |) ]]\\n in\\n M.pure BlockUnit.Tt.\\n```\\n\\nIf you compare the translated code to the original Circom code, you will see that the two are very similar, up to a more verbose syntax in Coq. This is because we translate the high-level representation of Circom to Coq.\\n\\n### Free monad\\n\\nEven if we do not define the Circom semantics for now, we need to write a few Coq definitions so that the code above can be type-checked. As in the translations we make for other languages, we use a free-monad to represent side effects. This is convenient, as we can first express which are the various \\"special\\" operators of the language (declaring a signal, instantiating a template, ...) and define their behavior in a second step. The behavior might be defined in a computational or relational way later.\\n\\nThe definitions of the monad are in [Garden/Garden.v](https://github.com/formal-land/garden/blob/main/Garden/Garden.v). Here is an extract:\\n\\n```coq\\nModule Primitive.\\n (** We group together primitives that share being impure functions operating over the state. *)\\n Inductive t : Set -> Set :=\\n | OpenScope : t unit\\n | CloseScope : t unit\\n | DeclareVar (name : string) (value : F.t) : t unit\\n | DeclareSignal (name : string) (dimensions : list F.t) : t unit\\n | SubstituteVar (name : string) (value : F.t) : t unit\\n | GetVarAccess (name : string) (access : list Access.t) : t F.t\\n | GetPrime : t F.t\\n | EqualityConstraint (value1 value2 : F.t) : t unit.\\nEnd Primitive.\\n```\\n\\nThese are some of the primitives from the Circom language, which we needed for our example (we will add more as we cover more of the language). For example:\\n\\n```coq\\n| DeclareVar (name : string) (value : F.t) : t unit\\n```\\n\\ncorresponds to the Circom construction:\\n\\n```circom\\nvar n = 1;\\n```\\n\\nwith `name` the string `\\"n\\"` and value the field element `1` in this case. We will use a scope of local variables for each function, as a dictionary from the name of the variable to its value. All local variables might be mutated. Compared to more complex languages, such as Rust, we do not need to handle the notion of pointer, simplifying many things.\\n\\nNote that with the free monad, we only give the list of primitive operations with their signatures, but we have not yet defined how to evaluate them.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have seen to define a first translation from the Circom language to Coq, for one example, with the goal of having a translation that is well-typed.\\n\\nIn the following article, we will explore the definition of a meaning to the primitive operations of Circom, such as signals and constraints, in order to formally verify that the `BinSum` is correct.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/12/20/what-is-formal-verification-of-smart-contracts","metadata":{"permalink":"/blog/2024/12/20/what-is-formal-verification-of-smart-contracts","source":"@site/blog/2024-12-20-what-is-formal-verification-of-smart-contracts.md","title":"\ud83e\udd84 How does formal verification of smart contracts work?","description":"We make here a general presentation about how the formal verification of smart contracts works by explaining:","date":"2024-12-20T00:00:00.000Z","formattedDate":"December 20, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"smart contract","permalink":"/blog/tags/smart-contract"},{"label":"audit","permalink":"/blog/tags/audit"}],"readingTime":8.275,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 How does formal verification of smart contracts work?","tags":["Solidity","smart contract","audit"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc7b Translation of Circom to Coq","permalink":"/blog/2024/12/20/translation-of-circom-to-coq"},"nextItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","permalink":"/blog/2024/11/15/tool-for-noir-2"}},"content":"We make here a general presentation about how the formal verification of smart contracts works by explaining:\\n\\n- How people secure their smart contracts without formal verification.\\n- How do formal tools typically work?\\n- How our solution [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) works on a short example (an [ERC-20](https://ethereum.org/en/developers/docs/standards/tokens/erc-20/) contract).\\n- Where LLMs could be the most useful, according to us, for formal verification work.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Ask for the highest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can also be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe are already working with some of the leading blockchain entities such as:\\n\\n- The [Ethereum Foundation](https://ethereum.foundation/)\\n- The [Sui Foundation](https://sui.io/about)\\n- Previously, the [Aleph Zero](https://alephzero.org/) and [Tezos](https://tezos.com/) foundations\\n\\n:::\\n\\n
\\n ![Forest](2024-12-20/forest.webp)\\n
\\n\\n## \ud83d\udee1\ufe0f Securing smart contracts, the common way\\n\\nSmart contracts are short programs, typically less than 5,000 lines of code, running \\"on the blockchain\\" to implement transaction rules. Examples can be virtual marketplaces to trade cryptocurrencies, virtual dollar coins, traceability databases, and NFTs, ... Most of the smart contracts are written in [Solidity](https://soliditylang.org/), a JavaScript-like language, and some are in [Rust](https://www.rust-lang.org/).\\n\\nTo know what a smart contract looks like, you can find a list of the biggest ones (in terms of users) on [shafu0x/awesome-smart-contracts](https://github.com/shafu0x/awesome-smart-contracts). A popular library to write smart contracts is [OpenZeppelin](https://www.openzeppelin.com/solidity-contracts). You can also search for the [Solidity language](https://github.com/search?q=lang%3ASolidity%20&type=repositories) on GitHub to find repositories with Solidity code.\\n\\nSmart contracts are most of the time open-source, as it is important for the users to know what are the rules which handle their money. If a contract is not open-source, _it is probably a scam_.\\n\\nSecuring smart contracts is very important as a single bug can mean that an attacker can steal all the funds of the users who deposited money on the contract, or just block it to compromise the service. Millions of dollars are stolen every month due to bugs in the contracts, and some projects almost lose everything in such attacks. An historically important attack is the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) where $60 million was stolen, leading to a hard fork of the [Ethereum](https://ethereum.org/) blockchain.\\n\\nNow, how do people secure their code? First of all, most projects are well aware that software security is important, and if they want to raise money or advertise their product, they need to show that they are secure. They typically do the following:\\n\\n- **Audits** Projects require a few audits, which are made by specialized companies or individuals, to review the code of a smart contract and find bugs or vulnerabilities. The issues are classified into categories of importance: informational, low, medium, high, or critical. The highest categories mean it is possible to steal all of the funds. Lower categories are more remarks about the coding style/missing documentation. At the end of an audit, a **report** is published together with the corrections for the vulnerabilities that were discovered. As an example, [here](https://github.com/trailofbits/publications/tree/master/reviews) is a list of audit reports from the company [Trail of Bits](https://www.trailofbits.com/).\\n- **Competitions** They enable anyone, during a pre-defined period of time of like a month, to look for bugs in a smart contract. At the end of the competition, a price pot is shared among the persons who found the most bugs. A typical price pot is $100,000, and some large competitions can go above $1,000,000 \ud83d\udcb0. You can see the list of all ongoing competitions on [www.dailywarden.com](https://www.dailywarden.com/).\\n- **Bounties** Finally, bounties are like competitions but always live. The aim is to reward critical vulnerabilities, such that there is an incentive to report a bug instead of exploiting it. A popular platform is [Immunefi](https://immunefi.com/).\\n\\nTo give an idea of the amounts that are at risk of attacks on the blockchain, the total valuation of Ethereum, the main smart contracts platform, is estimated at more than 300 Billion dollars! Attacks are believed to be mainly done by \ud83c\uddf0\ud83c\uddf5 North Korean agents, but sometimes they happen to be single, clever individuals.\\n\\n## \ud83d\udee0\ufe0f Formal verification tools\\n\\nSo, where does formal verification stand in all that?\\n\\nAs it is the idea to mathematically reason about code to show the total absence of bugs in a protocol, formal verification seems to be the ideal tool to ensure the absence of vulnerabilities in a smart contract. In fact, most popular platforms do not take the risk of deploying new versions without a formal verification step, mainly with the leading tool [Certora](https://www.certora.com/).\\n\\nAs the verification is, at the end of the day, a mathematical proof, we can be sure that the code is correct for any possible user inputs, for a given and explicit _specification_. In addition, when the code changes, there is no need to review everything again: you can just formally verify the code that changed, as you would do when writing tests. This saves you time and money.\\n\\n## \ud83d\udea7 Limitations\\n\\nSo, what are the limitations? Here are a few:\\n\\n- **Cost** You need to pay more than with traditional audits. Although the rewards are probably there, given the quantity of funds at risk in a smart contract, many small companies take the risk.\\n- **Time** Sometimes, time is an issue, even if the verification can be done in a continuous manner.\\n- **Specification** You still to have write the correct specification of your code! This is what defines what is a bug and what is a feature.\\n- **Complexity** Formal verification requires some specific knowledge which most developers do not have (This is why we are here to help you! \ud83d\ude04).\\n\\nAnother one is completeness. Some formal verification tools aim to _fully automate_ the proof part, so that you only need to write the specifications. But then:\\n\\n- Some properties are unprovable, or need to be cut into smaller ones in non-trivial ways.\\n- Some parts of the code are not verified. Typically, loops are only unrolled a few times (two or three times), instead of covering all the possible iterations.\\n- Some properties cannot even be expressed!\\n\\nThis is a _real_ concern, according to the security teams of a few blockchain companies we talked to. Popular tools such as Certora or [Halmos](https://github.com/a16z/halmos) fall into this category.\\n\\n## \ud83c\udf1f How to do better?\\n\\nUsing interactive theorem provers, such as [\ud83d\udc13 Coq](https://coq.inria.fr/) or Lean, you overcome the limitations of automated provers as presented above. Here are a few tools you can use:\\n\\n- [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) using the Coq theorem prover. This is the tool we made! \ud83c\udf89\\n- [Clear](https://github.com/NethermindEth/Clear) using the Lean theorem prover. This is a tool made by the company [Nethermind](https://nethermind.io/).\\n\\n[Kontrol](https://kontrol.runtimeverification.com/) from Runtime Verification is another verification tool providing ways to go further than automated tools.\\n\\nFor the question of correct and complete specifications, here is our idea:\\n\\n> Build a set of high-level primitives encoding ideas such as \\"identity\\", \\"value\\", \\"ownership\\", \\"exact calculation\\", ... which are not necessarily definable in a programming language but can be axiomatized in a proof system. Use them to give the business rules of a contract in a clear manner and to express meta-properties such as \\"it is impossible to steal\\" \ud83d\ude93.\\n\\n## \ud83d\udd27 Technical pipeline\\n\\nSolidity is a complex language. All the tools we mentioned above translate the code into a formal language to reason about it. They never take the Solidity code as it is. Instead, they first translate it to a simpler language, generally EVM bytecode (the assembly language for Solidity) or [Yul](https://docs.soliditylang.org/en/latest/yul.html) which is slightly higher level.\\n\\nThen, they run several steps to first \\"\ud83e\uddfc clean up the code\\" and obtain a representation that is high-level again. See, for example, the [Practical Verification of Smart Contracts using Memory Splitting](https://dl.acm.org/doi/10.1145/3689796) article from Certora about optimizing the memory representation of EVM code to retrieve some properties from the Solidity representation.\\n\\nIn `coq-of-solidity`, we call this step of going from low-level to high-level writing a \\"simulation\\", which is a high-level representation of the low-level code. This task is time-consuming. An alternative would be to use LLMs to generate it. We can check that the simulation is equivalent to the low-level version, either by writing a formal proof or by testing.\\n\\nAs an example, here is what we get for the verification of an ERC-20 smart contract with `coq-of-solidity` (you can click on the links to see the code):\\n\\n- [the ERC-20 Solidity contract](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/contract.sol)\\n- [the low-level version (in Yul, generated)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/contract.yul)\\n- [the Coq translation of the low-level version (generated)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/shallow.v)\\n- [the simulation in Coq (hand-written)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/simulations/contract.v)\\n- [the formal proof that the two are equivalent (hand-written)](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/erc20/proofs/contract.v)\\n\\n## \ud83e\udde0 Use cases for LLMs\\n\\nHere are a few areas where LLMs can be useful:\\n\\n1. Writing formal **specifications** from the code of a smart contract, its documentation, and a dataset of known vulnerabilities. We can find such datasets on the Internet or by reading audits and competition reports.\\n2. Writing **formal proofs** for the specifications.\\n3. Writing a **high-level representation** of a smart contract in a formal system.\\n4. Writing a formal proof that a **high-level representation is valid**.\\n\\nThe fact that most of the smart contracts are open-source should also help running learning algorithms. We hope to explore this area more in the future or give ideas to others.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have made a general presentation of security challenges around the deployment of smart contracts and how formal verification works and helps to secure smart contracts even more. We also presented a few ways to potentially improve current tooling.\\n\\nWe hope that this article will help you understand the importance of formal verification and how it can be used to secure your smart contracts. Please contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land) if you need formal verification services or advice!\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/11/15/tool-for-noir-2","metadata":{"permalink":"/blog/2024/11/15/tool-for-noir-2","source":"@site/blog/2024-11-15-tool-for-noir-2.md","title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","description":"In this blog post, we continue our presentation about our formal verification tool for \u25fc\ufe0f Noir programs coq-of-noir. Noir is a Rust-like language to write programs designed to run efficiently in zero-knowledge environments. It has a growing popularity and a focus on providing optimized libraries for common needs, such as a base64 library using \ud83e\udde0 field arithmetic that we use in this series of blog posts.","date":"2024-11-15T00:00:00.000Z","formattedDate":"November 15, 2024","tags":[{"label":"Noir","permalink":"/blog/tags/noir"},{"label":"smart contract","permalink":"/blog/tags/smart-contract"},{"label":"circuits","permalink":"/blog/tags/circuits"}],"readingTime":8.895,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","tags":["Noir","smart contract","circuits"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 How does formal verification of smart contracts work?","permalink":"/blog/2024/12/20/what-is-formal-verification-of-smart-contracts"},"nextItem":{"title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","permalink":"/blog/2024/11/14/sui-move-checker-abstract-stack"}},"content":"In this blog post, we continue our presentation about our formal verification tool for [\u25fc\ufe0f Noir](https://noir-lang.org/) programs [coq-of-noir](https://github.com/formal-land/coq-of-noir). Noir is a Rust-like language to write programs designed to run efficiently in zero-knowledge environments. It has a growing popularity and a focus on providing optimized libraries for common needs, such as a [base64](https://github.com/noir-lang/noir_base64) library using \ud83e\udde0 field arithmetic that we use in this series of blog posts.\\n\\nHere we present the details of our semantic rules to show that a Noir program has an expected behavior for any possible parameters. We focus, in particular, on our memory-handling approach and the definition of loops.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Require the strongest security!\\n\\nTo ensure your code is fully secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Noir](2024-11-15/noir.webp)\\n
\\n\\n## \u2699\ufe0f Semantic rules\\n\\nIn the previous blog post [\u25fc\ufe0f A formal verification tool for Noir \u2013 1](/blog/2024/11/01/tool-for-noir-1) we presented our general translation from the Noir syntax to [\ud83d\udc13 Coq](https://coq.inria.fr/), as well as the free monad we use to represent side-effects such as mutations. We now need to define semantic rules to be able to say that a particular translated Noir program evaluates to a certain value.\\n\\nFor expressions that do not have side effects we rely on the usual reduction rules of Coq. This is really convenient as we can then reuse the existing Coq tactics and automation to reason about pure expressions.\\n\\nFor side-effects like mutations or function calls, which we also consider as side-effects as there might be infinite recursion, we use a big-step semantics with the following predicate:\\n\\n```coq\\n{{ p, state_in | e \u21d3 output | state_out }}\\n```\\n\\nIt says that for a certain prime number $p$ which is the size of the arithmetic field, for an initial state `state_in`, the expression `e` evaluates to the output `output` and the final state `state_out`.\\n\\nWe define this rule with a Coq `Inductive` with one case per case in our free monad for effects. This is similar to the work we have done for Rust with [coq-of-rust](https://github.com/formal-land/coq-of-rust). Here are the relevant rules.\\n\\n- `Pure`\\n Expressions without side effects evaluate to their value and do not change the state. Note that in Coq, we do not distinguish between expressions and values, as all values are equal modulo evaluation rules, so we can directly use the expression as the output.\\n ```coq\\n | Pure :\\n {{ p, state_out | LowM.Pure output \u21d3 output | state_out }}\\n ```\\n- `GetFieldPrime`\\n To obtain the current size of the field $p$ we use the `GetFieldPrime` primitive. This is a side-effect as it depends on the current settings to compile the Noir program in circuits. We use this operation as an internal operation to define the arithmetic operations in the field by computing modulo $p$.\\n ```coq\\n | CallPrimitiveGetFieldPrime\\n (k : Z -> M.t)\\n (state_in : State) :\\n {{ p, state_in | k p \u21d3 output | state_out }} ->\\n {{ p, state_in |\\n LowM.CallPrimitive Primitive.GetFieldPrime k \u21d3 output\\n | state_out }}\\n ```\\n We use a semantics by continuation with a continuation `k` for most of the operations of the monad. Instead of directly returning some result, we pass it to the continuation and evaluate it. In our experience, this simplifies the reasoning on code instead of having to use another monadic operation to pass this value.\\n- `CallClosure`\\n We define a closure as a function from a list of values to some monadic expression. In our translation, terms are totally untyped; in particular we do not enforce any arity for the functions. In case a wrong number of arguments is passed to a function, we will have a runtime error. This is a trade-off to keep the translation simple and to avoid having to define a type system for Noir.\\n ```coq\\n | CallClosure\\n (f : list Value.t -> M.t) (args : list Value.t)\\n (k : Result.t -> M.t)\\n (output_inter : Result.t)\\n (state_in state_inter : State) :\\n let closure := Value.Closure (existS (_, _) f) in\\n {{ p, state_in | f args \u21d3 output_inter | state_inter }} ->\\n {{ p, state_inter | k output_inter \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.CallClosure closure args k \u21d3 output | state_out }}\\n ```\\n To call a function, we first evaluate its body on the arguments and then the continuation `k`. If the result is some `output` and `state_out`, we can say that the whole expression evaluates to `output` and `state_out`.\\n- `Let`\\n The `Let` primitive is the monadic bind. It allows to sequentially compose the execution of two expressions. We first evaluate the first expression, then the second one with the result of the first one.\\n ```coq\\n | Let\\n (e : M.t)\\n (k : Result.t -> M.t)\\n (output_inter : Result.t)\\n (state_in state_inter : State) :\\n {{ p, state_in | e \u21d3 output_inter | state_inter }} ->\\n {{ p, state_inter | k output_inter \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.Let e k \u21d3 output | state_out }}\\n ```\\n\\n## \ud83d\udc18 Memory handling\\n\\nIn Noir, you can make a new variable mutable with the keyword `let mut`:\\n\\n```rust\\nlet mut result: [u8; InputElements] = [0; InputElements];\\n```\\n\\nThen you can assign a new value to this variable or its content with the `=` operator:\\n\\n```rust\\nresult[i] = Base64Decoder.get(input_byte as Field);\\n```\\n\\nThere is basic pointer manipulation with the `&` operator to get a reference to a variable and the `*` operator to dereference a pointer. You can even pass a mutable reference to a function to modify the value of a variable. There is no deallocation of memory, which entirely removes the need for a garbage collector or deallocation strategy. This is because Noir programs are supposed to be very short-lived.\\n\\nTo handle all expressions in a uniform way, we consider that each Noir expression is an address to its content. For most (intermediate) values, which are not mutable, the address is the value itself. For mutable values, we use a fresh address for each `let mut` assignment.\\n\\n:::info Thanks\\n\\nAs [GitHub Copilot](https://github.com/features/copilot) correctly suggests me, this is similar to the approach we have taken for Rust in `coq-of-rust`. Thanks for following what we are doing! \ud83d\ude4f\\n\\n:::\\n\\nTo simplify the proofs, we let the user input a memory model of its choice. The only constraint is to provide memory operations for `read`, `write`, and `alloc`, and to make sure that these operations are consistent. Once it is done, here are the rules for the memory handling of mutable references:\\n\\n- `StateAlloc`\\n ```coq\\n | CallPrimitiveStateAlloc\\n (value : Value.t)\\n (address : Address)\\n (k : Value.t -> M.t)\\n (state_in state_in\' : State) :\\n let pointer := Pointer.Mutable (Pointer.Mutable.Make address []) in\\n State.read address state_in = None ->\\n State.alloc_write address state_in value = Some state_in\' ->\\n {{ p, state_in\' | k (Value.Pointer pointer) \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.CallPrimitive (Primitive.StateAlloc value) k \u21d3 output | state_out }}\\n ```\\n- `StateRead`\\n ```coq\\n | CallPrimitiveStateRead\\n (address : Address)\\n (value : Value.t)\\n (k : Value.t -> M.t)\\n (state_in : State) :\\n State.read address state_in = Some value ->\\n {{ p, state_in | k value \u21d3 output | state_out }} ->\\n {{ p, state_in | LowM.CallPrimitive (Primitive.StateRead address) k \u21d3 output | state_out }}\\n ```\\n- `StateWrite`\\n ```coq\\n | CallPrimitiveStateWrite\\n (value : Value.t)\\n (address : Address)\\n (k : unit -> M.t)\\n (state_in state_in\' : State) :\\n State.alloc_write address state_in value = Some state_in\' ->\\n {{ p, state_in\' | k tt \u21d3 output | state_out }} ->\\n {{ p, state_in |\\n LowM.CallPrimitive (Primitive.StateWrite address value) k \u21d3 output\\n | state_out }}\\n ```\\n\\nWhen using these rules to show that a certain Noir program evaluates to an expression, one has to make the right choice for the address used to allocate the value. This choice is arbitrary but can make the proof more or less complex later. The read and write operations are deterministic.\\n\\n## \u27b0 Loops\\n\\nThere is only one kind of loop in Noir, bounded `for` loops:\\n\\n```rust\\nfor i in 0..InputElements {\\n let input_byte = input[i];\\n result[i] = Base64Decoder.get(input_byte as Field);\\n}\\n```\\n\\nThe index `i` evolves in between statically known bounds. As such, these bounds always terminate, which is a requirement for formal verification to proceed! As a result, we do not need to introduce a dedicated monadic primitive for the loops and can define them with a recursive function:\\n\\n```coq\\nFixpoint for_nat (end_ : Z) (fuel : nat) (body : Z -> M.t) {struct fuel} : M.t :=\\n match fuel with\\n | O => pure (Value.Tuple [])\\n | S fuel\' =>\\n let* _ := body (end_ - Z.of_nat fuel) in\\n for_nat end_ fuel\' body\\n end.\\n\\nDefinition for_Z (start end_ : Z) (body : Z -> M.t) : M.t :=\\n for_nat end_ (Z.to_nat (end_ - start)) body.\\n```\\n\\nNote that we do not handle `break` or `continue` yet but propagate assert failures with `let*`. We _prove_ the following reasoning rule for loops:\\n\\n```coq\\nLemma For {State Address : Set} `{State.Trait State Address}\\n (p : Z) (state_in : State)\\n (integer_kind : IntegerKind.t) (start : Z) (len : nat) (body : Value.t -> M.t)\\n {Accumulator : Set}\\n (inject : State -> Accumulator -> State)\\n (accumulator_in : Accumulator)\\n (body_expression : Z -> MS! Accumulator unit)\\n (H_body : forall (accumulator_in : Accumulator) (i : Z),\\n let output_accumulator_out := body_expression i accumulator_in in\\n {{ p, inject state_in accumulator_in |\\n body (M.alloc (Value.Integer integer_kind i)) \u21d3\\n Panic.to_result (fst output_accumulator_out)\\n | inject state_in (snd output_accumulator_out) }}\\n ) :\\n let output_accumulator_out :=\\n foldS!\\n tt\\n (List.map (fun offset => start + Z.of_nat offset) (List.seq 0 len))\\n (fun (_ : unit) => body_expression)\\n accumulator_in in\\n {{ p, inject state_in accumulator_in |\\n M.for_\\n (Value.Integer integer_kind start)\\n (Value.Integer integer_kind (start + Z.of_nat len))\\n body \u21d3\\n Panic.to_result (fst output_accumulator_out)\\n | inject state_in (snd output_accumulator_out) }}.\\n```\\n\\nIt is a little bit involved but basically says that if the body of the loop evaluates to an expression for each possible iteration, then the whole loop evaluates to the recursive function `foldS!` using the modified memory as an accumulator.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have shown how we define the semantic rules for the Noir language in Coq, for the general monadic primitives, memory, and loops.\\n\\nIn the next blog post, we will apply these reasoning principles to give a semantics to the `base64` library of Noir.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/11/14/sui-move-checker-abstract-stack","metadata":{"permalink":"/blog/2024/11/14/sui-move-checker-abstract-stack","source":"@site/blog/2024-11-14-sui-move-checker-abstract-stack.md","title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","description":"We are continuing our formal verification work for the implementation of the type-checker of the Move language in the \ud83d\udca7 Sui blockchain. We verify a manual translation in the proof system \ud83d\udc13 Coq of the \ud83e\udd80 Rust code of the Move checker as available on GitHub.","date":"2024-11-14T00:00:00.000Z","formattedDate":"November 14, 2024","tags":[{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Move","permalink":"/blog/tags/move"},{"label":"Sui","permalink":"/blog/tags/sui"},{"label":"type-checker","permalink":"/blog/tags/type-checker"}],"readingTime":7.74,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","tags":["Rust","Move","Sui","type-checker"],"authors":[]},"unlisted":false,"prevItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 2","permalink":"/blog/2024/11/15/tool-for-noir-2"},"nextItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","permalink":"/blog/2024/11/01/tool-for-noir-1"}},"content":"We are continuing our formal verification work for the implementation of the type-checker of the [Move](https://sui.io/move) language in the [\ud83d\udca7 Sui](https://sui.io/) blockchain. We verify a manual translation in the proof system [\ud83d\udc13 Coq](https://coq.inria.fr/) of the [\ud83e\udd80 Rust](https://www.rust-lang.org/) code of the Move checker as available on [GitHub](https://github.com/move-language/move-sui/tree/main/crates/move-bytecode-verifier).\\n\\nIn this blog post, we present in detail the verification of a particular function `AbstractStack::pop_eq_n` that manipulates \ud83d\udcda stacks of types to show that it is equivalent to its naive implementation.\\n\\nAll the code presented here is on our GitHub at [github.com/formal-land/coq-of-rust](https://github.com/formal-land/coq-of-rust) \ud83e\uddd1\u200d\ud83c\udfeb.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Water in forest](2024-11-14/water-in-forest.webp)\\n
\\n\\n## \ud83d\udd75\ufe0f The code to verify\\n\\nHere is the definition in Rust of an `AbstractStack`, from the file [move-abstract-stack/src/lib.rs](https://github.com/move-language/move-sui/blob/main/crates/move-abstract-stack/src/lib.rs):\\n\\n```rust\\n/// An abstract value that compresses runs of the same value to reduce space usage\\npub struct AbstractStack {\\n values: Vec<(u64, T)>,\\n len: u64,\\n}\\n```\\n\\nIt says that a stack of elements of type `T` is a vector of pairs of a number and a value. The number is the number of times the value is repeated in the stack. The field `len` is the total number of elements in the stack. This representation is more efficient than a naive stack, in case the stack contains many repeated values.\\n\\nHere is one of the primitives to remove elements from this stack:\\n\\n```rust\\n/// Pops n values off the stack, erroring if there are not enough items or if the n items are\\n/// not equal\\npub fn pop_eq_n(&mut self, n: NonZeroU64) -> Result {\\n let n: u64 = n.get();\\n if self.is_empty() || n > self.len {\\n return Err(AbsStackError::Underflow);\\n }\\n let (count, last) = self.values.last_mut().unwrap();\\n debug_assert!(*count > 0);\\n let ret = match (*count).cmp(&n) {\\n Ordering::Less => return Err(AbsStackError::ElementNotEqual),\\n Ordering::Equal => {\\n let (_, last) = self.values.pop().unwrap();\\n last\\n }\\n Ordering::Greater => {\\n *count -= n;\\n last.clone()\\n }\\n };\\n self.len -= n;\\n Ok(ret)\\n}\\n```\\n\\nThis function removes `n` elements from the stack, returning the value of removed elements. It returns an error if there are not enough elements in the stack or if the `n` last items are not grouped as equal elements.\\n\\nOur goal is to **show that this function is equal to the naive pop function with repetition** on flattened stacks.\\n\\n## \u2696\ufe0f Specification\\n\\nHere is the property we want to verify in the formal language Coq:\\n\\n```coq\\nLemma flatten_pop_eq_n {A : Set} `{Eq.Trait A} (n : Z) (stack : AbstractStack.t A)\\n (H_n : n > 0) :\\n match AbstractStack.pop_eq_n n stack with\\n | Panic.Value (Result.Ok item, stack\') =>\\n flatten stack = List.repeat item (Z.to_nat n) ++ flatten stack\'\\n | _ => True\\n end.\\n```\\n\\nIt says that for any possible `stack` and `n` greater than 0, if we remove `n` elements from the stack and when the execution succeeds, the flattened stack is equal to the repetition of the removed element `n` times followed by the flattened stack.\\n\\nHow did we get from the Rust code above to the expression of this property? We manually converted the Rust code above in Coq with the following definitions:\\n\\n```coq\\nModule AbstractStack.\\n Record t (A : Set) : Set := {\\n values : list (Z * A);\\n len : Z;\\n }.\\n```\\n\\nfor the `AbstractStack` type, and:\\n\\n```coq\\nDefinition pop_eq_n {A : Set} (n : Z) : MS! (t A) (Result.t A AbsStackError.t) :=\\n fun (self : t A) =>\\n if (is_empty self || (n >? len self))%bool then\\n return! (Result.Err AbsStackError.Underflow, self)\\n else\\n let! (count, last) := Option.unwrap (List.hd_error self.(values)) in\\n if count panic! \\"unreachable\\"\\n | (_, last) :: values => return! ((count - n, last) :: values)\\n end in\\n let self := {|\\n values := values;\\n len := self.(len) - n\\n |} in\\n return! (Result.Ok last, self).\\n```\\n\\nfor the `pop_eq_n` function. Note that this definition uses a lot of user-defined notations, such as `let!`, that we made in order to simplify the expression of effects in Coq. You can read more about these notations on our previous blog post [\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2](/blog/2024/10/14/verification-move-sui-type-checker-2). We checked by testing that our translation above behaves as the original Rust code, as explained in our blog post [\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3](/blog/2024/10/15/verification-move-sui-type-checker-3). It is not necessary to understand the translation in detail, as its verification will flow naturally.\\n\\nWe define the `flatten` function to translate a stack with repetitions to a flat stack as:\\n\\n```coq\\nDefinition flatten {A : Set} (abstract_stack : AbstractStack.t A) : list A :=\\n List.flat_map (fun \'(n, v) => List.repeat v (Z.to_nat n)) abstract_stack.(AbstractStack.values).\\n```\\n\\nIt duplicates all the elements `n` times with `List.repeat v (Z.to_nat n)` and concatenates them with `List.flat_map`.\\n\\n## \ud83e\udd13 Proof\\n\\nTo show that the specification above is correct for any stacks, we cannot test it as it will only cover a finite amount of cases. We must write a Coq proof showing by mathematical reasoning that the code is always correct.\\n\\nHere is our full proof:\\n\\n```coq\\nProof.\\n destruct stack as [stack].\\n unfold AbstractStack.pop_eq_n, flatten.\\n (* if (is_empty self || (n >? len self))%bool then *)\\n destruct (_ || _); simpl; [reflexivity|].\\n unfold List.hd_error.\\n (* Option.unwrap (List.hd_error self.(values)) *)\\n destruct stack as [|[count last] stack]; simpl; [reflexivity|].\\n (* if count = 0\\nHeqb: (count List.repeat v (Z.to_nat n0)) stack =\\nList.repeat last (Z.to_nat n) ++ List.flat_map (fun \'(n0, v) => List.repeat v (Z.to_nat n0)) stack\\n\\n--------------------------------------\\n\\n2/2\\nList.repeat last (Z.to_nat count) ++ List.flat_map (fun \'(n0, v) => List.repeat v (Z.to_nat n0)) stack =\\nList.repeat last (Z.to_nat n) ++\\nList.repeat last (Z.to_nat (count - n)) ++ List.flat_map (fun \'(n0, v) => List.repeat v (Z.to_nat n0)) stack\\n```\\n\\nThis is how we can progress in the proof and know which command to type. We see two sub-goals `(1/2)` and `(2/2)` for each branch explored by the last `destruct`. In both cases, we need to show an equality:\\n\\n1. The first one is solved by the fact that `count = n` in this branch.\\n2. The second one is solved by the fact that `count > n` in this branch, so that we can group the `List.repeat last (Z.to_nat n)` with `List.repeat last (Z.to_nat (count - n))` (repeating a \\"negative\\" number of times is the empty list so we need to make sure that `count - n` is not negative).\\n\\n## \u2712\ufe0f Conclusion\\n\\nIn this example, we have seen how to verify that the `pop_eq_n` function of the `AbstractStack` type in the Move checker of Sui is equivalent to the naive pop function with repetition on flattened stacks. As this is a formal proof, we are sure that this property holds for any possible stack and value of `n`.\\n\\nWe are continuing the work to verify the other functions of the project, with the final aim to verify the whole type-checker. We will keep you updated on our progress in the next blog posts \ud83d\ude80.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/11/01/tool-for-noir-1","metadata":{"permalink":"/blog/2024/11/01/tool-for-noir-1","source":"@site/blog/2024-11-01-tool-for-noir-1.md","title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","description":"In this series of blog posts, we present our development of a formal verification tool for the \u25fc\ufe0f Noir smart contract language. It is particularly suited to writing zero-knowledge applications, providing primitive constructs such as a Field type to write programs that run efficiently as circuits. Having a formal verification for Noir enables the development of applications holding a large amount of money in this language, as it ensures that the code is correct with a mathematical level of certainty.","date":"2024-11-01T00:00:00.000Z","formattedDate":"November 1, 2024","tags":[{"label":"Noir","permalink":"/blog/tags/noir"},{"label":"smart contract","permalink":"/blog/tags/smart-contract"},{"label":"circuits","permalink":"/blog/tags/circuits"}],"readingTime":11.94,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","tags":["Noir","smart contract","circuits"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Example of verification for the Move\'s checker of Sui","permalink":"/blog/2024/11/14/sui-move-checker-abstract-stack"},"nextItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 2","permalink":"/blog/2024/10/28/verification-smooth-library-2"}},"content":"In this series of blog posts, we present our development of a formal verification tool for the [\u25fc\ufe0f Noir](https://noir-lang.org/) smart contract language. It is particularly suited to writing zero-knowledge applications, providing primitive constructs such as a `Field` type to write programs that run efficiently as circuits. Having a formal verification for Noir enables the development of applications holding a large amount of money in this language, as it ensures that the code is correct with a mathematical level of certainty.\\n\\nIn this first post, we present how we translate Noir code to the [\ud83d\udc13 Coq](https://coq.inria.fr/) proof system. We explore a translation after monomorphization and then at the HIR level. Note that we are interested in verifying programs _written in Noir_. The verification of the Noir compiler itself is a separated topic.\\n\\nAll our code is available as open-source on [github.com/formal-land/coq-of-noir](https://github.com/formal-land/coq-of-noir), and you are welcome to use it. We also provide all-included audit services to formally verify your smart contracts using `coq-of-noir`.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Noir](2024-11-01/noir.webp)\\n
\\n\\n## \u25fc\ufe0f Quick presentation of Noir\\n\\nNoir is designed as a small version of [\ud83e\udd80 Rust](https://www.rust-lang.org/) with many built-in constructs to make it more amenable to efficient compilation to zero-knowledge circuits. Being a smaller version of Rust, this simplifies the development of tooling as the surface of the language is reduced. In addition, as it shares similarities with Rust, we can reuse our knowledge from [coq-of-rust](https://github.com/formal-land/coq-of-rust), a formal verification tool for Rust, to propose an equivalent tool for Noir.\\n\\nA notable difference between Rust and Noir is that Noir has a much simpler memory management model: nothing is ever deallocated! As a result, the various kinds of pointers that exist in Rust (`Rc`, `RefCell`, ...) are not present in Noir. Most of the data is immutable, and mutations are encouraged to be done only on local variables.\\n\\nThe loops are restricted to `for` loops with bounds known at compile time, which simplifies the reasoning about them. For example, we are sure that all the loops terminate, which is required for the verification of the code.\\n\\nHere is an example of Noir program that we will use in this series of blog posts. It showcases the use of mutable variables in a loop, as well as generic values such as `InputElements` that are known at compile time and specialized during the monomorphization phase to compile the code down to a circuit. It is part of the [noir_base64](https://github.com/noir-lang/noir_base64) library to encode an array of ASCII values into base64 values using finite field operations to stay efficient.\\n\\n```rust\\n/**\\n * @brief Take an array of ASCII values and convert into base64 values\\n **/\\npub fn base64_encode_elements(\\n input: [u8; InputElements]\\n) -> [u8; InputElements] {\\n let mut Base64Encoder = Base64EncodeBE::new();\\n let mut result: [u8; InputElements] = [0; InputElements];\\n\\n for i in 0..InputElements {\\n result[i] = Base64Encoder.get(input[i] as Field);\\n }\\n\\n result\\n}\\n```\\n\\n## 1\ufe0f\u20e3 Monomorphization\\n\\nIn this phase of compilation, all generic types and values are instantiated with their concrete values, as well as trait instances. The resulting code is much simpler as it only contains functions and types. If we translate the code to an untyped representation in Coq, we can even consider that the monomorphized code only contains functions. Thus, for convenience, we started doing our translation from the monomorphized level.\\n\\nThe [abstract syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) for this level is in the Rust file [compiler/noirc_frontend/src/monomorphization/ast.rs](https://github.com/formal-land/coq-of-noir/blob/master/compiler/noirc_frontend/src/monomorphization/ast.rs) from the Noir\'s compiler. As an example, here is how the expressions are represented:\\n\\n```rust\\npub enum Expression {\\n Ident(Ident),\\n Literal(Literal),\\n Block(Vec),\\n Unary(Unary),\\n Binary(Binary),\\n Index(Index),\\n Cast(Cast),\\n For(For),\\n If(If),\\n Tuple(Vec),\\n ExtractTupleField(Box, usize),\\n Call(Call),\\n Let(Let),\\n Constrain(Box, Location, Option>),\\n Assign(Assign),\\n Semi(Box),\\n Break,\\n Continue,\\n}\\n```\\n\\nIf you look at the various constructors of this enum they correspond to the language\'s primitives presented in the reference manual of Noir. Expressions (`Ident`, `Binary`, `Call`, ...) and statements (`If`, `Let`, `Break`, ...) are mixed together. If we look at the definition of `Ident`:\\n\\n```rust\\npub struct Ident {\\n pub location: Option,\\n pub definition: Definition,\\n pub mutable: bool,\\n pub name: String,\\n pub typ: Type,\\n}\\n```\\n\\nand then at the definition of `Definition`:\\n\\n```rust\\npub enum Definition {\\n Local(LocalId),\\n Function(FuncId),\\n Builtin(String),\\n LowLevel(String),\\n // used as a foreign/externally defined unconstrained function\\n Oracle(String),\\n}\\n```\\n\\nwe get that most of the names have an associated _id_ that is a unique number. This is because in the monomorphization phase, we duplicate a lot of the definitions (once for each instantiation of a generic type), so we have to give them a unique id to distinguish them.\\n\\n### Translation\\n\\nWe translate the monomorphized code to Coq by doing:\\n\\n1. An extraction to JSON thanks to the `serde` serialization library in Rust.\\n2. Pretty-printing the resulting JSON to a Coq file with a Python script.\\n\\nWe find this development process to be rather efficient as the Python language is quite flexible and allows us to manipulate the JSON data easily. Compared to the work of a full compiler, which can be rather expensive computationally, what we do is mostly a translation from one syntax to another, and Python is a good fit.\\n\\nOur Noir example is monomorphized to the following code, which can be shown by the development option `--show-monomorphized` of `nargo`:\\n\\n```rust\\nfn base64_encode_elements$f4(input$l26: [u8; 36]) -> [u8; 36] {\\n let Base64Encoder$27 = new$f6();\\n let result$28 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];\\n for i$29 in 0 .. 36 {\\n result$l28[i$l29] = get$f7(Base64Encoder$l27, (input$l26[i$l29] as Field))\\n };\\n result$l28\\n}\\n```\\n\\nWe see that the generic variable `InputElements` is replaced by the constant value `36` as this is the value we use in the example we translate. All the identifiers have an additional `$...` suffix to make them unique. Thanks to the serialization library `serde`, we automatically get the JSON representation of this code that starts with:\\n\\n```json\\n{\\n \\"id\\": 4,\\n \\"name\\": \\"base64_encode_elements\\",\\n \\"parameters\\": [\\n [\\n 49,\\n false,\\n \\"input\\",\\n {\\n \\"Array\\": [\\n 118,\\n {\\n \\"Integer\\": [\\n \\"Unsigned\\",\\n \\"Eight\\"\\n ]\\n }\\n ]\\n }\\n ]\\n ],\\n \\"body\\": {\\n \\"Block\\": [\\n {\\n \\"Let\\": {\\n \\"id\\": 27,\\n \\"mutable\\": true,\\n \\"name\\": \\"Base64Encoder\\",\\n \\"expression\\": {\\n \\"Call\\": {\\n \\"func\\": {\\n \\"Ident\\": {\\n \\"location\\": {\\n \\"span\\": {\\n \\"start\\": 5312,\\n \\"end\\": 5315\\n },\\n \\"file\\": 70\\n },\\n \\"definition\\": {\\n \\"Function\\": 6\\n },\\n \\"mutable\\": false,\\n \\"name\\": \\"new\\",\\n \\"typ\\": {\\n \\"Function\\": [\\n [],\\n {\\n \\"Tuple\\": [\\n {\\n \\"Array\\": [\\n 64,\\n {\\n \\"Integer\\": [\\n \\"Unsigned\\",\\n \\"Eight\\"\\n ]\\n }\\n ]\\n }\\n ]\\n },\\n // much more JSON code\\n```\\n\\nThis is extremely verbose, and there is some information that we do not need, such as the locations of some of the items in the source. The advantage of JSON is that it is easy to parse and handle in most programming languages. In our case, here is an extract of the Python script that translates this JSON to Coq:\\n\\n```python\\n\'\'\'\\npub enum Expression {\\n Ident(Ident),\\n Literal(Literal),\\n Block(Vec),\\n Unary(Unary),\\n Binary(Binary),\\n Index(Index),\\n Cast(Cast),\\n For(For),\\n If(If),\\n Tuple(Vec),\\n ExtractTupleField(Box, usize),\\n Call(Call),\\n Let(Let),\\n Constrain(Box, Location, Option>),\\n Assign(Assign),\\n Semi(Box),\\n Break,\\n Continue,\\n}\\n\'\'\'\\ndef expression_to_coq(node) -> str:\\n node_type: str = list(node.keys())[0]\\n\\n if node_type == \\"Ident\\":\\n node = node[\\"Ident\\"]\\n return ident_to_coq(node)\\n\\n if node_type == \\"Literal\\":\\n node = node[\\"Literal\\"]\\n return alloc(literal_to_coq(node))\\n\\n if node_type == \\"Block\\":\\n node = node[\\"Block\\"]\\n return \\\\\\n \\"\\\\n\\".join(\\n expression_inside_block_to_coq(expression, index == len(node) - 1)\\n for index, expression in enumerate(node)\\n )\\n\\n if node_type == \\"Unary\\":\\n node = node[\\"Unary\\"]\\n return unary_to_coq(node)\\n\\n if node_type == \\"Binary\\":\\n node = node[\\"Binary\\"]\\n return binary_to_coq(node)\\n\\n # more cases...\\n```\\n\\nFor each kind of node in the AST, we write the original Rust type in comments, then let GitHub Copilot write the Python code and refine it. Here is the final Coq code that we get for this example:\\n\\n```coq\\nDefinition base64_encode_elements\u2084 (\u03b1 : list Value.t) : M.t :=\\n match \u03b1 with\\n | [input] =>\\n let input := M.alloc input in\\n let* result :=\\n let~ Base64Encoder := [[ M.copy_mutable (|\\n M.alloc (M.call_closure (|\\n M.read (| M.get_function (| \\"new\\", 6 |) |),\\n []\\n |))\\n |) ]] in\\n let~ result := [[ M.copy_mutable (|\\n M.alloc (Value.Array [\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |);\\n M.read (| M.alloc (Value.Integer IntegerKind.U8 0) |)\\n ])\\n |) ]] in\\n do~ [[\\n M.for_ (|\\n M.read (| M.alloc (Value.Integer IntegerKind.U32 0) |),\\n M.read (| M.alloc (Value.Integer IntegerKind.U32 36) |),\\n fun (i : Value.t) =>\\n [[\\n M.alloc (M.assign (|\\n M.read (| M.alloc (M.index (|\\n M.read (| M.alloc (result) |),\\n M.read (| i |)\\n |)) |),\\n M.read (| M.alloc (M.call_closure (|\\n M.read (| M.get_function (| \\"get\\", 7 |) |),\\n [\\n M.read (| Base64Encoder |);\\n M.read (| M.alloc (M.cast (|\\n M.read (| M.alloc (M.index (|\\n M.read (| input |),\\n M.read (| i |)\\n |)) |),\\n IntegerKind.Field\\n |)) |)\\n ]\\n |)) |)\\n |))\\n ]]\\n |)\\n ]] in\\n [[\\n result\\n ]] in\\n M.read result\\n | _ => M.impossible \\"wrong number of arguments\\"re\\n end.\\n```\\n\\nIf you attentively compare this Coq code to the original Noir version, you will see that the two are similar, although the Coq version is much more verbose with all the explicit memory allocations and reads. You might be wondering why we are choosing this specific representation. How did we know we had to use `M.for_`, for example, to represent the loops?\\n\\n### Semantics\\n\\nThis is where the semantics comes in. In the semantics phase, we define the meaning of each construct of the language in Coq. We reused our experience in building the [coq-of-rust](https://github.com/formal-land/coq-of-rust) and [coq-of-solidity](https://github.com/formal-land/coq-of-solidity), where we also had to define the semantics of imperative languages in Coq.\\n\\nWe remove all the type information to avoid the differences between the Coq\'s type system and the type system of Noir. All the values have the same type `Value.t`:\\n\\n```coq\\nModule Value.\\n Inductive t : Set :=\\n | Bool (b : bool)\\n | Integer (kind : IntegerKind.t) (integer : Z)\\n | String (s : string)\\n | FmtStr : string -> Z -> t -> t\\n | Pointer (pointer : Pointer.t t)\\n | Array (values : list t)\\n | Slice (values : list t)\\n | Tuple (values : list t)\\n | Closure : {\'(Value, M) : (Set * Set) @ list Value -> M} -> t.\\nEnd Value.\\n```\\n\\nWe have a monad `M.t` to represent the side-effects of Noir in Coq (memory mutation, non-termination for recursive calls, ...). We define this monad from the composition of two monads:\\n\\n- A free monad `LowM.t` that contains all the effects we cannot directly represent in Coq.\\n- An error monad `Result.t` to represent special control-flow operations, such as `break` and `continue`, which have to interrupt the execution of the current loop prematurely, and a panic value in case of assert failure, which must propagate up to the main function.\\n\\nThe definition of these types is as follows:\\n\\n- The free monad:\\n ```coq\\n Module LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure (value : A)\\n | CallPrimitive {B : Set} (primitive : Primitive.t B) (k : B -> t A)\\n | CallClosure (closure : Value.t) (args : list Value.t) (k : A -> t A)\\n | Let (e : t A) (k : A -> t A)\\n | Loop (body : t A) (k : A -> t A)\\n | Impossible (message : string).\\n End LowM.\\n ```\\n- The error monad:\\n ```coq\\n Module Result.\\n Inductive t : Set :=\\n | Ok (value : Value.t)\\n | Break\\n | Continue\\n | Panic {A : Set} (payload : A).\\n End Result.\\n ```\\n- The composition of the two monads:\\n ```coq\\n Module M.\\n Definition t : Set :=\\n LowM.t Result.t.\\n End M.\\n ```\\n\\nNote that since our type of values is always `Value.t`, we do not parameterize the monad `M.t` by the type of values.\\n\\n## \u2712\ufe0f Conclusion\\n\\nThanks to all the work above, we obtain a translation for a large subset of the Noir language to the Coq proof system, which type-checks and has a semantics. A difficulty with handling the code we produce from monomorphization is the unique identifier added after each name to make them unique. These identifiers are generated in a rather non-deterministic way that can depend on the machine that runs the compiler. In addition, they change every time we make changes to the source code.\\n\\nIn the next blog post, we will see how we prevent the identifiers from appearing in the generated code by working at a higher level than the monomorphization phase.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/10/28/verification-smooth-library-2","metadata":{"permalink":"/blog/2024/10/28/verification-smooth-library-2","source":"@site/blog/2024-10-28-verification-smooth-library-2.md","title":"\u2688 Verification of the Smoo.th library \u2013 2","description":"In this blog post, we detail the continuation of our work to formally verify the \u2688 Smoo.th library, which is an optimized implementation of elliptic curve operations in Solidity. We use our tool coq-of-solidity, representing any Solidity code in the generic proof assistant \ud83d\udc13 Coq, to verify the code for any execution path.","date":"2024-10-28T00:00:00.000Z","formattedDate":"October 28, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"},{"label":"elliptic curves","permalink":"/blog/tags/elliptic-curves"}],"readingTime":6.86,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u2688 Verification of the Smoo.th library \u2013 2","tags":["Solidity","Yul","elliptic curves"],"authors":[]},"unlisted":false,"prevItem":{"title":"\u25fc\ufe0f A formal verification tool for Noir \u2013 1","permalink":"/blog/2024/11/01/tool-for-noir-1"},"nextItem":{"title":"\ud83c\udf32 What we bring you","permalink":"/blog/2024/10/22/what-we-bring-to-you"}},"content":"In this blog post, we detail the continuation of our work to formally verify the [\u2688 Smoo.th](https://smoo.th/) library, which is an optimized implementation of elliptic curve operations in Solidity. We use our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity), representing any Solidity code in the generic proof assistant [\ud83d\udc13 Coq](https://coq.inria.fr/), to verify the code for any execution path.\\n\\nIn particular, we cover the changes we made to use unoptimized Yul code and how we made a functional representation of the loop to compute the most significant bit of the scalars.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Smooth in forest](2024-10-28/forest-smooth.webp)\\n
\\n\\n## \ud83d\udc0c Unoptimized Yul\\n\\nWe are now verifying the code based on the unoptimized [Yul](https://docs.soliditylang.org/en/latest/yul.html) output of the Solidity compiler instead of the optimized one. As a consequence the code is a little bit more verbose, although in our present case the difference is limited as we are verifying a code that is already hand-optimized. The main advantage is that the variables are preserved instead of being moved to locations in the memory, which makes the verification easier, especially when handling loop invariants. A downside is that we now have to trust the correctness of the Solidity compiler\'s optimization passes.\\n\\nAs an example, here is how we now translate in Coq the loop to compute the most significant bit of the scalars with the unoptimized Yul code:\\n\\n```coq\\nlet~ var_ZZZ_83 := [[ 0 ]] in\\nlet_state~ \'(var_ZZZ_83, var_mask_63) :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n (var_ZZZ_83, var_mask_63)\\n (* condition *)\\n (fun \'(var_ZZZ_83, var_mask_63) => [[\\n iszero ~(| var_ZZZ_83 |)\\n ]])\\n (* body *)\\n (fun \'(var_ZZZ_83, var_mask_63) =>\\n Shallow.lift_state_update\\n (fun var_ZZZ_83 => (var_ZZZ_83, var_mask_63))\\n (let~ var_ZZZ_83 := [[ add ~(| add ~(| sub ~(| 1, iszero ~(| and ~(| var_scalar_u_55, var_mask_63 |) |) |), shl ~(| 1, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, var_scalar_u_55 |), var_mask_63 |) |) |) |) |), add ~(| shl ~(| 2, sub ~(| 1, iszero ~(| and ~(| var_scalar_v_57, var_mask_63 |) |) |) |), shl ~(| 3, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, var_scalar_v_57 |), var_mask_63 |) |) |) |) |) |) ]] in\\n M.pure (BlockUnit.Tt, var_ZZZ_83)))\\n (* post *)\\n (fun \'(var_ZZZ_83, var_mask_63) =>\\n Shallow.lift_state_update\\n (fun var_mask_63 => (var_ZZZ_83, var_mask_63))\\n (let~ var_mask_63 := [[ shr ~(| 1, var_mask_63 |) ]] in\\n M.pure (BlockUnit.Tt, var_mask_63)))\\n```\\n\\nAs a reference, here is the original smart contract code, in hand-written Yul:\\n\\n```go\\nZZZ := 0\\nfor {} iszero(ZZZ) { mask := shr(1, mask) } {\\n ZZZ := add(\\n add(\\n sub(1, iszero(and(scalar_u, mask))),\\n shl(1, sub(1, iszero(and(shr(128, scalar_u), mask))))\\n ),\\n add(\\n shl(2, sub(1, iszero(and(scalar_v, mask)))),\\n shl(3, sub(1, iszero(and(shr(128, scalar_v), mask))))\\n )\\n )\\n}\\n```\\n\\nWe recognize the variables `var_ZZZ_83` and `var_mask_63`, corresponding to `ZZZ` and `mask` in the original code. They are made explicit in a state monad with the state `(var_ZZZ_83, var_mask_63)` for the loop.\\n\\nWe had some constructs we were not handling in `coq-of-solidity`, for constructs that appeared in the optimized code but not in the unoptimized one. An example is the initialization part of the `for` loop that seems to be always move away in the optimized code. We added those missing cases to our tool to be able to translate the unoptimized Yul code of Smoo.th.\\n\\n## \ud83c\udf97\ufe0f Verification of the loop\\n\\nVerifying the `for` loop above can be challenging. Automated verification tools for Solidity typically do not fully handle loops, and instead unroll them three or four times to check the first iterations, which can miss some bugs.\\n\\nThe first step is to prove the loop is equivalent to a recursive function, as this will simplify reasoning. Here is a recursive function that computes the most significant bit of the scalars `u` and `v`:\\n\\n```coq\\nFixpoint get\\n (u_low u_high v_low v_high : U128.t) (over_index : nat) :\\n PointsSelector.t * nat :=\\n match over_index with\\n | O =>\\n (* We should never reach this case if the scalars\\n are not all zero *)\\n (PointsSelector.Build_t false false false false, O)\\n | S index =>\\n let selector := HighLow.get_selector\\n u_low u_high v_low v_high (Z.of_nat index) in\\n if PointsSelector.is_zero selector then\\n let new_over_index := index in\\n get u_low u_high v_low v_high new_over_index\\n else\\n let next_over_index := index in\\n (selector, next_over_index)\\n end.\\n```\\n\\nHere are some notable changes compared to the original `for` loop:\\n\\n- We decompose the scalars `u` and `v` of 256 bits into their high and low parts, `u_low`, `u_high`, `v_low`, and `v_high` of 128 bits each.\\n- We make explicit the scalars that we select with the `PointsSelector` type, which is a record with four boolean fields. In the original code, the `ZZZ` variable is used to group these four booleans into a single integer.\\n- We use a natural number `over_index` to represent the mask. We decrement it at each iteration until it reaches zero, proving by construction the termination of the function. The relation with the mask is:\\n\\n$$\\n\\\\text{mask} = \\\\lfloor 2^{\\\\text{over\\\\_index} - 1} \\\\rfloor\\n$$\\n\\nNote that this means that when the `over_index` is zero, then the `mask` is zero. This corresponds to the last case of the loop. We use the variable name `over_index` so that if we define:\\n\\n$$\\n\\\\text{over\\\\_index} = \\\\text{index} + 1\\n$$\\n\\nthen the relation with the mask is:\\n\\n$$\\n\\\\text{mask} = 2^{\\\\text{index}}\\n$$\\n\\nfor all cases except the last one.\\n\\n## \ud83d\udca1 Reasoning rule\\n\\nHere is the reasoning rule for the smart contract loops in Coq:\\n\\n```coq\\nLemma LoopStep codes environment {In Out : Set}\\n (init : In)\\n (body : In -> LowM.t Out)\\n (break_with : Out -> In + Out)\\n (k : Out -> LowM.t Out)\\n (output output_inter : Out)\\n state state_inter state\'\\n (H_body :\\n {{? codes, environment, state |\\n body init \u21d3 output_inter\\n | state_inter ?}}\\n )\\n (H_break_with :\\n match break_with output_inter with\\n | inr output_inter\' =>\\n {{? codes, environment, state_inter |\\n k output_inter\' \u21d3 output\\n | state\' ?}}\\n | inl next_init =>\\n {{? codes, environment, state_inter |\\n LowM.Loop next_init body break_with k \u21d3 output\\n | state\' ?}}\\n end\\n ) :\\n {{? codes, environment, state |\\n LowM.Loop init body break_with k \u21d3 output\\n | state\' ?}}.\\n```\\n\\nThis rule, to be used in combination with some reasoning by induction, allows us to verify that a certain property is true for any number of iterations of the loop. In the present case, we use it to prove that the recursive function `get` is equivalent to the `for` loop. Basically, it states that:\\n\\n- Assuming that the `body` of the loop evaluates to some output `output_inter`,\\n- if the `break_with` helper, which wraps the end of the end of the loop to either continue the loop or break it, evaluates to `output`,\\n- then the whole loop evaluates to `output`.\\n\\nHere, the output of the body of the loop contains the state of the state monad, that is to say, the two variables `ZZZ` and `mask`, and a special variable to break or continue the `for` loop iterations.\\n\\nDue to a lack of time, we only made a sketch of the proof of evaluation of this loop, admitting some intermediate lemmas about identities over the selector function. This work is available in the file [coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4/run.v](https://github.com/formal-land/coq-of-solidity/blob/develop/coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4/run.v).\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have seen how to reason about loops with `coq-of-solidity`. This example with bit-level arithmetic was rather complex, but the general idea is still to reason by induction, showing the equivalence with a recursive function, using the reasoning rule `LoopStep` above to step through the loop.\\n\\nIf you have smart contracts that you need to secure, talk to us! \ud83e\udd1d The cost of an attack always far outweights the cost of an audit, and our solution, with full formal verification, is the more extensive in terms of coverage.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more, or comment on this post below! Feel free to DM us for any questions or requests!_\\n\\n:::"},{"id":"/2024/10/22/what-we-bring-to-you","metadata":{"permalink":"/blog/2024/10/22/what-we-bring-to-you","source":"@site/blog/2024-10-22-what-we-bring-to-you.md","title":"\ud83c\udf32 What we bring you","description":"We bring you the highest possible level of security \ud83e\uddb8 for your blockchain applications by using formal verification \u2728 optimized by AI solutions to keep the cost down. We believe that for systems holding a lot of value \ud83d\udcb0, it is necessary to use the most advanced techniques \u269b\ufe0f to ensure their security; otherwise attackers with large means (like North Korea \ud83c\uddf0\ud83c\uddf5, but not only) will be able to steal or damage the system by using these techniques themselves.","date":"2024-10-22T00:00:00.000Z","formattedDate":"October 22, 2024","tags":[],"readingTime":3.42,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83c\udf32 What we bring you","tags":[],"authors":[]},"unlisted":false,"prevItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 2","permalink":"/blog/2024/10/28/verification-smooth-library-2"},"nextItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 1","permalink":"/blog/2024/10/21/verification-smooth-library-1"}},"content":"We bring you the **highest possible level of security \ud83e\uddb8** for your blockchain applications by using **formal verification \u2728** optimized by **AI solutions** to keep the cost down. We believe that for systems **holding a lot of value \ud83d\udcb0**, it is necessary to use the most advanced techniques \u269b\ufe0f to ensure their security; otherwise attackers with large means (like **North Korea \ud83c\uddf0\ud83c\uddf5**, but not only) will be able to **steal or damage** the system by using these techniques themselves.\\n\\nIn this blog post we present how we work with customers to integrate full formal verification in their workflow and ensure that their code is **secure** in the best possible way.\\n\\n\x3c!-- It is possible to have a system which is fully secured \ud83d\udcaf once you have a mathematical proof of its security that is itself verified by a computer. This is what we provide with **formal verification**. In some sense this is good to know there is an end to the quest of finding security vulnerabilities . The issue is that formal verification is only as good as the **scope** of the code we verify, and the quality of the **security predicates** that we use. --\x3e\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Network in forest](2024-10-22/network-in-forest.webp)\\n
\\n\\n## \ud83d\udee1\ufe0f Why Formal Verification Matters\\n\\nSecurity is central to the long term success of decentralized platforms. Traditional testing or security audits can catch many issues, but are not enough to guarantee the absence of bugs. Formal verification is a technique that **checks every possible input** of your program to ensure that it is always correct, for a given set of security properties. It works by mathematically reasoning about the code constructs and then checking this reasoning with a computer.\\n\\n## \ud83d\udd04 Our Process\\n\\nOur process is as follows:\\n\\n1. **Understanding Your Needs** We start by meeting with you to understand your system and your security requirements.\\n2. **Formal Modeling** We then create a formal model of your system in a proof assistant, using automated translation tools to make sure we make no mistakes.\\n3. **Proof Generation** We then generate mathematical proofs that your system satisfies the security properties you require, using the latest techniques in proof automation to reduce the cost.\\n4. **Seamless Integration** We help you integrate the proofs into your CI pipeline to ensure that every commit is automatically checked for correctness.\\n\\n## \ud83c\udf81 Benefits You Can Expect\\n\\n* **Enhanced Security** You improve the security of your system by showing that whole classes of bugs are impossible.\\n* **Cost Savings** You prevent costly security incidents and reduce the need for extensive manual audits.\\n* **Investor Confidence** You demonstrate that your system is secure and that you protect your users.\\n* **Regulatory Compliance** Finally, you show that you have taken all necessary steps to meet regulatory requirements.\\n\\n## \ud83c\udf10 Why Choose Us?\\n\\n* **Expert Team** Our team has years of experience in formal verification, cryptography, and A, with publications in all of these domains.\\n* **Cutting-Edge Tool** We use and develop the latest tools in formal verification to ensure we can provide the best possible service cost-effectively.\\n* **Customized Solutions** We customize our solutions to your system. You made a new language for zk-circuits or smart contracts and want the technology to verify it? We can help you.\\n\\n## \ud83e\udd1d Get in Touch\\n\\nReady to take your application\'s security to the next level? Reach out to us at[ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land), and let\'s build a secure future together! \ud83d\ude80\\n\\n:::success Stay Tuned\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land) for more insights into formal verification, blockchain security, and how AI is changing the field. We share our case studies, tutorials, and the latest industry news to keep you ahead of the curve._\\n\\n:::"},{"id":"/2024/10/21/verification-smooth-library-1","metadata":{"permalink":"/blog/2024/10/21/verification-smooth-library-1","source":"@site/blog/2024-10-21-verification-smooth-library-1.md","title":"\u2688 Verification of the Smoo.th library \u2013 1","description":"In this blog post, we present the formal verification effort we started to show the absence of bugs in the \u2688 Smoo.th library, a library for optimized \u3030\ufe0f elliptic curve operations in Solidity. We are using our tool coq-of-solidity to make this non-trivial verification using the generic proof assistant \ud83d\udc13 Coq.","date":"2024-10-21T00:00:00.000Z","formattedDate":"October 21, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"},{"label":"elliptic curves","permalink":"/blog/tags/elliptic-curves"}],"readingTime":10.45,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\u2688 Verification of the Smoo.th library \u2013 1","tags":["Solidity","Yul","elliptic curves"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83c\udf32 What we bring you","permalink":"/blog/2024/10/22/what-we-bring-to-you"},"nextItem":{"title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","permalink":"/blog/2024/10/16/coq-of-solidity-enhanced-version-1"}},"content":"In this blog post, we present the formal verification effort we started to show the absence of bugs in the [\u2688 Smoo.th](https://smoo.th/) library, a library for optimized [\u3030\ufe0f elliptic curve](https://en.wikipedia.org/wiki/Elliptic_curve) operations in [Solidity](https://soliditylang.org/). We are using our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) to make this non-trivial verification using the generic proof assistant [\ud83d\udc13 Coq](https://coq.inria.fr/).\\n\\nThe **Smoo.th** library is interesting as elliptic curves are at the core of many cryptographic protocols, including authentication protocols, and having a generic and fast implementation simplifies the development of [dApps](https://en.wikipedia.org/wiki/Decentralized_application) in environments with missing pre-compiled (like L1s) or missing circuits (like zero-knowledge layers).\\n\\nFrom a verification point of view, it is very challenging as it combines low-level operations (hand-optimized [Yul](https://docs.soliditylang.org/en/latest/yul.html) code with bit shifts, inlined functions, ...) with higher-level reasoning on elliptic curves and arithmetic \ud83d\udcaa.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udc8ccontact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to a **mathematical reasoning on the code**. It can be integrated into your CI pipeline to check that every commit is fully correct **without doing a whole audit again**.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually **impossible to happen again**.\\n\\n:::\\n\\n
\\n ![Panda in forest](2024-10-21/panda-in-forest.webp)\\n
\\n\\n## \ud83d\uddfa\ufe0f Design of the library\\n\\nThe library is implemented in [SCL_mulmuladdX_fullgen_b4.sol](https://github.com/get-smooth/crypto-lib/blob/main/src/elliptic/SCL_mulmuladdX_fullgen_b4.sol) mostly in Yul. Given two points $G$ and $Q$ on an elliptic curve in the field $\\\\mathbb{F}_p$ and two scalars $u$ and $v$, it computes the following operation:\\n\\n$$\\nu \\\\cdot G + v \\\\cdot Q\\n$$\\n\\nwhere the points are represented as $(x, y)$ coordinates, the scalars are integers, and the curve is described in the short Weierstrass form.\\n\\nHere is a diagram to summarize the workflow of the library \ud83e\udd13:\\n\\n
\\n ![Smoo.th workflow](2024-10-21/smoo-th-diagram.svg)\\n
\\n\\nYou can find more details about the algorithms used in the library in the complete [audit report](https://github.com/get-smooth/crypto-lib/blob/main/doc/Audits/CRX_smooth_report_2024_07_11_v1.2.pdf) by [CryptoExperts](https://www.cryptoexperts.com/).\\n\\nOur goal is to show that all these steps are equivalent to doing the naive operation of adding the points $u \\\\cdot G$ and $v \\\\cdot Q$ on the elliptic curve, ignoring a higher gas consumption and that the library is then free of bugs. Note that there are a few exceptional points, for example, when $G$ is the opposite of $Q$, where the library does not work as it is and runs another algorithm instead. We need to make these points explicit in the proof and assume we are not in these special cases.\\n\\n## \ud83d\udc13 Translation to Coq\\n\\nIn order to formally verify that the code is correct for any possible inputs, we need to first translate it to a proof language, in our case Coq. We run our tool `coq-of-solidity` on the optimized Yul code as generated by the Solidity compiler, that optimizes further the already hand-optimized code of the library. All our verification work is available on GitHub in the folder [coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4](https://github.com/formal-land/coq-of-solidity/tree/develop/coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4) of the [coq-of-solidity\'s repository](https://github.com/formal-land/coq-of-solidity).\\n\\nHere is an example of hand-written Yul code from the contract, to compute the most-significant bit from the scalars:\\n\\n```go\\nZZZ := 0\\nfor {} iszero(ZZZ) { mask := shr(1, mask) } {\\n ZZZ := add(\\n add(\\n sub(1, iszero(and(scalar_u, mask))),\\n shl(1, sub(1, iszero(and(shr(128, scalar_u), mask))))\\n ),\\n add(\\n shl(2, sub(1, iszero(and(scalar_v, mask)))),\\n shl(3, sub(1, iszero(and(shr(128, scalar_v), mask))))\\n )\\n )\\n}\\n```\\n\\nThe Yul code after optimization by the Solidity compiler is:\\n\\n```go\\nmstore(0xe0, 0)\\nfor { } iszero(mload(0xe0)) { mstore(0x01a0, shr(1, mload(0x01a0))) } {\\n mstore(0xe0, add(\\n add(\\n sub(1, iszero(and(mload(0x0120), mload(0x01a0)))),\\n shl(1, sub(1, iszero(and(shr(128, mload(0x0120)), mload(0x01a0)))))\\n ),\\n add(\\n shl(2, sub(1, iszero(and(mload(0x0160), mload(0x01a0))))),\\n shl(3, sub(1, iszero(and(shr(128, mload(0x0160)), mload(0x01a0)))))\\n )\\n ))\\n}\\n```\\n\\nAs we can see, the variable names were replaced by fixed memory addresses. As we can see, this will make the verification more complex. The Coq code that we generate with `coq-of-solidity` is:\\n\\n```coq\\ndo~ [[ mstore ~(| 0xe0, 0 |) ]] in\\nlet_state~ \'tt :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n tt\\n (* condition *)\\n (fun \'tt => [[\\n iszero ~(| mload ~(| 0xe0 |) |)\\n ]])\\n (* body *)\\n (fun \'tt =>\\n do~ [[\\n mstore ~(| 0xe0, add ~(|\\n add ~(|\\n sub ~(| 1, iszero ~(| and ~(| mload ~(| 0x0120 |), mload ~(| 0x01a0 |) |) |) |),\\n shl ~(| 1, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, mload ~(| 0x0120 |) |), mload ~(| 0x01a0 |) |) |) |) |)\\n |),\\n add ~(|\\n shl ~(| 2, sub ~(| 1, iszero ~(| and ~(| mload ~(| 0x0160 |), mload ~(| 0x01a0 |) |) |) |) |),\\n shl ~(| 3, sub ~(| 1, iszero ~(| and ~(| shr ~(| 128, mload ~(| 0x0160 |) |), mload ~(| 0x01a0 |) |) |) |) |)\\n |)\\n |) |)\\n ]] in\\n M.pure (BlockUnit.Tt, tt))\\n (* post *)\\n (fun \'tt =>\\n do~ [[ mstore ~(| 0x01a0, shr ~(| 1, mload ~(| 0x01a0 |) |) |) ]] in\\n M.pure (BlockUnit.Tt, tt))\\ndefault~ tt in\\n```\\n\\nWe use a monadic notation `f ~(| x1, ..., xn |)` to represent the side-effects of the EVM, such as memory read and write with `mload` and `mstore`. The function `Shallow.for_` represents a for loop with an initial state, a condition, a body, and a post-action. We implement it using a primitive from our monad to represent potentially non-terminating loops.\\n\\nHere the proper state of the loop is empty (value `tt`) and we instead modify the memory with `mload`. Ideally we should have `(ZZZ, mask)` as the state of the loop to simplify the verification. For our next attempt at verifying this code, we will look at the Yul code generated before optimizations by the Solidity compiler in order to keep these variables.\\n\\n## \ud83d\udd2c What we verified\\n\\nWe are not done yet with the verification of this library. For now, we have verified that:\\n\\n- The addition operation `ecAddn2` is implemented as specified.\\n- The doubling and negation operation `ecDblNeg` is implemented as in the specification, in an inlined manner.\\n- The pre-computations of the sums of the possible combinations of points are correct.\\n- The retrieval of the pre-computed sums from the current bits of the scalars is correct.\\n\\nFor example, here is our statement for the execution of the `ecAddn2` operation:\\n\\n```coq\\nLemma run_usr\'dollar\'ecAddn2 codes environment state\\n (P1_X P1_Y P1_ZZ P1_ZZZ P2_X P2_Y : U256.t) (p : U256.t) :\\n let output :=\\n ecAddn2 p\\n {| PZZ.X := P1_X; PZZ.Y := P1_Y; PZZ.ZZ := P1_ZZ; PZZ.ZZZ := P1_ZZZ |}\\n {| PA.X := P2_X; PA.Y := P2_Y |} in\\n let output := Result.Ok (output.(PZZ.X), output.(PZZ.Y), output.(PZZ.ZZ), output.(PZZ.ZZZ)) in\\n {{? codes, environment, Some state |\\n Contract_91.Contract_91_deployed.usr\'dollar\'ecAddn2 P1_X P1_Y P1_ZZ P1_ZZZ P2_X P2_Y p \u21d3\\n output\\n | Some state ?}}.\\n```\\n\\nIt says that in a given environment (`codes`, `environment`, `state`), the execution of the translated function `Contract_91.Contract_91_deployed.usr\'dollar\'ecAddn2` gives the same result as a hand-written purely functional version `ecAddn2` operating on data types directly representing the curve points (`PZZ.t` and `PA.t`).\\n\\nWe verify this execution in a straightforward way by unfolding the definition and executing it step by step:\\n\\n```coq\\nProof.\\n simpl.\\n unfold Contract_91.Contract_91_deployed.usr\'dollar\'ecAddn2.\\n l. {\\n repeat (l; [repeat cu; p|]).\\n p.\\n }\\n p.\\nQed.\\n```\\n\\nFor the verification of the inlined`ecDblNeg` operation, here is the memory state just after computing the coordinates of the doubled point:\\n\\n```coq\\n[\\n mem0; mem1; Pure.add 0 2048; mem3; mem4;\\n Pure.addmod\\n (Pure.mulmod\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p) p)\\n (Pure.mulmod (Pure.sub p 2)\\n (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p) p) p;\\n Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p;\\n Pure.mulmod\\n (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p)\\n P_127.(PZZ.ZZZ) p;\\n Pure.addmod\\n (Pure.mulmod\\n (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p)\\n P_127.(PZZ.Y) p)\\n (Pure.mulmod\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)\\n (Pure.addmod\\n (Pure.addmod\\n (Pure.mulmod\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)\\n (Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)\\n (Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p) p)\\n (Pure.mulmod (Pure.sub p 2)\\n (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p) p) p)\\n (Pure.sub p (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p))\\n p) p) p;\\n HighLow.merge u_high u_low; 480; HighLow.merge v_high v_low; Pure.add 0 2048; 2 ^ 126;\\n Pure.mulmod (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) P_127.(PZZ.ZZ) p;\\n p; Q.(PA.Y); Q\'.(PA.X); Q\'.(PA.Y); p; a; G.(PA.X); G.(PA.Y); G\'.(PA.X); G\'.(PA.Y);\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n P0.(PZZ.X); P0.(PZZ.Y); P0.(PZZ.ZZ); P0.(PZZ.ZZZ);\\n P1.(PZZ.X); P1.(PZZ.Y); P1.(PZZ.ZZ); P1.(PZZ.ZZZ);\\n P2.(PZZ.X); P2.(PZZ.Y); P2.(PZZ.ZZ); P2.(PZZ.ZZZ);\\n P3.(PZZ.X); P3.(PZZ.Y); P3.(PZZ.ZZ); P3.(PZZ.ZZZ);\\n P4.(PZZ.X); P4.(PZZ.Y); P4.(PZZ.ZZ); P4.(PZZ.ZZZ);\\n P5.(PZZ.X); P5.(PZZ.Y); P5.(PZZ.ZZ); P5.(PZZ.ZZZ);\\n P6.(PZZ.X); P6.(PZZ.Y); P6.(PZZ.ZZ); P6.(PZZ.ZZZ);\\n P7.(PZZ.X); P7.(PZZ.Y); P7.(PZZ.ZZ); P7.(PZZ.ZZZ);\\n P8.(PZZ.X); P8.(PZZ.Y); P8.(PZZ.ZZ); P8.(PZZ.ZZZ);\\n P9.(PZZ.X); P9.(PZZ.Y); P9.(PZZ.ZZ); P9.(PZZ.ZZZ);\\n P10.(PZZ.X); P10.(PZZ.Y); P10.(PZZ.ZZ); P10.(PZZ.ZZZ);\\n P11.(PZZ.X); P11.(PZZ.Y); P11.(PZZ.ZZ); P11.(PZZ.ZZZ);\\n P12.(PZZ.X); P12.(PZZ.Y); P12.(PZZ.ZZ); P12.(PZZ.ZZZ);\\n P13.(PZZ.X); P13.(PZZ.Y); P13.(PZZ.ZZ); P13.(PZZ.ZZZ);\\n P14.(PZZ.X); P14.(PZZ.Y); P14.(PZZ.ZZ); P14.(PZZ.ZZZ);\\n P15.(PZZ.X); P15.(PZZ.Y); P15.(PZZ.ZZ); P15.(PZZ.ZZZ);\\n 0; p\\n]\\n```\\n\\nThe state is very large as we are verifying a large function (250 lines) directly mutating the memory. We recognize the parameters of the function (`Q`, `Q\'`, `G`, `G\'`) as well as the pre-computed points (`P0`, `P1`, `P2`, ..., `P16`). We also see the computation of the coordinates of the doubled point, stored at fixed memory addresses.\\n\\nWe define the `dbl_neg_P_127` point as:\\n\\n```coq\\nset (dbl_neg_P_127 := ecDblNeg a p P_127).\\n```\\n\\nWe then rewrite the memory locations of the doubled point with the coordinates of `dbl_neg_P_127`:\\n\\n```coq\\napply_memory_update_at P_127_X_address dbl_neg_P_127.(PZZ.X); [reflexivity|].\\napply_memory_update_at P_127_Y_address dbl_neg_P_127.(PZZ.Y); [reflexivity|].\\napply_memory_update_at P_127_ZZ_address dbl_neg_P_127.(PZZ.ZZ); [reflexivity|].\\napply_memory_update_at P_127_ZZZ_address dbl_neg_P_127.(PZZ.ZZZ); [reflexivity|].\\n```\\n\\ngiving us the new state:\\n\\n```coq\\n[\\n mem0; mem1; Pure.add 0 2048; mem3; mem4; dbl_neg_P_127.(PZZ.X);\\n Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p;\\n dbl_neg_P_127.(PZZ.ZZZ); dbl_neg_P_127.(PZZ.Y);\\n HighLow.merge u_high u_low; 480; HighLow.merge v_high v_low; Pure.add 0 2048; 2 ^ 126;\\n dbl_neg_P_127.(PZZ.ZZ);\\n p; Q.(PA.Y); Q\'.(PA.X); Q\'.(PA.Y); p; a; G.(PA.X); G.(PA.Y); G\'.(PA.X); G\'.(PA.Y);\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;\\n P0.(PZZ.X); P0.(PZZ.Y); P0.(PZZ.ZZ); P0.(PZZ.ZZZ);\\n P1.(PZZ.X); P1.(PZZ.Y); P1.(PZZ.ZZ); P1.(PZZ.ZZZ);\\n P2.(PZZ.X); P2.(PZZ.Y); P2.(PZZ.ZZ); P2.(PZZ.ZZZ);\\n P3.(PZZ.X); P3.(PZZ.Y); P3.(PZZ.ZZ); P3.(PZZ.ZZZ);\\n P4.(PZZ.X); P4.(PZZ.Y); P4.(PZZ.ZZ); P4.(PZZ.ZZZ);\\n P5.(PZZ.X); P5.(PZZ.Y); P5.(PZZ.ZZ); P5.(PZZ.ZZZ);\\n P6.(PZZ.X); P6.(PZZ.Y); P6.(PZZ.ZZ); P6.(PZZ.ZZZ);\\n P7.(PZZ.X); P7.(PZZ.Y); P7.(PZZ.ZZ); P7.(PZZ.ZZZ);\\n P8.(PZZ.X); P8.(PZZ.Y); P8.(PZZ.ZZ); P8.(PZZ.ZZZ);\\n P9.(PZZ.X); P9.(PZZ.Y); P9.(PZZ.ZZ); P9.(PZZ.ZZZ);\\n P10.(PZZ.X); P10.(PZZ.Y); P10.(PZZ.ZZ); P10.(PZZ.ZZZ);\\n P11.(PZZ.X); P11.(PZZ.Y); P11.(PZZ.ZZ); P11.(PZZ.ZZZ);\\n P12.(PZZ.X); P12.(PZZ.Y); P12.(PZZ.ZZ); P12.(PZZ.ZZZ);\\n P13.(PZZ.X); P13.(PZZ.Y); P13.(PZZ.ZZ); P13.(PZZ.ZZZ);\\n P14.(PZZ.X); P14.(PZZ.Y); P14.(PZZ.ZZ); P14.(PZZ.ZZZ);\\n P15.(PZZ.X); P15.(PZZ.Y); P15.(PZZ.ZZ); P15.(PZZ.ZZZ);\\n 0; p\\n]\\n```\\n\\nStill large but much cleaner!\\n\\n## \ud83d\udc40 What remains to be done\\n\\nThere are two main parts that remain to be done in order to have a full formal verification of the library:\\n\\n1. We need to complete the proof stating that the execution of the smart contract is equivalent to the execution of a purely functional version written in Coq, especially using recursive functions instead of `for` loops. Reasoning on the loops is complex; in the current version, we unroll the loops once in order to have a first step towards the full proof. As the memory used by the main function is quite large, we will first need to change the code we verify by looking at the Yul code generated before optimizations by the Solidity compiler.\\n2. Show that the purely functional version of the library is equivalent to the plain addition and scalar multiplication. We have only started this work. The main challenge is to show that we can remove the loop by doing the bitwise addition. This will require some bit-arithmetic reasoning, as well as field arithmetic for the operations modulo the prime number $p$.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have seen how the **Smoo.th** library works at a high level, how we can start verifying it, and what challenges do we face. This is also an interesting example to improve our tool `coq-of-solidity` and develop reasoning primitives for cryptographic code. We will continue this work in the coming weeks to verify more parts of this library.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any formal verification services you need._\\n\\n:::"},{"id":"/2024/10/16/coq-of-solidity-enhanced-version-1","metadata":{"permalink":"/blog/2024/10/16/coq-of-solidity-enhanced-version-1","source":"@site/blog/2024-10-16-coq-of-solidity-enhanced-version-1.md","title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","description":"We present improvements we made to our tool coq-of-solidity to formally verify Solidity smart contracts for any advanced properties, relying on the proof assistant \ud83d\udc13 Coq. The idea is to be able to prove the full absence of bugs \u2728 in very complex contracts, like L1 verifiers for zero-knowledge L2s \ud83d\udd75\ufe0f, or contracts with very large amounts of money \ud83d\udcb0 (in the billions).","date":"2024-10-16T00:00:00.000Z","formattedDate":"October 16, 2024","tags":[{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"monad","permalink":"/blog/tags/monad"},{"label":"effects","permalink":"/blog/tags/effects"},{"label":"Yul","permalink":"/blog/tags/yul"},{"label":"loops","permalink":"/blog/tags/loops"},{"label":"mutations","permalink":"/blog/tags/mutations"}],"readingTime":8.82,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","tags":["Solidity","monad","effects","Yul","loops","mutations"],"authors":[]},"unlisted":false,"prevItem":{"title":"\u2688 Verification of the Smoo.th library \u2013 1","permalink":"/blog/2024/10/21/verification-smooth-library-1"},"nextItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","permalink":"/blog/2024/10/15/verification-move-sui-type-checker-3"}},"content":"We present improvements we made to our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) to formally verify [Solidity](https://soliditylang.org/) smart contracts for any advanced properties, relying on the proof assistant [\ud83d\udc13 Coq](https://coq.inria.fr/). The idea is to be able to prove the **full absence of bugs \u2728** in **very complex contracts**, like L1 verifiers for **zero-knowledge L2s \ud83d\udd75\ufe0f**, or contracts with **very large amounts of money \ud83d\udcb0** (in the billions).\\n\\nIn this blog post, we present how we developed an effect inference mechanism to translate optimized [Yul](https://docs.soliditylang.org/en/latest/yul.html) code combining variable mutations and control flow with loops and nested premature returns (`break`, `continue`, and `leave`) to a clean \ud83e\uddfc purely functional representation in the proof system Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\nWe will be talking about this work at the [Encode London Conference](https://lu.ma/encode-london-24) on Friday, October 25, 2024 \ud83d\udce2.\\n\\n:::\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Frozen Solidity rock](2024-10-16/frozen-solidity.webp)\\n
\\n\\n## \ud83e\udde8 The issue\\n\\nYul is the intermediate language of the Solidity compiler that we translate to the Coq proof system to formally verify properties of smart contracts. The issue is that it has slightly different behaviors than the Coq language. In particular, it allows for variable mutations and imperative loops (`for` loops) with premature exits that have no native equivalents in purely functional languages like the ones used for formal verification.\\n\\nHere is a short example of Yul code that is impossible to translate to Coq as it is:\\n\\n```go\\nfunction rugby() -> x {\\n let i := 0\\n x := 0\\n for { } lt(i, 10) { i := add(i, 1) } {\\n x := add(x, i)\\n if eq(i, 5) {\\n leave\\n }\\n }\\n}\\n```\\n\\nIt uses the variable `x` to store the sum of the increasing sequence of integers `1`, `2`, `3`, ... but prematurely stops the loop when `i` reaches `5` and returns the final value of `x`.\\n\\nTo represent this code in a purely functional language, we need to:\\n\\n- Make explicit the fact that we operate on a local state, that is to say, the couple of the two variables `i` and `x`.\\n- Represent the control flow of the loop, which repeats its body until the condition `eq(i, 5)` is satisfied and then bubbles up to the body of the function to return the final result `x`.\\n\\n## Why is it important?\\n\\nHaving a purely functional representation of the Yul code is important as verifying functional programs is easier than verifying imperative ones, especially in the case of a system like Coq that is based on functional programming even at the logical level.\\n\\nIdeally, such a translation should be done automatically so that we are not at risk of making mistakes and can focus our time on the verification work. This would allow to more efficiently formally verify properties of smart contracts or similar imperative programs. Not that in Yul, in addition to mutations on variables, there are also mutations on the contract\'s memory and storage, which we do not cover here.\\n\\n## The solution\\n\\nOur solution is a tool that does an effect inference on the Yul code to determine which variables might be mutated at each point of the program, and then propagates the results in the two cases where the execution continues to the next instruction and the case where it bubbles up.\\n\\n### \ud83c\udfd7\ufe0f The tool\\n\\nWe wrote our tool in \ud83d\udc0d Python, for ease of development, parsing the Yul code from the JSON output of the Solidity compiler and outputting a Coq file that represents the functional version of the code. Yul is a rather pleasant language, optimized for formal verification and with very few constructs. Our code is available on our GitHub repository [github.com/formal-land/coq-of-solidity](https://github.com/formal-land/coq-of-solidity), in a pull request that is about to be merged.\\n\\nHere is the header of our main Python function, which translates Yul statements to Coq:\\n\\n```python\\ndef statement_to_coq(node) -> tuple[Callable[[set[str]], str], set[str], set[str]]:\\n```\\n\\nIt takes a JSON `node` corresponding to a statement (assignment, `if`, `for`, `leave`, ...) and returns a triple with:\\n\\n1. A function that takes the yet-to-be-determined mutated variables in the surrounding block and returns the Coq code of the statement.\\n2. The set of newly declared variables.\\n3. The set of mutated variables.\\n\\nFrom these information we can infer the variables that are mutated at each point of the program and propagate them.\\n\\n### \ud83d\udd0d Example\\n\\nAs an example, here is the generated Coq translation of our \ud83c\udfc9 `rugby` example above:\\n\\n```coq showLineNumbers\\nDefinition rugby : M.t U256.t :=\\n let~ \'(_, result) :=\\n let~ i := [[ 0 ]] in\\n let~ x := [[ 0 ]] in\\n let_state~ \'(i, x) :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n (i, x)\\n (* condition *)\\n (fun \'(i, x) => [[\\n lt ~(| i, 10 |)\\n ]])\\n (* body *)\\n (fun \'(i, x) =>\\n Shallow.lift_state_update\\n (fun x => (i, x))\\n (let~ x := [[ add ~(| x, i |) ]] in\\n let_state~ \'tt := [[\\n Shallow.if_ (|\\n eq ~(| i, 5 |),\\n M.pure (BlockUnit.Leave, tt),\\n tt\\n |)\\n ]] default~ x in\\n M.pure (BlockUnit.Tt, x)))\\n (* post *)\\n (fun \'(i, x) =>\\n Shallow.lift_state_update\\n (fun i => (i, x))\\n (let~ i := [[ add ~(| i, 1 |) ]] in\\n M.pure (BlockUnit.Tt, i)))\\n default~ x in\\n M.pure (BlockUnit.Tt, x)\\n in\\n M.pure result.\\n```\\n\\nOn lines `3` and `4` we see that we use normal `let` declarations for the variables `i` and `x`:\\n\\n```coq\\nlet~ i := [[ 0 ]] in\\nlet~ x := [[ 0 ]] in\\n```\\n\\nThe notation `let~` is a monadic notation to represent the side-effects of the EVM (storage updates, contract calls, ...) but the variables `i` and `x` are plain Coq variables, what will facilitate the formal verification process later.\\n\\nIn line `5`, we see that we consider the `for` loop to have a two-variable state `(i, x)`:\\n\\n```coq\\nlet_state~ \'(i, x) :=\\n (* for loop *)\\n Shallow.for_\\n (* init state *)\\n (i, x)\\n```\\n\\nThe condition depends on the whole state, even if it only uses a part of it:\\n\\n```coq\\n(* condition *)\\n(fun \'(i, x) => [[\\n lt ~(| i, 10 |)\\n]])\\n```\\nThe body is more interesting. We only modify the variable `x` but we need to read and return the whole state `(i, x)`, so we start with a lift operation:\\n\\n```coq\\n(* body *)\\n(fun \'(i, x) =>\\n Shallow.lift_state_update\\n (fun x => (i, x))\\n```\\n\\nThen we update the variable `x` with a standard variable declaration as if the variable was immutable:\\n\\n```coq\\n(let~ x := [[ add ~(| x, i |) ]] in\\n```\\n\\nThe updated value of the variable `x` is propagated at the end of the body:\\n\\n```coq\\nM.pure (BlockUnit.Tt, x)))\\n```\\n\\nThis is how we translate the inner `if`:\\n\\n```coq\\nlet_state~ \'tt := [[\\n Shallow.if_ (|\\n eq ~(| i, 5 |),\\n M.pure (BlockUnit.Leave, tt),\\n tt\\n |)\\n]] default~ x in\\n```\\n\\nIf the condition is satisfied, we return the special value `BlockUnit.Leave` that will be interpreted as a premature exit of the function and activate the bubble-up mechanism. The associated state is the special empty value `tt` as there are no mutations in the `if` statement. We use `default~ x` at the next line to say that we complete the `tt` state with the value `x` if we are bubbling up.\\n\\nThe binding of the expression of `default~` is done after the `let_state~` to be able to retrieve parts of the state that might have been modified, if needed. This is, for example, the case for the `for` loop where we say that we first get the values of the two variables `i` and `x`:\\n\\n```coq\\nlet_state~ \'(i, x) :=\\n (* for loop *)\\n```\\n\\nand then propagate only the state `x` in case of a premature exit:\\n\\n```coq\\ndefault~ x in\\n```\\n\\nat the line `33`.\\n\\n### \ud83d\udd2e Monad\\n\\nThe [monad](https://en.wikipedia.org/wiki/Monad_(functional_programming)) we use to represent the bubble-up mechanism is the following:\\n\\n```coq\\nModule Shallow.\\n Definition t (State : Set) : Set :=\\n M.t (BlockUnit.t * State).\\n```\\n\\nwhere:\\n\\n- `M.t` is the monad representing the side-effects of the EVM,\\n- `BlockUnit.t` is a type representing the different modes of the bubble-up mechanism: no bubble-up, or a bubble-up with a `break`, `continue`, or `leave` instruction,\\n- `State` is the type of the current state that we might be writing to.\\n\\nWe define the notation `let_state~ ... default~ ... in` with:\\n\\n```coq\\nNotation \\"\'let_state~\' pattern \':=\' e \'default~\' state \'in\' k\\" :=\\n (let_state e (fun pattern => (state, k)))\\n```\\n\\nand the function:\\n\\n```coq\\nDefinition let_state {State1 State2 : Set}\\n (expression : t State1) (body : State1 -> State2 * t State2) :\\n t State2 :=\\n M.strong_let_ expression (fun value =>\\n let \'(mode, state1) := value in\\n match mode with\\n (* no bubble-up, do not use the default state *)\\n | BlockUnit.Tt => snd (body state1)\\n (* bubble-up, use the default state and keep the same bubble-up mode *)\\n | _ => M.pure (mode, fst (body state1))\\n end).\\n```\\n\\nYou can also look at the definitions of the `Shallow.if_` and `Shallow.for_` functions in our code. For loops, we use a non-termination effect of the underlying monad `M.t`. This is because loops can be infinite, and this is not allowed in Coq.\\n\\n## Application\\n\\nWe are using the new translation above to formally verify the implementation of a hand-optimized Yul code using loops and mutations to implement cryptographic operations in an efficient way. We believe that this translation would work as well for any other examples of Yul code, enabling the formal verification of arbitrary Solidity or Yul code in a more functional way.\\n\\n## \u2712\ufe0f Conclusion\\n\\nWe have show how we can automatically translate arbitrary Yul code in a purely functional form \ud83c\udf1f, excluding mutations of the memory and the storage, in order to simplify further formal verification operations \ud83d\ude42.\\n\\nA work left to be done is to prove that this transformation is correct, showing it equivalent to our initial and simpler Yul semantics where variables are represented as string keys in a map. We believe this is possible by generating a proof on a case-by-case basis for each transformed program, working by unification and exploring all the branches. But this remains to be done.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any formal verification services you need._\\n\\n:::"},{"id":"/2024/10/15/verification-move-sui-type-checker-3","metadata":{"permalink":"/blog/2024/10/15/verification-move-sui-type-checker-3","source":"@site/blog/2024-10-15-verification-move-sui-type-checker-3.md","title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","description":"In the previous blog post, we have seen how we represent side-effects from the Rust code of the Sui\'s Move type-checker of bytecode in Coq. This translation represents about 3,200 lines of Coq code excluding comments. We need to trust that this translation is faithful to the original Rust code, as we generate it by hand or with GitHub Copilot.","date":"2024-10-15T00:00:00.000Z","formattedDate":"October 15, 2024","tags":[{"label":"monad","permalink":"/blog/tags/monad"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Sui","permalink":"/blog/tags/sui"}],"readingTime":5.795,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","tags":["monad","Rust","Sui"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Enhancements to coq-of-solidity \u2013 1","permalink":"/blog/2024/10/16/coq-of-solidity-enhanced-version-1"},"nextItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","permalink":"/blog/2024/10/14/verification-move-sui-type-checker-2"}},"content":"In the [previous blog post](/blog/2024/10/14/verification-move-sui-type-checker-2), we have seen how we represent side-effects from the Rust code of the [Sui](https://sui.io/)\'s [Move](https://sui.io/move) type-checker of bytecode in Coq. This translation represents about 3,200 lines of Coq code excluding comments. We need to trust that this translation is faithful to the original Rust code, as we generate it by hand or with GitHub Copilot.\\n\\nIn this blog post, we present how we test this translation to ensure it is correct by running the type-checker on each opcode of the Move bytecode and comparing the results with the Rust code, testing the success and error cases.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Forge in forest](2024-10-15/rock-with-mirror.webp)\\n
\\n\\n## The type-checker\\n\\nThe type-checker of Move Sui is a large piece of Rust code with a core function `verify_instr` in [move-bytecode-verifier/src/type_safety.rs](https://github.com/formal-land/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety.rs) that type-checks each individual instruction in a Move bytecode. There are exactly `77` different opcodes. To give you an example, here is how it type-checks the opcode `Add`:\\n\\n```rust\\nlet operand1 = safe_unwrap_err!(verifier.stack.pop());\\nlet operand2 = safe_unwrap_err!(verifier.stack.pop());\\nif operand1.is_integer() && operand1 == operand2 {\\n verifier.push(meter, operand1)?;\\n} else {\\n return Err(verifier.error(StatusCode::INTEGER_OP_TYPE_MISMATCH_ERROR, offset));\\n}\\n```\\n\\nThe Move virtual machine is stack-based. The type-checker maintains a stack of types, corresponding to the types of the values that should be on the stack at the current point of the execution. For the `Add` operation it pops the two last types on the types, checks that they are integers and equal, and pushes the result type on the stack. The result of an addition is of the same type as the operands. In case of an error, it returns the status code `INTEGER_OP_TYPE_MISMATCH_ERROR`.\\n\\nWe translate this code to Coq in the following way:\\n\\n```coq\\nletS! operand1 :=\\n liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in\\nletS! operand1 := return!toS! $ safe_unwrap_err operand1 in\\nletS! operand2 :=\\n liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in\\nletS! operand2 := return!toS! $ safe_unwrap_err operand2 in\\nif andb\\n (SignatureToken.is_integer operand1)\\n (SignatureToken.t_beq operand1 operand2)\\nthen\\n TypeSafetyChecker.Impl_TypeSafetyChecker.push operand1\\nelse\\n returnS! $ Result.Err $ TypeSafetyChecker.Impl_TypeSafetyChecker.error\\n verifier StatusCode.INTEGER_OP_TYPE_MISMATCH_ERROR offset\\n```\\n\\n## Tests\\n\\nThe two code extracts above seem very similar, but how to make sure that they are indeed the same, and that we made no typos or misunderstanding in the 3,200 lines of translation?\\n\\nTo answer that question, we choose to write unit tests on the Rust side covering all the execution paths (success and error, all the opcodes) and to run the same tests on the Coq side after a manual/AI assisted translation of these tests. We will compare the results of the tests to ensure that the Coq code behaves exactly like the Rust code.\\n\\nThe tests on the Rust side are in the file [move-bytecode-verifier/src/type_safety_tests/mod.rs](https://github.com/formal-land/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety_tests/mod.rs), which is a 3,000-line file with 176 tests. For example, for the addition we have:\\n\\n```rust\\n#[test]\\nfn test_arithmetic_correct_types() {\\n for instr in vec![\\n Bytecode::Add,\\n Bytecode::Sub,\\n Bytecode::Mul,\\n Bytecode::Mod,\\n Bytecode::Div,\\n Bytecode::BitOr,\\n Bytecode::BitAnd,\\n Bytecode::Xor,\\n ] {\\n for push_ty_instr in vec![\\n Bytecode::LdU8(42),\\n Bytecode::LdU16(257),\\n Bytecode::LdU32(89),\\n Bytecode::LdU64(94),\\n Bytecode::LdU128(Box::new(9999)),\\n Bytecode::LdU256(Box::new(U256::from(745_u32))),\\n ] {\\n let code = vec![push_ty_instr.clone(), push_ty_instr.clone(), instr.clone()];\\n let module = make_module(code);\\n let fun_context = get_fun_context(&module);\\n let result = type_safety::verify(&module, &fun_context, &mut DummyMeter);\\n assert!(result.is_ok());\\n }\\n }\\n}\\n```\\n\\nThere are four other tests covering the error cases (missing arguments, wrong types, ...).\\n\\nOne of the difficulties in these tests, apart from their size, is that we need to initialize the `module` variable with the proper content to be able to type-check some of the instructions. We defined some helpers for that, such as:\\n\\n```rust\\nfn add_simple_struct_with_abilities(module: &mut CompiledModule, abilities: AbilitySet) {\\n let struct_def = StructDefinition {\\n struct_handle: StructHandleIndex(0),\\n field_information: StructFieldInformation::Declared(vec![FieldDefinition {\\n name: IdentifierIndex(5),\\n signature: TypeSignature(SignatureToken::U32),\\n }]),\\n };\\n\\n let struct_handle = StructHandle {\\n module: ModuleHandleIndex(0),\\n name: IdentifierIndex(0),\\n abilities: abilities,\\n type_parameters: vec![],\\n };\\n\\n module.struct_defs.push(struct_def);\\n module.struct_handles.push(struct_handle);\\n}\\n```\\n\\nthat is used in `26` tests involving struct data structures.\\n\\n## Translation of the tests\\n\\nWe translated the tests using the same approach as for the type-checker, with the same monadic representation of effects. For example, we represent in Coq the arithmetic test above as:\\n\\n```coq\\nDefinition test_arithmetic_correct_types\\n (instr push_ty_instr : Bytecode.t) :\\n M!? PartialVMError.t unit :=\\n let code := [push_ty_instr; push_ty_instr; instr] in\\n let module := make_module code in\\n let! fun_context := get_fun_context module in\\n verify module fun_context.\\n\\nGoal List.Forall\\n (fun instr =>\\n List.Forall\\n (fun push_ty_instr =>\\n test_arithmetic_correct_types instr push_ty_instr = return!? tt\\n )\\n [\\n Bytecode.LdU8 42;\\n Bytecode.LdU16 257;\\n Bytecode.LdU32 89;\\n Bytecode.LdU64 94;\\n Bytecode.LdU128 9999;\\n Bytecode.LdU256 745\\n ]\\n )\\n [\\n Bytecode.Add;\\n Bytecode.Sub;\\n Bytecode.Mul;\\n Bytecode.Mod;\\n Bytecode.Div;\\n Bytecode.BitOr;\\n Bytecode.BitAnd;\\n Bytecode.Xor\\n ].\\nProof.\\n repeat constructor.\\nQed.\\n```\\n\\nWe convert the test that iterates assertions to an anonymous proof goal that uses the `List.Forall` predicate to verify a series of equalities. The `List.Forall` predicate is defined as \\"the following property is valid for all elements of the list\\".\\n\\nFortunately for us, GitHub Copilot was extremely efficient in the translation of these tests with a success rate of about %95 (we did not make a precise measurement). These end result is in [move_sui/simulations/move_bytecode_verifier/type_safety_tests/mod.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety_tests/mod.v) that contains more than 6,000 lines of Coq code excluding comments.\\n\\n## Detected issues\\n\\nAbout %20 of our translated Coq tests failed \ud83d\udca5, which we actually consider a very good success \ud83d\udcaa as the translated Coq code of the type-checker was not run before. Apart from one misunderstanding of the Rust code, all the issues were due to typos in the translation. We had about a dozen of them, such as a missing negation in a condition, some of them generating multiple test failures. It took about one day to fix all of them by changing our Coq translation of the type-checker accordingly. Now all the tests work \ud83c\udf89!\\n\\nA few errors where also due to incorrectly translated tests, typically with a missing line. We did a manual review, but we do not know for sure if there are tests with a mistake that by chance fix an error in the translation of the type-checker. We have not seen any such case yet.\\n\\n## Conclusion\\n\\nWe now have an idiomatic \ud83d\udc13 Coq translation of the type-checker of the Move bytecode in Rust. In addition, we test the result of this translation for every opcode and error case.\\n\\nNow that we are confident enough in the translation, we can start the specification and formal verification of the type-checker. This will involve reasoning on both the type-checker and the bytecode interpreter, showing that:\\n\\n- \u2705 The interpreter preserves the well-typedness of the code as it steps through the opcodes.\\n- \u2705 When a program is accepted by the type checker, the interpreter will not fail at runtime with a type error.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any services you need._\\n\\n:::"},{"id":"/2024/10/14/verification-move-sui-type-checker-2","metadata":{"permalink":"/blog/2024/10/14/verification-move-sui-type-checker-2","source":"@site/blog/2024-10-14-verification-move-sui-type-checker-2.md","title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","description":"We are working on formally verifying the \ud83e\udd80 Rust implementation of the Move type-checker for bytecode in the proof system \ud83d\udc13 Coq. You can find the code of this type-checker in the crate move-bytecode-verifier.","date":"2024-10-14T00:00:00.000Z","formattedDate":"October 14, 2024","tags":[{"label":"monad","permalink":"/blog/tags/monad"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Sui","permalink":"/blog/tags/sui"}],"readingTime":9.045,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","tags":["monad","Rust","Sui"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 3","permalink":"/blog/2024/10/15/verification-move-sui-type-checker-3"},"nextItem":{"title":"\ud83c\udf32 What we do at Formal Land","permalink":"/blog/2024/10/13/class-what-we-do"}},"content":"We are working on formally verifying the [\ud83e\udd80 Rust](https://www.rust-lang.org/) implementation of the [Move](https://sui.io/move) type-checker for bytecode in the proof system [\ud83d\udc13 Coq](https://coq.inria.fr/). You can find the code of this type-checker in the crate [move-bytecode-verifier](https://github.com/move-language/move-sui/tree/main/crates/move-bytecode-verifier).\\n\\nThis requires translating all the Rust code in idiomatic Coq on which we will write our specifications and proofs. We write this translation by hand relying as much as possible on generative AI tools such as [GitHub Copilot](https://github.com/features/copilot), as there are many particular cases. We plan, eventually, to prove it equivalent to the translation automatically generated by [coq-of-rust](https://github.com/formal-land/coq-of-rust).\\n\\nIn this blog post we present how we organize our \ud83d\udd2e monad to represent the side-effects used in this Rust code. We believe this organization should work for other Rust projects as well.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Forge in forest](2024-10-14/symbol-in-forest.webp)\\n
\\n\\n## Primitive effects\\n\\nIn functional programming, effects (or side-effects) are every operation that cannot be directly represented as a mathematical function, that is to say, a procedure that returns an output purely based on the value of its inputs and does nothing else. For example, the function returning the current time makes an effect as it depends on a hidden state (the current time) that is not passed as an argument. The function printing a message to the console makes an effect as it modifies the state of the console, in addition to returning a value that is generally either empty or a confirmation of the printing. Arithmetic operations (`+`, `*`, ...) are an example of pure functions.\\n\\nWe consider three primitive effects in our Rust code:\\n\\n- **Panic** For many reasons, a Rust program can panic, as a result of an out-of-bounds access to an array or a wrong [unwrap](https://doc.rust-lang.org/core/option/enum.Option.html#method.unwrap), for example. This is an effect as no outputs are returned in case of a panic.\\n- **Result** The `Result` type is used to represent the result of a computation that can fail. It is a sum type with two constructors: `Ok` for the successful result and `Err` for the error. The [Rust operator `?`](https://doc.rust-lang.org/rust-by-example/std/result/question_mark.html) is used to propagate errors in a function that returns a `Result`. This is another effect for us.\\n- **State** Finally, we consider the functions that mutate one of their arguments as effectful. The mutated parameter is generally typed as a mutable reference `&mut`.\\n\\nAll our Coq definitions to represent the effects are in the file [simulations/M.v](https://github.com/formal-land/coq-of-rust/blob/guillaume-claret%40fix-remaining-tests/CoqOfRust/simulations/M.v).\\n\\n### Panic\\n\\nWe define a monad `Panic.t` to represent the effect of a panic with:\\n\\n```coq\\nModule Panic.\\n Inductive t (A : Set) : Set :=\\n | Value : A -> t A\\n | Panic {Error : Set} : Error -> t A.\\n```\\n\\nNote that the type `Error` in this position is an existential type. This has a few consequences:\\n\\n- We do not need to annotate the type `Panic.t` with the type of the error.\\n- We can use any type for `Error` when we trigger a panic operation. This is useful for debugging, as we can add any payload to the panic message to help us understand what went wrong.\\n- We cannot compute on the panic payload. We do not consider this as a limitation, as panics should not be caught and handled in a Rust program, only propagated.\\n\\nWe define the monadic _return_ and _bind_ operations as usual:\\n\\n```coq\\nDefinition return_ {A : Set} (value : A) : t A := Value value.\\n\\nDefinition bind {A B : Set} (value : t A) (f : A -> t B) : t B :=\\n match value with\\n | Value value => f value\\n | Panic error => Panic error\\n end.\\n```\\n\\nWe introduce notations based on the exclamation mark `!` to make the code more readable:\\n\\n```coq\\nNotation \\"M!\\" := Panic.t.\\n\\nNotation \\"return!\\" := Panic.return_.\\n\\nNotation \\"\'let!\' x \':=\' X \'in\' Y\\" :=\\n (Panic.bind X (fun x => Y))\\n (at level 200, x pattern, X at level 100, Y at level 200).\\n```\\n\\n### Result\\n\\nWe define the monad `Result.t` to represent the propagation of errors with the `?` operator with:\\n\\n```coq\\nModule Result.\\n Inductive t (A Error : Set) : Set :=\\n | Ok : A -> t A Error\\n | Err : Error -> t A Error.\\n```\\n\\nThe difference with the `Panic.t` monad is that the error type is not existential anymore. This is because we want to be able to compute on the error payload, as some functions depend on the error value.\\n\\nWe define the _return_ and _bind_ operations as:\\n\\n```coq\\nDefinition return_ {A Error : Set} (value : A) : t A Error := Ok value.\\n\\nDefinition bind {Error A B : Set} (value : t A Error) (f : A -> t B Error) : t B Error :=\\n match value with\\n | Ok value => f value\\n | Err error => Err error\\n end.\\n```\\n\\nThe _bind_ corresponds to the question mark operator `?` in Rust. We also introduce notation to make the code more readable:\\n\\n```coq\\nNotation \\"M?\\" := (fun A Error => Result.t Error A).\\n\\nNotation \\"return?\\" := Result.return_.\\n\\nNotation \\"\'let?\' x \':=\' X \'in\' Y\\" := ...\\n```\\n\\n### State\\n\\nFinally, we define the monad `State.t` \ud83c\uddfa\ud83c\uddf8 to represent the effect of one or several mutable references with a mutable state type `S`:\\n\\n```coq\\nModule State.\\n Definition t (State A : Set) : Set := State -> A * State.\\n\\n Definition return_ {State A : Set} (value : A) : t State A :=\\n fun state => (value, state).\\n\\n Definition bind {State A B : Set} (value : t State A) (f : A -> t State B) : t State B :=\\n fun state =>\\n let (value, state) := value state in\\n f value state.\\n```\\n\\nThe state `S` will typically be the tuple of all the current mutable references in the Rust code. We use notations based on the letter `S`.\\n\\nWe also introduce lens operations that mimic how we can extract a mutable reference to the part of a data structure from a mutable reference to the whole data structure in Rust. Here is the definition of the lens type:\\n\\n```coq\\nRecord t {Big_A A : Set} : Set := {\\n read : Big_A -> M! A;\\n write : Big_A -> A -> M! Big_A\\n}.\\n```\\n\\nThe `read` and `write` operations correspond to the dereferencing and the assignment of a mutable reference in Rust. The type `Big_A` is the type of the whole data structure, and the type `A` is the type of the part that we are referencing. These primitives might fail (there are in the panic monad) if the mutable reference is not valid, for example, for an out-of-bounds access in an array or an invalid case in an enum.\\n\\nWe can use a lens to lift a computation that operates on a part of a data structure to a computation that operates on the whole data structure. We provide various _lift_ operators to help with this.\\n\\n## Combinaisons\\n\\nDepending on the Rust code we want to translate, we might need to use none, one, or several of the effects above. We explicitly define all the possible combinations of the above monads, as well as return operations to go from one monad to another, more general monad.\\n\\nThe special case is for the combination of the panic and state effect. When a panic occurs, we do not return the resulting state, as we are not supposed to continue the evaluation after a panic so the current state should not be relevant. We lose the information about the state of the program when a panic occurs, which can be a limitation for debugging, but:\\n\\n- It simplifies some definitions of simulations, and forces us not to speak about the specification of a state after a panic, what should not be relevant.\\n- We can still return the current state as an additional payload in the panic operator. This is actually what our panic operator does by default.\\n\\nThe most complete monad combines all the effects:\\n\\n```coq\\nModule StatePanicResult.\\n Definition t (State Error A : Set) : Set :=\\n MS! State (M? Error A).\\n\\n Definition return_ {State Error A : Set} (value : A) : t State Error A :=\\n returnS! (Result.Ok value).\\n\\n Definition bind {State Error A B : Set}\\n (value : t State Error A)\\n (f : A -> t State Error B) :\\n t State Error B :=\\n letS! value := value in\\n match value with\\n | Result.Ok value => f value\\n | Result.Err error => returnS! (Result.Err error)\\n end.\\n```\\n\\nwith the notations:\\n\\n```coq\\nNotation \\"MS!?\\" := StatePanicResult.t.\\n\\nNotation \\"returnS!?\\" := StatePanicResult.return_.\\n\\nNotation \\"\'letS!?\' x \':=\' X \'in\' Y\\" := ...\\n```\\n\\n:::info\\n\\nWe are repeating our notations a lot, as our three effects and their combinations are very similar. In addition, we always have to explicitly choose in our code which monad we use and add explicit conversions to go from one to another. A future enhancement could be to add some automation at this level, through the use of type-classes, for example, to automatically infer the monad to use based on the operations used in the code \ud83e\uddbe. For now, we prefer to stay explicit.\\n\\n:::\\n\\n## Iterations\\n\\nTo convert code involving `for` loops \ud83d\udd01 or manipulations with the `.map` method of iterators, we introduce the effectful version of the `for` loop (_fold_ or _reduce_ in functional languages) and the `map` method. For example, for the folding operation:\\n\\n```coq\\n(** The order of parameters is the same as in the source `for` loops. *)\\nDefinition fold_left {State Error A B : Set}\\n (init : A)\\n (l : list B)\\n (f : A -> B -> t State Error A) :\\n t State Error A :=\\n List.fold_left (fun acc x => bind acc (fun acc => f acc x)) l (return_ init).\\n```\\n\\nwith the notation:\\n\\n```coq\\nNotation \\"foldS!?\\" := StatePanicResult.fold_left.\\n```\\n\\n## Conclusion\\n\\nThanks to the definitions and notations above, we were able to translate (manually/with GitHub Copilot) all the code of the type-checker for the Move bytecode to Coq in an idiomatic Coq code of a size roughly similar to the original Rust code. This translation is available in our folder [move_sui/simulations/move_bytecode_verifier](https://github.com/formal-land/coq-of-rust/tree/guillaume-claret%40fix-remaining-tests/CoqOfRust/move_sui/simulations/move_bytecode_verifier) \ud83d\ude80.\\n\\nIn the next post we will present how we tested this translation to be faithful to the original Rust code, waiting to have an efficient way to prove it equivalent.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any services you need._\\n\\n:::"},{"id":"/2024/10/13/class-what-we-do","metadata":{"permalink":"/blog/2024/10/13/class-what-we-do","source":"@site/blog/2024-10-13-class-what-we-do.md","title":"\ud83c\udf32 What we do at Formal Land","description":"In this blog post, we present what we do at Formal Land \ud83c\udf32, what tools and services we are developing to provide more security for our customers \ud83e\uddb8. We believe that for critical applications such as blockchains (L1, L2, dApps) you should always use the most advanced technologies to find bugs, otherwise bad actors will do and overtake you in the never-ending race for security \ud83c\udfce\ufe0f.","date":"2024-10-13T00:00:00.000Z","formattedDate":"October 13, 2024","tags":[{"label":"security","permalink":"/blog/tags/security"},{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"interactive theorem proving","permalink":"/blog/tags/interactive-theorem-proving"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Solidity","permalink":"/blog/tags/solidity"}],"readingTime":6.75,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83c\udf32 What we do at Formal Land","tags":["security","formal verification","interactive theorem proving","Rust","Solidity"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 2","permalink":"/blog/2024/10/14/verification-move-sui-type-checker-2"},"nextItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","permalink":"/blog/2024/08/19/verification-move-sui-type-checker-1"}},"content":"In this blog post, we present what we do at Formal Land \ud83c\udf32, what tools and services we are developing to provide more security for our customers \ud83e\uddb8. We believe that for critical applications such as blockchains (L1, L2, dApps) you should always **use the most advanced technologies to find bugs, otherwise bad actors will do** and overtake you in the never-ending race for security \ud83c\udfce\ufe0f.\\n\\n**Formal verification** is one of the best techniques to ensure that your code is correct, as it **checks every possible input \u2728** of your program. For a long, formal verification was reserved for specific fields, such as the space industry \ud83e\uddd1\u200d\ud83d\ude80. We are making this technology accessible for the blockchain industry and general programming thanks to tools and services we develop, like [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) and [coq-of-rust](https://github.com/formal-land/coq-of-rust).\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo ensure your code is secure today, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land)! \ud83d\ude80\\n\\nFormal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.\\n\\nWe make bugs such as the [DAO hack](https://www.gemini.com/fr-fr/cryptopedia/the-dao-hack-makerdao) ($60 million stolen) virtually impossible to happen again.\\n\\n:::\\n\\n
\\n ![Forge in forest](2024-10-13/forge.webp)\\n
\\n\\n## Company\\n\\nWe have existed for **three years**, focusing on formal verification for the web3 industry to validate software \ud83d\udee1\ufe0f where safety is of paramount importance. **Formal verification** is a technique to analyze the code of a program, which relies on making a **mathematical proof that the code is correct**, proof that is furthermore checked by a computer \ud83e\udd13 to make sure there are absolutely no missing cases! As programs are made of 0 and 1 and fully deterministic, obtaining perfect programs is something we can reach.\\n\\nWe need to rely on a proof system. We exclusively use the [\ud83d\udc13 Coq](https://coq.inria.fr/) proof system as it is both:\\n\\n- **\ud83c\udf0c A generic proof system** We can represent any programming languages and security properties in Coq.\\n- **\ud83d\udc95 A well known system** Coq is taught in many universities and has a large community of users, with complex software such as the C compiler [CompCert](https://en.wikipedia.org/wiki/CompCert) fully implemented and verified in it.\\n\\nWe choose to verify **existing \ud83d\uddff** code rather than to develop new code written in a style simplifying formal verification. This is generally harder, but it is also more useful for many of our customers who have already written code and want to ensure it is correct without rewriting it. Verifying the existing code also enables the verification of the optimizations, which generally involve low-level operations that would be forbidden when rewriting the code in a formal verification language.\\n\\nWe verify the actual **\ud83c\udf0d implementation** of programs rather than a **\ud83d\uddfa\ufe0f model** of them. This is to capture all the implementation details, such as integer overflows or the use of specific data structures or libraries. We believe that a lot of bugs are hidden in the details (the devil is in the details), in addition to the high-level bugs of design. Verifying the implementation also helps to **follow code updates \ud83e\ude9c** as we are able to say that we verified the code for a precise commit hash.\\n\\n## Tools\\n\\n### \ud83d\udc2b coq-of-ocaml\\n\\nThe tool [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml) was our first product to analyze [\ud83d\udc2b OCaml](https://ocaml.org/) programs by translating the code to Coq. The translation is almost one-to-one in terms of size, for a verification work simplified at a maximum. It was initially developed as part of a PhD at [Inria](https://inria.fr/) and then at the [ Nomadic Labs](https://www.nomadic-labs.com/) company.\\n\\nWe use it to verify properties of the code of the Layer 1 of [Tezos](https://tezos.com/) with the project [Coq Tezos of OCaml](https://formal-land.gitlab.io/coq-tezos-of-ocaml/). We analyzed a code base of more than 100,000 lines of OCaml code, for which we made a full and automatic translation to the proof system Coq that can be maintained as the code evolves. We verified various properties, including:\\n\\n- The compatibility of the serialization/deserialization functions.\\n- The adequacy of the smart contract interpreter with the existing smart contract semantics.\\n- The preservation of various invariants on the data structures.\\n\\nMany more properties are yet to be verified, but the project is currently on hold. You can have more information by looking at the [project blog](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog)!\\n\\n### \ud83e\udd80 coq-of-rust\\n\\nOur second project is [coq-of-rust](https://github.com/formal-land/coq-of-rust) to verify Rust programs. Rust is an interesting target as more and more programs are getting written in it, especially for projects where the security is critical. Even if Rust offers a strong type system, with memory safe programs by design, there are still many bugs that can happen, like logical bugs or code making a panic (sudden stop of the program) in production due to an out-of-bound access in an array.\\n\\nThe project `coq-of-rust` was funded by [Aleph Zero](https://alephzero.org/) to verify the code of their smart contracts.\\n\\nWe achieve to translate most Rust programs to the Coq proof system, including the `core` library \ud83c\udf89, which is the standard library of Rust. To our knowledge, we are the only ones who have achieved such a translation of the standard library. The generated Coq code is about ten times the size of the initial Rust code. This is quite verbose and related in particular to:\\n\\n- the expansion of macros,\\n- the expansion of referencing/dereferencing operations that are often implicit in the source code,\\n- the expansion of `match` patterns to primitive patterns.\\n\\nWe have a semantics for the translated code, and are working on reasoning principles to show that this translated code is equivalent to a much simpler version (simulations) on which to reason.\\n\\nAs an example, here is the Coq translation of one of the functions of the [revm](https://github.com/bluealloy/revm), a Rust implementation of the Ethereum Virtual Machine:\\n\\n```rust\\n(*\\npub fn add(interpreter: &mut Interpreter, _host: &mut H) {\\n gas!(interpreter, gas::VERYLOW);\\n pop_top!(interpreter, op1, op2);\\n *op2 = op1.wrapping_add( *op2);\\n}\\n*)\\nDefinition add (\u03b5 : list Value.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03b5, \u03c4, \u03b1 with\\n | [], [ H ], [ interpreter; _host ] =>\\n ltac:(M.monadic\\n (let interpreter := M.alloc (| interpreter |) in\\n let _host := M.alloc (| _host |) in\\n M.catch_return (|\\n ltac:(M.monadic\\n (M.read (|\\n let~ _ :=\\n M.match_operator (|\\n M.alloc (| Value.Tuple [] |),\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let \u03b3 :=\\n M.use\\n (M.alloc (|\\n UnOp.not (|\\n M.call_closure (|\\n M.get_associated_function (|\\n Ty.path \\"revm_interpreter::gas::Gas\\",\\n \\"record_cost\\",\\n []\\n |),\\n [\\n M.SubPointer.get_struct_record_field (|\\n M.read (| interpreter |),\\n \\"revm_interpreter::interpreter::Interpreter\\",\\n \\"gas\\"\\n |);\\n M.read (|\\n M.get_constant (|\\n \\"revm_interpreter::gas::constants::VERYLOW\\"\\n |)\\n |)\\n ]\\n |)\\n |)\\n |)) in\\n let _ :=\\n M.is_constant_or_break_match (| M.read (| \u03b3 |), Value.Bool true |) in\\n (* ... more code ... *)\\n```\\n\\n### \ud83e\ude81 coq-of-solidity\\n\\nLast but not least, the tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) to translate [Solidity](https://soliditylang.org/) smart contracts to Coq. We use the Yul intermediate language of the Solidity compiler to do our translation, with roughly a three times size increase in the translated code.\\n\\nWe support most of the Solidity instructions, passing 90% of tests of the Solidity compiler. We recently developed a new translation mode that can represent arbitrary Solidity code, or Yul written by hand, in a nice monad, even in case of complex control flow like nested loops with `break` and `continue` instructions and variable mutations. This is done thanks to our new effect inference engine in `coq-of-solidity` to always give a purely functional representation of imperative code.\\n\\nCompared to other formal analysis tools for Solidity, the strength is to be able to **verify arbitrary complex properties**. This is crucial for the verification of cryptographic operations (**elliptic curve** implementations, **zero-knowledge verifiers** linking the L1 to the L2s, ...) that are out of reach of standard verification tools. For example, we are currently verifying a [hand-optimized Yul implementation](https://github.com/get-smooth/crypto-lib/blob/main/src/elliptic/SCL_mulmuladdX_fullgen_b4.sol) of elliptic curve operations.\\n\\n## Conclusion\\n\\nWe have seen what we are proposing at Formal Land to enhance the security of your applications to the best possible level \ud83c\udf1f, with security of mathematical certainty. Next time, we will see how to use the Coq proof system to verify simple properties by following the [Coq in a Hurry](https://cel.hal.science/inria-00001173v6/file/coq-hurry.pdf) tutorial \ud83d\ude80.\\n\\n:::success For more\\n\\n_Follow us on [X](https://x.com/FormalLand) or [LinkedIn](https://fr.linkedin.com/company/formal-land), or comment on this post below! Feel free to DM us for any services you need._\\n\\n:::"},{"id":"/2024/08/19/verification-move-sui-type-checker-1","metadata":{"permalink":"/blog/2024/08/19/verification-move-sui-type-checker-1","source":"@site/blog/2024-08-19-verification-move-sui-type-checker-1.md","title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","description":"In this blog post, we present our project to formally verify the implementation of the type checker for smart contracts of the \ud83d\udca7 Sui blockchain. The Sui blockchain uses the Move language to express smart contracts. This language is implemented in \ud83e\udd80 Rust and compiles down to the Move bytecode that is loaded in memory when executing the smart contracts.","date":"2024-08-19T00:00:00.000Z","formattedDate":"August 19, 2024","tags":[{"label":"Sui","permalink":"/blog/tags/sui"},{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Move","permalink":"/blog/tags/move"},{"label":"type checker","permalink":"/blog/tags/type-checker"}],"readingTime":2.575,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","tags":["Sui","formal verification","Coq","Rust","Move","type checker"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83c\udf32 What we do at Formal Land","permalink":"/blog/2024/10/13/class-what-we-do"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","permalink":"/blog/2024/08/13/coq-of-solidity-4"}},"content":"In this blog post, we present our project to formally verify the implementation of the type checker for smart contracts of the [\ud83d\udca7 Sui blockchain](https://sui.io/). The Sui blockchain uses the [Move](https://sui.io/move) language to express smart contracts. This language is implemented in [\ud83e\udd80 Rust](https://www.rust-lang.org/) and compiles down to the Move bytecode that is loaded in memory when executing the smart contracts.\\n\\nWe will formally verify the part that checks that the bytecode is well-typed, so that when a smart contract is executed it cannot encounter critical errors. The [type checker itself](https://github.com/move-language/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety.rs) is also written in Rust, and we will verify it using the proof assistant [Coq \ud83d\udc13](https://coq.inria.fr/) and our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) that translates Rust programs to Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo formally verify your Rust code and ensure it contains no bugs or vulnerabilities, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nThe cost is \u20ac10 per line of Rust code (excluding comments) and \u20ac20 per line for concurrent code.\\n\\n:::\\n\\n
\\n ![Sui in forest](2024-08-19/sui-in-forest.webp)\\n
\\n\\n## Plan\\n\\nThe plan for this project is as follows:\\n\\n1. **Write simulations \ud83e\uddee** of the Rust code we want to verify in Coq, namely the [type checker](https://github.com/move-language/move-sui/blob/main/crates/move-bytecode-verifier/src/type_safety.rs) and the [interpreter of bytecode](https://github.com/move-language/move-sui/blob/main/crates/move-vm-runtime/src/interpreter.rs). The simulations are functions that are equivalent to the ones in the original Rust program, but written in a style that is more amenable to formal verification. The changes can be:\\n - very simple, for example renaming variables to avoid name collisions in Coq,\\n - more involved like solving the trait instances or replacing Rust references with purely functional code, or\\n - specific to the project, like reversing the order of the bytecode\'s stack to simplify the proofs.\\n2. **Test these simulations \ud83d\udd0d** to show they behave like the original Rust code on examples covering all the opcodes of the Move bytecode.\\n3. **Prove the equivalence \ud83d\udff0** between the Coq simulations and the Rust code as translated to Coq by `coq-of-rust`. This part will give more precise results than the tests, as it will cover all possible inputs and states of the program. The complexity of this part is to go through all the details that exist in the Rust code, like the use of references to manipulate the memory, the macros after expansion, and the parts of the Rust standard library that the code depends on.\\n4. **Prove that the type checker is correct \ud83d\udee1\ufe0f** in Coq. The main properties we want to check are:\\n - the interpreter preserves the well-typedness of the code as it steps through the opcodes,\\n - when a program is accepted by the type checker, the interpreter will not fail at runtime with a type error.\\n\\n## What is done\\n\\nFor now, we have written a simulation for the type checker in [CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety.v). We are now:\\n\\n- adding tests to compare this simulation with the original Rust code,\\n- writing the simulation for the interpreter of the Move bytecode.\\n\\nIn the following blog posts, we will describe how we structured the simulations and how we are testing or verifying them.\\n\\n:::success Thanks\\n\\n_This project is kindly founded by the [Sui Foundation](https://sui.io/about) which we thank for their trust and support._\\n\\n:::"},{"id":"/2024/08/13/coq-of-solidity-4","metadata":{"permalink":"/blog/2024/08/13/coq-of-solidity-4","source":"@site/blog/2024-08-13-coq-of-solidity-4.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","description":"In this blog post we explain how we specify and formally verify a whole ERC-20 smart contract using our tool coq-of-solidity, which translates Solidity code to the proof assistant Coq \ud83d\udc13.","date":"2024-08-13T00:00:00.000Z","formattedDate":"August 13, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":6.49,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Formal verification of the type checker of Sui \u2013 part 1","permalink":"/blog/2024/08/19/verification-move-sui-type-checker-1"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","permalink":"/blog/2024/08/12/coq-of-solidity-3"}},"content":"In this blog post we explain how we specify and formally verify a whole [ERC-20 smart contract](https://github.com/ethereum/solidity,/blob/develop/test/libsolidity/semanticTests/various/erc20.sol) using our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity), which translates [Solidity](https://soliditylang.org/) code to the proof assistant [Coq \ud83d\udc13](https://coq.inria.fr/).\\n\\nThe proofs are still tedious for now, as there are around 1,000 lines of proofs for 100 lines of Solidity. We plan to automate this work as much as possible in the subsequent iterations of the tool. One good thing about the interactive theorem prover Coq is that we know we can never be stuck, so we can always make progress in our proof techniques and verify complex properties even if it takes time \u2728.\\n\\nFormal verification with an interactive proof assistant is the strongest way to verify programs since:\\n\\n- it covers all possible inputs and program states,\\n- it checks any kind of properties.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success Get started\\n\\nTo audit your smart contracts and make sure they contain no bugs, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nWe refund our work if we missed a high/critical severity bug.\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-08-13/ethereum-in-forest.webp)\\n
\\n\\n## Functional specification\\n\\nWe specify the ERC-20 smart contract by writing an equivalent version in Coq that acts as a functional specification. In this specification, we ignore the `emit` operations that are logging events in Solidity and the precise payload of revert operations (we only say that \\"a revert occurs\\"). We make all our arithmetic operations on `Z` the type of unbounded integers with explicit overflow checks.\\n\\nFor example, here is the `_transfer` function of the Solidity smart contract:\\n```solidity\\nfunction _transfer(address from, address to, uint256 value) internal {\\n require(to != address(0), \\"ERC20: transfer to the zero address\\");\\n\\n // The subtraction and addition here will revert on overflow.\\n _balances[from] = _balances[from] - value;\\n _balances[to] = _balances[to] + value;\\n emit Transfer(from, to, value);\\n}\\n```\\nWe specify it in the file [erc20.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40verify-erc20/CoqOfSolidity/simulations/erc20.v) by:\\n```coq\\nDefinition _transfer (from to : Address.t) (value : U256.t) (s : Storage.t)\\n : Result.t Storage.t :=\\n if to =? 0 then\\n revert_address_null\\n else if balanceOf s from in\\n if balanceOf s to + value >=? 2 ^ 256 then\\n revert_arithmetic\\n else\\n Result.Success s <| Storage.balances :=\\n Dict.declare_or_assign s.(Storage.balances) to (balanceOf s to + value)\\n |>.\\n```\\nWith the Coq notation:\\n```coq\\nstorage <| field := new_value |>\\n```\\nwe modify a storage element as in the equivalent Solidity:\\n```solidity\\nfield = new_value;\\n```\\nWith the two tests:\\n```coq\\nif balanceOf s from =? 2 ^ 256 then\\n```\\nwe make explicit the overflow checks that are implicit in the Solidity code.\\n\\n## Dispatch to the entrypoints\\n\\nA Solidity smart contract has two public functions:\\n\\n1. One is the deployment code, which essentially initializes the storage of the smart contract and loads the rest of the code in memory,\\n2. The other one is executed when a transaction is sent to the smart contract, which is dispatched to the relevant entrypoint according to the payload of the transaction.\\n\\nWe will focus on the second one. It takes the contract\'s payload in a specific format:\\n\\n1. The first four bytes are the function selector, which is the first four bytes of the hash of the function signature,\\n2. The rest of the payload is the arguments of the function, following the ABI ([Application Binary Interface](https://en.wikipedia.org/wiki/Application_binary_interface)) of Solidity.\\n\\nThis blog article [Deconstructing a Solidity Contract\u200a-\u200aPart III: The Function Selector](https://blog.openzeppelin.com/deconstructing-a-solidity-contract-part-iii-the-function-selector-6a9b6886ea49) from OpenZeppelin gives more information about it. In Coq, we represent the payload of a contract with a sum type:\\n```coq\\nModule Payload.\\n Inductive t : Set :=\\n | Transfer (to: Address.t) (value: U256.t)\\n | Approve (spender: Address.t) (value: U256.t)\\n | TransferFrom (from: Address.t) (to: Address.t) (value: U256.t)\\n | IncreaseAllowance (spender: Address.t) (addedValue: U256.t)\\n | DecreaseAllowance (spender: Address.t) (subtractedValue: U256.t)\\n | TotalSupply\\n | BalanceOf (owner: Address.t)\\n | Allowance (owner: Address.t) (spender: Address.t).\\nEnd Payload.\\n```\\nWe define how to get this payload from the binary representation:\\n```coq\\nDefinition of_calldata (callvalue : U256.t) (calldata: list U256.t) :\\n option Payload.t :=\\n if Z.of_nat (List.length calldata) None\\n | erc20.Result.Success (memory_end_beginning, memory_end_end, s) =>\\n Some (make_state environment state\\n (memory_end_beginning ++ memory_end_middle ++ memory_end_end)\\n (SimulatedStorage.of_erc20_state s)\\n )\\n end in\\n {{? codes, environment, Some state_start |\\n // highlight-next-line\\n The original code here:\\n ERC20_403.ERC20_403_deployed.body \u21d3\\n match output with\\n | erc20.Result.Revert p s => Result.Revert p s\\n | erc20.Result.Success (_, memory_end_end, _) =>\\n Result.Return memoryguard (32 * Z.of_nat (List.length memory_end_end))\\n end\\n | state_end ?}}.\\n```\\nThe proof is done in the same way as in the previous blog post [\ud83e\ude81 Coq of Solidity \u2013 part 3](/blog/2024/08/12/coq-of-solidity-3) about the verification of the `_approve` function. The body of the contract calls all the other functions of the contract, and we reuse the equivalence proofs for the other functions here.\\n\\nThe main difficulty we encountered in the proof was missing information in the specification. For example, our predicate of equivalence requires for the memory of the smart contract to have the exact same value as its specification at the end of execution, except in case of revert. This means we needed to add the final state of the memory in the specification also, even if this is an implementation detail. We will refine our equivalence statement in the future to avoid this kind of issue.\\n\\nFor the most part of the proof, the work was about stepping through both codes and making sure, by automatic unification, that the twos are indeed equal.\\n\\n:::success AlephZero\\n\\n_The development of `coq-of-solidity` is made possible thanks to the [AlephZero](https://alephzero.org/) project. We thank the AlephZero Foundation for their support \ud83d\ude4f._\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented how to specify and formally verify a typical smart contract in Solidity, the ERC-20 token, using our tool `coq-of-solidity` (open-source). In the next post, we will see how to verify an invariant on the code and how the proof system Coq reacts if we introduce a bug."},{"id":"/2024/08/12/coq-of-solidity-3","metadata":{"permalink":"/blog/2024/08/12/coq-of-solidity-3","source":"@site/blog/2024-08-12-coq-of-solidity-3.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","description":"We continue to strengthen the security of smart contracts with our tool coq-of-solidity \ud83d\udee0\ufe0f. It checks for vulnerabilities or bugs in Solidity code. It uses formal verification with an interactive theorem prover (Coq \ud83d\udc13) to make sure that we cover:","date":"2024-08-12T00:00:00.000Z","formattedDate":"August 12, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":10.83,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 4","permalink":"/blog/2024/08/13/coq-of-solidity-4"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","permalink":"/blog/2024/08/07/coq-of-solidity-2"}},"content":"We continue to strengthen the security of smart contracts with our tool [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) \ud83d\udee0\ufe0f. It checks for vulnerabilities or bugs in [Solidity](https://soliditylang.org/) code. It uses formal verification with an interactive theorem prover ([Coq \ud83d\udc13](https://coq.inria.fr/)) to make sure that we cover:\\n\\n- all possible user inputs/storage states, even if there are infinite possibilities,\\n- for any security properties.\\n\\nThis is very important as a single bug can lead to the loss of millions of dollars in smart contracts, as we have regularly seen in the past, and we can never be sure that a human review of the code did not miss anything.\\n\\nOur tool `coq-of-solidity` is one of the only tools using an interactive theorem prover for Solidity, together with [Clear](https://github.com/NethermindEth/Clear) from [Nethermind](https://www.nethermind.io/). This might be the most powerful approach to making code without bugs, as exemplified in this [PLDI paper](https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf) comparing the reliability of various C compilers. They found numerous bugs in each compiler except in the [formally verified one](https://github.com/AbsInt/CompCert)!\\n\\nIn this blog post we show how we functionally specify and verify the `_approve` function of an [ERC-20 smart contract](https://github.com/ethereum/solidity/blob/develop/test/libsolidity/semanticTests/various/erc20.sol). We will see how we prove that a refined version of the function is equivalent to the original one.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success AlephZero\\n\\n_The development of `coq-of-solidity` is made possible thanks to the [AlephZero](https://alephzero.org/) project. We thank the AlephZero Foundation for their support \ud83d\ude4f._\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-08-12/ethereum-in-forest.webp)\\n
\\n\\n## Functional specification\\n\\nHere is the `_approve` function of the Solidity smart contract that we want to specify:\\n\\n```solidity\\nmapping (address => mapping (address => uint256)) private _allowances;\\n\\nfunction _approve(address owner, address spender, uint256 value) internal {\\n require(owner != address(0), \\"ERC20: approve from the zero address\\");\\n require(spender != address(0), \\"ERC20: approve to the zero address\\");\\n\\n _allowances[owner][spender] = value;\\n emit Approval(owner, spender, value);\\n}\\n```\\n\\nIt modifies an item in the `_allowances` map and emits an `Approval` event after a few sanity checks. We will now write a functional specification of this function in Coq. The idea is to explain what this function is supposed to do describing its behavior with an idiomatic Coq code. This will be useful to make sure there are no mistakes in the smart contract, although here we have a very simple example. From the functional specification, we will also be able to check higher-level properties of the smart contract, such as the fact that the total amount of tokens is always conserved.\\n\\nHere is the Coq version of the `_approve` function:\\n```coq\\nModule Storage.\\n Record t := {\\n allowances : Dict.t (Address.t * Address.t) U256.t;\\n (* other fields *)\\n }.\\nEnd Storage.\\n\\nDefinition _approve (owner spender : Address.t) (value : U256.t) (s : Storage.t) :\\n Result.t Storage.t :=\\n if (owner =? 0) || (spender =? 0) then\\n revert_address_null\\n else\\n Result.Success s <| Storage.allowances :=\\n Dict.declare_or_assign s.(Storage.allowances) (owner, spender) value\\n |>.\\n```\\nIt takes the same parameters as the Solidity code: `owner`, `spender`, `value`, and the current state `s` of the storage. It returns a `Result.t Storage.t` type, which is either a `Result.Success` with the new storage after the execution of the `_approve` function, or a `revert_address_null` if the `owner` or `spender` is the null address. To create the new storage, we use the corresponding Coq notation and function to update the `_allowances` map.\\n\\n:::info\\n\\nWe ignore the `emit` primitives for now.\\n\\n:::\\n\\nNow let us show that, for any possible `owner`, `spender`, `value`, and storage state `s`, the `_approve` function in Solidity will behave exactly as the Coq specification.\\n\\n## Approve function\\n\\nHere is the Coq translation of the `_approve` function as generated by `coq-of-solidity`:\\n```coq\\nDefinition fun_approve (var_owner : U256.t) (var_spender : U256.t) (var_value : U256.t) : M.t unit :=\\n let~ _1 := [[ and ~(| var_owner, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in\\n do~ [[\\n M.if_unit (| iszero ~(| _1 |),\\n let~ memPtr := [[ mload ~(| 64 |) ]] in\\n do~ [[ mstore ~(| memPtr, (shl ~(| 229, 4594637 |)) |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 4 |)), 32 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 36 |)), 36 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 68 |)), 0x45524332303a20617070726f76652066726f6d20746865207a65726f20616464 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr, 100 |)), 0x7265737300000000000000000000000000000000000000000000000000000000 |) ]] in\\n do~ [[ revert ~(| memPtr, 132 |) ]] in\\n M.pure tt\\n |)\\n ]] in\\n let~ _2 := [[ and ~(| var_spender, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in\\n do~ [[\\n M.if_unit (| iszero ~(| _2 |),\\n let~ memPtr_1 := [[ mload ~(| 64 |) ]] in\\n do~ [[ mstore ~(| memPtr_1, (shl ~(| 229, 4594637 |)) |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 4 |)), 32 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 36 |)), 34 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 68 |)), 0x45524332303a20617070726f766520746f20746865207a65726f206164647265 |) ]] in\\n do~ [[ mstore ~(| (add ~(| memPtr_1, 100 |)), 0x7373000000000000000000000000000000000000000000000000000000000000 |) ]] in\\n do~ [[ revert ~(| memPtr_1, 132 |) ]] in\\n M.pure tt\\n |)\\n ]] in\\n do~ [[ mstore ~(| 0x00, _1 |) ]] in\\n do~ [[ mstore ~(| 0x20, 0x01 |) ]] in\\n let~ dataSlot := [[ keccak256 ~(| 0x00, 0x40 |) ]] in\\n let~ dataSlot_1 := [[ 0 ]] in\\n do~ [[ mstore ~(| 0, _2 |) ]] in\\n do~ [[ mstore ~(| 0x20, dataSlot |) ]] in\\n let~ dataSlot_1 := [[ keccak256 ~(| 0, 0x40 |) ]] in\\n do~ [[ sstore ~(| dataSlot_1, var_value |) ]] in\\n let~ _3 := [[ mload ~(| 0x40 |) ]] in\\n do~ [[ mstore ~(| _3, var_value |) ]] in\\n do~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in\\n M.pure tt.\\n```\\nWe plug into the Solidity compiler and translate the intermediate representation [Yul](https://docs.soliditylang.org/en/latest/yul.html) that `solc` uses to generate EVM bytecode. We automatically refine the Yul generated by the Solidity compiler but for now this refinement is limited.\\n\\nThe two `M.if_unit` at the beginning correspond to the `require` statements in the Solidity code. The `revert` statements are used to return an error message to the caller. The `mstore` and `sstore` functions are used to store values in the memory and the storage of the EVM. The `keccak256` function encodes the storage addresses to access the `_allowances` map. The `log3` function is used to emit an event at the end.\\n\\nThis representation of the `_approve` function is very verbose as it corresponds exactly to what the source code does and contains a lot of implementation details. Our goal now is to show that this version is equivalent to the functional specification that we wrote by hand.\\n\\n## Equivalence\\n\\nWe express that the functional specification is equivalent to the original one with this lemma:\\n```coq\\nLemma run_fun_approve codes environment state\\n (owner spender : Address.t) (value : U256.t) (s : erc20.Storage.t)\\n (mem_0 mem_1 mem_3 mem_4 : U256.t)\\n (H_owner : Address.Valid.t owner)\\n (H_spender : Address.Valid.t spender) :\\n let memoryguard := 0x80 in\\n let memory_start :=\\n [mem_0; mem_1; memoryguard; mem_3; mem_4] in\\n let state_start :=\\n make_state environment state memory_start (SimulatedStorage.of_erc20_state s) in\\n let output :=\\n erc20._approve owner spender value s in\\n let memory_end :=\\n [spender; erc20.keccak256_tuple2 owner 1; memoryguard; mem_3; value] in\\n let state_end :=\\n match output with\\n | erc20.Result.Revert _ _ => None\\n | erc20.Result.Success s =>\\n Some (make_state environment state memory_end (SimulatedStorage.of_erc20_state s))\\n end in\\n {{? codes, environment, Some state_start |\\n ERC20_403.ERC20_403_deployed.fun_approve owner spender value \u21d3\\n match output with\\n | erc20.Result.Revert p s => Result.Revert p s\\n | erc20.Result.Success _ => Result.Ok tt\\n end\\n | state_end ?}}.\\n```\\n\\nThis lemma of equivalence requires some parameters:\\n\\n- an initial `codes`, `environment`, and `state` values, that describe the state of the blockchain before the execution of the `_approve` function,\\n- a `memoryguard` value that gives a memory zone that we are safe to use,\\n- some `mem_i` variables, as we do not know the exact values of the memory slots before the execution of the function,\\n- an `owner`, `spender`, and `value` that are the parameters of the `_approve` function,\\n- an `s` that is the state of storage of the smart contract before the execution of the `_approve` function,\\n- an `H_owner` and `H_spender` proofs that the `owner` and `spender` are valid addresses. These two proofs are required to execute the function as expected and always available, thanks to runtime checks made at the entrypoints of the smart contract.\\n\\nThe lemma will hold for any possible values of the parameters above, even if there are infinite possibilities. This is the power of formal verification: we can prove that our smart contract is correct for all possible inputs and states.\\n\\nThe core statement uses the predicate:\\n```coq\\n{{? codes, environment, start_state |\\n original_code \u21d3\\n refined_code\\n| end_state ?}}\\n```\\nIt says that some `original_code` executed in the `start_state` environment will give the same output as the `refined_code` and will result in the final state `end_state`. The state is an option type: either `Some` state or `None` if the execution reverted. That way we do not have to deal with describing the state after a contract revert, that will reset the storage anyways.\\n\\n:::info\\n\\nThe statement of equivalence is relatively verbose so there could be mistakes in the way it is stated. This is not really an issue, as the `_approve` function is an intermediate function, so the only statement that really matters is the one on the main function of the contract that dispatches to the relevant entrypoint according to the payload of the transaction. There could also be mistakes there, but perhaps we can automatically generate this statement from the Solidity code.\\n\\n:::\\n\\n## Proof of equivalence\\n\\nThe way we write the proof is interesting. We use Coq as a symbolic debugger, where we execute both the original code and the functional specification until we reach the end of execution for all the branches, always with the same result.\\n\\nHere is an example of a debugging step (in the proof mode of Coq):\\n```coq\\n{{?codes, environment,\\nSome\\n (make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; mem_4]\\n [IsStorable.IMap.(IsStorable.to_storable_value) s.(erc20.Storage.balances);\\n StorableValue.Map2\\n (Dict.declare_or_assign s.(erc20.Storage.allowances) (owner, spender) value);\\n StorableValue.U256 s.(erc20.Storage.total_supply)])\\n|\\n // highlight-next-line\\n The original code here:\\n do~ call (Stdlib.mstore 128 value)\\n in (do~ call\\n (Stdlib.log3 128 32\\n 63486140976153616755203102783360879283472101686154884697241723088393386309925\\n owner spender) in LowM.Pure (Result.Ok tt)) \u21d3\\n // highlight-next-line\\n The functional specification here:\\n Result.Ok tt\\n| Some\\n (make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; value]\\n (SimulatedStorage.of_erc20_state\\n s<|erc20.Storage.allowances:= Dict.declare_or_assign s.(erc20.Storage.allowances)\\n (owner, spender) value|>))?}}\\n```\\nOn the original code side we can recognize:\\n```coq\\ndo~ [[ mstore ~(| _3, var_value |) ]] in\\ndo~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in\\nM.pure tt\\n```\\nthat corresponds to the end of the execution of the `_approve` function. On the functional specification, we have:\\n```coq\\nResult.Ok tt\\n```\\nthat ends the execution successfully but does not return anything. This is because we ignore the `emit` operation, translated as a `log3` Yul primitive. We also ignore the `mstore` call as it is only used to fill information for the `log3` call.\\n\\nHere are the various commands to step through the code, encoded as Coq tactics:\\n\\n- `p`: final **P**ure expression\\n- `pn`: final **P**ure expression ignoring the resulting state with a **N**one (for a revert)\\n- `pe`: final **P**ure expression with non-trivial **E**quality of results\\n- `pr`: Yul **PR**imitive\\n- `prn`: Yul **PR**imitive ignoring the resulting state with a **N**one\\n- `l`: step in a **L**et\\n- `lu`: step in a **L**et by **U**nfolding\\n- `c`: step in a function **C**all\\n- `cu`: step in a function **C**all by **U**nfolding\\n- `s`: **S**implify the goal\\n\\nThese commands verify that the two programs are equivalent as we step through them. As a reference, the proof is in [CoqOfSolidity/proofs/ERC20_functional.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40verify-erc20/CoqOfSolidity/proofs/ERC20_functional.v):\\n\\n```coq\\nProof.\\n simpl.\\n unfold ERC20_403.ERC20_403_deployed.fun_approve, erc20._approve.\\n l. {\\n now apply run_is_non_null_address.\\n }\\n unfold Stdlib.Pure.iszero.\\n lu.\\n c; [p|].\\n s.\\n unfold Stdlib.Pure.iszero.\\n destruct (owner =? 0); s. {\\n change (true || _) with true; s.\\n lu; c. {\\n apply_run_mload.\\n }\\n repeat (\\n lu ||\\n cu ||\\n (prn; intro) ||\\n s ||\\n p\\n ).\\n }\\n l. {\\n now apply run_is_non_null_address.\\n }\\n lu.\\n c; [p|]; s.\\n unfold Stdlib.Pure.iszero.\\n change (false || ?e) with e; s.\\n destruct (spender =? 0); s. {\\n lu; c. {\\n apply_run_mload.\\n }\\n repeat (\\n lu ||\\n cu ||\\n (prn; intro) ||\\n s ||\\n p\\n ).\\n }\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_keccak256_tuple2.\\n }\\n lu.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_keccak256_tuple2.\\n }\\n lu; c. {\\n apply_run_sstore_map2_u256.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n apply_run_mload.\\n }\\n s.\\n lu; c. {\\n apply_run_mstore.\\n }\\n CanonizeState.execute.\\n lu; c. {\\n p.\\n }\\n p.\\nQed.\\n```\\n\\n:::success Get started\\n\\nTo audit your smart contracts and make sure they contain no bugs, contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nWe refund our work in case we missed any high/critical severity bugs.\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented how to functionally specify a function with `coq-of-solidity`. In the next blog post we will see how to extend this proof and specification to the entire ERC-20 smart contract."},{"id":"/2024/08/07/coq-of-solidity-2","metadata":{"permalink":"/blog/2024/08/07/coq-of-solidity-2","source":"@site/blog/2024-08-07-coq-of-solidity-2.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","description":"We continue to work on our open source formal verification tool for Solidity named coq-of-solidity \ud83d\udee0\ufe0f. Formal verification is the strongest form of code audits, as we verify that the code behaves correctly in all possible execution cases \ud83d\udd0d. We use the interactive theorem prover Coq to express and verify any kinds of properties.","date":"2024-08-07T00:00:00.000Z","formattedDate":"August 7, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":6.36,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 3","permalink":"/blog/2024/08/12/coq-of-solidity-3"},"nextItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","permalink":"/blog/2024/06/28/coq-of-solidity-1"}},"content":"We continue to work on our open source **formal verification** tool for [Solidity](https://soliditylang.org/) named [coq-of-solidity](https://github.com/formal-land/coq-of-solidity) \ud83d\udee0\ufe0f. Formal verification is the strongest form of code audits, as we verify that the code behaves correctly in all possible execution cases \ud83d\udd0d. We use the **interactive theorem prover** [Coq](https://coq.inria.fr/) to express and verify any kinds of properties.\\n\\nWe work by translating the [Yul](https://docs.soliditylang.org/en/latest/yul.html) version of a smart contract to the formal language Coq \ud83d\udc13, in which we then express the code specifications/security properties and formally verify them \ud83d\udd04. The Yul language is an intermediate language used by the Solidity compiler and others to generate EVM bytecode. Yul is simpler than Solidity and at a higher level than the EVM bytecode, making it a good target for formal verification.\\n\\nIn this blog post we present the recent developments we made to simplify the reasoning \ud83e\udde0 about Yul programs once translated in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::success AlephZero\\n\\n_This development is made possible thanks to [AlephZero](https://alephzero.org/). We thank the Aleph Zero Foundation for their support to bring more security to the Web3 space \ud83d\ude4f._\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-08-07/ethereum-in-forest.webp)\\n
\\n\\n## Workflow\\n\\nWe present here the general workflow to use `coq-of-solidity` to make sure your smart contracts contain no bugs \ud83d\udc1b.\\n\\n
\\n ![Workflow](2024-08-07/workflow.png)\\n
\\n\\nThe workflow is as follows:\\n\\n1. We start with a Solidity smart contract.\\n2. The Solidity compiler translates it to the intermediate language Yul.\\n3. The `coq-of-yul` tool generates a first Coq version. This version is very low-level, with, for example, variable names represented by the string of their names.\\n4. The `prepare.py` script makes as many refinements as possible in the Coq code to make it more readable and easier to reason about. For example, we order the functions definitions by the order in which they are used and replace the Yul variables by standard Coq variables.\\n5. As we are not fully automated yet for the refinements, we add another manual step where we, for example, name the memory locations so that they appear as variables instead of fixed integers.\\n6. We write in Coq the formal specification of what we expect our smart contract to do or not do. A formal specification is like a test but expressed with quantifiers (\u2200, \u2203) so that we cover all execution cases.\\n7. We write a formal proof showing that our smart contract indeed validates the formal specification for any user inputs and blockchain states.\\n8. You can now deploy your smart contract, having followed one of the most secure development methodologies.\\n\\n## Refinement step\\n\\nThe code that `coq-of-solidity` generates is very verbose. For example, for this Yul function generated by the Solidity compiler to make an addition with overflow check:\\n\\n```go\\nfunction checked_add_uint256(x) -> sum\\n{\\n sum := add(x, /** @src 0:419:421 \\"20\\" */ 0x14)\\n /// @src 0:33:3484 \\"contract ERC20 {...\\"\\n if gt(x, sum)\\n {\\n mstore(0, shl(224, 0x4e487b71))\\n mstore(4, 0x11)\\n revert(0, 0x24)\\n }\\n}\\n```\\nwe get a Coq translation:\\n```coq\\nCode.Function.make (\\n \\"checked_add_uint256\\",\\n [\\"x\\"],\\n [\\"sum\\"],\\n M.scope (\\n do! ltac:(M.monadic (\\n M.assign (|\\n [\\"sum\\"],\\n Some (M.call (|\\n \\"add\\",\\n [\\n M.get_var (| \\"x\\" |);\\n [Literal.number 0x14]\\n ]\\n |))\\n |)\\n )) in\\n do! ltac:(M.monadic (\\n M.if_ (|\\n M.call (|\\n \\"gt\\",\\n [\\n M.get_var (| \\"x\\" |);\\n M.get_var (| \\"sum\\" |)\\n ]\\n |),\\n M.scope (\\n do! ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"mstore\\",\\n [\\n [Literal.number 0];\\n M.call (|\\n \\"shl\\",\\n [\\n [Literal.number 224];\\n [Literal.number 0x4e487b71]\\n ]\\n |)\\n ]\\n |)\\n |)\\n )) in\\n do! ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"mstore\\",\\n [\\n [Literal.number 4];\\n [Literal.number 0x11]\\n ]\\n |)\\n |)\\n )) in\\n do! ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"revert\\",\\n [\\n [Literal.number 0];\\n [Literal.number 0x24]\\n ]\\n |)\\n |)\\n )) in\\n M.pure BlockUnit.Tt\\n )\\n |)\\n )) in\\n M.pure BlockUnit.Tt\\n )\\n)\\n```\\n\\nThis is quite long to follow, and even harder to use to write formal proofs. We made a script [prepare.py](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40verify-erc20/CoqOfSolidity/test/libsolidity/semanticTests/various/erc20/prepare.py) that simplifies the code above to:\\n```coq\\nDefinition checked_add_uint256 (x : U256.t) : M.t U256.t :=\\n let~ sum := [[ add ~(| x, 0x14 |) ]] in\\n do~ [[\\n M.if_unit (| gt ~(| x, sum |),\\n do~ [[ mstore ~(| 0, (shl ~(| 224, 0x4e487b71 |)) |) ]] in\\n do~ [[ mstore ~(| 4, 0x11 |) ]] in\\n do~ [[ revert ~(| 0, 0x24 |) ]] in\\n M.pure tt\\n |)\\n ]] in\\n M.pure sum.\\n```\\nThis is much more readable. We have monadic notations to compose all the primitive Yul functions such as `mstore` and `revert`, that may cause side effects such as memory mutation or premature return. The code uses standard Coq variables and functions instead of strings, which simplifies the proofs.\\n\\nTo make sure that this transformation is correct, we also generate a Coq proof file that shows that our transformation is correct and that the original and transformed code from `prepare.py` are equivalent \u2714\ufe0f.\\n\\n### Next\\n\\nWe can simplify the code even further. For example:\\n\\n- We know that the functions `add`, `gt`, and `shl` are purely functional, so we could explicit this property in the Coq code. For now they are called as monadic functions with the notation `f ~(| arg1, ..., argn |)` even if they never make side effects.\\n- The `mstore` function stores values at fixed addresses in the memory, here `0` and `4`. We could remove these memory operations by introducing named variables to hold the results instead.\\n\\nWe hope to be able to automate as many refinements as possible in the future, but for now we have to do some manual work \ud83d\udd27.\\n\\n## Manual refinements\\n\\nWe manually refine the code by showing that it returns the same result, for every possible input and initial memory state, as a simplified code written by hand. For the `checked_add_uint256` function above we use:\\n```coq\\nDefinition simulation_checked_add_uint256 (x y : Z) : Result.t Z :=\\n if x + y >=? 2 ^ 256 then\\n Result.Revert 0 0x24\\n else\\n Result.Ok (x + y).\\n```\\nHere, all the computations are made with the `Z` type of unbounded integers that are simpler to manipulate for the proofs. We use an `if` statement to explicitly detect the overflows. The revert statement has the same parameters as in the original code, but we do not fill the memory area `0` to `0x24` anymore. The reason is that we ignore what the `revert` returned in our specifications as this is not relevant for now and also simplifies the proofs.\\n\\nIn the code above we do not manipulate the memory anymore. In general, we do the following kinds of refinements:\\n\\n- Using unbounded integers with explicit overflow checks instead of the fixed-size integers of the EVM.\\n- Using side effects only when necessary, for example for the `revert` statement.\\n- Removing memory operations by introducing named variables to hold the results.\\n- Simplifying the storage accesses by using explicit arrays or maps instead of the`keccak256` hash encoding of the addresses.\\n- Using explicit names for the entrypoints instead of binary encoding with the `keccak256` function.\\n\\nFor now these transformations are manual and semi-automated, but we hope to automate them as much as possible in the future. By proving that `simulation_checked_add_uint256` behaves as the original `checked_add_uint256` function we are sure that we can reason on the simplified code instead of the original one without losing any information \ud83d\udd0d.\\n\\n:::success Get started\\n\\nTo audit your smart contracts with the method above contact us at [ \ud83d\udce7contact@formal.land](mailto:contact@formal.land).\\n\\nCompared to other auditing methods, formal verification has the strong advantage of covering all possible execution cases \ud83d\udcaa.\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented the current status of our work to formally verify smart contracts, especially the refinements steps that make the reasoning possible. In our next posts we will continue seeing how we can verify a full smart contract \ud83d\udd2e."},{"id":"/2024/06/28/coq-of-solidity-1","metadata":{"permalink":"/blog/2024/06/28/coq-of-solidity-1","source":"@site/blog/2024-06-28-coq-of-solidity-1.md","title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","description":"Solidity is the most widely used smart contract language on the blockchain. As smart contracts are critical software handling a lot of money, there is a huge interest in finding all possible bugs before putting them into production.","date":"2024-06-28T00:00:00.000Z","formattedDate":"June 28, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Yul","permalink":"/blog/tags/yul"}],"readingTime":16.26,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","tags":["formal verification","Coq","Solidity","Yul"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 2","permalink":"/blog/2024/08/07/coq-of-solidity-2"},"nextItem":{"title":"\ud83e\udd84 Software correctness from first principles","permalink":"/blog/2024/06/05/software-correctness-from-first-principles"}},"content":"[Solidity](https://soliditylang.org/) is the most widely used **smart contract language** on the blockchain. As smart contracts are **critical software** handling a lot of money, there is a huge interest in finding **all possible bugs** before putting them into production.\\n\\n**Formal verification** is a technique to test a program on all possible entries, even when there are **infinitely many**. This contrasts with the traditional test techniques, which can only execute a finite set of scenarios. As such, it appears to be an ideal way to bring more security to smart contract audits.\\n\\nIn this blog post, we present the **formal verification tool `coq-of-solidity`** that we are developing for Solidity. Its specificities are that:\\n\\n1. It is open-source (GPL-3 for the translation, MIT for the proofs).\\n2. It uses an interactive theorem prover, the system Coq, to verify arbitrarily complex properties.\\n\\nHere, we present how we translate Solidity code into Coq using the intermediate language [Yul](https://docs.soliditylang.org/en/latest/yul.html). We explain the semantics we use and what remains to be done.\\n\\nThe code is available in our fork of the Solidity compiler at [github.com/formal-land/coq-of-solidity](https://github.com/formal-land/coq-of-solidity).\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info AlephZero\\n\\n_We are happy to be working with [AlephZero](https://alephzero.org/) to develop tools to bring more security for the audit of Solidity smart contracts, thanks to the use of formal verification and the interactive theorem prover [Coq](https://coq.inria.fr/). We thank the Aleph Zero Foundation for their support._\\n\\n:::\\n\\n
\\n ![Ethereum in forest](2024-06-28/ethereum-in-forest.webp)\\n
\\n\\n## Architecture of the tool\\n\\nWe reuse the code of the standard Solidity compiler `solc` in order to make sure that we can stay in sync with the evolutions of the language and be compatible with all the Solidity features. Thus, our most straightforward path to implementing a translation tool from Solidity to Coq was to fork the C++ code of `solc` in [github.com/formal-land/coq-of-solidity](https://github.com/formal-land/coq-of-solidity). We add a new `solc`\'s flag `--ir-coq` that tells the compiler to also generate a Coq output in addition to the expected EVM bytecode.\\n\\nAt first, we looked at the direct translation from the Solidity language to Coq, but this was getting too complex. We changed our strategy to instead target the Yul language, an intermediate language used by the Solidity compiler to have an intermediate step in its translation to the EVM bytecode. The Yul language is simpler than Solidity and still has a higher level than the EVM bytecode, making it a good target for formal verification. In contrast to the EVM bytecode, there are no explicit stack-manipulation or `goto` instructions in Yul simplifying formal verification.\\n\\nTo give an idea of the size difference between Solidity and Yul, here are the files to export these languages to JSON in the Solidity compiler:\\n\\n- [ast/ASTJsonExporter.cpp](https://github.com/ethereum/solidity/blob/develop/libsolidity/ast/ASTJsonExporter.cpp): Solidity to JSON, 1127 lines\\n- [libyul/AsmJsonConverter.cpp](https://github.com/ethereum/solidity/blob/develop/libyul/AsmJsonConverter.cpp): Yul to JSON, 205 lines\\n\\nThe Solidity language appears as more complex than Yul as the code to translate it to JSON is five times longer.\\n\\nWe copied the file `libyul/AsmJsonConverter.cpp` above to make a version that translates Yul to Coq: [libyul/AsmCoqConverter.cpp](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret@experiments-with-yul/libyul/AsmCoqConverter.cpp). We reused the code for compilation flags to add a new option `--ir-coq`, which runs the conversion to Coq instead of the conversion to JSON.\\n\\n## Translation of Yul\\n\\nTo limit the size of the generated Coq code, we translate the Yul code after the optimization passes. This helps to remove boilerplate code but may make the Yul code less relatable to the Solidity sources. Thankfully, the optimized Yul code is still readable in our tests, and the Solidity compiler can pretty-print a version of the optimized Yul code with comments to quote the corresponding Solidity source code.\\n\\nAs an example, here is how we translate the [if keyword](https://docs.soliditylang.org/en/latest/yul.html#if) of Yul:\\n\\n```cpp\\nstd::string AsmCoqConverter::operator()(If const& _node)\\n{\\n\\tyulAssert(_node.condition, \\"Invalid if condition.\\");\\n\\tstd::string ret = \\"M.if_ (|\\\\n\\";\\n\\tm_indent++;\\n\\tret += indent() + std::visit(*this, *_node.condition) + \\",\\\\n\\";\\n\\tret += indent() + (*this)(_node.body) + \\"\\\\n\\";\\n\\tm_indent--;\\n\\tret += indent() + \\"|)\\";\\n\\n\\treturn ret;\\n}\\n```\\n\\nWe convert each Yul `_node` to an `std::string` that represents the Coq code. We use the `m_indent` variable to keep track of the indentation level, and the `indent()` function to add the right number of spaces at the beginning of each line. We do not need to add extra parenthesis to disambiguate priorities, as the Yul language is simple enough.\\n\\nHere is the generated Coq code for the beginning of the [erc20.sol](https://github.com/ethereum/solidity/blob/develop/test/libsolidity/semanticTests/various/erc20.sol) example from the Solidity compiler\'s test suite:\\n\\n```coq\\n(* Generated by solc *)\\nRequire Import CoqOfSolidity.CoqOfSolidity.\\n\\nModule ERC20_403.\\n Definition code : M.t BlockUnit.t :=\\n do* ltac:(M.monadic (\\n M.function (|\\n \\"allocate_unbounded\\",\\n [],\\n [\\"memPtr\\"],\\n do* ltac:(M.monadic (\\n M.assign (|\\n [\\"memPtr\\"],\\n Some (M.call (|\\n \\"mload\\",\\n [\\n [Literal.number 64]\\n ]\\n |))\\n |)\\n )) in\\n M.od\\n |)\\n )) in\\n do* ltac:(M.monadic (\\n M.function (|\\n \\"revert_error_ca66f745a3ce8ff40e2ccaf1ad45db7774001b90d25810abd9040049be7bf4bb\\",\\n [],\\n [],\\n do* ltac:(M.monadic (\\n M.expr_stmt (|\\n M.call (|\\n \\"revert\\",\\n [\\n [Literal.number 0];\\n [Literal.number 0]\\n ]\\n |)\\n |)\\n )) in\\n M.od\\n |)\\n )) in\\n (* ... 6,000 remaining lines ... *)\\n```\\n\\nThis code is quite verbose, for an original smart contract size of 100 lines of Solidity. As a reference, the corresponding Yul code is 1,000 lines long and starts with:\\n\\n```go\\n/// @use-src 0:\\"erc20.sol\\"\\nobject \\"ERC20_403\\" {\\n code {\\n function allocate_unbounded() -> memPtr\\n { memPtr := mload(64) }\\n function revert_error_ca66f745a3ce8ff40e2ccaf1ad45db7774001b90d25810abd9040049be7bf4bb()\\n { revert(0, 0) }\\n // ... 1,000 remaining lines ...\\n```\\n\\nThe content is actually the same up to the notations, but we use many more line breaks and keywords in the Coq version.\\n\\n## Runtime in Coq\\n\\nNow that the code is translated in Coq, we need to define a _runtime_ for the Coq code. This means giving a definition for all the functions and types that are used in the generated code, like `M.t BlockUnit.t`, `M.monadic`, `M.function`, ... This runtime gives the semantics of the Yul language, that is to say, the meaning of all the primitives of the language.\\n\\n### Notation\\n\\nWe first define a monadic notation `ltac:(M.monadic ...)` to make a [monadic transformation](https://xavierleroy.org/mpri/2-4/monads.pdf) on the generated code. We reuse here what we have done for our [Rust translation to Coq](https://github.com/formal-land/coq-of-rust), which we describe in our blog post [\ud83e\udd80 Monadic notation for the Rust translation](/blog/2024/04/03/monadic-notation-for-rust-translation). The notation:\\n\\n```coq\\nf (| x_1, ..., x_n |)\\n```\\n\\ncorresponds to the call of a monadic function. The tactic `M.monadic` automatically chains all these calls using the monadic bind operator.\\n\\nThe `do* ... in ...` is another monadic notation to chain monadic expressions, directly calling the monadic bind. This notation is more explicit, and we use it in combination with the `ltac:(M.monadic ...)` notation as it might be more efficient to type-check very large files.\\n\\n### Monad\\n\\nTo represent the side effects in Yul, we use the following Coq monad, that we define in [CoqOfSolidity/CoqOfSolidity.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40experiments-with-yul/CoqOfSolidity/CoqOfSolidity.v):\\n\\n\\n```coq\\nModule U256.\\n Definition t := Z.\\nEnd U256.\\n\\nModule Environment.\\n Record t : Set := {\\n caller : U256.t;\\n (** Amount of wei sent to the current contract *)\\n callvalue : U256.t;\\n calldata : list Z;\\n (** The address of the contract. *)\\n address : U256.t;\\n }.\\nEnd Environment.\\n\\nModule BlockUnit.\\n (** The return value of a code block. *)\\n Inductive t : Set :=\\n (** The default value in case of success *)\\n | Tt\\n (** The instruction `break` was called *)\\n | Break\\n (** The instruction `continue` was called *)\\n | Continue\\n (** The instruction `leave` was called *)\\n | Leave.\\nEnd BlockUnit.\\n\\nModule Result.\\n (** A wrapper for the result of an expression or a code block. We can either return a normal value\\n with [Ok], or a special instruction [Return] that will stop the execution of the contract. *)\\n Inductive t (A : Set) : Set :=\\n | Ok (output : A)\\n | Return (p s : U256.t)\\n | Revert (p s : U256.t).\\n Arguments Ok {_}.\\n Arguments Return {_}.\\n Arguments Revert {_}.\\nEnd Result.\\n\\nModule Primitive.\\n (** We group together primitives that share being impure functions operating over the state. *)\\n Inductive t : Set -> Set :=\\n | OpenScope : t unit\\n | CloseScope : t unit\\n | GetVar (name : string) : t U256.t\\n | DeclareVars (names : list string) (values : list U256.t) : t unit\\n | AssignVars (names : list string) (values : list U256.t) : t unit\\n | MLoad (address length : U256.t) : t (list Z)\\n | MStore (address : U256.t) (bytes : list Z) : t unit\\n | SLoad (address : U256.t) : t U256.t\\n | SStore (address value : U256.t) : t unit\\n | RLoad : t (list Z)\\n | TLoad (address : U256.t) : t U256.t\\n | TStore (address value : U256.t) : t unit\\n | Log (topics : list U256.t) (payload : list Z) : t unit\\n | GetEnvironment : t Environment.t\\n | GetNonce : t U256.t\\n | GetCodedata (address : U256.t) : t (list Z)\\n | CreateAccount (address code : U256.t) (codedata : list Z) : t unit\\n | UpdateCodeForDeploy (address code : U256.t) : t unit\\n | LoadImmutable (name : U256.t) : t U256.t\\n | SetImmutable (name value : U256.t) : t unit\\n (** The call stack is there to debug the semantics of Yul. *)\\n | CallStackPush (name : string) (arguments : list (string * U256.t)) : t unit\\n | CallStackPop : t unit.\\nEnd Primitive.\\n\\nModule LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure (output : A)\\n | Primitive {B : Set}\\n (primitive : Primitive.t B)\\n (k : B -> t A)\\n | DeclareFunction\\n (name : string)\\n (body : list U256.t -> t (Result.t (list U256.t)))\\n (k : t A)\\n | CallFunction\\n (name : string)\\n (arguments : list U256.t)\\n (k : Result.t (list U256.t) -> t A)\\n | Loop {B : Set}\\n (body : t B)\\n (** The final value to return if we decide to break of the loop. *)\\n (break_with : B -> option B)\\n (k : B -> t A)\\n | CallContract\\n (address : U256.t)\\n (value : U256.t)\\n (input : list Z)\\n (k : U256.t -> t A)\\n (** Explicit cut in the monadic expressions, to provide better composition for the proofs. *)\\n | Let {B : Set} (e1 : t B) (k : B -> t A)\\n | Impossible (message : string).\\nEnd LowM.\\n\\nModule M.\\n Definition t (A : Set) := LowM.t (Result.t A).\\n```\\n\\nThe only type for values in Yul is the 256-bit unsigned integer `U256.t` that we represent with the `Z` type of Coq. The `BlockUnit.t` type represents the possible outcomes of a block of code:\\n\\n- `Ok` for the normal ending;\\n- `Break` or `Continue` to propagate a premature return from a call to the `break` or `continue` primitives;\\n- `Leave` to propagate the call to the `leave` primitive to terminate a function.\\n\\nWe define the monad in two steps. First, we define the `LowM.t` monad parameterized by the type of output `A`. The monad has the following constructors:\\n\\n- `Pure` to return a value without side effects;\\n- `Primitive` to execute one of the primitive, that are functions operating over the state (defined later);\\n- `DeclareFunction` to declare a function with a name and a body, which is a function taking a list of arguments and returning a list of results, as this is the case in Yul;\\n- `CallFunction` to call a function by its name with a list of arguments;\\n- `Loop` to execute a block of code in a loop, with a function to decide if we should break the loop, helpful to implement the `for` construct;\\n- `CallContract` a dedicated primitive to implement the `call` instruction of the EVM to call another contract located at a certain address;\\n- `Let` to compose two monadic expressions in a more explicit way than using the continuations;\\n- `Impossible` to signal an unexpected branch in the code.\\n\\nThis monad is purely descriptive. We give the list of primitives but we do not explain here how each operator behaves. Most of the primitives take a continuation `k`, which is a function from the output of the primitive to the rest of the code. This is a way to chain the primitives together. For convenience we define a monadic bind `let_` that chains these continuations to chain two monadic expressions.\\n\\nThen we define a monad `M.t` as:\\n```coq\\nModule M.\\n Definition t (A : Set) := LowM.t (Result.t A).\\n```\\n\\nto represent calculations that return a `Result.t` to take into account that a contract might return or revert at any point in its execution.\\n\\nFinally, we define the Yul keywords from these primitives. For example, for the `if` keyword:\\n\\n```coq\\nDefinition if_ (condition : list U256.t) (success : t BlockUnit.t) : t BlockUnit.t :=\\n match condition with\\n | [0] => pure BlockUnit.Tt\\n | [_] => success\\n | _ => LowM.Impossible \\"if: expected a single value as condition\\"\\n end.\\n```\\n\\n### Evaluation rules\\n\\nTo define how to run the primitives of the Yul\'s monad, we use evaluation rules in [CoqOfSolidity/simulations/CoqOfSolidity.v](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40experiments-with-yul/CoqOfSolidity/simulations/CoqOfSolidity.v):\\n\\n```coq\\nModule Run.\\n Reserved Notation \\"{{ environment , state | e \u21d3 output | state\' }}\\"\\n (at level 70, no associativity).\\n\\n Inductive t {A : Set} (environment : Environment.t) (state : State.t) (output : A) :\\n LowM.t A -> State.t -> Prop :=\\n | Pure : {{ environment, state | LowM.Pure output \u21d3 output | state }}\\n | Primitive {B : Set} (primitive : Primitive.t B) (k : B -> LowM.t A) value state_inter state\' :\\n inl (value, state_inter) = eval_primitive environment primitive state ->\\n {{ environment, state_inter | k value \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.Primitive primitive k \u21d3 output | state\' }}\\n | DeclareFunction name body k stack_inter state\' :\\n inl stack_inter = Stack.declare_function state.(State.stack) name body ->\\n let state_inter := state <| State.stack := stack_inter |> in\\n {{ environment, state_inter | k \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.DeclareFunction name body k \u21d3 output | state\' }}\\n | CallFunction name arguments k results state_inter state\' :\\n let function := Stack.get_function state.(State.stack) name in\\n {{ environment, state | function arguments \u21d3 results | state_inter }} ->\\n {{ environment, state_inter | k results \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.CallFunction name arguments k \u21d3 output | state\' }}\\n | Let {B : Set} (e1 : LowM.t B) k state_inter output_inter state\' :\\n {{ environment, state | e1 \u21d3 output_inter | state_inter }} ->\\n {{ environment, state_inter | k output_inter \u21d3 output | state\' }} ->\\n {{ environment, state | LowM.Let e1 k \u21d3 output | state\' }}\\n\\n where \\"{{ environment , state | e \u21d3 output | state\' }}\\" :=\\n (t environment state output e state\').\\nEnd Run.\\n```\\n\\nWe use the notation:\\n\\n```coq\\n{{ environment , state | e \u21d3 output | state\' }}\\n```\\n\\nto say that a certain monadic expression `e` evaluates to the value `output`, with the environment `environment`, the initial state `state`, and the final state `state\'`. We define the evaluation rules for each primitive of the monad.\\n\\n### Evaluation function\\n\\nWe also define an evaluation function that will be useful in further tests to extract the Coq code back to OCaml and run tests to compare its behavior with the original Yul code. We define the evaluation function as follows:\\n\\n```coq\\n(** A function to evaluate an expression given enough [fuel]. *)\\nFixpoint eval {A : Set}\\n (fuel : nat)\\n (environment : Environment.t)\\n (e : LowM.t A) :\\n State.t -> (A + string) * State.t :=\\n match fuel with\\n | O => fun state => (inr \\"out of fuel\\", state)\\n | S fuel =>\\n match e with\\n | LowM.Pure output => fun state => (inl output, state)\\n | LowM.Primitive primitive k =>\\n fun state =>\\n let value_state := eval_primitive environment primitive state in\\n match value_state with\\n | inl (value, state) => eval fuel environment (k value) state\\n | inr error => (inr error, state)\\n end\\n | LowM.DeclareFunction name body k =>\\n (* ... other cases ... *)\\n```\\n\\nIt uses a `fuel` parameter to make sure that the evaluation terminates. For a monadic expression `e` and an initial state and environment, it returns either the value of the expression or an error message, as well as a final state. The error might be due to an unexpected branch in the code, like a `break` outside a loop, or to a lack of fuel. We plan to prove that it is equivalent to the evaluation rules defined above.\\n\\n## Testing\\n\\nTo test that our translation works, we ran it on all the Solidity files in the test suite of the Solidity compiler. There are, at the time of writing, 4856 `.sol` example files in the [semanticTests](https://github.com/ethereum/solidity/tree/develop/test/libsolidity/semanticTests) and [syntaxTests](https://github.com/ethereum/solidity/tree/develop/test/libsolidity/syntaxTests) folders. On each of them we run the Solidity compiler with the `--ir-coq` flag to generate the Coq code. This works for most of the test files, although some of the test files have a special format that combine several Solidity files into one file that we do not handle yet. Then type-check the generated code with Coq, what succeeds for all the Solidity files we translate.\\n\\nA more complex check is to ensure that our semantics is correct, that is to say that when we run our `eval` function in Coq on a smart contract, we get the same output as running this smart contract on an actual EVM once compiled with the Solidity compiler. We have a mechanism to extract the expected execution traces in the semantic tests to equivalent checks in Coq. We succeed in more than 90% of the test cases now. There are still a few builtin functions that we need to implement, like pre-compiled contracts.\\n\\n## Existing solutions\\n\\nThere are already a few formal verification tools for Solidity, as smart contracts are an important kind of program to check. A few of them, like the [Certora Prover](https://www.certora.com/), are closed source. Most work at the EVM bytecode level, as the semantics of the EVM is simpler than the semantics of Solidity. A disadvantage of working at the EVM level is that this is a low-level language, so the code is hard to understand (explicit stack manipulations, ...). This is the reason why we believe this approach is mostly used with automated verification tools.\\n\\nIt is hard to have a rather complete support for the Solidity language, despite of many attempts including [one of ours](https://gitlab.com/formal-land/coq-of-solidity). We can cite the [Verisol](https://github.com/microsoft/verisol) project from Microsoft to verify Solidity programs.\\n\\nThe Yul language offers a good compromise between the high-level Solidity language and the low-level EVM bytecode. It was actually designed with *formal verification in mind*, according to its documentation. These [notes](https://hackmd.io/@FranckC/BJz02K4Za) from [Franck Cassez](https://franck44.github.io/) give a good overview of the formal verification efforts for Yul. One of the conclusions is that a lot of the existing work is either incomplete/unmaintained or not designed for the formal verification of smart contracts, but rather to verify the Yul language itself. As a result, they propose a formal verification framework for Yul in [Dafny](https://dafny.org/) with [yul-dafny](https://github.com/franck44/yul-dafny).\\n\\n:::warning For more\\n\\nIf you have smart contract projects that you want to formally verify, going further than a manual audit to find bugs, contact us at [contact@formal.land](mailto:contact@formal.land)! Formal verification has the strong advantage of covering all possible execution cases.\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented our ongoing development of a formal verification tool for Solidity using the Coq proof assistant. We have briefly shown how we translate Solidity code to Coq using the Yul intermediate language and how we define the semantics of Yul in Coq. We have tested our tool on the examples of the Solidity compiler\'s test suite to check that our formalization is correct.\\n\\nOur next steps will be to:\\n\\n1. Complete our definitions of the Ethereum\'s primitives, to have a 100% success on the Solidity test suite.\\n2. Formally specify and verify an example of contract, looking at the [erc20.sol](https://github.com/formal-land/coq-of-solidity/blob/guillaume-claret%40experiments-with-yul/test/libsolidity/semanticTests/various/erc20.sol) example."},{"id":"/2024/06/05/software-correctness-from-first-principles","metadata":{"permalink":"/blog/2024/06/05/software-correctness-from-first-principles","source":"@site/blog/2024-06-05-software-correctness-from-first-principles.md","title":"\ud83e\udd84 Software correctness from first principles","description":"Formal verification is a technique to verify the absence of bugs in a program by reasoning from first principles. Instead of testing a program on examples, what covers a finite number of cases, formal verification checks all possible cases. It does so by going back to the definition of programming languages, showing why the whole code is correct given how each individual keyword behaves.","date":"2024-06-05T00:00:00.000Z","formattedDate":"June 5, 2024","tags":[{"label":"formal verification","permalink":"/blog/tags/formal-verification"},{"label":"software correctness","permalink":"/blog/tags/software-correctness"},{"label":"first principles","permalink":"/blog/tags/first-principles"},{"label":"example","permalink":"/blog/tags/example"},{"label":"Python","permalink":"/blog/tags/python"}],"readingTime":7.425,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 Software correctness from first principles","tags":["formal verification","software correctness","first principles","example","Python"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\ude81 Coq of Solidity \u2013 part 1","permalink":"/blog/2024/06/28/coq-of-solidity-1"},"nextItem":{"title":"\ud83d\udc0d Simulation of Python code from traces in Coq","permalink":"/blog/2024/05/22/translation-of-python-code-simulations-from-trace"}},"content":"**Formal verification** is a technique to verify the **absence of bugs** in a program by reasoning from **first principles**. Instead of testing a program on examples, what covers a finite number of cases, formal verification checks **all possible cases**. It does so by going back to the **definition of programming languages**, showing why the whole code is correct given how each individual keyword behaves.\\n\\nWe will present this idea in detail and illustrate how it works for a very simple example.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Use of formal verification\\n\\nWe typically use formal verification for critical applications, where either:\\n\\n- life is at stake, like in the case of trains, airplanes, medical devices, or\\n- money is at stake, like in the case of financial applications.\\n\\nWith formal verification, in theory, **we can guarantee that the software will never fail**, as we can check **all possible cases** for a given property. A property can be that no non-admin users can read sensitive data, or that a program never fails with uncaught exceptions. For that to be truly the case, we need to verify the whole software stack for all the relevant properties.\\n\\nIn this research paper [Finding and Understanding Bugs in C Compilers](https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf), no bugs were found in the middle-end of the formally verified [CompCert](https://en.wikipedia.org/wiki/CompCert) C compiler, while the other C compilers (GCC, LLVM, ...) all contained subtle bugs. This illustrates that formal verification can be an effective way to make complex software with zero bugs!\\n\\n## Definition of programming languages\\n\\nTo be able to reason on a program we go back to the definition of programming languages. The programming languages (C, JavaScript, Python, ...) are generally defined with a precise set of rules. For example, in Python, the `if` statement is [defined in the reference manual](https://docs.python.org/3/reference/compound_stmts.html#if) by:\\n\\n```python\\nif_stmt ::= \\"if\\" assignment_expression \\":\\" suite\\n (\\"elif\\" assignment_expression \\":\\" suite)*\\n [\\"else\\" \\":\\" suite]\\n```\\n> It selects exactly one of the suites by evaluating the expressions one by one until one is found to be true (see section Boolean operations for the definition of true and false); then that suite is executed (and no other part of the if statement is executed or evaluated). If all expressions are false, the suite of the else clause, if present, is executed.\\n>\\n> — The Python\'s reference manual\\n\\nThis means that the Python code:\\n\\n```python\\nif condition:\\n a\\nelse:\\n b\\n```\\n\\nwill execute `a` when the `condition` is true, and `b` otherwise. There are similar rules for all other program constructs (loops, function definitions, classes, ...).\\n\\nTo make these rules more manageable, we generally split them into two parts:\\n\\n- The syntax part, that defines what is a valid program in the language. For example, in Python, the syntax is defined by the [grammar](https://docs.python.org/3/reference/grammar.html).\\n- The semantics part, that defines what a program does. This is what we have seen above with the description of the behavior of the `if` statement.\\n\\nIn formal verification, we will focus on the semantics of programs, assuming that the syntax is already verified by the compiler or interpreter, generating \\"syntax errors\\" in case of ill-formed programs.\\n\\n## Example to verify\\n\\nWe consider this short Python example of a function returning the maximum number in a list:\\n\\n```python\\ndef my_max(l):\\n m = l[0]\\n for x in l:\\n if x > m:\\n m = x\\n return m\\n```\\n\\nWe assume that the list `l` is not empty and only contains integers. If we run it on a few examples:\\n\\n```python\\nmy_max([1, 2, 3]) # => 3\\nmy_max([3, 2, 1]) # => 3\\nmy_max([1, 3, 2]) # => 3\\n```\\n\\nit always returns `3`, the biggest number in the list! But can we make sure this is always the case?\\n\\nWe can certainly not run `my_max` on all possible lists of integers, as there are infinitely many of them. We need to reason from the definition of the Python language, which is what we call formal verification reasoning.\\n\\n## Formal verification\\n\\nHere is a general specification that we give of the `my_max` function above:\\n\\n```python\\nforall (index : int) (l : list[int]),\\n 0 \u2264 index < len(l) \u21d2\\n l[index] \u2264 my_max(l)\\n```\\n\\nIt says that for all integer `index` and list of integers `l`, if the index is valid (between `0` and the length of the list), then the element at this index is less than or equal to the maximum of the list that we compute.\\n\\nTo verify this property for all possible list `l`, we reason by induction. A non-empty list is either:\\n\\n- a list with one element, where the maximum is the only element, or\\n- a list with at least two elements, where the maximum is either the last element or the maximum of the rest of the list.\\n\\nAt the start of the code, we will always have:\\n\\n```python\\ndef my_max(l):\\n m = l[0]\\n```\\n\\nwith `m` being equal to the first item of the list. Then:\\n\\n- If the list has only one element, we iterate only once in the `for` loop, with `x` equal to `l[0]`. The condition:\\n ```python\\n if x > m:\\n ```\\n is then equivalent to:\\n ```python\\n if l[0] > l[0]:\\n ```\\n and is always false. We then return `m = l[0]`, which is the only element of the list, and it verifies our property as:\\n ```python\\n l[0] \u2264 l[0]\\n ```\\n- If the list has at least two elements, we unroll the code execution of the `for` loop and iterate over all the elements until the last one. Our induction hypothesis tells us that the property we verify is true for the first part of the list, excluding the last element. This means that:\\n ```python\\n l[index] \u2264 m\\n ```\\n for all `index` between `0` and `len(l) - 2`. When we reach the last element, we have:\\n ```python\\n if x > m:\\n m = x\\n ```\\n with `x` being `l[len(l) - 1]`. There are two possibilities. Either *(i)* `x` is less than or equal to `m`, and we do not update `m`, or *(ii)* `x` is greater than `m`, and we update `m` to `x`. In both cases, the property is verified for the last element of the list, as:\\n 1. In the first case, `m` stays the same, so it is still larger or equal to all the elements of the list except the last one, as well as larger or equal to the last one according to this last `if` statement.\\n 2. In the second case, `m` is updated to `x`, which is the last element of the list and a greater value than the original `m`. Then it means that `m` is still larger or equal to all the elements of the list except the last one, being larger that the original `m`, and larger or equal to the last one as it is in fact equals to the last one.\\n\\nWe have now closed our induction proof and verified that our property is true for all possible lists of integers! The reasoning above is rather verbose but should actually correspond to the intuition of most programmers when reading this code.\\n\\nIn practice, with formal verification, the reasoning above is done in a proof assistance such as [Coq](https://coq.inria.fr/) to help making sure that we did not forget any case, and automatically solve simple cases for us. Having a proof written in a proof language like Coq also allows us to re-run it to check that it is still valid after a change in the code, and allows third-party persons to check it without reading all the details.\\n\\n## Completing the property\\n\\nAn additional property that we did not verify is:\\n\\n```python\\nforall (l : list[int]),\\n exists (index : int),\\n 0 \u2264 index < len(l) and\\n l[index] = my_max(l)\\n```\\n\\nIt says that the maximum of the list is actually in the list. We can verify it by induction in the same way as we did for the first property. You can detail this verification as an exercise.\\n\\n:::info For more\\n\\nIf you want to go into more details for the formal verification of Python programs, you can look at our [coq-of-python](https://github.com/formal-land/coq-of-python) project, where we define the semantics of Python in Coq and verify properties of Python programs (ongoing project!). We also provide formal verification services for [Rust](https://github.com/formal-land/coq-of-rust) and other languages like [OCaml](https://github.com/formal-land/coq-of-ocaml). Contact us at [contact@formal.land](mailto:contact@formal.land) to discuss if you have critical applications to check!\\n\\n:::\\n\\n## Conclusion\\n\\nWe have presented here the idea of **formal verification**, a technique to verify the absence of bugs in a program by reasoning from **first principles**. We have illustrated this idea for a simple Python example, showing how we can verify that a function computing the maximum of a list is correct **for all possible lists of integers**.\\n\\nWe will continue with more blog posts explaining what we can do with formal verification and why it matters. Feel free to share this post and to tell us what subjects you want to see covered!"},{"id":"/2024/05/22/translation-of-python-code-simulations-from-trace","metadata":{"permalink":"/blog/2024/05/22/translation-of-python-code-simulations-from-trace","source":"@site/blog/2024-05-22-translation-of-python-code-simulations-from-trace.md","title":"\ud83d\udc0d Simulation of Python code from traces in Coq","description":"In order to formally verify Python code in Coq our approach is the following:","date":"2024-05-22T00:00:00.000Z","formattedDate":"May 22, 2024","tags":[{"label":"coq-of-python","permalink":"/blog/tags/coq-of-python"},{"label":"Python","permalink":"/blog/tags/python"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"Ethereum","permalink":"/blog/tags/ethereum"},{"label":"simulation","permalink":"/blog/tags/simulation"},{"label":"trace","permalink":"/blog/tags/trace"}],"readingTime":8.59,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc0d Simulation of Python code from traces in Coq","tags":["coq-of-python","Python","Coq","translation","Ethereum","simulation","trace"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 Software correctness from first principles","permalink":"/blog/2024/06/05/software-correctness-from-first-principles"},"nextItem":{"title":"\ud83d\udc0d Simulation of Python code in Coq","permalink":"/blog/2024/05/14/translation-of-python-code-simulations"}},"content":"In order to formally verify Python code in Coq our approach is the following:\\n\\n1. Import Python code in Coq by running [coq-of-python](https://github.com/formal-land/coq-of-python).\\n2. Write a purely functional simulation in Coq of the code.\\n3. Show that this simulation is equivalent to the translation.\\n4. Verify the simulation.\\n\\nWe will show in this article how we can merge the steps 2. and 3. to save time in the verification process. We do so by relying on the proof mode of Coq and unification.\\n\\nOur mid-term goal is to formally specify the [Ethereum Virtual Machine](https://ethereum.org/en/developers/docs/evm/) (EVM) and prove that this specification is correct according to [reference implementation of the EVM](https://github.com/ethereum/execution-specs) in Python. This would ensure that it is always up-to-date and exhaustive. The code of this project is open-source and available on GitHub: [formal-land/coq-of-python](https://github.com/formal-land/coq-of-python).\\n\\n\x3c!-- truncate --\x3e\\n\\n
\\n ![Python at work](2024-05-22/python.webp)\\n
\\n\\n## Our Python\'s monad \ud83d\udc0d\\n\\nWe put the Python code that we import in Coq in a monad `M` to represent all the features that are hard to express in Coq, mainly the side effects. This monad is a combination of two levels:\\n\\n- `LowM` for the side effects except the control flow.\\n- `M` that adds an error monad on top of `LowM` to handle the control flow (exceptions, `break` instruction, ...).\\n\\n### LowM\\n\\nHere is the definition of the `LowM` monad in [CoqOfPython.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/CoqOfPython.v):\\n\\n```coq\\nModule Primitive.\\n Inductive t : Set -> Set :=\\n | StateAlloc (object : Object.t Value.t) : t (Pointer.t Value.t)\\n | StateRead (mutable : Pointer.Mutable.t Value.t) : t (Object.t Value.t)\\n | StateWrite (mutable : Pointer.Mutable.t Value.t) (update : Object.t Value.t) : t unit\\n | GetInGlobals (globals : Globals.t) (name : string) : t Value.t.\\nEnd Primitive.\\n\\nModule LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure (a : A)\\n | CallPrimitive {B : Set} (primitive : Primitive.t B) (k : B -> t A)\\n | CallClosure {B : Set} (closure : Data.t Value.t) (args kwargs : Value.t) (k : B -> t A)\\n | Impossible.\\n Arguments Pure {_}.\\n Arguments CallPrimitive {_ _}.\\n Arguments CallClosure {_ _}.\\n Arguments Impossible {_}.\\n\\n Fixpoint bind {A B : Set} (e1 : t A) (e2 : A -> t B) : t B :=\\n match e1 with\\n | Pure a => e2 a\\n | CallPrimitive primitive k => CallPrimitive primitive (fun v => bind (k v) e2)\\n | CallClosure closure args kwargs k => CallClosure closure args kwargs (fun a => bind (k a) e2)\\n | Impossible => Impossible\\n end.\\nEnd LowM.\\n```\\n\\nThis is a monad defined by continuation (the variable `k`):\\n\\n- We terminate a computation with the primitive `Pure` and some result `a`, that can be any purely functional expression.\\n- We can call some primitives grouped in `Primitive.t` that are side effects:\\n - `StateAlloc` to allocate a new object in the memory,\\n - `StateRead` to read an object from the memory,\\n - `StateWrite` to write an object in the memory,\\n - `GetInGlobals` to read a global variable, doing name resolution. This is a side effects as function definitions in Python do not need to be ordered.\\n- We can call a closure (an anonymous function) with `CallClosure`. This is required for termination, as we cannot define an eval function on the type of Python values since some do not terminate like the [\u03a9 expression](https://medium.com/@dkeout/why-you-must-actually-understand-the-%CF%89-and-y-combinators-c9204241da7a). See our previous post [Translation of Python code to Coq](/blog/2024/05/10/translation-of-python-code) for our definition of Python values. The combinator `CallClosure` is also very convenient to modularize our proofs: we reason on each closure independently.\\n- We can mark a code path as unreachable with `Impossible`.\\n\\n### M\\n\\nThe final monad `M` is defined as:\\n\\n```coq\\nDefinition M : Set :=\\n LowM.t (Value.t + Exception.t).\\n```\\n\\nIt has no parameters as Python is untyped, so all expressions have the same result type:\\n\\n- either a success value of type `Value.t`,\\n- or an exception of type `Exception.t`, with some special cases to represent a `return`, a `break`, or a `continue` instruction.\\n\\nWe define the monadic bind of `M` like for the error monad:\\n\\n```coq\\nDefinition bind (e1 : M) (e2 : Value.t -> M) : M :=\\n LowM.bind e1 (fun v => match v with\\n | inl v => e2 v\\n | inr e => LowM.Pure (inr e)\\n end).\\n```\\n\\n## Traces \ud83d\udc3e\\n\\nWe define our semantics of a computation `e` of type `M` in [simulations/proofs/CoqOfPython.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/simulations/proofs/CoqOfPython.v) with the predicate:\\n\\n```coq\\n{{ stack, heap | e \u21d3 to_value | P_stack, P_heap }}\\n```\\n\\nthat we call a _run_ or a _trace_, saying that:\\n\\n- starting from the initial state `stack`, `heap`,\\n- the computation `e` terminates with a value,\\n- that is in the image of the function `to_value`,\\n- and with a final stack and heap that satisfy the predicates `P_stack` and `P_heap`.\\n\\nNote that we do not explicit the resulting value and memory state of a computation in this predicate. We only say that it exists and verifies a few properties, that are here for compositionality. We have a purely functional function `evaluate` that can derive the result of a run of a computation:\\n\\n```coq\\nevaluate :\\n forall `{Heap.Trait} {A B : Set}\\n {stack : Stack.t} {heap : Heap} {e : LowM.t B}\\n {to_value : A -> B} {P_stack : Stack.t -> Prop} {P_heap : Heap -> Prop}\\n (run : {{ stack, heap | e \u21d3 to_value | P_stack, P_heap }}),\\n A * { stack : Stack.t | P_stack stack } * { heap : Heap | P_heap heap }\\n```\\n\\nThe function `evaluate` is defined in Coq by a `Fixpoint`. Its result is what we call a _simulation_, which is a purely functional definition equivalent to the orignal computation `e` from Python. It is equivalent by construction.\\n\\n## Building a trace \ud83d\udd28\\n\\nA trace is an inductive in `Set` that we can build with the following constructors:\\n\\n```coq\\nInductive t `{Heap.Trait} {A B : Set}\\n (stack : Stack.t) (heap : Heap)\\n (to_value : A -> B) (P_stack : Stack.t -> Prop) (P_heap : Heap -> Prop) :\\n LowM.t B -> Set :=\\n(* [Pure] primitive *)\\n| Pure\\n (result : A)\\n (result\' : B) :\\n result\' = to_value result ->\\n P_stack stack ->\\n P_heap heap ->\\n {{ stack, heap |\\n LowM.Pure result\' \u21d3\\n to_value\\n | P_stack, P_heap }}\\n(* [StateRead] primitive *)\\n| CallPrimitiveStateRead\\n (mutable : Pointer.Mutable.t Value.t)\\n (object : Object.t Value.t)\\n (k : Object.t Value.t -> LowM.t B) :\\n IsRead.t stack heap mutable object ->\\n {{ stack, heap |\\n k object \u21d3\\n to_value\\n | P_stack, P_heap }} ->\\n {{ stack, heap |\\n LowM.CallPrimitive (Primitive.StateRead mutable) k \u21d3\\n to_value\\n | P_stack, P_heap }}\\n(* [CallClosure] primitive *)\\n| CallClosure {C : Set}\\n (f : Value.t -> Value.t -> M)\\n (args kwargs : Value.t)\\n (to_value_inter : C -> Value.t + Exception.t)\\n (P_stack_inter : Stack.t -> Prop) (P_heap_inter : Heap -> Prop)\\n (k : Value.t + Exception.t -> LowM.t B) :\\n let closure := Data.Closure f in\\n {{ stack, heap |\\n f args kwargs \u21d3\\n to_value_inter\\n | P_stack_inter, P_heap_inter }} ->\\n (* We quantify over every possible values as we cannot compute the result of the closure here.\\n We only know that it exists and respects some constraints in this inductive definition. *)\\n (forall value_inter stack_inter heap_inter,\\n P_stack_inter stack_inter ->\\n P_heap_inter heap_inter ->\\n {{ stack_inter, heap_inter |\\n k (to_value_inter value_inter) \u21d3\\n to_value\\n | P_stack, P_heap }}\\n ) ->\\n {{ stack, heap |\\n LowM.CallClosure closure args kwargs k \u21d3\\n to_value\\n | P_stack, P_heap }}\\n(* ...cases for the other primitives of the monad... *)\\n```\\n\\n### Pure\\n\\nIn the `Pure` case we return the final result of the computation. We check the state fulfills the predicate `P_stack` and `P_heap`, and that the result is the image by the function `to_value` of some `result`.\\n\\n### CallPrimitiveStateRead\\n\\nTo read a value in memory, we rely on another predicate `IsRead` that checks if the `mutable` pointer is valid in the `stack` or `heap` and that the `object` is the value at this pointer. We then call the continuation `k` with this object. We have similar rules for allocating a new object in memory and writing at a pointer.\\n\\nNote that we parameterize all our semantics by `` `{Heap.Trait}`` that provides a specific `Heap` type with read and write primitives. We can choose the implementation of the memory model that we want to use in our simulations in order to simplify the reasoning.\\n\\n### CallClosure\\n\\nTo call a closure, we first evaluate the closure with the arguments and keyword arguments. We then call the continuation `k` with the result of the closure. We quantify over all possible results of the closure, as we cannot compute it here. This would require to be able to define `Fixpoint` together with `Inductive`, which is not possible in Coq. So we only know that the result of the closure exists, and can use the constraints on its result (the function `to_value` and the predicates `P_stack_inter` and `P_heap_inter`) to build a run of the continuation.\\n\\nThe other constructors are not presented here but are similar to the above. We will also add a monadic primitive for loops with the following idea: we show that a loop terminates by building a trace, as traces are `Inductive` so must be finite. We have no rules for the `Impossible` case so that building the trace of a computation also shows that the `Impossible` calls are in unreachable paths.\\n\\n## Example \ud83d\udd0d\\n\\nWe have applied these technique to a small code example with allocation, memory read, and closure call primitives. We were able to show that the resulting simulation obtained by running `evaluate` on the trace is equal to a simulation written by hand. The proof was just the tactic `reflexivity`. We believe that we can automate most of the tactics used to build a run, except for the allocations were the user needs to make a choice (immediate, stack, or heap allocation, which address, ...).\\n\\nTo continue our experiments we now need to complete our semantics of Python, especially to take into account method and operator calls.\\n\\n## Conclusion\\n\\nWe have presented an alternative way to build simulations of imperative Python code in purely functional Coq code. The idea is to enable faster reasoning over Python code by removing the need to build explicit simulations. We plan to port this technique to other tools like [coq-of-rust](https://github.com/formal-land/coq-of-rust) as well.\\n\\nTo see what we can do for you talk with us at [contact@formal.land](mailto:contact@formal.land) \ud83c\udfc7. For our previous projects, see our [formal verification of the Tezos\' L1](https://formal-land.gitlab.io/coq-tezos-of-ocaml/)!"},{"id":"/2024/05/14/translation-of-python-code-simulations","metadata":{"permalink":"/blog/2024/05/14/translation-of-python-code-simulations","source":"@site/blog/2024-05-14-translation-of-python-code-simulations.md","title":"\ud83d\udc0d Simulation of Python code in Coq","description":"We are continuing to specify the Ethereum Virtual Machine (EVM) in the formal verification language Coq. We are working from the automatic translation in Coq of the reference implementation of the EVM, which is written in the language Python.","date":"2024-05-14T00:00:00.000Z","formattedDate":"May 14, 2024","tags":[{"label":"coq-of-python","permalink":"/blog/tags/coq-of-python"},{"label":"Python","permalink":"/blog/tags/python"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"Ethereum","permalink":"/blog/tags/ethereum"}],"readingTime":6.63,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc0d Simulation of Python code in Coq","tags":["coq-of-python","Python","Coq","translation","Ethereum"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc0d Simulation of Python code from traces in Coq","permalink":"/blog/2024/05/22/translation-of-python-code-simulations-from-trace"},"nextItem":{"title":"\ud83d\udc0d Translation of Python code to Coq","permalink":"/blog/2024/05/10/translation-of-python-code"}},"content":"We are continuing to specify the [Ethereum Virtual Machine](https://ethereum.org/en/developers/docs/evm/) (EVM) in the formal verification language [Coq](https://coq.inria.fr/). We are working from the [automatic translation in Coq](https://github.com/formal-land/coq-of-python/tree/main/CoqOfPython/ethereum) of the [reference implementation of the EVM](https://github.com/ethereum/execution-specs), which is written in the language [Python](https://www.python.org/).\\n\\nIn this article, we will see how we specify the EVM in Coq by writing an interpreter that closely mimics the behavior of the Python code. We call that implementation a _simulation_ as it aims to reproduce the behavior of the Python code, the reference.\\n\\nIn contrast to the automatic translation from Python, the simulation is a manual translation written in idiomatic Coq. We expect it to be ten times smaller in lines compared to the automatic translation, and of about the same size as the Python code. This is because the automatic translation needs to encode all the Python specific features in Coq, like variable mutations and the class system.\\n\\nIn the following article, we will show how we can prove that the simulation is correct, meaning that it behaves exactly as the automatic translation.\\n\\nThe code of this project is open-source and available on GitHub: [formal-land/coq-of-python](https://github.com/formal-land/coq-of-python). This work follows a call from [Vitalik Buterin](https://en.wikipedia.org/wiki/Vitalik_Buterin) for more formal verification of the Ethereum\'s code.\\n\\n\x3c!-- truncate --\x3e\\n\\n
\\n ![Python writing simulations](2024-05-14/python_simulation.webp)\\n
\\n\\n## The `add` function \ud83e\uddee\\n\\nWe focus on a simulation for the `add` function in [vm/instructions/arithmetic.py](https://github.com/ethereum/execution-specs/blob/master/src/ethereum/paris/vm/instructions/arithmetic.py) that implements the addition primitive of the EVM. The Python code is:\\n\\n```python\\ndef add(evm: Evm) -> None:\\n \\"\\"\\"\\n Adds the top two elements of the stack together, and pushes the result back\\n on the stack.\\n\\n Parameters\\n ----------\\n evm :\\n The current EVM frame.\\n\\n \\"\\"\\"\\n # STACK\\n x = pop(evm.stack)\\n y = pop(evm.stack)\\n\\n # GAS\\n charge_gas(evm, GAS_VERY_LOW)\\n\\n # OPERATION\\n result = x.wrapping_add(y)\\n\\n push(evm.stack, result)\\n\\n # PROGRAM COUNTER\\n evm.pc += 1\\n```\\n\\nMost of the functions of the interpreter are written in this style. They take the global state of the interpreter, called `Evm` as input, and mutate it with the effect of the current instruction.\\n\\nThe `Evm` structure is defined as:\\n\\n```python\\n@dataclass\\nclass Evm:\\n \\"\\"\\"The internal state of the virtual machine.\\"\\"\\"\\n\\n pc: Uint\\n stack: List[U256]\\n memory: bytearray\\n code: Bytes\\n gas_left: Uint\\n env: Environment\\n valid_jump_destinations: Set[Uint]\\n logs: Tuple[Log, ...]\\n refund_counter: int\\n running: bool\\n message: Message\\n output: Bytes\\n accounts_to_delete: Set[Address]\\n touched_accounts: Set[Address]\\n return_data: Bytes\\n error: Optional[Exception]\\n accessed_addresses: Set[Address]\\n accessed_storage_keys: Set[Tuple[Address, Bytes32]]\\n```\\n\\nIt contains the current instruction pointer `pc`, the stack of the EVM, the memory, the code, the gas left, ...\\n\\nAs the EVM is a stack-based machine, the addition function does the following:\\n\\n1. It pops the two top elements of the stack `x` and `y`,\\n2. It charges a very low amount of gas,\\n3. It computes the result of the addition `result = x + y`,\\n4. It pushes the result back on the stack,\\n5. It increments the program counter `pc`.\\n\\nNote that all these operations might fail and raise an exception, for example,if the stack is empty when we pop `x`and `y` at the beginning.\\n\\n## Monad for the simulations \ud83e\uddea\\n\\nThe main side-effects that we want to integrate into the Coq simulations are:\\n\\n- the mutation of the global state `Evm`,\\n- the raising of exceptions.\\n\\nFor that, we use a state and error monad `MS?`:\\n\\n```coq\\nModule StateError.\\n Definition t (State Error A : Set) : Set :=\\n State -> (A + Error) * State.\\n\\n Definition return_ {State Error A : Set}\\n (value : A) :\\n t State Error A :=\\n fun state => (inl value, state).\\n\\n Definition bind {State Error A B : Set}\\n (value : t State Error A)\\n (f : A -> t State Error B) :\\n t State Error B :=\\n fun state =>\\n let (value, state) := value state in\\n match value with\\n | inl value => f value state\\n | inr error => (inr error, state)\\n end.\\nEnd StateError.\\n\\nNotation \\"MS?\\" := StateError.t.\\n```\\n\\nWe parametrize it by an equivalent definition in Coq of the type `Evm` and the type of exceptions that we might raise.\\n\\nIn Python the exceptions are a class that is extended as needed to add new kinds of exceptions. We use a closed sum type in Coq to represent the all possible exceptions that might happen in the EVM interpreter.\\n\\nFor the `Evm` state, some functions might actually only modify a part of it. For example, the `pop` function only modifies the `stack` field. We use a mechanism of [lens](https://medium.com/javascript-scene/lenses-b85976cb0534) to specialize the state monad to only modify a part of the state. For example, the `pop` function has the type:\\n\\n```coq\\npop : MS? (list U256.t) Exception.t U256.t\\n```\\n\\nwhere `list U256.t` is the type of the stack, while the `add` function has type:\\n\\n```coq\\nadd : MS? Evm.t Exception.t unit\\n```\\n\\nWe define a lens for the stack in the `Evm` type with:\\n\\n```coq\\nModule Lens.\\n Record t (Big_A A : Set) : Set := {\\n read : Big_A -> A;\\n write : Big_A -> A -> Big_A\\n }.\\nEnd Lens.\\n\\nModule Evm.\\n Module Lens.\\n Definition stack : Lens.t Evm.t (list U256.t) := {|\\n Lens.read := (* ... *);\\n Lens.write := (* ... *);\\n |}.\\n```\\n\\nWe can then lift the `pop` function to be used in a context where the `Evm` state is modified with:\\n\\n```coq\\nletS? x := StateError.lift_lens Evm.Lens.stack pop in\\n```\\n\\n## Typing discipline \ud83d\udc6e\\n\\nWe keep in Coq all the type names from the Python source code. When a new class is created we create a new Coq type. When the class inherits from another one, we add a field in the Coq type to represent the parent class. Thus we work by composition rather than inheritance.\\n\\nHere is an example of the primitive types defined in [base_types.py](https://github.com/ethereum/execution-specs/blob/master/src/ethereum/base_types.py):\\n\\n```python\\nclass FixedUint(int):\\n MAX_VALUE: ClassVar[\\"FixedUint\\"]\\n\\n # ...\\n\\n def __add__(self: T, right: int) -> T:\\n # ...\\n\\nclass U256(FixedUint):\\n MAX_VALUE = 2**256 - 1\\n\\n # ...\\n```\\n\\nWe simulate it by:\\n\\n```coq\\nModule FixedUint.\\n Record t : Set := {\\n MAX_VALUE : Z;\\n value : Z;\\n }.\\n\\n Definition __add__ (self right_ : t) : M? Exception.t t :=\\n (* ... *).\\nEnd FixedUint.\\n\\nModule U256.\\n Inductive t : Set :=\\n | Make (value : FixedUint.t).\\n\\n Definition of_Z (value : Z) : t :=\\n Make {|\\n FixedUint.MAX_VALUE := 2^256 - 1;\\n FixedUint.value := value;\\n |}.\\n\\n (* ... *)\\nEnd U256.\\n```\\n\\nFor the imports, that are generally written with an explicit list of names:\\n\\n```python\\nfrom ethereum.base_types import U255_CEIL_VALUE, U256, U256_CEIL_VALUE, Uint\\n```\\n\\nwe follow the same pattern in Coq:\\n\\n```coq\\nRequire ethereum.simulations.base_types.\\nDefinition U255_CEIL_VALUE := base_types.U255_CEIL_VALUE.\\nModule U256 := base_types.U256.\\nDefinition U256_CEIL_VALUE := base_types.U256_CEIL_VALUE.\\nModule Uint := base_types.Uint.\\n```\\n\\nThis is a bit more verbose than the usual way in Coq to import a module, but it makes the translation more straightforward.\\n\\n## Final simulation \ud83e\udeb6\\n\\nFinally, our Coq simulation of the `add` function is the following:\\n\\n```coq\\nDefinition add : MS? Evm.t Exception.t unit :=\\n (* STACK *)\\n letS? x := StateError.lift_lens Evm.Lens.stack pop in\\n letS? y := StateError.lift_lens Evm.Lens.stack pop in\\n\\n (* GAS *)\\n letS? _ := charge_gas GAS_VERY_LOW in\\n\\n (* OPERATION *)\\n let result := U256.wrapping_add x y in\\n\\n letS? _ := StateError.lift_lens Evm.Lens.stack (push result) in\\n\\n (* PROGRAM COUNTER *)\\n letS? _ := StateError.lift_lens Evm.Lens.pc (fun pc =>\\n (inl tt, Uint.__add__ pc (Uint.Make 1))) in\\n\\n returnS? tt.\\n```\\n\\nWe believe that it has a size and readability close to the original Python code. You can look at this definition in [vm/instructions/simulations/arithmetic.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/ethereum/paris/vm/instructions/simulations/arithmetic.v). As a reference, the automatic translation is 65 lines long and in [vm/instructions/arithmetic.v](https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/ethereum/paris/vm/instructions/arithmetic.v).\\n\\n## Conclusion\\n\\nWe have seen how to write a simulation for one example of a Python function. We now need to do it for the rest of the code of the interpreter. We will also see in a following article how to prove that the simulation behaves as the automatic translation of the Python code in Coq.\\n\\nFor our formal verification services, reach us at [contact@formal.land](mailto:contact@formal.land) \ud83c\udfc7! To know more about what we have done, see [our previous project](https://formal-land.gitlab.io/coq-tezos-of-ocaml/) on the verification of the L1 of Tezos."},{"id":"/2024/05/10/translation-of-python-code","metadata":{"permalink":"/blog/2024/05/10/translation-of-python-code","source":"@site/blog/2024-05-10-translation-of-python-code.md","title":"\ud83d\udc0d Translation of Python code to Coq","description":"We are starting to work on a new product, coq-of-python. The idea of this tool is, as you can guess, to translate Python code to the proof system Coq.","date":"2024-05-10T00:00:00.000Z","formattedDate":"May 10, 2024","tags":[{"label":"coq-of-python","permalink":"/blog/tags/coq-of-python"},{"label":"Python","permalink":"/blog/tags/python"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"Ethereum","permalink":"/blog/tags/ethereum"}],"readingTime":10.445,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc0d Translation of Python code to Coq","tags":["coq-of-python","Python","Coq","translation","Ethereum"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc0d Simulation of Python code in Coq","permalink":"/blog/2024/05/14/translation-of-python-code-simulations"},"nextItem":{"title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","permalink":"/blog/2024/04/26/translation-core-alloc-crates"}},"content":"We are starting to work on a new product, [coq-of-python](https://github.com/formal-land/coq-of-python). The idea of this tool is, as you can guess, to translate Python code to the [proof system Coq](https://coq.inria.fr/).\\n\\nWe want to import specifications written in Python to a formal system like Coq. In particular, we are interested in the [reference specification](https://github.com/ethereum/execution-specs) of [Ethereum](https://ethereum.org/), which describes how [EVM smart contracts](https://ethereum.org/en/developers/docs/evm/) run. Then, we will be able to use this specification to either formally verify the various implementations of the EVM or smart contracts.\\n\\nAll this effort follows [a Tweet](https://twitter.com/VitalikButerin/status/1759369749887332577) from [Vitalik Buterin](https://en.wikipedia.org/wiki/Vitalik_Buterin) hoping for more formal verification of the Ethereum\'s code:\\n\\n> One application of AI that I am excited about is AI-assisted formal verification of code and bug finding.\\n>\\n> Right now ethereum\'s biggest technical risk probably is bugs in code, and anything that could significantly change the game on that would be amazing.\\n>\\n> — Vitalik Buterin\\n\\nWe will now describe the technical development of `coq-of-python`. For the curious, all the code is on GitHub: [formal-land/coq-of-python](https://github.com/formal-land/coq-of-python).\\n\\n\x3c!-- truncate --\x3e\\n\\n
\\n ![Python with a rooster](2024-05-10/python_rooster.webp)\\n \x3c!--
A python with a rooster
--\x3e\\n
\\n\\n## Reading Python code \ud83d\udcd6\\n\\nA first step we need to do to translate Python code is to read it in a programmatic way. For simplicity and better integration, we chose to write `coq-of-python` in Python.\\n\\nWe use the [ast](https://docs.python.org/3/library/ast.html) module to parse the code and get an abstract syntax tree (AST) of the code. This is a tree representation of the code that we can manipulate in Python. We could have used other representations, such as the Python bytecode, but it seemed too low-level to be understandable by a human.\\n\\nGiven the path to a Python file, we get its AST with the following code:\\n\\n```python\\nimport ast\\n\\ndef read_python_file(path: str) -> ast.Module:\\n with open(path, \\"r\\") as file:\\n return ast.parse(file.read())\\n```\\n\\nThis code is very short, and we benefit from the general elegance of Python. There is no typing or advanced data types in Python, keeping the AST rather small. Here is an extract of it:\\n\\n```\\nexpr = BoolOp(boolop op, expr* values)\\n | NamedExpr(expr target, expr value)\\n | BinOp(expr left, operator op, expr right)\\n | UnaryOp(unaryop op, expr operand)\\n | Lambda(arguments args, expr body)\\n | IfExp(expr test, expr body, expr orelse)\\n | Dict(expr* keys, expr* values)\\n | Set(expr* elts)\\n | ListComp(expr elt, comprehension* generators)\\n | SetComp(expr elt, comprehension* generators)\\n | ... more cases ...\\n```\\n\\nAn expression is described as being of one of several kinds. For example, the application of a binary operator such as:\\n\\n```python\\n1 + 2\\n```\\n\\ncorresponds to the case `BinOp` with `1` as the `left` expression, `+` as the `op` operator, and `2` as the `right` expression.\\n\\n## Outputting Coq code \ud83d\udcdd\\n\\nWe translate each element of the Python\'s AST into a string of Coq code. We keep track of the current indentation level in order to present a nice output. Here is the code to translate the binary operator expressions:\\n\\n```python\\ndef generate_expr(indent, is_with_paren, node: ast.expr):\\n if isinstance(node, ast.BoolOp):\\n ...\\n elif isinstance(node, ast.BinOp):\\n return paren(\\n is_with_paren,\\n generate_operator(node.op) + \\" (|\\\\n\\" +\\n generate_indent(indent + 1) +\\n generate_expr(indent + 1, False, node.left) + \\",\\\\n\\" +\\n generate_indent(indent + 1) +\\n generate_expr(indent + 1, False, node.right) + \\"\\\\n\\" +\\n generate_indent(indent) + \\"|)\\"\\n )\\n elif ...\\n```\\n\\nWe have the current number of indentation levels in the `indent` variable. We use the flag `is_with_paren` to know whether we should add parenthesis around the current expression if it is the sub-expression of another one.\\n\\nWe apply the `node.op` operator on the two parameters `node.left` and `node.right`. For example, the translation of the Python code `1 + 2` will be:\\n\\n```coq\\nBinOp.add (|\\n Constant.int 1,\\n Constant.int 2\\n|)\\n```\\n\\nWe use a special notation `f (| x1, ..., xn |)` to represent a function application in a monadic context. In the next section, we explain why we need this notation.\\n\\n## Monad and values \ud83d\udd2e\\n\\nOne of the difficulties in translating some code to a language such as Coq is that Coq is purely functional. This means that a function can never modify a variable or raise an exception. The non-purely functional actions are called side-effects.\\n\\nTo solve this issue, we represent the side-effects of the Python code in a [monad]() in Coq. A monad is a special data structure representing the side-effects of a computation. We can chain monadic actions together to represent a sequence of side-effects.\\n\\nWe thus have two Coq types:\\n\\n- `Value.t` for the Python values (there is only one type for all values, as Python is a dynamically typed language),\\n- `M` for the monadic expressions.\\n\\nNote that we do not need to parametrize the monad by the type of the values, as we only have one type of value.\\n\\n### Values\\n\\nAccording to the reference manual of Python on the [data model](https://docs.python.org/3/reference/datamodel.html):\\n\\n> All data in a Python program is represented by objects or by relations between objects.\\n\\n> Every object has an identity, a type and a value. An object\u2019s identity never changes once it has been created; you may think of it as the object\u2019s address in memory.\\n\\n> Like its identity, an object\u2019s type is also unchangeable.\\n\\n> The value of some objects can change. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable.\\n\\nBy following this description, we propose this formalization for the values:\\n\\n```coq\\nModule Data.\\n Inductive t (Value : Set) : Set :=\\n | Ellipsis\\n | Bool (b : bool)\\n | Integer (z : Z)\\n | Tuple (items : list Value)\\n (* ... various other primitive types like lists, ... *)\\n | Closure {Value M : Set} (f : Value -> Value -> M)\\n | Klass {Value M : Set}\\n (bases : list (string * string))\\n (class_methods : list (string * (Value -> Value -> M)))\\n (methods : list (string * (Value -> Value -> M))).\\nEnd Data.\\n\\nModule Object.\\n Record t {Value : Set} : Set := {\\n internal : option (Data.t Value);\\n fields : list (string * Value);\\n }.\\nEnd Object.\\n\\nModule Pointer.\\n Inductive t (Value : Set) : Set :=\\n | Imm (data : Object.t Value)\\n | Mutable {Address A : Set}\\n (address : Address)\\n (to_object : A -> Object.t Value).\\nEnd Pointer.\\n\\nModule Value.\\n Inductive t : Set :=\\n | Make (globals : string) (klass : string) (value : Pointer.t t).\\nEnd Value.\\n```\\n\\nWe describe a `Value.t` by:\\n\\n- its type, given by a class name `klass` and a module name `globals` from which the class is defined,\\n- its value, given by a pointer to an object.\\n\\nA `Pointer.t` is either an immutable object `Imm` or a mutable object `Mutable` with an address and a function to get the object from what is stored in the memory. This function `to_object` is required as we plan to allow the user to provide its own custom memory model.\\n\\nAn `Object.t` has a list of named fields that we can populate in the `__init__` method of a class. It also has a special `internal` field that we can use to store special kinds of data, like primitive values.\\n\\nIn `Data.t`, we list the various primitive values that we use to define the primitive types of the Python language. We have:\\n\\n- atomic values such as booleans, integers, strings, ...\\n- composite values such as tuples, lists, dictionaries, ...\\n- closures with a function that takes the two arguments `*args` and `**kwargs` and returns a monadic value,\\n- classes with their bases, class methods, and instance methods.\\n\\n### Monad\\n\\nFor now, we axiomatize the monad `M`:\\n\\n```coq\\nParameter M : Set.\\n```\\n\\nWe will see later how to define it, probably by taking some inspiration from our monad from our similar project [coq-of-rust](https://github.com/formal-land/coq-of-rust).\\n\\nTo make the monadic code less heavy, we use a notation inspired by the `async/await` notation of many languages. We believe it to be less heavy than the monadic notation of languages like [Haskell](https://www.haskell.org/). We note:\\n\\n```coq\\nf (| x1, ..., xn |)\\n```\\n\\nto call a function `f` of type:\\n\\n```coq\\nValue.t -> ... -> Value.t -> M\\n```\\n\\nwith the arguments `x1`, ..., `xn` of type `Value.t` and binds its result to the current continuation in the context of the tactic `ltac:(M.monadic ...)`. See our blog post [Monadic notation for the Rust translation](/blog/2024/04/03/monadic-notation-for-rust-translation) for more information.\\n\\nIn summary:\\n\\n- `f (| x1, ..., xn |)` is like `await`,\\n- `ltac:(M.monadic ...)` is like `async`.\\n\\n## Handling of the names \ud83c\udff7\ufe0f\\n\\nNow we talk about how we handle the variable names and link them to their definitions. In the reference manual of Python, the part [Execution model](https://docs.python.org/3/reference/executionmodel.html) gives some information.\\n\\nFor now, we distinguish between two scopes, the global one (top-level definitions) and the local one for variables defined in a function. We might introduce a stack of local scopes to handle nested functions.\\n\\nWe name the global scope with a string, that is the path of the current file. Having absolute names helps us translating each file independently. The only file that a translated file requires is `CoqOfPython.CoqOfPython`, to have the definition of the values and the monad.\\n\\nTo translate `import` statements, we use assertions:\\n\\n```coq\\nAxiom ethereum_crypto_imports_elliptic_curve :\\n IsImported globals \\"ethereum.crypto\\" \\"elliptic_curve\\".\\nAxiom ethereum_crypto_imports_finite_field :\\n IsImported globals \\"ethereum.crypto\\" \\"finite_field\\".\\n```\\n\\nThis represents:\\n\\n```python\\nfrom . import elliptic_curve, finite_field\\n```\\n\\nIt means that in the current global scope `globals` we can use the name `\\"elliptic_curve\\"` from the other global scope `\\"ethereum.crypto\\"`.\\n\\nWe set the local scope at the entry of a function with the call:\\n\\n```coq\\nM.set_locals (| args, kwargs, [ \\"x1\\"; ...; \\"xn\\" ] |)\\n```\\n\\nfor a function whose parameter names are `x1`, ..., `xn`. For uniformity, we always group the function\'s parameters as `*args` and `**kwargs`. We do not yet handle the default values.\\n\\nWhen a user creates or updates a local variable `x` with a value `value`, we run:\\n\\n```coq\\nM.assign_local \\"x\\" value : M\\n```\\n\\nTo read a variable, we have a primitive:\\n\\n```coq\\nM.get_name : string -> string -> M\\n```\\n\\nIt takes as a parameter the name of the current global scope and the name of the variable the are reading. The local scope should be accessible from the monad. For now all these primitives are axiomatized.\\n\\n## Some numbers \ud83d\udcca\\n\\nThe code base that we analyze, the Python specification of Ethereum, contains _28,455 lines_ of Python, excluding comments. When we translate it to Coq we obtain _299,484 lines_. This is a roughly ten times increase.\\n\\nThe generated code completely compiles. For now, we avoid some complex Python expressions, like list comprehension, by generating a dummy expression instead. Having all the code that compiles will allow us to iterate and add support for more Python features with a simple check: making sure that all the code still compiles.\\n\\nAs an example, we translate the following function:\\n\\n```python\\ndef bnf2_to_bnf12(x: BNF2) -> BNF12:\\n \\"\\"\\"\\n Lift a field element in `BNF2` to `BNF12`.\\n \\"\\"\\"\\n return BNF12.from_int(x[0]) + BNF12.from_int(x[1]) * (\\n BNF12.i_plus_9 - BNF12.from_int(9)\\n )\\n```\\n\\nto the Coq code:\\n\\n```coq\\nDefinition bnf2_to_bnf12 : Value.t -> Value.t -> M :=\\n fun (args kwargs : Value.t) => ltac:(M.monadic (\\n let _ := M.set_locals (| args, kwargs, [ \\"x\\" ] |) in\\n let _ := Constant.str \\"\\n Lift a field element in `BNF2` to `BNF12`.\\n \\" in\\n let _ := M.return_ (|\\n BinOp.add (|\\n M.call (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"from_int\\" |),\\n make_list [\\n M.get_subscript (|\\n M.get_name (| globals, \\"x\\" |),\\n Constant.int 0\\n |)\\n ],\\n make_dict []\\n |),\\n BinOp.mult (|\\n M.call (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"from_int\\" |),\\n make_list [\\n M.get_subscript (|\\n M.get_name (| globals, \\"x\\" |),\\n Constant.int 1\\n |)\\n ],\\n make_dict []\\n |),\\n BinOp.sub (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"i_plus_9\\" |),\\n M.call (|\\n M.get_field (| M.get_name (| globals, \\"BNF12\\" |), \\"from_int\\" |),\\n make_list [\\n Constant.int 9\\n ],\\n make_dict []\\n |)\\n |)\\n |)\\n |)\\n |) in\\n M.pure Constant.None_)).\\n```\\n\\n## Conclusion\\n\\nWe continue working on the translation from Python to Coq, especially to now add a semantics to the translation. Our next goal is to have a version, written in idiomatic Coq, of the file [src/ethereum/paris/vm/instructions/arithmetic.py](https://github.com/ethereum/execution-specs/blob/master/src/ethereum/paris/vm/instructions/arithmetic.py), and proven equal to the original code. This will open the door to making a Coq specification of the EVM that is always synchronized to the Python\'s version.\\n\\nFor our services, reach us at [contact@formal.land](mailto:contact@formal.land) \ud83c\udfc7! We want to ensure the blockchain\'s L1 and L2 are bug-free, thanks to a mathematical analysis of the code. See [our previous project](https://formal-land.gitlab.io/coq-tezos-of-ocaml/) on the L1 of Tezos."},{"id":"/2024/04/26/translation-core-alloc-crates","metadata":{"permalink":"/blog/2024/04/26/translation-core-alloc-crates","source":"@site/blog/2024-04-26-translation-core-alloc-crates.md","title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","description":"We continue our work on formal verification of Rust programs with our tool coq-of-rust, to translate Rust code to the formal proof system Coq. One of the limitation we had was the handling of primitive constructs from the standard library of Rust, like Option::unwrapordefault or all other primitive functions. For each of these functions, we had to make a Coq definition to represent its behavior. This is both tedious and error prone.","date":"2024-04-26T00:00:00.000Z","formattedDate":"April 26, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"core","permalink":"/blog/tags/core"},{"label":"alloc","permalink":"/blog/tags/alloc"}],"readingTime":5.365,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","tags":["coq-of-rust","Rust","Coq","translation","core","alloc"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83d\udc0d Translation of Python code to Coq","permalink":"/blog/2024/05/10/translation-of-python-code"},"nextItem":{"title":"\ud83e\udd80 Monadic notation for the Rust translation","permalink":"/blog/2024/04/03/monadic-notation-for-rust-translation"}},"content":"We continue our work on formal verification of [Rust](https://www.rust-lang.org/) programs with our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust), to translate Rust code to the formal proof system [Coq](https://coq.inria.fr/). One of the limitation we had was the handling of primitive constructs from the standard library of Rust, like [Option::unwrap_or_default](https://doc.rust-lang.org/core/option/enum.Option.html#method.unwrap_or_default) or all other primitive functions. For each of these functions, we had to make a Coq definition to represent its behavior. This is both tedious and error prone.\\n\\nTo solve this issue, we worked on the translation of the [core](https://doc.rust-lang.org/core/) and [alloc](https://doc.rust-lang.org/alloc/) crates of Rust using `coq-of-rust`. These are very large code bases, with a lot of unsafe or advanced Rust code. We present what we did to have a \\"best effort\\" translation of these crates. The resulting translation is in the following folders:\\n\\n- [CoqOfRust/alloc](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/alloc)\\n- [CoqOfRust/core](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/core)\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat \u260e\ufe0f!\\n\\n:::\\n\\n
\\n ![Crab with a pen](2024-04-26/crab-in-library.webp)\\n
A crab in a library
\\n
\\n\\n## Initial run \ud83d\udc25\\n\\nAn initial run of `coq-of-rust` on the `alloc` and `core` crates of Rust generated us two files of a few hundred thousands lines of Coq corresponding to the whole translation of these crates. This is a first good news, as it means the tool runs of these large code bases. However the generated Coq code does not compile, even if the errors are very rare (one every few thousands lines).\\n\\nTo get an idea, here is the size of the input Rust code as given by the `cloc` command:\\n\\n- `alloc`: 26,299 lines of Rust code\\n- `core`: 54,192 lines of Rust code\\n\\nGiven that this code uses macros that we expand in our translation, the actual size that we have to translate is even bigger.\\n\\n## Splitting the generated code \ud83e\ude93\\n\\nThe main change we made was to split the output generated by `coq-of-rust` with one file for each input Rust file. This is possible because our translation is insensitive to the order of definitions and context-free. So, even if there are typically cyclic dependencies between the files in Rust, something that is forbidden in Coq, we can still split them.\\n\\nWe get the following sizes as output:\\n\\n- `alloc`: 54 Coq files, 171,783 lines of Coq code\\n- `core`: 190 Coq files, 592,065 lines of Coq code\\n\\nThe advantages of having the code split are:\\n\\n- it is easier to read and navigate in the generated code\\n- it is easier to compile as we can parallelize the compilation\\n- it is easier to debug as we can focus on one file at a time\\n- it is easier to ignore files that do not compile\\n- it will be easier to maintain, as it is easier to follow the diff of a single file\\n\\n## Fixing some bugs \ud83d\udc1e\\n\\nWe had some bugs related to the collisions between module names. These can occur when we choose a name for the module for an `impl` block. We fixed these by adding more information in the module names to make them more unique, like the `where` clauses that were missing. For example, for the implementation of the `Default` trait for the `Mapping` type:\\n\\n```rust\\n#[derive(Default)]\\nstruct Mapping {\\n // ...\\n}\\n```\\n\\nwe were generating the following Coq code:\\n\\n```coq\\nModule Impl_core_default_Default_for_dns_Mapping_K_V.\\n (* ...trait implementation ... *)\\nEnd Impl_core_default_Default_for_dns_Mapping_K_V.\\n```\\n\\nWe now generate:\\n\\n```coq\\nModule Impl_core_default_Default_where_core_default_Default_K_where_core_default_Default_V_for_dns_Mapping_K_V.\\n (* ... *)\\n```\\n\\nwith a module name that includes the `where` clauses of the `impl` block, stating that both `K` and `V` should implement the `Default` trait.\\n\\nHere is the list of files that do not compile in Coq, as of today:\\n\\n- `alloc/boxed.v`\\n- `core/any.v`\\n- `core/array/mod.v`\\n- `core/cmp/bytewise.v`\\n- `core/error.v`\\n- `core/escape.v`\\n- `core/iter/adapters/flatten.v`\\n- `core/net/ip_addr.v`\\n\\nThis represents 4% of the files. Note that in the files that compile there are some unhandled Rust constructs that are axiomatized, so this does not give the whole picture of what we do not support.\\n\\n## Example \ud83d\udd0e\\n\\nHere is the source code of the `unwrap_or_default` method for the `Option` type:\\n\\n```rust\\npub fn unwrap_or_default(self) -> T\\nwhere\\n T: Default,\\n{\\n match self {\\n Some(x) => x,\\n None => T::default(),\\n }\\n}\\n```\\n\\nWe translate it to:\\n\\n```coq\\nDefinition unwrap_or_default (T : Ty.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n let Self : Ty.t := Self T in\\n match \u03c4, \u03b1 with\\n | [], [ self ] =>\\n ltac:(M.monadic\\n (let self := M.alloc (| self |) in\\n M.read (|\\n M.match_operator (|\\n self,\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let \u03b30_0 :=\\n M.get_struct_tuple_field_or_break_match (|\\n \u03b3,\\n \\"core::option::Option::Some\\",\\n 0\\n |) in\\n let x := M.copy (| \u03b30_0 |) in\\n x));\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (M.alloc (|\\n M.call_closure (|\\n M.get_trait_method (| \\"core::default::Default\\", T, [], \\"default\\", [] |),\\n []\\n |)\\n |)))\\n ]\\n |)\\n |)))\\n | _, _ => M.impossible\\n end.\\n```\\n\\nWe prove that it is equivalent to the simpler functional code:\\n\\n```coq\\nDefinition unwrap_or_default {T : Set}\\n {_ : core.simulations.default.Default.Trait T}\\n (self : Self T) :\\n T :=\\n match self with\\n | None => core.simulations.default.Default.default (Self := T)\\n | Some x => x\\n end.\\n```\\n\\nThis simpler definition is what we use when verifying code. The proof of equivalence is in [CoqOfRust/core/proofs/option.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/core/proofs/option.v). In case the original source code changes, we are sure to capture these changes thanks to our proof. Because the translation of the `core` library was done automatically, we trust the generated definitions more than definitions that would be done by hand. However, there can still be mistakes or incompleteness in `coq-of-rust`, so we still need to check at proof time that the code makes sense.\\n\\n## Conclusion\\n\\nWe can now work on the verification of Rust programs with more trust in our formalization of the standard library. Our next target is to simplify our proof process, which is still tedious. In particular, showing that simulations are equivalent to the original Rust code requires doing the name resolution, introduction of high-level types, and removal of the side-effects. We would like to split these steps.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) \ud83d\udc8c! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification."},{"id":"/2024/04/03/monadic-notation-for-rust-translation","metadata":{"permalink":"/blog/2024/04/03/monadic-notation-for-rust-translation","source":"@site/blog/2024-04-03-monadic-notation-for-rust-translation.md","title":"\ud83e\udd80 Monadic notation for the Rust translation","description":"At Formal Land our mission is to reduce the cost of finding bugs in software. We use formal verification, that is to say mathematical reasoning on code, to make sure we find more bugs than with testing. As part of this effort, we are working on a tool coq-of-rust to translate Rust code to Coq, a proof assistant, to analyze Rust programs. Here we present a technical improvement we made in this tool.","date":"2024-04-03T00:00:00.000Z","formattedDate":"April 3, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"},{"label":"monad","permalink":"/blog/tags/monad"}],"readingTime":5.185,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Monadic notation for the Rust translation","tags":["coq-of-rust","Rust","Coq","translation","monad"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Translation of the Rust\'s core and alloc crates","permalink":"/blog/2024/04/26/translation-core-alloc-crates"},"nextItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","permalink":"/blog/2024/03/22/improvements-rust-translation-part-3"}},"content":"At Formal Land our mission is to reduce the cost of finding bugs in software. We use [formal verification](https://runtimeverification.com/blog/formal-verification-lore), that is to say mathematical reasoning on code, to make sure we find more bugs than with testing. As part of this effort, we are working on a tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) to translate Rust code to Coq, a proof assistant, to analyze Rust programs. Here we present a technical improvement we made in this tool.\\n\\nOne of the challenges of our translation from Rust to Coq is that the generated code is very verbose. The size increase is about ten folds in our examples. A reasons is that we use a monad to represent side effects in Coq, so we need to name each intermediate result and apply the `bind` operator. Here, we will present a monadic notation that prevents naming intermediate results to make the code more readable.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat \u260e\ufe0f!\\n\\n:::\\n\\n
\\n ![Crab with a pen](2024-04-03/crab-writing.webp)\\n
\\n\\n## Example \ud83d\udd0e\\n\\nHere is the Rust source code that we consider:\\n\\n```rust\\nfn add(a: i32, b: i32) -> i32 {\\n a + b\\n}\\n```\\n\\nBefore, we were generating the following Coq code, with `let*` as the notation for the bind:\\n\\n```coq\\nDefinition add (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03c4, \u03b1 with\\n | [], [ a; b ] =>\\n let* a := M.alloc a in\\n let* b := M.alloc b in\\n let* \u03b10 := M.read a in\\n let* \u03b11 := M.read b in\\n BinOp.Panic.add \u03b10 \u03b11\\n | _, _ => M.impossible\\n end.\\n```\\n\\nNow, with the new monadic notation, we generate:\\n\\n```coq\\nDefinition add (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03c4, \u03b1 with\\n | [], [ a; b ] =>\\n ltac:(M.monadic\\n (let a := M.alloc (| a |) in\\n let b := M.alloc (| b |) in\\n BinOp.Panic.add (| M.read (| a |), M.read (| b |) |)))\\n | _, _ => M.impossible\\n end.\\n```\\n\\nThe main change is that we do not need to introduce intermediate `let*` expressions with generated names. The code structure is more similar to the original Rust code, with additional calls to memory primitives such as `M.alloc` and `M.read`.\\n\\nThe notation `f (| x1, ..., xn |)` represents the call to the function `f` with the arguments `x1`, ..., `xn` returning a monadic result. We bind the result with the current continuation that goes up to the wrapping `ltac:(M.monadic ...)` tactic. We automatically transform the `let` into a `let*` with the `M.monadic` tactic when needed.\\n\\n## Where do we use this notation? \ud83e\udd14\\n\\nWe use this notation in all the function bodies that we generate, that are all in a monad to represent side effects. We call the `ltac:(M.monadic ...)` tactic at the start of the functions, as well as at the start of closure bodies that are defined inside functions. This also applies to the translation of `if`, `match`, and `loop` expressions, as we represent their bodies as functions.\\n\\nHere is an example of code with a `match` expression:\\n\\n```rust\\nfn add(a: i32, b: i32) -> i32 {\\n match a - b {\\n 0 => a + b,\\n _ => a - b,\\n }\\n}\\n```\\n\\nWe translate it to:\\n\\n```coq\\nDefinition add (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \u03c4, \u03b1 with\\n | [], [ a; b ] =>\\n ltac:(M.monadic\\n (let a := M.alloc (| a |) in\\n let b := M.alloc (| b |) in\\n M.read (|\\n M.match_operator (|\\n M.alloc (| BinOp.Panic.sub (| M.read (| a |), M.read (| b |) |) |),\\n [\\n fun \u03b3 =>\\n ltac:(M.monadic\\n (let _ :=\\n M.is_constant_or_break_match (|\\n M.read (| \u03b3 |),\\n Value.Integer Integer.I32 0\\n |) in\\n M.alloc (|\\n BinOp.Panic.add (| M.read (| a |), M.read (| b |) |)\\n |)));\\n fun \u03b3 =>\\n ltac:(M.monadic (\\n M.alloc (|\\n BinOp.Panic.sub (| M.read (| a |), M.read (| b |) |)\\n |)\\n ))\\n ]\\n |)\\n |)))\\n | _, _ => M.impossible\\n end.\\n```\\n\\nWe see that we call the tactic `M.monadic` for each branch of the `match` expression.\\n\\n## How does it work? \ud83d\udee0\ufe0f\\n\\nThe `M.monadic` tactic is defined in [M.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/M.v). The main part is:\\n\\n```coq showLineNumbers\\nLtac monadic e :=\\n lazymatch e with\\n (* ... *)\\n | context ctxt [M.run ?x] =>\\n lazymatch context ctxt [M.run x] with\\n | M.run x => monadic x\\n | _ =>\\n refine (M.bind _ _);\\n [ monadic x\\n | let v := fresh \\"v\\" in\\n intro v;\\n let y := context ctxt [v] in\\n monadic y\\n ]\\n end\\n (* ... *)\\n end.\\n```\\n\\nIn our translation of Rust, all of the values have the common type `Value.t`. The monadic bind is of type `M -> (Value.t -> M) -> M` where `M` is the type of the monad. The `M.run` function is an axiom that we use as a marker to know where we need to apply `M.bind`. The type of `M.run` is:\\n\\n```coq\\nAxiom run : M -> Value.t.\\n```\\n\\nThe notation for monadic function calls is defined using the `M.run` axiom with:\\n\\n```coq\\nNotation \\"e (| e1 , .. , en |)\\" := (M.run ((.. (e e1) ..) en)).\\n```\\n\\nWhen we encounter a `M.run` (line 4) we apply the `M.bind` (line 8) to the monadic expression `x` (line 9) and its continuation `ctx` that we obtain thanks to the `context` keyword (line 4) of the matching of expressions in Ltac.\\n\\nThere is another case in the `M.monadic` tactic to handle the `let` expressions, that is not shown here.\\n\\n## Conclusion\\n\\nThanks to this new monadic notation, the generated Coq code is more readable and closer to the original Rust code. This should simplify our work in writing proofs on the generated code, as well as debugging the translation.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) \ud83d\udc8c! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification."},{"id":"/2024/03/22/improvements-rust-translation-part-3","metadata":{"permalink":"/blog/2024/03/22/improvements-rust-translation-part-3","source":"@site/blog/2024-03-22-improvements-rust-translation-part-3.md","title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","description":"We explained how we started updating our translation tool coq-of-rust in our previous blog post, to support more of the Rust language. Our goal is to provide formal verification for the Rust \ud83e\udd80 language, relying on the proof system Coq \ud83d\udc13. We will see in this post how we continue implementing changes in coq-of-rust to:","date":"2024-03-22T00:00:00.000Z","formattedDate":"March 22, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":10.105,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","tags":["coq-of-rust","Rust","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Monadic notation for the Rust translation","permalink":"/blog/2024/04/03/monadic-notation-for-rust-translation"},"nextItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","permalink":"/blog/2024/03/08/improvements-rust-translation-part-2"}},"content":"We explained how we started updating our translation tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) in our [previous blog post](/blog/2024/03/08/improvements-rust-translation-part-2), to support more of the Rust language. Our goal is to provide formal verification for the Rust \ud83e\udd80 language, relying on the proof system Coq \ud83d\udc13. We will see in this post how we continue implementing changes in `coq-of-rust` to:\\n\\n1. remove the types from the translation,\\n2. be independent of the ordering of the definitions.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\n- Previous post: [Improvements in the Rust translation to Coq, part 2](/blog/2024/03/08/improvements-rust-translation-part-2)\\n\\n:::\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat \u260e\ufe0f!\\n\\n:::\\n\\n## Translating the `dns` example \ud83d\ude80\\n\\nWe continue with our previous example [dns.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/dns.rs), which is composed of around 200 lines of Rust code.\\n\\n### Borrow and dereference\\n\\nThe next error that we encounter when type-checking the Coq translation of `dns.rs` is:\\n\\n```\\nFile \\"./examples/default/examples/ink_contracts/dns.v\\", line 233, characters 22-27:\\nError: The reference deref was not found in the current environment.\\n```\\n\\nIn Rust, we can either take the address of a value with `&`, or dereference a reference with `*`. In our translation, we do not distinguish between the four following pointer types:\\n\\n- `&`\\n- `&mut`\\n- `*const`\\n- `*mut`\\n\\nWe let the user handle these in different ways if it can simplify their proofs, especially regarding the distinction between mutable and non-mutable pointers. It simplifies the definition of our borrowing and dereferencing operators, as we need only two to cover all cases. We even go further: we remove these two operators in the translation, as they are the identity in our case!\\n\\nTo better understand why they are the identity, we need to see that there are two kinds of Rust values in our representation:\\n\\n- the value itself and\\n- the value with its address.\\n\\nThe value itself is useful to compute over the values. For example, we use it to define the primitive addition over integers. The value with its address corresponds to the final Rust expression. Indeed, we can take the address of any sub-expression in Rust with the `&` operator, so each sub-expression should come with its address. When we take the address of an expression, we:\\n\\n- start from a value with its address and go to\\n- a value that is an address to the value above, which we will need to allocate to have an address for it also.\\n\\nThus, the `&` operator behaves as the identity function followed by an allocation. Similarly, the `*` is a memory read followed by the identity function. Since we already use the alloc and read operations to go from a value to a value with its address and the other way around, we do not need to define the `*` and `&` operators in our translation and remove them.\\n\\n### Primitive operators\\n\\nWe now need to distinguish between the function calls, that use the primitive:\\n\\n```coq\\nM.get_function : string -> M\\n```\\n\\nto find the right function to call when defining the semantics of the program (even if the function is defined later), and the calls to primitive operators (`+`, `*`, `!`, ...) that we define in our base library for Rust in Coq. The full list of primitive operators is given by:\\n\\n- [rustc_middle::mir::syntax::BinOp](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/syntax/enum.BinOp.html)\\n- [rustc_middle::thir::LogicalOp](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/thir/enum.LogicalOp.html) (with lazy evaluation of the parameters)\\n- [rustc_middle::mir::syntax::UnOp](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/syntax/enum.UnOp.html)\\n\\nWe adapted the handling of primitive operators from the code we had before and added a few other fixes so that now the `dns.rs` example type-checks in Coq \ud83c\udf8a! We will now focus on fixing the other examples.\\n\\n## Cleaning the code \ud83e\uddfc\\n\\nBut let us first clean the code a bit. All the expressions in the internal [AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) of `coq-of-rust` are in a wrapper with the current type of the expression:\\n\\n```rust\\npub(crate) struct Expr {\\n pub(crate) kind: Rc,\\n pub(crate) ty: Option>,\\n}\\n\\npub(crate) enum ExprKind {\\n Pure(Rc),\\n LocalVar(String),\\n Var(Path),\\n Constructor(Path),\\n // ... all the cases\\n```\\n\\nHaving access to the type of each sub-expression was useful before annotating the `let` expressions. This is not required anymore, as all the values have the type `Value.t`. Thus, we remove the wrapper `Expr` and rename `ExprKind` into `Expr`. The resulting code is easier to read, as wrapping everything with a type was verbose sometimes.\\n\\nWe also cleaned some translated types that were not used anymore in the code, removed unused `Derive` traits, and removed the monadic translation on the types.\\n\\n
\\n ![Crab in space](2024-03-22/crab-in-space.webp)\\n
A crab safely walking in space thanks to formal verification.
\\n
\\n\\n## Handling the remaining examples\\n\\nTo handle the remaining examples of our test suite (extracted from the snippets of the [Rust by Example](https://doc.rust-lang.org/rust-by-example/) book), we mainly needed to re-implement the pattern matching on the new untyped values. Here is an example of Rust code with matching:\\n\\n```rust\\nfn matching(tuple: (i32, i32)) -> i32 {\\n match tuple {\\n (0, 0) => 0,\\n (_, _) => 1,\\n }\\n}\\n```\\n\\nwith its translation in Coq:\\n\\n```coq showLineNumbers\\nDefinition matching (\ud835\udf0f : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \ud835\udf0f, \u03b1 with\\n | [], [ tuple ] =>\\n let* tuple := M.alloc tuple in\\n let* \u03b10 :=\\n match_operator\\n tuple\\n [\\n fun \u03b3 =>\\n let* \u03b30_0 := M.get_tuple_field \u03b3 0 in\\n let* \u03b30_1 := M.get_tuple_field \u03b3 1 in\\n let* _ :=\\n let* \u03b10 := M.read \u03b30_0 in\\n M.is_constant_or_break_match \u03b10 (Value.Integer Integer.I32 0) in\\n let* _ :=\\n let* \u03b10 := M.read \u03b30_1 in\\n M.is_constant_or_break_match \u03b10 (Value.Integer Integer.I32 0) in\\n M.alloc (Value.Integer Integer.I32 0);\\n fun \u03b3 =>\\n let* \u03b30_0 := M.get_tuple_field \u03b3 0 in\\n let* \u03b30_1 := M.get_tuple_field \u03b3 1 in\\n M.alloc (Value.Integer Integer.I32 1)\\n ] in\\n M.read \u03b10\\n | _, _ => M.impossible\\n end.\\n```\\n\\nHere is a breakdown of how it works:\\n\\n- On line 6 we call the `match_operator` primitive that takes a value to match on, `tuple`, and a list of functions that try to match the value with a pattern and execute some code in case of success. We execute the matching functions successively until one succeeds and we stop. There should be at least one succeeding function as pattern-match in Rust is exhaustive.\\n- On line 10 we get the first element of the tuple. Note that, more precisely, what we get is the address of the first element of `\u03b3` that is the address of the tuple `tuple` given as parameter to the function. Having the address might be required for some operations, like doing subsequent matching by reference or using the `&` operator in the `match`\'s body.\\n- On line 11 we do the same with the second element of the tuple. The indices for `\u03b3` are generated to avoid name clashes. They correspond to the depth of the sub-pattern being considered, followed by the index of the current item in this sub-pattern.\\n- On line 14, we check that the first element of the tuple is `0`. We use the `M.is_constant_or_break_match` primitive that checks if the value is a constant and if it is equal to the expected value. If it is not the case, it exits the current matching function, and the `match_operator` primitive will evaluate the next one, going to line 19.\\n- On line 24 we return the final result. Note that we always do a `M.alloc` followed by `M.read` to return the result. This could be simplified, as immediately reading an allocated value is like running the identity function.\\n\\nBy implementing the new version of the pattern-matching, as well as a few other smaller fixes, we were able to make all the examples type-check again! We now need to fix the proofs we had on the [erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/erc20.v) example, as the generated code changed a lot.\\n\\n## Updating the proofs \ud83d\udc69\u200d\ud83d\ude80\\n\\nUnfortunately, all these changes in the generated code are breaking our proofs. We still want to write our specifications and proofs by first showing a simulation of the Rust code with a simpler and functional definition. Before, with our simulations, we were:\\n\\n- replacing the management of pointers by either stateless functions or functions in a state monad;\\n- simplifying the error handling, especially for code that cannot panic.\\n\\nNow we also have to:\\n\\n- define the types;\\n- add the typing information;\\n- add the trait constraints and resolve the trait instances;\\n- resolve the function or associated function calls.\\n\\nWe have not finished updating the proofs but still merged our work in `main` with the pull request [#472](https://github.com/formal-land/coq-of-rust/pull/472) as this was taking too long. The proof that we want to update is in the file [proofs/erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/proofs/erc20.v) and is about the smart contract [erc20.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/erc20.rs).\\n\\n### Phi operators \ud83c\udfa0\\n\\nOur basic strategy for the proof, in order to handle the untyped Rust values of the new translation, is to define various `\u03c6` operators coming from a user-defined Coq type to a Rust value of type `Value.t`. These translate the data types that we define to represent the Rust types of the original program. Note that we previously had trouble translating the Rust types in the general case, especially for mutually recursive types or types involving a lot of trait manipulations.\\n\\nMore formally, we introduce the Coq typeclass:\\n\\n```coq\\nClass ToValue (A : Set) : Set := {\\n \u03a6 : Ty.t;\\n \u03c6 : A -> Value.t;\\n}.\\nArguments \u03a6 _ {_}.\\n```\\n\\nThis describes how to go from a user-defined type in Coq to the equivalent representation in `Value.t`. In addition to the `\u03c6` operator, we also define the `\u03a6` operator that gives the Rust type of the Coq type. This type is required to give for polymorphic definitions.\\n\\nWe always go from user-defined types to `Value.t`. We write our simulation statements like this:\\n\\n```coq\\n{{env, state |\\n code.example.get_at_index [] [\u03c6 vector; \u03c6 index] \u21d3\\n inl (\u03c6 (simulations.example.get_at_index vector index))\\n| state\'}}\\n```\\n\\nwhere:\\n\\n```coq\\n{{env, state | rust_program \u21d3 simulation_result | state\'}}\\n```\\n\\nis our predicate to state an evaluation of a Rust program to a simulation result. We apply the `\u03c6` operator to the arguments of the Rust program and to the result of the simulation. In some proofs, we set this operator as `Opaque` in order to keep track of it and avoid unwanted reductions.\\n\\n### Traits\\n\\nThe trait definitions, as well as trait constraints, are absent from the generated Coq code. For now, we add them back as follows, for the example of the `Default` trait:\\n\\n1. We define a `Default` typeclass in Coq:\\n\\n ```coq\\n Module Default.\\n Class Trait (Self : Set) : Set := {\\n default : Self;\\n }.\\n End Default.\\n ```\\n\\n2. We define what it means to implement the `Default` trait and have a corresponding simulation:\\n\\n ```coq\\n Module Default.\\n Record TraitHasRun (Self : Set)\\n `{ToValue Self}\\n `{core.simulations.default.Default.Trait Self} :\\n Prop := {\\n default :\\n exists default,\\n IsTraitMethod\\n \\"core::default::Default\\" (\u03a6 Self) []\\n \\"default\\" default /\\\\\\n Run.pure\\n (default [] [])\\n (inl (\u03c6 core.simulations.default.Default.default));\\n }.\\n End Default.\\n ```\\n\\n where `Run.pure` is our simulation predicate for the case where the `state` does not change.\\n\\n3. Finally, we use the `TraitHasRun` predicate as an additional hypothesis for simulation proofs on functions that depend on the `Default` trait in Rust:\\n\\n ```coq\\n (** Simulation proof for `unwrap_or_default` on the type `Option`. *)\\n Lemma run_unwrap_or_default {T : Set}\\n {_ : ToValue T}\\n {_ : core.simulations.default.Default.Trait T}\\n (self : option T) :\\n core.proofs.default.Default.TraitHasRun T ->\\n Run.pure\\n (core.option.Impl_Option_T.unwrap_or_default (\u03a6 T) [] [\u03c6 self])\\n (inl (\u03c6 (core.simulations.option.Impl_Option_T.unwrap_or_default self))).\\n Proof.\\n (* ... *)\\n Qed.\\n ```\\n\\n## Conclusion \u270d\ufe0f\\n\\nWe still have a lot to do, especially in finding the right approach to verify the newly generated Rust code. But we have finalized our new translation mode without types and ordering, which helps to successfully translate many more Rust examples. We also do not need to translate the dependencies of a project anymore before compiling it.\\n\\nOur next target is to translate the whole of Rust\'s standard library (with the help of some axioms for the expressions which we do not handle yet), in order to have a faithful definition of the Rust primitives, such as functions of the [option](https://doc.rust-lang.org/core/option/) and [vec](https://doc.rust-lang.org/alloc/vec/) modules.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) \ud83d\udc8c! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification."},{"id":"/2024/03/08/improvements-rust-translation-part-2","metadata":{"permalink":"/blog/2024/03/08/improvements-rust-translation-part-2","source":"@site/blog/2024-03-08-improvements-rust-translation-part-2.md","title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","description":"In our previous blog post, we stated our plan to improve our translation of Rust \ud83e\udd80 to Coq \ud83d\udc13 with coq-of-rust. We also provided a new definition for our Rust monad in Coq, and the definition of a unified type to represent any Rust values. We will now see how we modify the Rust implementation of coq-of-rust to make the generated code use these new definitions.","date":"2024-03-08T00:00:00.000Z","formattedDate":"March 8, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":9.055,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","tags":["coq-of-rust","Rust","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 3","permalink":"/blog/2024/03/22/improvements-rust-translation-part-3"},"nextItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","permalink":"/blog/2024/02/29/improvements-rust-translation"}},"content":"In our [previous blog post](/blog/2024/02/29/improvements-rust-translation), we stated our plan to improve our translation of Rust \ud83e\udd80 to Coq \ud83d\udc13 with [coq-of-rust](https://github.com/formal-land/coq-of-rust). We also provided a new definition for our Rust monad in Coq, and the definition of a unified type to represent any Rust values. We will now see how we modify the Rust implementation of `coq-of-rust` to make the generated code use these new definitions.\\n\\nWith this new translation strategy, to support more Rust code, we want:\\n\\n1. to remove the types from the translation,\\n2. to avoid the need to order the definitions in the generated Coq code.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\n- Next post: [Improvements in the Rust translation to Coq, part 3](/blog/2024/03/22/improvements-rust-translation-part-3)\\n- Previous post: [Improvements in the Rust translation to Coq, part 1](/blog/2024/02/29/improvements-rust-translation)\\n\\n:::\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase bug-free with [formal verification](https://en.wikipedia.org/wiki/Formal_verification).\\n\\nContact us at [contact@formal.land](mailto:contact@formal.land) to chat!\\n\\n:::\\n\\n## Implementation of the monad\\n\\nWe implemented the new monad and the type `Value.t` holding any kind of Rust values as described in the previous blog post. For now, we have removed the definitions related to the standard library of Rust (everything except the base definitions such as the integer types). This should not be an issue to type-check the generated Coq code, as the new code should be independent of the ordering of definitions: in particular, it should type-check even if the needed definitions are not yet there.\\n\\nWe added some definitions for the primitive unary and binary operators. These include some operations on the integers such arithmetic operations (with or without overflow, depending on the compilation mode), as well as comparisons (equality, lesser or equal than, ...).\\n\\nNow that the main library file [CoqOfRust/CoqOfRust.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/CoqOfRust.v) compiles in Coq, we can start to test the translation on our examples.\\n\\n## Generating the tests\\n\\nWe generate new snapshots for our translations with:\\n\\n```sh\\ncargo build && time python run_tests.py\\n```\\n\\nThis builds the project `coq-of-rust` (with a lot of warning about unused code for now) and re-generates our snapshots: for each Rust file in the [examples](https://github.com/formal-land/coq-of-rust/tree/main/examples) directory, we generate a Coq file with the same name but the extension `.v`. We generate two versions:\\n\\n- one in axiom mode, where all definitions are axiomatized, to translate libraries, for example, and\\n- one in full definition mode, where we also translate the bodies of the function definitions.\\n\\n## Axiom mode\\n\\nWe first try to type-check and fix the code generated in axiom mode.\\n\\n### Type aliases\\n\\nWe have a first error for type aliases that we do not translate properly. We need access to the fully qualified name of the alias. We do that by combining calls to the functions:\\n\\n- [crate_name](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.crate_name) to get the name of the current crate and\\n- [def_path](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.def_path) to get the whole definition path without the crate name.\\n\\nAs a result, for the file [examples/ink_contracts/basic_contract_caller.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/basic_contract_caller.rs), we translate the type alias:\\n\\n```rust\\ntype Hash = [u8; 32];\\n```\\n\\ninto the Coq code:\\n\\n```coq\\nAxiom Hash :\\n (Ty.path \\"basic_contract_caller::Hash\\") =\\n (Ty.apply (Ty.path \\"array\\") [Ty.path \\"u8\\"]).\\n```\\n\\nThen, during the proofs, we will be able to substitute the type `Hash` by its definition when it appears. Note that we now translate types by values of the type `Ty.t`, so there should be no difficulties in rewriting types.\\n\\nWe should add the length of the array in the type. This is not done yet.\\n\\n### Traits\\n\\nIn axiom mode, we remove most of the trait definitions. Instead, with our new translation model, the traits are mostly unique names (the absolute path of the trait definition). The main use of traits is to distinguish them from other traits, to know which trait implementation to use when calling a trait\'s method. We still translate the provided methods (that are default methods in the trait definition) to axioms and add a predicate stating that they are associated with the current trait. For example, we translate the following Rust trait:\\n\\n```rust\\n// crate `my_crate`\\n\\ntrait Animal {\\n fn new(name: &\'static str) -> Self;\\n\\n fn name(&self) -> &\'static str;\\n fn noise(&self) -> &\'static str;\\n\\n fn talk(&self) {\\n println!(\\"{} says {}\\", self.name(), self.noise());\\n }\\n}\\n```\\n\\nto the Coq code:\\n\\n```coq\\n(* Trait *)\\nModule Animal.\\n Parameter talk : (list Ty.t) -> (list Value.t) -> M.\\n\\n Axiom ProvidedMethod_talk : M.IsProvidedMethod \\"my_crate::Animal\\" talk.\\nEnd Animal.\\n```\\n\\nWe realize with this example that the translation in axiom mode generates very few errors, as we remove all the type definitions and all the function axioms have the same signature:\\n\\n```coq\\n(* A list of types that can be empty for non-polymorphic functions,\\n a list of parameters, and a return value in the monad `M`. *)\\nlist Ty.t -> list Value.t -> M\\n```\\n\\nso the type-checking of these axioms never fails. We thus jump to the full definition mode as this is where our new approach might fail.\\n\\n## Definition mode\\n\\nWe now try to type-check the generated Coq code in full definition mode. We start with the [dns.rs](https://github.com/formal-land/coq-of-rust/blob/main/examples/ink_contracts/dns.rs) smart contract example.\\n\\n### Polymorphic trait implementation\\n\\nThis example is interesting, as it contains polymorphic implementations, such as for the [mock](https://en.wikipedia.org/wiki/Mock_object) type `Mapping`:\\n\\n```rust\\n#[derive(Default)]\\nstruct Mapping {\\n _key: core::marker::PhantomData,\\n _value: core::marker::PhantomData,\\n}\\n```\\n\\nthat implements the [Default](https://doc.rust-lang.org/core/default/trait.Default.html) trait on the type `Mapping` for two type parameters `K` and `V`. We translate it to:\\n\\n```coq showLineNumbers\\n(* Struct Mapping *)\\n\\nModule Impl_core_default_Default_for_dns_Mapping_K_V.\\n (*\\n Default\\n *)\\n Definition default (\ud835\udf0f : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \ud835\udf0f, \u03b1 with\\n | [ Self; K; V ], [] =>\\n let* \u03b10 :=\\n M.get_method\\n \\"core::default::Default\\"\\n \\"default\\"\\n [ (* Self *) Ty.apply (Ty.path \\"core::marker::PhantomData\\") [ K ] ] in\\n let* \u03b11 := M.call \u03b10 [] in\\n let* \u03b12 :=\\n M.get_method\\n \\"core::default::Default\\"\\n \\"default\\"\\n [ (* Self *) Ty.apply (Ty.path \\"core::marker::PhantomData\\") [ V ] ] in\\n let* \u03b13 := M.call \u03b12 [] in\\n M.pure\\n (Value.StructRecord \\"dns::Mapping\\" [ (\\"_key\\", \u03b11); (\\"_value\\", \u03b13) ])\\n | _, _ => M.impossible\\n end.\\n\\n Axiom Implements :\\n forall (K V : Ty.t),\\n M.IsTraitInstance\\n \\"core::default::Default\\"\\n (* Self *) (Ty.apply (Ty.path \\"dns::Mapping\\") [ K; V ])\\n []\\n [ (\\"default\\", InstanceField.Method default) ]\\n [ K; V ].\\nEnd Impl_core_default_Default_for_dns_Mapping_K_V.\\n```\\n\\nHere are the interesting bits of this code:\\n\\n- On line 1, we translate the `Mapping` type into a single comment, as the types disappear in our translation and become just markers. The marker for `Mapping` is its absolute name `Ty.path \\"dns::Mapping\\"`.\\n- On line 7, the function `default` takes a list of types `\ud835\udf0f` as a parameter in case it is polymorphic. Here, this method is not polymorphic, but we still add the `\ud835\udf0f` parameter for uniformity. We also take three additional type parameters:\\n\\n - `Self`\\n - `K`\\n - `V`\\n\\n that represent the `Self` type on which the trait is implemented, and the two type parameters of the `Mapping` type. These will be provided when calling the `default` method.\\n\\n- On line 11, we use the primitive `M.get_method` (axiomatized for now) to get the method `default` of the trait `core::default::Default` for the type `core::marker::PhantomData`. Here, we see that having access to the type `K` in the body of the `default` function is useful, as it helps us to disambiguate between the various implementations of the `Default` trait instances that we call. Here, we provide the `Self` type of the trait in a list of a single element. If the `Default` trait or the `default` method were polymorphic, we would also append these type parameters in this list.\\n- On line 15, we call the `default` method instance that we found with an empty list of arguments.\\n- On line 23, we build a value of type `Mapping` with the two fields `_key` and `_value` initialized with the results of the two calls to the `default` method. We use the `Value.StructRecord` constructor to build the value, and its result is of type `Value.t` like all other Rust values.\\n- On line 24, we eliminate a case with a wrong number of type and value arguments. This should never happen as the arity of all the function calls is checked by the Rust type-checker.\\n- On line 27, we state that we have a new instance of the `Default` trait for the `Mapping` type, with the `default` method implemented by the `default` function. This is true for any values of the types `K` and `V`.\\n- On line 34, we specify that `[K, V]` are the type parameters of this implementation that should be given as extra parameters when calling the `default` method of this instance, together with the `Self` type.\\n\\n### Polymorphic implementation\\n\\nNext, we have a polymorphic implementation of mock associated functions for the `Mapping` type:\\n\\n```rust\\nimpl Mapping {\\n fn contains(&self, _key: &K) -> bool {\\n unimplemented!()\\n }\\n\\n // ...\\n```\\n\\nWe translate it to:\\n\\n```coq showLineNumbers\\nModule Impl_dns_Mapping_K_V.\\n Definition Self (K V : Ty.t) : Ty.t :=\\n Ty.apply (Ty.path \\"dns::Mapping\\") [ K; V ].\\n\\n (*\\n fn contains(&self, _key: &K) -> bool {\\n unimplemented!()\\n }\\n *)\\n Definition contains (\ud835\udf0f : list Ty.t) (\u03b1 : list Value.t) : M :=\\n match \ud835\udf0f, \u03b1 with\\n | [ Self; K; V ], [ self; _key ] =>\\n let* self := M.alloc self in\\n let* _key := M.alloc _key in\\n let* \u03b10 := M.var \\"core::panicking::panic\\" in\\n let* \u03b11 := M.read (mk_str \\"not implemented\\") in\\n let* \u03b12 := M.call \u03b10 [ \u03b11 ] in\\n never_to_any \u03b12\\n | _, _ => M.impossible\\n end.\\n\\n Axiom AssociatedFunction_contains :\\n forall (K V : Ty.t),\\n M.IsAssociatedFunction (Self K V) \\"contains\\" contains [ K; V ].\\n\\n (* ... *)\\n```\\n\\nWe follow a similar approach as for the translation of trait implementations, especially regarding the handling of polymorphic type variables. Here are some differences:\\n\\n- On line 2, we define a `Self` type as a function of the type parameters `K` and `V`. This is useful for avoiding repeating the same type expression later.\\n- On line 22, we use the predicate `M.IsAssociatedFunction` to state that we have a new associated function `contains` for the `Mapping` type, with the `contains` method implemented by the `contains` function. This is true for any values of the types `K` and `V`. Like for the trait implementations, we explicit the list `[K, V]` that will be given as an extra parameter to the function `contains`.\\n\\n## Conclusion\\n\\nIn the next blog post, we will see how we continue to translate the examples in full definition mode. There is still a lot to do to get to the same level of Rust support as before, but we are hopeful that our new approach will be more robust and easier to maintain.\\n\\nIf you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land)! Formal verification provides the highest level of safety for critical applications. See the [White House report on secure software development](https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf) for more on the importance of formal verification."},{"id":"/2024/02/29/improvements-rust-translation","metadata":{"permalink":"/blog/2024/02/29/improvements-rust-translation","source":"@site/blog/2024-02-29-improvements-rust-translation.md","title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","description":"Our tool coq-of-rust is translating Rust \ud83e\udd80 programs to the proof system Coq \ud83d\udc13 to do formal verification on Rust programs. Even if we are able to verify realistic code, such as an ERC-20 smart contract, coq-of-rust still has some limitations:","date":"2024-02-29T00:00:00.000Z","formattedDate":"February 29, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":12.655,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","tags":["coq-of-rust","Rust","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 2","permalink":"/blog/2024/03/08/improvements-rust-translation-part-2"},"nextItem":{"title":"\ud83e\uddab Translating Go to Coq, part 1","permalink":"/blog/2024/02/22/journey-coq-of-go"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) is translating Rust \ud83e\udd80 programs to the proof system Coq \ud83d\udc13 to do formal verification on Rust programs. Even if we are able to verify realistic code, such as an [ERC-20 smart contract](/blog/2023/12/13/rust-verify-erc-20-smart-contract), `coq-of-rust` still has some limitations:\\n\\n- fragile trait handling\\n- difficulties in ordering the definitions, in their order of dependencies as required by Coq\\n\\nWe will present how we plan to improve our tool to address these limitations.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::info\\n\\n- Next post: [Improvements in the Rust translation to Coq, part 2](/blog/2024/03/08/improvements-rust-translation-part-2)\\n\\n:::\\n\\n## Introduction\\n\\nAs emphasized in the [recent report from the White House](https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf), memory safety and formal verification are keys to ensure secure and correct software. Rust provides memory safety and we provide formal verification on top of it with `coq-of-rust`.\\n\\nWe will take the Rust [serde](https://github.com/serde-rs/serde) serialization library to have an example of code to translate in Coq. This is a popular Rust library that is used in almost all projects, either as a direct or transitive dependency. Serialization has a simple specification (being a bijection between the data and its serialized form) and is a good candidate for formal verification. We might verify this library afterwards if there is a need.\\n\\n:::tip Contact\\n\\nThis work is funded by the [Aleph Zero](https://alephzero.org/) crypto-currency in order to verify their Rust smart contracts. You can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase totally bug-free. Contact us at [contact@formal.land](mailto:contact@formal.land) to chat! We offer a free audit to assess the feasibility of formal verification on your case.\\n\\n:::\\n\\n:::note Goal\\n\\nOur company goal is to make formal verification accessible to all projects, reducing its cost to 20% of the development cost. There should be no reason to have bugs in end-user products!\\n\\n:::\\n\\n## Warnings\\n\\nWe start by running the command:\\n\\n```sh\\ncargo coq-of-rust\\n```\\n\\nin the `serde` directory. We get a lot of warnings, but the translation does not panic as it tries to always produce something for debugging purposes. We have two kinds of warnings.\\n\\n### Constants in patterns\\n\\nThe warning is the following:\\n\\n```\\nwarning: Constants in patterns are not yet supported.\\n --\x3e serde/src/de/mod.rs:2277:13\\n |\\n2277 | 0 => panic!(), // special case elsewhere\\n | ^\\n```\\n\\nThe reason why we did not handle constants in patterns is that they are represented in a special format in the Rust compiler that was not obvious to handle. The definition of [rustc_middle::mir::consts::Const](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/mir/consts/enum.Const.html) representing the constants in patterns is:\\n\\n```rust\\npub enum Const<\'tcx> {\\n Ty(Const<\'tcx>),\\n Unevaluated(UnevaluatedConst<\'tcx>, Ty<\'tcx>),\\n Val(ConstValue<\'tcx>, Ty<\'tcx>),\\n}\\n```\\n\\nThere are three cases, and each contains several more cases. To fix this issue, we added the code to handle the signed and unsigned integers, which are enough for our `serde` example. We will need to add other cases later, especially for the strings. This allowed us to discover and fix a bug in our handling of patterns for tuples with elision `..`, like in the example:\\n\\n```rust\\nfn main() {\\n let triple = (0, -2, 3);\\n\\n match triple {\\n (0, y, z) => println!(\\"First is `0`, `y` is {:?}, and `z` is {:?}\\", y, z),\\n (1, ..) => println!(\\"First is `1` and the rest doesn\'t matter\\"),\\n (.., 2) => println!(\\"last is `2` and the rest doesn\'t matter\\"),\\n (3, .., 4) => println!(\\"First is `3`, last is `4`, and the rest doesn\'t matter\\"),\\n _ => println!(\\"It doesn\'t matter what they are\\"),\\n }\\n}\\n```\\n\\nThese changes are in the pull-request [coq-of-rust#470](https://github.com/formal-land/coq-of-rust/pull/470).\\n\\n### Unimplemented `parent_kind`\\n\\nWe get a second form of warning:\\n\\n```\\nunimplemented parent_kind: Struct\\nexpression: Expr {\\n kind: ZstLiteral {\\n user_ty: None,\\n },\\n ty: FnDef(\\n DefId(2:31137 ~ core[10bc]::cmp::Reverse::{constructor#0}),\\n [\\n T/#1,\\n ],\\n ),\\n temp_lifetime: Some(\\n Node(14),\\n ),\\n span: serde/src/de/impls.rs:778:22: 778:29 (#0),\\n}\\n```\\n\\nThis is for some cases of expressions [rustc_middle::thir::ExprKind::ZstLiteral](https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/thir/enum.ExprKind.html#variant.ZstLiteral) in the Rust\'s [THIR representation](https://rustc-dev-guide.rust-lang.org/thir.html) that we do not handle. If we look at the `span` field, we see that it appears in the source in the file `serde/src/de/impls.rs` at line 778:\\n\\n```rust\\nforwarded_impl! {\\n (T), Reverse, Reverse // Here is the error\\n}\\n```\\n\\nThis is not very informative as this code is generated by a macro. Another similar kind of expression appears later:\\n\\n```rust\\nimpl<\'de, T> Deserialize<\'de> for Wrapping\\nwhere\\n T: Deserialize<\'de>,\\n{\\n fn deserialize(deserializer: D) -> Result\\n where\\n D: Deserializer<\'de>,\\n {\\n Deserialize::deserialize(deserializer).map(\\n // Here is the error:\\n Wrapping\\n )\\n }\\n}\\n```\\n\\nThe `Wrapping` term is the constructor of a structure, used as a function. We add the support of this case in the pull-request [coq-of-rust#471](https://github.com/formal-land/coq-of-rust/pull/471).\\n\\n## Coq errors\\n\\nWhen we type-check the generated Coq code, we quickly get an error:\\n\\n```coq\\n(* Generated by coq-of-rust *)\\nRequire Import CoqOfRust.CoqOfRust.\\n\\nModule lib.\\n Module core.\\n\\n End core.\\nEnd lib.\\n\\nModule macros.\\n\\nEnd macros.\\n\\nModule integer128.\\n\\nEnd integer128.\\n\\nModule de.\\n Module value.\\n Module Error.\\n Section Error.\\n Record t : Set := {\\n (* Here is the error: *)\\n err : ltac:(serde.de.value.ErrorImpl);\\n }.\\n\\n (* 180.000 more lines! *)\\n```\\n\\nThe reason is that `serde.de.value.ErrorImpl` is not yet defined here. In Coq, we must order the definitions in the order of dependencies to ensure that there are no non-terminating definitions with infinite recursive calls and to preserve the consistency of the system.\\n\\nThis issue does not seem easy to us, as in a Rust crate, everything can depend on each other:\\n\\n- types\\n- definitions\\n- traits\\n- `impl` blocks\\n\\nOur current solutions are:\\n\\n1. **To reorder the definitions in the source Rust code**, so that they appear in the right order for Coq. This is technically the simplest solution (no changes in `coq-of-rust`), but it is not very practical. Indeed, reordering elements in a big project generates a lot of conflicts in the version control system, especially if we cannot upstream the changes to the original project.\\n2. **To use a configuration file** to specify the order of the definitions. This works in a lot of cases, but we need to write this file manually and have it complete to compile the whole crate in Coq, even if we are interested in verifying a small part of the code. There are also some cases that are hard to entangle, in particular with traits that can depend on both types and definitions, that themselves may depend on traits.\\n\\nIn order to handle large projects, such as `serde`, we need to find a more definitive solution to handle the order of dependencies.\\n\\n## Plan for the order of definitions\\n\\nOur idea is to use a more verbose, but simpler translation, to generate Coq code that is not sensitive to the ordering of Rust. In addition, we should have a more robust mechanism for the traits, as there are still some edge cases that we do not handle well.\\n\\nOur main ingredients are:\\n\\n1. Generating an untyped code, where all Rust values become part of a single and shared `Value` type. With this approach, we can represent mutually recursive Rust types, that are generally hard to translate in a sound manner to Coq. We should also avoid a lot of errors on the Coq side related to type inference.\\n2. Adding an indirection level to all function calls, as any function call might refer to a definition that appears later in the code.\\n\\nThese ingredients have some drawbacks:\\n\\n- By removing the types, we will obtain a code that is less readable. It might contain translation errors that will be harder to spot. We will need to add the types back during the specification of the code.\\n- We will need to add error cases corresponding to type errors at runtime, as we will not have the type system to ensure that functions expecting a certain type of value receive it. We know from the Rust type checker that these errors should not happen, but we will need to prove it in Coq.\\n- We will have to resolve the indirections in the calls at proof time, or with other mechanisms, that will be more complex than the current translation.\\n- We will still need to have a translation of the types (as values), to guide the inference of trait instances.\\n\\n## Definition of a new monad\\n\\nWe rework our definitions of values, pointers and monad to represent the effects, taking into account the fact that we remove the types from the translation. Here are the main definitions that we are planning to use. We have not tested them yet as we need to update the translation to Coq to use them. We will do that just after.\\n\\n### Pointers\\n\\n```coq\\nModule Pointer.\\n Module Index.\\n Inductive t : Set :=\\n | Tuple (index : Z)\\n | Array (index : Z)\\n | StructRecord (constructor field : string)\\n | StructTuple (constructor : string) (index : Z).\\n End Index.\\n\\n Module Path.\\n Definition t : Set := list Index.t.\\n End Path.\\n\\n Inductive t (Value : Set) : Set :=\\n | Immediate (value : Value)\\n | Mutable {Address : Set} (address : Address) (path : Path.t).\\n Arguments Immediate {_}.\\n Arguments Mutable {_ _}.\\nEnd Pointer.\\n```\\n\\nA pointer is either:\\n\\n- a pointer to an immutable data, that is directly represented by its data;\\n- a pointer to a mutable data, that is inside a cell at a certain address in the memory. The exact location in the cell is given by the path.\\n\\nThe type of `Address` is not enforced yet, but we will do it when defining the semantics.\\n\\n### Values\\n\\n```coq\\nModule Value.\\n Inductive t : Set :=\\n | Bool : bool -> t\\n | Integer : Integer.t -> Z -> t\\n (** For now we do not know how to represent floats so we use a string *)\\n | Float : string -> t\\n | UnicodeChar : Z -> t\\n | String : string -> t\\n | Tuple : list t -> t\\n | Array : list t -> t\\n | StructRecord : string -> list (string * t) -> t\\n | StructTuple : string -> list t -> t\\n | Pointer : Pointer.t t -> t\\n (** The two existential types of the closure must be [Value.t] and [M]. We\\n cannot enforce this constraint there yet, but we will do when defining the\\n semantics. *)\\n | Closure : {\'(t, M) : Set * Set @ t -> M} -> t.\\nEnd Value.\\n```\\n\\nHere, this type aims to represent any Rust value. We might add a few cases later to represent the `dyn` values, for example. Most of the cases of this type are as expected:\\n\\n- The constructor `StructRecord` is for constructors of `struct` or `enum` with named fields.\\n- The constructor `StructTuple` is for constructors of `struct` or `enum` with unnamed fields.\\n- The constructor `Pointer` is for pointers to data, that could be either `&`, `&mut`, `*const`, or `*mut`.\\n- The constructor `Closure` is for closures (anonymous functions). To prevent errors with the positivity checker of Coq, we use an existential type for the type `Value.t` (as well as `M`, which will be defined later). Note that we are using impredicative `Set` in Coq, and `{A : Set @ P A}` is our notation for existential `Set` in `Set`. Without impredicative sets, we could have issues with the universe levels. The fact that these existential types are always `Value.t` and `M` will be enforced when defining the semantics.\\n\\n### Monad\'s primitives\\n\\n```coq\\nModule Primitive.\\n Inductive t : Set :=\\n | StateAlloc (value : Value.t)\\n | StateRead {Address : Set} (address : Address)\\n | StateWrite {Address : Set} (address : Address) (value : Value.t)\\n | EnvRead.\\nEnd Primitive.\\n```\\n\\nHere are the IO calls to the system that the monad can make. This list might be extended later. For now, we mainly have primitives to access the memory.\\n\\n### Monad: base\\n\\n```coq\\nModule LowM.\\n Inductive t (A : Set) : Set :=\\n | Pure : A -> t A\\n | CallPrimitive : Primitive.t -> (Value.t -> t A) -> t A\\n | Loop : t A -> (A -> bool) -> (A -> t A) -> t A\\n | Impossible : t A\\n (** This constructor is not strictly necessary, but is used as a marker for\\n functions calls in the generated code, to help the tactics to recognize\\n points where we can compose about functions. *)\\n | Call : t A -> (A -> t A) -> t A.\\n Arguments Pure {_}.\\n Arguments CallPrimitive {_}.\\n Arguments Loop {_}.\\n Arguments Impossible {_}.\\n Arguments Call {_}.\\n\\n Fixpoint let_ {A : Set} (e1 : t A) (f : A -> t A) : t A :=\\n match e1 with\\n | Pure v => f v\\n | CallPrimitive primitive k =>\\n CallPrimitive primitive (fun v => let_ (k v) f)\\n | Loop body is_break k =>\\n Loop body is_break (fun v => let_ (k v) f)\\n | Impossible => Impossible\\n | Call e k =>\\n Call e (fun v => let_ (k v) f)\\n end.\\nEnd LowM.\\n```\\n\\nThis is the first layer of our monad, very similar to what we had before. We remove the cast operation, as now everything has the same type. We use a style by continuation, but we also define a `let_` function to have a \\"bind\\" operator. Note that we always have the same type as parameter, so this is not really a monad as the \\"bind\\" operator should have the type:\\n\\n```coq\\nforall {A B : Set}, M A -> (A -> M B) -> M B\\n```\\n\\nAlways having the same type is enough for us as we use a single type of all Rust values.\\n\\n### Monad: with exceptions\\n\\nWe have the same type as before for the exceptions, representing the panics and all the special control flow operations such as `continue`, `return`, and `break`:\\n\\n```coq\\nModule Exception.\\n Inductive t : Set :=\\n (** exceptions for Rust\'s `return` *)\\n | Return : Value.t -> t\\n (** exceptions for Rust\'s `continue` *)\\n | Continue : t\\n (** exceptions for Rust\'s `break` *)\\n | Break : t\\n (** escape from a match branch once we know that it is not valid *)\\n | BreakMatch : t\\n | Panic : string -> t.\\nEnd Exception.\\n```\\n\\nOur final monad definition is a thin wrapper around `LowM`, to add an error monad to propagate the exceptions:\\n\\n```coq\\nDefinition M : Set :=\\n LowM.t (Value.t + Exception.t).\\n\\nDefinition let_ (e1 : M) (e2 : Value.t -> M) : M :=\\n LowM.let_ e1 (fun v1 =>\\n match v1 with\\n | inl v1 => e2 v1\\n | inr error => LowM.Pure (inr error)\\n end).\\n```\\n\\nOnce again, this is not really a monad as the type of the values that we compute is always the same, and we do not need more. Having a definition in two steps (`LowM` and `M`) is useful to separate the part that can be defined by computation (the `M` part) from the part whose semantics can only be given by inductive predicates (the `LowM` part).\\n\\n## Conclusion\\n\\nNext, we will see how we can use this new definition of Rust values, whether it works to translate our examples, and most importantly, how to modify `coq-of-rust` to generate terms without types.\\n\\nIf you are interested in formally verifying Rust projects, do not hesitate to get in touch with us at [contact@formal.land](mailto:contact@formal.land) or go to our [GitHub repository](https://github.com/formal-land/coq-of-rust) for `coq-of-rust`."},{"id":"/2024/02/22/journey-coq-of-go","metadata":{"permalink":"/blog/2024/02/22/journey-coq-of-go","source":"@site/blog/2024-02-22-journey-coq-of-go.md","title":"\ud83e\uddab Translating Go to Coq, part 1","description":"In this blog post, we present our development steps to build a tool to translate Go programs to the proof system Coq.","date":"2024-02-22T00:00:00.000Z","formattedDate":"February 22, 2024","tags":[{"label":"coq-of-go","permalink":"/blog/tags/coq-of-go"},{"label":"Go","permalink":"/blog/tags/go"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":12.03,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\uddab Translating Go to Coq, part 1","tags":["coq-of-go","Go","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Improvements in the Rust translation to Coq, part 1","permalink":"/blog/2024/02/29/improvements-rust-translation"},"nextItem":{"title":"\ud83e\uddee Experiment on translation from Haskell to Coq","permalink":"/blog/2024/02/14/experiment-coq-of-hs"}},"content":"In this blog post, we present our development steps to build a tool to translate Go programs to the proof system Coq.\\n\\nThe goal is to formally verify Go programs to make them totally bug-free. It is actually possible to make a program totally bug-free, as [formal verification](https://en.wikipedia.org/wiki/Formal_verification) can cover all execution cases and kinds of properties thanks to the use of mathematical methods. This corresponds to the highest level of the [Evaluation Assurance Levels](https://en.wikipedia.org/wiki/Evaluation_Assurance_Level) used for critical applications, such as the space industry.\\n\\nAll the code of our work is available on GitHub at [github.com/formal-land/coq-of-go](https://github.com/formal-land/coq-of-go).\\n\\n\x3c!-- truncate --\x3e\\n\\n## Introduction\\n\\nWe believe that there are not yet a lot of formal verification tools for Go. We can cite [Goose](https://github.com/tchajed/goose), which is working by translation from Go to the proof system Coq. We will follow a similar approach, translating the Go language to our favorite proof system Coq. In contrast to Goose, we plan to support the whole Go language, even at the expense of the simplicity of the translation.\\n\\nFor that, we target the translation of the [SSA form of Go](https://pkg.go.dev/golang.org/x/tools/go/ssa) of Go instead of the [Go AST](https://pkg.go.dev/go/ast). The SSA form is a more low-level representation of Go, so we hope to capture the semantics of the whole Go language more easily. This should be at the expense of the simplicity of the generated translation, but we hope that having full language support outweighs this.\\n\\nGo is an interesting target as:\\n\\n- this is quite a popular language,\\n- it is focusing on simplicity, with a reduced set of language features,\\n- a lot of critical backend applications are written in Go, including for very large companies (Google, Netflix, Uber, Twitch, etc.).\\n\\nAmong interesting properties that we can verify are:\\n\\n- the absence of reachable `panic` in the code,\\n- the absence of race conditions or deadlocks,\\n- the backward compatibility from release to release, for parts of the code whose behavior is not supposed to change,\\n- the strict application of business rules.\\n\\n:::tip Contact\\n\\nYou can [follow us on X](https://twitter.com/FormalLand) to get our updates. We propose tools and services to make your codebase totally bug-free. Contact us at [contact@formal.land](mailto:contact@formal.land) to chat! We offer a free audit to assess the feasibility of formal verification on your case.\\n\\n:::\\n\\n:::note Goal\\n\\nOur company goal is to make formal verification accessible to all projects, reducing its cost to 20% of the development cost. There should be no reason to have bugs in end-user products!\\n\\n:::\\n\\n![Mole and Rooster](2024-02-22/mole_rooster.webp)\\n\\n## First target\\n\\nOur first target is to achieve the formal verification _including all the dependencies_ of the hello world program:\\n\\n```go\\npackage main\\n\\nimport \\"fmt\\"\\n\\nfunc main() {\\n\\tfmt.Println(\\"Hello, World!\\")\\n}\\n```\\n\\nWhat we want to show about this code is that it does a single and only thing: outputting the string \\"Hello, World!\\" to the standard output. Its only dependency is the `fmt` package, but when we look at the transitive dependencies of this package:\\n\\n```sh\\ngo list -f \'{{ .Deps }}\' fmt\\n```\\n\\nwe get around forty packages:\\n\\n```\\nerrors\\ninternal/abi\\ninternal/bytealg\\ninternal/coverage/rtcov\\ninternal/cpu\\ninternal/fmtsort\\ninternal/goarch\\ninternal/godebugs\\ninternal/goexperiment\\ninternal/goos\\ninternal/itoa\\ninternal/oserror\\ninternal/poll\\ninternal/race\\ninternal/reflectlite\\ninternal/safefilepath\\ninternal/syscall/execenv\\ninternal/syscall/unix\\ninternal/testlog\\ninternal/unsafeheader\\nio\\nio/fs\\nmath\\nmath/bits\\nos\\npath\\nreflect\\nruntime\\nruntime/internal/atomic\\nruntime/internal/math\\nruntime/internal/sys\\nruntime/internal/syscall\\nsort\\nstrconv\\nsync\\nsync/atomic\\nsyscall\\ntime\\nunicode\\nunicode/utf8\\nunsafe\\n```\\n\\nWe will need to translate all these packages to meaningful Coq code.\\n\\n## The start\\n\\nWe made the `coq-of-go` tool, with everything in a single file [main.go](https://github.com/formal-land/coq-of-go/blob/main/main.go) for now. We retrieve the SSA form of a Go package provided as a command line parameter (code without the error handling):\\n\\n```go\\nfunc main() {\\n\\tpackageToTranslate := os.Args[1]\\n\\tcfg := &packages.Config{Mode: packages.LoadSyntax}\\n\\tinitial, _ := packages.Load(cfg, packageToTranslate)\\n\\t_, pkgs := ssautil.Packages(initial, 0)\\n\\tpkgs[0].Build()\\n\\tmembers := pkgs[0].Members\\n```\\n\\n:::note SSA form\\n\\nThe [SSA form](https://en.wikipedia.org/wiki/Static_single-assignment_form) of a program is generally used internally by compilers to have a simple representation to work on. The [LLVM](https://llvm.org/) language is such an example. In SSA, each variable is assigned exactly once and the control flow is explicit, with jumps or conditional jumps to labels. There are no `for` loops, `if` statements, or non-primitive expressions.\\n\\n:::\\n\\nThen we iterate over all the SSA `members`, and directly print the corresponding Coq code to the standard output. We do not use an intermediate representation or make intermediate passes. We do not even do pretty-printing (splitting lines that are too long at the right place, and introducing indentation)! This should not be necessary as the SSA code cannot nest sub-expressions or statements. We still try to print a readable Coq code, as it will be used in the proofs.\\n\\nThere are four kinds of SSA members:\\n\\n- named constants,\\n- globals,\\n- types,\\n- functions.\\n\\nNamed constants and globals are similar, and are for top-level variables whose value is either known at compile-time or computed at the program\'s init. Types are for type definitions. We will focus on functions, as this is where the code is.\\n\\n## Functions\\n\\nThe SSA functions in Go are described by the type [`ssa.Function`](https://pkg.go.dev/golang.org/x/tools/go/ssa#Function):\\n\\n```go\\ntype Function struct {\\n\\tSignature *types.Signature\\n\\n\\t// source information\\n\\tSynthetic string // provenance of synthetic function; \\"\\" for true source functions\\n\\n\\tPkg *Package // enclosing package; nil for shared funcs (wrappers and error.Error)\\n\\tProg *Program // enclosing program\\n\\n\\tParams []*Parameter // function parameters; for methods, includes receiver\\n\\tFreeVars []*FreeVar // free variables whose values must be supplied by closure\\n\\tLocals []*Alloc // frame-allocated variables of this function\\n\\tBlocks []*BasicBlock // basic blocks of the function; nil => external\\n\\tRecover *BasicBlock // optional; control transfers here after recovered panic\\n\\tAnonFuncs []*Function // anonymous functions directly beneath this one\\n\\t// contains filtered or unexported fields\\n}\\n```\\n\\nThe main part of interest for us is `Blocks`. A block is a sequence of instructions, and the control flow is explicit. The last instruction of a block is a jump to another block, or a return. The first instructions of a block can be the special `Phi` instruction, which is used to merge control flow from different branches.\\n\\nWe decided to write a first version to see what the SSA code of Go looks like when printed in Coq, without thinking about generating a well-typed code. This looks like this:\\n\\n```coq\\nwith MakeUint64 (\u03b1 : list Val.t) : M (list Val.t) :=\\n M.Thunk (\\n match \u03b1 with\\n | [x] =>\\n M.Thunk (M.EvalBody [(0,\\n let* \\"t0\\" := Instr.BinOp x \\"<\\" (Val.Lit (Lit.Int 9223372036854775808)) in\\n Instr.If (Register.read \\"t0\\") 1 2\\n );\\n (1,\\n let* \\"t1\\" := Instr.Convert x in\\n let* \\"t2\\" := Instr.ChangeType (Register.read \\"t1\\") in\\n let* \\"t3\\" := Instr.MakeInterface (Register.read \\"t2\\") in\\n M.Return [(Register.read \\"t3\\")]\\n );\\n (2,\\n let* \\"t4\\" := Instr.Alloc (* complit *) Alloc.Local \\"*go/constant.intVal\\" in\\n let* \\"t5\\" := Instr.FieldAddr (Register.read \\"t4\\") 0 in\\n let* \\"t6\\" := Instr.Call (CallKind.Function (newInt [])) in\\n let* \\"t7\\" := Instr.Call (CallKind.Function (TODO_method [(Register.read \\"t6\\"); x])) in\\n do* Instr.Store (Register.read \\"t5\\") (Register.read \\"t7\\") in\\n let* \\"t8\\" := Instr.UnOp \\"*\\" (Register.read \\"t4\\") in\\n let* \\"t9\\" := Instr.MakeInterface (Register.read \\"t8\\") in\\n M.Return [(Register.read \\"t9\\")]\\n )])\\n | _ => M.Thunk (M.EvalBody [])\\n end)\\n```\\n\\nfor a source Go code (from the [go/constant](https://pkg.go.dev/go/constant) package):\\n\\n```go\\n// MakeUint64 returns the [Int] value for x.\\nfunc MakeUint64(x uint64) Value {\\n\\tif x < 1<<63 {\\n\\t\\treturn int64Val(int64(x))\\n\\t}\\n\\treturn intVal{newInt().SetUint64(x)}\\n}\\n```\\n\\nThere are three blocks of code, labeled with `0`, `1`, and `2`. The first block ends with a conditional jump `If` corresponding to the `if` statement in the Go code. The following blocks are corresponding to the two possible branches of the `if` statement. They both end with a `Return` instruction, corresponding to the `return` statement in the Go code. They run various primitive instructions that we have translated as we can.\\n\\nThe generated Coq code is still readable but more verbose than the original Go code. We will later develop proof techniques using simulations to enable the user to define equivalent but simpler versions of the translation. Being able to define simulations of an imperative program is also important for the proofs, as we can rewrite the code in functional style to make it easier to reason about.\\n\\n## Type-checking\\n\\nFrom there, a second step is to have a generated code that type-checks, forgetting about making a code with sound semantics for now. We generate the various Coq definitions that are needed in a header of the generated code, using axioms for all the definitions. For example, for the allocations we do:\\n\\n```coq\\nModule Alloc.\\n Inductive t : Set :=\\n | Heap\\n | Local.\\nEnd Alloc.\\n\\nModule Instr.\\n Parameter Alloc : Alloc.t -> string -> M Val.t.\\n```\\n\\nThe `Inductive` keyword in Coq defines a type with two constructors `Heap` and `Local`. The `Parameter` keyword defines an axiomatized definition, where we only provide the type but not the definition itself. The `Instr.Alloc` instruction takes as parameters an allocation mode `Alloc.t` and a string and returns an `M Val.t` value.\\n\\n### Representation of values\\n\\nWe make the choice to remove the types while doing the translation, as the type system of Go is probably incompatible with the one of Coq in many ways. We thus translate everything to a single type `Val.t` in Coq to represent all kinds of possible Go values. The downside of this approach is that is makes the generated code less readable and less safe, as types are useful to track the correct use of values.\\n\\nFor now, we define the `Val.t` type as:\\n\\n```coq\\nModule Val.\\n Inductive t : Set :=\\n | Lit (_ : Lit.t)\\n | Tuple (_ : list t).\\nEnd Val.\\n```\\n\\nwith the literals `Lit.t` as:\\n\\n```coq\\nModule Lit.\\n Inductive t : Set :=\\n | Bool (_ : bool)\\n | Int (_ : Z)\\n | Float (_ : Rational)\\n | Complex (_ _ : Rational)\\n | String (_ : string)\\n | Nil.\\nEnd Lit.\\n```\\n\\nWe plan to refine this type and add more cases as we improve `coq-of-go`. Structures, pointers, and closures are missing for now.\\n\\n### Monadic style\\n\\nIn order to represent the side-effects of the Go code, we use a [monadic style](). This is a standard approach to represent side-effects like mutations, exceptions, or non-termination in a purely function language such as Coq. We choose to use:\\n\\n- A free monad, where all the primitives are constructor of the inductive type `M` of the monad. This simplifies the manipulation of the monad by allowing to compute on it and by delegating the actual implementation of the monadic primitives for later.\\n- A co-inductive type, to allow potentially non-terminating programs. Co-inductive types are like lazy definitions in Haskell where it is possible to make an infinite list for example, as long as only a finite number of elements are consumed.\\n\\nIn that sense, we follow the approach in the paper [Modular, Compositional, and Executable Formal Semantics for LLVM IR](https://cambium.inria.fr/~eyoon/paper/vir.pdf), that is using a co-inductive free monad (interaction tree) to formalize a reasonable subset of the LLVM language that is also an SSA representation but with more low-level instructions than Go.\\n\\nOur definition for `M` for now is:\\n\\n```coq\\nModule M.\\n CoInductive t (A : Set) : Set :=\\n | Return (_ : A)\\n | Bind {B : Set} (_ : t B) (_ : B -> t A)\\n | Thunk (_ : t A)\\n | EvalBody (_ : list (Z * t A)).\\n Arguments Return {A}.\\n Arguments Bind {A B}.\\n Arguments Thunk {A}.\\n Arguments EvalBody {A}.\\nEnd M.\\nDefinition M : Set -> Set := M.t.\\n```\\n\\nWe define all the functions that we translate as mutually recursive with the `CoFixpoint ... with ...` keyword of Coq. Thus, we do not have to preserve the ordering of definitions that is required by Coq or care for recursive or mutually recursive functions in Go.\\n\\nHowever, we did not achieve to make the type-checker of Coq happy for our `CoFixpoint` as many definitions are axiomatized, and the type-checker of Coq wants their definitions to know if they produce co-inductive constructors. So, for now, we admit this step by disabling the termination checker with this flag:\\n\\n```coq\\nLocal Unset Guard Checking.\\n```\\n\\n## Next\\n\\nWhen we translate our hello world example we get the Coq code:\\n\\n```coq\\nCoFixpoint Main (\u03b1 : list Val.t) : M (list Val.t) :=\\n M.Thunk (\\n match \u03b1 with\\n | [] =>\\n M.Thunk (M.EvalBody [(0,\\n let* \\"t0\\" := Instr.Alloc (* varargs *) Alloc.Heap \\"*[1]any\\" in\\n let* \\"t1\\" := Instr.IndexAddr (Register.read \\"t0\\") (Val.Lit (Lit.Int 0)) in\\n let* \\"t2\\" := Instr.MakeInterface (Val.Lit (Lit.String \\"Hello, World!\\")) in\\n do* Instr.Store (Register.read \\"t1\\") (Register.read \\"t2\\") in\\n let* \\"t3\\" := Instr.Slice (Register.read \\"t0\\") None None in\\n let* \\"t4\\" := Instr.Call (CallKind.Function (fmt.Println [(Register.read \\"t3\\")])) in\\n M.Return []\\n )])\\n | _ => M.Thunk (M.EvalBody [])\\n end)\\n\\nwith init (\u03b1 : list Val.t) : M (list Val.t) :=\\n M.Thunk (\\n match \u03b1 with\\n | [] =>\\n M.Thunk (M.EvalBody [(0,\\n let* \\"t0\\" := Instr.UnOp \\"*\\" (Register.read \\"init$guard\\") in\\n Instr.If (Register.read \\"t0\\") 2 1\\n );\\n (1,\\n do* Instr.Store (Register.read \\"init$guard\\") (Val.Lit (Lit.Bool true)) in\\n let* \\"t1\\" := Instr.Call (CallKind.Function (fmt.init [])) in\\n Instr.Jump 2\\n );\\n (2,\\n M.Return []\\n )])\\n | _ => M.Thunk (M.EvalBody [])\\n end).\\n```\\n\\nThe `init` function, which is automatically generated by the Go compiler to initialize global variables, does not do much here. It checks whether it was already called or not reading the `init$guard` variable, and if not, it calls the `fmt.init` function. The `Main` function is the one that we are interested in. It allocates a variable to store the string \\"Hello, World!\\", and then calls the `fmt.Println` function to print it.\\n\\nFrom there, to continue the project we have two possibilities:\\n\\n1. Give actual definitions to each primitive instruction that is used in this example (for now, everything is axiomatized).\\n2. Translate all the transitive dependencies of the hello world program to Coq, and make sure that we can compile everything together.\\n\\nFor the next step, we choose to follow the second possibility as we are more confident in being able to define the semantics of the instructions, which is purely done on the Coq side, than in being able to use the Go compiler\'s APIs to retrieve the definitions of all the dependencies and related them together.\\n\\n## Conclusion\\n\\nWe have presented the beginning of our journey to translate Go programs to Coq, to build a formal verification tool for Go. The translation type-checks on the few examples we have tried but has no semantics. We will follow by handling the translation of dependencies of a package.\\n\\nIf you are interested in this project, please contact us at [contact@formal.land](mailto:contact@formal.land) or go to our [GitHub repository](https://github.com/formal-land/coq-of-go)."},{"id":"/2024/02/14/experiment-coq-of-hs","metadata":{"permalink":"/blog/2024/02/14/experiment-coq-of-hs","source":"@site/blog/2024-02-14-experiment-coq-of-hs.md","title":"\ud83e\uddee Experiment on translation from Haskell to Coq","description":"We present an experiment coq-of-hs that we have made on the translation of Haskell programs to the proof system Coq \ud83d\udc13. The goal is to formally verify Haskell programs to make them totally bug-free.","date":"2024-02-14T00:00:00.000Z","formattedDate":"February 14, 2024","tags":[{"label":"coq-of-hs","permalink":"/blog/tags/coq-of-hs"},{"label":"Haskell","permalink":"/blog/tags/haskell"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"translation","permalink":"/blog/tags/translation"}],"readingTime":4.365,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\uddee Experiment on translation from Haskell to Coq","tags":["coq-of-hs","Haskell","Coq","translation"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\uddab Translating Go to Coq, part 1","permalink":"/blog/2024/02/22/journey-coq-of-go"},"nextItem":{"title":"\ud83e\udd84 The importance of formal verification","permalink":"/blog/2024/02/02/formal-verification-for-aleph-zero"}},"content":"We present an experiment [coq-of-hs](https://github.com/formal-land/coq-of-hs-experiment) that we have made on the translation of [Haskell](https://www.haskell.org/) programs to the proof system [Coq \ud83d\udc13](https://coq.inria.fr/). The goal is to formally verify Haskell programs to make them totally bug-free.\\n\\nIndeed, even with the use of a strict type system, there can still be bugs for properties that cannot be expressed with types. An example of such a property is the backward compatibility of an API endpoint for the new release of a web service when there has been code refactoring. Only formal verification can cover all execution cases and kinds of properties.\\n\\nThe code of the tool is at: [github.com/formal-land/coq-of-hs-experiment](https://github.com/formal-land/coq-of-hs-experiment) (AGPL license)\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Contact\\n\\nWe propose tools to make your codebase totally bug-free. Contact us at [contact@formal.land](mailto:contact@formal.land) for more information! We offer a free audit to assess the feasibility of formal verification for your case.\\n\\n:::\\n\\n:::info Info\\n\\nWe estimate that the cost of formal verification should be 20% of the development cost. There are no reasons to still have bugs today!\\n\\n:::\\n\\n![Haskell Logo](2024-02-14/haskell_logo.svg)\\n\\n## Goal of the experiment\\n\\nThere are already some tools to formally verify Haskell programs:\\n\\n- [\ud83d\udc13 hs-to-coq](https://github.com/plclub/hs-to-coq) translation from Haskell to Coq\\n- [\ud83d\udca7 Liquid Haskell](https://en.wikipedia.org/wiki/Liquid_Haskell) verification using [SMT solvers](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories)\\n\\nIn this experiment, we want to check the feasibility of translation from Haskell to Coq:\\n\\n- \ud83d\udc4d covering all the language without manual configuration or code changes,\\n- \ud83d\udc4e even if this is at the cost of a more verbose and low-level translation.\\n\\n## Example\\n\\nHere is an example of a Haskell function:\\n\\n```haskell\\nfixObvious :: (a -> a) -> a\\nfixObvious f = f (fixObvious f)\\n```\\n\\nthat `coq-of-hs` translates to this valid Coq code:\\n\\n```coq\\nCoFixpoint fixObvious : Val.t :=\\n (Val.Lam (fun (f : Val.t) => (Val.App f (Val.App fixObvious f)))).\\n```\\n\\n## Infrastructure\\n\\nWe read the [Haskell Core](https://serokell.io/blog/haskell-to-core) representation of Haskell using the GHC plugin system. Thus, we read the exact same code version as the one that is compiled down to assembly code by [GHC](https://www.haskell.org/ghc/), to take into account all compilation options.\\n\\nHaskell Core is an intermediate representation of Haskell that is close to the lambda calculus and used by the Haskell compiler for various optimizations passes. Here are all the constructors of the `Expr` type of Haskell Core:\\n\\n```haskell\\ndata Expr b\\n = Var Id\\n | Lit Literal\\n | App (Expr b) (Arg b)\\n | Lam b (Expr b)\\n | Let (Bind b) (Expr b)\\n | Case (Expr b) b Type [Alt b]\\n | Cast (Expr b) Coercion\\n | Tick (Tickish Id) (Expr b)\\n | Type Type\\n | Coercion Coercion\\n```\\n\\nThis paper [System FC, as implemented in GHC](https://repository.brynmawr.edu/cgi/viewcontent.cgi?article=1015&context=compsci_pubs) presents it as [System F](https://en.wikipedia.org/wiki/System_F) plus coercions. We translate Haskell code to an untyped version of the lambda calculus in Coq, with co-induction to allow for infinite data structures:\\n\\n```coq\\nModule Val.\\n #[bypass_check(positivity)]\\n CoInductive t : Set :=\\n | Lit (_ : Lit.t)\\n | Con (_ : string) (_ : list t)\\n | App (_ _ : t)\\n | Lam (_ : t -> t)\\n | Case (_ : t) (_ : t -> list (Case.t t))\\n | Impossible.\\nEnd Val.\\n```\\n\\nWe make the translation by induction over the Haskell Core representation, and we translate each constructor to a corresponding constructor of the Coq representation. We pretty-print the Coq code directly without using an intermediate representation. We use the [prettyprinter](https://github.com/quchen/prettyprinter) package with the two main following primitives:\\n\\n```haskell\\nconcatNest :: [Doc ()] -> Doc ()\\nconcatNest = group . nest 2 . vsep\\n\\nconcatGroup :: [Doc ()] -> Doc ()\\nconcatGroup = group . vsep\\n```\\n\\nto display a sub-term with or without indentation when splitting lines that are too long. This translation works well on all the Haskell expressions that we have tested.\\n\\n## Missing features\\n\\n### Semantics\\n\\nWe have not yet defined a semantics. For now, the terms that we generate in Coq are purely descriptive. We will wait to have examples of things to verify to define semantics that are practical to use.\\n\\n### Type-classes\\n\\nWe have not yet translated typeclasses. The Haskell Core language hides most of the typeclasses-related code. For example, it represents instances as additional function parameters for functions that have a typeclass constraints. But we still need to declare the functions corresponding to the member of the typeclasses, what we have not done yet.\\n\\n### Multi-file projects\\n\\nWe have not yet implemented the translation of multi-file projects. We have only tested the translation of a single-file project.\\n\\n### Standard library\\n\\nSimilarly to the handling of multi-file projects, we have not yet tested the translation of projects using external libraries or translating the base library of Haskell.\\n\\n### Strict positivity\\n\\nWe had to turn off the strict positivity condition for the definition of `Val.t` in Coq with:\\n\\n```coq\\n#[bypass_check(positivity)]\\n```\\n\\nThis is for to the case:\\n\\n```coq\\n| Lam (_ : t -> t)\\n```\\n\\nwhere `t` appears as a parameter of a function (negative position). We do not know if this causes any problem in practice, on values that correspond to well-typed Haskell programs.\\n\\n## Conclusion\\n\\nWe have presented an experiment on the translation of Haskell programs to Coq. If you are interested in this project, please get in touch with us at [contact@formal.land](mailto:contact@formal.land) or go to the [GitHub repository](https://github.com/formal-land/coq-of-hs-experiment) of the project."},{"id":"/2024/02/02/formal-verification-for-aleph-zero","metadata":{"permalink":"/blog/2024/02/02/formal-verification-for-aleph-zero","source":"@site/blog/2024-02-02-formal-verification-for-aleph-zero.md","title":"\ud83e\udd84 The importance of formal verification","description":"Ensuring Flawless Software in a Flawed World","date":"2024-02-02T00:00:00.000Z","formattedDate":"February 2, 2024","tags":[],"readingTime":5.53,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 The importance of formal verification","authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\uddee Experiment on translation from Haskell to Coq","permalink":"/blog/2024/02/14/experiment-coq-of-hs"},"nextItem":{"title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","permalink":"/blog/2024/01/18/update-coq-of-rust"}},"content":"> Ensuring Flawless Software in a Flawed World\\n\\nIn this blog post, we present what formal verification is and why this is such a valuable tool to improve the security of your applications.\\n\\n\x3c!-- truncate --\x3e\\n\\n![Formal verification](2024-02-02/formal_verification.png)\\n\\n:::tip Contact\\n\\nIf you want to formally verify your codebase to improve the security of your application, contact us at [contact@formal.land](mailto:contact@formal.land)! We offer a free audit of your codebase to assess the feasibility of formal verification.\\n\\n:::\\n\\n:::info Thanks\\n\\nThe current development of our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust), for the formal verification of Rust code, is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation and its [Ecosystem Funding Program](https://alephzero.org/ecosystem-funding-program). The aim is to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n## What is formal verification?\\n\\nFormal verification is a set of techniques to check for the complete correctness of a program, reasoning at a symbolic level rather than executing a particular instance of the code. By symbolic reasoning, we mean following the values of the variables by tracking their names and constraints, without necessarily giving them an example value. This is what we would do in our heads to understand a code where a variable `username` appears, following which functions it is given to, to know where we use the user name. The concrete user name that we consider is irrelevant, although some people prefer to think with an example.\\n\\nIn formal verification, we rely on precise mathematical reasoning to make sure that there are no mistakes or missing cases. We check this reasoning with a dedicated program ([SMT](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories) solver, [Coq](https://coq.inria.fr/) proof system, ...). Indeed, as programs grow in complexity, it could be easy to forget an `if` branch or an error case.\\n\\nFor example, to say that the following Rust program is valid:\\n\\n```coq\\n/// Return the maximum of [a] and [b]\\nfn get_max(a: u128, b: u128) -> u128 {\\n if a > b {\\n a\\n } else {\\n b\\n }\\n}\\n```\\n\\nwe reason on two cases (reasoning by disjunction):\\n\\n- `a > b` where `a` is the maximum,\\n- `a <= b` where `b` is the maximum,\\n\\nwith the values of `a` and `b` being irrelevant (symbolic). In both cases, we can conclude that `get_max` returns the maximum.\\n\\nThis is in contrast with testing, where we need to execute the program with all possible instances of `a` and `b` to check that the program is correct with 100% certainty. This is infeasible in this case as the type `u128` is too large to be tested exhaustively: there are `2^256` possible values for `a` and `b`, meaning `115792089237316195423570985008687907853269984665640564039457584007913129639936` possible values!\\n\\nA program is shown correct with respect to an expected behavior, called a _formal specification_. This is expressed in a mathematical language to be non-ambiguous. For example, we can specify the behavior of the previous program as:\\n\\n```\\nFORALL (a b : u128),\\n (get_max a b = a OR get_max a b = b) AND\\n (get_max a b >= a AND get_max a b >= b)\\n```\\n\\nstating that we indeed return the maximum of `a` and `b`.\\n\\nWhen a program is formally verified, we are mathematically sure it will always follow its specifications. This is a way to eliminate all bugs, as long as we have a complete specification of what it is supposed to do or not do. This corresponds to the highest level of Evaluation Assurance Level, [EAL7](https://en.wikipedia.org/wiki/Evaluation_Assurance_Level#EAL7:_Formally_Verified_Design_and_Tested). This is used for critical applications, such as space rocket software, where a single bug can be extremely expensive (the loss of a rocket!).\\n\\nThere are various formal verification tools, such as the proof system [Coq](https://coq.inria.fr/). The C compiler [CompCert](https://en.wikipedia.org/wiki/CompCert) is an example of large software verified in Coq. It is proven correct, in contrast to most other C compilers that contain [subtle bugs](https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf). CompCert is now used by Airbus to compile C programs embedded in planes \ud83d\udeeb.\\n\\n## Why is it such a useful tool?\\n\\nFormal verification is extremely useful as it can anticipate all the bugs by exploring all possible execution cases of a program. Here is a quote from [Edsger W. Dijkstra](https://en.wikipedia.org/wiki/Formal_verification):\\n\\n> Program testing can be used to show the presence of bugs, but never to show their absence!\\n\\nIt offers the possibility to make software that never fails. This is often required for applications with human life at stake, such as planes or medical devices. But it can also be useful for applications where a single bug can be extremely expensive, such as financial applications.\\n\\nSmart contracts are a good example of such applications. They are programs that are executed on a blockchain and are used to manage assets worth billions of dollars. A single bug in a smart contract can lead to the loss of all the assets managed by the contract. In the first half of 2023, some estimate that attacks on web3 platforms resulted in a loss of [$655.61 million](https://www.linkedin.com/pulse/h1-2023-global-web3-security-report-aml-analysis-crypto-regulatory/), with most of these losses due to bugs in smart contracts. These bugs could be prevented using formally verified smart contracts.\\n\\nFinally, formal verification is useful to improve the quality of a program by enforcing the need to use:\\n\\n- clear programming constructs,\\n- an explicit specification of the behavior of the program.\\n\\n## Comparison of formal verification and testing\\n\\nCompared to testing, formal verification is more complex as:\\n\\n- it typically takes much more time to formally verify a program than to test it on a reasonable set of inputs,\\n- it requires a formal specification of the program, which is not always available,\\n- it requires some specific expertise to use the formal verification tools and to write the specifications.\\n\\nIn addition, formal verification assumes a certain model of the environment of the program, which is not always accurate. When actually executing the code, we also exercise all the dependencies (libraries, operating system, network, ...) that might cause issues at runtime.\\n\\nHowever, formal verification is the only way to have an exhaustive check of the program. It verifies all corner cases, such as integer overflows, or hard-to-reproduce issues, such as concurrency bugs. We recommend combining both approaches as they do not catch the same kinds of bugs.\\n\\nAt [Formal Land](https://formal.land/), we consider it critical to lower the cost of formal verification to apply it to a larger scope of programs and prevent more bugs and attacks. We work on the formal verification of Rust with [coq-of-rust](https://github.com/formal-land/coq-of-rust) and OCaml with [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml).\\n\\n## Conclusion\\n\\nFormal verification is a powerful tool to improve the security of your applications. It is the only way to prevent all bugs by exploring all possible executions of your programs. It complements existing testing methods. It is particularly useful for critical applications, such as smart contracts, where a single bug can be extremely expensive."},{"id":"/2024/01/18/update-coq-of-rust","metadata":{"permalink":"/blog/2024/01/18/update-coq-of-rust","source":"@site/blog/2024-01-18-update-coq-of-rust.md","title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","description":"We continue our work on the coq-of-rust tool to formally verify Rust programs with the Coq proof assistant. We have upgraded the Rust version that we support, simplified the translation of the traits, and are adding better support for the standard library of Rust.","date":"2024-01-18T00:00:00.000Z","formattedDate":"January 18, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Aleph-Zero","permalink":"/blog/tags/aleph-zero"}],"readingTime":3.5,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","tags":["coq-of-rust","Rust","Coq","Aleph-Zero"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 The importance of formal verification","permalink":"/blog/2024/02/02/formal-verification-for-aleph-zero"},"nextItem":{"title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","permalink":"/blog/2024/01/04/rust-translating-match"}},"content":"We continue our work on the [coq-of-rust](https://github.com/formal-land/coq-of-rust) tool to formally verify Rust programs with the [Coq proof assistant](https://coq.inria.fr/). We have upgraded the Rust version that we support, simplified the translation of the traits, and are adding better support for the standard library of Rust.\\n\\nOverall, we are now able to translate **about 80%** of the Rust examples from the [Rust by Example](https://doc.rust-lang.org/stable/rust-by-example/) book into valid Coq files. This means we support a large subset of the Rust language.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nTo formally verify your Rust codebase and improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)! Formal verification is the only way to prevent all bugs by exploring all possible executions of your programs \ud83c\udfaf.\\n\\n:::\\n\\n:::info Thanks\\n\\nThis work and the development of [coq-of-rust](https://github.com/formal-land/coq-of-rust) is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation, to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n![Rust rooster](2024-01-18/rooster.png)\\n\\n## Upgrade of the Rust version\\n\\nThe tool `coq-of-rust` is tied to a particular version of the Rust compiler that we use to parse and type-check a `cargo` project. We now support the `nightly-2023-12-15` version of Rust, up from `nightly-2023-04-30`. Most of the changes were minor, but it is good to handle these regularly to have smooth upgrades. The corresponding pull request is [coq-of-rust/pull/445](https://github.com/formal-land/coq-of-rust/pull/445). We also got more [Clippy](https://github.com/rust-lang/rust-clippy) warnings thanks to the new version of Rust.\\n\\n## Simplify the translation of traits\\n\\nThe traits of Rust are similar to the [type-classes of Coq](https://coq.inria.fr/refman/addendum/type-classes.html). This is how we translate traits to Coq.\\n\\nBut there are a lot of subtle differences between the two languages. The type-class inference mechanism of Coq does not work all the time on generated Rust code, even when adding a lot of code annotations. We think that the only reliable way to translate Rust traits would be to explicit the implementations inferred by the Rust compiler, but the Rust compiler currently throws away this information.\\n\\nInstead, our new solution is to use a Coq tactic:\\n\\n```coq\\n(** Try first to infer the trait instance, and if unsuccessful, delegate it at\\n proof time. *)\\nLtac get_method method :=\\n exact (M.pure (method _)) ||\\n exact (M.get_method method).\\n```\\n\\nthat first tries to infer the trait instance for a particular method, and if it fails, delegates its definition to the user at proof time. This is a bit unsafe, as a user could provide invalid instances at proof time, by giving some custom instance definitions instead of the ones generated by `coq-of-rust`. So, one should be careful to only apply generated instances to fill the hole made by this tactic in case of failure. We believe this to be a reasonable assumption that we could enforce someday if needed.\\n\\nWe are also starting to remove the trait constraints on polymorphic functions (the `where` clauses). We start by doing it in our manual definition of the standard library of Rust. The rationale is that we can provide the actual trait instances at proof time by having the right hypothesis replicating the constraints of the `where` clauses. Having fewer `where` clauses reduces the complexity of the type inference of Coq on the generated code. There are still some cases that we need to clarify, for example, the handling of [associated types](https://doc.rust-lang.org/rust-by-example/generics/assoc_items/types.html) in the absence of traits.\\n\\n## Handling more of the standard library\\n\\nWe have a definition of the standard library of Rust, mainly composed of axiomatized[^1] definitions, in these three folders:\\n\\n- [CoqOfRust/alloc](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/alloc)\\n- [CoqOfRust/core](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/core)\\n- [CoqOfRust/std](https://github.com/formal-land/coq-of-rust/tree/main/CoqOfRust/std)\\n\\nBy adding more of these axioms, as well as with some small changes to the `coq-of-rust` tool, we are now able to successfully translate around 80% of the examples of the [Rust by Example](https://doc.rust-lang.org/stable/rust-by-example/) book. There can still be some challenges on larger programs, but this showcases the good support of `coq-of-rust` for the Rust language.\\n\\n## Conclusion\\n\\nWe are continuing to improve our tool `coq-of-rust` to support more of the Rust language and are making good progress. If you need to improve the security of critical applications written in Rust, contact us at [contact@formal.land](mailto:contact@formal.land) to start formally verifying your code!\\n\\n[^1]: An axiom in Coq is either a theorem whose proof is admitted, or a function/constant definition left for latter. This is the equivalent in Rust of the `todo!` macro."},{"id":"/2024/01/04/rust-translating-match","metadata":{"permalink":"/blog/2024/01/04/rust-translating-match","source":"@site/blog/2024-01-04-rust-translating-match.md","title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","description":"Our tool coq-of-rust enables formal verification of \ud83e\udd80 Rust code to make sure that a program has no bugs. This technique checks all possible execution paths using mathematical techniques. This is important for example to ensure the security of smart contracts written in Rust language.","date":"2024-01-04T00:00:00.000Z","formattedDate":"January 4, 2024","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"Aleph-Zero","permalink":"/blog/tags/aleph-zero"}],"readingTime":6.005,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","tags":["coq-of-rust","Rust","Coq","Aleph-Zero"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Upgrade the Rust version of coq-of-rust","permalink":"/blog/2024/01/18/update-coq-of-rust"},"nextItem":{"title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","permalink":"/blog/2023/12/13/rust-verify-erc-20-smart-contract"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) enables [formal verification](https://en.wikipedia.org/wiki/Formal_verification) of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code to make sure that a program has no bugs. This technique checks all possible execution paths using mathematical techniques. This is important for example to ensure the security of smart contracts written in Rust language.\\n\\nOur tool `coq-of-rust` works by translating Rust programs to the general proof system [\ud83d\udc13 Coq](https://coq.inria.fr/). Here we explain how we translate[ `match` patterns](https://doc.rust-lang.org/book/ch06-02-match.html) from Rust to Coq. The specificity of Rust patterns is to be able to match values either by value or reference.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nTo formally verify your Rust codebase and improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)! Formal verification is the only way to prevent all bugs by exploring all possible executions of your program.\\n\\n:::\\n\\n:::info Thanks\\n\\nThis work and the development of [coq-of-rust](https://github.com/formal-land/coq-of-rust) is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation, to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n![Rust rooster](2024-01-04/rust-rooster.png)\\n\\n## Rust example \ud83e\udd80\\n\\nTo illustrate the pattern matching in Rust, we will use the following example featuring a match by reference:\\n\\n```rust\\npub(crate) fn is_option_equal
(\\n is_equal: fn(x: &A, y: &A) -> bool,\\n lhs: Option,\\n rhs: &A,\\n) -> bool {\\n match lhs {\\n None => false,\\n Some(ref value) => is_equal(value, rhs),\\n }\\n}\\n```\\n\\nWe take a function `is_equal` as a parameter, operating only on references to the type `A`. We apply it to compare two values `lhs` and `rhs`:\\n\\n- if `lhs` is `None`, we return `false`,\\n- if `lhs` is `Some`, we get its value by reference and apply `is_equal`.\\n\\nWhen we apply the pattern:\\n\\n```rust\\nSome(ref value) => ...\\n```\\n\\nwe do something interesting: we read the value of `lhs` to know if we are in a `Some` case but leave it in place and return `value` the reference to its content.\\n\\nTo simulate this behavior in Coq, we need to match in two steps:\\n\\n1. match the value of `lhs` to know if we are in a `Some` case or not,\\n2. if we are in a `Some` case, create the reference to the content of a `Some` case based on the reference to `lhs`.\\n\\n## Coq translation \ud83d\udc13\\n\\nThe Coq translation that our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) generates is the following:\\n\\n```coq\\nDefinition is_option_equal\\n {A : Set}\\n (is_equal : (ref A) -> (ref A) -> M bool.t)\\n (lhs : core.option.Option.t A)\\n (rhs : ref A)\\n : M bool.t :=\\n let* is_equal := M.alloc is_equal in\\n let* lhs := M.alloc lhs in\\n let* rhs := M.alloc rhs in\\n let* \u03b10 : M.Val bool.t :=\\n match_operator\\n lhs\\n [\\n fun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.None => M.alloc false\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t);\\n fun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.Some _ =>\\n let \u03b30_0 := \u03b3.[\\"Some.0\\"] in\\n let* value := M.alloc (borrow \u03b30_0) in\\n let* \u03b10 : (ref A) -> (ref A) -> M bool.t := M.read is_equal in\\n let* \u03b11 : ref A := M.read value in\\n let* \u03b12 : ref A := M.read rhs in\\n let* \u03b13 : bool.t := M.call (\u03b10 \u03b11 \u03b12) in\\n M.alloc \u03b13\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t)\\n ] in\\n M.read \u03b10.\\n```\\n\\nWe run the `match_operator` on `lhs` and the two branches of the `match`. This operator is of type:\\n\\n```coq\\nDefinition match_operator {A B : Set}\\n (scrutinee : A)\\n (arms : list (A -> M B)) :\\n M B :=\\n ...\\n```\\n\\nIt takes a `scrutinee` value to match as a parameter, and runs a sequence of functions `arms` on it. Each function `arms` takes the value of the `scrutinee` and returns a monadic value `M B`. This monadic value can either be a success value if the pattern matches, or a special failure value if the pattern does not match. We evaluate the branches until one succeeds.\\n\\n### `None` branch\\n\\nThe `None` branch is the simplest one. We read the value at the address given by `lhs` (we represent each Rust variable by its address) and match it with the `None` constructor:\\n\\n```coq\\nfun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.None => M.alloc false\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t)\\n```\\n\\nIf it matches, we return `false`. If it does not, we return the special value `M.break_match` to indicate that the pattern does not match.\\n\\n### `Some` branch\\n\\nIn the `Some` branch, we first also read the value at the address given by `lhs` and match it with the `Some` constructor:\\n\\n```coq\\nfun \u03b3 =>\\n (let* \u03b10 := M.read \u03b3 in\\n match \u03b10 with\\n | core.option.Option.Some _ =>\\n let \u03b30_0 := \u03b3.[\\"Some.0\\"] in\\n let* value := M.alloc (borrow \u03b30_0) in\\n let* \u03b10 : (ref A) -> (ref A) -> M bool.t := M.read is_equal in\\n let* \u03b11 : ref A := M.read value in\\n let* \u03b12 : ref A := M.read rhs in\\n let* \u03b13 : bool.t := M.call (\u03b10 \u03b11 \u03b12) in\\n M.alloc \u03b13\\n | _ => M.break_match\\n end) :\\n M (M.Val bool.t)\\n```\\n\\nIf we are in that case, we create the value:\\n\\n```coq\\nlet \u03b30_0 := \u03b3.[\\"Some.0\\"] in\\n```\\n\\nwith the address of the first field of the `Some` constructor, relative to the address of `lhs` given in `\u03b3`. We define the operator `.[\\"Some.0\\"]` when we define the option type and generate such definitions for all user-defined enum types.\\n\\nWe then encapsulate the address `\u03b30_0` in a proper Rust reference:\\n\\n```coq\\nlet* value := M.alloc (borrow \u03b30_0) in\\n```\\n\\nof type `ref A` in the original Rust code. Finally, we call the function `is_equal` on the two references `value` and `rhs`, with some boilerplate code to read and allocate the variables.\\n\\n## General translation\\n\\nWe generalize this translation to all patterns by:\\n\\n- flattening all the or patterns `|` so that only patterns with a single choice remain,\\n- evaluating each match branch in order with the `match_operator` operator,\\n- in each branch, evaluating the inner patterns in order. This evaluation might fail at any point if the pattern does not match. In this case, we return the special value `M.break_match` and continue with the next branch.\\n\\nAt least one branch should succeed as the Rust compiler checks that all cases are covered. We still have a special value `M.impossible` in Coq for the case where no patterns match and satisfy the type checker.\\n\\nWe distinguish and handle the following kind of patterns (and all their combinations):\\n\\n- wild patterns `_`,\\n- binding patterns `(ref) name` or `(ref) name as pattern` (the `ref` keyword is optional),\\n- struct patterns `Name { field1: pattern1, ... }` or `Name(pattern1, ...)`\\n- tuple patterns `(pattern1, ...)`,\\n- literal patterns `12`, `true`, ...,\\n- slice patterns `[first, second, tail @ ..]`,\\n- dereference patterns `&pattern`.\\n\\nThis was enough to cover all of our examples. The Rust compiler can also automatically add some `ref` patterns when matching on references. We do not need to handle this case as this is automatically done by the Rust compiler during its compilation to the intermediate [THIR](https://rustc-dev-guide.rust-lang.org/thir.html) representation, and e directly read the THIR code.\\n\\n## Conclusion\\n\\nIn this blog post, we have presented how we translate Rust patterns to the proof system Coq. The difficult part is handling the `ref` patterns, which we do by matching in two steps: matching on the values and then computing the addresses of the sub-fields.\\n\\nIf you have Rust smart contracts or programs to verify, feel free to email us at [contact@formal.land](mailto:contact@formal.land). We will be happy to help!"},{"id":"/2023/12/13/rust-verify-erc-20-smart-contract","metadata":{"permalink":"/blog/2023/12/13/rust-verify-erc-20-smart-contract","source":"@site/blog/2023-12-13-rust-verify-erc-20-smart-contract.md","title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","description":"Our tool coq-of-rust enables formal verification of \ud83e\udd80 Rust code to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system \ud83d\udc13 Coq.","date":"2023-12-13T00:00:00.000Z","formattedDate":"December 13, 2023","tags":[{"label":"Aleph-Zero","permalink":"/blog/tags/aleph-zero"},{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"ERC-20","permalink":"/blog/tags/erc-20"},{"label":"ink!","permalink":"/blog/tags/ink"}],"readingTime":20.12,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","tags":["Aleph-Zero","coq-of-rust","Rust","Coq","ERC-20","ink!"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Translating Rust match patterns to Coq with coq-of-rust","permalink":"/blog/2024/01/04/rust-translating-match"},"nextItem":{"title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","permalink":"/blog/2023/11/26/rust-function-body"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) enables formal verification of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system [\ud83d\udc13 Coq](https://coq.inria.fr/).\\n\\nHere, we show how we formally verify an [ERC-20 smart contract](https://github.com/use-ink/ink/blob/master/integration-tests/public/erc20/lib.rs) written in Rust for the [Aleph Zero](https://alephzero.org/) blockchain. [ERC-20](https://en.wikipedia.org/wiki/Ethereum#ERC20) smart contracts are used to create new kinds of tokens in an existing blockchain. Examples are stable coins such as the [\ud83d\udcb2USDT](https://tether.to/).\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nTo formally verify your Rust codebase and improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)! Formal verification is the only way to prevent all bugs by exploring all possible executions of your program.\\n\\n:::\\n\\n:::info Thanks\\n\\nThis work and the development of [coq-of-rust](https://github.com/formal-land/coq-of-rust) is made possible thanks to the [Aleph Zero](https://alephzero.org/)\'s Foundation, to develop an extra safe platform to build decentralized applications with formally verified smart contracts.\\n\\n:::\\n\\n![Rooster verifying](2023-12-13/rooster-verifying.png)\\n\\n## Smart contract code \ud83e\udd80\\n\\nHere is the Rust code of the smart contract that we want to verify:\\n\\n```rust\\n#[ink::contract]\\nmod erc20 {\\n use ink::storage::Mapping;\\n\\n #[ink(storage)]\\n #[derive(Default)]\\n pub struct Erc20 {\\n total_supply: Balance,\\n balances: Mapping,\\n allowances: Mapping<(AccountId, AccountId), Balance>,\\n }\\n\\n #[ink(event)]\\n pub struct Transfer {\\n // ...\\n }\\n\\n #[ink(event)]\\n pub struct Approval {\\n // ...\\n }\\n\\n #[derive(Debug, PartialEq, Eq)]\\n #[ink::scale_derive(Encode, Decode, TypeInfo)]\\n pub enum Error {\\n // ...\\n }\\n\\n pub type Result = core::result::Result;\\n\\n impl Erc20 {\\n #[ink(constructor)]\\n pub fn new(total_supply: Balance) -> Self {\\n let mut balances = Mapping::default();\\n let caller = Self::env().caller();\\n balances.insert(caller, &total_supply);\\n Self::env().emit_event(Transfer {\\n from: None,\\n to: Some(caller),\\n value: total_supply,\\n });\\n Self {\\n total_supply,\\n balances,\\n allowances: Default::default(),\\n }\\n }\\n\\n #[ink(message)]\\n pub fn total_supply(&self) -> Balance {\\n self.total_supply\\n }\\n\\n #[ink(message)]\\n pub fn balance_of(&self, owner: AccountId) -> Balance {\\n self.balance_of_impl(&owner)\\n }\\n\\n #[inline]\\n fn balance_of_impl(&self, owner: &AccountId) -> Balance {\\n self.balances.get(owner).unwrap_or_default()\\n }\\n\\n #[ink(message)]\\n pub fn allowance(&self, owner: AccountId, spender: AccountId) -> Balance {\\n self.allowance_impl(&owner, &spender)\\n }\\n\\n #[inline]\\n fn allowance_impl(&self, owner: &AccountId, spender: &AccountId) -> Balance {\\n self.allowances.get((owner, spender)).unwrap_or_default()\\n }\\n\\n #[ink(message)]\\n pub fn transfer(&mut self, to: AccountId, value: Balance) -> Result<()> {\\n let from = self.env().caller();\\n self.transfer_from_to(&from, &to, value)\\n }\\n\\n #[ink(message)]\\n pub fn approve(&mut self, spender: AccountId, value: Balance) -> Result<()> {\\n let owner = self.env().caller();\\n self.allowances.insert((&owner, &spender), &value);\\n self.env().emit_event(Approval {\\n owner,\\n spender,\\n value,\\n });\\n Ok(())\\n }\\n\\n #[ink(message)]\\n pub fn transfer_from(\\n &mut self,\\n from: AccountId,\\n to: AccountId,\\n value: Balance,\\n ) -> Result<()> {\\n let caller = self.env().caller();\\n let allowance = self.allowance_impl(&from, &caller);\\n if allowance < value {\\n return Err(Error::InsufficientAllowance)\\n }\\n self.transfer_from_to(&from, &to, value)?;\\n // We checked that allowance >= value\\n #[allow(clippy::arithmetic_side_effects)]\\n self.allowances\\n .insert((&from, &caller), &(allowance - value));\\n Ok(())\\n }\\n\\n fn transfer_from_to(\\n &mut self,\\n from: &AccountId,\\n to: &AccountId,\\n value: Balance,\\n ) -> Result<()> {\\n let from_balance = self.balance_of_impl(from);\\n if from_balance < value {\\n return Err(Error::InsufficientBalance)\\n }\\n // We checked that from_balance >= value\\n #[allow(clippy::arithmetic_side_effects)]\\n self.balances.insert(from, &(from_balance - value));\\n let to_balance = self.balance_of_impl(to);\\n self.balances\\n .insert(to, &(to_balance.checked_add(value).unwrap()));\\n self.env().emit_event(Transfer {\\n from: Some(*from),\\n to: Some(*to),\\n value,\\n });\\n Ok(())\\n }\\n }\\n}\\n```\\n\\nThis whole code is rather short and contains no loops, which will simplify our verification process. It uses a lot of macros, such as `#[ink(message)]`, that are specific to the [ink!](https://use.ink/) language for smart contracts, built on top of Rust. To verify this smart contract, we removed all the macros and added a mock of the dependencies, such as `ink::storage::Mapping` to get a map data structure.\\n\\n## The Coq translation \ud83d\udc13\\n\\nBy running our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) we automatically obtain the corresponding Coq code for the contract [erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/erc20.v). Here is an extract for the `transfer` function:\\n\\n```coq\\n(*\\n fn transfer(&mut self, to: AccountId, value: Balance) -> Result<()> {\\n let from = self.env().caller();\\n self.transfer_from_to(&from, &to, value)\\n }\\n*)\\nDefinition transfer\\n (self : mut_ref ltac:(Self))\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance))\\n : M ltac:(erc20.Result unit) :=\\n let* self : M.Val (mut_ref ltac:(Self)) := M.alloc self in\\n let* to : M.Val erc20.AccountId.t := M.alloc to in\\n let* value : M.Val ltac:(erc20.Balance) := M.alloc value in\\n let* from : M.Val erc20.AccountId.t :=\\n let* \u03b10 : mut_ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : erc20.Env.t :=\\n M.call (erc20.Erc20.t::[\\"env\\"] (borrow (deref \u03b10))) in\\n let* \u03b12 : M.Val erc20.Env.t := M.alloc \u03b11 in\\n let* \u03b13 : erc20.AccountId.t :=\\n M.call (erc20.Env.t::[\\"caller\\"] (borrow \u03b12)) in\\n M.alloc \u03b13 in\\n let* \u03b10 : mut_ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : u128.t := M.read value in\\n let* \u03b12 : core.result.Result.t unit erc20.Error.t :=\\n M.call\\n (erc20.Erc20.t::[\\"transfer_from_to\\"] \u03b10 (borrow from) (borrow to) \u03b11) in\\n let* \u03b10 : M.Val (core.result.Result.t unit erc20.Error.t) := M.alloc \u03b12 in\\n M.read \u03b10.\\n```\\n\\nMore details of the translation are given in previous blog posts, but basically:\\n\\n- we make explicit all memory and implicit operations (like borrowing and dereferencing),\\n- we apply a monadic translation to chain the primitive operations with `let*`.\\n\\n## Proof strategy\\n\\n![Proof strategy](2023-12-13/proof-strategy.png)\\n\\nWe verify the code in two steps:\\n\\n1. Show that a simpler, purely functional Coq code can simulate all the smart contract code.\\n2. Show that the simulation is correct.\\n\\nThat way, we can eliminate all the memory-related operations by showing the equivalence with a simulation. Then, we can focus on the functional code, which is more straightforward to reason about. We can cite another project, [Aeneas](https://github.com/AeneasVerif/aeneas), which proposes to do the first step (removing memory operations) automatically.\\n\\n## Simulations\\n\\n### Simulation code\\n\\nWe will work on the example of the `transfer` function. We define the simulations in [Simulations/erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/Simulations/erc20.v). For the `transfer` function this is:\\n\\n```coq\\nDefinition transfer\\n (env : erc20.Env.t)\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n MS? State.t ltac:(erc20.Result unit) :=\\n transfer_from_to (Env.caller env) to value.\\n```\\n\\nThe function `transfer` is a wrapper around `transfer_from_to`, using the smart contract caller as the `from` account. The monad `MS?` combines the state and error effect. The state is given by the `State.t` type:\\n\\n```coq\\nModule State.\\n Definition t : Set := erc20.Erc20.t * list erc20.Event.t.\\nEnd State.\\n```\\n\\nIt combines the state of the contract (type `Self` in the Rust code) and a list of events to represent the logs. The errors of the monad include panic errors, as well as control flow primitives such as `return` or `break` that we implement with exceptions.\\n\\n### Equivalence statement\\n\\nWe write all our proofs in [Proofs/erc20.v](https://github.com/formal-land/coq-of-rust/blob/main/CoqOfRust/examples/default/examples/ink_contracts/Proofs/erc20.v). The lemma stating that the simulation is equivalent to the original code is:\\n\\n```coq\\nLemma run_transfer\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance))\\n (H_storage : Erc20.Valid.t storage)\\n (H_value : Integer.Valid.t value) :\\n let state := State.of_storage storage in\\n let self := Ref.mut_ref Address.storage in\\n let simulation :=\\n lift_simulation\\n (Simulations.erc20.transfer env to value) storage in\\n {{ Environment.of_env env, state |\\n erc20.Impl_erc20_Erc20_t_2.transfer self to value \u21d3\\n simulation.(Output.result)\\n | simulation.(Output.state) }}.\\n```\\n\\nThe main predicate is:\\n\\n```coq\\n{{ env, state | translated_code \u21d3 result | final_state }}.\\n```\\n\\nThis predicate defines our semantics, explaining how to evaluate a translated Rust code in an environment `env` and a state `state`, to obtain a result `result` and a final state `final_state`. We use an environment in addition to a state to initialize various globals and other information related to the execution context. For example, here, we use the environment to store the `caller` of the contract and the pointer to the list of logs.\\n\\n### Semantics\\n\\nWe define our monad for the translated code `M A` in a style by continuation:\\n\\n```coq\\nInductive t (A : Set) : Set :=\\n| Pure : A -> t A\\n| CallPrimitive {B : Set} : Primitive.t B -> (B -> t A) -> t A\\n| Cast {B1 B2 : Set} : B1 -> (B2 -> t A) -> t A\\n| Impossible : t A.\\nArguments Pure {_}.\\nArguments CallPrimitive {_ _}.\\nArguments Cast {_ _ _}.\\nArguments Impossible {_}.\\n```\\n\\nFor now, we use the primitives to access the memory and the environment:\\n\\n```coq\\nModule Primitive.\\n Inductive t : Set -> Set :=\\n | StateAlloc {A : Set} : A -> t (Ref.t A)\\n | StateRead {Address A : Set} : Address -> t A\\n | StateWrite {Address A : Set} : Address -> A -> t unit\\n | EnvRead {A : Set} : t A.\\nEnd Primitive.\\n```\\n\\nFor each of our monad constructs, we add a case to our evaluation predicate that we will describe:\\n\\n- `Pure` The result is the value itself, and the state is unchanged:\\n ```coq\\n | Pure :\\n {{ env, state\' | LowM.Pure result \u21d3 result | state\' }}\\n ```\\n- `Cast` The evaluation is only possible when `B1` and `B2` are the same type `B`:\\n ```coq\\n | Cast {B : Set} (state : State) (v : B) (k : B -> LowM A) :\\n {{ env, state | k v \u21d3 result | state\' }} ->\\n {{ env, state | LowM.Cast v k \u21d3 result | state\' }}\\n ```\\n In this case, we return the result of the continuation `k` of the cast. We do not change the state in the cast.\\n- We read the state using the primitive `State.read`, checking that the `address` is indeed allocated (it returns `None` otherwise). Note that the type of `v` depends on its address. We directly allocate values with their original type, to avoid serializations/deserializations to represent the state.\\n ```coq\\n | CallPrimitiveStateRead\\n (address : Address) (v : State.get_Set address)\\n (state : State)\\n (k : State.get_Set address -> LowM A) :\\n State.read address state = Some v ->\\n {{ env, state | k v \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateRead address) k \u21d3 result\\n | state\' }}\\n ```\\n- Similarly, we write into the state with `State.alloc_write`, that only succeeds for allocated addresses:\\n ```coq\\n | CallPrimitiveStateWrite\\n (address : Address) (v : State.get_Set address)\\n (state state_inter : State)\\n (k : unit -> LowM A) :\\n State.alloc_write address state v = Some state_inter ->\\n {{ env, state_inter | k tt \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateWrite address v) k \u21d3 result\\n | state\' }}\\n ```\\n- To allocate a new value in memory, we have to make a choice depending on whether we want this value to be writable or not. For immutable values, we do not create a new address and instead say that the address is the value itself:\\n ```coq\\n | CallPrimitiveStateAllocNone {B : Set}\\n (state : State) (v : B)\\n (k : Ref B -> LowM A) :\\n {{ env, state | k (Ref.Imm v) \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateAlloc v) k \u21d3 result\\n | state\' }}\\n ```\\n If we later attempt to update this value, it will not be possible to define a semantics and we will be stuck. It is up to the user to correctly anticipate if a value will be updated or not to define the semantics. For values that might be updated, we use:\\n ```coq\\n | CallPrimitiveStateAllocSome\\n (address : Address) (v : State.get_Set address)\\n (state : State)\\n (k : Ref (State.get_Set address) -> LowM A) :\\n let r :=\\n Ref.MutRef (A := State.get_Set address) (B := State.get_Set address)\\n address (fun full_v => full_v) (fun v _full_v => v) in\\n State.read address state = None ->\\n State.alloc_write address state v = Some state\' ->\\n {{ env, state | k r \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive (Primitive.StateAlloc v) k \u21d3 result\\n | state\' }}\\n ```\\n We need to provide an address not already allocated: `State.read` should return `None`. At this point, we can make any choice of unallocated address in order to simplify the proofs later.\\n- Finally, we read the whole environment with:\\n ```coq\\n | CallPrimitiveEnvRead\\n (state : State) (k : Env -> LowM A) :\\n {{ env, state | k env \u21d3 result | state\' }} ->\\n {{ env, state |\\n LowM.CallPrimitive Primitive.EnvRead k \u21d3 result\\n | state\' }}\\n ```\\n\\n### Semantics remarks\\n\\nWe can make a few remarks about our semantics:\\n\\n- There are no cases for `M.Impossible` as this primitive corresponds to impossible branches in the code.\\n- The semantics is not computable, in the sense that we cannot define a function `run` to evaluate a monadic program in a certain environment and state. Indeed, the user needs to make a choice during the allocation of new values, to know if we allocate the value as immutable or mutable, and with which address. The `M.Cast` operator is also not computable, as we cannot decide if two types are equal.\\n- We can choose the type that we use for the `State`, as well as the primitives `State.read` and `State.alloc_write`, as long as they verify well-formedness properties. For example, reading after a write at the same address should return the written value. One should choose a `State` that simplifies its proofs the most. To verify the smart contract, we have taken a record with two fields:\\n 1. the storage of the contract (the `Self` type in Rust),\\n 2. the list of events logged by the contract.\\n- Even if the monad is in continuation-passing style, we add a primitive `M.Call` corresponding to a bind, to explicit the points in the code where we call user-defined functions. This is not necessary but helpful to track things in the proofs. Otherwise, the monadic bind is defined as a fixpoint with:\\n ```coq\\n Fixpoint bind {A B : Set} (e1 : t A) (f : A -> t B) : t B :=\\n match e1 with\\n | Pure v => f v\\n | CallPrimitive primitive k =>\\n CallPrimitive primitive (fun v => bind (k v) f)\\n | Cast v k =>\\n Cast v (fun v\' => bind (k v\') f)\\n | Impossible => Impossible\\n end.\\n ```\\n- To handle the panic and `return`/`break` exceptions, we wrap our monad into an error monad:\\n ```coq\\n Definition M (A : Set) : Set :=\\n LowM (A + Exception.t).\\n ```\\n where `LowM` is the monad without errors as defined above and `Exception.t` is:\\n ```coq\\n Module Exception.\\n Inductive t : Set :=\\n (** exceptions for Rust\'s `return` *)\\n | Return {A : Set} : A -> t\\n (** exceptions for Rust\'s `continue` *)\\n | Continue : t\\n (** exceptions for Rust\'s `break` *)\\n | Break : t\\n | Panic : Coq.Strings.String.string -> t.\\n End Exception.\\n ```\\n\\n### Proof of equivalence\\n\\nTo prove that the equivalence between the simulation and the original code holds, we proceed by induction on the monadic code. This corresponds to symbolically evaluating the monadic code, in the proof mode of Coq, applying the primitives of the semantics predicate at each step. We use the following tactic to automate this work:\\n\\n```coq\\nrun_symbolic.\\n```\\n\\nWe manually handle the following cases:\\n\\n- branching (`if` or `match`),\\n- external function calls: generally, we apply an existing equivalence proof for a call to another function instead of doing the symbolic evaluation of the function,\\n- memory allocations: we need to choose the type of allocation (mutable or immutable) and the address of the allocation for mutable ones.\\n\\nHere is the proof for the `transfer` function:\\n\\n```coq\\nProof.\\n unfold erc20.Impl_erc20_Erc20_t_2.transfer,\\n Simulations.erc20.transfer,\\n lift_simulation.\\n Opaque erc20.transfer_from_to.\\n run_symbolic.\\n eapply Run.Call. {\\n apply run_env.\\n }\\n run_symbolic.\\n eapply Run.Call. {\\n apply Env.run_caller.\\n }\\n run_symbolic.\\n eapply Run.Call. {\\n now apply run_transfer_from_to.\\n }\\n unfold lift_simulation.\\n destruct erc20.transfer_from_to as [[] [?storage ?logs]]; run_symbolic.\\n Transparent erc20.transfer_from_to.\\nQed.\\n```\\n\\n## Proofs\\n\\n### Handling of integers\\n\\nWe distinguish the various types of integers used in Rust:\\n\\n- unsigned ones: `u8`, `u16`, `u32`, `u64`, `u128`, `usize`,\\n- signed ones: `i8`, `i16`, `i32`, `i64`, `i128`, `isize`.\\n\\nWe define a separate type for each of them, that is to say, a wrapper around the `Z` type of unbounded integers from Coq:\\n\\n```coq\\nModule u8.\\n Inductive t : Set := Make (z : Z) : t.\\nEnd u8.\\n```\\n\\nTo enforce the bounds, we define a validity predicate for each type:\\n\\n```coq\\nModule Valid.\\n Definition t {A : Set} `{Integer.C A} (v : A) : Prop :=\\n Integer.min <= Integer.to_Z v <= Integer.max.\\nEnd Valid.\\n```\\n\\nAll integer types are of the class `Integer.C` with a `min`, `max`, and `to_Z` functions. We do not embed this predicate with the integer type ([refinement type](https://en.wikipedia.org/wiki/Refinement_type)) to avoid mixing proofs and code. We pay a cost by having to handle the values and the validity proofs separately.\\n\\nDepending on the configuration mode of Rust, integer operations can overflow or panic. We have several implementations of the arithmetic operations, depending on the mode:\\n\\n```coq\\nModule BinOp.\\n (** Operators with panic, in the monad. *)\\n Module Panic.\\n Definition add {A : Set} `{Integer.C A} (v1 v2 : A) : M A :=\\n (* ... *)\\n\\n Definition sub (* ... *)\\n End Panic.\\n\\n (** Operators with overflow, outside of the monad as\\n there cannot be any errors. *)\\n Module Wrap.\\n Definition add {A : Set} `{Integer.C A} (v1 v2 : A) : A :=\\n (* ... *)\\n\\n Definition sub (* ... *)\\n End Wrap.\\nEnd BinOp.\\n```\\n\\nWe also have additional operators, useful for the definition of simulations:\\n\\n- optimistic operators, operating on `Z` without checking the bounds of the result (for cases where we can prove that the result is never out of bounds),\\n- operators returning in the option monad, to handle the case where the result is out of bounds.\\n\\nNote that the comparison operators (`=`, `<`, ...) never panic or overflow. In the context of these smart contracts, the arithmetic operators are panicking in case of overflow.\\n\\n### Definition of messages\\n\\nWe can call the smart contract with three read primitives (`total_supply`, `balance_of`, `allowance`) and three write primitives (`transfer`, `approve`, `transfer_from`). We define two message types to formalize these access points. This will later allow us to express properties over all possible read and write messages:\\n\\n```coq\\nModule ReadMessage.\\n (** The type parameter is the type of result of the call. *)\\n Inductive t : Set -> Set :=\\n | total_supply :\\n t ltac:(erc20.Balance)\\n | balance_of\\n (owner : erc20.AccountId.t) :\\n t ltac:(erc20.Balance)\\n | allowance\\n (owner : erc20.AccountId.t)\\n (spender : erc20.AccountId.t) :\\n t ltac:(erc20.Balance).\\nEnd ReadMessage.\\n\\nModule WriteMessage.\\n Inductive t : Set :=\\n | transfer\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n t\\n | approve\\n (spender : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n t\\n | transfer_from\\n (from : erc20.AccountId.t)\\n (to : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n t.\\nEnd WriteMessage.\\n```\\n\\n### No panics on read messages\\n\\nWe show that for all possible read messages, the smart contract does not panic:\\n\\n```coq\\nLemma read_message_no_panic\\n (env : erc20.Env.t)\\n (message : ReadMessage.t ltac:(erc20.Balance))\\n (storage : erc20.Erc20.t) :\\n let state := State.of_storage storage in\\n exists result,\\n {{ Environment.of_env env, state |\\n ReadMessage.dispatch message \u21d3\\n (* [inl] means success (no panics) *)\\n inl result\\n | state }}.\\n```\\n\\nThis is done by symbolic evaluation of the simulations:\\n\\n```coq\\nProof.\\n destruct message; simpl.\\n { eexists.\\n apply run_total_supply.\\n }\\n { eexists.\\n apply run_balance_of.\\n }\\n { eexists.\\n apply run_allowance.\\n }\\nQed.\\n```\\n\\n### Invariants\\n\\nThe data structure of the storage of the smart contract is as follows:\\n\\n```rust\\npub struct Erc20 {\\n total_supply: Balance,\\n balances: Mapping,\\n allowances: Mapping<(AccountId, AccountId), Balance>,\\n}\\n```\\n\\nAn invariant is that the total supply is always equal to the sum of all the balances in the mapping `Mapping`. We define this invariant in Coq as:\\n\\n```coq\\nDefinition sum_of_money (storage : erc20.Erc20.t) : Z :=\\n Lib.Mapping.sum Integer.to_Z storage.(erc20.Erc20.balances).\\n\\nModule Valid.\\n Definition t (storage : erc20.Erc20.t) : Prop :=\\n Integer.to_Z storage.(erc20.Erc20.total_supply) =\\n sum_of_money storage.\\nEnd Valid.\\n```\\n\\nWe show that this invariant holds for any output of the write messages, given that it holds for the input storage:\\n\\n```coq\\nLemma write_dispatch_is_valid\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (write_message : WriteMessage.t)\\n (H_storage : Erc20.Valid.t storage)\\n (H_write_message : WriteMessage.Valid.t write_message) :\\n let state := State.of_storage storage in\\n let \'(result, (storage, _)) :=\\n WriteMessage.simulation_dispatch env write_message (storage, []) in\\n match result with\\n | inl _ => Erc20.Valid.t storage\\n | _ => True\\n end.\\n```\\n\\nWe assume that the initial storage is valid with the hypothesis:\\n\\n```coq\\n(H_storage : Erc20.Valid.t storage)\\n```\\n\\nWe show the property in the case without panics with:\\n\\n```coq\\nmatch result with\\n | inl _ => ...\\n```\\n\\nWhen the smart contract panics (integer overflow), the storage is discarded anyways, and it might actually by invalid. For example, in the `transfer_from_to` function we have:\\n\\n```rust\\nself.balances.insert(*from, from_balance - value);\\nlet to_balance = self.balance_of_impl(to);\\nself.balances.insert(*to, to_balance + value);\\n```\\n\\nSo if there is a panic during the addition `+`, like an overflow, the final storage can have the `from` account modified but not the `to` account. So here, the balance sum is no longer equal to the total supply.\\n\\n### Total supply is constant\\n\\nWe show that the total supply is also a constant, meaning that no calls to the smart contract can modify its value. The statement is the following:\\n\\n```coq\\nLemma write_dispatch_is_constant\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (write_message : WriteMessage.t) :\\n let state := State.of_storage storage in\\n let \'(result, (storage\', _)) :=\\n WriteMessage.simulation_dispatch env write_message (storage, []) in\\n match result with\\n | inl _ =>\\n storage.(erc20.Erc20.total_supply) =\\n storage\'.(erc20.Erc20.total_supply)\\n | _ => True\\n end.\\n```\\n\\nIt says that for any initial `storage` and `write_message` sent to the smart contract, if we return a result without panicking (`inl _`), then the total supply in the final storage `storage\'` is equal to the initial one. We verify this fact by symbolic evaluation of all the branches of the simulation. There are no difficulties in this proof as the code never modifies the `total_supply`.\\n\\n### Action from the logs\\n\\nWe infer the action of the smart contract on the storage from its logs. This characterizes exactly what we modifications we can deduce on the storage from the logs. We define an action as a function from the storage to a set of possible new storages, given the knowledge of the logs of the contract:\\n\\n```coq\\nModule Action.\\n Definition t : Type := erc20.Erc20.t -> erc20.Erc20.t -> Prop.\\nEnd Action.\\n```\\n\\nThe main statement is the following:\\n\\n```coq\\nLemma retrieve_action_from_logs\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (write_message : WriteMessage.t)\\n (events : list erc20.Event.t) :\\n match\\n WriteMessage.simulation_dispatch env write_message (storage, [])\\n with\\n | (inl (result.Result.Ok tt), (storage\', events)) =>\\n action_of_events events storage storage\'\\n | _ => True\\n end.\\n```\\n\\nThis relates the final storage `storage\'` to the initial storage `storage` using the logs `events` when there are no panics. We define the `action_of_events` predicate as the successive application of the `action_of_event` predicate, which is defined as:\\n\\n```coq\\nDefinition action_of_event (event : erc20.Event.t) : Action.t :=\\n fun storage storage\' =>\\n match event with\\n | erc20.Event.Transfer (erc20.Transfer.Build_t\\n (option.Option.Some from)\\n (option.Option.Some to)\\n value\\n ) =>\\n (* In case of transfer event, we do not know how the allowances are\\n updated. *)\\n exists allowances\',\\n storage\' =\\n storage <|\\n erc20.Erc20.balances := balances_of_transfer storage from to value\\n |> <|\\n erc20.Erc20.allowances := allowances\'\\n |>\\n | erc20.Event.Transfer (erc20.Transfer.Build_t _ _ _) => False\\n | erc20.Event.Approval (erc20.Approval.Build_t owner spender value) =>\\n storage\' =\\n storage <|\\n erc20.Erc20.allowances :=\\n Lib.Mapping.insert (owner, spender) value\\n storage.(erc20.Erc20.allowances)\\n |>\\n end.\\n```\\n\\nWhen the `event` in the logs is of kind `erc20.Event.Transfer`, the resulting storage has:\\n\\n- the `balances` updated according to the function `balances_of_transfer`;\\n- the `allowances` updated to an unknown value `allowances\'`.\\n\\nWhen the `event` in the logs is of kind `erc20.Event.Approval`, the resulting storage has:\\n\\n- the `allowances` updated calling `Lib.Mapping.insert` on `(owner, spender)`;\\n- the `balances` unchanged.\\n\\n### Approve only on caller\\n\\nWe added one last proof to say that when the `approve` function succeeds, it only modifies the allowance of the caller:\\n\\n```coq\\nLemma approve_only_changes_owner_allowance\\n (env : erc20.Env.t)\\n (storage : erc20.Erc20.t)\\n (spender : erc20.AccountId.t)\\n (value : ltac:(erc20.Balance)) :\\n let \'(result, (storage\', _)) :=\\n Simulations.erc20.approve env spender value (storage, []) in\\n match result with\\n | inl (result.Result.Ok tt) =>\\n forall owner spender,\\n Integer.to_Z (Simulations.erc20.allowance storage\' owner spender) <>\\n Integer.to_Z (Simulations.erc20.allowance storage owner spender) ->\\n owner = Simulations.erc20.Env.caller env\\n | _ => True\\n end.\\n```\\n\\nIf an allowance changes after the call to `approve`, then the owner of the allowance is the caller of the smart contract. This is done by symbolic evaluation of the simulation.\\n\\n## Conclusion\\n\\nIn this example, we have shown how we formally verify the ERC-20 smart contract written in Rust for the [Aleph Zero](https://alephzero.org/) project. Formally verifying smart contracts is extremely important as they can hold a lot of money, and a single bug can prove fatal as recent attacks continue to show: [List of crypto hacks in 2023](https://www.ccn.com/education/crypto-hacks-2023-full-list-of-scams-and-exploits-as-millions-go-missing/).\\n\\nIf you have Rust smart contracts to verify, feel free to email us at [contact@formal.land](mailto:contact@formal.land). We will be happy to help!"},{"id":"/2023/11/26/rust-function-body","metadata":{"permalink":"/blog/2023/11/26/rust-function-body","source":"@site/blog/2023-11-26-rust-function-body.md","title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","description":"Our tool coq-of-rust enables formal verification of \ud83e\udd80 Rust code, to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system \ud83d\udc13 Coq.","date":"2023-11-26T00:00:00.000Z","formattedDate":"November 26, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"}],"readingTime":4.975,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","tags":["coq-of-rust","Rust","Coq"],"authors":[]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Verifying an ERC-20 smart contract in Rust","permalink":"/blog/2023/12/13/rust-verify-erc-20-smart-contract"},"nextItem":{"title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","permalink":"/blog/2023/11/08/rust-thir-and-bundled-traits"}},"content":"Our tool [coq-of-rust](https://github.com/formal-land/coq-of-rust) enables formal verification of [\ud83e\udd80 Rust](https://www.rust-lang.org/) code, to make sure that a program has no bugs given a precise specification. We work by translating Rust programs to the general proof system [\ud83d\udc13 Coq](https://coq.inria.fr/).\\n\\nHere, we present how we translate function bodies from Rust to Coq in an example. We also show some of the optimizations we made to reduce the size of the translation.\\n\\n\x3c!-- truncate --\x3e\\n\\n:::tip Purchase\\n\\nIf you need to formally verify your Rust codebase to improve the security of your application, email us at [contact@formal.land](mailto:contact@formal.land)!\\n\\n:::\\n\\n![Rust and Coq](2023-11-26/rust_and_coq.png)\\n\\n## Translating a function body\\n\\nWe take the following Rust example as input:\\n\\n```rust\\n// fn balance_of_impl(&self, owner: &AccountId) -> Balance { ... }\\n\\nfn balance_of(&self, owner: AccountId) -> Balance {\\n self.balance_of_impl(&owner)\\n}\\n```\\n\\nHere is the corresponding Coq code that `coq-of-rust` generates _without optimizations_:\\n\\n```coq\\nDefinition balance_of\\n (self : ref ltac:(Self))\\n (owner : erc20.AccountId.t)\\n : M ltac:(erc20.Balance) :=\\n let* self : M.Val (ref ltac:(Self)) := M.alloc self in\\n let* owner : M.Val erc20.AccountId.t := M.alloc owner in\\n let* \u03b10 : ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : M.Val erc20.Erc20.t := deref \u03b10 in\\n let* \u03b12 : ref erc20.Erc20.t := borrow \u03b11 in\\n let* \u03b13 : M.Val (ref erc20.Erc20.t) := M.alloc \u03b12 in\\n let* \u03b14 : ref erc20.Erc20.t := M.read \u03b13 in\\n let* \u03b15 : ref erc20.AccountId.t := borrow owner in\\n let* \u03b16 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b15 in\\n let* \u03b17 : ref erc20.AccountId.t := M.read \u03b16 in\\n let* \u03b18 : M.Val erc20.AccountId.t := deref \u03b17 in\\n let* \u03b19 : ref erc20.AccountId.t := borrow \u03b18 in\\n let* \u03b110 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b19 in\\n let* \u03b111 : ref erc20.AccountId.t := M.read \u03b110 in\\n let* \u03b112 : u128.t := erc20.Erc20.t::[\\"balance_of_impl\\"] \u03b14 \u03b111 in\\n let* \u03b113 : M.Val u128.t := M.alloc \u03b112 in\\n M.read \u03b113.\\n```\\n\\nThis code is much more verbose than the original Rust code as we make all pointer manipulations explicit. We will see just after how to simplify it. We start with the function declaration:\\n\\n```coq\\nDefinition balance_of\\n (self : ref ltac:(Self))\\n (owner : erc20.AccountId.t)\\n : M ltac:(erc20.Balance) :=\\n```\\n\\nthat repeats the parameters in the Rust source. Note that the final result is wrapped into the monad type `M`. This is a monad representing all the side-effects used in Rust programs (state, panic, non-termination, ...). Then, we allocate all the function parameters:\\n\\n```coq\\n let* self : M.Val (ref ltac:(Self)) := M.alloc self in\\n let* owner : M.Val erc20.AccountId.t := M.alloc owner in\\n```\\n\\nThis ensures that both `self` and `owner` have an address in memory, in case we borrow them later. This allocation is also fresh, so we cannot access the address of the values from the caller by mistake. We use the monadic let `let*` as allocations can modify the memory state.\\n\\nThen we start by the body of the function itself. We do all the necessary pointer manipulations to compute the parameters `self` and `&owner` of the function `balance_of_impl`. These representations are directly taken from the abstract syntax tree of the Rust compiler (using the [THIR](https://rustc-dev-guide.rust-lang.org/thir.html) version).\\n\\nFor example, for the first parameter `self`, named `\u03b14` in this translation, we do:\\n\\n```coq\\n let* \u03b10 : ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : M.Val erc20.Erc20.t := deref \u03b10 in\\n let* \u03b12 : ref erc20.Erc20.t := borrow \u03b11 in\\n let* \u03b13 : M.Val (ref erc20.Erc20.t) := M.alloc \u03b12 in\\n let* \u03b14 : ref erc20.Erc20.t := M.read \u03b13 in\\n```\\n\\nWe combine the operators:\\n\\n- `M.read`: to get a value of type `A` from a value with an address `M.Val`,\\n- `deref`: to get the value with an address `M.Val A` pointed by a reference `ref A`,\\n- `borrow`: to get the reference `ref A` to a value with an address `M.Val A`,\\n- `M.alloc`: to allocate a new value `A` in memory, returning a value with address `M.Val A`.\\n\\nWe do the same to compute the second parameter `&owner` of `balance_of_impl` with:\\n\\n```coq\\n let* \u03b15 : ref erc20.AccountId.t := borrow owner in\\n let* \u03b16 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b15 in\\n let* \u03b17 : ref erc20.AccountId.t := M.read \u03b16 in\\n let* \u03b18 : M.Val erc20.AccountId.t := deref \u03b17 in\\n let* \u03b19 : ref erc20.AccountId.t := borrow \u03b18 in\\n let* \u03b110 : M.Val (ref erc20.AccountId.t) := M.alloc \u03b19 in\\n let* \u03b111 : ref erc20.AccountId.t := M.read \u03b110 in\\n```\\n\\nFinally, we call the `balance_of_impl` function and return the result:\\n\\n```coq\\n let* \u03b112 : u128.t := erc20.Erc20.t::[\\"balance_of_impl\\"] \u03b14 \u03b111 in\\n let* \u03b113 : M.Val u128.t := M.alloc \u03b112 in\\n M.read \u03b113.\\n```\\n\\nWe do not keep the address of the result, as it will be allocated again by the caller function.\\n\\n## Optimizations\\n\\nSome operations can always be removed, namely:\\n\\n- `M.read (M.alloc v) ==> v`: we do not need to allocate and give an address to a value if it will be immediately read,\\n- `deref (borrow v) ==> v` and `borrow (deref v) ==> v`: the borrowing and dereferencing operators are doing the opposite, so they cancel each other. We need to be careful of the mutability status of the borrowing and dereferencing.\\n\\nApplying these simple simplification rules, we get the following slimed-down translation:\\n\\n```coq\\nDefinition balance_of\\n (self : ref ltac:(Self))\\n (owner : erc20.AccountId.t)\\n : M ltac:(erc20.Balance) :=\\n let* self : M.Val (ref ltac:(Self)) := M.alloc self in\\n let* owner : M.Val erc20.AccountId.t := M.alloc owner in\\n let* \u03b10 : ref erc20.Erc20.t := M.read self in\\n let* \u03b11 : ref erc20.AccountId.t := borrow owner in\\n erc20.Erc20.t::[\\"balance_of_impl\\"] \u03b10 \u03b11.\\n```\\n\\nThis is much shorter and easier to verify!\\n\\n## Conclusion\\n\\nWe have illustrated in an example how we translate a simple function from Rust to Coq. In this example, we saw how the pointer operations are made explicit in the abstract syntax tree of Rust, and how we simplify them for the frequent cases.\\n\\nIf you have any comments or suggestions, feel free to email us at [contact@formal.land](mailto:contact@formal.land). In future posts, we will go into more detail about the verification process itself."},{"id":"/2023/11/08/rust-thir-and-bundled-traits","metadata":{"permalink":"/blog/2023/11/08/rust-thir-and-bundled-traits","source":"@site/blog/2023-11-08-rust-thir-and-bundled-traits.md","title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","description":"We continued our work on coq-of-rust, a tool to formally verify Rust programs using the proof system Coq \ud83d\udc13. This tool translates Rust programs to an equivalent Coq program, which can then be verified using Coq\'s proof assistant. It opens the door to building mathematically proven bug-free Rust programs.","date":"2023-11-08T00:00:00.000Z","formattedDate":"November 8, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"trait","permalink":"/blog/tags/trait"},{"label":"THIR","permalink":"/blog/tags/thir"},{"label":"HIR","permalink":"/blog/tags/hir"}],"readingTime":5.22,"hasTruncateMarker":true,"authors":[{"name":"Guillaume Claret"}],"frontMatter":{"title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","tags":["coq-of-rust","Rust","Coq","trait","THIR","HIR"],"author":"Guillaume Claret"},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Translation of function bodies from Rust to Coq","permalink":"/blog/2023/11/26/rust-function-body"},"nextItem":{"title":"\ud83e\udd80 Trait representation in Coq","permalink":"/blog/2023/08/25/trait-representation-in-coq"}},"content":"We continued our work on [coq-of-rust](https://github.com/formal-land/coq-of-rust), a tool to formally verify [Rust](https://www.rust-lang.org/) programs using the proof system [Coq \ud83d\udc13](https://coq.inria.fr/). This tool translates Rust programs to an equivalent Coq program, which can then be verified using Coq\'s proof assistant. It opens the door to building mathematically proven bug-free Rust programs.\\n\\nWe present two main improvements we made to `coq-of-rust`:\\n\\n- Using the THIR intermediate language of Rust to have more information during the translation to Coq.\\n- Bundling the type-classes representing the traits of Rust to have faster type-checking in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n![Rust and Coq](2023-11-08/rust_and_coq.png)\\n\\n## THIR intermediate language\\n\\nTo translate Rust programs to Coq, we plug into the compiler of Rust, which operates on a series of intermediate languages:\\n\\n- source code (`.rs` files);\\n- abstract syntax tree (AST): immediately after parsing;\\n- [High-Level Intermediate Representation](https://rustc-dev-guide.rust-lang.org/hir.html) (HIR): after macro expansion, with name resolution and close to the AST;\\n- [Typed High-Level Intermediate Representation](https://rustc-dev-guide.rust-lang.org/thir.html) (THIR): after the type-checking;\\n- [Mid-level Intermediate Representation](https://rustc-dev-guide.rust-lang.org/mir/index.html) (MIR): low-level representation based on a [control-flow graph](https://en.wikipedia.org/wiki/Control-flow_graph), inlining traits and polymorphic functions, and with [borrow checking](https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html);\\n- machine code (assembly, LLVM IR, ...).\\n\\nWe were previously using the HIR language to start our translation to Coq, because it is not too low-level and close to what the user has originally in the `.rs` file. This helps relate the generated Coq code to the original Rust code.\\n\\nHowever, at the level of HIR, there is still a lot of implicit information. For example, Rust has [automatic dereferencing rules](https://users.rust-lang.org/t/automatic-dereferencing/53828) that are not yet explicit in HIR. In order not to make any mistakes during our translation to Coq, we prefer to use the next representation, THIR, that makes explicit such rules.\\n\\nIn addition, the THIR representation shows when a method call is from a trait (and which trait) or from a standalone `impl` block. Given that we still have trouble translating the traits with [type-classes](https://coq.inria.fr/doc/V8.18.0/refman/addendum/type-classes.html) that are inferrable by Coq, this helps a lot.\\n\\nA downside of the THIR representation is that it is much more verbose. For example, here is a formatting function generated from HIR:\\n\\n```coq\\nDefinition fmt\\n `{\u210b : State.Trait}\\n (self : ref Self)\\n (f : mut_ref core.fmt.Formatter)\\n : M core.fmt.Result :=\\n let* \u03b10 := format_argument::[\\"new_display\\"] (addr_of self.[\\"radius\\"]) in\\n let* \u03b11 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"Circle of radius \\" ])\\n (addr_of [ \u03b10 ]) in\\n f.[\\"write_fmt\\"] \u03b11.\\n```\\n\\nThis is the kind of functions generated by the `#[derive(Debug)]` macro of Rust, to implement a formatting function on a type. Here is the version translated from THIR, with explicit borrowing and dereferencing:\\n\\n```coq\\nDefinition fmt\\n `{\u210b : State.Trait}\\n (self : ref Self)\\n (f : mut_ref core.fmt.Formatter)\\n : M ltac:(core.fmt.Result) :=\\n let* \u03b10 := deref f core.fmt.Formatter in\\n let* \u03b11 := borrow_mut \u03b10 core.fmt.Formatter in\\n let* \u03b12 := borrow [ mk_str \\"Circle of radius \\" ] (list (ref str)) in\\n let* \u03b13 := deref \u03b12 (list (ref str)) in\\n let* \u03b14 := borrow \u03b13 (list (ref str)) in\\n let* \u03b15 := pointer_coercion \\"Unsize\\" \u03b14 in\\n let* \u03b16 := deref self converting_to_string.Circle in\\n let* \u03b17 := \u03b16.[\\"radius\\"] in\\n let* \u03b18 := borrow \u03b17 i32 in\\n let* \u03b19 := deref \u03b18 i32 in\\n let* \u03b110 := borrow \u03b19 i32 in\\n let* \u03b111 := core.fmt.rt.Argument::[\\"new_display\\"] \u03b110 in\\n let* \u03b112 := borrow [ \u03b111 ] (list core.fmt.rt.Argument) in\\n let* \u03b113 := deref \u03b112 (list core.fmt.rt.Argument) in\\n let* \u03b114 := borrow \u03b113 (list core.fmt.rt.Argument) in\\n let* \u03b115 := pointer_coercion \\"Unsize\\" \u03b114 in\\n let* \u03b116 := core.fmt.Arguments::[\\"new_v1\\"] \u03b15 \u03b115 in\\n core.fmt.Formatter::[\\"write_fmt\\"] \u03b11 \u03b116.\\n```\\n\\nWe went from a function having two intermediate variables to seventeen intermediate variables. This code is much more verbose, but it is also more explicit. In particular, it details when the:\\n\\n- borrowing (going from a value of type `T` to `&T`), and the\\n- dereferencing (going from a value of type `&T` to `T`)\\n\\noccur. It also shows that the method `write_fmt` is a method from the implementation of the type `core.fmt.Formatter`, generating:\\n\\n```coq\\ncore.fmt.Formatter::[\\"write_fmt\\"] \u03b11 \u03b116\\n```\\n\\ninstead of:\\n\\n```coq\\nf.[\\"write_fmt\\"] \u03b11\\n```\\n\\n## Bundled traits\\n\\nSome Rust codebases can have a lot of traits. For example in [paritytech/ink/crates/env/src/types.rs](https://github.com/paritytech/ink/blob/ccb38d2c3ac27523fe3108f2bb7bffbbe908cdb7/crates/env/src/types.rs#L120) the trait `Environment` references more than forty other traits:\\n\\n```rust\\npub trait Environment: Clone {\\n const MAX_EVENT_TOPICS: usize;\\n\\n type AccountId: \'static\\n + scale::Codec\\n + CodecAsType\\n + Clone\\n + PartialEq\\n + ...;\\n\\n type Balance: \'static\\n + scale::Codec\\n + CodecAsType\\n + ...;\\n\\n ...\\n```\\n\\nWe first used an unbundled approach to represent this trait by a type-class in Coq, as it felt more natural:\\n\\n```coq\\nModule Environment.\\n Class Trait (Self : Set) `{Clone.Trait Self}\\n {AccountId : Set}\\n `{scale.Codec.Trait AccountId}\\n `{CodecAsType AccountId}\\n `{Clone AccountId}\\n `{PartialEq AccountId}\\n ...\\n```\\n\\nHowever, the backquote operator generated too many implicit arguments, and the type-checker of Coq was very slow. We then switched to a bundled approach, as advocated in this blog post: [Exponential blowup when using unbundled typeclasses to model algebraic hierarchies](https://www.ralfj.de/blog/2019/05/15/typeclasses-exponential-blowup.html). The Coq code for this trait now looks like this:\\n\\n```coq\\nModule Environment.\\n Class Trait `{\u210b : State.Trait} (Self : Set) : Type := {\\n \u210b_0 :: Clone.Trait Self;\\n MAX_EVENT_TOPICS : usize;\\n AccountId : Set;\\n \u2112_0 :: parity_scale_codec.codec.Codec.Trait AccountId;\\n \u2112_1 :: ink_env.types.CodecAsType.Trait AccountId;\\n \u2112_2 :: core.clone.Clone.Trait AccountId;\\n \u2112_3 ::\\n core.cmp.PartialEq.Trait AccountId\\n (Rhs := core.cmp.PartialEq.Default.Rhs AccountId);\\n ...;\\n Balance : Set;\\n \u2112_8 :: parity_scale_codec.codec.Codec.Trait Balance;\\n \u2112_9 :: ink_env.types.CodecAsType.Trait Balance;\\n ...;\\n\\n ...\\n```\\n\\nWe use the notation `::` for fields that are trait instances. With this approach, traits have types as parameters but no other traits.\\n\\nThe type-checking is now much faster, and in particular, we avoid some cases with exponential blowup or non-terminating type-checking. But this is not a perfect solution as we still have cases where the instance inference does not terminate or fails with hard-to-understand error messages.\\n\\n## Conclusion\\n\\nWe have illustrated here some improvements we recently made to our [coq-of-rust](https://github.com/formal-land/coq-of-rust) translator for two key areas:\\n\\n- the translation of traits;\\n- the translation of the implicit borrowing and dereferencing, that can occur every time we call a function.\\n\\nThese improvements will allow us to formally verify some more complex Rust codebases. In particular, we are applying `coq-of-rust` to verify smart contracts written for the [ink!](https://use.ink/) platform, that is a subset of Rust.\\n\\n:::tip Contact\\n\\nIf you have comments, similar experiences to share, or wish to formally verify your codebase to improve the security of your application, contact us at [contact@formal.land](mailto:contact@formal.land)!\\n\\n:::"},{"id":"/2023/08/25/trait-representation-in-coq","metadata":{"permalink":"/blog/2023/08/25/trait-representation-in-coq","source":"@site/blog/2023-08-25-trait-representation-in-coq.md","title":"\ud83e\udd80 Trait representation in Coq","description":"In our project coq-of-rust we translate programs written in Rust to equivalent programs in the language of the proof system Coq \ud83d\udc13, which will later allow us to formally verify them.","date":"2023-08-25T00:00:00.000Z","formattedDate":"August 25, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"trait","permalink":"/blog/tags/trait"}],"readingTime":7.58,"hasTruncateMarker":true,"authors":[{"name":"Bart\u0142omiej Kr\xf3likowski"}],"frontMatter":{"title":"\ud83e\udd80 Trait representation in Coq","tags":["coq-of-rust","Rust","Coq","trait"],"author":"Bart\u0142omiej Kr\xf3likowski"},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Optimizing Rust translation to Coq with THIR and bundled traits","permalink":"/blog/2023/11/08/rust-thir-and-bundled-traits"},"nextItem":{"title":"\ud83e\udd80 Monad for side effects in Rust","permalink":"/blog/2023/05/28/monad-for-side-effects-in-rust"}},"content":"In our project [coq-of-rust](https://github.com/formal-land/coq-of-rust) we translate programs written in [Rust](https://www.rust-lang.org/) to equivalent programs in the language of the proof system [Coq \ud83d\udc13](https://coq.inria.fr/), which will later allow us to formally verify them.\\nBoth Coq and Rust have many unique features, and there are many differences between them, so in the process of translation we need to treat the case of each language construction separately.\\nIn this post, we discuss how we translate the most complicated one: [traits](https://doc.rust-lang.org/book/ch10-02-traits.html).\\n\\n\x3c!-- truncate --\x3e\\n\\n## \ud83e\udd80 Traits in Rust\\n\\nTrait is the way to define a shared behaviour for a group of types in Rust.\\nTo define a trait we have to specify a list of signatures of the methods we want to be implemented for the types implementing our trait.\\nWe can also create a generic definition of a trait with the same syntax as in every Rust definition.\\nOptionally, we can add a default implementation to any method or extend the list with associated types.\\nTraits can also extend a behaviour of one or more other traits, in which case, to implement a trait for a type we would have to implement all its supertraits first.\\n\\nConsider the following example (adapted from the [Rust Book](https://doc.rust-lang.org/book/)):\\n\\n```rust\\nstruct Sheep {\\n naked: bool,\\n name: &\'static str,\\n}\\n\\ntrait Animal {\\n // Associated function signature; `Self` refers to the implementor type.\\n fn new(name: &\'static str) -> Self;\\n\\n // Method signatures; these will return a string.\\n fn name(&self) -> &\'static str;\\n fn noise(&self) -> &\'static str;\\n\\n // Traits can provide default method definitions.\\n fn talk(&self) {\\n println!(\\"{} says {}\\", self.name(), self.noise());\\n }\\n}\\n\\nimpl Sheep {\\n fn is_naked(&self) -> bool {\\n self.naked\\n }\\n}\\n\\n// Implement the `Animal` trait for `Sheep`.\\nimpl Animal for Sheep {\\n // `Self` is the implementor type: `Sheep`.\\n fn new(name: &\'static str) -> Sheep {\\n Sheep {\\n name: name,\\n naked: false,\\n }\\n }\\n\\n fn name(&self) -> &\'static str {\\n self.name\\n }\\n\\n fn noise(&self) -> &\'static str {\\n if self.is_naked() {\\n \\"baaaaah?\\"\\n } else {\\n \\"baaaaah!\\"\\n }\\n }\\n\\n // Default trait methods can be overridden.\\n fn talk(&self) {\\n // For example, we can add some quiet contemplation.\\n println!(\\"{} pauses briefly... {}\\", self.name, self.noise());\\n }\\n}\\n\\nimpl Sheep {\\n fn shear(&mut self) {\\n if self.is_naked() {\\n // Implementor methods can use the implementor\'s trait methods.\\n println!(\\"{} is already naked...\\", self.name());\\n } else {\\n println!(\\"{} gets a haircut!\\", self.name);\\n\\n self.naked = true;\\n }\\n }\\n}\\n\\nfn main() {\\n // Type annotation is necessary in this case.\\n let mut dolly = Animal::new(\\"Dolly\\"): Sheep;\\n\\n dolly.talk();\\n dolly.shear();\\n dolly.talk();\\n}\\n```\\n\\nWe have a type `Sheep`, a trait `Animal`, and an implementation of `Animal` for `Sheep`.\\nAs we can see in `main`, after a trait is implemented for a type, we can use the methods of the trait like normal methods of the type.\\n\\n## Our translation\\n\\nRust notion of trait is very similar to the concept of [typeclasses](https://en.wikipedia.org/wiki/Type_class) in [functional programming](https://en.wikipedia.org/wiki/Functional_programming).\\nTypeclasses are also present in Coq, so translation of this construction is quite straightforward.\\n\\nFor a given trait we create a typeclass with fields being just translated signatures of the methods of the trait.\\nTo allow for the use of method syntax, we also define instances of `Notation.Dot` for every method name of the trait.\\nWe also add a parameter of type `Set` for every type parameter of the trait and translate trait bounds of the types into equivalent typeclass parameters.\\n\\n## Translation of associated types\\n\\nAssociated types are a bit harder than methods to translate, because it is possible to use `::` notation to access them.\\nFor that purpose, we created another typeclass in `Notation` module:\\n\\n```coq\\nClass DoubleColonType {Kind : Type} (type : Kind) (name : string) : Type := {\\n double_colon_type : Set;\\n}.\\n```\\n\\nwith a notation:\\n\\n```coq\\nNotation \\"e1 ::type[ e2 ]\\" := (Notation.double_colon_type e1 e2)\\n (at level 0).\\n```\\n\\nFor every associated type, we create a parameter and a field of the typeclass resulting from the trait translation, and below, we create an instance of `Notation.DoubleColonType`.\\n\\n## The example in Coq\\n\\nHere is our Coq translation of the example code above:\\n\\n```coq\\n(* Generated by coq-of-rust *)\\nRequire Import CoqOfRust.CoqOfRust.\\n\\nModule Sheep.\\n Unset Primitive Projections.\\n Record t : Set := {\\n naked : bool;\\n name : ref str;\\n }.\\n Global Set Primitive Projections.\\n\\n Global Instance Get_naked : Notation.Dot \\"naked\\" := {\\n Notation.dot \'(Build_t x0 _) := x0;\\n }.\\n Global Instance Get_name : Notation.Dot \\"name\\" := {\\n Notation.dot \'(Build_t _ x1) := x1;\\n }.\\nEnd Sheep.\\nDefinition Sheep : Set := @Sheep.t.\\n\\nModule Animal.\\n Class Trait (Self : Set) : Set := {\\n new `{H : State.Trait} : (ref str) -> (M (H := H) Self);\\n name `{H : State.Trait} : (ref Self) -> (M (H := H) (ref str));\\n noise `{H : State.Trait} : (ref Self) -> (M (H := H) (ref str));\\n }.\\n\\n Global Instance Method_new `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"new\\" := {\\n Notation.dot := new;\\n }.\\n Global Instance Method_name `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"name\\" := {\\n Notation.dot := name;\\n }.\\n Global Instance Method_noise `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"noise\\" := {\\n Notation.dot := noise;\\n }.\\n Global Instance Method_talk `{H : State.Trait} `(Trait)\\n : Notation.Dot \\"talk\\" := {\\n Notation.dot (self : ref Self):=\\n (let* _ :=\\n let* _ :=\\n let* \u03b10 := self.[\\"name\\"] in\\n let* \u03b11 := format_argument::[\\"new_display\\"] (addr_of \u03b10) in\\n let* \u03b12 := self.[\\"noise\\"] in\\n let* \u03b13 := format_argument::[\\"new_display\\"] (addr_of \u03b12) in\\n let* \u03b14 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" says \\"; \\"\\n\\" ])\\n (addr_of [ \u03b11; \u03b13 ]) in\\n std.io.stdio._print \u03b14 in\\n Pure tt in\\n Pure tt\\n : M (H := H) unit);\\n }.\\nEnd Animal.\\n\\nModule Impl_traits_Sheep.\\n Definition Self := traits.Sheep.\\n\\n Definition is_naked `{H : State.Trait} (self : ref Self) : M (H := H) bool :=\\n Pure self.[\\"naked\\"].\\n\\n Global Instance Method_is_naked `{H : State.Trait} :\\n Notation.Dot \\"is_naked\\" := {\\n Notation.dot := is_naked;\\n }.\\nEnd Impl_traits_Sheep.\\n\\nModule Impl_traits_Animal_for_traits_Sheep.\\n Definition Self := traits.Sheep.\\n\\n Definition new\\n `{H : State.Trait}\\n (name : ref str)\\n : M (H := H) traits.Sheep :=\\n Pure {| traits.Sheep.name := name; traits.Sheep.naked := false; |}.\\n\\n Global Instance AssociatedFunction_new `{H : State.Trait} :\\n Notation.DoubleColon Self \\"new\\" := {\\n Notation.double_colon := new;\\n }.\\n\\n Definition name `{H : State.Trait} (self : ref Self) : M (H := H) (ref str) :=\\n Pure self.[\\"name\\"].\\n\\n Global Instance Method_name `{H : State.Trait} : Notation.Dot \\"name\\" := {\\n Notation.dot := name;\\n }.\\n\\n Definition noise\\n `{H : State.Trait}\\n (self : ref Self)\\n : M (H := H) (ref str) :=\\n let* \u03b10 := self.[\\"is_naked\\"] in\\n if (\u03b10 : bool) then\\n Pure \\"baaaaah?\\"\\n else\\n Pure \\"baaaaah!\\".\\n\\n Global Instance Method_noise `{H : State.Trait} : Notation.Dot \\"noise\\" := {\\n Notation.dot := noise;\\n }.\\n\\n Definition talk `{H : State.Trait} (self : ref Self) : M (H := H) unit :=\\n let* _ :=\\n let* _ :=\\n let* \u03b10 := format_argument::[\\"new_display\\"] (addr_of self.[\\"name\\"]) in\\n let* \u03b11 := self.[\\"noise\\"] in\\n let* \u03b12 := format_argument::[\\"new_display\\"] (addr_of \u03b11) in\\n let* \u03b13 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" pauses briefly... \\"; \\"\\n\\" ])\\n (addr_of [ \u03b10; \u03b12 ]) in\\n std.io.stdio._print \u03b13 in\\n Pure tt in\\n Pure tt.\\n\\n Global Instance Method_talk `{H : State.Trait} : Notation.Dot \\"talk\\" := {\\n Notation.dot := talk;\\n }.\\n\\n Global Instance I : traits.Animal.Trait Self := {\\n traits.Animal.new `{H : State.Trait} := new;\\n traits.Animal.name `{H : State.Trait} := name;\\n traits.Animal.noise `{H : State.Trait} := noise;\\n }.\\nEnd Impl_traits_Animal_for_traits_Sheep.\\n\\nModule Impl_traits_Sheep_3.\\n Definition Self := traits.Sheep.\\n\\n Definition shear `{H : State.Trait} (self : mut_ref Self) : M (H := H) unit :=\\n let* \u03b10 := self.[\\"is_naked\\"] in\\n if (\u03b10 : bool) then\\n let* _ :=\\n let* _ :=\\n let* \u03b10 := self.[\\"name\\"] in\\n let* \u03b11 := format_argument::[\\"new_display\\"] (addr_of \u03b10) in\\n let* \u03b12 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" is already naked...\\n\\" ])\\n (addr_of [ \u03b11 ]) in\\n std.io.stdio._print \u03b12 in\\n Pure tt in\\n Pure tt\\n else\\n let* _ :=\\n let* _ :=\\n let* \u03b10 := format_argument::[\\"new_display\\"] (addr_of self.[\\"name\\"]) in\\n let* \u03b11 :=\\n format_arguments::[\\"new_v1\\"]\\n (addr_of [ \\"\\"; \\" gets a haircut!\\n\\" ])\\n (addr_of [ \u03b10 ]) in\\n std.io.stdio._print \u03b11 in\\n Pure tt in\\n let* _ := assign self.[\\"naked\\"] true in\\n Pure tt.\\n\\n Global Instance Method_shear `{H : State.Trait} : Notation.Dot \\"shear\\" := {\\n Notation.dot := shear;\\n }.\\nEnd Impl_traits_Sheep_3.\\n\\n(* #[allow(dead_code)] - function was ignored by the compiler *)\\nDefinition main `{H : State.Trait} : M (H := H) unit :=\\n let* dolly :=\\n let* \u03b10 := traits.Animal.new \\"Dolly\\" in\\n Pure (\u03b10 : traits.Sheep) in\\n let* _ := dolly.[\\"talk\\"] in\\n let* _ := dolly.[\\"shear\\"] in\\n let* _ := dolly.[\\"talk\\"] in\\n Pure tt.\\n```\\n\\nAs we can see, the trait `Animal` is translated to a module `Animal`. Every time we want to refer to the trait we use the name `Trait` or `Animal.Trait`, depending on whether we do it inside or outside its module.\\n\\n## Conclusion\\n\\nTraits are similar enough to Coq classes to make the translation relatively intuitive.\\nThe only hard case is a translation of associated types, for which we need a special notation.\\n\\n:::tip Contact\\n\\nIf you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us at [contact@formal.land](mailto:contact@formal.land). We will be happy to set up a call with you.\\n\\n:::"},{"id":"/2023/05/28/monad-for-side-effects-in-rust","metadata":{"permalink":"/blog/2023/05/28/monad-for-side-effects-in-rust","source":"@site/blog/2023-05-28-monad-for-side-effects-in-rust.md","title":"\ud83e\udd80 Monad for side effects in Rust","description":"To formally verify Rust programs, we are building coq-of-rust, a translator from Rust \ud83e\udd80 code to the proof system Coq \ud83d\udc13. We generate Coq code that is as similar as possible to the original Rust code, so that the user can easily understand the generated code and write proofs about it. In this blog post, we explain how we are representing side effects in Coq.","date":"2023-05-28T00:00:00.000Z","formattedDate":"May 28, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"},{"label":"monad","permalink":"/blog/tags/monad"},{"label":"side effects","permalink":"/blog/tags/side-effects"}],"readingTime":5.03,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Monad for side effects in Rust","tags":["coq-of-rust","Rust","Coq","monad","side effects"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Trait representation in Coq","permalink":"/blog/2023/08/25/trait-representation-in-coq"},"nextItem":{"title":"\ud83e\udd80 Representation of Rust methods in Coq","permalink":"/blog/2023/04/26/representation-of-rust-methods-in-coq"}},"content":"To formally verify Rust programs, we are building [coq-of-rust](https://github.com/formal-land/coq-of-rust), a translator from Rust \ud83e\udd80 code to the proof system [Coq \ud83d\udc13](https://coq.inria.fr/). We generate Coq code that is as similar as possible to the original Rust code, so that the user can easily understand the generated code and write proofs about it. In this blog post, we explain how we are representing side effects in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n## \ud83e\udd80 Side effects in Rust\\n\\nIn programming, [side effects]() are all what is not representable by pure functions (mathematical functions, functions that always return the same output for given input parameters). In Rust there are various kinds of side effects:\\n\\n- errors (the [panic!](https://doc.rust-lang.org/core/macro.panic.html) macro) that propagate and do appear in the return type of functions,\\n- non-termination, with some potentially non-terminating loops (never returning a result is considered as a side-effect),\\n- control-flow, with the `break`, `continue`, `return` keywords, that can jump to a different part of the code,\\n- memory allocations and memory mutations,\\n- I/O, with for example the [println!](https://doc.rust-lang.org/std/macro.println.html) macro, that prints a message to the standard output,\\n- concurrency, with the [thread::spawn](https://doc.rust-lang.org/std/thread/fn.spawn.html) function, that creates a new thread.\\n\\n## \ud83d\udc13 Coq, a purely functional language\\n\\nLike most proof systems, Coq is a purely functional language. This means we need to find an encoding for the side effects. The reason for most proof systems to forbid side effects is to be logically consistent. Otherwise, it would be easy to write a proof of `False` by writing a term that does not terminate for example.\\n\\n## \ud83d\udd2e Monads in Coq\\n\\nMonads are a common way to represent side effects in a functional language. A monad is a type constructor `M`:\\n\\n```coq\\nDefinition M (A : Set) : Set :=\\n ...\\n```\\n\\nrepresenting computations returning values of type `A`. As an example we can take the error monad of computations that can fail with an error message, using the [Result](https://doc.rust-lang.org/std/result/enum.Result.html) type like in Rust:\\n\\n```coq\\nDefinition M (A : Set) : Set :=\\n Result A string.\\n```\\n\\nIt must have two operators, `Pure` and `Bind`.\\n\\n### The `Pure` operator\\n\\nThe `Pure` operator has type:\\n\\n```coq\\nDefinition Pure {A : Set} (v : A) : M A :=\\n ...\\n```\\n\\nIt lifts a pure value `v` into the monad. For our error monad, the `Pure` operator is:\\n\\n```coq\\nDefinition Pure {A : Set} (v : A) : M A :=\\n Ok v.\\n```\\n\\n### The `Bind` operator\\n\\nThe `Bind` operator has type:\\n\\n```coq\\nDefinition Bind {A B : Set} (e1 : M A) (f : A -> M B) : M B :=\\n ...\\n```\\n\\nIt sequences two computations `e1` with `f`, where `f` is a function that takes the result of `e1` as input and returns a new computation. We also note the `Bind` operator:\\n\\n```coq\\nlet* x := e1 in\\ne2\\n```\\n\\nassuming that `f` is a function that takes `x` as input and returns `e2`. Requiring this operator for all monads shows that sequencing computations is a very fundamental operation for side effects.\\n\\nFor our error monad, the `Bind` operator is:\\n\\n```coq\\nDefinition Bind {A B : Set} (e1 : M A) (f : A -> M B) : M B :=\\n match e1 with\\n | Ok v => f v\\n | Err msg => Err msg\\n end.\\n```\\n\\n## \ud83d\udea7 State, exceptions, non-termination, control-flow\\n\\nWe use a single monad to represent all the side effects that interest us in Rust. This monad is called `M` and is defined as follows:\\n\\n```coq\\nDefinition RawMonad `{State.Trait} :=\\n ...\\n\\nModule Exception.\\n Inductive t (R : Set) : Set :=\\n | Return : R -> t R\\n | Continue : t R\\n | Break : t R\\n | Panic {A : Set} : A -> t R.\\n Arguments Return {_}.\\n Arguments Continue {_}.\\n Arguments Break {_}.\\n Arguments Panic {_ _}.\\nEnd Exception.\\nDefinition Exception := Exception.t.\\n\\nDefinition Monad `{State.Trait} (R A : Set) : Set :=\\n nat -> State -> RawMonad ((A + Exception R) * State).\\n\\nDefinition M `{State.Trait} (A : Set) : Set :=\\n Monad Empty_set A.\\n```\\n\\nWe assume the definition of some `RawMonad` for memory handling that we will describe in a later post. Our monad `M` is a particular case of the monad `Monad` with `R = Empty_set`. It is a combination four monads:\\n\\n1. The `RawMonad`.\\n2. A state monad, that takes a `State` as input and a return an updated state as output. The trait `State.Trait` provides read/write operations on the `State` type.\\n3. An error monad with errors of type `Exception R`. There errors include the `Return`, `Continue`, `Break` and `Panic` constructors. The `Return` constructor is used to return a value from a function. The `Continue` constructor is used to continue the execution of a loop. The `Break` constructor is used to break the execution of a loop. The `Panic` constructor is used to panic with an error message. We implement all these operations as exceptions, even if only `Panic` is really an error, as they behave in the same way: interrupting the execution of the current sub-expression to bubble up to a certain level.\\n4. A fuel monad for non-termination, with the additional `nat` parameter.\\n\\nThe parameter `R` of the type constructor `Monad` is used to represent the type of values that can be returned in the body of a function. It is the same as the return type of the function. So for a function returning a value of type `A`, we define its body in `Monad A A`. Then, we wrap it in an operator:\\n\\n```coq\\nDefinition catch_return {A : Set} (e : Monad A A) : M A :=\\n ...\\n```\\n\\nthat catches the `Return` exceptions and returns the value.\\n\\n## Conclusion\\n\\nWe will see in the next post how we define the `RawMonad` to handle the Rust state of a program and memory allocation.\\n\\n:::tip Contact\\n\\nIf you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us at [contact@formal.land](mailto:contact@formal.land). We will be happy to set up a call with you.\\n\\n:::"},{"id":"/2023/04/26/representation-of-rust-methods-in-coq","metadata":{"permalink":"/blog/2023/04/26/representation-of-rust-methods-in-coq","source":"@site/blog/2023-04-26-representation-of-rust-methods-in-coq.md","title":"\ud83e\udd80 Representation of Rust methods in Coq","description":"With our project coq-of-rust we aim to translate high-level Rust code to similar-looking Coq code, to formally verify Rust programs. One of the important constructs in the Rust language is the method syntax. In this post, we present our technique to translate Rust methods using type-classes in Coq.","date":"2023-04-26T00:00:00.000Z","formattedDate":"April 26, 2023","tags":[{"label":"coq-of-rust","permalink":"/blog/tags/coq-of-rust"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"Coq","permalink":"/blog/tags/coq"}],"readingTime":4.57,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd80 Representation of Rust methods in Coq","tags":["coq-of-rust","Rust","Coq"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Monad for side effects in Rust","permalink":"/blog/2023/05/28/monad-for-side-effects-in-rust"},"nextItem":{"title":"\ud83e\udd84 Our current formal verification efforts","permalink":"/blog/2023/01/24/current-verification-efforts"}},"content":"With our project [coq-of-rust](https://github.com/formal-land/coq-of-rust) we aim to translate high-level Rust code to similar-looking [Coq](https://coq.inria.fr/) code, to [formally verify](https://en.wikipedia.org/wiki/Formal_verification) Rust programs. One of the important constructs in the Rust language is the [method syntax](https://doc.rust-lang.org/book/ch05-03-method-syntax.html). In this post, we present our technique to translate Rust methods using type-classes in Coq.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Rust Code To Translate\\n\\nConsider the following Rust example, which contains a method (adapted from the [Rust Book](https://doc.rust-lang.org/book/)):\\n\\n```rust\\nstruct Rectangle {\\n width: u32,\\n height: u32,\\n}\\n\\nimpl Rectangle {\\n // Here \\"area\\" is a method\\n fn area(&self) -> u32 {\\n self.width * self.height\\n }\\n}\\n\\nfn main() {\\n let rect1 = Rectangle {\\n width: 30,\\n height: 50,\\n };\\n\\n println!(\\n \\"The area of the rectangle is {} square pixels.\\",\\n // We are calling this method there\\n rect1.area()\\n );\\n}\\n```\\n\\nThe Rust compiler can find the implementation of the `.area()` method call because it knows that the type of `rect1` is `Rectangle`. There could be other `area` methods defined for different types, and the code would still compile calling the `area` method of `Rectangle`.\\n\\nCoq has no direct equivalent for calling a function based on its name and type.\\n\\n## Our Translation\\n\\nHere is our Coq translation of the code above:\\n\\n```coq\\n 1: (* Generated by coq-of-rust *)\\n 2: Require Import CoqOfRust.CoqOfRust.\\n 3:\\n 4: Import Root.std.prelude.rust_2015.\\n 5:\\n 6: Module Rectangle.\\n 7: Record t : Set := {\\n 8: width : u32;\\n 9: height : u32;\\n10: }.\\n11:\\n12: Global Instance Get_width : Notation.Dot \\"width\\" := {\\n13: Notation.dot \'(Build_t x0 _) := x0;\\n14: }.\\n15: Global Instance Get_height : Notation.Dot \\"height\\" := {\\n16: Notation.dot \'(Build_t _ x1) := x1;\\n17: }.\\n18: End Rectangle.\\n19: Definition Rectangle : Set := Rectangle.t.\\n20:\\n21: Module ImplRectangle.\\n22: Definition Self := Rectangle.\\n23:\\n24: Definition area (self : ref Self) : u32 :=\\n25: self.[\\"width\\"].[\\"mul\\"] self.[\\"height\\"].\\n26:\\n27: Global Instance Method_area : Notation.Dot \\"area\\" := {\\n28: Notation.dot := area;\\n29: }.\\n30: End ImplRectangle.\\n31:\\n32: Definition main (_ : unit) : unit :=\\n33: let rect1 := {| Rectangle.width := 30; Rectangle.height := 50; |} in\\n34: _crate.io._print\\n35: (_crate.fmt.Arguments::[\\"new_v1\\"]\\n36: [ \\"The area of the rectangle is \\"; \\" square pixels.\\\\n\\" ]\\n37: [ _crate.fmt.ArgumentV1::[\\"new_display\\"] rect1.[\\"area\\"] ]) ;;\\n38: tt ;;\\n39: tt.\\n```\\n\\nOn line `24` we define the `area` function. On line `27` we declare that `area` is a method. On line `37` we call the `area` method on `rect1` with:\\n\\n```coq\\nrect1.[\\"area\\"]\\n```\\n\\nwhich closely resembles the source Rust code:\\n\\n```rust\\nrect1.area()\\n```\\n\\nCoq can automatically find the code of the `area` method to call.\\n\\n## How It Works\\n\\nThe code:\\n\\n```coq\\nrect1.[\\"area\\"]\\n```\\n\\nis actually a notation for:\\n\\n```coq\\nNotation.dot \\"area\\" rect1\\n```\\n\\nThen we leverage the inference mechanism of type-classes in Coq to find the code of the `area` method:\\n\\n```coq\\nModule Notation.\\n (** A class to represent the notation [e1.e2]. This is mainly used to call\\n methods, or access to named or indexed fields of structures.\\n The kind is either a string or an integer. *)\\n Class Dot {Kind : Set} (name : Kind) {T : Set} : Set := {\\n dot : T;\\n }.\\n Arguments dot {Kind} name {T Dot}.\\nEnd Notation.\\n```\\n\\nThe `Dot` class has three parameters: `Kind`, `name`, and `T`. `Kind` is the type of the name of the method (generally a string but it could be an integer in rare cases), `name` is the name of the method, and `T` is the type of the method. The `dot` field of the class is the code of the method.\\n\\nWhen we define the class instance:\\n\\n```coq\\n27: Global Instance Method_area : Notation.Dot \\"area\\" := {\\n28: Notation.dot := area;\\n29: }.\\n```\\n\\nwe instantiate the class `Notation.Dot` with three parameters:\\n\\n- `Kind` (inferred) is `string` because the name of the method is a string,\\n- `name` is `\\"area\\"` because the name of the method is `area`,\\n- `T` (inferred) is `ref Rectangle -> u32` because the method is declared as `fn area(&self) -> u32`.\\n\\nThen we define the `dot` field of the class instance to be the `area` function.\\n\\nWhen we call:\\n\\n```coq\\nNotation.dot \\"area\\" rect1\\n```\\n\\nCoq will automatically find the class instance `Method_area` because the type of `rect1` is `Rectangle` and the name of the method is `\\"area\\"`.\\n\\n## Other Use Cases\\n\\nThe `Dot` class is also used to access to named or indexed fields of structures or traits. We use a similar mechanism for associated functions. For example, the Rust code:\\n\\n```rust\\nlet rect1 = Rectangle::square(3);\\n```\\n\\nis translated to:\\n\\n```coq\\nlet rect1 := Rectangle::[\\"square\\"] 3 in\\n```\\n\\nwith a type-class for the `type::[name]` notation as follows:\\n\\n```coq\\nModule Notation.\\n (** A class to represent associated functions (the notation [e1::e2]). The\\n kind might be [Set] for functions associated to a type,\\n or [Set -> Set] for functions associated to a trait. *)\\n Class DoubleColon {Kind : Type} (type : Kind) (name : string) {T : Set} :\\n Set := {\\n double_colon : T;\\n }.\\n Arguments double_colon {Kind} type name {T DoubleColon}.\\nEnd Notation.\\n```\\n\\n## In Conclusion\\n\\nThe type-classes mechanism of Coq appears flexible enough to represent our current use cases involving methods and associated functions. It remains to be seen whether this approach will suffice for future use cases.\\n\\n:::tip Contact\\n\\nIf you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us at [contact@formal.land](mailto:contact@formal.land). We will be happy to set up a call with you.\\n\\n:::"},{"id":"/2023/01/24/current-verification-efforts","metadata":{"permalink":"/blog/2023/01/24/current-verification-efforts","source":"@site/blog/2023-01-24-current-verification-efforts.md","title":"\ud83e\udd84 Our current formal verification efforts","description":"We are diversifying ourselves to apply formal verification on 3\ufe0f\u20e3 new languages with Solidity, Rust, and TypeScript. In this article we describe our approach. For these three languages, we translate the code to the proof system \ud83d\udc13 Coq. We generate the cleanest \ud83e\uddfc possible output to simplify the formal verification \ud83d\udcd0 effort that comes after.","date":"2023-01-24T00:00:00.000Z","formattedDate":"January 24, 2023","tags":[{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"OCaml","permalink":"/blog/tags/o-caml"},{"label":"Solidity","permalink":"/blog/tags/solidity"},{"label":"Rust","permalink":"/blog/tags/rust"},{"label":"TypeScript","permalink":"/blog/tags/type-script"}],"readingTime":4.89,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83e\udd84 Our current formal verification efforts","tags":["coq-of-ocaml","OCaml","Solidity","Rust","TypeScript"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd80 Representation of Rust methods in Coq","permalink":"/blog/2023/04/26/representation-of-rust-methods-in-coq"},"nextItem":{"title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","permalink":"/blog/2022/12/13/latest-blog-posts-on-tezos"}},"content":"We are diversifying ourselves to apply [formal verification](https://en.wikipedia.org/wiki/Formal_verification) on 3\ufe0f\u20e3 new languages with **Solidity**, **Rust**, and **TypeScript**. In this article we describe our approach. For these three languages, we translate the code to the proof system [\ud83d\udc13 Coq](https://coq.inria.fr/). We generate the cleanest \ud83e\uddfc possible output to simplify the formal verification \ud83d\udcd0 effort that comes after.\\n\\n> Formal verification is a way to ensure that a program follows its specification in \ud83d\udcaf% of cases thanks to the use of mathematical methods. It removes far more bugs and security issues than testing, and is necessary to deliver software of the highest quality \ud83d\udc8e.\\n\\n\x3c!-- truncate --\x3e\\n\\n## \ud83d\uddfa\ufe0f General plan\\nTo apply formal verification to real-sized applications, we need to handle thousands of lines of code in a seamless way. We rely on the proof system Coq to write our proofs, as it has a mature ecosystem, and automated (SMT) and interactive ways to write proofs. To keep the proofs simple, we must find an efficient way to convert an existing and evolving codebase to Coq.\\n\\nFor example, given the following TypeScript example:\\n```typescript\\nexport function checkIfEnoughCredits(user: User, credits: number): boolean {\\n if (user.isAdmin) {\\n return credits >= 0;\\n }\\n\\n return credits >= 1000;\\n}\\n```\\nwe want to generate the corresponding Coq code in an automated way:\\n```coq\\nDefinition checkIfEnoughCredits (user : User) (credits : number) : bool :=\\n if user.(User.isAdmin) then\\n credits >= 0\\n else\\n credits >= 1000.\\n```\\nThis is the exact equivalent written using the Coq syntax, where we check the `credits` condition depending on the user\'s status. This is the `checkIfEnoughCredits` definition a Coq developer would directly write, in an idiomatic way.\\n\\nWe make some hypothesis on the input code. In TypeScript we assume the code does not contain mutations, which is often the case to simplify asynchronous code. In Rust we have other hypothesis as making safe mutations is one of the keys features of the language and a frequent pattern. For each language we look for a correct subset to work on, to support common use cases and still generate a clean Coq code.\\n\\n## \ud83c\uddf8 Solidity\\n\u27a1\ufe0f [Project page](/docs/verification/solidity) \u2b05\ufe0f\\n\\nThe [Solidity language](https://soliditylang.org/) is the main language to write smart contracts on the [Ethereum](https://ethereum.org/) blockchain. As smart contracts cannot be easily updated and handle a large amount of money, it is critical to formally verify them to prevent bugs.\\n\\nOur strategy is to develop a translator [coq-of-solidity](https://gitlab.com/formal-land/coq-of-solidity) from Solidity to Coq. We are using an implementation of an [ERC-20](https://en.wikipedia.org/wiki/Ethereum#ERC20) smart contract as an example to guide our translation. Two top difficulties in the translation of Solidity programs are:\\n* the use of object-oriented programming with inheritance on classes,\\n* the use of mutations and errors, that need to be handled in a monad.\\n\\nWe are still trying various approach to handle these difficulties and generate a clean Coq output for most cases.\\n\\nIn addition to our work on Solidity, we are looking at the [EVM code](https://ethereum.org/en/developers/docs/evm/) that is the assembly language of Ethereum. It has the advantage of being more stable and with a simpler semantics than Solidity. However, it is not as expressive and programs in EVM are much harder to read. We have a prototype of translator from EVM to Coq named [ethereum-vm-to-coq](https://gitlab.com/formal-land/ethereum-vm-to-coq). An interesting goal will be to connect the translation of Solidity and of EVM in Coq to show that they have the same semantics on a given smart contract.\\n\\nNote that EVM is the target language of many verification project on Ethereum such as [Certora](https://www.certora.com/) or static analyzers. We prefer to target Solidity as it is more expressive and the generated code in Coq will thus be easier to verify.\\n\\n## \ud83e\udd80 Rust\\n\u27a1\ufe0f [Project page](/docs/verification/rust) \u2b05\ufe0f\\n\\nThe [Rust language](https://www.rust-lang.org/) is a modern systems programming language that is gaining popularity. It is a safe language that prevents many common errors such as buffer overflows or use-after-free. It is also a language that is used to write low-level code, such as drivers or operating systems. As such, it is critical to formally verify Rust programs to prevent bugs.\\n\\nWe work in collaboration with the team developing the [Aeneas](https://github.com/AeneasVerif) project, with people from Inria and Microsoft. The aim is to translate Rust code with mutations to a purely functional form in Coq (without mutations) to simplify the verification effort and avoid the need of separation logic. The idea of this translation is explained in the [Aeneas paper](https://dl.acm.org/doi/abs/10.1145/3547647).\\n\\nThere are two steps in the translation:\\n1. **From [MIR](https://rustc-dev-guide.rust-lang.org/mir/index.html) (low-level intermediate form of Rust) to LLBC.** This is a custom language for the project that contains all the information of MIR but is better suited for analysis. For example, instead of using a control-flow graph it uses control structures and an abstract syntax tree. This step is implemented in Rust.\\n2. **From LLBC to Coq.** This is the heart of the project and is implemented in OCaml. This is where the translation from mutations to a purely functional form occurs.\\n\\nFor now we are focusing on adding new features to LLBC and improving the user experience: better error messages, generation of an output with holes for unhandled Rust features.\\n\\n## \ud83c\udf10 TypeScript\\n\u27a1\ufe0f [Project page](/docs/verification/typescript) \u2b05\ufe0f\\n\\nWe have a [\ud83d\udcfd\ufe0f demo project](https://formal-land.github.io/coq-of-js/) to showcase the translation of a purely functional subset of JavaScript to Coq. We handle functions and basic data types such as records, enums and discriminated unions. We are now porting the code to TypeScript in [coq-of-ts](https://github.com/formal-land/coq-of-ts). We prefer to work on TypeScript rather than JavaScript as type information are useful to guide the translation, and avoid the need of additional annotations on the source code.\\n\\nOur next target will be to make `coq-of-ts` usable on real-life project example.\\n\\n:::info Social media\\nFollow us on Twitter at [Twitter](https://twitter.com/FormalLand) \ud83d\udc26 and [Telegram](https://t.me/formal_land) to get the latest news about our projects. If you think our work is interesting, please share it with your friends and colleagues. \ud83d\ude4f\\n:::"},{"id":"/2022/12/13/latest-blog-posts-on-tezos","metadata":{"permalink":"/blog/2022/12/13/latest-blog-posts-on-tezos","source":"@site/blog/2022-12-13-latest-blog-posts-on-tezos.md","title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","description":"Here we recall some blog articles that we have written since this summer, on the formal verification of the protocol of Tezos. For this project, we are verifying a code base of around 100,000 lines of OCaml code. We automatically convert the OCaml code to the proof system Coq using the converter coq-of-ocaml. We then apply various proof techniques to make sure that the protocol of Tezos does not contain bugs.","date":"2022-12-13T00:00:00.000Z","formattedDate":"December 13, 2022","tags":[{"label":"coq-tezos-of-ocaml","permalink":"/blog/tags/coq-tezos-of-ocaml"},{"label":"Tezos","permalink":"/blog/tags/tezos"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"}],"readingTime":1.755,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","tags":["coq-tezos-of-ocaml","Tezos","coq-of-ocaml"]},"unlisted":false,"prevItem":{"title":"\ud83e\udd84 Our current formal verification efforts","permalink":"/blog/2023/01/24/current-verification-efforts"},"nextItem":{"title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","permalink":"/blog/2022/06/23/upgrade-coq-of-ocaml-4.14"}},"content":"Here we recall some blog articles that we have written since this summer, on the [formal verification of the protocol of Tezos](https://formal-land.gitlab.io/coq-tezos-of-ocaml/). For this project, we are verifying a code base of around 100,000 lines of OCaml code. We automatically convert the OCaml code to the proof system Coq using the converter [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml). We then apply various proof techniques to make sure that the protocol of Tezos does not contain bugs.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Blog articles \ud83d\udcdd\\nHere is the list of articles about the work we have done since this summer. We believe that some of this work is very unique and specific to Tezos.\\n\\n* [The error monad, internal errors and validity predicates, step-by-step](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/12/12/internal-errors-step-by-step/) by *Pierre Vial*: a detailed explanation of what we are doing to verify the absence of unexpected errors in the whole code base;\\n* [Absence of internal errors](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/10/18/absence-of-internal-errors/) by *Guillaume Claret*: the current state of our proofs to verify the absence of unexpected errors;\\n* [Skip-list verification. Using inductive predicates](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/10/03/verifying-the-skip-list-inductive-predicates/) by *Bart\u0142omiej Kr\xf3likowski* and *Natalie Klaus*: a presentation of our verification effort on the skip-list algorithm implementation (part 2);\\n* [Verifying the skip-list](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/10/03/verifying-the-skip-list/) by *Natalie Klaus* and *Bart\u0142omiej Kr\xf3likowski*: a presentation of our verification effort on the skip-list algorithm implementation (part 1);\\n* [Verifying json-data-encoding](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/08/15/verify-json-data-encoding/) by *Tait van Strien*: our work to verify an external library used by the Tezos protocol, to safely serialize data to JSON values;\\n* [Fixing reused proofs](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/07/19/fixing-proofs/) by *Bart\u0142omiej Kr\xf3likowski*: a presentation, with examples, of the work we do to maintain existing proofs and specifications as the code evolves;\\n* [Formal verification of property based tests](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/06/07/formal-verification-of-property-based-tests/) by *Guillaume Claret*: the principle and status of our work to formally verify the generalized case of property-based tests;\\n* [Plan for backward compatibility verification](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/06/02/plan-backward-compatibility) by *Guillaume Claret*: an explanation of the strategy we use to show that two successive versions of the Tezos protocol are fully backward compatible.\\n\\nTo follow more of our activity, feel free to register on our [Twitter account \ud83d\udc26](https://twitter.com/FormalLand)! If you need services or advices to formally verify your code base, you can drop us an [email \ud83d\udce7](mailto:contact@formal.land)!"},{"id":"/2022/06/23/upgrade-coq-of-ocaml-4.14","metadata":{"permalink":"/blog/2022/06/23/upgrade-coq-of-ocaml-4.14","source":"@site/blog/2022-06-23-upgrade-coq-of-ocaml-4.14.md","title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","description":"In an effort to support the latest version of the protocol of Tezos we upgraded coq-of-ocaml to add compatibility with OCaml 4.14. The result is available in the branch ocaml-4.14. We describe here how we made this upgrade.","date":"2022-06-23T00:00:00.000Z","formattedDate":"June 23, 2022","tags":[{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"ocaml","permalink":"/blog/tags/ocaml"},{"label":"4.14","permalink":"/blog/tags/4-14"}],"readingTime":2.195,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","tags":["coq-of-ocaml","ocaml","4.14"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Latest blog posts on our formal verification effort on Tezos","permalink":"/blog/2022/12/13/latest-blog-posts-on-tezos"},"nextItem":{"title":"\ud83d\udc2b Status update on the verification of Tezos","permalink":"/blog/2022/06/15/status update-tezos"}},"content":"In an effort to support the latest version of the [protocol of Tezos](https://gitlab.com/tezos/tezos/-/tree/master/src/proto_alpha/lib_protocol) we upgraded [`coq-of-ocaml`](https://github.com/formal-land/coq-of-ocaml) to add compatibility with OCaml 4.14. The result is available in the branch [`ocaml-4.14`](https://github.com/formal-land/coq-of-ocaml/pull/217). We describe here how we made this upgrade.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Usage of Merlin\\nIn `coq-of-ocaml` we are using [Merlin](https://github.com/ocaml/merlin) to get the typed [abstract syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) of OCaml files. We see the AST through the [Typedtree](https://docs.mirage.io/ocaml/Typedtree/index.html) interface, together with an access to all the definitions of the current compilation environment. Merlin computes the current environment by understanding how an OCaml project is configured and connecting to the [dune](https://dune.build/) build system. The environment is mandatory for certain transformations in `coq-of-ocaml`, like:\\n* finding a canonical name for module types;\\n* propagating phantom types.\\n\\nIn order to use Merlin as a library (rather than as a daemon), we vendor the [LSP version](https://github.com/rgrinberg/merlin/tree/lsp) of [rgrinberg](https://github.com/rgrinberg) in the folder [`vendor/`](https://github.com/formal-land/coq-of-ocaml/tree/master/vendor). This vendored version works with no extra configurations.\\n\\n## Upgrade\\nWhen a new version of OCaml is out, we upgrade our vendored version of Merlin to a compatible one. Then we do the necessary changes to `coq-of-ocaml`, as the interface of the AST generally evolves with small changes. For OCaml 4.14, the main change was some types becoming abstract such as `Types.type_expr`. To access to the fields of these types, we now need to use a specific getter and do changes such as:\\n```diff\\n+ match typ.desc with\\n- match Types.get_desc typ with\\n```\\nThis made some patterns in `match` expressions more complex, but otherwise the changes were very minimal. We ran all the unit-tests of `coq-of-ocaml` after the upgrade and they were still valid.\\n\\n## Git submodule or copy & paste?\\nTo vendor Merlin we have two possibilities:\\n1. Using a [Git submodule](https://git-scm.com/book/en/v2/Git-Tools-Submodules).\\n2. Doing a copy & paste of the code.\\n\\nThe first possibility is more efficient in terms of space, but there are a few disadvantages:\\n* we cannot make small modifications if needed;\\n* the archives generated by Github do not contain the code of the submodules (see this [issue](https://github.com/dear-github/dear-github/issues/214))\\n* if a commit in the repository for the submodule disappears, then the submodule is unusable.\\n\\nThe last reason forced us to do a copy & paste for OCaml 4.14. We now have to be cautious not to commit the generate `.ml` file for the OCaml parser.\\n\\n## Next\\nThe next change will be doing the upgrade to OCaml 5. There should be much more changes, and in particular a new way of handling the effects. We do not know yet if it will be possible to translate the effect handlers to Coq in a nice way."},{"id":"/2022/06/15/status update-tezos","metadata":{"permalink":"/blog/2022/06/15/status update-tezos","source":"@site/blog/2022-06-15-status update-tezos.md","title":"\ud83d\udc2b Status update on the verification of Tezos","description":"Here we give an update on our verification effort on the protocol of Tezos. We add the marks:","date":"2022-06-15T00:00:00.000Z","formattedDate":"June 15, 2022","tags":[{"label":"tezos","permalink":"/blog/tags/tezos"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"coq","permalink":"/blog/tags/coq"}],"readingTime":7.53,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Status update on the verification of Tezos","tags":["tezos","coq-of-ocaml","coq"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Upgrade coq-of-ocaml to OCaml 4.14","permalink":"/blog/2022/06/23/upgrade-coq-of-ocaml-4.14"},"nextItem":{"title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","permalink":"/blog/2022/02/02/make-tezos-a-formally-verified-crypto"}},"content":"Here we give an update on our [verification effort](https://formal-land.gitlab.io/coq-tezos-of-ocaml/) on the protocol of Tezos. We add the marks:\\n* \u2705 for \\"rather done\\"\\n* \ud83c\udf0a for \\"partially done\\"\\n* \u274c for \\"most is yet to do\\"\\n\\nOn the website of project, we also automatically generates pages such as [Compare](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/status/compare/) to follow the status of the tasks.\\n\\n\x3c!-- truncate --\x3e\\n\\n## Maintenance of the translation \u2705\\nWe were able to maintain most of the translation from OCaml to Coq of the protocol of Tezos using [coq-of-ocaml](https://github.com/formal-land/coq-of-ocaml), including all the translation of the Michelson interpreter. There was an increase in the size of the OCaml code base in recent months, due to new features added in Tezos like the [rollups](https://research-development.nomadic-labs.com/tezos-is-scaling.html). Here are the numbers of lines of code (`.ml` and `.mli` files) for the various protocol versions:\\n* protocol H: `51147`\\n* protocol I: `59535`\\n* protocol J: `83271` (increase mainly due to the rollups)\\n* protocol Alpha (development version of K): `90716`\\n\\nWe still translate most of the protocol code up to version J. We stayed on version J for a while as we wanted to add as many proofs as possible before doing a proof of backward compatibility between J and K. We are currently updating the translation to support the protocol version Alpha, preparing for the translation of K.\\n\\nFor protocol J, we needed to add a [blacklist.txt](https://gitlab.com/nomadic-labs/coq-tezos-of-ocaml/-/blob/master/blacklist.txt) of files that we do not support. Indeed, we need to add new changes to `coq-of-ocaml` to support these or do hard-to-maintain changes to [our fork](https://gitlab.com/tezos/tezos/-/merge_requests/3303) of the Tezos protocol. We plan to complete the translation and remove this black-list for the protocol J soon (in a week or two).\\n\\n## Size of the proofs \u2705\\nOne of our plans is to have a reasonable quantity of proofs, to cover a reasonable quantity of code and properties from the protocol. We believe we have a good quantity of proofs now, as we have more than 50,000 lines of Coq code (for an OCaml codebase of 80,000 lines).\\n\\nIn addition to our main targets, we verify many \\"smaller\\" properties, such as:\\n* conversion functions are inverses (when there are two `to_int` and `of_int` functions in a file, we show that they are inverses);\\n* the `compare` functions, to order elements, are well defined (see our blog post [Verifying the compare functions of OCaml](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/04/04/verifying-the-compare-functions));\\n* invariants are preserved. For example, [here](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/proofs/carbonated_map#Make.update_is_valid) we show that updating a carbonated map preserves the property of having a size field actually equal to the number of elements.\\n\\nWe should note that the size of Coq proofs tends to grow faster than the size of the verified code. We have no coverage metrics to know how much of the code is covered by these proofs.\\n\\n## Data-encodings \ud83c\udf0a\\nThe [data-encoding](https://gitlab.com/nomadic-labs/data-encoding) library is a set of combinators to write serialization/de-serialization functions. We verify that the encodings defined for each protocol data type are bijective. The good thing we have is a semi-automated tactic to verify the use of the `data-encoding` primitives. We detail this approach in our blog post [Automation of `data_encoding` proofs](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/11/22/data-encoding-automation). We can verify most of the encoding functions that we encounter. From there, we also express the **invariant** associated with each data type, which the encodings generally check at runtime. The invariants are then the domain of definition of the encodings.\\n\\nHowever, we have a hole: we do not verify the `data-encoding` library itself. Thus the [axioms we made](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/environment/proofs/data_encoding) on the data-encoding primitives may have approximations. And indeed, we missed one issue in the development code of the protocol. This is thus a new high-priority target to verify the `data-encoding` library itself. One of the challenges for the proof is the use of side-effects (references and exceptions) in this library.\\n\\n## Property-based tests \ud83c\udf0a\\nThe property-based tests on the protocol are located in [`src/proto_alpha/lib_protocol/test/pbt`](https://gitlab.com/tezos/tezos/-/tree/master/src/proto_alpha/lib_protocol/test/pbt). These tests are composed of:\\n* a generator, generating random inputs of a certain shape;\\n* a property function, a boolean function taking a generated input and supposed to always answer `true`.\\n\\nWe translated a part of these tests to Coq, to convert them to theorems and have specifications extracted from the code. The result of this work is summarized in this blog post: [Formal verification of property based tests](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2022/06/07/formal-verification-of-property-based-tests). We have fully translated and verified four test files over a total of twelve. We are continuing the work of translations and proofs.\\n\\nHowever, we found that for some of the files the proofs were taking a long time to write compared to the gains in safety. Indeed, the statements made in the tests are sometimes too complex when translated into general theorems. For example, for [test_carbonated_map.ml](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/test/pbt/test_carbonated_map.ml) we have to deal with:\\n* gas exhaustion (seemingly impossible in the tests);\\n* data structures of size greater than `max_int` (impossible in practice).\\n\\nAll of that complicate the proofs for little gain in safety. So I would say that not all the property-based tests have a nice and useful translation to Coq. We should still note that for some of the tests, like with saturation arithmetic, we have proofs that work well. For these, we rely on the automated linear arithmetic tactic [`lia`](https://coq.inria.fr/refman/addendum/micromega.html) of Coq to verify properties over integer overflows.\\n\\n## Storage system \ud83c\udf0a\\nBy \\"storage system\\" we understand the whole set of functors defined in [`storage_functors.ml`](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/storage_functors.ml) and how we apply them to define the protocol storage in [`storage.ml`](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/storage_functors.ml). These functors create sub-storages with signatures such as:\\n```ocaml\\nmodule type Non_iterable_indexed_data_storage = sig\\n type t\\n type context = t\\n type key\\n type value\\n val mem : context -> key -> bool Lwt.t\\n val get : context -> key -> value tzresult Lwt.t\\n val find : context -> key -> value option tzresult Lwt.t\\n val update : context -> key -> value -> Raw_context.t tzresult Lwt.t\\n val init : context -> key -> value -> Raw_context.t tzresult Lwt.t\\n val add : context -> key -> value -> Raw_context.t Lwt.t\\n val add_or_remove : context -> key -> value option -> Raw_context.t Lwt.t\\n val remove_existing : context -> key -> Raw_context.t tzresult Lwt.t\\n val remove : context -> key -> Raw_context.t Lwt.t\\nend\\n```\\nThis `Non_iterable_indexed_data_storage` API looks like the API of an OCaml\'s [Map](https://v2.ocaml.org/api/Map.Make.html). As a result, our goal for the storage is to show that is can be simulated by standard OCaml data structures such as sets and maps. This is a key step to unlock further reasoning about code using the storage.\\n\\nUnfortunately, we were not able to verify the whole storage system yet. Among the difficulties are that:\\n* there are many layers in the definition of the storage;\\n* the storage functors use a lot of abstractions, and sometimes it is unclear how to specify them in the general case.\\n\\nStill, we have verified some of the functors as seen in [`Proofs/Storage_functors.v`](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/proofs/storage_functors) and specified the `storage.ml` file in [`Proos/Storage.v`](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/storage). We believe in having the correct specifications for all of the storage abstractions now. We plan to complete all these proofs later.\\n\\n## Michelson\\nThe verification of the Michelson interpreter is what occupied most of our time. By considering the OCaml files whose name starts by `script_`, the size of the Michelson interpreter is around 20,000 lines of OCaml code.\\n\\n### Simulations \ud83c\udf0a\\nThe interpreter relies heavily on [GADTs](https://v2.ocaml.org/manual/gadts.html) in OCaml. Because these do not translate nicely in Coq, we need to write simulations in dependent types of the interpreter functions, and prove them correct in Coq. We describe this process in our [Michelson Guide](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/guides/michelson).\\n\\nThe main difficulties we encountered are:\\n* the number of simulations to write (covering the 20,000 lines of OCaml);\\n* the execution time of the proof of correctness of the simulations. This is due to the large size of the inductive types describing the Michelson AST, and the use of dependent types generating large proof terms. For example, there are around 30 cases for the types and 150 for the instructions node in the AST.\\n\\nWhen writing the simulations, we are also verifying the termination of all the functions and the absence of reachable `assert false`. We have defined the simulation of many functions, but are still missing important ones such as [`parse_instr_aux`](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/script_ir_translator/#parse_instr_aux) to parse Michelson programs.\\n\\n### Mi-Cho-Coq \ud83c\udf0a\\nWe have a project to verify that the [Mi-Cho-Coq](https://gitlab.com/nomadic-labs/mi-cho-coq) framework, used to formally verify smart contracts written in Michelson, is compatible with the implementation of the Michelson interpreter in OCaml. We have a partial proof of compatibility in [Micho_to_dep.v](https://formal-land.gitlab.io/coq-tezos-of-ocaml/docs/simulations/micho_to_dep). We still need to complete this proof, especially to handle instructions with loops. Our goal is to show a complete inclusion of the semantics of Mi-Cho-Coq into the semantics of the implementation.\\n\\n### Parse/unparse \u274c\\nWe wanted to verify that the various parsing and unparsing functions over Michelson are inverses. These functions exist for:\\n* comparable types\\n* types\\n* comparable data\\n* data\\n\\nBecause we are still focused on writing, verifying or updating the simulations, we are still not done for this task.\\n\\n## Conclusion\\nWe have many ongoing projects but few fully completed tasks. We will focus more on having terminated proofs."},{"id":"/2022/02/02/make-tezos-a-formally-verified-crypto","metadata":{"permalink":"/blog/2022/02/02/make-tezos-a-formally-verified-crypto","source":"@site/blog/2022-02-02-make-tezos-a-formally-verified-crypto.md","title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","description":"Elephants","date":"2022-02-02T00:00:00.000Z","formattedDate":"February 2, 2022","tags":[{"label":"tezos","permalink":"/blog/tags/tezos"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"coq","permalink":"/blog/tags/coq"}],"readingTime":3.675,"hasTruncateMarker":true,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","tags":["tezos","coq-of-ocaml","coq"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Status update on the verification of Tezos","permalink":"/blog/2022/06/15/status update-tezos"},"nextItem":{"title":"\ud83d\udc2b New blog posts and Meetup talk","permalink":"/blog/2021/11/12/new-blog-posts-and-meetup-talk"}},"content":"![Elephants](elephants-elmira-gokoryan.webp)\\n\\nOur primary goal at [Formal Land \ud83c\udf32](https://formal.land/) is to make [Tezos](https://tezos.com/) the first crypto-currency with a formally verified implementation. With [formal verification](https://en.wikipedia.org/wiki/Formal_verification), thanks to mathematical methods, we can check that a program behaves as expected for all possible inputs. Formal verification goes beyond what testing can do, as testing can only handle a finite amount of cases. That is critical as cryptocurrencies hold a large amount of money (around $3B for Tezos today). The current result of our verification project is available on [nomadic-labs.gitlab.io/coq-tezos-of-ocaml](https://formal-land.gitlab.io/coq-tezos-of-ocaml/). Formal verification is also key to allowing Tezos to evolve constantly in a safe and backward compatible manner.\\n\\n\x3c!-- truncate --\x3e\\n\\nWe proceed in two steps:\\n1. we translate the code of Tezos, written in [OCaml](https://ocaml.org/), to the proof language [Coq](https://coq.inria.fr/) using the translator [coq-of-ocaml](https://github.com/foobar-land/coq-of-ocaml);\\n2. we write our specifications and proofs in the Coq language.\\n\\nWe believe this is one of the most efficient ways to proceed, as we can work on an almost unmodified version of the codebase and use the full power of the mature proof system Coq. The code of Tezos is composed of around:\\n* 50,000 lines for the protocol (the kernel of Tezos), and\\n* 200,000 lines for the shell (everything else, including the peer-to-peer layer and the storage backend).\\n\\nWe are currently focusing on verifying the protocol for the following modules.\\n\\n## Data-encoding\\nThe [data-encoding](https://gitlab.com/nomadic-labs/data-encoding) library offers serialization and deserialization to binary and JSON formats. It is used in various parts of the Tezos protocol, especially on all the data types ending up in the storage system. In practice, many encodings are defined in the OCaml files named `*_repr.ml`. We verify that the `data-encoding` library is correctly used to define the encodings. We check that converting a value to binary format and from binary returns the initial value. We explicit the domain of validity of such conversions. This verification work generally reveals and propagates invariants about the data structures of the protocol. As an invariant example, all the account amounts should always be positive. Having these invariants will be helpful for the verification of higher-level layers of the protocol.\\n\\n## Michelson smart contracts\\nThe smart contract language of Tezos is [Michelson](https://tezos.gitlab.io/active/michelson.html). The interpreter and type-checker of smart contracts is one of the most complex and critical parts of the protocol. We are verifying two things about this code:\\n* The equivalence of the interpreter and the Coq semantics for Michelson defined in the project [Mi-Cho-Coq](https://gitlab.com/nomadic-labs/mi-cho-coq). Thanks to this equivalence, we can make sure that the formal verification of smart contracts is sound for the current version of the protocol.\\n* The compatibility of the parsing and unparsing functions for the Michelson types and values. The parsing functions take care of the type-checking and do a lot of sanity checks on Michelson expressions with appropriate error messages. Showing that the parsing and unparsing functions are inverses is important for security reasons. The Michelson values are always unparsed at the end of a smart contract execution to be stored on disk.\\n\\nTo do these proofs, we also give a new semantics of Michelson, expressed using dependent types rather than [GADTs](https://ocaml.org/manual/gadts-tutorial.html) in the OCaml implementation.\\n\\n## Storage system\\nCryptocurrencies typically take a lot of space on disk (in the hundreds of gigabytes). In Tezos, we use the key-value database [Irmin](https://irmin.org/). The protocol provides a lot of [abstractions](https://gitlab.com/tezos/tezos/-/blob/master/src/proto_alpha/lib_protocol/storage_functors.ml) over this database to expose higher-level interfaces with set and map-like APIs. We verify that these abstractions are valid doing a proof by simulation, where we show that the whole system is equivalent to an [in-memory database](https://en.wikipedia.org/wiki/In-memory_database) using simpler data structures. Thanks to this simulation, we will be able to reason about code using the storage as if we were using the simpler in-memory version.\\n\\n## In addition\\nWe also plan to verify:\\n* The implementation of the `data-encoding` library itself. This code is challenging for formal verification as it contains many imperative features. Another specificity of this library is that it sits outside of the protocol of Tezos, and we might need to adapt `coq-of-ocaml` to support it.\\n* The [property-based tests of the protocol](https://gitlab.com/tezos/tezos/-/tree/master/src/proto_alpha/lib_protocol/test/pbt). These tests are written as boolean functions (or functions raising exceptions), which must return `true` on any possible inputs. We will verify them in the general case by importing their definitions to Coq and verifying with mathematical proofs that they are always correct.\\n\\n:::tip Contact\\nFor any questions or remarks, contact us on \ud83d\udc49 [contact@formal.land](mailto:contact@formal.land) \ud83d\udc48.\\n:::"},{"id":"/2021/11/12/new-blog-posts-and-meetup-talk","metadata":{"permalink":"/blog/2021/11/12/new-blog-posts-and-meetup-talk","source":"@site/blog/2021-11-12-new-blog-posts-and-meetup-talk.md","title":"\ud83d\udc2b New blog posts and Meetup talk","description":"Recently, we added two new blog posts about the verification of the crypto-currency Tezos:","date":"2021-11-12T00:00:00.000Z","formattedDate":"November 12, 2021","tags":[{"label":"tezos","permalink":"/blog/tags/tezos"},{"label":"mi-cho-coq","permalink":"/blog/tags/mi-cho-coq"},{"label":"coq-of-ocaml","permalink":"/blog/tags/coq-of-ocaml"},{"label":"meetup","permalink":"/blog/tags/meetup"}],"readingTime":0.58,"hasTruncateMarker":false,"authors":[],"frontMatter":{"title":"\ud83d\udc2b New blog posts and Meetup talk","tags":["tezos","mi-cho-coq","coq-of-ocaml","meetup"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Make Tezos the first formally verified cryptocurrency","permalink":"/blog/2022/02/02/make-tezos-a-formally-verified-crypto"},"nextItem":{"title":"\ud83d\udc2b Verification of the use of data-encoding","permalink":"/blog/2021/10/27/verification-data-encoding"}},"content":"Recently, we added two new blog posts about the verification of the crypto-currency [Tezos](https://tezos.com/):\\n* [Verify the Michelson types of Mi-Cho-Coq](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/11/01/verify-michelson-types-mi-cho-coq/) to compare the types defined in the Tezos code for the [Michelson](http://tezos.gitlab.io/active/michelson.html) interpreter and in the [Mi-Cho-Coq library](https://gitlab.com/nomadic-labs/mi-cho-coq) to verify smart contracts;\\n* [Translate the Tenderbake\'s code to Coq](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/11/08/translate-tenderbake/) to explain how we translated the recent changes in Tezos to the Coq using [coq-of-ocaml](https://github.com/foobar-land/coq-of-ocaml). In particular we translated the code of the new [Tenderbake](https://research-development.nomadic-labs.com/a-look-ahead-to-tenderbake.html) consensus algorithm.\\n\\nWe also talked at the [Lambda Lille Meetup](https://www.meetup.com/LambdaLille/events/281374644/) (in French) to present our work on `coq-of-ocaml` for Tezos. A video on the [Youtube channel](https://www.youtube.com/channel/UC-hC7y_ilQBq0QCa9xDu1iA) of the Meetup should be available shortly. We thanks the organizers for hosting the talk."},{"id":"/2021/10/27/verification-data-encoding","metadata":{"permalink":"/blog/2021/10/27/verification-data-encoding","source":"@site/blog/2021-10-27-verification-data-encoding.md","title":"\ud83d\udc2b Verification of the use of data-encoding","description":"We added a blog post about the verification of the use of data-encodings in the protocol of Tezos. Currently, we work on the verification of Tezos and publish our blog articles there. We use coq-of-ocaml to translate the OCaml code to Coq and do our verification effort.","date":"2021-10-27T00:00:00.000Z","formattedDate":"October 27, 2021","tags":[{"label":"data-encoding","permalink":"/blog/tags/data-encoding"}],"readingTime":0.235,"hasTruncateMarker":false,"authors":[],"frontMatter":{"title":"\ud83d\udc2b Verification of the use of data-encoding","tags":["data-encoding"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b New blog posts and Meetup talk","permalink":"/blog/2021/11/12/new-blog-posts-and-meetup-talk"},"nextItem":{"title":"\ud83d\ude00 Welcome","permalink":"/blog/2021/10/10/welcome"}},"content":"We added a blog post about the [verification of the use of data-encodings](https://formal-land.gitlab.io/coq-tezos-of-ocaml/blog/2021/10/20/data-encoding-usage) in the protocol of Tezos. Currently, we work on the verification of Tezos and publish our blog articles there. We use [coq-of-ocaml](https://foobar-land.github.io/coq-of-ocaml/) to translate the OCaml code to Coq and do our verification effort."},{"id":"/2021/10/10/welcome","metadata":{"permalink":"/blog/2021/10/10/welcome","source":"@site/blog/2021-10-10-welcome.md","title":"\ud83d\ude00 Welcome","description":"Welcome to the blog of Formal Land. Here we will post various updates about the work we are doing.","date":"2021-10-10T00:00:00.000Z","formattedDate":"October 10, 2021","tags":[{"label":"Welcome","permalink":"/blog/tags/welcome"}],"readingTime":0.095,"hasTruncateMarker":false,"authors":[],"frontMatter":{"title":"\ud83d\ude00 Welcome","tags":["Welcome"]},"unlisted":false,"prevItem":{"title":"\ud83d\udc2b Verification of the use of data-encoding","permalink":"/blog/2021/10/27/verification-data-encoding"}},"content":"Welcome to the blog of [Formal Land](/). Here we will post various updates about the work we are doing."}]}')}}]); \ No newline at end of file diff --git a/assets/js/b6190d2c.256f86b9.js b/assets/js/b6190d2c.256f86b9.js deleted file mode 100644 index ce4ddc715..000000000 --- a/assets/js/b6190d2c.256f86b9.js +++ /dev/null @@ -1 +0,0 @@ -"use strict";(self.webpackChunkformal_land=self.webpackChunkformal_land||[]).push([[4559],{9552:(e,n,t)=>{t.r(n),t.d(n,{assets:()=>l,contentTitle:()=>r,default:()=>d,frontMatter:()=>s,metadata:()=>a,toc:()=>c});var i=t(4848),o=t(8453);const s={title:"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)",tags:["Rust","links","simulations"],authors:[]},r=void 0,a={permalink:"/blog/2025/01/30/links-for-rust-in-rocq",source:"@site/blog/2025-01-30-links-for-rust-in-rocq.md",title:"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)",description:"In this article we show how we re-build the type and naming information of \ud83e\udd80 Rust code in  Rocq/Coq, the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of Revm, a virtual machine to run EVM programs.",date:"2025-01-30T00:00:00.000Z",formattedDate:"January 30, 2025",tags:[{label:"Rust",permalink:"/blog/tags/rust"},{label:"links",permalink:"/blog/tags/links"},{label:"simulations",permalink:"/blog/tags/simulations"}],readingTime:7.485,hasTruncateMarker:!0,authors:[],frontMatter:{title:"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)",tags:["Rust","links","simulations"],authors:[]},unlisted:!1,nextItem:{title:"\ud83e\udd16 Designing a coding assistant for Rocq",permalink:"/blog/2025/01/21/designing-a-coding-assistant-for-rocq"}},l={authorsImageUrls:[]},c=[{value:"\ud83c\udfaf The challenge",id:"-the-challenge",level:2},{value:"\ud83d\udedd Strategy",id:"-strategy",level:2},{value:"\ud83e\uddea Example",id:"-example",level:2},{value:"\ud83d\udd2e Link monad",id:"-link-monad",level:2},{value:"\u2712\ufe0f Conclusion",id:"\ufe0f-conclusion",level:2}];function h(e){const n={a:"a",admonition:"admonition",code:"code",em:"em",h2:"h2",img:"img",li:"li",ol:"ol",p:"p",pre:"pre",strong:"strong",ul:"ul",...(0,o.R)(),...e.components};return(0,i.jsxs)(i.Fragment,{children:[(0,i.jsxs)(n.p,{children:["In this article we show how we re-build the type and naming information of ",(0,i.jsx)(n.a,{href:"https://www.rust-lang.org/",children:"\ud83e\udd80\xa0Rust"})," code in ",(0,i.jsxs)(n.a,{href:"https://rocq-prover.org/",children:[(0,i.jsx)("img",{src:"https://raw.githubusercontent.com/coq/rocq-prover.org/refs/heads/main/rocq-id/logos/SVG/icon-rocq-orange.svg",height:"18px"}),"\xa0Rocq/Coq"]}),", the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of ",(0,i.jsx)(n.a,{href:"https://github.com/bluealloy/revm",children:"Revm"}),", a virtual machine to run ",(0,i.jsx)(n.a,{href:"https://en.wikipedia.org/wiki/Ethereum#Virtual_machine",children:"EVM"})," programs."]}),"\n",(0,i.jsx)(n.p,{children:"This is the continuation of the following article:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:(0,i.jsx)(n.a,{href:"/blog/2024/04/26/translation-core-alloc-crates",children:"\ud83e\udd80 Translation of the Rust's core and alloc crates"})}),"\n"]}),"\n",(0,i.jsxs)(n.admonition,{title:"Ask for the highest security!",type:"success",children:[(0,i.jsx)(n.p,{children:"When millions are at stake, bug bounties are not enough. How do you ensure your security audits are exhaustive?"}),(0,i.jsxs)(n.p,{children:["The best way is to use ",(0,i.jsx)(n.strong,{children:"formal verification"}),"."]}),(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Contact us"})," at\xa0",(0,i.jsx)(n.a,{href:"mailto:contact@formal.land",children:"\xa0\ud83d\udc8ccontact@formal.land"})," to make sure your code is safe!\xa0\ud83d\udee1\ufe0f"]}),(0,i.jsxs)(n.p,{children:["We cover ",(0,i.jsx)(n.strong,{children:"Rust"}),", ",(0,i.jsx)(n.strong,{children:"Solidity"}),", and ",(0,i.jsx)(n.strong,{children:"ZK systems"}),"."]})]}),"\n",(0,i.jsx)("figure",{children:(0,i.jsx)(n.p,{children:(0,i.jsx)(n.img,{alt:"Green forest",src:t(3423).A+"",width:"1024",height:"1024"})})}),"\n",(0,i.jsx)(n.h2,{id:"-the-challenge",children:"\ud83c\udfaf The challenge"}),"\n",(0,i.jsx)(n.p,{children:"Our goal is to be able to formally verify large Rust codebases, counting thousands of lines, and without having to modify the code to make it more amenable to formal verification. Our concrete example is the verification of the Revm that includes about 10,000 lines of Rust code, depending on how far we include the dependencies."}),"\n",(0,i.jsx)(n.p,{children:"This requires to have a methodology of verification that both:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:"Scales with the size of the codebase. Rust programs often use a lot of abstractions, and we make the choice to keep these abstractions in the formal model. Combined with the expressivity of the Rocq prover, we hope this will ensure we can scale our reasoning."}),"\n",(0,i.jsx)(n.li,{children:"Supports most of the Rust language, noting that Rust is a complex and feature-rich language."}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:["To make sure our translation from the Rust language to the Rocq system has good support, we generate a translation that is very verbose and rather low-level without interpreting the meaning of the various Rust primitives too much. For example, our translation tool is only about 5,000 lines long. It is written in Rust and uses the APIs of the ",(0,i.jsx)(n.code,{children:"rustc"})," compiler."]}),"\n",(0,i.jsx)(n.p,{children:"This approach leaves the burdens of defining the semantics of Rust and designing the reasoning primitives on the Rocq side."}),"\n",(0,i.jsx)(n.h2,{id:"-strategy",children:"\ud83d\udedd Strategy"}),"\n",(0,i.jsx)(n.p,{children:"We plan to reason on the translated Rust code with two intermediate steps:"}),"\n",(0,i.jsxs)(n.ol,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Links"})," These represent a complete rewriting of the translated code, adding type and naming information that are erased during the translation to Rocq. We also prove that this rewriting is equivalent to the initial translation. We hope to automate this step as much as possible."]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Simulations"})," In this step we make the less obvious transformations, in particular representing the memory mutations in a clean and custom state monad, as well as various optimizations such as collapsing all the integer types if it helps for the proofs later. We also prove that this rewriting is equivalent to the links."]}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:["At the end of the ",(0,i.jsx)(n.strong,{children:"Simulations"})," step, we should obtain a purely functional and idiomatic representation of the original Rust code in Rocq. This representation should be easier to reason about, and we will be able to formally verify properties of the code."]}),"\n",(0,i.jsx)(n.p,{children:"As a summary, here are the steps we want to follow:"}),"\n",(0,i.jsx)("figure",{class:"text--center",children:(0,i.jsx)(n.p,{children:(0,i.jsx)(n.img,{alt:"Compilation steps",src:t(7867).A+"",width:"214",height:"696"})})}),"\n",(0,i.jsx)(n.h2,{id:"-example",children:"\ud83e\uddea Example"}),"\n",(0,i.jsx)(n.p,{children:"Here is an example from the standard library of Rust, which is used to define other comparison operators:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-rust",children:"pub fn max_by Ordering>(v1: T, v2: T, compare: F) -> T {\n match compare(&v1, &v2) {\n Ordering::Less | Ordering::Equal => v2,\n Ordering::Greater => v1,\n }\n}\n"})}),"\n",(0,i.jsxs)(n.p,{children:["This example is interesting as it uses some abstractions, with polymorphism, traits, closures, and a bit of pointer manipulations. Ideally, we should be able to represent it with a Rocq code of a similar size, without the explicit references\xa0",(0,i.jsx)(n.code,{children:"&"})," that are mostly useless in a purely functional setting. But here is the Rocq code we obtain after running\xa0",(0,i.jsx)(n.a,{href:"https://github.com/formal-land/coq-of-rust",children:"coq-of-rust"}),":"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:'Definition max_by (\u03b5 : list Value.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\n match \u03b5, \u03c4, \u03b1 with\n | [], [ T; F ], [ v1; v2; compare ] =>\n ltac:(M.monadic\n (let v1 := M.alloc (| v1 |) in\n let v2 := M.alloc (| v2 |) in\n let compare := M.alloc (| compare |) in\n M.read (|\n M.match_operator (|\n M.alloc (|\n M.call_closure (|\n M.get_trait_method (|\n "core::ops::function::FnOnce",\n F,\n [],\n [ Ty.tuple [ Ty.apply (Ty.path "&") [] [ T ]; Ty.apply (Ty.path "&") [] [ T ] ] ],\n "call_once",\n [],\n []\n |),\n [\n M.read (| compare |);\n Value.Tuple\n [\n M.borrow (|\n Pointer.Kind.Ref,\n M.deref (| M.borrow (| Pointer.Kind.Ref, v1 |) |)\n |);\n M.borrow (|\n Pointer.Kind.Ref,\n M.deref (| M.borrow (| Pointer.Kind.Ref, v2 |) |)\n |)\n ]\n ]\n |)\n |),\n [\n fun \u03b3 =>\n ltac:(M.monadic\n (M.find_or_pattern (|\n \u03b3,\n [\n fun \u03b3 =>\n ltac:(M.monadic\n (let _ := M.is_struct_tuple (| \u03b3, "core::cmp::Ordering::Less" |) in\n Value.Tuple []));\n fun \u03b3 =>\n ltac:(M.monadic\n (let _ := M.is_struct_tuple (| \u03b3, "core::cmp::Ordering::Equal" |) in\n Value.Tuple []))\n ],\n fun \u03b3 =>\n ltac:(M.monadic\n match \u03b3 with\n | [] => ltac:(M.monadic v2)\n | _ => M.impossible "wrong number of arguments"\n end)\n |)));\n fun \u03b3 =>\n ltac:(M.monadic\n (let _ := M.is_struct_tuple (| \u03b3, "core::cmp::Ordering::Greater" |) in\n v1))\n ]\n |)\n |)))\n | _, _, _ => M.impossible "wrong number of arguments"\n end.\n'})}),"\n",(0,i.jsx)(n.p,{children:"This is extremely verbose and not idiomatic for Rocq! We can see some of the Rust features that are made explicit:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["The list of constant generics ",(0,i.jsx)(n.code,{children:"\u03b5"}),", the list of type generics ",(0,i.jsx)(n.code,{children:"\u03c4"}),", and the list of arguments ",(0,i.jsx)(n.code,{children:"\u03b1"}),"."]}),"\n",(0,i.jsxs)(n.li,{children:["The memory operations ",(0,i.jsx)(n.code,{children:"alloc"})," and ",(0,i.jsx)(n.code,{children:"read"}),", and the pointers manipulations ",(0,i.jsx)(n.code,{children:"borrow"})," and ",(0,i.jsx)(n.code,{children:"deref"}),"."]}),"\n",(0,i.jsxs)(n.li,{children:["The trait instance resolution with ",(0,i.jsx)(n.code,{children:"M.get_trait_method"}),"."]}),"\n",(0,i.jsxs)(n.li,{children:["The decomposition of the pattern matching in more elementary operations like ",(0,i.jsx)(n.code,{children:"M.is_struct_tuple"}),"."]}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:["Most of this information comes from the ",(0,i.jsx)(n.a,{href:"https://rustc-dev-guide.rust-lang.org/thir.html",children:"THIR intermediate representation"})," of the code as provided by the Rust compiler."]}),"\n",(0,i.jsx)(n.p,{children:"Here is the link definition we will write, proven equivalent to the code above by construction:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"Definition run_max_by {T F : Set} `{Link T} `{Link F}\n (Run_FnOnce_for_F :\n function.FnOnce.Run\n F\n (Ref.t Pointer.Kind.Ref T * Ref.t Pointer.Kind.Ref T)\n (Output := Ordering.t)\n )\n (v1 v2 : T) (compare : F) :\n {{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}.\nProof.\n destruct Run_FnOnce_for_F as [[call_once [H_call_once run_call_once]]].\n run_symbolic.\n eapply Run.CallPrimitiveGetTraitMethod. {\n apply H_call_once.\n }\n run_symbolic.\n eapply Run.CallClosure. {\n apply (run_call_once compare (Ref.immediate _ v1, Ref.immediate _ v2)).\n }\n intros [ordering |]; cbn; [|run_symbolic].\n destruct ordering; run_symbolic.\nDefined.\n"})}),"\n",(0,i.jsxs)(n.p,{children:["The beginning of the definition corresponds to the trait resolution and calls to the ",(0,i.jsx)(n.code,{children:"compare"})," function. The last part with ",(0,i.jsx)(n.code,{children:"destruct ordering"})," is the representation of the ",(0,i.jsx)(n.code,{children:"match"})," statement in the Rust code. With this definition, we add explicit Rocq types instead of the universal ",(0,i.jsx)(n.code,{children:"Value.t"})," type of the translated code and make explicit the trait resolution. The trait instance has to be provided as an explicit parameter with the ",(0,i.jsx)(n.code,{children:"Run_FnOnce_for_F"})," argument."]}),"\n",(0,i.jsx)(n.p,{children:"With the statement:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"{{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}\n"})}),"\n",(0,i.jsxs)(n.p,{children:["we say that the translated function ",(0,i.jsx)(n.code,{children:"cmp.max_by"}),' has a "link" definition, built implicitly in the proof, returning a value of type ',(0,i.jsx)(n.code,{children:"T"}),". We can extract the definition of this function calling the primitive:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"evaluate : forall {Output : Set} `{Link Output} {e : M},\n {{ e \ud83d\udd3d Output }} ->\n LowM.t (Output.t Output)\n"})}),"\n",(0,i.jsxs)(n.p,{children:['It returns a "link" computation in the ',(0,i.jsx)(n.code,{children:"LowM.t"}),' monad. The output is often unreadable as it is, but we can step through it by symbolic execution. This will be useful for the next step to define and prove equivalent the "simulations".']}),"\n",(0,i.jsx)(n.h2,{id:"-link-monad",children:"\ud83d\udd2e Link monad"}),"\n",(0,i.jsxs)(n.p,{children:["Like the monad used for the translation of Rust programs by ",(0,i.jsx)(n.code,{children:"coq-of-rust"}),", the link's monad is a free monad but with fewer primitive operations. The primitive operations are only related to the memory handling:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"Inductive t : Set -> Set :=\n| StateAlloc {A : Set} `{Link A} (value : A) : t (Ref.Core.t A)\n| StateRead {A : Set} `{Link A} (ref_core : Ref.Core.t A) : t A\n| StateWrite {A : Set} `{Link A} (ref_core : Ref.Core.t A) (value : A) : t unit\n| GetSubPointer {A Sub_A : Set} `{Link A} `{Link Sub_A}\n (ref_core : Ref.Core.t A) (runner : SubPointer.Runner.t A Sub_A) :\n t (Ref.Core.t Sub_A).\n"})}),"\n",(0,i.jsxs)(n.p,{children:["Compared to the side effects in the generated translation, we eliminate all the operations related to name handling (trait resolution, function calls, etc.). We also always use explicit types instead of the universal ",(0,i.jsx)(n.code,{children:"Value.t"})," type and get rid of the ",(0,i.jsx)(n.code,{children:"M.impossible"})," operation that was necessary to represent impossible branches in the absence of types."]}),"\n",(0,i.jsx)(n.h2,{id:"\ufe0f-conclusion",children:"\u2712\ufe0f Conclusion"}),"\n",(0,i.jsx)(n.p,{children:"We have presented our general strategy to formally verify large Rust codebases. In the next blog posts, we will go into more details to look at the definition of the proof of equivalence for the links, and at how we automate the most repetitive parts of the proofs."}),"\n",(0,i.jsx)(n.admonition,{title:"For more",type:"success",children:(0,i.jsx)(n.p,{children:(0,i.jsxs)(n.em,{children:["Follow us on ",(0,i.jsx)(n.a,{href:"https://x.com/FormalLand",children:"X"})," or ",(0,i.jsx)(n.a,{href:"https://fr.linkedin.com/company/formal-land",children:"LinkedIn"})," for more, or comment on this post below! Feel free to DM us for any questions or requests!"]})})})]})}function d(e={}){const{wrapper:n}={...(0,o.R)(),...e.components};return n?(0,i.jsx)(n,{...e,children:(0,i.jsx)(h,{...e})}):h(e)}},7867:(e,n,t)=>{t.d(n,{A:()=>i});const i=t.p+"assets/images/compilation-steps-28e6722120c86d9e17aa05e4d38f1515.svg"},3423:(e,n,t)=>{t.d(n,{A:()=>i});const i=t.p+"assets/images/green-forest-00d5ecf11d8ae6a919d6f5e8ce6aeb7e.webp"},8453:(e,n,t)=>{t.d(n,{R:()=>r,x:()=>a});var i=t(6540);const o={},s=i.createContext(o);function r(e){const n=i.useContext(s);return i.useMemo((function(){return"function"==typeof e?e(n):{...n,...e}}),[n,e])}function a(e){let n;return n=e.disableParentContext?"function"==typeof e.components?e.components(o):e.components||o:r(e.components),i.createElement(s.Provider,{value:n},e.children)}}}]); \ No newline at end of file diff --git a/assets/js/b6190d2c.d4342f7f.js b/assets/js/b6190d2c.d4342f7f.js new file mode 100644 index 000000000..7d48e38e4 --- /dev/null +++ b/assets/js/b6190d2c.d4342f7f.js @@ -0,0 +1 @@ +"use strict";(self.webpackChunkformal_land=self.webpackChunkformal_land||[]).push([[4559],{9552:(e,n,t)=>{t.r(n),t.d(n,{assets:()=>l,contentTitle:()=>r,default:()=>d,frontMatter:()=>s,metadata:()=>a,toc:()=>c});var i=t(4848),o=t(8453);const s={title:"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)",tags:["Rust","links","simulations"],authors:[]},r=void 0,a={permalink:"/blog/2025/01/30/links-for-rust-in-rocq",source:"@site/blog/2025-01-30-links-for-rust-in-rocq.md",title:"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)",description:"In this article we show how we re-build the type and naming information of \ud83e\udd80 Rust code in  Rocq/Coq, the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of Revm, a virtual machine to run EVM programs.",date:"2025-01-30T00:00:00.000Z",formattedDate:"January 30, 2025",tags:[{label:"Rust",permalink:"/blog/tags/rust"},{label:"links",permalink:"/blog/tags/links"},{label:"simulations",permalink:"/blog/tags/simulations"}],readingTime:7.485,hasTruncateMarker:!0,authors:[],frontMatter:{title:"\ud83e\udd80 Typing and naming of Rust code in Rocq (1/3)",tags:["Rust","links","simulations"],authors:[]},unlisted:!1,nextItem:{title:"\ud83e\udd16 Designing a coding assistant for Rocq",permalink:"/blog/2025/01/21/designing-a-coding-assistant-for-rocq"}},l={authorsImageUrls:[]},c=[{value:"\ud83c\udfaf The challenge",id:"-the-challenge",level:2},{value:"\ud83d\udedd Strategy",id:"-strategy",level:2},{value:"\ud83e\uddea Example",id:"-example",level:2},{value:"\ud83d\udd2e Link's monad",id:"-links-monad",level:2},{value:"\u2712\ufe0f Conclusion",id:"\ufe0f-conclusion",level:2}];function h(e){const n={a:"a",admonition:"admonition",code:"code",em:"em",h2:"h2",img:"img",li:"li",ol:"ol",p:"p",pre:"pre",strong:"strong",ul:"ul",...(0,o.R)(),...e.components};return(0,i.jsxs)(i.Fragment,{children:[(0,i.jsxs)(n.p,{children:["In this article we show how we re-build the type and naming information of ",(0,i.jsx)(n.a,{href:"https://www.rust-lang.org/",children:"\ud83e\udd80\xa0Rust"})," code in ",(0,i.jsxs)(n.a,{href:"https://rocq-prover.org/",children:[(0,i.jsx)("img",{src:"https://raw.githubusercontent.com/coq/rocq-prover.org/refs/heads/main/rocq-id/logos/SVG/icon-rocq-orange.svg",height:"18px"}),"\xa0Rocq/Coq"]}),", the formal verification system we use. A challenge is to be able to represent arbitrary Rust programs, including the standard library of Rust and the whole of ",(0,i.jsx)(n.a,{href:"https://github.com/bluealloy/revm",children:"Revm"}),", a virtual machine to run ",(0,i.jsx)(n.a,{href:"https://en.wikipedia.org/wiki/Ethereum#Virtual_machine",children:"EVM"})," programs."]}),"\n",(0,i.jsx)(n.p,{children:"This is the continuation of the following article:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:(0,i.jsx)(n.a,{href:"/blog/2024/04/26/translation-core-alloc-crates",children:"\ud83e\udd80 Translation of the Rust's core and alloc crates"})}),"\n"]}),"\n",(0,i.jsxs)(n.admonition,{title:"Ask for the highest security!",type:"success",children:[(0,i.jsx)(n.p,{children:"When millions are at stake, bug bounties are not enough. How do you ensure your security audits are exhaustive?"}),(0,i.jsxs)(n.p,{children:["The best way is to use ",(0,i.jsx)(n.strong,{children:"formal verification"}),"."]}),(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Contact us"})," at\xa0",(0,i.jsx)(n.a,{href:"mailto:contact@formal.land",children:"\xa0\ud83d\udc8ccontact@formal.land"})," to make sure your code is safe!\xa0\ud83d\udee1\ufe0f"]}),(0,i.jsxs)(n.p,{children:["We cover ",(0,i.jsx)(n.strong,{children:"Rust"}),", ",(0,i.jsx)(n.strong,{children:"Solidity"}),", and ",(0,i.jsx)(n.strong,{children:"ZK systems"}),"."]})]}),"\n",(0,i.jsx)("figure",{children:(0,i.jsx)(n.p,{children:(0,i.jsx)(n.img,{alt:"Green forest",src:t(3423).A+"",width:"1024",height:"1024"})})}),"\n",(0,i.jsx)(n.h2,{id:"-the-challenge",children:"\ud83c\udfaf The challenge"}),"\n",(0,i.jsx)(n.p,{children:"Our goal is to be able to formally verify large Rust codebases, counting thousands of lines, and without having to modify the code to make it more amenable to formal verification. Our concrete example is the verification of the Revm that includes about 10,000 lines of Rust code, depending on how far we include the dependencies."}),"\n",(0,i.jsx)(n.p,{children:"This requires to have a methodology of verification that both:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:"Scales with the size of the codebase. Rust programs often use a lot of abstractions, and we make the choice to keep these abstractions in the formal model. Combined with the expressivity of the Rocq prover, we hope this will ensure we can scale our reasoning."}),"\n",(0,i.jsx)(n.li,{children:"Supports most of the Rust language, noting that Rust is a complex and feature-rich language."}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:["To make sure our translation from the Rust language to the Rocq system has good support, we generate a translation that is very verbose and rather low-level without interpreting the meaning of the various Rust primitives too much. For example, our translation tool is only about 5,000 lines long. It is written in Rust and uses the APIs of the ",(0,i.jsx)(n.code,{children:"rustc"})," compiler."]}),"\n",(0,i.jsx)(n.p,{children:"This approach leaves the burdens of defining the semantics of Rust and designing the reasoning primitives on the Rocq side."}),"\n",(0,i.jsx)(n.h2,{id:"-strategy",children:"\ud83d\udedd Strategy"}),"\n",(0,i.jsx)(n.p,{children:"We plan to reason on the translated Rust code with two intermediate steps:"}),"\n",(0,i.jsxs)(n.ol,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Links"})," These represent a complete rewriting of the translated code, adding type and naming information that are erased during the translation to Rocq. We also prove that this rewriting is equivalent to the initial translation. We hope to automate this step as much as possible."]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Simulations"})," In this step we make the less obvious transformations, in particular representing the memory mutations in a clean and custom state monad, as well as various optimizations such as collapsing all the integer types if it helps for the proofs later. We also prove that this rewriting is equivalent to the links."]}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:["At the end of the ",(0,i.jsx)(n.strong,{children:"Simulations"})," step, we should obtain a purely functional and idiomatic representation of the original Rust code in Rocq. This representation should be easier to reason about, and we will be able to formally verify properties of the code."]}),"\n",(0,i.jsx)(n.p,{children:"As a summary, here are the steps we want to follow:"}),"\n",(0,i.jsx)("figure",{class:"text--center",children:(0,i.jsx)(n.p,{children:(0,i.jsx)(n.img,{alt:"Compilation steps",src:t(7867).A+"",width:"214",height:"696"})})}),"\n",(0,i.jsx)(n.h2,{id:"-example",children:"\ud83e\uddea Example"}),"\n",(0,i.jsx)(n.p,{children:"Here is an example from the standard library of Rust, which is used to define other comparison operators:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-rust",children:"pub fn max_by Ordering>(v1: T, v2: T, compare: F) -> T {\n match compare(&v1, &v2) {\n Ordering::Less | Ordering::Equal => v2,\n Ordering::Greater => v1,\n }\n}\n"})}),"\n",(0,i.jsxs)(n.p,{children:["This example is interesting as it uses some abstractions, with polymorphism, traits, closures, and a bit of pointer manipulations. Ideally, we should be able to represent it with a Rocq code of a similar size, without the explicit references\xa0",(0,i.jsx)(n.code,{children:"&"})," that are mostly useless in a purely functional setting. But here is the Rocq code we obtain after running\xa0",(0,i.jsx)(n.a,{href:"https://github.com/formal-land/coq-of-rust",children:"coq-of-rust"}),":"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:'Definition max_by (\u03b5 : list Value.t) (\u03c4 : list Ty.t) (\u03b1 : list Value.t) : M :=\n match \u03b5, \u03c4, \u03b1 with\n | [], [ T; F ], [ v1; v2; compare ] =>\n ltac:(M.monadic\n (let v1 := M.alloc (| v1 |) in\n let v2 := M.alloc (| v2 |) in\n let compare := M.alloc (| compare |) in\n M.read (|\n M.match_operator (|\n M.alloc (|\n M.call_closure (|\n M.get_trait_method (|\n "core::ops::function::FnOnce",\n F,\n [],\n [ Ty.tuple [ Ty.apply (Ty.path "&") [] [ T ]; Ty.apply (Ty.path "&") [] [ T ] ] ],\n "call_once",\n [],\n []\n |),\n [\n M.read (| compare |);\n Value.Tuple\n [\n M.borrow (|\n Pointer.Kind.Ref,\n M.deref (| M.borrow (| Pointer.Kind.Ref, v1 |) |)\n |);\n M.borrow (|\n Pointer.Kind.Ref,\n M.deref (| M.borrow (| Pointer.Kind.Ref, v2 |) |)\n |)\n ]\n ]\n |)\n |),\n [\n fun \u03b3 =>\n ltac:(M.monadic\n (M.find_or_pattern (|\n \u03b3,\n [\n fun \u03b3 =>\n ltac:(M.monadic\n (let _ := M.is_struct_tuple (| \u03b3, "core::cmp::Ordering::Less" |) in\n Value.Tuple []));\n fun \u03b3 =>\n ltac:(M.monadic\n (let _ := M.is_struct_tuple (| \u03b3, "core::cmp::Ordering::Equal" |) in\n Value.Tuple []))\n ],\n fun \u03b3 =>\n ltac:(M.monadic\n match \u03b3 with\n | [] => ltac:(M.monadic v2)\n | _ => M.impossible "wrong number of arguments"\n end)\n |)));\n fun \u03b3 =>\n ltac:(M.monadic\n (let _ := M.is_struct_tuple (| \u03b3, "core::cmp::Ordering::Greater" |) in\n v1))\n ]\n |)\n |)))\n | _, _, _ => M.impossible "wrong number of arguments"\n end.\n'})}),"\n",(0,i.jsx)(n.p,{children:"This is extremely verbose and not idiomatic for Rocq! We can see some of the Rust features that are made explicit:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["The list of constant generics ",(0,i.jsx)(n.code,{children:"\u03b5"}),", the list of type generics ",(0,i.jsx)(n.code,{children:"\u03c4"}),", and the list of arguments ",(0,i.jsx)(n.code,{children:"\u03b1"}),"."]}),"\n",(0,i.jsxs)(n.li,{children:["The memory operations ",(0,i.jsx)(n.code,{children:"alloc"})," and ",(0,i.jsx)(n.code,{children:"read"}),", and the pointers manipulations ",(0,i.jsx)(n.code,{children:"borrow"})," and ",(0,i.jsx)(n.code,{children:"deref"}),"."]}),"\n",(0,i.jsxs)(n.li,{children:["The trait instance resolution with ",(0,i.jsx)(n.code,{children:"M.get_trait_method"}),"."]}),"\n",(0,i.jsxs)(n.li,{children:["The decomposition of the pattern matching in more elementary operations like ",(0,i.jsx)(n.code,{children:"M.is_struct_tuple"}),"."]}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:["Most of this information comes from the ",(0,i.jsx)(n.a,{href:"https://rustc-dev-guide.rust-lang.org/thir.html",children:"THIR intermediate representation"})," of the code as provided by the Rust compiler."]}),"\n",(0,i.jsx)(n.p,{children:"Here is the link definition we will write, proven equivalent to the code above by construction:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"Definition run_max_by {T F : Set} `{Link T} `{Link F}\n (Run_FnOnce_for_F :\n function.FnOnce.Run\n F\n (Ref.t Pointer.Kind.Ref T * Ref.t Pointer.Kind.Ref T)\n (Output := Ordering.t)\n )\n (v1 v2 : T) (compare : F) :\n {{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}.\nProof.\n destruct Run_FnOnce_for_F as [[call_once [H_call_once run_call_once]]].\n run_symbolic.\n eapply Run.CallPrimitiveGetTraitMethod. {\n apply H_call_once.\n }\n run_symbolic.\n eapply Run.CallClosure. {\n apply (run_call_once compare (Ref.immediate _ v1, Ref.immediate _ v2)).\n }\n intros [ordering |]; cbn; [|run_symbolic].\n destruct ordering; run_symbolic.\nDefined.\n"})}),"\n",(0,i.jsxs)(n.p,{children:["The beginning of the definition corresponds to the trait resolution and calls to the ",(0,i.jsx)(n.code,{children:"compare"})," function. The last part with ",(0,i.jsx)(n.code,{children:"destruct ordering"})," is the representation of the ",(0,i.jsx)(n.code,{children:"match"})," statement in the Rust code. With this definition, we add explicit Rocq types instead of the universal ",(0,i.jsx)(n.code,{children:"Value.t"})," type of the translated code and make explicit the trait resolution. The trait instance has to be provided as an explicit parameter with the ",(0,i.jsx)(n.code,{children:"Run_FnOnce_for_F"})," argument."]}),"\n",(0,i.jsx)(n.p,{children:"With the statement:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"{{ cmp.max_by [] [ \u03a6 T; \u03a6 F ] [ \u03c6 v1; \u03c6 v2; \u03c6 compare ] \ud83d\udd3d T }}\n"})}),"\n",(0,i.jsxs)(n.p,{children:["we say that the translated function ",(0,i.jsx)(n.code,{children:"cmp.max_by"}),' has a "link" definition, built implicitly in the proof, returning a value of type ',(0,i.jsx)(n.code,{children:"T"}),". We can extract the definition of this function calling the primitive:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"evaluate : forall {Output : Set} `{Link Output} {e : M},\n {{ e \ud83d\udd3d Output }} ->\n LowM.t (Output.t Output)\n"})}),"\n",(0,i.jsxs)(n.p,{children:['It returns a "link" computation in the ',(0,i.jsx)(n.code,{children:"LowM.t"}),' monad. The output is often unreadable as it is, but we can step through it by symbolic execution. This will be useful for the next step to define and prove equivalent the "simulations".']}),"\n",(0,i.jsx)(n.h2,{id:"-links-monad",children:"\ud83d\udd2e Link's monad"}),"\n",(0,i.jsxs)(n.p,{children:["Like the monad used for the translation of Rust programs by ",(0,i.jsx)(n.code,{children:"coq-of-rust"}),", the link's monad is a free monad but with fewer primitive operations. The primitive operations are only related to the memory handling:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-coq",children:"Inductive t : Set -> Set :=\n| StateAlloc {A : Set} `{Link A} (value : A) : t (Ref.Core.t A)\n| StateRead {A : Set} `{Link A} (ref_core : Ref.Core.t A) : t A\n| StateWrite {A : Set} `{Link A} (ref_core : Ref.Core.t A) (value : A) : t unit\n| GetSubPointer {A Sub_A : Set} `{Link A} `{Link Sub_A}\n (ref_core : Ref.Core.t A) (runner : SubPointer.Runner.t A Sub_A) :\n t (Ref.Core.t Sub_A).\n"})}),"\n",(0,i.jsxs)(n.p,{children:["Compared to the side effects in the generated translation, we eliminate all the operations related to name handling (trait resolution, function calls, etc.). We also always use explicit types instead of the universal ",(0,i.jsx)(n.code,{children:"Value.t"})," type and get rid of the ",(0,i.jsx)(n.code,{children:"M.impossible"})," operation that was necessary to represent impossible branches in the absence of types."]}),"\n",(0,i.jsx)(n.h2,{id:"\ufe0f-conclusion",children:"\u2712\ufe0f Conclusion"}),"\n",(0,i.jsx)(n.p,{children:"We have presented our general strategy to formally verify large Rust codebases. In the next blog posts, we will go into more details to look at the definition of the proof of equivalence for the links, and at how we automate the most repetitive parts of the proofs."}),"\n",(0,i.jsx)(n.admonition,{title:"For more",type:"success",children:(0,i.jsx)(n.p,{children:(0,i.jsxs)(n.em,{children:["Follow us on ",(0,i.jsx)(n.a,{href:"https://x.com/FormalLand",children:"X"})," or ",(0,i.jsx)(n.a,{href:"https://fr.linkedin.com/company/formal-land",children:"LinkedIn"})," for more, or comment on this post below! Feel free to DM us for any questions or requests!"]})})})]})}function d(e={}){const{wrapper:n}={...(0,o.R)(),...e.components};return n?(0,i.jsx)(n,{...e,children:(0,i.jsx)(h,{...e})}):h(e)}},7867:(e,n,t)=>{t.d(n,{A:()=>i});const i=t.p+"assets/images/compilation-steps-28e6722120c86d9e17aa05e4d38f1515.svg"},3423:(e,n,t)=>{t.d(n,{A:()=>i});const i=t.p+"assets/images/green-forest-00d5ecf11d8ae6a919d6f5e8ce6aeb7e.webp"},8453:(e,n,t)=>{t.d(n,{R:()=>r,x:()=>a});var i=t(6540);const o={},s=i.createContext(o);function r(e){const n=i.useContext(s);return i.useMemo((function(){return"function"==typeof e?e(n):{...n,...e}}),[n,e])}function a(e){let n;return n=e.disableParentContext?"function"==typeof e.components?e.components(o):e.components||o:r(e.components),i.createElement(s.Provider,{value:n},e.children)}}}]); \ No newline at end of file diff --git a/assets/js/runtime~main.3bd98707.js b/assets/js/runtime~main.1ba6b1ef.js similarity index 98% rename from assets/js/runtime~main.3bd98707.js rename to assets/js/runtime~main.1ba6b1ef.js index 62ba3b09d..eed6affc3 100644 --- a/assets/js/runtime~main.3bd98707.js +++ b/assets/js/runtime~main.1ba6b1ef.js @@ -1 +1 @@ -(()=>{"use strict";var e,a,c,b,f,d={},t={};function r(e){var a=t[e];if(void 0!==a)return a.exports;var c=t[e]={exports:{}};return d[e].call(c.exports,c,c.exports,r),c.exports}r.m=d,e=[],r.O=(a,c,b,f)=>{if(!c){var d=1/0;for(i=0;i=f)&&Object.keys(r.O).every((e=>r.O[e](c[o])))?c.splice(o--,1):(t=!1,f0&&e[i-1][2]>f;i--)e[i]=e[i-1];e[i]=[c,b,f]},r.n=e=>{var a=e&&e.__esModule?()=>e.default:()=>e;return r.d(a,{a:a}),a},c=Object.getPrototypeOf?e=>Object.getPrototypeOf(e):e=>e.__proto__,r.t=function(e,b){if(1&b&&(e=this(e)),8&b)return e;if("object"==typeof e&&e){if(4&b&&e.__esModule)return e;if(16&b&&"function"==typeof e.then)return e}var f=Object.create(null);r.r(f);var d={};a=a||[null,c({}),c([]),c(c)];for(var t=2&b&&e;"object"==typeof t&&!~a.indexOf(t);t=c(t))Object.getOwnPropertyNames(t).forEach((a=>d[a]=()=>e[a]));return d.default=()=>e,r.d(f,d),f},r.d=(e,a)=>{for(var c in a)r.o(a,c)&&!r.o(e,c)&&Object.defineProperty(e,c,{enumerable:!0,get:a[c]})},r.f={},r.e=e=>Promise.all(Object.keys(r.f).reduce(((a,c)=>(r.f[c](e,a),a)),[])),r.u=e=>"assets/js/"+({55:"601ef4a9",58:"a431aef6",65:"3926085a",111:"ae3cae59",127:"1993cab4",291:"0e068da8",367:"917ab7c2",440:"9cf8b934",468:"40a5438f",469:"7d9726a8",471:"4ab1c6f7",498:"ba3e77ec",501:"b985990b",567:"b488c124",574:"890e518c",621:"25e70bc2",640:"8ca4d6e2",656:"72c84e71",697:"ab9c6cc7",721:"bb3f9e72",727:"f813a603",801:"6fa80661",818:"f32fe326",827:"566fc268",863:"a487208b",878:"c7369102",909:"3c80015b",910:"eb6aa549",952:"282cd1c8",970:"3229a8e9",992:"be355487",1019:"720a56e4",1023:"b13ba252",1040:"e34d4f16",1061:"582c3eb6",1079:"e93a9b61",1130:"37dfae32",1131:"36f6f17a",1194:"11ce4159",1338:"99451683",1339:"f1787281",1377:"615dbb02",1425:"f7d79ded",1444:"e56edc49",1477:"2df98331",1543:"d73ecefa",1599:"cec855a0",1674:"f99ec706",1688:"89bd0f60",1691:"806c182d",1692:"bf1ed0c4",1703:"79c03ce4",1716:"523d8f7d",1801:"424e7040",1805:"c445085e",1826:"1ef21e38",1864:"d94865d5",1921:"59b56f69",1924:"8bd24425",1975:"0d56dc46",1979:"a173baa0",1991:"b2b675dd",2070:"aac014fc",2094:"861700e4",2109:"bd0ed3e1",2159:"7236e08c",2173:"2519c48f",2176:"b3ad03cc",2206:"940891e7",2257:"f3a15648",2258:"bdf7c199",2274:"054c34c3",2323:"70899ae5",2442:"d38380b3",2445:"74ab7bc8",2446:"d127eec6",2458:"0c4678a6",2470:"b146e155",2499:"7f18eb30",2538:"04e6399c",2620:"649e05e5",2644:"1fb147df",2676:"e704f625",2711:"9e4087bc",2728:"aa1d233f",2792:"8000e7cb",2864:"586a608e",2875:"bc490233",2937:"74366370",2962:"eada5b94",2968:"39093f96",3094:"d8ad77ec",3096:"439d3734",3102:"b2203129",3185:"ed3f27c2",3186:"68b0dfa4",3187:"4688d424",3214:"08b4cf62",3249:"ccc49370",3255:"9398d9b4",3290:"b12698d1",3339:"262b9cbb",3362:"d90ac61a",3437:"d31ecc05",3478:"35bc1c84",3568:"7d628d86",3639:"cf149e64",3715:"04f3873c",3787:"9e4c2aa5",3879:"5dc20450",3909:"8e0f4841",3933:"4bbf9573",3956:"305566bb",4002:"452bbf79",4075:"033e8252",4084:"e141b4eb",4116:"fd759d37",4222:"48ac1a5d",4230:"9072ab31",4282:"4f24d1ff",4347:"9bab1bd5",4354:"2adc0ba4",4358:"ffcdbdee",4402:"0928b497",4412:"fe5bf14f",4414:"f1f43052",4462:"3c0ee67f",4467:"02e13650",4478:"98e459ed",4497:"4071a8ab",4543:"4c1a3a9e",4557:"28295219",4559:"b6190d2c",4566:"bdd52cb1",4576:"f1c3ede6",4583:"1df93b7f",4617:"ae7616e5",4696:"ab4c6d72",4753:"908d74e3",4774:"f261144a",4809:"c190410d",4813:"6875c492",4827:"cce51cf2",4828:"1680c68e",4845:"a7cac7dd",4900:"bd9520f3",4907:"29813ec3",4935:"762ff625",4936:"4e6ea248",4970:"c9adbec0",4997:"a8f0412d",5104:"c6618825",5129:"6870d8e6",5213:"c9eb5c8c",5275:"22b2e39c",5302:"a24beb83",5315:"8db8ac2d",5341:"4cee49ee",5345:"2d92dfb9",5361:"65cc9109",5391:"7c0fea77",5428:"8db1271b",5439:"4d658fd1",5440:"834fe8c4",5452:"e417fa07",5457:"ff6b4ecf",5461:"beec3e5b",5578:"a8e6f3e5",5712:"eec17282",5739:"69205f08",5743:"c4d14b59",5756:"560c153f",5767:"8eb4e46b",5801:"2836b1c5",5845:"e6b868b1",5877:"530ea6aa",5894:"b2f554cd",5916:"a2c017b9",5927:"ea11de6c",5941:"52cd65da",5996:"45d50612",6079:"1681b0a1",6103:"fc3deafd",6171:"3b73c59b",6204:"2240d8ed",6244:"63152ce2",6260:"1c25c29d",6375:"d0b22415",6385:"42af9969",6390:"278479fb",6487:"f8de77c0",6521:"b346e459",6722:"74ae6181",6739:"fde865fd",6804:"51bceda4",6820:"0fb5280d",6837:"232c92ba",6840:"db6bff56",6895:"f4e61408",6924:"1e7a46a0",6940:"16329161",7054:"59025a76",7065:"3ba35b71",7097:"cf3d20aa",7098:"a7bd4aaa",7145:"7aba737b",7175:"f6f9690d",7199:"f21d102b",7224:"91ac000a",7261:"6fd34e84",7266:"094a728b",7358:"be4406bf",7381:"a926bf88",7422:"44c9a67a",7472:"814f3328",7520:"f6ba3702",7545:"60862750",7580:"97c52b50",7608:"8b760ba7",7630:"e566aea2",7643:"a6aa9e1f",7650:"ad4ab9ff",7802:"c5d15731",7816:"3169fffb",7829:"680bc8cf",7838:"ae69f024",7900:"a5d7f2f4",8e3:"0fc5ff8e",8005:"b422d039",8013:"4a2980b2",8033:"a9f31b53",8059:"026ddec0",8209:"01a85c17",8246:"638be742",8305:"396effda",8382:"d41fa627",8387:"8ee64c0c",8401:"17896441",8411:"c5903cab",8457:"a06ffc17",8468:"b6692631",8530:"2e0973e2",8540:"b44e231f",8543:"54a1f05d",8581:"935f2afb",8630:"24926dff",8655:"e494d26d",8674:"f099533d",8718:"ae41b95b",8755:"047b1aeb",8772:"82eef687",8790:"92999a1c",8806:"76bc59ef",8820:"bc196327",8821:"892f03bb",8824:"4a4ddbec",8827:"90c66bca",8924:"661d2f35",8958:"491dd2b3",9030:"9b60c9cc",9048:"a94703ab",9059:"84c5b4db",9089:"c98d3999",9111:"34d4b30d",9166:"90d4f0a7",9267:"a7023ddc",9317:"0228dac2",9324:"e32cc564",9328:"36cb36bb",9347:"d7c1c49e",9363:"09cb7d6a",9384:"13e421cf",9424:"b1f9f584",9453:"65fdface",9464:"0b0e8aeb",9607:"b6982b7f",9611:"0ab8b207",9615:"38e3802c",9623:"3ed9a774",9647:"5e95c892",9729:"0a509f49",9741:"20f81dd5",9780:"3621c1a8",9797:"bee11635",9835:"5ef0175c",9837:"95638c7c",9873:"07c12bb3",9888:"a8fa71a4",9890:"0b4df4b7",9984:"97b22f94"}[e]||e)+"."+{55:"0566bb39",58:"7f994ef6",65:"52bc7bda",111:"27061dfb",127:"3a9ef318",291:"d5641d1d",367:"6701a2e9",440:"76d29dee",468:"cb3b4242",469:"1ee5c6cd",471:"1f64822e",498:"4311351a",501:"cfab5552",567:"854d6339",574:"a4a0b2bd",621:"63a8f50d",640:"af8dbf98",656:"39500e75",697:"c9d01eeb",721:"3f073000",727:"4b9cd53e",801:"15602f23",818:"d751af84",827:"68b2268e",863:"1bc6d62d",878:"956e30ef",909:"a90a7aec",910:"a8e7640e",952:"9c43247e",970:"8091dd17",992:"4f66f214",1019:"1b167f27",1023:"05a5a112",1040:"b9281c81",1061:"0b2a818b",1079:"4cbd2c0b",1130:"6803d05b",1131:"c3f65053",1135:"5688adaf",1194:"3ecfaa5f",1338:"d5971c44",1339:"adc7efb4",1377:"54a223a2",1425:"d7b146ab",1444:"d4cfc3c4",1477:"d132f837",1543:"c1ba3faf",1599:"2090aac8",1674:"de011b59",1688:"c1af74bd",1691:"307569a2",1692:"d37d0c29",1703:"445e72c6",1716:"44154781",1801:"550f1c7f",1805:"7a1e67a5",1826:"159246fb",1864:"ee525089",1921:"9a395a96",1924:"91e9f4e0",1975:"1b9a8408",1979:"af43b3f6",1991:"e21e49af",2070:"af98a9d7",2094:"33c113c1",2109:"55f0eabc",2159:"c01f14a4",2173:"689cb8cb",2176:"51ccbfcf",2206:"95957777",2237:"75d0c357",2257:"e1c4d741",2258:"5d9bb219",2274:"94ef6ea1",2323:"60f2dd29",2442:"30d2b1ab",2445:"8e057809",2446:"9fd4f11a",2458:"1bce34fb",2470:"f1e87858",2499:"49c841a6",2538:"62874827",2620:"b9024547",2644:"90914a6b",2676:"913e7e1e",2711:"abea12d8",2728:"d12ffd65",2778:"459d8f64",2792:"4b1657cb",2864:"9e36b9fc",2875:"1fcb03e0",2937:"19041835",2962:"62ae8c5d",2968:"4f1f8dbf",3094:"ba824bf4",3096:"65cc6bd7",3102:"d975815b",3185:"9f3bbfb3",3186:"3fa6f2d9",3187:"a6ffa19c",3214:"a39a8a85",3249:"c436d061",3255:"c1f5101c",3290:"12f9fc49",3339:"031ac7d5",3362:"3d9802c1",3437:"87d3f793",3478:"5f6dae6f",3568:"82871bec",3639:"3f898b18",3715:"f56fcc6f",3787:"fb5bcc44",3879:"9ee7e094",3909:"56628669",3933:"046d339f",3956:"fb412b55",4002:"532e2413",4075:"bee38e71",4084:"0fe5f8c1",4116:"18817edc",4222:"bba114da",4230:"f89c79b2",4282:"e8084434",4347:"917e1a75",4354:"85bd53f0",4358:"0a98e02f",4402:"fa1054b1",4412:"281f9e28",4414:"986a7113",4462:"b2e48da1",4467:"51aaa417",4478:"1ac289d7",4497:"e6c0ffb1",4543:"8bd6688c",4557:"69a51e2e",4559:"256f86b9",4566:"718e8bdd",4576:"83e373ba",4583:"1abdc564",4617:"7e2bba99",4696:"5597b9c4",4753:"7908f09d",4774:"3bb47763",4809:"3a8464b4",4813:"c9388d9f",4827:"cd88487a",4828:"86931fb1",4845:"eff847ea",4900:"00d2c290",4907:"c0a449f0",4935:"eefe61ce",4936:"808789e2",4970:"28925304",4997:"d5710eaa",5104:"6956da23",5129:"9c7f689f",5213:"a33d4f7b",5275:"8456260d",5302:"39c5171b",5315:"75ba66f0",5341:"b4126614",5345:"f2e6fa73",5361:"93d0d593",5391:"dc63af7e",5428:"010b1de8",5439:"c4687111",5440:"788459cb",5452:"50caf0f1",5457:"c54ec846",5461:"79da6fc9",5578:"66de05ff",5712:"10aeb859",5739:"56ecc9eb",5743:"2a9344a6",5756:"41ded959",5767:"d24befa3",5801:"d99fa637",5845:"436010fd",5877:"cc097bef",5894:"29cd5ada",5916:"3a8cfe67",5927:"c2fc3c6e",5941:"eabf7ad4",5996:"af60c555",6079:"3a5db64e",6103:"a34db9bb",6171:"7f38da2f",6204:"7c254684",6244:"9c4cecc6",6260:"14f324e8",6375:"0ae76ec3",6385:"6035ca6c",6390:"bb0f3448",6487:"810ec2f9",6521:"0f6c907e",6722:"f27bb2e2",6739:"6a43a6f2",6804:"a85d6ede",6820:"2099001b",6837:"769080ce",6840:"e8c3bf9d",6895:"51d962f2",6924:"7222bda8",6940:"48a27581",7054:"30e28783",7065:"4977536c",7097:"dec12943",7098:"bd449a7d",7145:"d9ebecc2",7175:"aa0a4723",7199:"197997a1",7224:"301fb2b3",7261:"51e0f4f3",7266:"32ff0fa8",7358:"87c7aa6f",7381:"5d511376",7422:"98bbfa0c",7472:"3409d71b",7520:"a45cd4cc",7545:"5752fad7",7580:"cf02f41b",7608:"6af50133",7630:"73da8de5",7643:"19e21a39",7650:"5cd78671",7802:"32cee3b5",7816:"d7857f76",7829:"3c4c4e84",7838:"3194a050",7900:"fbc15c8f",8e3:"f9774b47",8005:"830866b6",8013:"973656cb",8033:"9f044d65",8059:"c98dbcf1",8209:"de3cf12b",8246:"4919ddae",8305:"8b1c4f04",8382:"c640f9c7",8387:"72e4c6f2",8401:"d4130c19",8411:"0416d517",8457:"1aa47368",8468:"3bedc53e",8530:"f3dd583a",8540:"b4a2a703",8543:"5780d3dc",8581:"62e80fa9",8630:"ef8131f6",8655:"68a422ef",8674:"88ad5309",8706:"959c499a",8718:"705b088b",8755:"971011ce",8772:"d8580b07",8790:"a3ef100d",8806:"16737830",8820:"f63bbeed",8821:"eb814a1d",8824:"29cb0317",8827:"a894595c",8924:"c9be2397",8958:"69886543",9030:"97d019ec",9048:"bfd106c6",9059:"8e27b63f",9089:"90feed96",9111:"b06aed8a",9166:"ca220213",9267:"0b0e547a",9317:"eb9a1920",9324:"dac5d5b3",9328:"85d65ba9",9347:"4d4ec702",9363:"6550bed7",9384:"6d9fb24e",9424:"5cae9820",9453:"bbf89ec9",9464:"c997cb3d",9607:"06adb013",9611:"d28c208b",9615:"05a9a47c",9623:"8386ecba",9647:"25919f8a",9729:"86f183e3",9741:"85809e97",9780:"a3411121",9797:"c8d6cb73",9835:"610603d3",9837:"e259b87d",9873:"40d473f0",9888:"8b907f5b",9890:"96c8132b",9984:"442a9d5f"}[e]+".js",r.miniCssF=e=>{},r.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||new Function("return this")()}catch(e){if("object"==typeof window)return window}}(),r.o=(e,a)=>Object.prototype.hasOwnProperty.call(e,a),b={},f="formal-land:",r.l=(e,a,c,d)=>{if(b[e])b[e].push(a);else{var t,o;if(void 0!==c)for(var n=document.getElementsByTagName("script"),i=0;i{t.onerror=t.onload=null,clearTimeout(s);var f=b[e];if(delete b[e],t.parentNode&&t.parentNode.removeChild(t),f&&f.forEach((e=>e(c))),a)return a(c)},s=setTimeout(u.bind(null,void 0,{type:"timeout",target:t}),12e4);t.onerror=u.bind(null,t.onerror),t.onload=u.bind(null,t.onload),o&&document.head.appendChild(t)}},r.r=e=>{"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},r.p="/",r.gca=function(e){return e={16329161:"6940",17896441:"8401",28295219:"4557",60862750:"7545",74366370:"2937",99451683:"1338","601ef4a9":"55",a431aef6:"58","3926085a":"65",ae3cae59:"111","1993cab4":"127","0e068da8":"291","917ab7c2":"367","9cf8b934":"440","40a5438f":"468","7d9726a8":"469","4ab1c6f7":"471",ba3e77ec:"498",b985990b:"501",b488c124:"567","890e518c":"574","25e70bc2":"621","8ca4d6e2":"640","72c84e71":"656",ab9c6cc7:"697",bb3f9e72:"721",f813a603:"727","6fa80661":"801",f32fe326:"818","566fc268":"827",a487208b:"863",c7369102:"878","3c80015b":"909",eb6aa549:"910","282cd1c8":"952","3229a8e9":"970",be355487:"992","720a56e4":"1019",b13ba252:"1023",e34d4f16:"1040","582c3eb6":"1061",e93a9b61:"1079","37dfae32":"1130","36f6f17a":"1131","11ce4159":"1194",f1787281:"1339","615dbb02":"1377",f7d79ded:"1425",e56edc49:"1444","2df98331":"1477",d73ecefa:"1543",cec855a0:"1599",f99ec706:"1674","89bd0f60":"1688","806c182d":"1691",bf1ed0c4:"1692","79c03ce4":"1703","523d8f7d":"1716","424e7040":"1801",c445085e:"1805","1ef21e38":"1826",d94865d5:"1864","59b56f69":"1921","8bd24425":"1924","0d56dc46":"1975",a173baa0:"1979",b2b675dd:"1991",aac014fc:"2070","861700e4":"2094",bd0ed3e1:"2109","7236e08c":"2159","2519c48f":"2173",b3ad03cc:"2176","940891e7":"2206",f3a15648:"2257",bdf7c199:"2258","054c34c3":"2274","70899ae5":"2323",d38380b3:"2442","74ab7bc8":"2445",d127eec6:"2446","0c4678a6":"2458",b146e155:"2470","7f18eb30":"2499","04e6399c":"2538","649e05e5":"2620","1fb147df":"2644",e704f625:"2676","9e4087bc":"2711",aa1d233f:"2728","8000e7cb":"2792","586a608e":"2864",bc490233:"2875",eada5b94:"2962","39093f96":"2968",d8ad77ec:"3094","439d3734":"3096",b2203129:"3102",ed3f27c2:"3185","68b0dfa4":"3186","4688d424":"3187","08b4cf62":"3214",ccc49370:"3249","9398d9b4":"3255",b12698d1:"3290","262b9cbb":"3339",d90ac61a:"3362",d31ecc05:"3437","35bc1c84":"3478","7d628d86":"3568",cf149e64:"3639","04f3873c":"3715","9e4c2aa5":"3787","5dc20450":"3879","8e0f4841":"3909","4bbf9573":"3933","305566bb":"3956","452bbf79":"4002","033e8252":"4075",e141b4eb:"4084",fd759d37:"4116","48ac1a5d":"4222","9072ab31":"4230","4f24d1ff":"4282","9bab1bd5":"4347","2adc0ba4":"4354",ffcdbdee:"4358","0928b497":"4402",fe5bf14f:"4412",f1f43052:"4414","3c0ee67f":"4462","02e13650":"4467","98e459ed":"4478","4071a8ab":"4497","4c1a3a9e":"4543",b6190d2c:"4559",bdd52cb1:"4566",f1c3ede6:"4576","1df93b7f":"4583",ae7616e5:"4617",ab4c6d72:"4696","908d74e3":"4753",f261144a:"4774",c190410d:"4809","6875c492":"4813",cce51cf2:"4827","1680c68e":"4828",a7cac7dd:"4845",bd9520f3:"4900","29813ec3":"4907","762ff625":"4935","4e6ea248":"4936",c9adbec0:"4970",a8f0412d:"4997",c6618825:"5104","6870d8e6":"5129",c9eb5c8c:"5213","22b2e39c":"5275",a24beb83:"5302","8db8ac2d":"5315","4cee49ee":"5341","2d92dfb9":"5345","65cc9109":"5361","7c0fea77":"5391","8db1271b":"5428","4d658fd1":"5439","834fe8c4":"5440",e417fa07:"5452",ff6b4ecf:"5457",beec3e5b:"5461",a8e6f3e5:"5578",eec17282:"5712","69205f08":"5739",c4d14b59:"5743","560c153f":"5756","8eb4e46b":"5767","2836b1c5":"5801",e6b868b1:"5845","530ea6aa":"5877",b2f554cd:"5894",a2c017b9:"5916",ea11de6c:"5927","52cd65da":"5941","45d50612":"5996","1681b0a1":"6079",fc3deafd:"6103","3b73c59b":"6171","2240d8ed":"6204","63152ce2":"6244","1c25c29d":"6260",d0b22415:"6375","42af9969":"6385","278479fb":"6390",f8de77c0:"6487",b346e459:"6521","74ae6181":"6722",fde865fd:"6739","51bceda4":"6804","0fb5280d":"6820","232c92ba":"6837",db6bff56:"6840",f4e61408:"6895","1e7a46a0":"6924","59025a76":"7054","3ba35b71":"7065",cf3d20aa:"7097",a7bd4aaa:"7098","7aba737b":"7145",f6f9690d:"7175",f21d102b:"7199","91ac000a":"7224","6fd34e84":"7261","094a728b":"7266",be4406bf:"7358",a926bf88:"7381","44c9a67a":"7422","814f3328":"7472",f6ba3702:"7520","97c52b50":"7580","8b760ba7":"7608",e566aea2:"7630",a6aa9e1f:"7643",ad4ab9ff:"7650",c5d15731:"7802","3169fffb":"7816","680bc8cf":"7829",ae69f024:"7838",a5d7f2f4:"7900","0fc5ff8e":"8000",b422d039:"8005","4a2980b2":"8013",a9f31b53:"8033","026ddec0":"8059","01a85c17":"8209","638be742":"8246","396effda":"8305",d41fa627:"8382","8ee64c0c":"8387",c5903cab:"8411",a06ffc17:"8457",b6692631:"8468","2e0973e2":"8530",b44e231f:"8540","54a1f05d":"8543","935f2afb":"8581","24926dff":"8630",e494d26d:"8655",f099533d:"8674",ae41b95b:"8718","047b1aeb":"8755","82eef687":"8772","92999a1c":"8790","76bc59ef":"8806",bc196327:"8820","892f03bb":"8821","4a4ddbec":"8824","90c66bca":"8827","661d2f35":"8924","491dd2b3":"8958","9b60c9cc":"9030",a94703ab:"9048","84c5b4db":"9059",c98d3999:"9089","34d4b30d":"9111","90d4f0a7":"9166",a7023ddc:"9267","0228dac2":"9317",e32cc564:"9324","36cb36bb":"9328",d7c1c49e:"9347","09cb7d6a":"9363","13e421cf":"9384",b1f9f584:"9424","65fdface":"9453","0b0e8aeb":"9464",b6982b7f:"9607","0ab8b207":"9611","38e3802c":"9615","3ed9a774":"9623","5e95c892":"9647","0a509f49":"9729","20f81dd5":"9741","3621c1a8":"9780",bee11635:"9797","5ef0175c":"9835","95638c7c":"9837","07c12bb3":"9873",a8fa71a4:"9888","0b4df4b7":"9890","97b22f94":"9984"}[e]||e,r.p+r.u(e)},(()=>{var e={5354:0,1869:0};r.f.j=(a,c)=>{var b=r.o(e,a)?e[a]:void 0;if(0!==b)if(b)c.push(b[2]);else if(/^(1869|5354)$/.test(a))e[a]=0;else{var f=new Promise(((c,f)=>b=e[a]=[c,f]));c.push(b[2]=f);var d=r.p+r.u(a),t=new Error;r.l(d,(c=>{if(r.o(e,a)&&(0!==(b=e[a])&&(e[a]=void 0),b)){var f=c&&("load"===c.type?"missing":c.type),d=c&&c.target&&c.target.src;t.message="Loading chunk "+a+" failed.\n("+f+": "+d+")",t.name="ChunkLoadError",t.type=f,t.request=d,b[1](t)}}),"chunk-"+a,a)}},r.O.j=a=>0===e[a];var a=(a,c)=>{var b,f,d=c[0],t=c[1],o=c[2],n=0;if(d.some((a=>0!==e[a]))){for(b in t)r.o(t,b)&&(r.m[b]=t[b]);if(o)var i=o(r)}for(a&&a(c);n{"use strict";var e,a,c,b,f,d={},t={};function r(e){var a=t[e];if(void 0!==a)return a.exports;var c=t[e]={exports:{}};return d[e].call(c.exports,c,c.exports,r),c.exports}r.m=d,e=[],r.O=(a,c,b,f)=>{if(!c){var d=1/0;for(i=0;i=f)&&Object.keys(r.O).every((e=>r.O[e](c[o])))?c.splice(o--,1):(t=!1,f0&&e[i-1][2]>f;i--)e[i]=e[i-1];e[i]=[c,b,f]},r.n=e=>{var a=e&&e.__esModule?()=>e.default:()=>e;return r.d(a,{a:a}),a},c=Object.getPrototypeOf?e=>Object.getPrototypeOf(e):e=>e.__proto__,r.t=function(e,b){if(1&b&&(e=this(e)),8&b)return e;if("object"==typeof e&&e){if(4&b&&e.__esModule)return e;if(16&b&&"function"==typeof e.then)return e}var f=Object.create(null);r.r(f);var d={};a=a||[null,c({}),c([]),c(c)];for(var t=2&b&&e;"object"==typeof t&&!~a.indexOf(t);t=c(t))Object.getOwnPropertyNames(t).forEach((a=>d[a]=()=>e[a]));return d.default=()=>e,r.d(f,d),f},r.d=(e,a)=>{for(var c in a)r.o(a,c)&&!r.o(e,c)&&Object.defineProperty(e,c,{enumerable:!0,get:a[c]})},r.f={},r.e=e=>Promise.all(Object.keys(r.f).reduce(((a,c)=>(r.f[c](e,a),a)),[])),r.u=e=>"assets/js/"+({55:"601ef4a9",58:"a431aef6",65:"3926085a",111:"ae3cae59",127:"1993cab4",291:"0e068da8",367:"917ab7c2",440:"9cf8b934",468:"40a5438f",469:"7d9726a8",471:"4ab1c6f7",498:"ba3e77ec",501:"b985990b",567:"b488c124",574:"890e518c",621:"25e70bc2",640:"8ca4d6e2",656:"72c84e71",697:"ab9c6cc7",721:"bb3f9e72",727:"f813a603",801:"6fa80661",818:"f32fe326",827:"566fc268",863:"a487208b",878:"c7369102",909:"3c80015b",910:"eb6aa549",952:"282cd1c8",970:"3229a8e9",992:"be355487",1019:"720a56e4",1023:"b13ba252",1040:"e34d4f16",1061:"582c3eb6",1079:"e93a9b61",1130:"37dfae32",1131:"36f6f17a",1194:"11ce4159",1338:"99451683",1339:"f1787281",1377:"615dbb02",1425:"f7d79ded",1444:"e56edc49",1477:"2df98331",1543:"d73ecefa",1599:"cec855a0",1674:"f99ec706",1688:"89bd0f60",1691:"806c182d",1692:"bf1ed0c4",1703:"79c03ce4",1716:"523d8f7d",1801:"424e7040",1805:"c445085e",1826:"1ef21e38",1864:"d94865d5",1921:"59b56f69",1924:"8bd24425",1975:"0d56dc46",1979:"a173baa0",1991:"b2b675dd",2070:"aac014fc",2094:"861700e4",2109:"bd0ed3e1",2159:"7236e08c",2173:"2519c48f",2176:"b3ad03cc",2206:"940891e7",2257:"f3a15648",2258:"bdf7c199",2274:"054c34c3",2323:"70899ae5",2442:"d38380b3",2445:"74ab7bc8",2446:"d127eec6",2458:"0c4678a6",2470:"b146e155",2499:"7f18eb30",2538:"04e6399c",2620:"649e05e5",2644:"1fb147df",2676:"e704f625",2711:"9e4087bc",2728:"aa1d233f",2792:"8000e7cb",2864:"586a608e",2875:"bc490233",2937:"74366370",2962:"eada5b94",2968:"39093f96",3094:"d8ad77ec",3096:"439d3734",3102:"b2203129",3185:"ed3f27c2",3186:"68b0dfa4",3187:"4688d424",3214:"08b4cf62",3249:"ccc49370",3255:"9398d9b4",3290:"b12698d1",3339:"262b9cbb",3362:"d90ac61a",3437:"d31ecc05",3478:"35bc1c84",3568:"7d628d86",3639:"cf149e64",3715:"04f3873c",3787:"9e4c2aa5",3879:"5dc20450",3909:"8e0f4841",3933:"4bbf9573",3956:"305566bb",4002:"452bbf79",4075:"033e8252",4084:"e141b4eb",4116:"fd759d37",4222:"48ac1a5d",4230:"9072ab31",4282:"4f24d1ff",4347:"9bab1bd5",4354:"2adc0ba4",4358:"ffcdbdee",4402:"0928b497",4412:"fe5bf14f",4414:"f1f43052",4462:"3c0ee67f",4467:"02e13650",4478:"98e459ed",4497:"4071a8ab",4543:"4c1a3a9e",4557:"28295219",4559:"b6190d2c",4566:"bdd52cb1",4576:"f1c3ede6",4583:"1df93b7f",4617:"ae7616e5",4696:"ab4c6d72",4753:"908d74e3",4774:"f261144a",4809:"c190410d",4813:"6875c492",4827:"cce51cf2",4828:"1680c68e",4845:"a7cac7dd",4900:"bd9520f3",4907:"29813ec3",4935:"762ff625",4936:"4e6ea248",4970:"c9adbec0",4997:"a8f0412d",5104:"c6618825",5129:"6870d8e6",5213:"c9eb5c8c",5275:"22b2e39c",5302:"a24beb83",5315:"8db8ac2d",5341:"4cee49ee",5345:"2d92dfb9",5361:"65cc9109",5391:"7c0fea77",5428:"8db1271b",5439:"4d658fd1",5440:"834fe8c4",5452:"e417fa07",5457:"ff6b4ecf",5461:"beec3e5b",5578:"a8e6f3e5",5712:"eec17282",5739:"69205f08",5743:"c4d14b59",5756:"560c153f",5767:"8eb4e46b",5801:"2836b1c5",5845:"e6b868b1",5877:"530ea6aa",5894:"b2f554cd",5916:"a2c017b9",5927:"ea11de6c",5941:"52cd65da",5996:"45d50612",6079:"1681b0a1",6103:"fc3deafd",6171:"3b73c59b",6204:"2240d8ed",6244:"63152ce2",6260:"1c25c29d",6375:"d0b22415",6385:"42af9969",6390:"278479fb",6487:"f8de77c0",6521:"b346e459",6722:"74ae6181",6739:"fde865fd",6804:"51bceda4",6820:"0fb5280d",6837:"232c92ba",6840:"db6bff56",6895:"f4e61408",6924:"1e7a46a0",6940:"16329161",7054:"59025a76",7065:"3ba35b71",7097:"cf3d20aa",7098:"a7bd4aaa",7145:"7aba737b",7175:"f6f9690d",7199:"f21d102b",7224:"91ac000a",7261:"6fd34e84",7266:"094a728b",7358:"be4406bf",7381:"a926bf88",7422:"44c9a67a",7472:"814f3328",7520:"f6ba3702",7545:"60862750",7580:"97c52b50",7608:"8b760ba7",7630:"e566aea2",7643:"a6aa9e1f",7650:"ad4ab9ff",7802:"c5d15731",7816:"3169fffb",7829:"680bc8cf",7838:"ae69f024",7900:"a5d7f2f4",8e3:"0fc5ff8e",8005:"b422d039",8013:"4a2980b2",8033:"a9f31b53",8059:"026ddec0",8209:"01a85c17",8246:"638be742",8305:"396effda",8382:"d41fa627",8387:"8ee64c0c",8401:"17896441",8411:"c5903cab",8457:"a06ffc17",8468:"b6692631",8530:"2e0973e2",8540:"b44e231f",8543:"54a1f05d",8581:"935f2afb",8630:"24926dff",8655:"e494d26d",8674:"f099533d",8718:"ae41b95b",8755:"047b1aeb",8772:"82eef687",8790:"92999a1c",8806:"76bc59ef",8820:"bc196327",8821:"892f03bb",8824:"4a4ddbec",8827:"90c66bca",8924:"661d2f35",8958:"491dd2b3",9030:"9b60c9cc",9048:"a94703ab",9059:"84c5b4db",9089:"c98d3999",9111:"34d4b30d",9166:"90d4f0a7",9267:"a7023ddc",9317:"0228dac2",9324:"e32cc564",9328:"36cb36bb",9347:"d7c1c49e",9363:"09cb7d6a",9384:"13e421cf",9424:"b1f9f584",9453:"65fdface",9464:"0b0e8aeb",9607:"b6982b7f",9611:"0ab8b207",9615:"38e3802c",9623:"3ed9a774",9647:"5e95c892",9729:"0a509f49",9741:"20f81dd5",9780:"3621c1a8",9797:"bee11635",9835:"5ef0175c",9837:"95638c7c",9873:"07c12bb3",9888:"a8fa71a4",9890:"0b4df4b7",9984:"97b22f94"}[e]||e)+"."+{55:"0566bb39",58:"7f994ef6",65:"52bc7bda",111:"27061dfb",127:"3a9ef318",291:"d5641d1d",367:"6701a2e9",440:"76d29dee",468:"cb3b4242",469:"1ee5c6cd",471:"1f64822e",498:"4311351a",501:"cfab5552",567:"854d6339",574:"a4a0b2bd",621:"63a8f50d",640:"af8dbf98",656:"39500e75",697:"c9d01eeb",721:"3f073000",727:"4b9cd53e",801:"15602f23",818:"d751af84",827:"68b2268e",863:"1bc6d62d",878:"956e30ef",909:"a90a7aec",910:"a8e7640e",952:"9c43247e",970:"8091dd17",992:"4f66f214",1019:"1b167f27",1023:"05a5a112",1040:"b9281c81",1061:"0b2a818b",1079:"4cbd2c0b",1130:"6803d05b",1131:"c3f65053",1135:"5688adaf",1194:"3ecfaa5f",1338:"d5971c44",1339:"adc7efb4",1377:"54a223a2",1425:"d7b146ab",1444:"d4cfc3c4",1477:"d132f837",1543:"c1ba3faf",1599:"2090aac8",1674:"de011b59",1688:"c1af74bd",1691:"307569a2",1692:"d37d0c29",1703:"445e72c6",1716:"44154781",1801:"550f1c7f",1805:"7a1e67a5",1826:"159246fb",1864:"ee525089",1921:"9a395a96",1924:"91e9f4e0",1975:"1b9a8408",1979:"af43b3f6",1991:"e21e49af",2070:"af98a9d7",2094:"33c113c1",2109:"55f0eabc",2159:"c01f14a4",2173:"689cb8cb",2176:"51ccbfcf",2206:"95957777",2237:"75d0c357",2257:"e1c4d741",2258:"5d9bb219",2274:"94ef6ea1",2323:"60f2dd29",2442:"30d2b1ab",2445:"8e057809",2446:"9fd4f11a",2458:"1bce34fb",2470:"f1e87858",2499:"49c841a6",2538:"62874827",2620:"b9024547",2644:"90914a6b",2676:"913e7e1e",2711:"abea12d8",2728:"d12ffd65",2778:"459d8f64",2792:"4b1657cb",2864:"9e36b9fc",2875:"1fcb03e0",2937:"19041835",2962:"62ae8c5d",2968:"4f1f8dbf",3094:"ba824bf4",3096:"65cc6bd7",3102:"d975815b",3185:"9f3bbfb3",3186:"3fa6f2d9",3187:"a6ffa19c",3214:"a39a8a85",3249:"c436d061",3255:"c1f5101c",3290:"12f9fc49",3339:"031ac7d5",3362:"3d9802c1",3437:"87d3f793",3478:"5f6dae6f",3568:"82871bec",3639:"3f898b18",3715:"f56fcc6f",3787:"fb5bcc44",3879:"9ee7e094",3909:"56628669",3933:"046d339f",3956:"fb412b55",4002:"532e2413",4075:"bee38e71",4084:"0fe5f8c1",4116:"18817edc",4222:"bba114da",4230:"f89c79b2",4282:"e8084434",4347:"917e1a75",4354:"85bd53f0",4358:"0a98e02f",4402:"fa1054b1",4412:"281f9e28",4414:"986a7113",4462:"b2e48da1",4467:"51aaa417",4478:"1ac289d7",4497:"e6c0ffb1",4543:"8bd6688c",4557:"69a51e2e",4559:"d4342f7f",4566:"718e8bdd",4576:"83e373ba",4583:"1abdc564",4617:"7e2bba99",4696:"5597b9c4",4753:"7908f09d",4774:"3bb47763",4809:"3a8464b4",4813:"c9388d9f",4827:"cd88487a",4828:"86931fb1",4845:"eff847ea",4900:"00d2c290",4907:"c0a449f0",4935:"eefe61ce",4936:"808789e2",4970:"28925304",4997:"d5710eaa",5104:"6956da23",5129:"9c7f689f",5213:"a33d4f7b",5275:"8456260d",5302:"39c5171b",5315:"75ba66f0",5341:"b4126614",5345:"f2e6fa73",5361:"93d0d593",5391:"dc63af7e",5428:"010b1de8",5439:"c4687111",5440:"788459cb",5452:"50caf0f1",5457:"c54ec846",5461:"79da6fc9",5578:"66de05ff",5712:"10aeb859",5739:"56ecc9eb",5743:"2a9344a6",5756:"41ded959",5767:"d24befa3",5801:"d99fa637",5845:"436010fd",5877:"cc097bef",5894:"a9c17cae",5916:"3a8cfe67",5927:"c2fc3c6e",5941:"eabf7ad4",5996:"af60c555",6079:"3a5db64e",6103:"a34db9bb",6171:"7f38da2f",6204:"7c254684",6244:"9c4cecc6",6260:"14f324e8",6375:"0ae76ec3",6385:"6035ca6c",6390:"bb0f3448",6487:"810ec2f9",6521:"0f6c907e",6722:"f27bb2e2",6739:"6a43a6f2",6804:"a85d6ede",6820:"2099001b",6837:"769080ce",6840:"e8c3bf9d",6895:"51d962f2",6924:"7222bda8",6940:"48a27581",7054:"30e28783",7065:"4977536c",7097:"dec12943",7098:"bd449a7d",7145:"d9ebecc2",7175:"aa0a4723",7199:"197997a1",7224:"301fb2b3",7261:"51e0f4f3",7266:"32ff0fa8",7358:"87c7aa6f",7381:"5d511376",7422:"98bbfa0c",7472:"3409d71b",7520:"a45cd4cc",7545:"5752fad7",7580:"cf02f41b",7608:"6af50133",7630:"73da8de5",7643:"19e21a39",7650:"5cd78671",7802:"32cee3b5",7816:"d7857f76",7829:"3c4c4e84",7838:"3194a050",7900:"fbc15c8f",8e3:"f9774b47",8005:"830866b6",8013:"973656cb",8033:"9f044d65",8059:"c98dbcf1",8209:"de3cf12b",8246:"4919ddae",8305:"8b1c4f04",8382:"c640f9c7",8387:"72e4c6f2",8401:"d4130c19",8411:"0416d517",8457:"1aa47368",8468:"3bedc53e",8530:"f3dd583a",8540:"b4a2a703",8543:"5780d3dc",8581:"62e80fa9",8630:"ef8131f6",8655:"68a422ef",8674:"88ad5309",8706:"959c499a",8718:"705b088b",8755:"971011ce",8772:"d8580b07",8790:"a3ef100d",8806:"16737830",8820:"f63bbeed",8821:"eb814a1d",8824:"29cb0317",8827:"a894595c",8924:"c9be2397",8958:"69886543",9030:"97d019ec",9048:"bfd106c6",9059:"8e27b63f",9089:"90feed96",9111:"b06aed8a",9166:"ca220213",9267:"0b0e547a",9317:"eb9a1920",9324:"dac5d5b3",9328:"85d65ba9",9347:"4d4ec702",9363:"6550bed7",9384:"6d9fb24e",9424:"5cae9820",9453:"bbf89ec9",9464:"c997cb3d",9607:"06adb013",9611:"d28c208b",9615:"05a9a47c",9623:"8386ecba",9647:"25919f8a",9729:"86f183e3",9741:"85809e97",9780:"a3411121",9797:"c8d6cb73",9835:"610603d3",9837:"e259b87d",9873:"40d473f0",9888:"8b907f5b",9890:"96c8132b",9984:"442a9d5f"}[e]+".js",r.miniCssF=e=>{},r.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||new Function("return this")()}catch(e){if("object"==typeof window)return window}}(),r.o=(e,a)=>Object.prototype.hasOwnProperty.call(e,a),b={},f="formal-land:",r.l=(e,a,c,d)=>{if(b[e])b[e].push(a);else{var t,o;if(void 0!==c)for(var n=document.getElementsByTagName("script"),i=0;i{t.onerror=t.onload=null,clearTimeout(s);var f=b[e];if(delete b[e],t.parentNode&&t.parentNode.removeChild(t),f&&f.forEach((e=>e(c))),a)return a(c)},s=setTimeout(u.bind(null,void 0,{type:"timeout",target:t}),12e4);t.onerror=u.bind(null,t.onerror),t.onload=u.bind(null,t.onload),o&&document.head.appendChild(t)}},r.r=e=>{"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},r.p="/",r.gca=function(e){return e={16329161:"6940",17896441:"8401",28295219:"4557",60862750:"7545",74366370:"2937",99451683:"1338","601ef4a9":"55",a431aef6:"58","3926085a":"65",ae3cae59:"111","1993cab4":"127","0e068da8":"291","917ab7c2":"367","9cf8b934":"440","40a5438f":"468","7d9726a8":"469","4ab1c6f7":"471",ba3e77ec:"498",b985990b:"501",b488c124:"567","890e518c":"574","25e70bc2":"621","8ca4d6e2":"640","72c84e71":"656",ab9c6cc7:"697",bb3f9e72:"721",f813a603:"727","6fa80661":"801",f32fe326:"818","566fc268":"827",a487208b:"863",c7369102:"878","3c80015b":"909",eb6aa549:"910","282cd1c8":"952","3229a8e9":"970",be355487:"992","720a56e4":"1019",b13ba252:"1023",e34d4f16:"1040","582c3eb6":"1061",e93a9b61:"1079","37dfae32":"1130","36f6f17a":"1131","11ce4159":"1194",f1787281:"1339","615dbb02":"1377",f7d79ded:"1425",e56edc49:"1444","2df98331":"1477",d73ecefa:"1543",cec855a0:"1599",f99ec706:"1674","89bd0f60":"1688","806c182d":"1691",bf1ed0c4:"1692","79c03ce4":"1703","523d8f7d":"1716","424e7040":"1801",c445085e:"1805","1ef21e38":"1826",d94865d5:"1864","59b56f69":"1921","8bd24425":"1924","0d56dc46":"1975",a173baa0:"1979",b2b675dd:"1991",aac014fc:"2070","861700e4":"2094",bd0ed3e1:"2109","7236e08c":"2159","2519c48f":"2173",b3ad03cc:"2176","940891e7":"2206",f3a15648:"2257",bdf7c199:"2258","054c34c3":"2274","70899ae5":"2323",d38380b3:"2442","74ab7bc8":"2445",d127eec6:"2446","0c4678a6":"2458",b146e155:"2470","7f18eb30":"2499","04e6399c":"2538","649e05e5":"2620","1fb147df":"2644",e704f625:"2676","9e4087bc":"2711",aa1d233f:"2728","8000e7cb":"2792","586a608e":"2864",bc490233:"2875",eada5b94:"2962","39093f96":"2968",d8ad77ec:"3094","439d3734":"3096",b2203129:"3102",ed3f27c2:"3185","68b0dfa4":"3186","4688d424":"3187","08b4cf62":"3214",ccc49370:"3249","9398d9b4":"3255",b12698d1:"3290","262b9cbb":"3339",d90ac61a:"3362",d31ecc05:"3437","35bc1c84":"3478","7d628d86":"3568",cf149e64:"3639","04f3873c":"3715","9e4c2aa5":"3787","5dc20450":"3879","8e0f4841":"3909","4bbf9573":"3933","305566bb":"3956","452bbf79":"4002","033e8252":"4075",e141b4eb:"4084",fd759d37:"4116","48ac1a5d":"4222","9072ab31":"4230","4f24d1ff":"4282","9bab1bd5":"4347","2adc0ba4":"4354",ffcdbdee:"4358","0928b497":"4402",fe5bf14f:"4412",f1f43052:"4414","3c0ee67f":"4462","02e13650":"4467","98e459ed":"4478","4071a8ab":"4497","4c1a3a9e":"4543",b6190d2c:"4559",bdd52cb1:"4566",f1c3ede6:"4576","1df93b7f":"4583",ae7616e5:"4617",ab4c6d72:"4696","908d74e3":"4753",f261144a:"4774",c190410d:"4809","6875c492":"4813",cce51cf2:"4827","1680c68e":"4828",a7cac7dd:"4845",bd9520f3:"4900","29813ec3":"4907","762ff625":"4935","4e6ea248":"4936",c9adbec0:"4970",a8f0412d:"4997",c6618825:"5104","6870d8e6":"5129",c9eb5c8c:"5213","22b2e39c":"5275",a24beb83:"5302","8db8ac2d":"5315","4cee49ee":"5341","2d92dfb9":"5345","65cc9109":"5361","7c0fea77":"5391","8db1271b":"5428","4d658fd1":"5439","834fe8c4":"5440",e417fa07:"5452",ff6b4ecf:"5457",beec3e5b:"5461",a8e6f3e5:"5578",eec17282:"5712","69205f08":"5739",c4d14b59:"5743","560c153f":"5756","8eb4e46b":"5767","2836b1c5":"5801",e6b868b1:"5845","530ea6aa":"5877",b2f554cd:"5894",a2c017b9:"5916",ea11de6c:"5927","52cd65da":"5941","45d50612":"5996","1681b0a1":"6079",fc3deafd:"6103","3b73c59b":"6171","2240d8ed":"6204","63152ce2":"6244","1c25c29d":"6260",d0b22415:"6375","42af9969":"6385","278479fb":"6390",f8de77c0:"6487",b346e459:"6521","74ae6181":"6722",fde865fd:"6739","51bceda4":"6804","0fb5280d":"6820","232c92ba":"6837",db6bff56:"6840",f4e61408:"6895","1e7a46a0":"6924","59025a76":"7054","3ba35b71":"7065",cf3d20aa:"7097",a7bd4aaa:"7098","7aba737b":"7145",f6f9690d:"7175",f21d102b:"7199","91ac000a":"7224","6fd34e84":"7261","094a728b":"7266",be4406bf:"7358",a926bf88:"7381","44c9a67a":"7422","814f3328":"7472",f6ba3702:"7520","97c52b50":"7580","8b760ba7":"7608",e566aea2:"7630",a6aa9e1f:"7643",ad4ab9ff:"7650",c5d15731:"7802","3169fffb":"7816","680bc8cf":"7829",ae69f024:"7838",a5d7f2f4:"7900","0fc5ff8e":"8000",b422d039:"8005","4a2980b2":"8013",a9f31b53:"8033","026ddec0":"8059","01a85c17":"8209","638be742":"8246","396effda":"8305",d41fa627:"8382","8ee64c0c":"8387",c5903cab:"8411",a06ffc17:"8457",b6692631:"8468","2e0973e2":"8530",b44e231f:"8540","54a1f05d":"8543","935f2afb":"8581","24926dff":"8630",e494d26d:"8655",f099533d:"8674",ae41b95b:"8718","047b1aeb":"8755","82eef687":"8772","92999a1c":"8790","76bc59ef":"8806",bc196327:"8820","892f03bb":"8821","4a4ddbec":"8824","90c66bca":"8827","661d2f35":"8924","491dd2b3":"8958","9b60c9cc":"9030",a94703ab:"9048","84c5b4db":"9059",c98d3999:"9089","34d4b30d":"9111","90d4f0a7":"9166",a7023ddc:"9267","0228dac2":"9317",e32cc564:"9324","36cb36bb":"9328",d7c1c49e:"9347","09cb7d6a":"9363","13e421cf":"9384",b1f9f584:"9424","65fdface":"9453","0b0e8aeb":"9464",b6982b7f:"9607","0ab8b207":"9611","38e3802c":"9615","3ed9a774":"9623","5e95c892":"9647","0a509f49":"9729","20f81dd5":"9741","3621c1a8":"9780",bee11635:"9797","5ef0175c":"9835","95638c7c":"9837","07c12bb3":"9873",a8fa71a4:"9888","0b4df4b7":"9890","97b22f94":"9984"}[e]||e,r.p+r.u(e)},(()=>{var e={5354:0,1869:0};r.f.j=(a,c)=>{var b=r.o(e,a)?e[a]:void 0;if(0!==b)if(b)c.push(b[2]);else if(/^(1869|5354)$/.test(a))e[a]=0;else{var f=new Promise(((c,f)=>b=e[a]=[c,f]));c.push(b[2]=f);var d=r.p+r.u(a),t=new Error;r.l(d,(c=>{if(r.o(e,a)&&(0!==(b=e[a])&&(e[a]=void 0),b)){var f=c&&("load"===c.type?"missing":c.type),d=c&&c.target&&c.target.src;t.message="Loading chunk "+a+" failed.\n("+f+": "+d+")",t.name="ChunkLoadError",t.type=f,t.request=d,b[1](t)}}),"chunk-"+a,a)}},r.O.j=a=>0===e[a];var a=(a,c)=>{var b,f,d=c[0],t=c[1],o=c[2],n=0;if(d.some((a=>0!==e[a]))){for(b in t)r.o(t,b)&&(r.m[b]=t[b]);if(o)var i=o(r)}for(a&&a(c);n - + diff --git a/blog/2021/10/10/welcome.html b/blog/2021/10/10/welcome.html index 48258d00b..3420b728e 100644 --- a/blog/2021/10/10/welcome.html +++ b/blog/2021/10/10/welcome.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2021/10/27/verification-data-encoding.html b/blog/2021/10/27/verification-data-encoding.html index 19f3fc1dc..9d6a2d237 100644 --- a/blog/2021/10/27/verification-data-encoding.html +++ b/blog/2021/10/27/verification-data-encoding.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2021/11/12/new-blog-posts-and-meetup-talk.html b/blog/2021/11/12/new-blog-posts-and-meetup-talk.html index dbffb8f20..5d3db40d2 100644 --- a/blog/2021/11/12/new-blog-posts-and-meetup-talk.html +++ b/blog/2021/11/12/new-blog-posts-and-meetup-talk.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2022/02/02/make-tezos-a-formally-verified-crypto.html b/blog/2022/02/02/make-tezos-a-formally-verified-crypto.html index f5633e59a..2ad04ec0a 100644 --- a/blog/2022/02/02/make-tezos-a-formally-verified-crypto.html +++ b/blog/2022/02/02/make-tezos-a-formally-verified-crypto.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2022/06/15/status update-tezos.html b/blog/2022/06/15/status update-tezos.html index 13b14a062..040b239bf 100644 --- a/blog/2022/06/15/status update-tezos.html +++ b/blog/2022/06/15/status update-tezos.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2022/06/23/upgrade-coq-of-ocaml-4.14.html b/blog/2022/06/23/upgrade-coq-of-ocaml-4.14.html index 4c1b174f6..4bed4c8f3 100644 --- a/blog/2022/06/23/upgrade-coq-of-ocaml-4.14.html +++ b/blog/2022/06/23/upgrade-coq-of-ocaml-4.14.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2022/12/13/latest-blog-posts-on-tezos.html b/blog/2022/12/13/latest-blog-posts-on-tezos.html index be9cc933b..2cef08f90 100644 --- a/blog/2022/12/13/latest-blog-posts-on-tezos.html +++ b/blog/2022/12/13/latest-blog-posts-on-tezos.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/01/24/current-verification-efforts.html b/blog/2023/01/24/current-verification-efforts.html index dd16f6b2c..e053d337f 100644 --- a/blog/2023/01/24/current-verification-efforts.html +++ b/blog/2023/01/24/current-verification-efforts.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/04/26/representation-of-rust-methods-in-coq.html b/blog/2023/04/26/representation-of-rust-methods-in-coq.html index fdaf232c3..3d1e1ea3d 100644 --- a/blog/2023/04/26/representation-of-rust-methods-in-coq.html +++ b/blog/2023/04/26/representation-of-rust-methods-in-coq.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/05/28/monad-for-side-effects-in-rust.html b/blog/2023/05/28/monad-for-side-effects-in-rust.html index b62ca22a6..aad380f3d 100644 --- a/blog/2023/05/28/monad-for-side-effects-in-rust.html +++ b/blog/2023/05/28/monad-for-side-effects-in-rust.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/08/25/trait-representation-in-coq.html b/blog/2023/08/25/trait-representation-in-coq.html index 9931269da..267faaadc 100644 --- a/blog/2023/08/25/trait-representation-in-coq.html +++ b/blog/2023/08/25/trait-representation-in-coq.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/11/08/rust-thir-and-bundled-traits.html b/blog/2023/11/08/rust-thir-and-bundled-traits.html index 19fe9cf7f..16073373d 100644 --- a/blog/2023/11/08/rust-thir-and-bundled-traits.html +++ b/blog/2023/11/08/rust-thir-and-bundled-traits.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/11/26/rust-function-body.html b/blog/2023/11/26/rust-function-body.html index ab5cadbd8..bcdb69855 100644 --- a/blog/2023/11/26/rust-function-body.html +++ b/blog/2023/11/26/rust-function-body.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2023/12/13/rust-verify-erc-20-smart-contract.html b/blog/2023/12/13/rust-verify-erc-20-smart-contract.html index 9fb05b68a..808323675 100644 --- a/blog/2023/12/13/rust-verify-erc-20-smart-contract.html +++ b/blog/2023/12/13/rust-verify-erc-20-smart-contract.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/01/04/rust-translating-match.html b/blog/2024/01/04/rust-translating-match.html index 20165c819..6dd899488 100644 --- a/blog/2024/01/04/rust-translating-match.html +++ b/blog/2024/01/04/rust-translating-match.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/01/18/update-coq-of-rust.html b/blog/2024/01/18/update-coq-of-rust.html index 825dd1197..966bc751b 100644 --- a/blog/2024/01/18/update-coq-of-rust.html +++ b/blog/2024/01/18/update-coq-of-rust.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/02/02/formal-verification-for-aleph-zero.html b/blog/2024/02/02/formal-verification-for-aleph-zero.html index ff619ace4..7ea60e02b 100644 --- a/blog/2024/02/02/formal-verification-for-aleph-zero.html +++ b/blog/2024/02/02/formal-verification-for-aleph-zero.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/02/14/experiment-coq-of-hs.html b/blog/2024/02/14/experiment-coq-of-hs.html index 2eaa62a84..62c3583fd 100644 --- a/blog/2024/02/14/experiment-coq-of-hs.html +++ b/blog/2024/02/14/experiment-coq-of-hs.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/02/22/journey-coq-of-go.html b/blog/2024/02/22/journey-coq-of-go.html index ac465f274..4cc02a314 100644 --- a/blog/2024/02/22/journey-coq-of-go.html +++ b/blog/2024/02/22/journey-coq-of-go.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/02/29/improvements-rust-translation.html b/blog/2024/02/29/improvements-rust-translation.html index 39bd49f02..64e960283 100644 --- a/blog/2024/02/29/improvements-rust-translation.html +++ b/blog/2024/02/29/improvements-rust-translation.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/03/08/improvements-rust-translation-part-2.html b/blog/2024/03/08/improvements-rust-translation-part-2.html index 71c6920ec..1c6c0bc6e 100644 --- a/blog/2024/03/08/improvements-rust-translation-part-2.html +++ b/blog/2024/03/08/improvements-rust-translation-part-2.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/03/22/improvements-rust-translation-part-3.html b/blog/2024/03/22/improvements-rust-translation-part-3.html index 857f5761d..5fbeae157 100644 --- a/blog/2024/03/22/improvements-rust-translation-part-3.html +++ b/blog/2024/03/22/improvements-rust-translation-part-3.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/04/03/monadic-notation-for-rust-translation.html b/blog/2024/04/03/monadic-notation-for-rust-translation.html index 52ed1b8f1..41d4151aa 100644 --- a/blog/2024/04/03/monadic-notation-for-rust-translation.html +++ b/blog/2024/04/03/monadic-notation-for-rust-translation.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/04/26/translation-core-alloc-crates.html b/blog/2024/04/26/translation-core-alloc-crates.html index 1c3c320a3..55af61927 100644 --- a/blog/2024/04/26/translation-core-alloc-crates.html +++ b/blog/2024/04/26/translation-core-alloc-crates.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/05/10/translation-of-python-code.html b/blog/2024/05/10/translation-of-python-code.html index b2d8f6cb5..253441167 100644 --- a/blog/2024/05/10/translation-of-python-code.html +++ b/blog/2024/05/10/translation-of-python-code.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/05/14/translation-of-python-code-simulations.html b/blog/2024/05/14/translation-of-python-code-simulations.html index 885492262..412b8f811 100644 --- a/blog/2024/05/14/translation-of-python-code-simulations.html +++ b/blog/2024/05/14/translation-of-python-code-simulations.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/05/22/translation-of-python-code-simulations-from-trace.html b/blog/2024/05/22/translation-of-python-code-simulations-from-trace.html index e98be5f61..dd9784192 100644 --- a/blog/2024/05/22/translation-of-python-code-simulations-from-trace.html +++ b/blog/2024/05/22/translation-of-python-code-simulations-from-trace.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/06/05/software-correctness-from-first-principles.html b/blog/2024/06/05/software-correctness-from-first-principles.html index 5991658ea..056950ce3 100644 --- a/blog/2024/06/05/software-correctness-from-first-principles.html +++ b/blog/2024/06/05/software-correctness-from-first-principles.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/06/28/coq-of-solidity-1.html b/blog/2024/06/28/coq-of-solidity-1.html index f1dc0be0d..81446a3b1 100644 --- a/blog/2024/06/28/coq-of-solidity-1.html +++ b/blog/2024/06/28/coq-of-solidity-1.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/08/07/coq-of-solidity-2.html b/blog/2024/08/07/coq-of-solidity-2.html index 0f98b5d21..9f97a8fac 100644 --- a/blog/2024/08/07/coq-of-solidity-2.html +++ b/blog/2024/08/07/coq-of-solidity-2.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/08/12/coq-of-solidity-3.html b/blog/2024/08/12/coq-of-solidity-3.html index a2f7aba92..a8f517b2d 100644 --- a/blog/2024/08/12/coq-of-solidity-3.html +++ b/blog/2024/08/12/coq-of-solidity-3.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/08/13/coq-of-solidity-4.html b/blog/2024/08/13/coq-of-solidity-4.html index 5e466151a..100f521a9 100644 --- a/blog/2024/08/13/coq-of-solidity-4.html +++ b/blog/2024/08/13/coq-of-solidity-4.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/08/19/verification-move-sui-type-checker-1.html b/blog/2024/08/19/verification-move-sui-type-checker-1.html index 51661a71d..a7163c74e 100644 --- a/blog/2024/08/19/verification-move-sui-type-checker-1.html +++ b/blog/2024/08/19/verification-move-sui-type-checker-1.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/13/class-what-we-do.html b/blog/2024/10/13/class-what-we-do.html index 3a709a365..5e63b182c 100644 --- a/blog/2024/10/13/class-what-we-do.html +++ b/blog/2024/10/13/class-what-we-do.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/14/verification-move-sui-type-checker-2.html b/blog/2024/10/14/verification-move-sui-type-checker-2.html index bf0066845..131bf97d3 100644 --- a/blog/2024/10/14/verification-move-sui-type-checker-2.html +++ b/blog/2024/10/14/verification-move-sui-type-checker-2.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/15/verification-move-sui-type-checker-3.html b/blog/2024/10/15/verification-move-sui-type-checker-3.html index ee5c94f2f..bee242075 100644 --- a/blog/2024/10/15/verification-move-sui-type-checker-3.html +++ b/blog/2024/10/15/verification-move-sui-type-checker-3.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/16/coq-of-solidity-enhanced-version-1.html b/blog/2024/10/16/coq-of-solidity-enhanced-version-1.html index c4f9b8afe..3e4640814 100644 --- a/blog/2024/10/16/coq-of-solidity-enhanced-version-1.html +++ b/blog/2024/10/16/coq-of-solidity-enhanced-version-1.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/21/verification-smooth-library-1.html b/blog/2024/10/21/verification-smooth-library-1.html index b8eebf93c..3078146af 100644 --- a/blog/2024/10/21/verification-smooth-library-1.html +++ b/blog/2024/10/21/verification-smooth-library-1.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/22/what-we-bring-to-you.html b/blog/2024/10/22/what-we-bring-to-you.html index 11fcfd8bd..daadc11c2 100644 --- a/blog/2024/10/22/what-we-bring-to-you.html +++ b/blog/2024/10/22/what-we-bring-to-you.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/10/28/verification-smooth-library-2.html b/blog/2024/10/28/verification-smooth-library-2.html index cd4f64aa3..f1c4b19e2 100644 --- a/blog/2024/10/28/verification-smooth-library-2.html +++ b/blog/2024/10/28/verification-smooth-library-2.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/11/01/tool-for-noir-1.html b/blog/2024/11/01/tool-for-noir-1.html index c96b8a464..011ea75a1 100644 --- a/blog/2024/11/01/tool-for-noir-1.html +++ b/blog/2024/11/01/tool-for-noir-1.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/11/14/sui-move-checker-abstract-stack.html b/blog/2024/11/14/sui-move-checker-abstract-stack.html index 5463144e8..134a6e4d1 100644 --- a/blog/2024/11/14/sui-move-checker-abstract-stack.html +++ b/blog/2024/11/14/sui-move-checker-abstract-stack.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/11/15/tool-for-noir-2.html b/blog/2024/11/15/tool-for-noir-2.html index 8da2d25dc..03fafdedc 100644 --- a/blog/2024/11/15/tool-for-noir-2.html +++ b/blog/2024/11/15/tool-for-noir-2.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/12/20/translation-of-circom-to-coq.html b/blog/2024/12/20/translation-of-circom-to-coq.html index 823cc39ff..2db648236 100644 --- a/blog/2024/12/20/translation-of-circom-to-coq.html +++ b/blog/2024/12/20/translation-of-circom-to-coq.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/12/20/what-is-formal-verification-of-smart-contracts.html b/blog/2024/12/20/what-is-formal-verification-of-smart-contracts.html index 7cabda953..bc25814ad 100644 --- a/blog/2024/12/20/what-is-formal-verification-of-smart-contracts.html +++ b/blog/2024/12/20/what-is-formal-verification-of-smart-contracts.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2024/12/26/mutually-recursive-functions-with-notation.html b/blog/2024/12/26/mutually-recursive-functions-with-notation.html index f4fe4c60f..fc5b76fb2 100644 --- a/blog/2024/12/26/mutually-recursive-functions-with-notation.html +++ b/blog/2024/12/26/mutually-recursive-functions-with-notation.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2025/01/06/annotating-what-we-are-doing.html b/blog/2025/01/06/annotating-what-we-are-doing.html index 8bedafd77..7517ffc69 100644 --- a/blog/2025/01/06/annotating-what-we-are-doing.html +++ b/blog/2025/01/06/annotating-what-we-are-doing.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2025/01/13/verification-one-instruction-sui.html b/blog/2025/01/13/verification-one-instruction-sui.html index 8ff302f60..01b2f2f32 100644 --- a/blog/2025/01/13/verification-one-instruction-sui.html +++ b/blog/2025/01/13/verification-one-instruction-sui.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2025/01/21/designing-a-coding-assistant-for-rocq.html b/blog/2025/01/21/designing-a-coding-assistant-for-rocq.html index cfb85b19f..38bbd7841 100644 --- a/blog/2025/01/21/designing-a-coding-assistant-for-rocq.html +++ b/blog/2025/01/21/designing-a-coding-assistant-for-rocq.html @@ -11,7 +11,7 @@ - + diff --git a/blog/2025/01/30/links-for-rust-in-rocq.html b/blog/2025/01/30/links-for-rust-in-rocq.html index dbda2bfc5..9328e2425 100644 --- a/blog/2025/01/30/links-for-rust-in-rocq.html +++ b/blog/2025/01/30/links-for-rust-in-rocq.html @@ -11,7 +11,7 @@ - + @@ -61,13 +61,13 @@

🧪 Examplewe say that the translated function cmp.max_by has a "link" definition, built implicitly in the proof, returning a value of type T. We can extract the definition of this function calling the primitive:

evaluate : forall {Output : Set} `{Link Output} {e : M},
{{ e 🔽 Output }} ->
LowM.t (Output.t Output)

It returns a "link" computation in the LowM.t monad. The output is often unreadable as it is, but we can step through it by symbolic execution. This will be useful for the next step to define and prove equivalent the "simulations".

-
+

Like the monad used for the translation of Rust programs by coq-of-rust, the link's monad is a free monad but with fewer primitive operations. The primitive operations are only related to the memory handling:

Inductive t : Set -> Set :=
| StateAlloc {A : Set} `{Link A} (value : A) : t (Ref.Core.t A)
| StateRead {A : Set} `{Link A} (ref_core : Ref.Core.t A) : t A
| StateWrite {A : Set} `{Link A} (ref_core : Ref.Core.t A) (value : A) : t unit
| GetSubPointer {A Sub_A : Set} `{Link A} `{Link Sub_A}
(ref_core : Ref.Core.t A) (runner : SubPointer.Runner.t A Sub_A) :
t (Ref.Core.t Sub_A).

Compared to the side effects in the generated translation, we eliminate all the operations related to name handling (trait resolution, function calls, etc.). We also always use explicit types instead of the universal Value.t type and get rid of the M.impossible operation that was necessary to represent impossible branches in the absence of types.

✒️ Conclusion

We have presented our general strategy to formally verify large Rust codebases. In the next blog posts, we will go into more details to look at the definition of the proof of equivalence for the links, and at how we automate the most repetitive parts of the proofs.

-
For more

Follow us on X or LinkedIn for more, or comment on this post below! Feel free to DM us for any questions or requests!