From c448ef89c926359ed5413a25652e91afb3405efe Mon Sep 17 00:00:00 2001 From: Maxime Buyse Date: Thu, 20 Feb 2025 11:54:41 +0100 Subject: [PATCH 1/6] Add 'hax for everyone' blog post. --- docs/blog/.authors.yml | 4 ++ docs/blog/posts/hax-for-everyone.md | 68 +++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) create mode 100644 docs/blog/posts/hax-for-everyone.md diff --git a/docs/blog/.authors.yml b/docs/blog/.authors.yml index 042212ee2..8017e41fd 100644 --- a/docs/blog/.authors.yml +++ b/docs/blog/.authors.yml @@ -7,3 +7,7 @@ authors: name: Lucas Franceschino description: Creator avatar: https://cryspen.com/images/lucas_hu1387702743353212719.jpg + maxime: + name: Maxime Buyse + description: Creator + diff --git a/docs/blog/posts/hax-for-everyone.md b/docs/blog/posts/hax-for-everyone.md new file mode 100644 index 000000000..dff104db5 --- /dev/null +++ b/docs/blog/posts/hax-for-everyone.md @@ -0,0 +1,68 @@ +--- +authors: + - maxime +title: "Hax for everyone" +date: 2025-02-20 +--- + +# Trying to make hax usable in more contexts +The hax toolchain has been successfully used to formally verify our cryptographic implementations for [ML-KEM](https://cryspen.com/post/ml-kem-verification/),[Bertie](https://cryspen.com/post/hax-pv/) and more. All these projects are internal to Cryspen which means the Rust implementation is developed with formal verification (using hax) in mind, and use a limited number of Rust features. +However, hax is under constant development and the improvements we bring are targeted at making it more usable. With these improvements we want to bring hax to a new kind of projects that are external to Cryspen and don’t have restrictions on the Rust patterns they use. We want hax to be usable in this context with minimal modifications to the code (ideally no modification at all). An example of such a project is the verification of [sandwich](https://github.com/sandbox-quantum/sandwich), a high-level cryptographic library that we are working on together with [SandboxAQ](https://cryspen.com/post/hax-sandbox/). This project revealed the weaknesses of hax in this context which brought us to implement some improvements that will be presented in this blog post. +## Challenges +The projects developed by Cryspen applying hax from the beginning can limit themselves to the subset of Rust supported by hax. Applying hax to a pre-existing project means some large parts are probably not supported. The challenge is then to identify which features to prioritize for support in hax (and adding support is yet another challenge), and which features have no short-term plan for support (mainly mutable references and raw pointers). For the latter we need to abstract out the code (if it is not relevant for proofs) or rewrite it (when possible, and ideally we should avoid this). +Having external users encourages us even more to make hax an easily-usable and well-documented tool. +## Frontend improvements +The hax frontend is mostly relying on rustc and cargo to extract intermediary representations of a Rust crate. It is supposed to produce a result for any Rust crate (restrictions on the available Rust features come later in the toolchain). However the information given by rustc is sometimes partial or lacks some parts that are needed for our translations. A crucial example of this is trait resolution as we need to know the trait derivation that is used by each call of a trait method. This is a part of the hax frontend that has proven tricky and still had many bugs a few months ago. At that time, launching it on a somehow complicated crate had big chances of resulting in a crash. As part of our effort to improve the usability of hax, many of these bugs have now been fixed which is a great step forward, especially as even for some projects that look simple, we need to extract part of the dependencies which are usually more problematic. +According to our tests on the top 500 crates (by number of downloads on crates.io), hax frontend succeeds without crashing or timing out on more than 99%. However we are still looking for a better way to measure the coverage of the Rust features, and identifying the situations where we can still improve. +## Recursive Bundles +Rust modules are more of a namespacing system than an actual module system as used in functional languages (like our backends). In particular our backends require the module dependency graph to be acyclic while Rust has no such restriction. It is quite common in Rust to \make use of this and create cyclic dependencies between modules which means it is necessary for us to have a solution for this problem. +Here is an example (you can open it in the hax playground to check the code hax generates out of it): +```rust + +pub struct Error(); + +mod private { + pub(crate) fn f() -> Result<(), super::Error> { + Ok(()) + } +} + +pub fn user_f() -> Result<(), Error> { + private::f() +} +``` +[Open this code snippet in the playground](https://hax-playground.cryspen.com/#fstar/b7fe08cccd/gist=fcb9cb9854c69ee6e2788648a380ff79) +In this example there is a dependency between the top level module and the `private` module. Our solution to break these cycles is simply to put the content of the cyclic modules in a single module (that we call bundle), and then re-exposing the items in their original modules. +This solution is not perfect because it changes the architecture of the generated code compared to the original code, and it could be improved by minimizing the content of the bundles (choosing a set of definitions to break the cycle instead of the full content of the modules). But so far it has proven very useful as it removes a big limitation on the Rust we support. +## Opaque items +Large projects usually contain code that we don’t support yet but we still want to reason about the rest of the project and have an abstract model (axiomatization) for the parts that we don’t support. We need to control which parts we want to fully extract and which parts we extract only as axioms. CLI include flags have been the solution for this but they only allow to choose at the model level, which is inconvenient for large projects. To make this more practical we added another way to specify inside the source with the attribute `hax_lib::opaque` makes an item axiomatized. There is still the problem of complicated `-i` flags which will be solved in the future by having the corresponding information in configuration files. +## Control flow rewriting without monads including inside loops +Translating imperative code to functional backends for verification implies some handling of side effects and transformation of control flow. A classic solution for this is to have a monadic encoding state which results in generated code that can be hard to read (and to reason about). This is the solution that was implemented (with some bugs) in hax but we decided to replace it with a solution without monads. The code we produce is simpler to read, but the main limitation is that there is code duplication which in some cases can lead to an extracted code that is exponentially bigger than the source. + +Here is a simple example of this: +```rust +fn f() -> i32{ + if true { + if true { + return 1 + } + } + 3 +} +``` +[Open this code snippet in the playground](https://hax-playground.cryspen.com/#fstar/b7fe08cccd/gist=078ca6da8dad17541533bb5a0724784b) + +The F* code extracted from this example is the following: +``` +let f (_: Prims.unit) : i32 = + if true + then if true then mk_i32 1 else mk_i32 3 + else mk_i32 3 +``` +Here the semantics is preserved, but adding the `else` branches results in a duplication of the return value `3`. +Our idea to improve in the future is to revive the monadic version, but use it only if the duplication is too big. +Support for control flow (`return`, `break` and `continue`) in loops has been added as well. In hax, loops are translated as a functional fold in which the accumulator keeps track of the modification of the environment done by the effectful operations in the source. This extension relies on a monadic encoding of the loop result, that is passed in the accumulator to deal with the specific cases of `return`, `break` and `continue`. +## Items sorting +A quality of life feature that we have been lacking for a long time is trying to respect, as much as possible, the same order of items in the generated code compared to the source. We need to modify the order because (as for modules), Rust allows items to be defined in any order, while our backends need items to be defined after the other items they depend on (except for mutual recursion). We rely on a graph topological sort to ensure this property, and now use a modified version of the stable topological sort provided by ocamlgraph, which produces an order that respects the dependencies, but in the absence of constraints tries respects the order of the source. +## Conclusion +Bringing hax to a new kind of project revealed the gap needed for it to be usable, but thanks to our active work, we have made great progress towards this goal. Even though there is still much more to do, this has allowed us to get results in these new applications of hax (stay tuned for more details about that!). From 115406a66ef5e9ff79d1e0908997afca046937e2 Mon Sep 17 00:00:00 2001 From: maximebuyse <45398004+maximebuyse@users.noreply.github.com> Date: Thu, 20 Feb 2025 14:34:46 +0100 Subject: [PATCH 2/6] Add avatar Co-authored-by: Franziskus Kiefer --- docs/blog/.authors.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/blog/.authors.yml b/docs/blog/.authors.yml index 8017e41fd..68eb30b03 100644 --- a/docs/blog/.authors.yml +++ b/docs/blog/.authors.yml @@ -9,5 +9,5 @@ authors: avatar: https://cryspen.com/images/lucas_hu1387702743353212719.jpg maxime: name: Maxime Buyse - description: Creator + avatar: https://cryspen.com/images/maxime_hu_8da610acc29fd826.png From 4019fab66c6b03c3f7653707eaf604a02de5a0f9 Mon Sep 17 00:00:00 2001 From: Maxime Buyse Date: Thu, 20 Feb 2025 14:44:22 +0100 Subject: [PATCH 3/6] Various improvements to the blog post. --- docs/blog/posts/hax-for-everyone.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/blog/posts/hax-for-everyone.md b/docs/blog/posts/hax-for-everyone.md index dff104db5..3cb9913c9 100644 --- a/docs/blog/posts/hax-for-everyone.md +++ b/docs/blog/posts/hax-for-everyone.md @@ -6,10 +6,10 @@ date: 2025-02-20 --- # Trying to make hax usable in more contexts -The hax toolchain has been successfully used to formally verify our cryptographic implementations for [ML-KEM](https://cryspen.com/post/ml-kem-verification/),[Bertie](https://cryspen.com/post/hax-pv/) and more. All these projects are internal to Cryspen which means the Rust implementation is developed with formal verification (using hax) in mind, and use a limited number of Rust features. -However, hax is under constant development and the improvements we bring are targeted at making it more usable. With these improvements we want to bring hax to a new kind of projects that are external to Cryspen and don’t have restrictions on the Rust patterns they use. We want hax to be usable in this context with minimal modifications to the code (ideally no modification at all). An example of such a project is the verification of [sandwich](https://github.com/sandbox-quantum/sandwich), a high-level cryptographic library that we are working on together with [SandboxAQ](https://cryspen.com/post/hax-sandbox/). This project revealed the weaknesses of hax in this context which brought us to implement some improvements that will be presented in this blog post. +The hax toolchain has been successfully used to formally verify our cryptographic implementations for [ML-KEM](https://cryspen.com/post/ml-kem-verification/),[Bertie](https://cryspen.com/post/hax-pv/) and more. All these projects are developed with formal verification (using hax) in mind, and use a limited number of Rust features. +However, hax is under constant development and the improvements we bring are targeted at making it more usable. With these improvements we want to bring hax to a new kind of projects that don’t have restrictions on the Rust patterns they use. We want hax to be usable in this context with minimal modifications to the code (ideally no modification at all). An example of such a project is the verification of [sandwich](https://github.com/sandbox-quantum/sandwich), a high-level cryptographic library that we are working on together with [SandboxAQ](https://cryspen.com/post/hax-sandbox/). This project revealed the weaknesses of hax in this context which brought us to implement some improvements that will be presented in this blog post. ## Challenges -The projects developed by Cryspen applying hax from the beginning can limit themselves to the subset of Rust supported by hax. Applying hax to a pre-existing project means some large parts are probably not supported. The challenge is then to identify which features to prioritize for support in hax (and adding support is yet another challenge), and which features have no short-term plan for support (mainly mutable references and raw pointers). For the latter we need to abstract out the code (if it is not relevant for proofs) or rewrite it (when possible, and ideally we should avoid this). +The projects that use hax from the beginning can limit themselves to the subset of Rust supported by hax. Applying hax to a pre-existing project means some large parts are probably not supported. The challenge is then to identify which features to prioritize for support in hax (and adding support is yet another challenge), and which features have no short-term plan for support (mainly mutable references and raw pointers). For the latter we need to abstract out the code (if it is not relevant for proofs) or rewrite it (when possible, and ideally we should avoid this). Having external users encourages us even more to make hax an easily-usable and well-documented tool. ## Frontend improvements The hax frontend is mostly relying on rustc and cargo to extract intermediary representations of a Rust crate. It is supposed to produce a result for any Rust crate (restrictions on the available Rust features come later in the toolchain). However the information given by rustc is sometimes partial or lacks some parts that are needed for our translations. A crucial example of this is trait resolution as we need to know the trait derivation that is used by each call of a trait method. This is a part of the hax frontend that has proven tricky and still had many bugs a few months ago. At that time, launching it on a somehow complicated crate had big chances of resulting in a crash. As part of our effort to improve the usability of hax, many of these bugs have now been fixed which is a great step forward, especially as even for some projects that look simple, we need to extract part of the dependencies which are usually more problematic. @@ -31,7 +31,8 @@ pub fn user_f() -> Result<(), Error> { private::f() } ``` -[Open this code snippet in the playground](https://hax-playground.cryspen.com/#fstar/b7fe08cccd/gist=fcb9cb9854c69ee6e2788648a380ff79) +[Open this code snippet in the hax playground](https://hax-playground.cryspen.com/#fstar/b7fe08cccd/gist=fcb9cb9854c69ee6e2788648a380ff79) + In this example there is a dependency between the top level module and the `private` module. Our solution to break these cycles is simply to put the content of the cyclic modules in a single module (that we call bundle), and then re-exposing the items in their original modules. This solution is not perfect because it changes the architecture of the generated code compared to the original code, and it could be improved by minimizing the content of the bundles (choosing a set of definitions to break the cycle instead of the full content of the modules). But so far it has proven very useful as it removes a big limitation on the Rust we support. ## Opaque items @@ -50,10 +51,10 @@ fn f() -> i32{ 3 } ``` -[Open this code snippet in the playground](https://hax-playground.cryspen.com/#fstar/b7fe08cccd/gist=078ca6da8dad17541533bb5a0724784b) +[Open this code snippet in the hax playground](https://hax-playground.cryspen.com/#fstar/b7fe08cccd/gist=078ca6da8dad17541533bb5a0724784b) The F* code extracted from this example is the following: -``` +```ocaml let f (_: Prims.unit) : i32 = if true then if true then mk_i32 1 else mk_i32 3 From d8c0ffc2cd8b7b53b09d7c2b6117229f7d1b9e8b Mon Sep 17 00:00:00 2001 From: Maxime Buyse Date: Mon, 24 Feb 2025 17:41:39 +0100 Subject: [PATCH 4/6] Blogpost improvements following suggestions. --- docs/blog/posts/hax-for-everyone.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/blog/posts/hax-for-everyone.md b/docs/blog/posts/hax-for-everyone.md index 3cb9913c9..3f597f2dc 100644 --- a/docs/blog/posts/hax-for-everyone.md +++ b/docs/blog/posts/hax-for-everyone.md @@ -6,16 +6,17 @@ date: 2025-02-20 --- # Trying to make hax usable in more contexts -The hax toolchain has been successfully used to formally verify our cryptographic implementations for [ML-KEM](https://cryspen.com/post/ml-kem-verification/),[Bertie](https://cryspen.com/post/hax-pv/) and more. All these projects are developed with formal verification (using hax) in mind, and use a limited number of Rust features. -However, hax is under constant development and the improvements we bring are targeted at making it more usable. With these improvements we want to bring hax to a new kind of projects that don’t have restrictions on the Rust patterns they use. We want hax to be usable in this context with minimal modifications to the code (ideally no modification at all). An example of such a project is the verification of [sandwich](https://github.com/sandbox-quantum/sandwich), a high-level cryptographic library that we are working on together with [SandboxAQ](https://cryspen.com/post/hax-sandbox/). This project revealed the weaknesses of hax in this context which brought us to implement some improvements that will be presented in this blog post. +The hax toolchain has been successfully used to formally verify our cryptographic implementations for [ML-KEM](https://cryspen.com/post/ml-kem-verification/),[Bertie](https://cryspen.com/post/hax-pv/) and more. All these projects are developed with formal verification (using hax) in mind, and use a limited subset of Rust features. +However, hax is under constant development and the improvements we bring are targeted at making it more usable. With these improvements we want to bring hax to a new kind of projects that don’t have restrictions on the Rust patterns they use. We want hax to be usable in this context with minimal modifications to the code (ideally no modification at all). An example of such a project is the verification of [sandwich](https://github.com/sandbox-quantum/sandwich), a high-level cryptographic library built by [SandboxAQ](https://cryspen.com/post/hax-sandbox/). This project revealed the weaknesses of hax in this context which brought us to implement some improvements that will be presented in this blog post. ## Challenges -The projects that use hax from the beginning can limit themselves to the subset of Rust supported by hax. Applying hax to a pre-existing project means some large parts are probably not supported. The challenge is then to identify which features to prioritize for support in hax (and adding support is yet another challenge), and which features have no short-term plan for support (mainly mutable references and raw pointers). For the latter we need to abstract out the code (if it is not relevant for proofs) or rewrite it (when possible, and ideally we should avoid this). +The projects that use hax from the beginning can limit themselves to the subset of Rust supported by hax. Applying hax to a pre-existing project means that it may use various Rust features that are probably not supported yet in hax. The challenge is then to identify which features to prioritize for support in hax (and adding support is yet another challenge), and which features have no short-term plan for support. For the latter we need to abstract out the code (if it is not relevant for proofs) or rewrite it (when possible; ideally we try to avoid this). Having external users encourages us even more to make hax an easily-usable and well-documented tool. ## Frontend improvements -The hax frontend is mostly relying on rustc and cargo to extract intermediary representations of a Rust crate. It is supposed to produce a result for any Rust crate (restrictions on the available Rust features come later in the toolchain). However the information given by rustc is sometimes partial or lacks some parts that are needed for our translations. A crucial example of this is trait resolution as we need to know the trait derivation that is used by each call of a trait method. This is a part of the hax frontend that has proven tricky and still had many bugs a few months ago. At that time, launching it on a somehow complicated crate had big chances of resulting in a crash. As part of our effort to improve the usability of hax, many of these bugs have now been fixed which is a great step forward, especially as even for some projects that look simple, we need to extract part of the dependencies which are usually more problematic. +The hax frontend is mostly relying on rustc and cargo to extract intermediary representations of a Rust crate. It is supposed to produce a result for any Rust crate (restrictions on the available Rust features come later in the toolchain). However the information given by rustc is sometimes partial or lacks some parts that are needed for our translations. A crucial example of this is trait resolution as we need to know the trait derivation that is used by each call of a trait method. This is a part of the hax frontend that has proven tricky and still had many bugs a few months ago. At that time, launching it on a somehow complicated crate had big chances of resulting in a crash. As part of our effort to improve the usability of hax, many of these bugs have now been fixed (in collaboration with our colleagues at Inria). This is a big step forward, since even for a project that looks small and simple, we need to handle all of its dependencies which are usually more problematic. + According to our tests on the top 500 crates (by number of downloads on crates.io), hax frontend succeeds without crashing or timing out on more than 99%. However we are still looking for a better way to measure the coverage of the Rust features, and identifying the situations where we can still improve. ## Recursive Bundles -Rust modules are more of a namespacing system than an actual module system as used in functional languages (like our backends). In particular our backends require the module dependency graph to be acyclic while Rust has no such restriction. It is quite common in Rust to \make use of this and create cyclic dependencies between modules which means it is necessary for us to have a solution for this problem. +Rust code is organized in modules, where modules can be seen as a namespacing system. When translating modules to our backends (F*, Coq, ProVerif) we need to generate the corresponding module-like abstraction in the backend, which typically works quite differently. In particular our backends require the module dependency graph to be acyclic while Rust has no such restriction. It is quite common in Rust to make use of this and create cyclic dependencies between modules which means it is necessary for us to have a solution for this problem. Here is an example (you can open it in the hax playground to check the code hax generates out of it): ```rust @@ -36,7 +37,7 @@ pub fn user_f() -> Result<(), Error> { In this example there is a dependency between the top level module and the `private` module. Our solution to break these cycles is simply to put the content of the cyclic modules in a single module (that we call bundle), and then re-exposing the items in their original modules. This solution is not perfect because it changes the architecture of the generated code compared to the original code, and it could be improved by minimizing the content of the bundles (choosing a set of definitions to break the cycle instead of the full content of the modules). But so far it has proven very useful as it removes a big limitation on the Rust we support. ## Opaque items -Large projects usually contain code that we don’t support yet but we still want to reason about the rest of the project and have an abstract model (axiomatization) for the parts that we don’t support. We need to control which parts we want to fully extract and which parts we extract only as axioms. CLI include flags have been the solution for this but they only allow to choose at the model level, which is inconvenient for large projects. To make this more practical we added another way to specify inside the source with the attribute `hax_lib::opaque` makes an item axiomatized. There is still the problem of complicated `-i` flags which will be solved in the future by having the corresponding information in configuration files. +Large projects usually contain code that we don’t support yet but we still want to reason about the rest of the project and have an abstract model (axiomatization) for the parts that we don’t support. We need to control which parts we want to fully extract and which parts we extract only as opaque items. The command-line options offered by the hax toolchain provide a solution to this, but they only allow to choose at the model level, which is inconvenient for large projects. To make this more practical we added another way to specify inside the source with the attribute `hax_lib::opaque` makes an item axiomatized. There is still the problem of complicated `-i` flags which will be solved in the future by having the corresponding information in configuration files. ## Control flow rewriting without monads including inside loops Translating imperative code to functional backends for verification implies some handling of side effects and transformation of control flow. A classic solution for this is to have a monadic encoding state which results in generated code that can be hard to read (and to reason about). This is the solution that was implemented (with some bugs) in hax but we decided to replace it with a solution without monads. The code we produce is simpler to read, but the main limitation is that there is code duplication which in some cases can lead to an extracted code that is exponentially bigger than the source. From 77c5191db1602ab7e2599a23a5e862915a4be731 Mon Sep 17 00:00:00 2001 From: Franziskus Kiefer Date: Mon, 24 Feb 2025 20:11:32 +0100 Subject: [PATCH 5/6] Update hax-for-everyone.md --- docs/blog/posts/hax-for-everyone.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/blog/posts/hax-for-everyone.md b/docs/blog/posts/hax-for-everyone.md index 3f597f2dc..f0eef26dc 100644 --- a/docs/blog/posts/hax-for-everyone.md +++ b/docs/blog/posts/hax-for-everyone.md @@ -2,7 +2,7 @@ authors: - maxime title: "Hax for everyone" -date: 2025-02-20 +date: 2025-02-25 --- # Trying to make hax usable in more contexts From cd68297179f71868e0befe46088925ff1b123c9b Mon Sep 17 00:00:00 2001 From: Maxime Buyse Date: Tue, 25 Feb 2025 18:00:12 +0100 Subject: [PATCH 6/6] Install rust in this action to get latest version instead of what is installed by github. --- .github/workflows/licenses.yml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.github/workflows/licenses.yml b/.github/workflows/licenses.yml index e488007eb..f5e4f2efd 100644 --- a/.github/workflows/licenses.yml +++ b/.github/workflows/licenses.yml @@ -18,6 +18,9 @@ jobs: uses: ocaml/setup-ocaml@v3 with: ocaml-compiler: 5 + - uses: actions-rust-lang/setup-rust-toolchain@v1 + with: + toolchain: stable - name: Install cargo-deny run: cargo install cargo-deny - name: Install cargo-deny