Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnarkyJS SHA/Keccak #9

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
198 changes: 198 additions & 0 deletions 0005-snarkyjs-ecdsa-sha.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# Exposing ECDSA and SHA3/Keccak to SnarkyJS

## Summary

Cryptographic primitives such as ECDSA and SHA3 are widely used outside of Mina. For example, Ethereum uses ECDSA over secp256k1 for signatures - in order to "communicate" with the outside world and other blockchains, SnarkyJS (and, therefore Mina) needs to support these primitives as well. This RFC describes how we will leverage the custom gates implemented by the crypto team and expose them to SnarkyJS, making them accessible to smart contract developers.

## Motivation

The initial [Ethereum Crypto Primitive PRD](https://www.notion.so/minaprotocol/Ethereum-Primitives-Support-PRD-d89af720e1c94f7b90166709432e7bd5) describes the importance of establishing cryptographic compatibility with Ethereum. The [ECDSA PRD](https://www.notion.so/minaprotocol/ECDSA-ver-gadget-PoC-PRD-9458c38adf204d6b922deb8eed1ac193) and the
[Keccak PRD](https://www.notion.so/minaprotocol/Keccak-gadget-PoC-PRD-59b024bce9d5441c8a00a0fcc9b356ae) then go into detail and describe two of the most important building blocks to achieve Ethereum compatibility in a cryptographic sense.

This RFC describes the steps needed to expose both ECDSA and SHA3/Keccak to SnarkyJS, enabling SnarkyJS developer to use these primitives in their applications and smart contracts.

It is important to mention that this RFC only considers the work required to _expose_ already existing cryptographic primitives (gates and gadgets), which have previously been implemented by the crypto team, from the OCaml bindings layer to SnarkyJS - it is not required to implement additional cryptographic primitives. Changes will only impact [snarkyjs-bindings](https://github.com/o1-labs/snarkyjs-bindings) and [SnarkyJS](https://github.com/o1-labs/snarkyjs) itself.

Once completed, SnarkyJS users will be able to leverage ECDSA and SHA3/Keccak to build applications that integrate with Ethereum and other use cases that require the use of said cryptographic primitives.

## Detailed design

### SHA3/Keccak

The Keccak and SHA3 gadget has been implemented by the crypto team ([minaprotocol/mina PR#13196](https://github.com/MinaProtocol/mina/pull/13196)), enabling us to leverage the already existing bindings layer to and from OCaml. This design allows us to simply integrate the new gadgets into SnarkyJS by simply exposing them in `ocaml/lib/snarky_js_bindings_lib.ml`.

For Keccak/SHA3, the implementation exposes two ready-to-go functions.
Trivo25 marked this conversation as resolved.
Show resolved Hide resolved

```ocaml
(** Gagdet for NIST SHA-3 function for output lengths 224/256/384/512 *)
val nist_sha3 :
(module Snarky_backendless.Snark_intf.Run with type field = 'f)
-> int
-> 'f Snarky_backendless.Cvar.t list
-> 'f Snarky_backendless.Cvar.t array

(** Gadget for Keccak hash function for the parameters used in Ethereum *)
val ethereum :
(module Snarky_backendless.Snark_intf.Run with type field = 'f)
-> 'f Snarky_backendless.Cvar.t list
-> 'f Snarky_backendless.Cvar.t array

```

These two functions will be imported into the bindings layer and exposed via a new sub-module as part of the already existing `Snarky` module.

```ocaml

module Sha = struct
let create message nist_version length =
let message_array = Array.to_list message in
if Js.to_bool nist_version then
Kimchi_gadgets.Keccak.nist_sha3 (module Impl) length message_array
else Kimchi_gadgets.Keccak.ethereum (module Impl) message_array
end

```

In order to reduce the bindings surface, Keccak and SHA3 will be exposed via a `create` function, which behaves like a factory pattern.
By calling this function inside SnarkyJS, we can define and expose all possible variants of SHA3(224/256/384/512) and Keccak without the need to have an individual function in the bindings layer for each variant.
Trivo25 marked this conversation as resolved.
Show resolved Hide resolved

In order to differentiate Poseidon, which works over native Field elements, and SHA3 and Keccak which works over byte-sized Field elements, it will be beneficial to introduce a new type `UInt8` (a Field element that is exactly a byte) to draw a clean line between both hash functions.

In SnarkyJS, these new primitives will be declared as followed:

```ts
function buildSha(length: 224 | 256 | 385 | 512, nist: boolean) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update this with the most object too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly do you mean by "the most object"?

return {
hash(message: UInt8[]) {
let payload = [0, ...message.map((f) => f.value)];
return Snarky.sha.create(payload, nist, length).map(Field);
},
};
}

const Sha3_224 = buildSha(224, true);
const Sha3_256 = buildSha(256, true);
const Sha3_385 = buildSha(385, true);
const Sha3_512 = buildSha(512, true);
const Keccak = buildSha(256, false);
```

### Implementation (previously Alternative Approach)

After discussing different API approaches, the overall consensus agreed on pursuing this particular approach over others. The implementation will combine all hash functions, including Poseidon, under a shared namespace `Hash`. Developers will then be able to use these functions by calling `Hash.[hash_name].hash(xs)`. However, this would not be equivalent to the existing `Poseidon` API.

However, the existing `Poseidon` API could be deprecated in favour of the new `Hash` namespace.

```ts
const Hash = {
Poseidon: {
hash: (xs: UInt8[]) => "digest",
},

// ..

SHA3_224: {
hash: (xs: UInt8[]) => "digest",
},

Keccak: {
hash: (xs: UInt8[]) => "digest",
},
};
```

Developers would then be able to simply import the `Hash` namespace into their project and use all available hash functions.

```ts
import { Hash } from "snarkyjs";

Hash.Poseidon.hash(xs);
Hash.Keccak.hash(xs);
// ..

// deprecated in favor of Hash.Poseidon.hash
Poseidon.hash(xs);
```

### Alternative Approach

Another alternative to the factory pattern above could be to supply the developer with a single function that takes a range of parameters, so that the developer can choose their flavour of SHA3/Keccak on their own.

_Note_: Currently, if `nist` is set to `false`, we only support an output length of 256.

```ts
function SHA3(length: 224 | 256 | 385 | 512 | { nist: 256 }): {
hash(xs: UInt8[]);
};
```

Developers can then import these functions into their project via

```ts
import { Sha3_224, Sha3_256, Sha3_385, Sha3_512, Keccak } from "snarkyjs";

Sha3_224.hash(xs);
// ..
```

It is important to mention that we have a lot of freedom in the the implementation details of the API, as the gates and gadgets are already implemented in OCaml and just need to exposed and wrapped by a developer friendly API.

One goal of the API is to design an easy to use interface for developers as well as maintaining consistency with already existing primitives (Poseidon in this case, Schnorr in the case of ECDSA).
The usage of both Keccak/SHA3 and Poseidon is the same: The user passes an array of Field elements into the hash function and the output is returned. One important detail is that Keccak/SHA3 only accepts an array of Field element that each are no larger than 1 byte. We will also provide additional helper functions that allow conversion between Field element arrays and hexadecimal encoded strings.

```ts
class UInt8 {
// constraints an array of Field elements to be at most 1 byte per Field
static fromFields(xs: Field[]): UInt8[];
// conversion from a hex-string to an array of UInt8 elements
fromHex(hex: string): UInt8[];
// conversion of UInt8 elements to a hex-string
static toHex(xs: UInt8[]): string;
}
```

Additionally, we will add a function

```ts
ForeignField.fromBytes(bytes: UInt8[]): ForeignField;
```

to the `ForeignField` implementation that constraints the output of Keccak to a foreign field, which will later be an important constraint for ECDSA.

Overall, exposing new gadgets and gates follow a strict pattern that has been used all over in the SnarkyJS bindings layer. As an example, the [Poseidon](https://github.com/o1-labs/snarkyjs-bindings/blob/main/ocaml/lib/snarky_js_bindings_lib.ml#L386) implementation behaves similarly. From the point of view of SnarkyJS, these gadgets are just another set of function calls to OCaml.

## ECDSA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this section if there will be separate RFC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! It will be removed with the other PR, which is stacked on top of this PR #14


**TODO** this will be a seperate PR stacked on top of this one

**Evergreen, wide-sweeping Details**

**Ephemeral details**

## Test plan and functional requirements

## SHA3/Keccak
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan and functional requirements and SHA3/Keccak sections are of the same level. Later one probably should be a sub-section (###).


In order to test the implementation of SHA3 and Keccak in SnarkyJS, we will follow the testing approach we already apply to other gadgets and gates.
This includes testing the out-of- and in-snark variants using our testing framework, as well as adding a SHA3 and Keccak regression test. The regression tests will also include a set of predetermined digests to make sure that the algorithm doesn't unexpectedly change over time (similar to the tests implemented for the OCaml gadget). We will include a range of edge cases in the tests (e.g. empty input, zero, etc).

In addition to that, we should provide a dedicated integration test that handles SHA3/Keccak hashing within a smart contract (proving enabled). This will allow us to not only provide developers with an example, but also ensure that SHA3 and Keccak proofs can be generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Are we concerned about performance testing?
    • For example, hashing large amounts of data or rapidly making hash requests.
  • Are we concerned about collision testing?
  • Are we concerned about security testing?
    • Possible/known vulnerabilities, cryptographic weakness, etc.
  • Are we concerned about usability testing?
    • In a form of gathering the feedback from a small group of developers to improve the design and usability of the API before releasing it.

  • The RFC indicates that hashing with SHA3 and Keccak is more expensive than using Poseidon. Performance metrics, such as time and memory usage, should be gathered and analyzed to provide developers with a clear understanding of the subject.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - @querolita to what extend has the original crypto implementation be tested? Do you have any recommendations to what we should test, if at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests performed to the implementation were:

  • Checking different output lengths and different configurations of the padding (NIST's for SHA3 and submission's for Keccak)
  • Checking different length outputs with different numbers of blocks
  • Edge cases of padding to make sure that a whole new block is created, and messages that already look padded
  • Checking different endianness of inputs and outputs
  • Check that non-byte length inputs cannot be passed
  • Can reuse constraint system for the same circuit structure
  • Cannot reuse constraint system for different output lengths
  • Cannot reuse constraint system for same output length but different padding
  • Cannot reuse constraint system for different endianness

I would create similar tests for SnarkyJS, but probably more high level since you don't have access to low level details such as constraint system creation (that I am aware of). In particular, I would make sure that inputs which are not "correct formatted hex values" (meaning an odd number of hex digits) is not passed to the function. And of course making sure that all combinations actually produce the expected value (pairs of output lengths and padding types), for some random values. I wouldn't go much deeper than that regarding the snarkyjs level alone.

In that sense, we are not testing against collision, just comparing against official outputs of the hash function on those given inputs, nor performance testing (all we know is 1 block of encryption, which is enough for 1080bits of input data, fits in 15k kimchi rows, thus 4 full blocks can be hashed without chunking), we haven't looked for keccak's potential vulnerabilities, and we have not gathered usability info from users other than ourselves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shimkiv does this sounds good to you? Is there anything else you would change here?

Copy link
Member

@shimkiv shimkiv Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will follow the testing approach we already apply to other gadgets and gates

It will be good to add some implementation example links here.

will also include a set of predetermined digests

Can we please be more specific here? What kind of set? Is it standardised set? If not then how we determine the set to be good enough?
Open question

similar to the tests implemented for the OCaml gadget

Can we please include links here?

We will include a range of edge cases in the tests (e.g. empty input, zero, etc).

It will be good if we can be more explicit here and at least highlight the major cases we're aware of and that will be included into the testing plan.

In that sense, we are not testing against collision, just comparing against official outputs of the hash function on those given inputs, nor performance testing (all we know is 1 block of encryption, which is enough for 1080bits of input data, fits in 15k kimchi rows, thus 4 full blocks can be hashed without chunking), we haven't looked for keccak's potential vulnerabilities, and we have not gathered usability info from users other than ourselves.

If we agree on this then it would be good if we can add listed items into the "out of the scope" testing section with some rationale.


## Drawbacks

Compared to Poseidon, hashing with SHA3 and Keccak is expensive. This should be made clear to the developer to avoid inefficient circuits. Additionally, it is important to educate developers of when to use SHA3/Keccak and when to use Poseidon. Additionally, the API should be secure.
Trivo25 marked this conversation as resolved.
Show resolved Hide resolved
It should be mentioned that developers should ideally use Poseidon for everything that does not explicitly require SHA3/Keccak (e.g. a Merkle Tree in SnarkyJS, checksums of Field elements and provable structures `Struct`, etc.) and only use SHA3/Keccak if it is really required (e.g. interacting with Ethereum, verifying Ethereum signatures, etc.).

Adding new primitives, especially cryptographic primitives, always includes risks such as the possibility of not constraining the algorithm and input enough to provide the developer with a safe API that is required to build secure applications. However, adding these primitives to SnarkyJS enables developers to explore a new range of important use cases.

## Rationale and alternatives

Keccak and SHA3 could not be exposed to SnarkyJS at all. However, this would essentially render these primitives useless since they were specifically designed to be used by developers with SnarkyJS. By adding these primitives, SnarkyJS will become an even more powerful zero-knowledge SDK that enables developers to explore a wide range of use cases.
Trivo25 marked this conversation as resolved.
Show resolved Hide resolved

## Prior art

Exposing gates and gadgets from the OCaml layer to SnarkyJS is nothing new - the same procedure has been applied to other primitives such as Poseidon, Field, and Elliptic Curve operations.

Trivo25 marked this conversation as resolved.
Show resolved Hide resolved
## Unresolved questions

No unresolved questions.