-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create payload->blob type system #1120
base: master
Are you sure you want to change the base?
Changes from all commits
9b222e0
bae8843
ebf9979
d66ddf2
f6021ea
422c2a8
a3d184b
856f5fc
1eeec85
a30a4ad
5eab8dd
132e8de
da63536
3016cac
9f98462
58b76c1
916573f
7eed618
c3a6291
b604c82
2ef8e99
13d16c7
22e8134
23b49c4
2add571
e5363e4
7801044
c70fad8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,79 @@ | ||||||
package codecs | ||||||
|
||||||
import ( | ||||||
"fmt" | ||||||
) | ||||||
|
||||||
// Blob is data that is dispersed on eigenDA. | ||||||
// | ||||||
// A Blob is represented under the hood by a coeff polynomial | ||||||
type Blob struct { | ||||||
coeffPolynomial *coeffPoly | ||||||
// blobLength must be a power of 2, and should match the blobLength claimed in the BlobCommitment | ||||||
// This is the blob length IN SYMBOLS, not in bytes | ||||||
blobLength uint32 | ||||||
Comment on lines
+11
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unclear how coeffPolynomial and blobLength are related. Does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also can't we just enforce that blobLength = nextPowerOf2(len(coeffPolynomial)) and then we don't need blobLength here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a corner case here:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will add a description of this corner case as a comment, if you confirm my logic here |
||||||
} | ||||||
|
||||||
// BlobFromBytes initializes a Blob from bytes | ||||||
// | ||||||
// blobLength is the length of the blob IN SYMBOLS | ||||||
func BlobFromBytes(bytes []byte, blobLength uint32) (*Blob, error) { | ||||||
poly, err := coeffPolyFromBytes(bytes) | ||||||
if err != nil { | ||||||
return nil, fmt.Errorf("polynomial from bytes: %w", err) | ||||||
} | ||||||
|
||||||
return BlobFromPolynomial(poly, blobLength) | ||||||
} | ||||||
Comment on lines
+20
to
+27
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see comment above, do we not want to enforce some relationship between len(bytes) and blobLength? |
||||||
|
||||||
// BlobFromPolynomial initializes a blob from a polynomial | ||||||
// | ||||||
// blobLength is the length of the blob IN SYMBOLS | ||||||
func BlobFromPolynomial(coeffPolynomial *coeffPoly, blobLength uint32) (*Blob, error) { | ||||||
return &Blob{ | ||||||
coeffPolynomial: coeffPolynomial, | ||||||
blobLength: blobLength, | ||||||
}, nil | ||||||
} | ||||||
|
||||||
// GetBytes gets the raw bytes of the Blob | ||||||
func (b *Blob) GetBytes() []byte { | ||||||
return b.coeffPolynomial.getBytes() | ||||||
} | ||||||
Comment on lines
+40
to
+42
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. have a preference for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. or There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I personally tend to understand a nuanced difference between
So, I chose But maybe I imagine this nuance, in which case I'm ok changing the current names. What would your preferred naming be for these two methods? |
||||||
|
||||||
// ToPayload converts the Blob into a Payload | ||||||
// | ||||||
// The payloadStartingForm indicates how payloads are constructed by the dispersing client. Based on the starting form | ||||||
// of the payload, we can determine what operations must be done to the blob in order to reconstruct the original payload | ||||||
func (b *Blob) ToPayload(payloadStartingForm PolynomialForm) (*Payload, error) { | ||||||
Comment on lines
+46
to
+48
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I prefer if get rid of the Wdyt of just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think |
||||||
var encodedPayload *encodedPayload | ||||||
var err error | ||||||
switch payloadStartingForm { | ||||||
case PolynomialFormCoeff: | ||||||
// the payload started off in coefficient form, so no conversion needs to be done | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again to me its not about "starting". its more about the invariant that we want to convert a blob to a payload. This requires knowing 2 piece of info:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
encodedPayload, err = b.coeffPolynomial.toEncodedPayload(b.blobLength) | ||||||
if err != nil { | ||||||
return nil, fmt.Errorf("coeff poly to encoded payload: %w", err) | ||||||
} | ||||||
case PolynomialFormEval: | ||||||
// the payload started off in evaluation form, so we first need to convert the blob's coeff poly into an eval poly | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
evalPoly, err := b.coeffPolynomial.toEvalPoly(b.blobLength) | ||||||
if err != nil { | ||||||
return nil, fmt.Errorf("coeff poly to eval poly: %w", err) | ||||||
} | ||||||
|
||||||
encodedPayload, err = evalPoly.toEncodedPayload(b.blobLength) | ||||||
if err != nil { | ||||||
return nil, fmt.Errorf("eval poly to encoded payload: %w", err) | ||||||
} | ||||||
default: | ||||||
return nil, fmt.Errorf("invalid polynomial form") | ||||||
} | ||||||
|
||||||
payload, err := encodedPayload.decode() | ||||||
if err != nil { | ||||||
return nil, fmt.Errorf("decode payload: %w", err) | ||||||
} | ||||||
|
||||||
return payload, nil | ||||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,22 +4,24 @@ import ( | |
"fmt" | ||
) | ||
|
||
type BlobEncodingVersion byte | ||
type PayloadEncodingVersion uint8 | ||
|
||
const ( | ||
// This minimal blob encoding contains a 32 byte header = [0x00, version byte, uint32 len of data, 0x00, 0x00,...] | ||
// PayloadEncodingVersion0 entails a 32 byte header = [0x00, version byte, big-endian uint32 len of payload, 0x00, 0x00,...] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add comment as to why we add the first 0x00 byte (so that it gets parsed as a bn254 FE). Took me a second to remember this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
// followed by the encoded data [0x00, 31 bytes of data, 0x00, 31 bytes of data,...] | ||
DefaultBlobEncoding BlobEncodingVersion = 0x0 | ||
// | ||
// Each group of 32 bytes starts with a 0x00 byte so that they can be parsed as valid bn254 field elements. | ||
PayloadEncodingVersion0 PayloadEncodingVersion = 0x0 | ||
) | ||
|
||
type BlobCodec interface { | ||
DecodeBlob(encodedData []byte) ([]byte, error) | ||
EncodeBlob(rawData []byte) ([]byte, error) | ||
} | ||
|
||
func BlobEncodingVersionToCodec(version BlobEncodingVersion) (BlobCodec, error) { | ||
func BlobEncodingVersionToCodec(version PayloadEncodingVersion) (BlobCodec, error) { | ||
switch version { | ||
case DefaultBlobEncoding: | ||
case PayloadEncodingVersion0: | ||
return DefaultBlobCodec{}, nil | ||
Comment on lines
+24
to
25
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this looks a bit weird now. Why is payload version 0 mapping to a defaultBlobCodec? Should we use BlobCodecVersion0 instead? Or do we want separate versioning for the payloadEncoding and BlobCodec? Or are these things the same? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I renamed the constant to what I felt made sense for the new code, and but the name doesn't work so well in the old
Yes, they are essentially the same thing. |
||
default: | ||
return nil, fmt.Errorf("unsupported blob encoding version: %x", version) | ||
|
@@ -33,7 +35,7 @@ func GenericDecodeBlob(data []byte) ([]byte, error) { | |
// version byte is stored in [1], because [0] is always 0 to ensure the codecBlobHeader is a valid bn254 element | ||
// see https://github.com/Layr-Labs/eigenda/blob/master/api/clients/codecs/default_blob_codec.go#L21 | ||
// TODO: we should prob be working over a struct with methods such as GetBlobEncodingVersion() to prevent index errors | ||
version := BlobEncodingVersion(data[1]) | ||
version := PayloadEncodingVersion(data[1]) | ||
codec, err := BlobEncodingVersionToCodec(version) | ||
if err != nil { | ||
return nil, err | ||
|
@@ -49,7 +51,7 @@ func GenericDecodeBlob(data []byte) ([]byte, error) { | |
|
||
// CreateCodec creates a new BlobCodec based on the defined polynomial form of payloads, and the desired | ||
// BlobEncodingVersion | ||
func CreateCodec(payloadPolynomialForm PolynomialForm, version BlobEncodingVersion) (BlobCodec, error) { | ||
func CreateCodec(payloadPolynomialForm PolynomialForm, version PayloadEncodingVersion) (BlobCodec, error) { | ||
lowLevelCodec, err := BlobEncodingVersionToCodec(version) | ||
if err != nil { | ||
return nil, fmt.Errorf("create low level codec: %w", err) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
package codecs | ||
|
||
import ( | ||
"bytes" | ||
"testing" | ||
|
||
"github.com/stretchr/testify/require" | ||
) | ||
|
||
// TestBlobConversion checks that internal blob conversion methods produce consistent results | ||
func FuzzBlobConversion(f *testing.F) { | ||
for _, seed := range [][]byte{{}, {0x00}, {0xFF}, {0x00, 0x00}, {0xFF, 0xFF}, bytes.Repeat([]byte{0x55}, 1000)} { | ||
f.Add(seed) | ||
} | ||
|
||
f.Fuzz( | ||
func(t *testing.T, originalData []byte) { | ||
testBlobConversionForForm(t, originalData, PolynomialFormEval) | ||
testBlobConversionForForm(t, originalData, PolynomialFormCoeff) | ||
}) | ||
|
||
} | ||
|
||
func testBlobConversionForForm(t *testing.T, payloadBytes []byte, form PolynomialForm) { | ||
payload := NewPayload(payloadBytes) | ||
|
||
blob, err := payload.ToBlob(form) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The natural interpretation for this reads wrong :( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like perhaps defining on Blob a method There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you think of this idea: when constructing a payload, declare in the constructor what form it is. The sequence would then read:
I personally really dislike like the pattern of creating a new empty object, just to call a method that creates another instance of the object |
||
require.NoError(t, err) | ||
|
||
blobBytes := blob.GetBytes() | ||
blobFromBytes, err := BlobFromBytes(blobBytes, blob.blobLength) | ||
require.NoError(t, err) | ||
|
||
decodedPayload, err := blobFromBytes.ToPayload(form) | ||
require.NoError(t, err) | ||
|
||
decodedPayloadBytes := decodedPayload.GetBytes() | ||
|
||
require.Equal(t, payloadBytes, decodedPayloadBytes) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
package codecs | ||
|
||
import ( | ||
"fmt" | ||
|
||
"github.com/Layr-Labs/eigenda/encoding/fft" | ||
"github.com/Layr-Labs/eigenda/encoding/rs" | ||
"github.com/consensys/gnark-crypto/ecc/bn254/fr" | ||
) | ||
|
||
// coeffPoly is a polynomial in coefficient form. | ||
// | ||
// The underlying bytes represent 32 byte field elements, and each field element represents a coefficient | ||
type coeffPoly struct { | ||
fieldElements []fr.Element | ||
} | ||
Comment on lines
+14
to
+16
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This duplicates an existing type (currently in
Also, there is a bunch of serialization logic in that file as well for going back and forth between serialized and deserialized form. Is the plan to circle back and replace these old types in a follow up PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason why you put this into a struct instead of doing |
||
|
||
// coeffPolyFromBytes creates a new polynomial from bytes. This function performs the necessary checks to guarantee that the | ||
// bytes are well-formed, and returns a new object if they are | ||
func coeffPolyFromBytes(bytes []byte) (*coeffPoly, error) { | ||
fieldElements, err := rs.ToFrArray(bytes) | ||
if err != nil { | ||
return nil, fmt.Errorf("deserialize field elements: %w", err) | ||
} | ||
|
||
return &coeffPoly{fieldElements: fieldElements}, nil | ||
} | ||
|
||
// coeffPolyFromElements creates a new coeffPoly from field elements. | ||
func coeffPolyFromElements(elements []fr.Element) *coeffPoly { | ||
return &coeffPoly{fieldElements: elements} | ||
} | ||
|
||
// toEvalPoly converts a coeffPoly to an evalPoly, using the FFT operation | ||
// | ||
// blobLength (in SYMBOLS) is required, to be able to choose the correct parameters when performing FFT | ||
func (p *coeffPoly) toEvalPoly(blobLength uint32) (*evalPoly, error) { | ||
// TODO (litt3): this could conceivably be optimized, so that multiple objects share an instance of FFTSettings, | ||
// which has enough roots of unity for general use. If the following construction of FFTSettings ever proves | ||
// to present a computational burden, consider making this change. | ||
fftSettings := fft.FFTSettingsFromBlobLength(blobLength) | ||
|
||
// the FFT method pads to the next power of 2, so we don't need to do that manually | ||
fftedElements, err := fftSettings.FFT(p.fieldElements, false) | ||
if err != nil { | ||
return nil, fmt.Errorf("perform FFT: %w", err) | ||
} | ||
|
||
return evalPolyFromElements(fftedElements), nil | ||
} | ||
|
||
// GetBytes returns the bytes that underlie the polynomial | ||
func (p *coeffPoly) getBytes() []byte { | ||
return rs.FieldElementsToBytes(p.fieldElements) | ||
} | ||
Comment on lines
+53
to
+55
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My personal preference for methods like this where we convert to/from |
||
|
||
// toEncodedPayload converts a coeffPoly into an encoded payload | ||
// | ||
// blobLength is required, to be able to perform length checks on the encoded payload during construction. | ||
// blobLength is in symbols, NOT bytes | ||
func (p *coeffPoly) toEncodedPayload(blobLength uint32) (*encodedPayload, error) { | ||
return encodedPayloadFromElements(p.fieldElements, blobLength) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
package codecs | ||
|
||
import ( | ||
"bytes" | ||
"testing" | ||
|
||
"github.com/stretchr/testify/require" | ||
) | ||
|
||
// FuzzConversionConsistency checks that data can be encoded and decoded repeatedly, always getting back the original data | ||
func FuzzConversionConsistency(f *testing.F) { | ||
for _, seed := range [][]byte{{}, {0x00}, {0xFF}, {0x00, 0x00}, {0xFF, 0xFF}, bytes.Repeat([]byte{0x55}, 1000)} { | ||
f.Add(seed) | ||
} | ||
|
||
f.Fuzz( | ||
func(t *testing.T, originalData []byte) { | ||
payload := NewPayload(originalData) | ||
|
||
blob1, err := payload.ToBlob(PolynomialFormEval) | ||
require.NoError(t, err) | ||
|
||
blob2, err := payload.ToBlob(PolynomialFormCoeff) | ||
require.NoError(t, err) | ||
|
||
decodedPayload1, err := blob1.ToPayload(PolynomialFormEval) | ||
require.NoError(t, err) | ||
|
||
decodedPayload2, err := blob2.ToPayload(PolynomialFormCoeff) | ||
require.NoError(t, err) | ||
|
||
require.Equal(t, originalData, decodedPayload1.GetBytes()) | ||
require.Equal(t, originalData, decodedPayload2.GetBytes()) | ||
}) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,7 +22,7 @@ func (v DefaultBlobCodec) EncodeBlob(rawData []byte) ([]byte, error) { | |
codecBlobHeader := make([]byte, 32) | ||
// first byte is always 0 to ensure the codecBlobHeader is a valid bn254 element | ||
// encode version byte | ||
codecBlobHeader[1] = byte(DefaultBlobEncoding) | ||
codecBlobHeader[1] = byte(PayloadEncodingVersion0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. doesn't this function There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
// encode length as uint32 | ||
binary.BigEndian.PutUint32(codecBlobHeader[2:6], uint32(len(rawData))) // uint32 should be more than enough to store the length (approx 4gb) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider renaming to
blobLengthSymbols
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although is symbol standard here? I feel like our whitepaper is the only place I've ever seen it. Should we just use FE for field element everywhere? BlobLengthFE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally used
blobLength
to try to keep consistency with the struct in the commitment, but I agree, it's not ideal that I have to put in comments everywhere that it's in symbolsI don't have a preference between
BlobLengthFE
orBlobLengthSymbols
. Usage ofSymbols
is pretty pervasive in our codebase, but also I don't think anyone would be confused by theFE
terminology. Do you have a preference?