-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: multiple blobs per PFB #1154
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1154 +/- ##
==========================================
+ Coverage 48.20% 48.52% +0.31%
==========================================
Files 72 72
Lines 4070 4134 +64
==========================================
+ Hits 1962 2006 +44
- Misses 1936 1951 +15
- Partials 172 177 +5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
we should be able to split one or two minor refactors out of this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks really good to me! Hyped for this feature
func() [][]int { return [][]int{{4}} }, | ||
2, 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] if the square size is 2
, the blob size is 4
, and the expected starting index is 1
, how does the blob fit in the square? I would expect something like:
C = compact share
B = blob share
|C|B|
|B|B|
with one B share leftover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh I see, this threw me off after you mentioned it too. This only takes a single share as its blobsize, not the number of shares that are used. so we have single single share tx, and therefore
|C|B|
|_|_|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh I see, blobSizes is the # of bytes in the blob. [No change needed] then I expect this to behave similarly and indeed it does
func() [][]int { return [][]int{{4}} }, | |
2, 1, | |
func() [][]int { return [][]int{{1}} }, | |
2, 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some high level comments first
// with this message should use when included in a block. The share_version | ||
// specified must match the share_version used to generate the | ||
// share_commitment in this message. | ||
uint32 share_version = 8; | ||
repeated uint32 share_versions = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: would we ever want a message to refer to blobs that have different share_versions? Can't we enforce that all blobs must have the same share_version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there could maybe be a scenario with multiple share_versions, but we could enforce that if we wanted to. Its probably a good trade-off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok perhaps a better question: What changes would cause us to change the share version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One use case is to add the sender of the PFB, or other metadata, before the blob in the square.
Other usecases would be to allow posters to not use varints or some other encoding should it be useful.
I'm not sure if people will ever want to mix those tho. Even if they do, then only in that specific case do we have more linear increase in state transitions per different share version.
With the benefit of removing a lot of redundant bytes. Which sounds like a really good tradeoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't think that constraining the share versions for blobs within a single PFB is going to be a problem for users. Worst case they just have two PFBs (for each version) and pay the extra 2 cents in gas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left an issue in #1223
@@ -26,16 +26,16 @@ message ShareCommitAndSignature { | |||
// MsgPayForBlob pays for the inclusion of a blob in the block. | |||
message MsgPayForBlob { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message MsgPayForBlob { | |
message MsgPayForBlobs { |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely!! started to change this in this PR, but since this PR is already large, decided to spin it off as a follow up. #1221
@@ -26,16 +26,16 @@ message ShareCommitAndSignature { | |||
// MsgPayForBlob pays for the inclusion of a blob in the block. | |||
message MsgPayForBlob { | |||
string signer = 1; | |||
bytes namespace_id = 2; | |||
uint64 blob_size = 3; | |||
repeated bytes namespace_ids = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud: Most cases of submitting multiple blobs would all probably be for the same namespace. Could we add a rule that if there is a single namespace_id yet multiple blob_sizes that the same namespace is used for all blobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think there will be a lot of use cases where people post PFBs with many different namespaces, but better yet, we have #1115
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only problem with not encoding namespace is it means we have events without namespaces, which could be annoying for indexers. Say I want to query the most popular namespaces - this now becomes super difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a comment in #1115 to link to this comment as its a good point
app/process_proposal.go
Outdated
if len(msgs) != 1 { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not possible for a correct node to generate this in PrepareProposal
then we should outright reject the entire proposal. Same for line 96
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I agree and don't see why we shouldn't change this now. e9a8014
I tried to test this, but it would take a ton of code to hit this test, as we'd basically have to write a custom share splitting method just to create an other wise valid block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed some feedback here, but I still plan on breaking off one refactor in a meager attempt to reduce the size of the PR.
app/process_proposal.go
Outdated
if len(msgs) != 1 { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I agree and don't see why we shouldn't change this now. e9a8014
I tried to test this, but it would take a ton of code to hit this test, as we'd basically have to write a custom share splitting method just to create an other wise valid block.
@@ -26,16 +26,16 @@ message ShareCommitAndSignature { | |||
// MsgPayForBlob pays for the inclusion of a blob in the block. | |||
message MsgPayForBlob { | |||
string signer = 1; | |||
bytes namespace_id = 2; | |||
uint64 blob_size = 3; | |||
repeated bytes namespace_ids = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think there will be a lot of use cases where people post PFBs with many different namespaces, but better yet, we have #1115
// with this message should use when included in a block. The share_version | ||
// specified must match the share_version used to generate the | ||
// share_commitment in this message. | ||
uint32 share_version = 8; | ||
repeated uint32 share_versions = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there could maybe be a scenario with multiple share_versions, but we could enforce that if we wanted to. Its probably a good trade-off
pkg/inclusion/get_commit.go
Outdated
} | ||
commitments[i] = commitment | ||
} | ||
return merkle.HashFromByteSlices(commitments), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of merkelizing all the commitments over plain hashing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question, so if we have many blobs being paid for by a PFB, then there might be some scenario where you want to prove that someone paid for a specific blob without downloading the entirely blob. other than that, I'm not really sure. Using a hash would probably work fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to prove that someone paid for a specific blob without downloading the entirely blob
Won't you still need the blob or at least the commitment for that blob?
(I'm generally a bit hesitant about merkelizing something without a strong use case. For example, tendermint merkelizes the header which I always felt was pointless. The headers is like 400 bytes. Just download the entire thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another good point.
it would be good to get @nashqueue 's point of view on this as well for inclusion fraud proofs. We can discuss synchronously if necessary, as we should probably make a decision soon. I'm definitely fine with changing this to simply include all of the share commitments in the PFB instead of using a secondary commitment over all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain that we would have to end up downloading all of the share commitments for a single PFB if we're attempting to prove inclusion for a single one in both of the other options, those being taking a hash over all of the commitments, or including all commitments in the PFB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is ever a scenario where a single PFB pays for 1000 blobs, then I could see the secondary commitment being useful, until then, probably not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you make a tradeoff between fraud-proof size and DA bytes. In a way, you can see it if you have multiple commitments, you only need to show that one of them is broken. If you have only one, you need all blobs to compute the merkle hash and compare that it is wrong. If there is no limit to PFBs per Tx then the worst case fraud proof size for a single commitment fraud-proof would be O(n), with n being the number of shares in a square. This is the case if you have 1 share-sized blob per Tx. So, therefore, I would say saving all commitments is the better option here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, that's what we'll do then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left an issue to handle after this is merged #1231
…1196) ## Overview part of #388, and split apart from #1154 this PR uses a new mechanism to generate the share commit that can create a commitment over multiple blobs. We don't yet support multiple blobs per PFB, so this is currently effectively hashing the normal share commitment. ## Checklist - [x] New and updated code has appropriate documentation - [x] New and updated code has new and/or updated testing - [x] Required CI checks are passing - [x] Visual proof for any user facing features like CLI or documentation updates - [x] Linked issues closed with keywords
## Overview part of #388, split apart from #1154 This PR refactors enforcement of only allowing a single PFB per sdk.Tx and, per feedback in #1154, **adds a new block validity rule** where all transactions must be decodable! ## Checklist - [x] New and updated code has appropriate documentation - [x] New and updated code has new and/or updated testing - [x] Required CI checks are passing - [x] Visual proof for any user facing features like CLI or documentation updates - [x] Linked issues closed with keywords Co-authored-by: Callum Waters <[email protected]> Co-authored-by: Rootul P <[email protected]>
Co-authored-by: Rootul P <[email protected]>
## Overview bumps core to v1.13.0-tm-v0.34.23 blocking #1154 and #1042 ## Checklist - [x] New and updated code has appropriate documentation - [x] New and updated code has new and/or updated testing - [x] Required CI checks are passing - [x] Visual proof for any user facing features like CLI or documentation updates - [x] Linked issues closed with keywords
there are still a lot of follow ups to this PR, but if there is no more feedback, I'll try to merge this PR monday |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM w/ the exception of one test name
Co-authored-by: Rootul P <[email protected]>
Overview
This PR implements the ability to add an arbitrary number of blobs to a single PFB. Leaving as a draft until we merge the blocking PR in core, and probably tidy up a bit, rebase, or add a few more unit tests.
In part thanks to us planning ahead earlier this year, there's actually not that many required changes to get multiple blobs per PFB. Mostly just adding tests, using a new mechanism to calculate the share commitments, making a few things a slice instead of a single thing, and minor adjustments to make square estimation/square layout work with multiple blobs.
closes #388
Checklist