Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GIP-0083: Substreams On The Network. #63

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

abourget
Copy link

@abourget abourget commented Dec 6, 2024

Addition of GIP-0080, to bring Substreams on the network officially, and pave the road to indexing rewards.

@@ -0,0 +1,111 @@
---
GIP: "0080"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GIP 80 and 81 have been taken already, could you change this to 0082? (and include it in the PR title plz)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to get in quick, 0082 taken too now. :)

Note we are moving away from have the title included in the filename, so just "0083.md" would be fine/better.

Copy link
Contributor

@RembrandtK RembrandtK Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it seems this proposal might have got to 0082 first, and the other 0082 should be bumped.

We should perhaps change the process so get a GIP number assigned (with a placeholder merged to main) and then start work without interference.

Practical (imperfect) thing that can be done right now is to include GIP in PR title. Then if you create new PR, hopefully you scan existing titles, and can stake your claim. :)


# Abstract

This GIP proposes integrating Substreams, a streaming-first blockchain indexing technology, into The Graph Network. This integration will enable Substreams to be fully recognized on the network and eventually receive indexing rewards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Receiving indexing rewards may require smart contracts changes and will require a bit more work, it might be worth mentioning that to avoid confusion on the scope of this GIP


## Payments

Payments for Substreams work will be on-chain. The existing `collect` function in the staking contract will be used for payment processing. It was deemed that no TAP was required to have Substreams collect payments and honor the promises of the protocol.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TAP is not required for initial full network support, but still strongly desirable for the long term, I think it would be good to make that distinction.
IMO it would still be necessary to have some clarity on the GIP on what the MVP of payments looks like. Who calls collect and at what times? How is the pricing defined? How do indexer and payer agree on the price? What do indexers need to do?

Copy link

@p-diogo p-diogo Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 one Pablo's last questions. Most readers may not understand how the staking contract works, specifically what directly calling collect() allows for and who can do it. The GIP lacks additional context, and references to previous work like the Permissionless Payers GIP would be appreciated.

How is the pricing defined? How do indexer and payer agree on the price? What do indexers need to do?

My understanding is that that information on pricing currently lives only in thegraph.market, which I assume comes from the mentioned providers.json file. Given that .json file will be populated with the endpoints registered on-chain (after some health checks), some tooling is missing for Indexers to transparently advertise their rate (which seems to be tied to $ per byte). Maybe the DataEdge could also be used for this, with a well-defined protocol, and tracked in a subgraph which eventually feeds that providers.json.


Economic security will be enforced through slashing and disputes. Indexers providing incorrect data can have their stake slashed.

- **Attestations:** Substreams endpoints will provide signed attestations (optional) to verify data integrity. The command-line tools (`substreams gui --request-attestations` or `substreams run --request-attestations`) will generate and display these attestations. To accommodate different future attestation methods, and not only an Ethereum key signature, the attestation payload will be prefixed with the protocol used (e.g., `eth:` for Ethereum). Detailed documentation will describe the attestation format and verification process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attestations being optional sounds odd. It makes me think a malicious indexer could provide incorrect data all the time, and only serve correct data when an attestation is provided. Can't attestations be included in payloads by default?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes perhaps, the server could configure to always have them. I was hesitant to make them always sign off, depending on performance issues.

Also, if someone is running the tech internally, perhaps it doesn't make sense for them to sign stuff.
So perhaps a server serving on the network could have --always-sign-payloads or something?

I'll add some details

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense for those who registered their endpoints through the DataEdge contract do run with that parameter all the time. That info should be made accessible via the /health or /info endpoints mentioned so that at least frontends like thegraph.market could indicate that to consumers (and keep track of those who may at times turn it off).
But I'd prefer to defer to software defaults and have it enabled, as we should optimize for a scenario where Substreams services are always offered through our network (or at least assume most will do it via The Graph).


## Feature Matrix

This GIP does _not_ propose enabling indexing rewards for Substreams just yet. It is the first step towards rolling out all what is necessary to eventually get there. This is contingent on data determinism assurances and Indexer readiness. Future GIPs will propose expanding indexing rewards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above - enabling indexing rewards would also require figuring out what activity would be rewarding (is it a firehose POI?) and might require smart contracts changes too

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, so perhaps we detail that in a future GIP

@abourget
Copy link
Author

Ok I've revised a few things, based on a few feedback elements I've received.


## Payments

Payments for Substreams work will be on-chain. The existing `collect` function in the staking contract will be used for payment processing. It was deemed that no TAP was required to have Substreams collect payments and honor the promises of the protocol. However, as TAP brings augmented trust minimization properties, it would be incorporated in future work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expand a bit on how this solution integrates with the current protocol, in particular the payments architecture?

I'm not sure I get the whole picture, some questions that I have after reading through the document:

  • The collect function in the staking contract pulls funds from the caller, is the consumer supposed to make this call?
  • Collect function also requires an allocation id, which implies the existence of a subgraph deployment id. How does that fit with substreams?
  • Another feature of the staking collect method is curation fees, what role does curation play with substreams?


## Indexer Selection Algorithm (ISA)

Given that there's no gateway involved with Substreams (it's direct point-to-point between consumer and provider), the existing Indexer Selection Algorithm (ISA) does not apply. A round-robin selection of providers will be used in the front-end (https://thegraph.market) to ensure fair distribution of request loads, while still allowing consumers to choose a provider.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth clarifying what the role of thegraph.market is and how are consumers supposed to integrate with it. So far they have been using thegraph.com/explorer as the entrypoint for their subgraph needs, since we are changing this I think it's worth making it explicit that this is a new product.


## Economic Security and Disputes

Economic security will be enforced through slashing and disputes. Indexers providing incorrect data can have their stake slashed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ties to my previous comment, how is econ security enforced? Indexers need to just stake? Do they need to allocate in some form?


## Arbitration

StreamingFast will provide support for arbitration cases involving Substreams, either by hiring an Arbiter or by collaborating with individuals experienced in arbitration. A detailed document outlining the arbitration process for Substreams discrepancies will be created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some detail about how arbitration is going to work is needed here. It sounds like its going to be a completely off chain process, however the slashing part needs to occur on-chain through the staking contract. How is that going to be bridged?

@tmigone
Copy link
Contributor

tmigone commented Dec 16, 2024

Another comment, the GIP makes no references to Horizon so I assume it's not a pre-requisite, is that assumption correct?

@p-diogo p-diogo requested a review from fordN December 16, 2024 22:46
@abourget abourget changed the title First draft of Substreams On The Network. 0084 - First draft of Substreams On The Network. Jan 7, 2025
@abourget abourget changed the title 0084 - First draft of Substreams On The Network. GIP-0083: Substreams On The Network. Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants