Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Libp2p Gossipsub Peer Gater #6479

Merged
merged 13 commits into from
Sep 23, 2024
Merged

Add Libp2p Gossipsub Peer Gater #6479

merged 13 commits into from
Sep 23, 2024

Conversation

kc1116
Copy link
Contributor

@kc1116 kc1116 commented Sep 18, 2024

This PR adds the libp2p gossipsub peer gater option to the flow gossipsub adapter. The peer gater is used by gossipsub to throttle messages from spamming peers. Each topic can be assigned a topic delivery weight which can prioritize messages of one topic from another. Additionally this PR adds the following:

Mainnet 25 outage testing results

@codecov-commenter
Copy link

codecov-commenter commented Sep 18, 2024

Codecov Report

Attention: Patch coverage is 9.37500% with 29 lines in your changes missing coverage. Please review.

Project coverage is 41.46%. Comparing base (56ac820) to head (cffba51).

Files with missing lines Patch % Lines
network/netconf/flags.go 25.00% 9 Missing ⚠️
network/p2p/config/gossipsub.go 0.00% 9 Missing ⚠️
network/p2p/builder/gossipsub/gossipSubBuilder.go 0.00% 4 Missing and 1 partial ⚠️
network/p2p/node/gossipSubAdapterConfig.go 0.00% 3 Missing ⚠️
network/p2p/mock/pub_sub_adapter_config.go 0.00% 2 Missing ⚠️
insecure/corruptlibp2p/pubsub_adapter_config.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6479      +/-   ##
==========================================
- Coverage   41.48%   41.46%   -0.02%     
==========================================
  Files        2027     2027              
  Lines      144784   144816      +32     
==========================================
- Hits        60060    60051       -9     
- Misses      78513    78554      +41     
  Partials     6211     6211              
Flag Coverage Δ
unittests 41.46% <9.37%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -628,6 +628,12 @@ network-config:
# keep the entire network's size. Otherwise, the local node's view of the network will be incomplete due to cache eviction.
# Recommended size is 10x the number of peers in the network.
cache-size: 10000
peer-gater:
enable: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this enabled so the full name is gossipsub-peer-gater-enabed. it's more consistent with our other configs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// TopicDeliveryWeightsOverride topic delivery weights used to override the default topic delivery weight 1.0 of the peer gater.
// Parameters are "numerical values" that are used to compute or build components that compute the score of a peer in GossipSub system.
type TopicDeliveryWeightsOverride struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be difficult to allow arbitrary strings here instead of a fixed list? It might be useful to be able to add overrides for other channels without having to rollout a new image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR does introduces an effective way to penalize nodes with dropping their messages with probability (1-threshold). It is important keep in mind that threshold is computed and updated for each peer individually. For clarity, lets write $\texttt{threshold}_x$ to denote the value for node $x$. In a nutshell, I would describe $\texttt{threshold}_x$ as scoring value .

Note the computation of

total := st.deliver + pg.params.DuplicateWeight*st.duplicate + pg.params.IgnoreWeight*st.ignore + pg.params.RejectWeight*st.reject
threshold := (1 + st.deliver) / (1 + total)

$$ \texttt{threshold}_x = \frac{1 + \texttt{deliver}_x}{1 + \texttt{deliver}_x + \texttt{duplicate}_x + \texttt{ignore}_x + \texttt{reject}_x}$$

I have represented the penalty factors as $\texttt{duplicate}_x, \texttt{ignore}_x, \texttt{reject}_x \geq 0$. Just hypothetically, in a scenario where the validation queue overruns, throttle will increase, but unless a the score for node $x$ is actively penalized (e.g. by calling DuplicateMessage), the penalty scores will all be zero :
$0 = \texttt{duplicate}_x = \texttt{ignore}_x = \texttt{reject}_x$,
which yields: $\texttt{threshold}_x = 1$

In other words, I think this solution will work very well, if libP2P detects those messages as duplicates. Otherwise, we would still end up with $\texttt{threshold}_x = 1$ for every individual peer $x$ and throttling no-one.

Can we actively confirm this please, that there is an increased rate of DuplicateMessage calls? The peer gater is hooked into libP2P's PubSup by implementing the RawTracer interface.

I am wondering if we could capture the respective evidence with LocalGossipSubRouterMetrics?

"enable the libp2p peer gater")
flags.Float64(BuildFlagName(gossipsubKey, p2pconfig.PeerGaterKey, p2pconfig.TopicDeliveryWeightsKey, channels.ConsensusCommittee.String()),
config.GossipSub.PeerGaterParameters.TopicDeliveryWeightsOverride.ConsensusCommittee,
"topic delivery weights override for the consensus-committee topic")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the goal for the usage string should be to describe what this flag does. I don't really understand the current string, because it talks about technical details, but not what I should be expecting as a result of it.

Just as an example what I mean, you could write something like:

This is a heuristic for reducing load from nodes that ask a lot of questions,
while relaying fewer up-to-date liveness-critical messages. We do that by
assigning liveness-critical messages (such as messages on the consensus-committee
topic) a higher weight, that is added to a peer's score for successfully delivering this
such message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"topic delivery weights override for the consensus-committee topic")
flags.Float64(BuildFlagName(gossipsubKey, p2pconfig.PeerGaterKey, p2pconfig.TopicDeliveryWeightsKey, channels.SyncCommittee.String()),
config.GossipSub.PeerGaterParameters.TopicDeliveryWeightsOverride.SyncCommittee,
"topic delivery weights override for the sync-committee topic")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here.

@AlexHentschel
Copy link
Member

I have looked a bit further into the matter, and there is a massive spike in iWant messages.

https://www.notion.so/flowfoundation/WIP-Investigation-Notes-AlexH-0487b537cdcf415ab76432dc33f15d33?pvs=4

- set default source decay to 10m
- disable the peer gater by default
@peterargue peterargue added this pull request to the merge queue Sep 23, 2024
@peterargue peterargue removed this pull request from the merge queue due to a manual request Sep 23, 2024
@peterargue peterargue added this pull request to the merge queue Sep 23, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 23, 2024
@peterargue peterargue added this pull request to the merge queue Sep 23, 2024
Merged via the queue into master with commit a573d77 Sep 23, 2024
55 checks passed
@peterargue peterargue deleted the khalil/libp2p-peer-gater branch September 23, 2024 20:47
peterargue added a commit that referenced this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants