Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(swarm): allow NetworkBehaviour to report remote address #4302

Closed
mxinden opened this issue Aug 7, 2023 · 16 comments · Fixed by #4371
Closed

feat(swarm): allow NetworkBehaviour to report remote address #4302

mxinden opened this issue Aug 7, 2023 · 16 comments · Fixed by #4371
Assignees
Labels
getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted

Comments

@mxinden
Copy link
Member

mxinden commented Aug 7, 2023

Description

Allow a NetworkBehaviour to report the external address of a remote peer to other NetworkBehaviours.

Possible approach (though up for discussion):

  • Extend ToSwarm with ToSwarm::NewExternalAddrOfPeer.
  • Have libp2p-identify emit the new event.
  • Extend FromSwarm with FromSwarm::NewExternalAddrOfPeer.
  • Have libp2p-kad consume the new event.

Motivation

  • Allows libp2p-identify to e.g. tell libp2p-kad of an external address of a remote peer and thus allows libp2p-kad to track that address in its Kademlia routing table.

Are you planning to do it yourself in a pull request?

No

@mxinden mxinden added help wanted getting-started Issues that can be tackled if you don't know the internals of libp2p very well labels Aug 7, 2023
@dariusc93
Copy link
Member

I assume this will not just be limited to just libp2p-identify emitting the event but any behaviour that has obtain a peer remote address?

@mxinden
Copy link
Member Author

mxinden commented Aug 9, 2023

Yes, any NetworkBehaviour, not constrained to libp2p-identify.

@thomaseizinger
Copy link
Contributor

I assume this will not just be limited to just libp2p-identify emitting the event but any behaviour that has obtain a peer remote address?

Yes although there aren't that many which can discover "new" addresses. I'd say it is unnecessary to emit an event for an address we just connected to for example.

Behaviours I can think of that can discover new addresses are:

  • identify
  • kademlia
  • mDNS
  • rendezvous

@StemCll
Copy link
Contributor

StemCll commented Aug 12, 2023

Taking a look into this

@mxinden
Copy link
Member Author

mxinden commented Aug 27, 2023

(1) libp2p-request-response consuming FromSwarm::NewExternalAddrOfPeer

Another consumer of the new FromSwarm::NewExternalAddrOfPeer can be libp2p-request-response.

This could feed into its addresses cache:

/// Externally managed addresses via `add_address` and `remove_address`.
addresses: HashMap<PeerId, SmallVec<[Multiaddr; 6]>>,

Though we would need some garbage collection mechanism along the lines of libp2p-identify:

/// The addresses of all peers that we have discovered.
discovered_peers: PeerCache,

(2) Generic Swarm::add_external_addr_of_peer triggering FromSwarm::NewExternalAddrOfPeer

If we go one step further and add Swarm::add_external_addr_of_peer we could deprecate the protocol specific methods in libp2p-request-response and libp2p-kad:

/// Adds a known address for a peer that can be used for
/// dialing attempts by the `Swarm`, i.e. is returned
/// by [`NetworkBehaviour::handle_pending_outbound_connection`].
///
/// Addresses added in this way are only removed by `remove_address`.
pub fn add_address(&mut self, peer: &PeerId, address: Multiaddr) {
self.addresses.entry(*peer).or_default().push(address);
}

/// Adds a known listen address of a peer participating in the DHT to the
/// routing table.
///
/// Explicitly adding addresses of peers serves two purposes:
///
/// 1. In order for a node to join the DHT, it must know about at least
/// one other node of the DHT.
///
/// 2. When a remote peer initiates a connection and that peer is not
/// yet in the routing table, the `Kademlia` behaviour must be
/// informed of an address on which that peer is listening for
/// connections before it can be added to the routing table
/// from where it can subsequently be discovered by all peers
/// in the DHT.
///
/// If the routing table has been updated as a result of this operation,
/// a [`KademliaEvent::RoutingUpdated`] event is emitted.
pub fn add_address(&mut self, peer: &PeerId, address: Multiaddr) -> RoutingUpdate {


@thomaseizinger @StemCll thoughts on (1) and (2)?

@StemCll I suggest not doing (1) or (2) as part of #4371 to keep the pull request's scope small.

@thomaseizinger
Copy link
Contributor

Instead of (1), I thought we wanted to do #4103?

@mxinden
Copy link
Member Author

mxinden commented Sep 1, 2023

In my eyes no protocol should depend on #4103. In other words, all protocols should function without #4103. I don't think we should move away from the current convention where every protocol keeps its own cache of peer information.

In my eyes #4103 is mostly for users to interact with. E.g. retrieving the currently connected nodes, their addresses, ...

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Sep 2, 2023

In my eyes no protocol should depend on #4103. In other words, all protocols should function without #4103. I don't think we should move away from the current convention where every protocol keeps its own cache of peer information.

In my eyes #4103 is mostly for users to interact with. E.g. retrieving the currently connected nodes, their addresses, ...

Agreed in principle.

At the moment, libp2p-request-response has APIs that IMO belong onto #4103, like add_address.

I am not against adding a cache of discovered addresses although that might be better implemented on a Swarm level as an implementation detail without an external API so that not every protocol has to re-implement it.

Once we have #4103, I'd propose to remove the add_address API on libp2p-request-response.

Protocols should be able to perform their primary function in isolation but for me, that excludes keeping an address book. I can use libp2p-request-response just fine if I've made a dial with addresses to the peer already. Thus, I don't think that is conflicting and from that PoV, I don't see why libp2p-request-response should be a consumer of this new event.

@thomaseizinger
Copy link
Contributor

I've commented with a different idea on #4371: Instead of removing the add_address API on libp2p-request-response as a result of #4103, we probably should instead add a function to Swarm that allows users to trigger this event. That is how we've previously solved this. Behaviours like kademlia & request-response can then have an internal state where they respond to the event and (temporarily) store the address.

@thomaseizinger
Copy link
Contributor

I've commented with a different idea on #4371: Instead of removing the add_address API on libp2p-request-response as a result of #4103, we probably should instead add a function to Swarm that allows users to trigger this event. That is how we've previously solved this. Behaviours like kademlia & request-response can then have an internal state where they respond to the event and (temporarily) store the address.

I just realised that this is exactly (2) from above. Sorry @mxinden, I must have not read the above comment properly 😅

Or as they say: Great minds think alike 😇

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Oct 8, 2023

Instead of sharing addresses, should we be thinking about sharing peer records?

@mxinden
Copy link
Member Author

mxinden commented Oct 8, 2023

Given that we don't support it in libp2p-identity yet, would you construct the peer record locally? If so, what would you set on PeerRecord::envelope?

pub struct PeerRecord {
peer_id: PeerId,
seq: u64,
addresses: Vec<Multiaddr>,
/// A signed envelope representing this [`PeerRecord`].
///
/// If this [`PeerRecord`] was constructed from a [`SignedEnvelope`], this is the original instance.
envelope: SignedEnvelope,
}

Long term, yes, I think this mechanism should use peer records. Short term, to not block it on the role out of peer records, I suggest using addresses only.

@thomaseizinger
Copy link
Contributor

You can't construct it locally because it needs to be signed so yeah it would have to come from libp2p-identify but that is a trivial addition compared to this feature.

I thought of it because we will need a mechanism for sharing peer records for anything that is like libp2p/specs#587 and it is kind of unfortunate to design a new API that we already know will be obsoleted.

@thomaseizinger
Copy link
Contributor

For reference, here is the ticket: #4017

@mxinden
Copy link
Member Author

mxinden commented Oct 11, 2023

You can't construct it locally because it needs to be signed so yeah it would have to come from libp2p-identify but that is a trivial addition compared to this feature.

libp2p-identify is not the only protocol that can discover addresses of remote peers. E.g. libp2p-kad can do so as well, e.g. on Kademlia::find_closest. Unfortunately the libp2p Kademlia specification only supports plain addresses today. See https://github.com/libp2p/specs/blob/master/kad-dht/README.md.

and it is kind of unfortunate to design a new API that we already know will be obsoleted.

With the above in mind, I don't think the new API will be obsolete any time soon. I don't see the large libp2p DHT deployments (e.g. IPFS) move to signed peer records any time soon.

@thomaseizinger
Copy link
Contributor

Yeah that is fair! I guess once we have non-exhaustive, we can easily add more variants :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants