-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client not resolving any packets after lots of publish #133
Comments
Actually, you don't seem to be needing all the publishing. For let client = Client::builder()
.no_relays()
.cache_size(0) // Disable caching so we actually call the DHT.
.maximum_ttl(0)
.build()
.unwrap();
let key = PublicKey::try_from("de5ommih83t8u9x6p94yfrar367ewgprobf1x8uwdynixq7zoupy").unwrap();
if client.resolve_most_recent(&key).await.is_none() {
println!("None");
}
|
This mainly tells me that the DHT is just not happy with the rate of requests this script is making from the same IP. |
There is also the possibility that Pkarr and or Mainline itself is where the rate of queries is causing an issue (channels dropping messages or something), and that is what I will try to debug now in a more controlled environment. |
This is a possibility we are looking into right now. The goal is to determine what is possible with a republisher that keep republishing user packets on the homeserver. I am currently switching the churn script to the mainline lib directly to count actual items responded by the individual nodes. This will give me more insights.
Cool. I'll let you know if I figure out more. |
I think the correct methodology is:
Last time I tried this way before I started Pkarr in Rust, I managed to keep ~150k records alive with >95% success rate at resolving. Anyways, good luck. |
I incorporated mainline into the resolve function and I see the same behaviour. https://github.com/pubky/pkarr-churn/tree/mainline-problem Resolution starts to drop heavily after 30 public keys. And keep in mind, I sleep 1min! after every resolve in this example. So I have doubts that it is the rate limiting. I will keep experimenting.
|
Anyone interested in stress testing pkarr/mainline and getting a picture on how long it takes to see a packet churning, here is an MRE that works as expected, and easy to read:
In my experience, publishing 512 records work fine with no issues, and I got a 99.7% resolution success rate for the first 10 minutes... Even then, when a record failed to resolve, that was most likely a rate limiting issue, because I can manually resolve that failing key just fine. I didn't run this random sampling for longer than 10 minutes, because I know from experience churning can take hours, and on occasion days, and I am not that invested to find the exact number. |
Doesn't work for me. Fails all the time
I guess rate limiting. Not sure where though. Maybe asking the same nodes again and again for routing info? |
I suspect that you are either using a VPN (sharing the same IP with others) or you have been making more requests than I do (running pkarr-churn more often) or just the mainline is more busy now. Rate limiting is not the only problem though, UDP in general is flaky and it can be overwhelmed and packets can be dropped for no fault of your own, or anyone's. I still don't see enough evidence that the implementation of mainline is causing a bug ... maybe more throttling of requests would help, but I would rather leave it for whoever dealing with this edge case to retry and backoff.. at least for now. If the issue was blocking an actual production needs, as opposed to statistics gathering, that I quite honestly don't consider critical or valuable (actionable), I would be acting with more sense of urgency. |
I just replicated this from my laptop behind VPN and from my VPS. Resolution right after that is a bit spotty though, but I couldn't see an issue while publishing. And as usual resolving manually always succeeds. I am satisfied that a medium sized service provider can easily keep the record's of thousands of users live on the DHT, at worst by implementing some retries and backoffs, let alone what can be done with relays and republishing from native clients. I don't see a bug or a mission failure here... so far. |
Conclusion:
Closing here as it is resolve for us. |
We were experimenting with pkarr packet churn rates when we found some weird behaviour.
If we publish 100 public keys without using the cache the pkarr client would always return
None
even though the packet actually exists.I made an example to reproduce it.
resolve_most_recent()
stops to work completely andresolve()
works occasionally but with lots of misses.Version: 3.5.1
FYI: @SHAcollision
The text was updated successfully, but these errors were encountered: