Use batched RPC to fetch mempool entries & transactions #979

romanz · 2023-12-20T19:12:11Z

This PR improves initial mempool sync significantly.

Before the change (efa045c):

$ RUST_LOG=electrs::mempool=DEBUG ./server.sh bitcoin
...
[2023-12-20T19:14:25.916Z DEBUG electrs::mempool] loading 13693 mempool transactions
[2023-12-20T19:15:33.444Z DEBUG electrs::mempool] 13693 mempool txs: 13693 added, 0 removed

After the change (ab12ce7):

$ RUST_LOG=electrs::mempool=DEBUG ./server.sh bitcoin
...
[2023-12-20T19:25:50.766Z DEBUG electrs::mempool] loading 13693 mempool transactions
[2023-12-20T19:25:57.931Z DEBUG electrs::mempool] 13693 mempool txs: 13693 added, 0 removed

conduition

7 seconds? fantastic. I'll try this out with my own configuration in the next few days and let you know how it goes. The code looks good to me.

conduition · 2023-12-23T00:01:05Z

On my setup where electrs and bitcoind are running on separate machines, this PR takes 7 minutes to do the initial mempool sync:

[2023-12-22T18:59:02.121Z DEBUG electrs::mempool] loading 86409 mempool transactions
[2023-12-22T19:06:48.405Z DEBUG electrs::mempool] 86398 mempool txs: 86398 added, 0 removed

Subsequent updates typically take 1 to 10 seconds:

[2023-12-22T19:07:41.434Z DEBUG electrs::mempool] loading 88704 mempool transactions
[2023-12-22T19:07:50.642Z DEBUG electrs::mempool] 88696 mempool txs: 2309 added, 11 removed

This is a HUGE improvement over master, where initial sync takes 29 minutes:

[2023-12-22T20:40:03.788Z DEBUG electrs::mempool] loading 101477 mempool transactions
[2023-12-22T21:09:33.240Z DEBUG electrs::mempool] 96762 mempool txs: 96762 added, 0 removed

And subsequent updates take at least 10 seconds, or longer:

[2023-12-22T22:24:56.417Z DEBUG electrs::mempool] loading 79937 mempool transactions
[2023-12-22T22:25:18.195Z DEBUG electrs::mempool] 79937 mempool txs: 1464 added, 3 removed

conduition · 2023-12-23T00:14:18Z

src/mempool.rs

+                entries.len(),
+                txids_chunk.len()
+            );
+            let txs = daemon.get_mempool_transactions(txids_chunk)?;


It might be wise to chunk the raw transaction fetching depending on the size of each TX.

These days, with inscriptions running around, some TXs might be much larger than others, so one batch could end up fetching several dozen megabytes worth of TX data, while another might only fetch a few hundred kilobytes.

At this point in the code we can use the mempool entries to check how big each transaction is, which we can use to pace the TX fetching more evenly. This way we don't end up with some responses which are too large.

The logic might go something like:

all_raw_txs = [] while len(mempool_entries) > 0: fetch_bucket = [] expected_response_size = 0 for entry in mempool_entries: if entry.size + expected_response_size > MAX_RESP_SIZE: expected_response_size += entry.size mempool_entries.remove(entry) fetch_bucket.append(entry.txid) else: break raw_txs = daemon.fetch_txids_batch(fetch_bucket) all_raw_txs.extend(raw_txs)

@romanz I'd be happy to give this a go if you'd like. But this PR is already a huge improvement over the status quo, so I'd say it's safe to save this idea for a future PR.

Thanks, good idea!
Let's implement it as a future PR.

Use batched RPC to fetch mempool entries & transactions

ab12ce7

romanz mentioned this pull request Dec 20, 2023

Async mempool scanning #970

Open

conduition reviewed Dec 21, 2023

View reviewed changes

romanz mentioned this pull request Dec 22, 2023

Initial mempool sync may take a lot of time #968

Open

romanz self-assigned this Dec 22, 2023

conduition reviewed Dec 23, 2023

View reviewed changes

conduition approved these changes Dec 23, 2023

View reviewed changes

romanz merged commit ab12ce7 into master Dec 23, 2023
8 checks passed

romanz deleted the mempool-sync branch December 23, 2023 09:00

romanz added the performance label Dec 23, 2023

conduition mentioned this pull request Dec 23, 2023

batched mempool fetching should be smart about which TXs to fetch #981

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use batched RPC to fetch mempool entries & transactions #979

Use batched RPC to fetch mempool entries & transactions #979

romanz commented Dec 20, 2023 •

edited

Loading

conduition left a comment

conduition commented Dec 23, 2023 •

edited

Loading

conduition Dec 23, 2023

romanz Dec 23, 2023

Use batched RPC to fetch mempool entries & transactions #979

Use batched RPC to fetch mempool entries & transactions #979

Conversation

romanz commented Dec 20, 2023 • edited Loading

conduition left a comment

Choose a reason for hiding this comment

conduition commented Dec 23, 2023 • edited Loading

conduition Dec 23, 2023

Choose a reason for hiding this comment

romanz Dec 23, 2023

Choose a reason for hiding this comment

romanz commented Dec 20, 2023 •

edited

Loading

conduition commented Dec 23, 2023 •

edited

Loading