Releases: streamingfast/firehose-ethereum
v2.4.6
v2.4.5
v2.4.4
Substreams fixes
- fix a possible panic() when an request is interrupted during the file loading phase of a squashing operation.
- fix a rare possibility of stalling if only some fullkv stores caches were deleted, but further segments were still present.
- fix stats counters for store operations time
v2.4.3
substreams
- fix memory leak on substreams execution (by bumping wazero dependency)
- remove the need for substreams-tier1 blocktype auto-detection
- fix missing error handling when writing output data to files. This could result in tier1 request just "hanging" waiting for the file never produced by tier2.
- fix handling of dstore error in tier1 'execout walker' causing stalling issues on S3 or on unexpected storage errors
- increase number of retries on storage when writing states or execouts (5 -> 10)
- prevent slow squashing when loading each segment from full KV store (can happen when a stage contains multiple stores)
v2.4.2
substreams
- Fix a context leak causing tier1 responses to slow down progressively
v2.4.1
Substreams
- fix thread leak in metering affecting substreams
- revert a substreams scheduler optimisation that causes slow restarts when close to head
- add substreams_tier2_active_requests and substreams_tier2_request_counter prometheus metrics
v2.4.0
Substreams
- Substreams bumped to @v1.5.0: See https://github.com/streamingfast/substreams/releases/tag/v1.5.0 for details.
Chain-agnostic tier2
- A single substreams-tier2 instance can now serve requests for multiple chains or networks. All network-specific parameters are now passed from Tier1 to Tier2 in the internal ProcessRange request.
- This allows you to better use your computing resources by pooling all the networks together.
Important
Since the tier2
services will now get the network information from the tier1
request, you must make sure that the file paths and network addresses will be the same for both tiers.
ex: if --common-merged-blocks-store-url=/data/merged
is set on tier1, make sure the merged blocks are also available from tier2 under the path /data/merged
.
The flags --substreams-state-store-url
, --substreams-state-store-default-tag
, --common-merged-blocks-store-url
, --substreams-rpc-endpoints stringArray
and --substreams-rpc-gas-limit
are now ignored on tier2.
The flag --common-first-streamable-block
should be set to 0 to accommodate every chain.
Non-ethereum chains can query a firehose-ethereum
tier2, but the opposite is not true, since only the firehose-ethereum
implements the eth_call
WASM extension.
Performance improvements
- All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing).
- Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
- Tier2 will skip processing completely if it's processing the last stage and the
output_module
is a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time) - Tier2 will skip processing completely if it's processing a stage where all the stores and outputs have been processed and cached.
- Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
- Improved file listing performance for Google Storage backends by 25%!!(MISSING)
Tip
Concurrent requests on the same module hashes may benefit from the other requests' work to a certain extent (up to 75%!!(MISSING)) -- The very first request does most of the work for the other ones.
Tip
More caches will increase disk usage and there is no automatic removal of old module caches. The operator is responsible for deleting old module caches.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files.
The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst
when it runs into them.
Metrics
- Readiness metric for Substreams tier1 app is now named
substreams_tier1
(was mistakenly calledfirehose
before). - Added back readiness metric for Substreams tiere app (named
substreams_tier2
). - Added metric
substreams_tier1_active_worker_requests
which gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes. - Added metric
substreams_tier1_worker_request_counter
which gives the total Substreams worker requests a tier1 app made against tier2 nodes.
Flags
- Added
--merger-delete-threads
to customize the number of threads the merger will use to delete files. It's recommended to increase this when using Ceph as S3 storage provider to 25 or higher (due to performance issues with deletes the merger might otherwise not be able to delete one-block files fast enough).
v2.3.7
-
Fixed
tools check merged-blocks
default range when-r <range>
is not provided to now be[0, +∞]
(was previously[HEAD, +∞]
). -
Fixed
tools check merged-blocks
to be able to run without a block range provided. -
Added API Key based authentication to
tools firehose-client
andtools firehose-single-block-client
, specify the value through environment variableFIREHOSE_API_KEY
(you can use flag--api-key-env-var
to change variable's name to something else thanFIREHOSE_API_KEY
). -
Fixed
tools check merged-blocks
examples using block range (range should be specified as[<start>]?:[<end>]
). -
Added
--substreams-tier2-max-concurrent-requests
to limit the number of concurrent requests to the tier2 Substreams service.
v2.3.6
- BlockFetcher: added support for WithdrawalsRoot, BlobGasUsed, BlobExcessGas and ParentBeaconRoot fields when fetching blocks from RPC (for example, to get those values for Optimism)
- Substreams: add support for
substreams-tier2-max-concurrent-requests
flag to limit the number of concurrent requests to tier2 - Adding traceID for RPCCalls
v2.3.5
Substreams
Warning
This release deprecates the "RPC Cache (for eth_calls)" feature of substreams: It has been turned off by default and will not be supported in future releases.
The RPC cache was a not-well-known feature that cached all eth_calls responses by default and loaded them on each request.
It is being deprecated because it has a negative impact on global performance.
If you want to cache your eth_call responses, you should do it in a specialized proxy instead of having substreams manage this.
Until the feature is completely removed, you can keep the previous behavior by setting the --substreams-rpc-cache-store-url
flag to a non-empty value (its previous default value was {data-dir}/rpc-cache
)
- Performance: prevent reprocessing jobs when there is only a mapper in production mode and everything is already cached
- Performance: prevent "UpdateStats" from running too often and stalling other operations when running with a high parallel jobs count
- Performance: fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Added the output module's hash to the "incoming request" log
- Substreams RPC: add
--substreams-rpc-gas-limit
flag to allow overriding default of 50M. Arbitrum chains behave better with a value of0
to avoidintrinsic gas too low (supplied gas 50000000)
errors
Reader node
-
The
reader-node-bootstrap-url
gained the ability to be bootstrapped from abash
script.If the bootstrap URL is of the form
bash:///<path/to/script>?<parameters>
, the bash script at
<path/to/script>
will be executed. The script is going to receive in environment variables the resolved
reader node variables in the form ofREADER_NODE_<VARIABLE_NAME>
. The fully resolved node arguments
(fromreader-node-arguments
) are passed as args to the bash script. The query parameters accepted are:arg=<value>
| Pass as extra argument to the script, prepended to the list of resolved node argumentsenv=<key>%!d(MISSING)<value>
| Pass as extra environment variable as<key>=<value>
with key being upper-cased (multiple(s) allowed)env_<key>=<value>
| Pass as extra environment variable as<key>=<value>
with key being upper-cased (multiple(s) allowed)cwd=<path>
| Change the working directory to<path>
before running the scriptinterpreter=<path>
| Use<path>
as the interpreter to run the scriptinterpreter_arg=<arg>
| Pass<interpreter_arg>
as arguments to the interpreter before the script path (multiple(s) allowed)
[!NOTE]
Thebash:///
script support is currently experimental and might change in upcoming releases, the behavior changes will be
clearly documented here.