Skip to content

Commit

Permalink
Build tasks for HTTP retrievals only + various improvements (#7)
Browse files Browse the repository at this point in the history
* feat: skip deals created before June 2023

Signed-off-by: Miroslav Bajtoš <[email protected]>

* feat: query IPNI with concurrency=5

Signed-off-by: Miroslav Bajtoš <[email protected]>

* cache IPNI responses

Signed-off-by: Miroslav Bajtoš <[email protected]>

* rename "stats.retrievable" to "stats.advertised"

Signed-off-by: Miroslav Bajtoš <[email protected]>

* log unknown protocol codes

Signed-off-by: Miroslav Bajtoš <[email protected]>

* keep HTTP retrievals only

Signed-off-by: Miroslav Bajtoš <[email protected]>

* fix loading of cached IPNI responses

Signed-off-by: Miroslav Bajtoš <[email protected]>

* log progress of build-retrieval-tasks

Signed-off-by: Miroslav Bajtoš <[email protected]>

* document how to disable MaxListenersExceededWarning

Signed-off-by: Miroslav Bajtoš <[email protected]>

* treat grapsync providers at `/tcp/80/http` as trustless HTTP GW

Signed-off-by: Miroslav Bajtoš <[email protected]>

* feat: process StateMarketDeals in Rust

Speed up the initial JSON parsing from ~1h to <3m.

Signed-off-by: Miroslav Bajtoš <[email protected]>

* fix: add missing "const"

Signed-off-by: Miroslav Bajtoš <[email protected]>

* feat: dockerize + serve output via HTTP

Signed-off-by: Miroslav Bajtoš <[email protected]>

* npx standard --fix

Signed-off-by: Miroslav Bajtoš <[email protected]>

* update README

Signed-off-by: Miroslav Bajtoš <[email protected]>

---------

Signed-off-by: Miroslav Bajtoš <[email protected]>
  • Loading branch information
bajtos authored Dec 7, 2023
1 parent 26fbfef commit 1e4b134
Show file tree
Hide file tree
Showing 10 changed files with 616 additions and 27 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,8 @@ generated/ldn-deals.ndjson
generated/retrieval-tasks.ndjson
generated/StateMarketDeals.ndjson
generated/update-spark-db.sql


# Added by cargo

/target
317 changes: 317 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[package]
name = "fil-deal-ingester"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
env_logger = "0.10.1"
json-event-parser = "0.1.1"
log = "0.4.20"
11 changes: 11 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM node:20
USER node
WORKDIR /usr/src/app
COPY package*.json .
COPY generated/ldn-deals.ndjson .
COPY scripts scripts
RUN ls -l
RUN npm ci

ENV SERVE=1
CMD [ "node", "--no-warnings", "scripts/build-retrieval-tasks.js", "ldn-deals.ndjson" ]
Loading

0 comments on commit 1e4b134

Please sign in to comment.