Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: atrium-repo #272

Open
wants to merge 80 commits into
base: main
Choose a base branch
from
Open

feat: atrium-repo #272

wants to merge 80 commits into from

Conversation

DrChat
Copy link
Contributor

@DrChat DrChat commented Dec 23, 2024

This builds on str4d's work here: #168.

Big notes:

  • Complete merkle search tree implementation (but still lacking fuzz tests)
  • Complete Repository implementation with support for basic CRUD operations
  • Added a new pair of traits for block storage: AsyncBlockStoreRead and AsyncBlockStoreWrite.
    • In-memory storage backend (primarily intended for testing)
    • CAR storage (supporting both reading and writing)
    • Users can implement other storage sources, like Azure-backed block storage.

Test coverage (cargo llvm-cov -p atrium-repo --html --open):
image


  • Merkle Search Tree
    • Parse Node
      • Read Node from bytes
      • Verify depth and sort order
      • Limit the number of TreeEntrys per Node to a statistically unlikely maximum length.
      • Consider limiting the overall depth of the repo, or other parameters, to prevent more sophisticated key mining attacks.
    • Locate key within node
    • Locate keys within node with a given prefix
    • Add entry
    • Edit entry
    • Delete entry
  • Repository
    • Load from a CAR file
    • Load from a firehose record
    • Read/write records inside of a repository
  • Storage
    • CAR files
      • Reading CAR files
      • Verify completeness of the repository structure.
      • Robustness to both duplication and de-duplication of blocks.
      • Ignore any unnecessary or unlinked blocks.
    • Commits from firehose records

@DrChat DrChat marked this pull request as draft December 23, 2024 20:14
@DrChat DrChat force-pushed the repo branch 6 times, most recently from 3b84377 to f8e988e Compare December 26, 2024 21:08
@DrChat DrChat force-pushed the repo branch 2 times, most recently from f736ccc to b230066 Compare January 1, 2025 19:20
loop {
let node = Node::read_from(&mut bs, node_cid).await?;
if !seen.insert(node_cid) {
// This CID was already seen. There is a cycle in the graph.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even possible to serialize a graph with a cycle?
The two nodes would point at each other, and by definition it should be impossible to compute their CIDs if there is a cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I actually don't think this is possible, which means the extra validation logic here is probably optional and very unlikely to be hit in practice.
To err on the side of safety though, I think it would be best to leave this in-place.

@DrChat DrChat force-pushed the repo branch 2 times, most recently from 6db3530 to 60adc4a Compare January 18, 2025 21:58
@DrChat
Copy link
Contributor Author

DrChat commented Jan 18, 2025

I think the PR is in a pretty good state for a MVP.
My time is limited so I won't be able to quickly iterate on the functionality for creating new commits for a repository. Still planning on it, but the work here can be pushed up in the meantime :)

@DrChat DrChat marked this pull request as ready for review January 18, 2025 22:24
@erlend-sh
Copy link
Contributor

erlend-sh commented Jan 19, 2025

Pinging @str4d since this builds on their prior work.

@sugyan sugyan self-requested a review January 19, 2025 13:48
Copy link
Owner

@sugyan sugyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have written several comments on areas unrelated to the internal logic. There are also a couple of points from clippy that I hope you can correct as well.

.gitignore Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
atrium-repo/Cargo.toml Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
atrium-repo/README.md Outdated Show resolved Hide resolved
@DrChat DrChat force-pushed the repo branch 5 times, most recently from 5f55fc5 to cb8e8c0 Compare February 1, 2025 17:52
@DrChat
Copy link
Contributor Author

DrChat commented Feb 8, 2025

Note: It looks like there's a pretty exhaustive MST test suite here that could probably be leveraged to validate this implementation.
https://github.com/DavidBuchanan314/mst-test-suite

@DrChat DrChat force-pushed the repo branch 2 times, most recently from b3afd8a to 2ee7cba Compare February 12, 2025 00:59
@DrChat DrChat force-pushed the repo branch 4 times, most recently from 9277356 to 465724d Compare February 15, 2025 22:50
@DrChat DrChat requested a review from sugyan February 17, 2025 22:50
@DrChat
Copy link
Contributor Author

DrChat commented Feb 17, 2025

@sugyan Alright, I think this code is ready to be reviewed and merged :)

Copy link
Owner

@sugyan sugyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on a minor point regarding Cargo.toml, README, etc.

There seem to be a few warnings caused by cargo clippy and I'd like to see that fixed if possible.

I'd love to have @str4d review the implementation details if possible, but he's busy...?

Cargo.toml Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
atrium-repo/Cargo.toml Show resolved Hide resolved
atrium-repo/README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants