-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unixfsv1-directories-as-ADLs proposal #105
base: main
Are you sure you want to change the base?
Conversation
This needs to highlight right up front that this is strictly about the Go implementation--it's mentioned, but a little offhand and I don't think it'll be entirely obvious to many who are reading this. Also IMO it'd be worth extra-highlighting that one of the highest points of leverage here is the independence of UnixFS and having it all in one, well-defined place with clean boundaries, and not having its logic spilling over in many places. Being able to use it independently and have it work for all of the difficult cases it needs to work (like scalability) out of the box is going to unlock a lot of potential ecosystem & independent tooling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's lots of good content here @warpfork and @lidel . Thanks for putting it together.
This isn't your fault, but I think the project proposal template may have gotten in the way of making the pitch more clear. I worry that a non-engineer will get somewhat lost here. I think we need to get *specific& problems/solutions from a customer's point-of-view called out upfront. (I recognize a lot of time has been spent here and we don't want to spend a lot more.)
In reading through this again, I think a way (just an idea) we could crisply get the point across is with a structure like:
# Use Case #1: Customer wants to add a file to a directory
Incrementally adding a file to a directory is a common usecase for customers.
## Current Problematic Option #1: Use ipfs object patch
We have repeatedly seen customers reach for "patch" semantics. It unfortunately has these issues:
1. no implicit gc protection - the result can be garbage collected before one can pin it
2. can create untransferrable block - there is no support for sharding of big directories. A single block can be created that is too big to be exchanged over network
Because of these issues we have actually hidden the APIs and directed customers to use...
## Current Problematic Option #2: Use ipfs files cp
While this is our current recommended approach, it too is fraught with issues including:
1. Abstraction level
2. Built on top of the MFS library which few Stewards know how to touch, and those who do view it as radioactive
3. Clunky - have to type `files cp` and `files stat`
...
## How use case is handled by ADLs
// Show what the user-typed code/calls look like when using ADLs
# Use Case #2: ...
... Repeated structure for use case above ...
# Other problems solved
1. Unixfsv1 implementation components are currently scattered among many source code repositories,
and the API boundaries are not especially clear. This makes maintenance and development difficult. This approach would consolidate the implementation to a single repository.
2. Selectors
...
If this jives/makes sense, I would suggest putting content like this at the top and have the existing content below with a note that reader is welcome but not expected to read below. I want a PM who commits 10 minutes of time to look at this to get why the engineering team would be recommending this.
As to how we prevent requiring so much write-up in the future, I'm looking to the improvements in #106 to help here.
Thanks again for your time and work here. I'm also game to talk more verbally.
Some very low-level features are made available by IPFS APIs, but are not always safe to use, | ||
because they may do incorrect things in "corner cases" which are not obvious to end users, resulting in a general air of fragility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provide specific example(s)?
Some high level features are made available by other IPFS APIs, but jump to such a high level | ||
that users with specific needs can be observed to jump back down to the low level APIs | ||
(and then procede to use them incorrectly, because those APIs do not guide the user to good usage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provide some example(s)?
Example user stories such as "patch this directory" currently don't work out well, because while we offer "patch" APIs in some places, | ||
they only work on low-level data, or, they work on high level data, but in ways that are not entirely desirable: | ||
|
||
- In the low-level way: we encounter problematic scenarios because low-level abstractions do not include enough knowledge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make clear what the low-level way is? I assume it's ipfs object patch add-link
which are so large that they exceed the limits set by data transfer mechanisms elsewhere in our stack. | ||
This incoherence results in serious usability problems | ||
(specifically, that users can create data which is then only available from that node, and cannot be transferred). | ||
- In the high-level way: While the "patch" user story can be satisfied (arguably) through the use of the MFS system, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make clear what this way is? I assume it's ipfs files cp
Long story short: this is a proposal to renovate the way we work with unixfsv1 data so that it becomes managable through the interface of an IPLD "ADL" (an "Advanced Data Layout"), which is an API concept design to handle needs like sharding.
The aim is that this should produce nicely maintainable code, while also fixing a lot of bugs along the way, providing a couple of nice features immediately (like Selectors working over unixfsv1 pathing, which is extremely neat), and providing a solid ground for new work.