Data discovery #4
Replies: 8 comments 13 replies
-
I think it makes sense to do as you suggest @lecoqlibre - to go with option 3 for now. Would we require all platforms to implement the TypeIndex on a fixed URL (e.g. |
Beta Was this translation helpful? Give feedback.
-
Also wondering how this ties in with datafoodconsortium/.github#5 (Platform Discovery) ? |
Beta Was this translation helpful? Give feedback.
-
I agree with @maxime that the entry point on platforms should enable the retrieval of containers for data querying, rather than presenting a graph with the user as the root and their data as predicates in a tree structure. A method more aligned with the Solid standard would be for this entry point to return the profile linked to the user's WebID as a "Solid WebID Profile," with profile data containing links to the containers [https://solid.github.io/webid-profile/]. The containers could be represented using the VoID ontology [https://www.w3.org/TR/void/]. Another solution could be for the server to return the containers without concerning itself with the notion of a user, on an endpoint like _wellKnown/void or equivalent, as is done by semapps servers. These solutions could be complementary. |
Beta Was this translation helpful? Give feedback.
-
Ok. I agree it makes sense to use a single point of entry for a platform.
Ok... which if these standards are you proposing to use: WebID OIDC; Solid OIDC; WebID-TLS ? Are these all extensions to OIDC ? Would that require a change/clarification to the standard to utilise WebID + OIDC, rather than just OIDC ? I'm concerned that we're introducing another concept (WebID) - we currently reference
Would this still be encapsulated in the OIDC user access layer - to provide data security ? Also worth noting the parallel discussion around use of TypeIndexes / data structures - #3 @balessan - would love to hear your thoughts. |
Beta Was this translation helpful? Give feedback.
-
Introducing WebID and profileIn the following proposal, any DFC user and platform would have their own WebID and profile(s). When dereferenced, a WebID would give information (profile) which can be used to find data. The WebID could be used with Solid-OIDC for authentication and authorization instead of the email (currently used in the DFC token). Platform WebID and profileFor instance, a public WebID of a platform (https://ofn.org/card#me) could look like the following and contain links to pieces of data: # This is the WebID profile document
<> a foaf:PersonalProfileDocument; # (note: there is a proposal to rename it to solid:profileDocument).
foaf:primaryTopic <#me>.
<#me> a foaf:Agent; # this is the WebID profile, containing useful information to discover data.
foaf:name "Open Food Network";
pim:preferencesFile <link/to/the/preferences/file>; # could be used to restrict access to some information (see below).
solid:publicTypeIndex <link/to/the/public/type/index>; # we can use TypeIndex or whatever vocabulary to advertise some index documents.
dfc-b:EnterpriseByNameIndex <link/to/the/index/document>. # or we could list the indexes directly in the WebID using custom predicates which is linked to specific indexes like this one. If the platform want to restrict access to some information, it could move them in a private or restricted (ACL) preferences file: <> a pim:ConfigurationFile.
<https://ofn.org/card#me>
solid:publicTypeIndex <link/to/the/public/type/index>; # like before we can use TypeIndex or any vocabulary of our choice.
dfc-b:EnterpriseByNameIndex <link/to/the/index/document>. # or direct predicates. User's WebID and profileThe following is an example of an user's WebID (https://webid.provider/user/card#me): <> a foaf:PersonalProfileDocument;
foaf:primaryTopic <#me>.
<#me> a foaf:Person;
foaf:name "Bob";
pim:storage <https://socleo.fr/user/dfc/> <https://ofn.org/user/dfc/>; # link to the user's data on different platforms.
pim:preferencesFile <link/to/the/preferences/file>. # Used to restrict access to sensitive information (see below). To restrict access to sensitive information, the user did move some information in the preferences file, protected by ACL: <> a pim:ConfigurationFile.
# Here we are extending the profile of the user...
<https://webid.provider/user/card#me>
dfc-b:agent # ...mapping the user to his corresponding dfc-b:Person on the different platforms he uses.
<https://socleo.fr/user/dfc/agent/person/user>
<https://ofn.org/user/dfc/agent/person/user>.
# We also define a configuration to find the data.
<#config> a dfc-b:Configuration;
solid:privateTypeIndex <link/to/the/private/type/index>; # TypeIndex could be used to lists containers.
dfc-b:enterpriseContainer <link/to/the/container>; # or use DFC custom direct predicates like this one.
void:somePredicate <...>; # or we can use void datasets or any other vocabulary of our choice.
dfc-b:EnterpriseByNameIndex
<https://socleo.fr/user/dfc/enterpriseByNameIndex>
<https://ofn.org/user/dfc/enterpriseByNameIndex>. |
Beta Was this translation helpful? Give feedback.
-
Today during our Tech meeting with @RaggedStaff, @RachL, @simonLouvet and @lin-d-hop I presented the WebID proposal which is linked to a Solid-OIDC proposal (only the Resource server + Authorization server part of it). Garethe and Simon said that platforms are not ready to use Solid-OIDC although we did not discuss what it is and what is required to implement it (I started to implement a Proof Of Concept which showed me that is "easily" doable). Lynne said that we should make the standard work without changing it ("rewriting" was the word I think). It seems we want a solution right now with as few changes as possible and considering Solid-OIDC as the way to go but in a future version of the standard. Simon and Garethe was saying that Solid-OIDC has not been discussed and agreed before and we need a solution that is working with the current classic OIDC authentication. It's true but I was thinking it was maybe the time to discuss about it seriously because we gonna have to change the standard anyway. And as an architech I prefer to take the time to design things in a long term thinking because I know the result will be stronger, easier and cheaper at the end. But I also understand real-life limitations and how short/mid term solutions could be attractive. Even if they can work we should be careful, especially when building a standard, because some changes will have a huge impact if they are not made at the right time (authentication and data storage is like the foundations of a building). But let's go back to the meeting. Some opinion was expressed on these subjects:
I'm little nervous about the point 3b. This point is related to the fact that we need a way to know all the other platforms a user is using and where its data (the data of this particular user) is stored on these platforms. The question is where to save that information? So if we choose the point 3b. we gonna have a WebID for the same user on all the platforms. We gonna have to link them together, using the Indeed, these WebIDs will be copies of each other, they should be identical. So we need these copies to be synchronized. And here we hit the question of synchronization with all the issues and complexity it has (network errors, authorizations, integrity, etc). Some possible alternatives to replication that could work with the current DFC OIDC:
I would say the best option is 3 given the constraints. The first option seems unfair and the second is not generic and not realy secure. Simon is also proposing to use ActivityPub to synchronize platforms. Using this mechanism the option 3b could also be used. But on my side I don't realy like this idea of synchronization. I think it will be overcomplicated to manage to get the ecosystem synced. And the Solid project and its philosophy resolve this issue, after almost ten years of research. WebID and Solid-OIDC are well designed, simple and fit the Web requirements while being secure (the Demonstrating Proof of Possession is also ten years of research in the cyber-security community). Here I paste the content I wrote in the meeting notes: Abstract: If we want multiple applications to exchange data in a re-usable way, we need to let one app to know where data is hosted on a second app. Example: tell that route (URL) points to the Enterprise LDP container, that other route points to the SuppliedProduct LDP container and so on. We have a consensus about using a WebID + WebID profile document for platforms. When dereferenced, a WebID would give information (profile) which can be used to find data. For describing the data that can be found, I think we should have in fact:
Because the main high level TypeIndex should only advertise the High Level data types (and not all types like we did in option 3 of the discussion). We should instead think in term of data format. The user should not choose where he wants to store a particular RDF resource in the installation itself. He choose the installation on which he wants to work on but that’s all. The application should know the internal organisation but not the user. While the standard could use a TypeIndex to reference the internal resources of a installation this is not necessary useful. So the WebID of a platform could reference a public TypeIndex listing all (those who agreed) the DFC (main) Enterprise the platform is hosting (or preferably all the DFC installations but we need a way to link a installation to the main enterprise(s) of the installation). This enables data discovery: other platforms would be able to know which farms are hosted by the platform by following the links found in the TypeIndex and try to dereference whatever RDF resource(s) it might need to display each farm information on its interface. So platforms would host the users information in dedicated containers. For instance, the user “Tom Jones” could have a DFC installation at https://someplatform.org/tom.jones/. This container would host all the RDF resources of the installation including the (main) Enterprise of Tom Jones. If many installation are supported, platforms could host these installations in https://someplatform.org/tom.jones/installation1 and https://someplatform.org/tom.jones/installation2. Having a fixed tree structure is like having some kind of file format or package format. Example: Enterprise will always be stored in the /agents/enterprises/ container, supplied products will always be stored in the /defined-products/supplied-products/ container and so on. The installation would be defined by a new ontology predicate like dfc-b:installation. Users should have their own WebID too. This would allow him to make the link between the platforms he is using. The WebID should also be used to login on platforms. That would allow Access Control at the platform level. This requires platforms to get closer to the Solid protocol by implementing Solid-OIDC (primer) which would fix the security issues we have. Solid-OIDC is well designed and use the new Demonstrating Proof of Possession (DPoP) mechanism which has been actively seeked for more than a decade by the OAuth ecosystem. The history is showed in this presentation. While platforms seems to not be ready to move the users data on Solid PODs yet, they could act as OAuth Resource Server and Authorisation server. Indeed, the Identity server could be delegated to a third party like DFC maybe? Indeed the DFC could propose a WebID service! So users would create their WebID on DFC and would be able to login on all the DFC applications they would use. In the same time they would also be able to use Solid applications! At the platform level the implementation of a Resource + Authorisation server is doable. The DFC could provide some tutorials and tools to make this even easier and faster. Maxime worked several days on this topic and will show some code at https://github.com/datafoodconsortium/solid-oidc-proposal. |
Beta Was this translation helpful? Give feedback.
-
I not aggre each platfrm have divergentinformations about user. webId in a platform is the user as seen by the platform. |
Beta Was this translation helpful? Give feedback.
-
@RaggedStaff, @lecoqlibre , @simonLouvet @lin-d-hop and I have met again. On this topic we have agreed that:
|
Beta Was this translation helpful? Give feedback.
-
If we want multiple applications to exchange data in a re-usable way, we need to let one app to know where data is hosted on a second app. Example: tell that route (URL) points to the Enterprise LDP container, that other route points to the SuppliedProduct LDP container and so on.
I see some options here:
/agents/enterprises/
container, supplied products will always be stored in the/defined-products/supplied-products/
container and so on. The root container would be defined by a new ontology predicate likedfc-t:rootContainer: "https://example.org/user/"
.dfc-t:routeOfEnterprise
would be used to express where Enterprise can be hosted, thedfc-t:routeOfSuppliedProduct
would be used to express where Supplied products can be hosted on so on.I think we should consider the investigation of the option 4 while the option 3 could be easily set up.
Currently we use a TypeIndex in our Solid specifications.
Beta Was this translation helpful? Give feedback.
All reactions