-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YAML-LD datatypes (and tags for datatypes) #17
Comments
This is not the case anymore with JSON-LD 1.1 (example) |
This is another interesting direction to explore that does not seem to create inconsistencies with YAML spec, thanks Vladimir! I suggest using full-URI tags in the examples for clarity, eg:
|
I feel that manually specifying data types for each value is very tedious, and the tag syntax is not very intuitive. My feeling is this: why don't we delegate that task to the context? The machine is smart enough to understand that a value of a |
Can you post an example? Probably we should start collecting examples of "equivalence classes" of yaml files in this repo. |
@ioggstream https://yaml.org/spec/1.2.2/#104-other-schemas allows us to make an XSD YAML scheme, @anatoly-scherbakov Of course if a field ALWAYS uses the same datatype, the context can provide it. But dates in instance data often come in various granularities (same with numbers). So wouldn't it be nice to write this instead of the respective long forms?
|
Of course we can, and that's an important role of JSON-LD contexts: making explicit some implicit constrains/dependencies (e.g. "this field expects this datatype"). However, we also need a way to make this information explicit (e.g. in the expanded form of JSON-LD). In JSON-LD, this is done with a value object Also, +1 to @VladimirAlexiev use-case above. |
@VladimirAlexiev @ioggstream that is an interesting point. When using JSON-LD, I always tried to ensure that a particular property always maps to a specific type, but I agree that this application of tags is compelling. 👍 |
This was discussed during today's call: https://json-ld.org/minutes/2022-06-22/. |
This issue was discussed in today's meeting. |
I think this is a great candidate for something an extended profile could do, and something like the In my mind, this isn't a direct replacement for the The toRdf and fromRdf algorithms would need to honor them when generating RDF or turning RDF back into the internal representation, again running with the appropriate processing mode.
Otherwise, this change should be fairly transparent. IMO, this is the primary motivation for an extended profile. |
So what is actually in play here is a profile of YAML itself - the profile for which JSON-LD translations are lossless, so we dont need a profile of YAML-LD, but YAML-LD is an extension of a "YAML-JSON-compatible" profile. Such a profile could be implicit - or made explicit if multiple YAML/JSON conversions are defined. Another reason to make it explicit would be to validate if a given YAML document is compatible with YAML-LD before defining the YAML-LD extended syntax for that YAML schema. |
I guess in my mind, the "YAML-JDON-compatible" profile is analogous to YAML using the JSON schema. This does not depend on explicit tags, but implicitly associates the values with I think something like a "YAML-XSD-compatible" profile might require the use of a tag namespace such as suggested by @VladimirAlexiev: If running in "extended", or "YAML-XSD-compatible" mode, a In my mind, this and alias nodes are the primary think that would be enabled by an extended mode. If a processor sees some other Given this, I think we may be about ready to define the processing modes more completely. |
I'm thinking here about statements about conformance - :myresource dct:conformsTo - how do I know if a yaml resource is "YAML using the JSON schema." (the same holds true for the identifiers for YAML-LD and JSON-LD.) general Use Case is to be able to determine what an API supports in terms of interoperability of data payloads. Can anyone orient me to where this is being defined or discussed? I can see inline directives such as https://yaml.org/spec/1.2.2/#681-yaml-directives, Is identification of the profile out-of-band using resolvable identifiers (i.e. not in syntax-specific directives using syntax-specific keywords and versioning) a factor in defining processing modes? |
This comment was marked as resolved.
This comment was marked as resolved.
I've looked into this some more as part of trying to implement extended support for XSD scalar values in YAML. IMO, the appropriate
This would allow values such as Other YAML tools show similar issues, I think largely due to the fact that that YAML spec only uses the An example file I've been working with to exercise this variation is the following: %YAML 1.2
%TAG ! http://www.w3.org/2001/XMLSchema#
---
"@context":
"@vocab": http://xmlns.com/foaf/0.1/
name: !string Gregg Kellogg
homepage: https://greggkellogg.net/
depiction: http://www.gravatar.com/avatar/42f948adff3afaa52249d963117af7c8
date: !date 2022-08-08 (note, the use of a specific tag name shouldn't be significant. In this case, it's using the primary tag handle, but it could just as well be the secondary tag handle ( If we are to support XSD types, we probably want to white-list allowed datatype URIs to include most XSD types, in addition to See also yaml/yaml-spec#268 (comment). |
No, I don't believe it is, however, we could consider using a datatype form such as defined for the i18n namespace: @prefix i18n: <https://www.w3.org/ns/i18n#> .
[ ex:title "foo"^^i18n:en ] . Although it's defined to allow a combination of language and base-direction, it can be used for just language or base direction. Of course, we would need to define that literal values using an i18n datatype consisting of only language would be translated to language-tagged literals, and visa-versa. |
|
onlineyamltools.com allows Trying with explicit xsd tag gives the same error: %YAML 1.2
%TAG !xsd! http://www.w3.org/2001/XMLSchema#
---
name: !xsd!string Gregg Kellogg This tool can only use the "YAML JSON schema" builtin tags (and supports %YAML 1.2
%TAG ! tag:yaml.org,2002:
---
name: !str Gregg Kellogg
int: !int 123
bigint: !int 123456789012345678901231 # -> 1.2345678901234569e+23 ouch!
bigint: 123456789012345678901231 # -> 1.2345678901234569e+23 ouch!
float: !float 1.235609853907835079889067406870964870956870967908 # -> 1.235609853907835
date: !timestamp 2022-08-08 -> 2022-08-08T00:00:00.000Z |
My implementation needed to use a lower-level parser that just transforms YAML to the Representation Graph without further interpretation. In Ruby Psych, this is done via Psych.parse_stream. That level shouldn't place constraints on any specific schema. |
Beyond XSD: let's not forget custom datatypes, eg:
|
This was discussed on [2022-09-28](https://json-ld.org/minutes/2022-09-28/#16).Pierre-Antoine Champin: The devil is in the details, and in the bnodes :-D ✪
Vladimir Alexiev: I think we should use YAML tags in the form that datatypes are used for RDF. ✪
... JSON-LD is more verbose, and the YAML syntax is more concise.
... In many case the context will relieve you of this need, but there are cases where the graph is heterogeneus
... May be a problem with parsers.
... This also relates to YAML schemas, and how to attach types.
... YAML had a schema including dates, but have backed up.
... My proposal would be that the WG will declare a %TAG |xsd| ...
... But, implementers will need to use a better parser that supports tags.
... This is also important for numbers.
... We had trouble in xxx group, where the number would be mis-interpreted.
... Then we need to look at a YAML parsers matrix to determine how widely available it is.
Gregg Kellogg: The current "spec" refers to a basic profile, which doesn't include tags but only basic YAML values ✪
... and an Extended profile that includes XSD datatypes, and tags for URLs (is it absolute, or relative...)
... Gregg has an implementation that uses the YAML parse tree.
... Also in JSON-LD (discussion between Gregg and Antoine at TPAC), there is a movement towards handling more datatypes, and not mangling literals with default treatment of numbers
Vladimir Alexiev: What about URLs? ✪
... In a heterogeneous dataset, the same field could contain either a string or a resource.
... can we have a single tag !id or !uri that would handle absolute, relative and CURIEs?
Gregg Kellogg: We want to explore some more use cases of URLs before deciding ✪
Vladimir Alexiev: Can we decide this issue? ✪
... let's not forget custom datatypes, eg geo:wktLiteral, geo:gmlLiteral, 5-10 more in GeoSPARQL 1.1, and the tentative rdf:JSON and rdf:YAML
Gregg Kellogg: Questions of quoting: is !xsd!integer '123' the same as !xsd!integer 123 and same as 123, or different? ✪
Niklas Lindström: Author: someone!tag-key => as if author was defined in the context with "`@type`": <tag-key>; then if e.g. someone!uri was encountered, *and* uri is defined as an alias of "`@id`", this is short for {"`@id`": "someone"} ✪
... the tag comes before the value, eg !tag-key someone
https://github.com/type -> `@type` ✪
https://github.com/id -> `@id` ✪
Gregg Kellogg: Tags should be declared in %TAG not in context, else we'll go against the grain of YAML ✪
|
@gkellogg -- Several unfenced |
Sorry, must have been unfenced on IRC. I’ll fix them later |
Yeah, I'm sure they were unfenced on IRC. There's no consistent value to fencing there. Weirdly, now that they're single-backtick fenced here, those backticks are showing as part of the text instead of being interpreted as markdown -- so, for instance, we now see (bold added here to help with clarity) {"`@id`": "someone"}, where we'd expect to see {" I suspect this won't be a quick or easy fix, but it should be raised with the folks running the (now several!) IRC/log-to-GitHub bots. |
Well, I handle the irc log to HTML for these minutes, which were inserted here. Perhaps could detect some bare keywords, but you’re right that the result in the comment is wrongly interpreted, but that seems like a GH issue. |
I'd suggest wrapping the larger element including the |
@type
in JSON-LD). Eg see Elaborate on handling of JSON builtin typesinteger
anddouble
w3c/json-ld-syntax#387 for the pitfalls of using large integers or decimals-.inf
and.nan
, datetimes), and even more complex structures. One could declare "YAML schemas" with additional tags, eg to represent all XSD datatypesWhy might we want more than "string plus
@type
"?dc:date
below and many other examples)02022-05-18
to2022-05-18
if tagged as!xsd!date
rather than looking at a parallel@type
field.Let's collect below examples of what we could want.
@gkellogg in ietf-wg-httpapi/mediatypes#8 (comment)
@VladimirAlexiev from #2:
-.inf
and.nan
).12345678901234567890.12345
is converted to RDF literal"12345678901234567168"^^xsd:integer
(see jsonld playground)@type
, egNew ones:
"foo"@en
in YAML rather than a separate@language
field?The text was updated successfully, but these errors were encountered: