-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARQL 1.2 Functions related to initial text direction and language tags #154
Comments
Difference to the earlier #113 draft
|
Makes sense! |
For reference, there's an issue open with some concerns about the current text direction approach. If an alternative approach is taken, this will have an impact here as well. |
@rubensworks - thanks for pointing out that issue. Nothing is final until the publication of the REC 😄 There is no rush to get text into the SPARQL spec for these functions but at the same time, the WG has made a decision and we can't wait until RDF 1.2 is finalized before doing work. The function list is my view on what is the natural outcome of the WG decision on initial text direction and the changes in RDF. That includes discussions with the internationalization working group. Bidirectional text is a much larger problem and I don't see that the WG has decided to take up the issue. The only response I recall is along the lines of "use a content-focused literal" (e.g. JSON-LD has non-normative "base direction". So initial text direction (terminology suggested by i18n IIRC) already exists. Datatypes have been discussed and problems with them identified. A datatype is a class, and the subclass relationship does not work for scripts (a subclass must be usable in a place where the superclass is valid). There is nothing to stop use of compound literals. The WG initial text direction decision does not block that nor do do the proposed SPARQL changes. If the WG takes up w3c/rdf-concepts#79 , things may need to change. FWIW I think the lack of a way to give a direction to a non-language tagged string is a bit odd. It would need a new datatype, not a munging of (General discussion about initial text direction in RDF 1.2 and on the RDF Concepts issues list please.) |
The rdf-tests PR w3c/rdf-tests#135 shows that using Therefore
Table in the description updated. N.B. Langtags are compared and matched in a case insensitive manner but RDF concepts does not mandate lowercase. Some systems use canonical langtags (e.g. |
The WG looked at w3c/rdf-concepts#79 at TPAC'24 and resolved:
|
Co-authored-by: Thomas Tanon <[email protected]> Co-authored-by: Olaf Hartig <[email protected]>
While implementing base direction support, I realized that (at least) the following string-based functions will also need to have their description updated and/or spec tests amended to cope with directional language strings.
|
The argument compatibility rules which apply to STRSTARTS, STRENDS, CONTAINS, STRBEFORE and string literal return type have been updated. Similarly, LCASE, UCASE . Examples could do with an additional row but what else? And more tests. REPLACE - unrelated: needs adding to string literal return text (1.1 omission). "string literal" needs a proper anchor. What else? |
The
This probably needs to be extended towards directions as well. Functions such as
(I haven't looked into all descriptions in detail yet, my comment above was mainly to raise the need to look into it more detail later) |
Sub-issue #180 created. |
SPARQL 1.2 Functions for language string literals
Functions:
hasLANG(literal)
,hasLANGDIR(literal)
,LANG(literal)
,LANGDIR(literal)
,STRLANGDIR(xsd:string, xsd:string, xsd:string)
,STRLANGDIR(xsd:string, xsd:string, xsd:string)
.LANG(literal)
is part of SPARQL 1.1 and is extended forrdf:dirLangString
.Accessors:
hasLANG
hasLANGDIR
LANG
LANGDIR
"abc"@en
"en"
""
"abc"@en--ltr
"en"
"ltr"
"abc"@en--LTR
"en"
"ltr"
"abc"
""
""
"abc"^^rdf:dirLangString
""
""
"abc"^^rdf:langString
""
""
"123"^^xsd:integer
""
""
<http://example/xyz>
Constructors:
STRLANG("abc", "en")
"abc"@en
STRLANG("abc", "")
STRLANG(123, "")
STRLANGDIR("abc", "en", "ltr")
"abc"@en--ltr
STRLANGDIR("abc", "en", "LTR")
STRLANGDIR("abc", "en", "")
STRLANGDIR("abc", "", "ltr")
STRLANGDIR(123, "", "ltr")
STRLANGDIR(<x:uri>, "en", "ltr")
It is possible to write
"abc"^^rdf:dirLangString
and"abc"^^rdf:langString
in N-Triples and Turtle.The functions
hasLang
andhasLANGDIR
test whether an RDF term has the language tag of initial text direction component. See RDF Concepts, section "Literals". They don't test by datatype.LANG
is in SPARQL 1.1. This determines the choice forLANGDIR
when passed a non-literal and the result ofLANGDIR(123)
.The accessors
LANG
andLANGDIR
return the facet or""
followingLANG
in SPARQL 1.1.The argument must be a literal otherwise it is an error.
In these cases,
hasLANG
/hasLANGDIR
is false and the return ofLANG
andLANGDIR
is""
.The facet is not present.
It may be possible to write a literals with text direction but no language tag in some other format (note: for RDF/XML we can require "lang=" if "dir=" is present").
Notes
hasFUNC(arg)
is equivalent toFUNC(arg) != ""
.The name
hasLANG
/hasLANGDIR
is different in style toisLITERAL
etc because thehas*
tests a component, not the RDF term as a whole.hasLANG
applies tordf:langString
andrdf:dirLangString
.Initial Text direction is canonicalized to lowercase: c.f. langtag being canonicalized in RDF 1.2.
It is not possible to write a literal in Turtle or N-Triples with a text direction but no language tag, nor is it possible to write a literal other than
rdf:dirLangString
andrdf:langString
with language tag. These are illegal in RDF Concepts but may be it will occur naturally in other syntaxes as corner cases. The accessors approach works on components and would be well-defined.The text was updated successfully, but these errors were encountered: