Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qualifier syntax is ambiguous for IRI identifiers and inconsistent #148

Open
althonos opened this issue Feb 28, 2024 · 0 comments
Open

Qualifier syntax is ambiguous for IRI identifiers and inconsistent #148

althonos opened this issue Feb 28, 2024 · 0 comments
Labels
1.4 major issues issues that are priority to address in 1.6

Comments

@althonos
Copy link
Member

Ambiguous qualifiers

Inside qualifier lists, there is an ambiguity with the current syntax.

For instance, inside qualifiers, which are produced with the following rule:

Qualifier ::= Rel-ID '=' QuotedString 

the Rel-ID is also allowed to contain an = sign at the end (if being produced by the Unprefixed-ID rule), so a greedy parser cannot parse the following:

{minCardinality=1}

(this has to be treated weirdly in fastobo).

I would suggest removing the OboChar rule, and have two rules in the syntax, one for producing the identifiers (and using the syntax from the SPARQL PN_LOCAL and PNAME_LN terminals); and one for producing the unquoted strings (and allowing most characters except { or ! which would need to be escaped to avoid ambiguity with the EOL rule):

Abbreviated-ID := PN_LOCAL
Prefixed-ID := PNAME_LN

This makes the syntax for the abbreviated and prefixed identifiers similar to the one of the OWL Manchester, and is more restrictive in terms of what an identifier can contain.

Even with that change, an IRI can still contain = as a subdelimiter, so this doesn't solve the problem, e.g. from ncit.obo:

def: "A practiced and regimented skill or series of actions." [] {http://purl.obolibrary.org/obo/NCIT_C16847="NCI"}

the URL part is still ambiguous.

Inconsistency

Qualifier lists are the only places where the equal sign = is used; in xref lists or property values, the value is only separated by a whitespace from the annotation property. Since whitespaces are not IRI characters, this also fixes the problem from above. This would change syntax from:

Qualifier ::= Rel-ID '=' QuotedString 

to

Qualifier ::= Rel-ID {WhiteSpaceChar} QuotedString 

which concretizes for instance to:

{http://purl.obolibrary.org/obo/NCIT_C16847 "NCI"}

in the example above, and can be parsed without ambiguity.

@althonos althonos added the 1.4 major issues issues that are priority to address in 1.6 label Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.4 major issues issues that are priority to address in 1.6
Projects
None yet
Development

No branches or pull requests

1 participant