Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Bioregistry-style compact identifiers #117

Open
egonw opened this issue Jun 11, 2023 · 2 comments
Open

add Bioregistry-style compact identifiers #117

egonw opened this issue Jun 11, 2023 · 2 comments

Comments

@egonw
Copy link
Member

egonw commented Jun 11, 2023

@cthoyt, let's say, I would be adding Bioregistry-style compact identifiers as Literals to the WikiPathways RDF, is there already an ontology with a predicate for such identifiers?

Let's say we have this:

<https://identifiers.org/ensembl/ENSG00000139163>
        rdf:type            wp:GeneProduct , wp:DataNode ;
        rdfs:label          "ETNK1" ;
        wp:bdbEnsembl       <https://identifiers.org/ensembl/ENSG00000139163> ;
        wp:bdbEntrezGene    <https://identifiers.org/ncbigene/55500> ;
        wp:bdbHgncSymbol    <https://identifiers.org/hgnc.symbol/ETNK1> ;
        wp:bdbUniprot       <https://identifiers.org/uniprot/A0A5K1VW28> , <https://identifiers.org/uniprot/Q86U68> , <https://identifiers.org/uniprot/Q9HBU6> , <https://identifiers.org/uniprot/H0YH69> , <https://identifiers.org/uniprot/H0YFP7> , <https://identifiers.org/uniprot/A0A
5F9ZI33> ;
        wp:bdbWikidata      <http://www.wikidata.org/entity/Q18041828> .

And I would add something like x:y as below, what should x:y be (ps, ignore the prefixes, I didn't check them; you get the point)?

<https://identifiers.org/ensembl/ENSG00000139163>
        x:y       "ensembl:ENSG00000139163" , ncbigene:55500", "hgnc.symbol:ETNK1", uniprot:A0A5K1VW28", "uniprot:Q9HBU6", "uniprot:H0YH69", "uniprot:H0YFP7", "uniprot/A0A5F9ZI33, "wikidata:Q18041828" ;

I could use wp:bdbBioregistry but maybe you have something better in mind?

(yes, once we're done with full RDF support in Bioregistry we can add that too)

cc @DeniseSl22 @marvinm2 @ammar257ammar

@DeniseSl22
Copy link
Contributor

I believe BioRegistry has something like that in their own database (which I'm currently using for the Kinetic RDF model):

@prefix uniprot:   <https://identifiers.org/uniprot/> . 
@prefix bioregistry: <https://bioregistry.io/oboinowl:> . 

uniprot:P21549 bioregistry:hasDbXref uniprotkb:P21549.

@cthoyt
Copy link

cthoyt commented Jun 12, 2023

There's a lot going on here, let me try to unpack it.

wrt Egon's original comment, I think what you're trying to do is say that for URI entity https://identifiers.org/ensembl/ENSG00000139163, there are some equivalent things who have CURIE representations as "ensembl:ENSG00000139163", "ncbigene:55500", etc.

This is a bit confusing since it combines two logical operations together, maybe you could instead do something like

https://identifiers.org/ensembl/ENSG00000139163 skos:exactMatch https://bioregistry.io/ncbigene:55500
https://bioregistry.io/ncbigene:55500 <predicate> "ncbigene:55500"

The second thing that's confusing is that the semantics implied by such a predicate would be redundant in connecting https://identifiers.org/ensembl/ENSG00000139163 and "ensembl:ENSG00000139163"

The Bioregistry schema has a lot of predicates for talking about meta stuff. It also links to several other partially overlapping vocabularies like vann, void, sh, and idot (see the turtle). However, it doesn't have predicates for connecting an IRI to a CURIE representation of the IRI like <predicate> did in the example above.


@DeniseSl22 I'm don't think you're using the prefixes correctly in your example

@prefix uniprot:   <https://identifiers.org/uniprot/> . 
@prefix bioregistry: <https://bioregistry.io/oboinowl:> . 

uniprot:P21549 bioregistry:hasDbXref uniprotkb:P21549.

The vocabulary on the second line is oboinowl, so you should call it that

@prefix uniprot:   <https://identifiers.org/uniprot/> . 
@prefix oboinowl: <https://bioregistry.io/oboinowl:> . 

uniprot:P21549 oboinowl:hasDbXref uniprotkb:P21549.

Second, I'm not what the context for uniprotkb is in this example since it's not defined.

Maybe of interest to you - Bioregistry has a SPARQL service that implements identifier mapping (e.g., so you can avoid materializing redundant definitions of the same entity using multiple URI prefixes). I think it will be easier to finish this discussion over a call sometime this or next week if you have time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants