Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing relationships in AR relations ontology/graph #13016

Closed
ValWood opened this issue Feb 13, 2017 · 9 comments
Closed

missing relationships in AR relations ontology/graph #13016

ValWood opened this issue Feb 13, 2017 · 9 comments

Comments

@ValWood
Copy link
Contributor

ValWood commented Feb 13, 2017

My understanding was that directly regulates, and descendants were to describe the situation when one gene MF directly regulates another i.e.

sf1

but

Q directly regulates does not have the parent direct_input (should it?)

Also, directly regulates, and children are described for use with biological process terms in addition to molecular functions. Presumably this applies only to the situation

where x phosphorylates y (as above), you could say

BP protein phosphorylation directly_regulates substrate
(although this would be fully redundant with the MF annotation, and could be inferred from it)

Q. Is "directly regulates" used in any other context than the above for BP ?

@ValWood
Copy link
Contributor Author

ValWood commented Feb 13, 2017

Actually am I even looking at the right thing?

I'm looking at this:
http://www.ebi.ac.uk/QuickGO/AnnotationExtensionRelations.html

But I just forgot about the AE tracker (maybe should have posted there)?

Anyway this ticket
geneontology/annotation_extensions#63

shows very different relations
differetn

@dosumis
Copy link
Contributor

dosumis commented Feb 13, 2017

Q directly regulates does not have the parent direct_input (should it?)

No, because the regulates relations apply between processes (in the broad sense - so including MF and BP). has_input is a relation between a material entity (e.g. a gene product, a chemical and a process (broad sense again).

In this case, while biologists may say that gene product X (directly) regulates gene product Y, in GO we record that some specific activity of gene product X (directly) regulates some specific activity of gene product Y.

For LEGO, we have recently agreed to a rule to infer has_input when we record that a gene product enables binding to another gene product. So from this:

GP1 <-enabled_by- [protein binding] -enabled_by-> GP2

... we can infer that the protein binding instance has_input both GP1 and GP2.

Similarly, it might be good to have a compact way to record that a gene product is a substrate for enzymatic modification and that as a result its activity is modified. Consider a kinase cascade.

GP1<-enabled_by-[protein kinase activity]-directly_positively_regulates->[protein kinase activity]-enabled_by->[GP2]

We should be able to infer:

GP1<-enabled_by-[protein kinase activity]-has_substrate*->GP2

  • I'd still like a has_substrate relation. Would like to revisit as part of MF refactoring.

CC @thomaspd @cmungall @balhoff

NOTE: In annotation extensions you could record 'has_regulation_target'.

GP1 enables kinase activity directly_positively_regulates 'kinase activity', has_regulation_target** GP2, has_substrate GP2

It would be nice if we could infer the has_substrate bit here too.

(** annotation extension only relation)

@dosumis
Copy link
Contributor

dosumis commented Feb 13, 2017

Actually am I even looking at the right thing?

Depends what you're trying to do...

Your second comment appears to be about the new qualifiers. Here's a table of relation hierarchies + their domains and ranges.

ancestor relation domain range
causally upstream of or within gene product biological_process
causally upstream of process (MF or BP) process (MF or BP)
has participant process (MF or BP) material entity (e.g. chemical, protein, CC, cell)

image

image

@dosumis
Copy link
Contributor

dosumis commented Feb 13, 2017

Also, directly regulates, and children are described for use with biological process terms in addition to molecular functions.

The intent here is, I think, that BP1 directly_regulates BP2 if (and only if):

BP1 regulates BP2
MF1 regulates MF2
MF1 part_of BP1
MF2 part of BP2

This probably needs some more thought. We certainly don't have the axioms in place to infer this.

@ValWood
Copy link
Contributor Author

ValWood commented Feb 13, 2017

OK I got confused. I am only interested in the scope of relations for annotations extensions at present, not the "new" qualifiers.

I still don't understand your explanation of why for AE extensions "directly regulates" is not an example of "has_substrate". I guess I don't understand how you could use this relationship to
extend an annotation with a gene product which is not also a direct substrate of the molecular function that is a descendant of a single step process.

In this context, when we say "directly regulates" don't we mean "directly regulating a molecular function". If not what does the "directly" mean?

If not, so that I understand, can you provide an example of the use of this relation where the gene product in the extension is not a "direct_input" (i.e substrate) ?

@dosumis
Copy link
Contributor

dosumis commented Feb 13, 2017

I guess I don't understand how you could use this relationship to
extend an annotation with a gene product which is not also a direct substrate of the molecular function that is a descendant of a single step process.

You are correct. But if you tried to solve this by making has_substrate an ancestor of has_direct_input you would end up inferring that a kinase activity is a substrate when you want to record that the gene product which has the kinase activity is the substrate. This is why I suggested the rule in my first comment. Initially this will only apply in LEGO - but I'd be interested to see if we could find a way to use it for extended annotations too.

@ValWood
Copy link
Contributor Author

ValWood commented Feb 13, 2017

http://www.ebi.ac.uk/QuickGO/AnnotationExtensionRelations.html

Hmm, we have been using these extensions thus:

wee2 protein tyrosine kinase activity directly_inhibits cdc2
(lots of annotations coming through in our next update)

So it seems that there isn't an extension to specify this (I don't understand the explanations in the docs). We need something which means "directly inhibits the activity of"?

So these relations would be used in traditional extensions at all? Maybe any extension which is unsuitable for traditional GO annotation should be flagged as such?

Although this seems odd, because the "regulates" extension predates Noctua use I'm sure, and I'm sure lots of people use this in traditional GO annotation?

I's soooo confused......

@ValWood
Copy link
Contributor Author

ValWood commented Feb 14, 2017

OK Midori explained the part I don't understand. More later...

@ValWood ValWood closed this as completed Feb 14, 2017
@dosumis
Copy link
Contributor

dosumis commented Feb 14, 2017

directly_inhibits means directly_inhibits activity of

This is not (or should not be) a legal annotation extension:

wee2 protein tyrosine kinase activity directly_inhibits cdc2

LEGO semantics allows us to capture the following:

In English: wee2 is capable of tyrosine kinase activity that directly inhibits the protein kinase activity of CDC2
(wouldn't it be nice if we could display it like this?!)

IN LEGO: [wee2] <-enabled_by*-[protein kinase activity]-directly_positively_regulates->[protein kinase activity]-enabled_by->[cdc2]

Extended GO annotation is not expressive enough for us to capture this completely. We could curate:

  • wee2 (enables*) protein tyrosine kinase activity directly_positively_regulates 'protein kinase activity', has_regulation_target cdc2, has_substrate cdc2
  • cdc2 (enables) protein kinase activity

Strictly, all this records logically is a bunch of disconnected facts:

  • wee2 has protein tyrosine kinase activity
  • wee2 directly_positively_regulates the 'protein kinase activity' of some gene product
    • extensions are not expressive enough to explicitly state that it is the kinase activity of cdc2 being regulated
  • wee2 directly_positively_regulates some activity of cdc2
    • annotation extensions are not expressive enough to specify which activity of wee2 regulates which activity of cdc2
  • cdc2 is a target for the kinase activity of wee2
  • cdc2 has protein kinase activity

A biologist reading this would almost certainly fill in the gaps to make a sentence like the one above. But this is not 100 percent reliable. What if wee2 also regulated some things by binding them? (I'm making this up but it's plausible biology). What if cdc2 had some other activity that wee2 regulated?

* enabled_by/enables are horrible names. I wish we could get rid of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants