diff --git a/docs/404.html b/docs/404.html new file mode 100644 index 0000000..de2aec7 --- /dev/null +++ b/docs/404.html @@ -0,0 +1,148 @@ + + + +
+ + + + +Obtains the size of a certain number of predefined corpora. The total size of a corpus is important for calculating term frequencies.
-corpus_size(corpus = c("taxon_annotations", "taxa", "gene_annotations", - "genes"))- +
corpus_size(corpus = c("taxon_annotations", "taxa", "gene_annotations", + "genes"))+
the size of the specified corpus as an integer number.
-Corpus sizes are cached per session after they have first been obtained. Thus, if the Phenoscape KB changes, a session needs to be restarted to have those changes be reflected.
-+corpus_size("taxa")#> [1] 752corpus_size("taxon_annotations")#> [1] 805354
corpus_size("taxa") +#> [1] 785corpus_size("taxon_annotations") +#> [1] 809892
Searches the KB for terms matching the given text, and returns the result(s) as a data frame (see Value).
-find_term(query, type = NA, definedBy = quote(!is.na(.)), - matchBy = NA, matchTypes = NA, nomatch = NA, limit = 100, - verbose = FALSE)- +
find_term(query, type = NA, definedBy = quote(!is.na(.)), + matchBy = NA, matchTypes = NA, nomatch = NA, limit = 100, + verbose = FALSE)+
matchTypes | character, the types of matches (exact, partial, broad) to
accept. By default (value NA), no filtering by match type is performed.
-Use |
+Use
---|---|
nomatch | @@ -171,46 +178,42 @@logical; whether to issue a message if no matches found |
A data frame with columns "id", "label", "isDefinedBy", and "matchType" and one row for each term match.
-Matches can be filtered by type of term (class, property, etc), ontology in which the term is defined, and by type of match (exact, partial, etc). The term properties considered for matching can also be controlled.
-Returns the date of the current release of the Phenoscape Knowledgebase (KB) and counts of annotated matrices, taxa, phenotypes, characters, and states.
-get_KBinfo() ++get_KBinfo() # S3 method for KBinfo -print(x, ..., tz = "")- +print(x, ..., tz = "")
A list of class "KBinfo" with summary statistics of annotation counts and other KB metadata (specifically, a timestamp for the current KB build).
-+#> [1] "annotated_taxa" "annotated_characters" "annotated_matrices" -#> [4] "annotated_states" "build_time"kbmeta#> Annotated taxa: 6118 -#> Annotated characters: 12724 -#> Annotated matrices: 217 -#> Annotated states: 26407 -#> Build time: 2019-12-03 15:19:58 EST
- + + + diff --git a/docs/reference/get_data.html b/docs/reference/get_data.html index cbc4472..9341056 100644 --- a/docs/reference/get_data.html +++ b/docs/reference/get_data.html @@ -8,21 +8,29 @@#> [1] "annotated_taxa" "annotated_characters" "annotated_matrices" +#> [4] "annotated_states" "build_time"kbmeta +#> Annotated taxa: 6206 +#> Annotated characters: 13364 +#> Annotated matrices: 230 +#> Annotated states: 26905 +#> Build time: 2020-04-08 16:37:21 EDT
Obtains and parses data from JSON, CSV, and NeXML-returning API endpoints, respectively.
-get_json_data(url, query, verbose = FALSE, ensureNames = NULL) +-get_json_data(url, query, verbose = FALSE, ensureNames = NULL, + forceGET = FALSE) + +get_csv_data(url, query, ..., verbose = FALSE, forceGET = FALSE) -get_csv_data(url, query, ..., verbose = FALSE) +get_nexml_data(url, query, verbose = FALSE, forceGET = FALSE)-get_nexml_data(url, query, verbose = FALSE)
character, which column or list names to ensure are included in the result to be returned. If result returned by the API endpoint does not include them, they will be added, with NA values. |
+
+ |
forceGET | +logical, whether to force using the HTTP GET method for API +access regardless of request length. The default is FALSE, meaning queries +exceeding a certain size for transmission will automatically use HTTP POST +for accessing the KB API. |
---|---|
... | for |
+read.csv()
For get_json_data
, a data frame or list, depending on the result of
-jsonlite::fromJSON()
.
jsonlite::fromJSON()
.
For get_csv_data
, a data frame.
For get_nexml_data
, a nexml object.
These are package-internal functions.
- - + + + diff --git a/docs/reference/get_phenotypes.html b/docs/reference/get_phenotypes.html index 2e16f4e..bb18fdf 100644 --- a/docs/reference/get_phenotypes.html +++ b/docs/reference/get_phenotypes.html @@ -8,21 +8,29 @@Retrieves "semantic phenotypes", i.e., phenotypes encoded as ontological
expressions. Filtering is possible by anatomical entity (optionally including
entities related by certain properties, see includeRels
), phenotypic
quality, taxonomic group where the phenotypes have been recorded, and study
(a.k.a. publication).
get_phenotypes(entity = NA, quality = NA, taxon = NA, study = NA, - includeRels = c("part of"), .withTaxon = FALSE, verbose = FALSE)- +
get_phenotypes(entity = NA, quality = NA, taxon = NA, study = NA, + includeRels = c("part of"), .withTaxon = FALSE, verbose = FALSE)+
A data frame with columns "id" and "label".
@@ -176,82 +183,77 @@Entity, quality, and taxon can be given as IRI or as name (i.e, term label). In the latter case, names will be resolved to IRIs against anatomy ontologies, PATO, and taxonomy ontologies, respectively. Warnings will be issued if only a partial match can be found. The study must be given as IRI.
-# NOT RUN { -phens1 <- get_phenotypes(entity = "pelvic fin") -head(phens1) +- + + + diff --git a/docs/reference/get_term_label.html b/docs/reference/get_term_label.html index b4c82c5..2d325bc 100644 --- a/docs/reference/get_term_label.html +++ b/docs/reference/get_term_label.html @@ -8,21 +8,29 @@if (FALSE) { +phens1 <- get_phenotypes(entity = "pelvic fin") +head(phens1) # by default, parts are already included -phens2 <- get_phenotypes(entity = "pelvic fin", includeRels = c("part")) -nrow(phens1) == nrow(phens2) -table(phens2$id %in% phens1$id) +phens2 <- get_phenotypes(entity = "pelvic fin", includeRels = c("part")) +nrow(phens1) == nrow(phens2) +table(phens2$id %in% phens1$id) # but historical homologues are not -phens2 <- get_phenotypes(entity = "pelvic fin", includeRels = c("part", "hist")) -table(phens2$id %in% phens1$id) +phens2 <- get_phenotypes(entity = "pelvic fin", includeRels = c("part", "hist")) +table(phens2$id %in% phens1$id) # neither are serially homologous -phens2 <- get_phenotypes(entity = "pelvic fin", includeRels = TRUE) -table(phens2$id %in% phens1$id) +phens2 <- get_phenotypes(entity = "pelvic fin", includeRels = TRUE) +table(phens2$id %in% phens1$id) # filter also by quality -phens2 <- get_phenotypes(entity = "pelvic fin", quality = "shape") -table(phens1$id %in% phens2$id) +phens2 <- get_phenotypes(entity = "pelvic fin", quality = "shape") +table(phens1$id %in% phens2$id) # filter also by quality and taxon -phens2 <- get_phenotypes(entity = "pelvic fin", quality = "shape", taxon = "Siluriformes") -table(phens1$id %in% phens2$id) +phens2 <- get_phenotypes(entity = "pelvic fin", quality = "shape", taxon = "Siluriformes") +table(phens1$id %in% phens2$id) # filter by entity, quality and taxon, and return taxa as well (resulting in # (phenotype, taxon) "tuples") -phens2a <- get_phenotypes(entity = "pelvic fin", quality = "shape", taxon = "Siluriformes", - .withTaxon = TRUE) -head(phens2a) -nrow(phens2a) - nrow(phens2) # lots of redundancy due to n:n relationship -nrow(unique(phens2a[,c("id", "label")])) == nrow(phens2) # but same #phenotypes +phens2a <- get_phenotypes(entity = "pelvic fin", quality = "shape", taxon = "Siluriformes", + .withTaxon = TRUE) +head(phens2a) +nrow(phens2a) - nrow(phens2) # lots of redundancy due to n:n relationship +nrow(unique(phens2a[,c("id", "label")])) == nrow(phens2) # but same #phenotypes # can compute and visualize similarity -sm <- jaccard_similarity(terms = phens2$id, .labels = phens2$label, .colnames = "label") -plot(hclust(as.dist(1-sm))) -# } +sm <- jaccard_similarity(terms = phens2$id, .labels = phens2$label, .colnames = "label") +plot(hclust(as.dist(1-sm))) +} +Obtains the labels for a list of terms — get_term_label • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,12 +38,12 @@ - + - + @@ -49,9 +57,10 @@ + - ++@@ -73,7 +82,7 @@ - @@ -110,15 +119,13 @@Obtains the labels for a list of terms
--Attempts to obtain the label for each term, identified by IRI, in the input list. Terms for which no label is found in the database will have NA as the label in the result (see Value).
-get_term_label(term_iris, preserveOrder = FALSE, verbose = FALSE)- +get_term_label(term_iris, preserveOrder = FALSE, verbose = FALSE)+Arguments
A data.frame with columns "id" and "label". The "id" column contains
the IRIs. The label will be NA
for term IRIs that are not present in the
KB, or for which the KB cannot produce a label.
nexml_drop_otu
drops OTUs (taxa) from a nexml object.
Currently none of the OTUs to be dropped can be used in a tree that's part
of the nexml
object. If they are, first drop the OTUs from the tree(s)
-(for example, using drop.tip() from package "ape"), then
+(for example, using drop.tip() from package "ape"), then
replace the tree(s). Dropping OTUs will not only drop them from the OTUs
block, but will also drop matrix rows that referenced the OTUs to be dropped.
This may in turn leave some characters unused. Therefore, it is recommended
@@ -140,18 +148,17 @@
is_unused_otu
is a filter function for nexml_drop_otu
for dropping
unused OTUs.
nexml_drop_otu(nexml, filter, at = NA, block = 1, ...) +-nexml_drop_otu(nexml, filter, at = NA, block = 1, ...) + +nexml_drop_char(nexml, filter, at = NA, block = 1, ..., + pruneStates = TRUE, pruneRows = TRUE) -nexml_drop_char(nexml, filter, at = NA, block = 1, ..., - pruneStates = TRUE, pruneRows = TRUE) +is_unused_char(charList, ...) -is_unused_char(charList, ...) +is_unused_otu(otuList, ...)-is_unused_otu(otuList, ...)
The functions for dropping components return a nexml @@ -230,12 +237,12 @@
ignoreTrees = TRUE
, for a node of a tree, and FALSE
otherwise.
-
- + + + diff --git a/docs/reference/obo_prefix.html b/docs/reference/obo_prefix.html index 8f16504..59fabdd 100644 --- a/docs/reference/obo_prefix.html +++ b/docs/reference/obo_prefix.html @@ -8,21 +8,29 @@nex <- RNeXML::nexml_read(system.file("examples", "ontotrace-result.xml", package = "rphenoscape")) ++nexml_drop_char(nex, + filter = function(x) + !pk_is_descendant("paired fin", x, includeRels = "part_of"), + at = "obo:IAO_0000219") %>% + nexml_drop_otu(filter = is_unused_otu) +} +nex <- RNeXML::nexml_read(system.file("examples", "ontotrace-result.xml", package = "rphenoscape")) # drop by label matching -nexml_drop_otu(nex, filter = function(x) grepl(" sp.", x), at = "label")#> A nexml object representing: +nexml_drop_otu(nex, filter = function(x) grepl(" sp.", x), at = "label") +#> A nexml object representing: #> 1 phylogenetic tree block(s), where: #> block 1 contains 0 phylogenetic tree(s) #> 1 character block(s), where: @@ -260,15 +267,15 @@Examp #> NeXML generated by RNeXML using schema version: 0.9 #> Size: 209.6 Kb
#> -#>#>-#> -#> +#> +#>#>#> #>#>#> -#>nexml_drop_char(nex, filter = function(x) grepl("pelvic", x), at = "label") %>% - nexml_drop_otu(filter = is_unused_otu)#> A nexml object representing: +#>nexml_drop_char(nex, filter = function(x) grepl("pelvic", x), at = "label") %>% + nexml_drop_otu(filter = is_unused_otu) +#> A nexml object representing: #> 1 phylogenetic tree block(s), where: #> block 1 contains 0 phylogenetic tree(s) #> 1 character block(s), where: @@ -294,48 +301,47 @@Examp #> #> NeXML generated by RNeXML using schema version: 0.9 #> Size: 205.8 Kb
-# NOT RUN { -nex <- pk_get_ontotrace_xml(taxon = "Ictaluridae", - entity = "fin", variable_only = FALSE) +if (FALSE) { +nex <- pk_get_ontotrace_xml(taxon = "Ictaluridae", + entity = "fin", variable_only = FALSE) # ontotrace results store VTO IRIs in dwc:taxonID annotations: -nexml_drop_otu(nex, - filter = function(x) !pk_is_descendant("Ictalurus", x), - at = "dwc:taxonID") %>% - nexml_drop_char(filter = is_unused_char) +nexml_drop_otu(nex, + filter = function(x) !pk_is_descendant("Ictalurus", x), + at = "dwc:taxonID") %>% + nexml_drop_char(filter = is_unused_char) # anatomy IRIs are in obo:IAO_0000219 ("denotes") annotations: -nexml_drop_char(nex, - filter = function(x) - !pk_is_descendant("paired fin", x, includeRels = "part_of"), - at = "obo:IAO_0000219") %>% - nexml_drop_otu(filter = is_unused_otu) -# }
Extract the OBO ontology prefix from IRIs
-obo_prefix(x)- +
obo_prefix(x)+
a list or vector of IRIs, and/or objects that have an "id" key. |
A character vector of the same length as the input vector or list, with NA in the positions where extracting the OBO ontology prefix failed.
-- + + + diff --git a/docs/reference/pa_dep_matrix.html b/docs/reference/pa_dep_matrix.html index 353bd04..e4010f5 100644 --- a/docs/reference/pa_dep_matrix.html +++ b/docs/reference/pa_dep_matrix.html @@ -8,21 +8,29 @@tt <- c("http://purl.obolibrary.org/obo/UBERON_0011618", ++ "http://purl.obolibrary.org/obo/NCBITaxon_7955") +obo_prefix(tt) +tt <- c("http://purl.obolibrary.org/obo/UBERON_0011618", "http://purl.obolibrary.org/obo/PATO_0002279", "http://purl.obolibrary.org/obo/VTO_0071642", "http://purl.obolibrary.org/obo/MP_0030825", - "http://purl.obolibrary.org/obo/NCBITaxon_7955") -obo_prefix(tt)#> [1] "UBERON" "PATO" "VTO" "MP" "NCBITaxon"#> [1] "UBERON" "PATO" "VTO" "MP" "NCBITaxon"
Obtains a presence-absence dependency matrix for the given set of terms. The resulting matrix M will have values 1 and 0, where M[i,j] = 1 iff the presence if term i implies the presence of term j. Note that it follows that @@ -122,12 +130,11 @@
pa_dep_matrix(terms, .names = c("ID", "IRI", "label"), .labels = NULL, - preserveOrder = FALSE, verbose = FALSE)- +
pa_dep_matrix(terms, .names = c("ID", "IRI", "label"), .labels = NULL, + preserveOrder = FALSE, verbose = FALSE)+
A data.fram M with M[i,j] = 1 iff the presence of term i implies the @@ -180,51 +187,49 @@
term.iris
, giving the term IRIs for the rows (and columns).
Note that these extra attributes will be lost upon subsetting the returned
matrix.
-
# NOT RUN { -tl <- c("http://purl.obolibrary.org/obo/UBERON_0000981", +- + + + diff --git a/docs/reference/phenotype.html b/docs/reference/phenotype.html index dcf85b9..2ec58df 100644 --- a/docs/reference/phenotype.html +++ b/docs/reference/phenotype.html @@ -8,21 +8,29 @@if (FALSE) { +tl <- c("http://purl.obolibrary.org/obo/UBERON_0000981", "http://purl.obolibrary.org/obo/UBERON_0002103", "http://purl.obolibrary.org/obo/UBERON_0000976", - "http://purl.obolibrary.org/obo/UBERON_0002102") -m <- pa_dep_matrix(tl) -m # term IDs as row and column names -id_prefixes <- attr(m, "prefixes") -id_prefixes # 4x "http://purl.obolibrary.org/obo/" - -m <- pa_dep_matrix(tl, .names = "label") -m # term labels as row and column names -mat_terms <- attr(m, "term.iris") -mat_terms # term IRIs in the same order as rows (and columns) -# } + "http://purl.obolibrary.org/obo/UBERON_0002102") +m <- pa_dep_matrix(tl) +m # term IDs as row and column names +id_prefixes <- attr(m, "prefixes") +id_prefixes # 4x "http://purl.obolibrary.org/obo/" + +m <- pa_dep_matrix(tl, .names = "label") +m # term labels as row and column names +mat_terms <- attr(m, "term.iris") +mat_terms # term IRIs in the same order as rows (and columns) +} +Phenotype Objects — phenotype • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,8 +38,8 @@ - + - + @@ -60,9 +68,10 @@ + - ++@@ -84,7 +93,7 @@ - @@ -121,7 +130,6 @@Phenotype Objects
--
as.phenotype
creates an object (or a list of objects) of type "phenotype". The object to be coerced can be a character vector (of IRIs), or a data.frame. In the latter case, there must be a column "id" with the IRIs of phenotypes. @@ -136,25 +144,24 @@Phenotype Objects
chars
extracts the (non-redundant) characters from the phenotype object (or an object coercible to phenotype).-
as.phenotype(x, withTaxa = FALSE, ...) ++as.phenotype(x, withTaxa = FALSE, ...) # S3 method for data.frame -as.phenotype(x, ...) +as.phenotype(x, ...) -is.phenotype(x) +is.phenotype(x) -is_valid_phenotype(x) +is_valid_phenotype(x) -charstates(x) +charstates(x) -chars(x) +chars(x) # S3 method for phenotype -print(x, ...)- +print(x, ...)Arguments
additional parameters where applicable; ignored for printing |
as.phenotype
returns and object of type "phenotype", or a list of such objects
@@ -202,18 +209,19 @@
chars
returns a data.frame with collumns "character.id" and "character.label"
(IRI and label of the character), and "study.id" and "study.label" (IRI and
short label for the study to which the character and state belong).
Create and test objects of type "phenotype", and extract properties from them.
-+obj <- as.phenotype("http://foo") +is.phenotype(obj) +is_valid_phenotype(obj) +} + - + + + diff --git a/docs/reference/phenotype_matches.html b/docs/reference/phenotype_matches.html index 07a9955..aad2514 100644 --- a/docs/reference/phenotype_matches.html +++ b/docs/reference/phenotype_matches.html @@ -8,21 +8,29 @@# query for a set of phenotypes (IDs and their labels) -phens <- get_phenotypes(entity = "basihyal bone") -nrow(phens)#> [1] 19#> [1] "phenotype" "list"obj#> Phenotype 'basihyal bone absent' +phens <- get_phenotypes(entity = "basihyal bone") +nrow(phens) +#> [1] 20#> [1] "phenotype" "list"obj +#> Phenotype 'basihyal bone absent' #> Linked to states: #> label character.label study.label #> 1 states 0, 1 and 2 Basihyal Imamura et al. (2005) @@ -230,7 +238,8 @@Examp #> #> No information about taxa exhibiting this phenotype.
# optionally include taxa exhibiting the phenotype -as.phenotype(phens[3,], withTaxa = TRUE)#> Phenotype 'basihyal bone absent' +as.phenotype(phens[3,], withTaxa = TRUE) +#> Phenotype 'basihyal bone absent' #> Linked to states: #> label character.label study.label #> 1 states 0, 1 and 2 Basihyal Imamura et al. (2005) @@ -252,7 +261,8 @@Examp #> 4 http://purl.obolibrary.org/obo/VTO_0052740 Hemitripterus villosus #> 5 http://purl.obolibrary.org/obo/VTO_0052539 Liparis agassizii #> [ reached 'max' / getOption("max.print") -- omitted 6 rows ]
# full list of taxa: -as.phenotype(phens[3,], withTaxa = TRUE)$taxa#> id label +as.phenotype(phens[3,], withTaxa = TRUE)$taxa +#> id label #> 1 http://purl.obolibrary.org/obo/VTO_0035423 Corydoras rabauti #> 2 http://purl.obolibrary.org/obo/VTO_0052870 Enophrys diceraus #> 3 http://purl.obolibrary.org/obo/VTO_0071753 Eumicrotremus asperrimus @@ -265,8 +275,12 @@Examp #> 10 http://purl.obolibrary.org/obo/VTO_0034991 Siluriformes #> 11 http://purl.obolibrary.org/obo/VTO_0036124 Silurus glanis
#> [1] "list"length(objs)#> [1] 19#> [1] TRUEobjs[[3]]#> [1] "list"#> [1] 20#> [1] TRUEobjs[[3]] +#> Phenotype 'basihyal bone absent' #> Linked to states: #> label character.label study.label #> 1 states 0, 1 and 2 Basihyal Imamura et al. (2005) @@ -283,86 +297,76 @@Examp #> #> No information about taxa exhibiting this phenotype.
# extract character states and (non-redundant) characters -charstates(obj)#> id -#> 1 http://purl.org/phenoscape/uuid/aad2b9ab-d755-41c3-b0d4-07828af709bd -#> 2 http://purl.org/phenoscape/uuid/d432b4d8-f351-40bc-8dde-bf14aaa1cbd1 -#> 3 http://purl.org/phenoscape/uuid/302e3faf-9ac3-40c7-936a-047249b3c8cf -#> 4 http://purl.org/phenoscape/uuid/559715cf-1fcf-48e0-bd5f-0f4d3f06fded +charstates(obj) +#> id +#> 1 http://purl.org/phenoscape/uuid/62b5a920-8b49-42be-86f9-baab28a601fc +#> 2 http://purl.org/phenoscape/uuid/eb9a3385-bd7f-46a8-ac8f-ac109e8e0e3c +#> 3 http://purl.org/phenoscape/uuid/78fa240f-904b-4018-89e9-07c1550539fc +#> 4 http://purl.org/phenoscape/uuid/5b7b3d47-890a-4bf7-be90-61417c3d1bf3 #> label #> 1 states 0, 1 and 2 #> 2 absent (unordered) #> 3 Absent #> 4 basihyal absent #> character.id -#> 1 http://purl.org/phenoscape/uuid/1d8251ad-2454-4d7a-8b59-f4a425332003 -#> 2 http://purl.org/phenoscape/uuid/1d8251ad-2454-4d7a-8b59-f4a425332003 -#> 3 http://purl.org/phenoscape/uuid/6a9cc2f1-88c0-419a-8bca-0bb3918e7bb0 -#> 4 http://purl.org/phenoscape/uuid/b54a1da4-4e8f-4495-8c29-6a05d4d7c6bb -#> character.label -#> 1 Basihyal -#> 2 Basihyal -#> 3 Basihyal bone -#> 4 Basihyal -#> study.id -#> 1 http://dx.doi.org/10.1007/s10228-005-0282-6 -#> 2 http://dx.doi.org/10.1007/s10228-005-0282-6 -#> 3 http://specifyassets.nhm.ku.edu/Ichthyology/originals/sp68693518805352821487.att.pdf -#> 4 https://scholar.google.com/scholar?q=Higher-level+Phylogeny+of+Siluriformes%2C+With+a+New+Classification+of+the+Order+%28Teleostei%2C+Ostariophysi%29&btnG=&hl=en&as_sdt=0%2C42 +#> 1 http://purl.org/phenoscape/uuid/44a9193e-4da7-45ba-9eae-6ca060579da2 +#> 2 http://purl.org/phenoscape/uuid/44a9193e-4da7-45ba-9eae-6ca060579da2 +#> 3 http://purl.org/phenoscape/uuid/46cd66db-2454-4304-bf12-9a63aa7421ba +#> 4 http://purl.org/phenoscape/uuid/cedb8568-c69e-45b1-907c-d9a0bc677f6e +#> character.label study.id +#> 1 Basihyal http://dx.doi.org/10.1007/s10228-005-0282-6 +#> 2 Basihyal http://dx.doi.org/10.1007/s10228-005-0282-6 +#> 3 Basihyal bone http://dx.doi.org/10.11646/zootaxa.2877.1.1 +#> 4 Basihyal https://ci.nii.ac.jp/naid/10012505149/ #> study.label #> 1 Imamura et al. (2005) #> 2 Imamura et al. (2005) #> 3 Mabee et al. (2011) -#> 4 De Pinna (1993)chars(obj)#> character.id -#> 1 http://purl.org/phenoscape/uuid/1d8251ad-2454-4d7a-8b59-f4a425332003 -#> 3 http://purl.org/phenoscape/uuid/6a9cc2f1-88c0-419a-8bca-0bb3918e7bb0 -#> 4 http://purl.org/phenoscape/uuid/b54a1da4-4e8f-4495-8c29-6a05d4d7c6bb -#> character.label -#> 1 Basihyal -#> 3 Basihyal bone -#> 4 Basihyal -#> study.id -#> 1 http://dx.doi.org/10.1007/s10228-005-0282-6 -#> 3 http://specifyassets.nhm.ku.edu/Ichthyology/originals/sp68693518805352821487.att.pdf -#> 4 https://scholar.google.com/scholar?q=Higher-level+Phylogeny+of+Siluriformes%2C+With+a+New+Classification+of+the+Order+%28Teleostei%2C+Ostariophysi%29&btnG=&hl=en&as_sdt=0%2C42 +#> 4 De Pinna (1993)chars(obj) +#> character.id +#> 1 http://purl.org/phenoscape/uuid/44a9193e-4da7-45ba-9eae-6ca060579da2 +#> 3 http://purl.org/phenoscape/uuid/46cd66db-2454-4304-bf12-9a63aa7421ba +#> 4 http://purl.org/phenoscape/uuid/cedb8568-c69e-45b1-907c-d9a0bc677f6e +#> character.label study.id +#> 1 Basihyal http://dx.doi.org/10.1007/s10228-005-0282-6 +#> 3 Basihyal bone http://dx.doi.org/10.11646/zootaxa.2877.1.1 +#> 4 Basihyal https://ci.nii.ac.jp/naid/10012505149/ #> study.label #> 1 Imamura et al. (2005) #> 3 Mabee et al. (2011) #> 4 De Pinna (1993)-# NOT RUN { +if (FALSE) { # IDs that don't resolve still yield an object, but is not valid -obj <- as.phenotype("http://foo") -is.phenotype(obj) -is_valid_phenotype(obj) -# }
Determines which of one or more phenotype IDs match the given filter.
-phenotype_matches(x, studies)- +
phenotype_matches(x, studies)+
a logical vector of the same length as the vector of phenotypes, with TRUE as element if the corresponding phenotype matches, and FALSE otherwise.
-At present, the only supported query filter is a list of studies (as their IDs). A phenotype matches if it is linked to at least one of the studies.
-+#> [1] 19# which of these are in the same study or studies as the first one? -phenotype_matches(x, pk_get_study_list(phenotype = x$id[1]))#> [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE -#> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
- + + + diff --git a/docs/reference/pk_class.html b/docs/reference/pk_class.html index 40e9da4..4b25157 100644 --- a/docs/reference/pk_class.html +++ b/docs/reference/pk_class.html @@ -8,21 +8,29 @@#> [1] 20# which of these are in the same study or studies as the first one? +phenotype_matches(x, pk_get_study_list(phenotype = x$id[1])) +#> [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE +#> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Return direct superclasses, direct subclasses, and equivalent classes of a given term
-pk_taxon_class(x, verbose = TRUE) +-pk_taxon_class(x, verbose = TRUE) -pk_anatomical_class(x, verbose = TRUE) +pk_anatomical_class(x, verbose = TRUE) + +pk_phenotype_class(x, verbose = TRUE)-pk_phenotype_class(x, verbose = TRUE)
logical: optional. If TRUE (default), informative messages printed. |
A list containing data.frame
- - + + + diff --git a/docs/reference/pk_get_iri.html b/docs/reference/pk_get_iri.html index ace6def..1661202 100644 --- a/docs/reference/pk_get_iri.html +++ b/docs/reference/pk_get_iri.html @@ -8,21 +8,29 @@Finds the term matching the query text, and returns its IRI. If the query text is already a IRI, it is returned as is.
-pk_get_iri(text, as, exactOnly = FALSE, nomatch = NA, - verbose = FALSE)- +
pk_get_iri(text, as, exactOnly = FALSE, nomatch = NA, + verbose = FALSE)+
The IRI if a match is found.
-This uses find_term to find matches, and assumes that the term of interest is a class. If there is an exact match, its IRI will be returned. If there isn't, by default partial, and as a last resort broad, matches will also be considered, although this will result in a warning.
- - + + + diff --git a/docs/reference/pk_get_ontotrace_xml.html b/docs/reference/pk_get_ontotrace_xml.html index 9967489..298c768 100644 --- a/docs/reference/pk_get_ontotrace_xml.html +++ b/docs/reference/pk_get_ontotrace_xml.html @@ -8,21 +8,29 @@Queries the Phenoscape KB for a synthetic presence/absence character matrix for the given taxa and anatomical entities, and returns the result as a nexml object (from the RNeXML package).
-pk_get_ontotrace_xml(taxon, entity, relation = "part of", - variable_only = TRUE, strict = TRUE)- +
pk_get_ontotrace_xml(taxon, entity, relation = "part of", + variable_only = TRUE, strict = TRUE)+
RNeXML::nexml object
-The character matrix includes both asserted and logically inferred states. The @@ -167,64 +173,60 @@
relation
for changing
this. By default, only characters that are variable across the resulting taxa
are included; use variable_only
to change this.
-
# NOT RUN { +- + + + diff --git a/docs/reference/pk_get_study.html b/docs/reference/pk_get_study.html index 661ce1b..a0e28e6 100644 --- a/docs/reference/pk_get_study.html +++ b/docs/reference/pk_get_study.html @@ -8,21 +8,29 @@if (FALSE) { # one taxon (including subclasses), one entity (including subclasses and # by default its parts) -nex <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "fin") +nex <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "fin") # same as above, except do not include parts or other relationships (fin # presence/absence does not vary across Ictalurus, hence need to allow # non-variable characters) -nex <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "fin", - relation = NA, variable_only = FALSE) +nex <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "fin", + relation = NA, variable_only = FALSE) # instead of parts, include entities in develops_from relationship to the query entity -nex <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "paired fin bud", - relation = "develops from", variable_only = FALSE) +nex <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "paired fin bud", + relation = "develops from", variable_only = FALSE) # query with multiple taxa, and/or multiple entities: -nex <- pk_get_ontotrace_xml(taxon = c("Ictalurus", "Ameiurus"), - entity = c("pectoral fin", "pelvic fin")) +nex <- pk_get_ontotrace_xml(taxon = c("Ictalurus", "Ameiurus"), + entity = c("pectoral fin", "pelvic fin")) # Use the RNeXML API to obtain the character matrix etc: -m <- RNeXML::get_characters(nex) -dim(m) # number of taxa and characters -rownames(m) # taxon names -colnames(m) # characters (entity names) +m <- RNeXML::get_characters(nex) +dim(m) # number of taxa and characters +rownames(m) # taxon names +colnames(m) # characters (entity names) -# } +} +pk_get_study — pk_get_study • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,10 +38,10 @@ - + - + @@ -47,9 +55,10 @@ + - ++@@ -71,7 +80,7 @@ - @@ -108,13 +117,11 @@pk_get_study
--pk_get_study
-pk_get_study(nexmls)- +pk_get_study(nexmls)+Arguments
a list of nexml objects |
A list of data.frames containing matrices
-# NOT RUN { -slist <- pk_get_study_list(taxon = "Ameiurus", entity = "pelvic splint") -nex_list <- pk_get_study_xml(slist$id) # get the list of NeXML objects for the studies -pk_get_study(nex_list) # retrieve the study matrices -pk_get_study_meta(nex_list) # retrieve the meta data for the studies -# }+
- + + + diff --git a/docs/reference/pk_get_study_list.html b/docs/reference/pk_get_study_list.html index 08689b1..0d84b34 100644 --- a/docs/reference/pk_get_study_list.html +++ b/docs/reference/pk_get_study_list.html @@ -8,21 +8,29 @@if (FALSE) { +slist <- pk_get_study_list(taxon = "Ameiurus", entity = "pelvic splint") +nex_list <- pk_get_study_xml(slist$id) # get the list of NeXML objects for the studies +pk_get_study(nex_list) # retrieve the study matrices +pk_get_study_meta(nex_list) # retrieve the meta data for the studies +} +
Return studies that contain taxa which are members of the optional input taxon, and characters which have phenotype annotations subsumed by the given entity and quality terms.
-pk_get_study_list(taxon = NA, entity = NA, quality = NA, - phenotype = NA, includeRels = NA, relation = "part of")- +
pk_get_study_list(taxon = NA, entity = NA, quality = NA, + phenotype = NA, includeRels = NA, relation = "part of")+
data.frame
-# NOT RUN { +- + + + diff --git a/docs/reference/pk_get_study_meta.html b/docs/reference/pk_get_study_meta.html index 14d90a9..ee36352 100644 --- a/docs/reference/pk_get_study_meta.html +++ b/docs/reference/pk_get_study_meta.html @@ -8,21 +8,29 @@if (FALSE) { # by default, parts are included -slist <- pk_get_study_list(taxon = "Siluridae", entity = "fin") -colnames(slist) -nrow(slist) +slist <- pk_get_study_list(taxon = "Siluridae", entity = "fin") +colnames(slist) +nrow(slist) # can also disable parts -slist <- pk_get_study_list(taxon = "Siluridae", entity = "fin", includeRels = FALSE) -nrow(slist) +slist <- pk_get_study_list(taxon = "Siluridae", entity = "fin", includeRels = FALSE) +nrow(slist) # or filter studies only by entity, including their parts -slist <- pk_get_study_list(entity = "pelvic fin", includeRels = c("part of")) -nrow(slist) +slist <- pk_get_study_list(entity = "pelvic fin", includeRels = c("part of")) +nrow(slist) # or filter studies only by entity, including their parts -slist <- pk_get_study_list(entity = "pelvic fin", includeRels = c("part of")) -nrow(slist) +slist <- pk_get_study_list(entity = "pelvic fin", includeRels = c("part of")) +nrow(slist) # including not only parts but also historical and serial homologs -slist <- pk_get_study_list(entity = "pelvic fin", - includeRels = c("part of", +slist <- pk_get_study_list(entity = "pelvic fin", + includeRels = c("part of", "serially homologous to", - "historical homologous to")) -nrow(slist) + "historical homologous to")) +nrow(slist) # relationship names can be given as prefixes -slist1 <- pk_get_study_list(entity = "pelvic fin", - includeRels = c("part", "serial", "historical")) -nrow(slist1) == nrow(slist) +slist1 <- pk_get_study_list(entity = "pelvic fin", + includeRels = c("part", "serial", "historical")) +nrow(slist1) == nrow(slist) # or apply no filter, obtaining all studies in the KB -slist <- pk_get_study_list() -nrow(slist) -# } +slist <- pk_get_study_list() +nrow(slist) +} +pk_get_study_meta — pk_get_study_meta • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,10 +38,10 @@ - + - + @@ -47,9 +55,10 @@ + - ++@@ -71,7 +80,7 @@ - @@ -108,13 +117,11 @@pk_get_study_meta
--pk_get_study_meta
-pk_get_study_meta(nexmls)- +pk_get_study_meta(nexmls)+Arguments
a list of NeXML objects |
A list of data.frames containing taxa and characters
-# NOT RUN { -slist <- pk_get_study_list(taxon = "Ameiurus", entity = "pelvic splint") -nex_list <- pk_get_study_xml(slist$id) # get the list of NeXML objects for the studies -pk_get_study(nex_list) # retrieve the study matrices -pk_get_study_meta(nex_list) # retrieve the meta data for the studies -# }+
- + + + diff --git a/docs/reference/pk_get_study_xml.html b/docs/reference/pk_get_study_xml.html index ea407cb..1302284 100644 --- a/docs/reference/pk_get_study_xml.html +++ b/docs/reference/pk_get_study_xml.html @@ -8,21 +8,29 @@if (FALSE) { +slist <- pk_get_study_list(taxon = "Ameiurus", entity = "pelvic splint") +nex_list <- pk_get_study_xml(slist$id) # get the list of NeXML objects for the studies +pk_get_study(nex_list) # retrieve the study matrices +pk_get_study_meta(nex_list) # retrieve the meta data for the studies +} +
pk_get_study_xml
-pk_get_study_xml(study_ids)- +
pk_get_study_xml(study_ids)+
a list of study IDs. |
A list of nexml objects
-# NOT RUN { -slist <- pk_get_study_list(taxon = "Ameiurus", entity = "pelvic splint") -nex_list <- pk_get_study_xml(slist$id) -# }+
- + + + diff --git a/docs/reference/pk_is_descendant.html b/docs/reference/pk_is_descendant.html index 5406fd2..761bc60 100644 --- a/docs/reference/pk_is_descendant.html +++ b/docs/reference/pk_is_descendant.html @@ -8,21 +8,29 @@if (FALSE) { +slist <- pk_get_study_list(taxon = "Ameiurus", entity = "pelvic splint") +nex_list <- pk_get_study_xml(slist$id) +} +
Tests which in a list of candidate terms are ancestors to or descendants of the query term. Note that terms are not considered ancestors and descendants of themselves.
-pk_is_descendant(term, candidates, includeRels = c("none", "part_of")) +-pk_is_descendant(term, candidates, includeRels = c("none", "part_of")) + +pk_is_ancestor(term, candidates, includeRels = c("none", "part_of"))-pk_is_ancestor(term, candidates, includeRels = c("none", "part_of"))
A logical vector indicating which candidate terms are ancestors and descendants, respectively, of the query term.
-Any of both the query term and the list of candidate terms can be supplied @@ -154,56 +160,52 @@
# NOT RUN { +- + + + diff --git a/docs/reference/pk_is_extinct.html b/docs/reference/pk_is_extinct.html index 4807732..79db436 100644 --- a/docs/reference/pk_is_extinct.html +++ b/docs/reference/pk_is_extinct.html @@ -8,21 +8,29 @@if (FALSE) { # taxa: -pk_is_descendant("Halecostomi", c("Halecostomi", "Icteria", "Sciaenidae")) -pk_is_ancestor("Sciaenidae", c("Halecostomi", "Abeomelomys", "Sciaenidae")) +pk_is_descendant("Halecostomi", c("Halecostomi", "Icteria", "Sciaenidae")) +pk_is_ancestor("Sciaenidae", c("Halecostomi", "Abeomelomys", "Sciaenidae")) # anatomical entities: -pk_is_descendant("paired fin", c("pectoral fin", "pelvic fin", "dorsal fin")) -pk_is_descendant("paired fin", c("pelvic fin", "pelvic fin ray")) -pk_is_descendant("paired fin", c("pelvic fin", "pelvic fin ray"), includeRels = "part_of") +pk_is_descendant("paired fin", c("pectoral fin", "pelvic fin", "dorsal fin")) +pk_is_descendant("paired fin", c("pelvic fin", "pelvic fin ray")) +pk_is_descendant("paired fin", c("pelvic fin", "pelvic fin ray"), includeRels = "part_of") -pk_is_ancestor("pelvic fin", c("paired fin", "hindlimb", "fin")) -pk_is_ancestor("pelvic fin ray", c("paired fin", "fin")) -pk_is_ancestor("pelvic fin ray", c("paired fin", "fin"), includeRels = "part_of") +pk_is_ancestor("pelvic fin", c("paired fin", "hindlimb", "fin")) +pk_is_ancestor("pelvic fin ray", c("paired fin", "fin")) +pk_is_ancestor("pelvic fin ray", c("paired fin", "fin"), includeRels = "part_of") # phenotypic quality -pk_is_ancestor("triangular", c("shape", "color", "amount")) -pk_is_descendant("shape", c("T-shaped", "star shaped", "yellow")) -# } +pk_is_ancestor("triangular", c("shape", "color", "amount")) +pk_is_descendant("shape", c("T-shaped", "star shaped", "yellow")) +} +Determine which taxa are extinct — pk_is_extinct • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,10 +38,10 @@ - + - + @@ -47,9 +55,10 @@ + - ++@@ -71,7 +80,7 @@ - @@ -108,13 +117,11 @@Determine which taxa are extinct
--This is simply a convenience function on top of
-pk_taxon_detail()
.pk_is_extinct(taxon, verbose = FALSE)- +pk_is_extinct(taxon, verbose = FALSE)+Arguments
A logical named vector with value TRUE
if the corresponding input
taxon is marked as extinct, and FALSE otherwise. For taxon names that failed
to be looked up, the value will be NA. Names will be the input taxa where
there were given as names, and the label of the respective taxon otherwise.
Generate matrix of inferred presence/absence associations for anatomical structures subsumed by the provided entity class expression, for any taxa within the provided taxon class expression.
-pk_get_ontotrace(nex) +-pk_get_ontotrace(nex) + +pk_get_ontotrace_meta(nex)-pk_get_ontotrace_meta(nex)
a nexml object |
data.frame: The OntoTrace matrix.
-# NOT RUN { -nex0 <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "fin") - -nex <- pk_get_ontotrace_xml(taxon = c("Ictalurus", "Ameiurus"), entity = "fin spine") -pk_get_ontotrace(nex) -pk_get_ontotrace_meta(nex) -# }+
- + + + diff --git a/docs/reference/pk_terms.html b/docs/reference/pk_terms.html index 92120a6..17a2604 100644 --- a/docs/reference/pk_terms.html +++ b/docs/reference/pk_terms.html @@ -8,21 +8,29 @@if (FALSE) { +nex0 <- pk_get_ontotrace_xml(taxon = "Ictalurus", entity = "fin") + +nex <- pk_get_ontotrace_xml(taxon = c("Ictalurus", "Ameiurus"), entity = "fin spine") +pk_get_ontotrace(nex) +pk_get_ontotrace_meta(nex) +} +
Retrieve details about a taxon, an anatomical structure, a gene, or a phenotypic quality.
-pk_taxon_detail(term, verbose = FALSE) +-pk_taxon_detail(term, verbose = FALSE) -pk_anatomical_detail(term, verbose = FALSE) +pk_anatomical_detail(term, verbose = FALSE) -pk_phenotype_detail(term, verbose = FALSE) +pk_phenotype_detail(term, verbose = FALSE) + +pk_gene_detail(term, taxon = NA, verbose = FALSE)-pk_gene_detail(term, taxon = NA, verbose = FALSE)
A data.frame, with at least columns "id" and "label".
@@ -156,18 +163,20 @@For pk_gene_detail
, the additional columns are "taxon.id" and "taxon.label"
for the corresponding NCBI Taxonomy ID and name, and "matchType" ('exact'
or 'partial').
pk_taxon_detail("Coralliozetus")#> id label extinct +- + + + diff --git a/docs/reference/pkb_args_to_query.html b/docs/reference/pkb_args_to_query.html index 7f8baf9..d72c707 100644 --- a/docs/reference/pkb_args_to_query.html +++ b/docs/reference/pkb_args_to_query.html @@ -8,21 +8,29 @@pk_taxon_detail("Coralliozetus") +#> id label extinct #> 1 http://purl.obolibrary.org/obo/VTO_0042955 Coralliozetus FALSE #> rank.id rank.label common_name -#> 1 http://purl.obolibrary.org/obo/TAXRANK_0000005 genus <NA>pk_anatomical_detail("basihyal bone")#> label -#> 1 basihyal bone +#> 1 http://purl.obolibrary.org/obo/TAXRANK_0000005 genus <NA>pk_anatomical_detail("basihyal bone") +#> label isDefinedBy +#> 1 basihyal bone http://purl.obolibrary.org/obo/uberon.owl #> definition #> 1 Replacement bone that is median and is the anterior-most bone of the ventral hyoid arch. #> id -#> 1 http://purl.obolibrary.org/obo/UBERON_0011618pk_gene_detail("socs5")#> id label matchType +#> 1 http://purl.obolibrary.org/obo/UBERON_0011618pk_gene_detail("socs5") +#> id label matchType #> 1 http://xenbase.org/XB-GENEPAGE-479592 socs5 exact #> 2 http://www.informatics.jax.org/marker/MGI:2385459 Socs5 exact #> 3 http://zfin.org/ZDB-GENE-061013-408 socs5a partial @@ -179,32 +188,30 @@Examp #> 4 http://purl.obolibrary.org/obo/NCBITaxon_7955 Danio rerio
Creates a list of named query parameters — pkb_args_to_query • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,13 +38,13 @@ - + - + @@ -50,9 +58,10 @@ + - ++@@ -74,7 +83,7 @@ - @@ -111,16 +120,14 @@Creates a list of named query parameters
--Several Phenoscape KB API endpoints use a form-like parameter list for filtering (limiting), or, in the case of relationships, expanding, the query result. This function aids in preparing the query string for these endpoints. It is internal to the package.
-pkb_args_to_query(..., includeRels = FALSE, verbose = FALSE)- +pkb_args_to_query(..., includeRels = FALSE, verbose = FALSE)+Arguments
A list of named query parameters suitatoble for several form-like query endpoints in the Phenoscape KB API.
- - + + + diff --git a/docs/reference/profile_similarity.html b/docs/reference/profile_similarity.html index 80a3e35..3b9e6d4 100644 --- a/docs/reference/profile_similarity.html +++ b/docs/reference/profile_similarity.html @@ -8,21 +8,29 @@Calculates the semantic similarity between profiles (groups) of terms.
bestPairs
aggregates pairwise scores by "best pairs" between two profiles.
That is, for profiles P1 and P2 with terms T1[i] (i = 1,...n) and T2[j]
@@ -132,15 +140,14 @@
profile_similarity(pairwise, subsumer_mat, ..., f, reduce = NA) +-profile_similarity(pairwise, subsumer_mat, ..., f, reduce = NA) + +bestPairs(X, best = max, aggregate = mean) -bestPairs(X, best = max, aggregate = mean) +reduce.ignoringDiag(X, aggregate = mean)-reduce.ignoringDiag(X, aggregate = mean)
f | -a factor (or an object coercible to factor) defining the group + | a factor (or an object coercible to factor) defining the group (profile) membership of the terms (= columns) in the subsumer matrix. Columns and rows of the resulting profile similarity matrix will take their names from the levels of the factor. The factor may have 2 or more levels. |
@@ -185,17 +192,17 @@
---|---|---|
best | -the function for determing the best score. The default is |
+ the function for determing the best score. The default is |
aggregate | the function for aggregating scores (for |
+NA values before calculating the aggregate. The default is
A profile refers to a group of terms, and profile similarity to the similarity @@ -209,7 +216,7 @@
In pairwise mode, if the reduce
function is asymmetric (as will typically be
@@ -217,85 +224,88 @@
(X + t(X)) / 2
, if X is
the profile similarity matrix.
-
- + + + diff --git a/docs/reference/similarity.html b/docs/reference/similarity.html index 732b850..6ddaa4b 100644 --- a/docs/reference/similarity.html +++ b/docs/reference/similarity.html @@ -8,21 +8,29 @@tt <- sapply(c("pelvic fin", "pectoral fin", - "forelimb", "hindlimb", "dorsal fin", "caudal fin"), - pk_get_iri, as = "anatomy") ++profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, + reduce = mean) +tt <- sapply(c("pelvic fin", "pectoral fin", + "forelimb", "hindlimb", "dorsal fin", "caudal fin"), + pk_get_iri, as = "anatomy") # define groups (profiles) as factors: -pairedUnpaired <- c(rep("paired", times = 4), rep("unpaired", times = 2)) -finsLimbs <- c("fins", "fins", "limbs", "limbs", "fins", "fins") -pairedFinLimb <- interaction(as.factor(pairedUnpaired), as.factor(finsLimbs)) +pairedUnpaired <- c(rep("paired", times = 4), rep("unpaired", times = 2)) +finsLimbs <- c("fins", "fins", "limbs", "limbs", "fins", "fins") +pairedFinLimb <- interaction(as.factor(pairedUnpaired), as.factor(finsLimbs)) # compute subsumer matrix -subs.mat <- subsumer_matrix(tt, .colnames = "label", .labels = names(tt), - preserveOrder = TRUE) +subs.mat <- subsumer_matrix(tt, .colnames = "label", .labels = names(tt), + preserveOrder = TRUE) # group-wise profile similarity: -profile_similarity(jaccard_similarity, subs.mat, f = pairedUnpaired)#> paired unpaired -#> paired 1.0000000 0.3163265 -#> unpaired 0.3163265 1.0000000profile_similarity(jaccard_similarity, subs.mat, f = finsLimbs)#> fins limbs -#> fins 1.0000000 0.7653061 -#> limbs 0.7653061 1.0000000profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb)#> paired.fins unpaired.fins paired.limbs -#> paired.fins 1.0000000 0.3483146 0.8152174 -#> unpaired.fins 0.3483146 1.0000000 0.3152174 -#> paired.limbs 0.8152174 0.3152174 1.0000000+profile_similarity(jaccard_similarity, subs.mat, f = pairedUnpaired) +#> paired unpaired +#> paired 1.0000000 0.3366337 +#> unpaired 0.3366337 1.0000000profile_similarity(jaccard_similarity, subs.mat, f = finsLimbs) +#> fins limbs +#> fins 1.0000000 0.7722772 +#> limbs 0.7722772 1.0000000profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb) +#> paired.fins unpaired.fins paired.limbs +#> paired.fins 1.0000000 0.3695652 0.8210526 +#> unpaired.fins 0.3695652 1.0000000 0.3368421 +#> paired.limbs 0.8210526 0.3368421 1.0000000# pairwise, using mean (average pairwise score); result is symmetric -profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, - reduce = mean)#> paired.fins unpaired.fins paired.limbs -#> paired.fins 0.9397590 0.3742470 0.8251232 -#> unpaired.fins 0.3742470 0.9189189 0.3413199 -#> paired.limbs 0.8251232 0.3413199 0.9285714# the same, but excluding self-similarity of terms within groups -profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, - reduce = reduce.ignoringDiag)#> paired.fins unpaired.fins paired.limbs -#> paired.fins 0.8795181 0.3742470 0.8251232 -#> unpaired.fins 0.3742470 0.8378378 0.3413199 -#> paired.limbs 0.8251232 0.3413199 0.8571429# pairwise, using max; result is symmetric -profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, - reduce = max)#> paired.fins unpaired.fins paired.limbs -#> paired.fins 1.0000000 0.3750000 0.8571429 -#> unpaired.fins 0.3750000 1.0000000 0.3414634 -#> paired.limbs 0.8571429 0.3414634 1.0000000# pairwise, using average of best pairs; result is _not_ symmetric -sm <- profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, - reduce = bestPairs) -sm#> paired.fins unpaired.fins paired.limbs -#> paired.fins 1.0000000 0.3750000 0.8571429 -#> unpaired.fins 0.3742470 1.0000000 0.3413199 -#> paired.limbs 0.8571429 0.3414634 1.0000000#> paired.fins unpaired.fins paired.limbs -#> paired.fins 1.0000000 0.3746235 0.8571429 -#> unpaired.fins 0.3746235 1.0000000 0.3413917 -#> paired.limbs 0.8571429 0.3413917 1.0000000#> paired.fins unpaired.fins paired.limbs +#> paired.fins 0.9418605 0.3964696 0.8310345 +#> unpaired.fins 0.3964696 0.9250000 0.3641711 +#> paired.limbs 0.8310345 0.3641711 0.9310345# the same, but excluding self-similarity of terms within groups +profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, + reduce = reduce.ignoringDiag) +#> paired.fins unpaired.fins paired.limbs +#> paired.fins 0.8837209 0.3964696 0.8310345 +#> unpaired.fins 0.3964696 0.8500000 0.3641711 +#> paired.limbs 0.8310345 0.3641711 0.8620690# pairwise, using max; result is symmetric +profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, + reduce = max) +#> paired.fins unpaired.fins paired.limbs +#> paired.fins 1.0000000 0.3975904 0.8620690 +#> unpaired.fins 0.3975904 1.0000000 0.3647059 +#> paired.limbs 0.8620690 0.3647059 1.0000000# pairwise, using average of best pairs; result is _not_ symmetric +sm <- profile_similarity(jaccard_similarity, subs.mat, f = pairedFinLimb, + reduce = bestPairs) +sm +#> paired.fins unpaired.fins paired.limbs +#> paired.fins 1.0000000 0.3975904 0.8620690 +#> unpaired.fins 0.3964696 1.0000000 0.3641711 +#> paired.limbs 0.8620690 0.3647059 1.0000000#> paired.fins unpaired.fins paired.limbs +#> paired.fins 1.000000 0.3970300 0.8620690 +#> unpaired.fins 0.397030 1.0000000 0.3644385 +#> paired.limbs 0.862069 0.3644385 1.0000000
The Tanimoto similarity ST is computed according to the definition for bit vectors (see Jaccard index at Wikipedia). For weights \(W_i \in \{0, 1\}\) it is the same as the @@ -150,18 +158,17 @@
The Resnik similarity between two terms is the information content (IC) of their most informative common ancestor (MICA), which is the common subsumer with the greatest information content.
-tanimoto_similarity(subsumer_mat = NA, terms = NULL, ...) +-tanimoto_similarity(subsumer_mat = NA, terms = NULL, ...) + +jaccard_similarity(subsumer_mat = NA, terms = NULL, ...) -jaccard_similarity(subsumer_mat = NA, terms = NULL, ...) +cosine_similarity(subsumer_mat = NA, terms = NULL, ...) -cosine_similarity(subsumer_mat = NA, terms = NULL, ...) +resnik_similarity(subsumer_mat = NA, terms = NULL, ..., + wt = term_freqs, wt_args = list(), base = 10)-resnik_similarity(subsumer_mat = NA, terms = NULL, ..., - wt = term_freqs, wt_args = list(), base = 10)
wt_args | @@ -205,67 +214,63 @@
---|
A matrix with M[i,j] = similarity of terms i and j.
-Philip Resnik (1995). "Using information content to evaluate semantic similarity in a taxonomy". Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI'95). 1: 448–453
-# NOT RUN { -sm <- jaccard_similarity(terms = c("pelvic fin", "pectoral fin", +- + + + diff --git a/docs/reference/subsumer_matrix.html b/docs/reference/subsumer_matrix.html index 7f22317..dac5820 100644 --- a/docs/reference/subsumer_matrix.html +++ b/docs/reference/subsumer_matrix.html @@ -8,21 +8,29 @@if (FALSE) { +sm <- jaccard_similarity(terms = c("pelvic fin", "pectoral fin", "forelimb", "hindlimb", - "dorsal fin", "caudal fin"), - .colnames = "label") -sm + "dorsal fin", "caudal fin"), + .colnames = "label") +sm # e.g., turn into distance matrix, cluster, and plot -plot(hclust(as.dist(1-sm))) -# }# NOT RUN { -phens <- get_phenotypes("basihyal bone", taxon = "Cyprinidae") -sm.ic <- resnik_similarity(terms = phens$id, - .colnames = "label", .labels = phens$label, - wt_args = list(as = "phenotype", - corpus = "taxa")) -maxIC <- -log10(1 / corpus_size("taxa")) +plot(hclust(as.dist(1-sm))) +} +if (FALSE) { +phens <- get_phenotypes("basihyal bone", taxon = "Cyprinidae") +sm.ic <- resnik_similarity(terms = phens$id, + .colnames = "label", .labels = phens$label, + wt_args = list(as = "phenotype", + corpus = "taxa")) +maxIC <- -log10(1 / corpus_size("taxa")) # normalize by max IC, turn into distance matrix, cluster, and plot -plot(hclust(as.dist(1-sm.ic/maxIC))) -# } +plot(hclust(as.dist(1-sm.ic/maxIC))) +} +Obtains a subsumer matrix — subsumer_matrix • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,12 +38,12 @@ - + - + @@ -49,9 +57,10 @@ + - ++@@ -73,7 +82,7 @@ - @@ -110,16 +119,14 @@Obtains a subsumer matrix
--A subsumer matrix M for terms \(j \in \{1, \dots, n\}\) has value \(M_{i,j}=1\) iff class i (which can be an anonymous class expression) subsumes term j, and zero otherwise. Therefore, it will have n columns, one for each term.
-subsumer_matrix(terms, .colnames = c("ID", "IRI", "label"), - .labels = NULL, preserveOrder = FALSE, verbose = FALSE)- +subsumer_matrix(terms, .colnames = c("ID", "IRI", "label"), + .labels = NULL, preserveOrder = FALSE, verbose = FALSE)+Arguments
A data.frame representing the subsumer matrix
@@ -171,7 +178,6 @@term.iris
, giving the term IRIs for the rows (and columns).
Note that these extra attributes will be lost upon subsetting the returned
matrix.
-
In this implementation, for each row i @@ -181,53 +187,49 @@
# NOT RUN { -tl <- c("http://purl.obolibrary.org/obo/UBERON_0000981", +- + + + diff --git a/docs/reference/taxon_ontology_iris.html b/docs/reference/taxon_ontology_iris.html index ab03cd3..e1bbc85 100644 --- a/docs/reference/taxon_ontology_iris.html +++ b/docs/reference/taxon_ontology_iris.html @@ -8,21 +8,29 @@if (FALSE) { +tl <- c("http://purl.obolibrary.org/obo/UBERON_0000981", "http://purl.obolibrary.org/obo/UBERON_0002103", "http://purl.obolibrary.org/obo/UBERON_0000976", - "http://purl.obolibrary.org/obo/UBERON_0002102") -m <- subsumer_matrix(tl) -m <- # term IDs as column names -id_prefixes <- attr(m, "prefixes") -id_prefixes # 4x "http://purl.obolibrary.org/obo/" - -m <- subsumer_matrix(tl, .colnames = "label") -m # term labels as column names -mat_terms <- attr(m, "term.iris") -mat_terms # term IRIs in the same order as columns -# } + "http://purl.obolibrary.org/obo/UBERON_0002102") +m <- subsumer_matrix(tl) +m <- # term IDs as column names +id_prefixes <- attr(m, "prefixes") +id_prefixes # 4x "http://purl.obolibrary.org/obo/" + +m <- subsumer_matrix(tl, .colnames = "label") +m # term labels as column names +mat_terms <- attr(m, "term.iris") +mat_terms # term IRIs in the same order as columns +} +Get IRIs of ontologies with taxonomy terms — taxon_ontology_iris • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,10 +38,10 @@ - + - + @@ -47,9 +55,10 @@ + - ++@@ -71,7 +80,7 @@ - @@ -108,42 +117,41 @@Get IRIs of ontologies with taxonomy terms
--Obtains the IRIs of taxon ontologies in the Phenoscape KB.
-taxon_ontology_iris()
- +taxon_ontology_iris()+ +Value
A character vector
- - + + + diff --git a/docs/reference/term_category.html b/docs/reference/term_category.html index ea1ce66..a09d20b 100644 --- a/docs/reference/term_category.html +++ b/docs/reference/term_category.html @@ -8,21 +8,29 @@Determine the general category of terms — term_category • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,13 +38,13 @@ - + - + @@ -50,9 +58,10 @@ + - ++@@ -74,7 +83,7 @@ - @@ -111,16 +120,14 @@Determine the general category of terms
--Terms in the Phenoscape KB fall into different general categories: entity, quality, phenotype (which typically are entity-quality compositions), and taxon. The category is sometimes needed to plug a term IRI into the right parameter for a function or API call.
-term_category(x)- +term_category(x)+Arguments
A character vector with the term categories ("entity", "quality", "phenotype", or "taxon") of the terms in the input list.
-The implementation will first try infer the category from the object type @@ -143,43 +149,40 @@
- + + + diff --git a/docs/reference/term_freqs.html b/docs/reference/term_freqs.html index 7078433..541f944 100644 --- a/docs/reference/term_freqs.html +++ b/docs/reference/term_freqs.html @@ -8,21 +8,29 @@term_category(c("http://purl.obolibrary.org/obo/UBERON_0011618", ++ "http://purl.obolibrary.org/obo/MP_0030825")) +term_category(c("http://purl.obolibrary.org/obo/UBERON_0011618", "http://purl.obolibrary.org/obo/PATO_0002279", "http://purl.obolibrary.org/obo/VTO_0071642", - "http://purl.obolibrary.org/obo/MP_0030825"))#> [1] "entity" "quality" "taxon" "phenotype"#> [1] "phenotype" "phenotype" "phenotype"#> [1] "entity" "quality" "taxon" "phenotype"#> [1] "phenotype" "phenotype" "phenotype"
Determines the frequencies for the given input list of terms, based on the selected corpus.
-term_freqs(x, as = c("auto", "entity", "quality", "phenotype"), - corpus = c("taxon_annotations", "taxa", "gene_annotations", "genes"), - decodeIRI = TRUE, ...)- +
term_freqs(x, as = c("auto", "entity", "quality", "phenotype"), + corpus = c("taxon_annotations", "taxa", "gene_annotations", "genes"), + decodeIRI = TRUE, ...)+
a vector of frequencies as floating point numbers (between zero and 1.0), of the same length (and ordering) as the input list of terms.
-Depending on the corpus selected, the frequencies are queried directly @@ -183,7 +189,6 @@
decodeIRI
), but this only works to a limited extent.
-
Term categories being accurate is vital for obtaining correct counts and @@ -198,52 +203,48 @@
as
. If all terms are of the same category and the category is
known beforehand, it saves time and prevents potential errors to supply this
category using as
.
-
terms <- c("pectoral fin", "pelvic fin", "dorsal fin", "paired fin") -IRIs <- sapply(terms, pk_get_iri, as = "anatomy") -term_freqs(IRIs)#> [1] 0.001797967 0.002695709 0.004368265 0.004585561-phens <- get_phenotypes(entity = "basihyal bone") -term_freqs(phens$id, as = "phenotype", corpus = "taxon_annotations")#> [1] 3.973408e-05 4.966760e-06 1.365859e-05 1.129938e-04 6.208450e-06 -#> [6] 2.607549e-05 6.332619e-05 2.222625e-04 1.365859e-05 1.328608e-04 -#> [11] 4.966760e-06 6.518872e-04 8.940168e-05 2.619966e-04 5.090929e-05 -#> [16] 1.862535e-05 1.192022e-04 2.235042e-05 1.030603e-04term_freqs(phens$id, as = "phenotype", corpus = "taxa")#> [1] 0.002659574 NA 0.002659574 0.002659574 0.001329787 0.001329787 -#> [7] 0.002659574 0.001329787 0.002659574 0.005319149 0.002659574 0.005319149 -#> [13] 0.001329787 0.002659574 0.002659574 NA 0.003989362 0.001329787 -#> [19] 0.001329787+- + + + diff --git a/docs/reference/term_iri.html b/docs/reference/term_iri.html index 5bccdb2..7451ff5 100644 --- a/docs/reference/term_iri.html +++ b/docs/reference/term_iri.html @@ -8,21 +8,29 @@terms <- c("pectoral fin", "pelvic fin", "dorsal fin", "paired fin") +IRIs <- sapply(terms, pk_get_iri, as = "anatomy") +term_freqs(IRIs) +#> [1] 0.001791597 0.002191650 0.004333911 0.004073383+phens <- get_phenotypes(entity = "basihyal bone") +term_freqs(phens$id, as = "phenotype", corpus = "taxon_annotations") +#> [1] 3.951144e-05 4.938930e-06 1.358206e-05 1.123607e-04 6.173663e-06 +#> [6] 2.592938e-05 6.297136e-05 2.210171e-04 2.222519e-04 1.358206e-05 +#> [11] 1.358206e-04 4.938930e-06 6.494693e-04 8.890074e-05 2.617633e-04 +#> [16] 5.185877e-05 1.852099e-05 1.185343e-04 2.222519e-05 1.024828e-04term_freqs(phens$id, as = "phenotype", corpus = "taxa") +#> [1] 0.002547771 0.000000000 0.002547771 0.002547771 0.001273885 0.001273885 +#> [7] 0.002547771 0.001273885 0.001273885 0.002547771 0.005095541 0.002547771 +#> [13] 0.005095541 0.001273885 0.002547771 0.002547771 0.000000000 0.003821656 +#> [19] 0.001273885 0.001273885Obtain IRI(s) for canonical terms and properties — partOf_iri • rphenoscape + - + - - + + + + + + + - + + - + - - + + + @@ -30,14 +38,14 @@ - + - + @@ -51,9 +59,10 @@ + - ++@@ -75,7 +84,7 @@ - @@ -112,22 +121,20 @@Obtain IRI(s) for canonical terms and properties
--
partOf_iri
returns the IRI ofthe canonical "part_of" relationship in the database.
hasPart_iri
returns the IRI ofthe canonical "has_part" relationship in the database.Provides cached access to IRIs of terms and properties that are frequently used.
-partOf_iri() +-partOf_iri() + +hasPart_iri() -hasPart_iri() +term_iri(label, type = "owl:Class", preferOntologies = NULL, + firstOnly = TRUE)-term_iri(label, type = "owl:Class", preferOntologies = NULL, - firstOnly = TRUE)Arguments
Character, the IRI(s) for the requested term or property.
-Requested IRIs are not hard-coded. Instead, they are dynamically retrieved from @@ -171,40 +177,38 @@
For the frequently needed properties "part of", "has part" etc, one should use
the predefined functions (partOf_iri()
, hasPart_iri()
), so that the correct
matches are cached and used internally by the package.
#> [1] "http://purl.obolibrary.org/obo/BFO_0000050"#> [1] "http://purl.obolibrary.org/obo/RO_0002202"term_iri("anatomical structure", firstOnly = FALSE)#> [1] "http://purl.obolibrary.org/obo/UBERON_0000061" +- + + +#> [1] "http://purl.obolibrary.org/obo/BFO_0000050"#> [1] "http://purl.obolibrary.org/obo/RO_0002202"term_iri("anatomical structure", firstOnly = FALSE) +#> [1] "http://purl.obolibrary.org/obo/UBERON_0000061" #> [2] "http://purl.obolibrary.org/obo/CARO_0000003"