Skip to content

Commit

Permalink
Gh-266: Added docs on matched vertex and corner case (#431)
Browse files Browse the repository at this point in the history
* Added docs on matched vertex and corner case

* PR feedback

* Link ToSet

---------

Co-authored-by: GCHQDeveloper314 <[email protected]>
  • Loading branch information
t92549 and GCHQDeveloper314 authored Nov 10, 2023
1 parent 96b44a6 commit 44777ca
Show file tree
Hide file tree
Showing 2 changed files with 132 additions and 15 deletions.
31 changes: 16 additions & 15 deletions docs/reference/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,19 @@ hide:

# Glossary

| Term | Description |
| :---------- | :----------------------------------- |
| Node | A node is an entity within a graph |
| Edge | An edge is a connection between two nodes |
| Properties | A property is a key/value pair that stores data on both edges and entities |
| Python | A programming language that is used to build applications. Gaffer uses python to interact with the API |
| Java | A object oriented programming language used to build software. Gaffer is primarily built in Java |
| Database | A database is a collection of organised structured information or data typically stored in a computer system |
| API | Application Programming Interface. An API is for one or more services / systems to communicate with each other |
| JSON | JavaScript Object Notation is a text based format for representing structure data based on JavaScript object syntax|
| Element | The word is used to describe the combination of both an edge and entity |
| Entity | An entity represents a point in a graph e.g. a person |
| Stores | A Gaffer store represents the backing database responsbile for storing or facilitating access to a graph |
| Operations | An operation is an instruction / function that you send to the API to manipulate and query a graph |
| Vertex | A vertex refers to the field in an entity that describes its type |
| Term | Description |
| :---------------- | :----------------------------------- |
| Entity | An entity represents a point in a graph |
| Edge | An edge is a connection between two entities |
| Vertex | In Gaffer, a vertex is the id of an entity |
| Node | A node is what Gaffer calls an entity |
| Properties | A property is a key/value pair that stores data on both edges and entities |
| Element | The word is used to describe edges or entities |
| Stores | A Gaffer store represents the backing database responsbile for storing or facilitating access to a graph |
| Operations | An operation is an instruction / function that you send to the API to manipulate and query a graph |
| Matched vertex | `matchedVertex` is a field added to Edges which are returned by Gaffer queries, stating whether your seeds matched the source or destination |
| Python | A programming language that is used to build applications. Gaffer uses python to interact with the API |
| Java | A object oriented programming language used to build software. Gaffer is primarily built in Java |
| Database | A database is a collection of organised structured information or data typically stored in a computer system |
| API | Application Programming Interface. An API is for one or more services / systems to communicate with each other |
| JSON | JavaScript Object Notation is a text based format for representing structure data based on JavaScript object syntax |
116 changes: 116 additions & 0 deletions docs/reference/operations-guide/get.md
Original file line number Diff line number Diff line change
Expand Up @@ -1240,6 +1240,122 @@ Gets elements related to provided seeds. [Javadoc](https://gchq.github.io/Gaffer
} ]
```

??? warning "Example fetching *duplicate* edges in corner cases"

Get entities and edges by entity id 3 and edge id 2 to 3.

Here we see a corner case where strange behaviour occurs when using the Accumulo store.
This happens when you provide an `EntitySeed` which matches the destination of an edge, and an `EdgeSeed` which matches the same edge.
Due to how the `matchedVertex` filtering works within Gaffer's Accumulo iterator code, this duplication is not removed.
This may seem like a niche corner case, but is more likely to happen when you query Gaffer with a large amount of seeds.
Note that, despite the entity matching some of the edges, the results do not have a matched vertex field.

If you want to remove duplicate results, you can put a [ToSet](./core.md#toset) operation next in your operation chain.

=== "Java"

``` java
final GetElements operation = new GetElements.Builder()
.input(new EntitySeed(3), new EdgeSeed(2, 3, DirectedType.EITHER))
.build();
```

=== "JSON"

``` json
{
"class" : "GetElements",
"input" : [ {
"class" : "EntitySeed",
"vertex" : 3
}, {
"class" : "EdgeSeed",
"source" : 2,
"destination" : 3,
"matchedVertex" : "SOURCE",
"directedType" : "EITHER"
} ]
}
```

=== "Python"

``` python
g.GetElements(
input=[
g.EntitySeed(
vertex=3
),
g.EdgeSeed(
source=2,
destination=3,
directed_type="EITHER",
matched_vertex="SOURCE"
)
]
)
```

Results:

=== "Java"

``` java
Entity[vertex=2,group=entity,properties=Properties[count=<java.lang.Integer>1]]
Edge[source=2,destination=3,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>2]]
Entity[vertex=3,group=entity,properties=Properties[count=<java.lang.Integer>2]]
Edge[source=3,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>4]]
// This Edge is duplicated:
Edge[source=2,destination=3,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>2]]
```

=== "JSON"

``` json
[ {
"class" : "uk.gov.gchq.gaffer.data.element.Entity",
"group" : "entity",
"vertex" : 2,
"properties" : {
"count" : 1
}
}, {
"class" : "uk.gov.gchq.gaffer.data.element.Edge",
"group" : "edge",
"source" : 2,
"destination" : 3,
"directed" : true,
"properties" : {
"count" : 2
}
}, {
"class" : "uk.gov.gchq.gaffer.data.element.Entity",
"group" : "entity",
"vertex" : 3,
"properties" : {
"count" : 2
}
}, {
"class" : "uk.gov.gchq.gaffer.data.element.Edge",
"group" : "edge",
"source" : 3,
"destination" : 4,
"directed" : true,
"properties" : {
"count" : 4
}
}, {
"class" : "uk.gov.gchq.gaffer.data.element.Edge",
"group" : "edge",
"source" : 2,
"destination" : 3,
"directed" : true,
"properties" : {
"count" : 2
}
} ]
```

## GetAdjacentIds

Performs a single hop down related edges. [Javadoc](https://gchq.github.io/Gaffer/uk/gov/gchq/gaffer/operation/impl/get/GetAdjacentIds.html)
Expand Down

0 comments on commit 44777ca

Please sign in to comment.