Skip to content

Commit

Permalink
Gh-438: Overall doc improvements (#440)
Browse files Browse the repository at this point in the history
* fixing terminology and improving consistency

* Fixed casing and grammar

---------

Co-authored-by: GCHQDeveloper314 <[email protected]>
  • Loading branch information
cn337131 and GCHQDeveloper314 authored Dec 1, 2023
1 parent e0347ec commit bb8fb8e
Show file tree
Hide file tree
Showing 17 changed files with 72 additions and 75 deletions.
2 changes: 1 addition & 1 deletion docs/administration-guide/aggregation/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ met:
There are a few different use cases for applying ingest aggregation but it is
largely driven by the data you have and the analysis you wish to perform. As an
example, say you were expecting multiple connections of the same edge between
two nodes but each instance of the edge may have differing values on its
two entities but each instance of the edge may have differing values on its
properties, this could be a place to apply aggregation to sum the values etc.

Please see the [ingest aggregation example](ingest-example.md) for some common
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Gaffer Images

As demonstrated in the [quickstart](../quickstart.md) it is very simple to start
up a basic in memory gaffer graph using the available Open Container Initiative
up a basic in memory Gaffer graph using the available Open Container Initiative
(OCI) images.

For large scale graphs with persistent storage you will want to use a different
Expand Down
4 changes: 2 additions & 2 deletions docs/administration-guide/gaffer-deployment/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ docker pull gchq/gaffer-rest:2.0.0
docker run -p 8080:8080 gchq/gaffer-rest:2.0.0
```

The Swagger rest API should be available at
The Swagger REST API should be available at
[http://127.0.0.1:8080/rest](http://127.0.0.1:8080/rest) to try out.

Be aware that as the image uses the map store backend by default, all graph
Expand Down Expand Up @@ -45,7 +45,7 @@ are more widely used than others, the main types you might want to use are:
[Apache Accumulo](https://accumulo.apache.org/).
- **Map Store** - In memory JVM store, useful for quick prototyping.
- **Proxy Store** - This provides a way to hook into an existing Gaffer store,
when used all operations are delegated to the chosen Gaffer Rest API.
when used all operations are delegated to the chosen Gaffer REST API.
- **Federated Store** - Similar to a proxy store however, this will forward all
requests to a collection of sub graphs but merge the responses so they
appear as one graph.
Expand Down
8 changes: 4 additions & 4 deletions docs/development-guide/example-deployment/project-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This guide will run through the start up and deployment of a basic Gaffer instance. It will cover
how to write a basic Gaffer Schema from scratch along with using the pre-made containers to run the
Gaffer rest API and Accumulo based data store.
Gaffer REST API and Accumulo based data store.

!!! warning
Please be aware that the example is only intended to demonstrate the core Gaffer concepts it is
Expand All @@ -12,7 +12,7 @@ Gaffer rest API and Accumulo based data store.
## The Example Graph

For this basic example we will attempt to recreate the graph in the following diagram consisting of
two nodes (vertexes) with one directed edge between them.
two entities with one directed edge between them.

```mermaid
graph LR
Expand Down Expand Up @@ -104,7 +104,7 @@ that suites a stand alone deployment consisting of the following file structure:
2. Any data files, e.g. CSV, to be made available to the Gaffer container.
3. The main graph config file to set various properties of the overall graph.
4. This file holds the schema outlining the elements in the graph, e.g. the
nodes (aka entities) and edges.
entities and edges.
5. This file defines the different data types in the graph and how they are
serialised to Java classes.
6. Config file for additional Gaffer operations and set the class to handle
Expand Down Expand Up @@ -189,7 +189,7 @@ gaffer.store.operation.declarations=/gaffer/store/operationsDeclarations.json
### Operations Declarations

The operation declarations file is a way of enabling additional operations in Gaffer. By default
there are some built in operations already available (the rest API has a get all operations request
there are some built in operations already available (the REST API has a get all operations request
to see a list), but its likely you might want to enable others or add your own custom ones. As the
example will load its data from a local CSV file we can activate a couple of additional operations
using the following file.
Expand Down
26 changes: 13 additions & 13 deletions docs/development-guide/example-deployment/using-the-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ use to load data or query.
Gaffer supports various methods of loading data and depending on your use case you can even bypass
it all together to load directly into Accumulo.

This example will focus on using the rest API to add the graph elements. In production this method
This example will focus on using the REST API to add the graph elements. In production this method
would not be recommended for large volumes of data. However, it is fine for smaller data sets and
generally can be done in a few stages outlined in the following diagram.

Expand All @@ -33,11 +33,11 @@ is a standard `AddElements` operation which takes raw elements JSON as input and
graph.

!!! info
This is where the schema is used here to validate the elements are correct and conform before
adding.
This is where the schema is used to validate the elements are correct and conform before
adding them to a graph.

Using the example again we will demonstrate how we could write an operation chain to load the data
from the neo4j formatted CSV file.
from the Neo4j formatted CSV file.

```json
{
Expand Down Expand Up @@ -66,24 +66,24 @@ via the `operationsDeclarations.json`), which
streams the data from the CSV file into the next `GenerateElements` operation.

For the generator we have selected the built in `Neo4jCsvElementGenerator` class, this is already
set up to be able to parse a correctly formatted neo4j exported CSV into Gaffer elements via the
set up to be able to parse a correctly formatted Neo4j exported CSV into Gaffer elements via the
schema. If you are curious as to what the output of each operation is you can try run a subset of
this chain to see how the data changes on each one, the output should be returned back to you in the
server response section of the Swagger API.

## Querying Data

Once data is loaded in the graph its now possible to start querying the data to gain insight and
Once data is loaded into the graph it is now possible to start querying the data to gain insight and
perform analytics. Querying in Gaffer can get fairly complex but generally simple queries are made
up of two parts; a `Get` Operation and a `View`.

Starting with the `Get` operation, say we want to get all nodes and edges based on their ID. To do
this we can use the `GetElements` operation and set the `Seed` to the entity (e.g. node) or edge
Starting with the `Get` operation, say we want to get all entities and edges based on their ID. To do
this we can use the `GetElements` operation and set the `Seed` to the vertex or edge
where we want to start the search. To demonstrate this on the example graph we can attempt to get
all entities and edges associated with the `Person` node with ID `v1`.
all entities and edges associated with the `Person` entity with ID `v1`.

The result from this query should return the node associated with the `v1` id along with any edges
on this node, which in this case is just one
The result from this query should return the entity associated with the `v1` id along with any edges
on this vertex, which in this case is just one.

=== "Input Query"
```json
Expand Down Expand Up @@ -148,9 +148,9 @@ manipulate the results. In general a `View` has the following possible use cases
or excluded.

Taking the example from the previous section we will demonstrate general filtering on a query. As
before, the query returns the node `v1` and any edges associated with it. We will now filter it to
before, the query returns the vertex `v1` and any edges associated with it. We will now filter it to
include only edges where the weight is over a certain value. In this scenario it is analogous to
asking, *"get all the `Created` edges on node `v1` that have a `weight` greater than 0.3"*.
asking, *"get all the `Created` edges on vertex `v1` that have a `weight` greater than 0.3"*.

=== "Filter Query"

Expand Down
28 changes: 14 additions & 14 deletions docs/development-guide/example-deployment/writing-the-schema.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Writing the Schema

In Gaffer JSON based schemas need to be written upfront to model and understand how to load and
treat the data in the graph. These schemas define all aspects of the nodes and edges in the graph,
treat the data in the graph. These schemas define all aspects of the entities and edges in the graph,
and can even be used to automatically do basic analysis or aggregation on queries and ingested data.

For reference, this guide will use the same CSV data set from the [project setup](./project-setup.md#the-example-graph) page.
Expand All @@ -23,7 +23,7 @@ For reference, this guide will use the same CSV data set from the [project setup

## Elements Schema

In Gaffer an element refers to any object in the graph, i.e. your nodes (vertexes) and edges. To set
In Gaffer, an element refers to any object in the graph, i.e. your entities and edges. To set
up a graph we need to tell Gaffer what objects are in the graph and the properties they have. The
standard way to do this is a JSON config file in the schema directory. The filename can just be
called something like `elements.json`, the name is not special as all files under the `schema`
Expand All @@ -33,7 +33,7 @@ using an appropriate name.
As covered in the [Getting Started Schema page](../../user-guide/schema.md), to write a schema you can see that there are some
required fields, but largely a schema is highly specific to your input data.

Starting with the `entities` from the example, we can see there will be two distinct types of nodes
Starting with the `entities` from the example, we can see there will be two distinct types of entity
in the graph; one representing a `Person` and another for `Software`. These can be added into the
schema to give something like the following:

Expand All @@ -59,7 +59,7 @@ schema to give something like the following:
From the basic schema you can see that we have added two entity types for the graph. For now, each
`entity` just contains a short description and a type associated to the `vertex` key. The type here
is just a placeholder, but it has been named appropriately as it's assumed that we will just use the
string representation of the node's id (this will be defined in the `types.json` later in the
string representation of the entities id (this will be defined in the `types.json` later in the
guide).

Expanding on the basic schema we will now add the `edges` to the graph. As the example graph is
Expand Down Expand Up @@ -92,14 +92,14 @@ As discussed in the [user schema guide](../../user-guide/schema.md), edges have
the `source` and `destination` fields, these must match the types associated with the vertex field
in the relevant entities. From the example, we can see that the source of a `Created` edge is a
`Person` so we will use the placeholder type we set as the `vertex` field which is
`id.person.string`. Similarly the destination is a `Software` node so we will use its placeholder of
`id.person.string`. Similarly the destination is a `Software` vertex so we will use its placeholder of
`id.software.string`.

We must also set whether an edge is directed or not, in this case it is as only a person can create
software not the other way around. To set this we will use the `true` type, but note that this is a
placeholder and must still be defined in the types.json.

Continuing with the example, the nodes and edges also have some properties associated with each such
Continuing with the example, the entities and edges also have some properties associated with each such
as name, age etc. These can also be added to the schema using a properties map to result in the
extended schema below.

Expand Down Expand Up @@ -152,10 +152,10 @@ schema there are some placeholder types added as the values for many of the keys
similarly to if you have ever programmed in a strongly typed language, they are essentially the
wrapper for the value to encapsulate it.

Now starting with the types for the nodes/vertexes, we used two placeholder types, one for the
Now starting with the types for the entities, we used two placeholder types, one for the
`Person` entity and one for the `Software` entity. From the example CSV you can see there is a `_id`
column that uses a string identifier that is used for the ID of the node (this will also be used by
the `edge` to identify the source and destination). We will define a type for each node ID using the
column that uses a string identifier that is used for the ID of the entity (this will also be used by
the `edge` to identify the source and destination). We will define a type for each entity ID using the
standard java `String` class to encapsulate it, this leads to a basic `type.json` like the
following.

Expand All @@ -175,17 +175,17 @@ following.
```

The next set of types that need defining are, the ones used for the properties that are attached to
the nodes/entities. Again we need to take a look back at what our input data looks like, in the CSV
the entities. Again we need to take a look back at what our input data looks like, in the CSV
file we can see there are three different types that are used for the properties which are analogous
to a `String`, an `Integer` and a `Float`.

!!! tip
Of course technically, all of these properties could be encapsulated in a string but, assigning
a relevant type allows some additional type specific features when doing things like grouping
and aggregation as it would in traditional programming.
Of course technically, all of these properties could be encapsulated in a string but assigning
a relevant type allows some additional type specific features often used in grouping
and aggregation.

If we make a type for each of the possible properties using the standard Java classes we end up with
the following.
the following:

```json
{
Expand Down
2 changes: 1 addition & 1 deletion docs/development-guide/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Organization](https://github.com/orgs/gchq/repositories).
The core Java [Gaffer repo](https://github.com/gchq/Gaffer) contains the main Gaffer product.
If you are completely new to Gaffer you can try out our [Road Traffic Demo](https://github.com/gchq/Gaffer/blob/master/example/road-traffic/README.md) or look at our example [deployment guide](../development-guide/example-deployment/project-setup.md).

The [gafferpy repo](https://github.com/gchq/gafferpy) contains a python shell that can execute operations.
The [gafferpy repo](https://github.com/gchq/gafferpy) contains a Python shell that can execute operations.

The [gaffer-docker repo](https://github.com/gchq/gaffer-docker) contains the code needed to run Gaffer using Docker or Kubernetes.
More information about running a containerised instance of Gaffer can be found in our [adminstration guide](../administration-guide/introduction.md).
Expand Down
2 changes: 1 addition & 1 deletion docs/development-guide/rest-api-sketches.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ object using the `ObjectMapper` module which uses the relevant deserialiser (

## Creating cardinality values over JSON

When adding or updating a cardinality object over the rest api, you specify the vertex values to add to the sketch.
When adding or updating a cardinality object over the REST API, you specify the vertex values to add to the sketch.
This is done by either using the `offers` field with `HyperLogLogPlus`, or the `values` field with `HllSketch`.
The HyperLogLog object is then instantiated and updated with
the values. The object can then be serialised and stored in the datastore.
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ hide:
| Stores | A Gaffer store represents the backing database responsbile for storing or facilitating access to a graph |
| Operations | An operation is an instruction / function that you send to the API to manipulate and query a graph |
| Matched vertex | `matchedVertex` is a field added to Edges which are returned by Gaffer queries, stating whether your seeds matched the source or destination |
| Python | A programming language that is used to build applications. Gaffer uses python to interact with the API |
| Python | A programming language that is used to build applications. Gaffer uses Python to interact with the API |
| Java | A object oriented programming language used to build software. Gaffer is primarily built in Java |
| Database | A database is a collection of organised structured information or data typically stored in a computer system |
| API | Application Programming Interface. An API is for one or more services / systems to communicate with each other |
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/operations-guide/accumulo.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,7 @@ This operation has been introduced as a replacement to the `GetElementsBetweenSe

!!! warning "Currently Unavailable"

The python API for this operation is currently unavailable [see this issue](https://github.com/gchq/gafferpy/issues/14).
The Python API for this operation is currently unavailable [see this issue](https://github.com/gchq/gafferpy/issues/14).

Results:

Expand Down
8 changes: 4 additions & 4 deletions docs/user-guide/apis/java-api.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# Using the Java API

As Gaffer is written in Java there is native support to allow use of all its
public classes. Using Gaffer via the Java interface does differ from the rest
public classes. Using Gaffer via the Java interface does differ from the REST
API and `gafferpy` but is fully featured with extensive
[Javadocs](https://gchq.github.io/Gaffer/overview-summary.html). However, you
will of course need to be familiar with writing and running Java code in order
will need to be familiar with writing and running Java code in order
to utilise this form of the API.

## Querying a Graph

Using Java to query a graph unlike the other APIs requires a reference to a
Using Java to query a graph, unlike the other APIs, requires a reference to a
`Graph` object that essentially represents a graph.

With the other APIs you would connect directly to a running instance via the
rest interface; however, to do this with Java you would need to configure a
REST interface; however, to do this with Java you would need to configure a
`Graph` object with a [proxy store](../../administration-guide/gaffer-stores/proxy-store.md).

!!! example ""
Expand Down
Loading

0 comments on commit bb8fb8e

Please sign in to comment.