Feature implementation

Implementation deatils

##DynamoDBClient

###Overview: DynamoDBClient (DDBC) provides methods to upload larger items to AWS DyanoDB. The current implementation affords uploading items of any size (theoretically unbounded), by chunking the item into 400KB chunks.

An average flow of events when using DDBC is as follows:

A call is made to upload a record to DynamoDB.
DDBC, which is given the file size and the content as a String, begins to iteratively break the data into a byte array to send upstream to AWS. The length of said array is never larger than 400KB, which is the official limit of DynamoDB.
The byte array effectively represents a String representation of the data (i.e. using jackson-* libraries).
There are 2 keys in play, a partition and sort key. The chunks are stored in a separate chunk table, with a composite primary key consisting of both the partition and sort key. The partition key is the token, and the sort key indicates what position the series of bytes belongs to in the original non-chunked item.
On finish, a status is returned to the caller which includes the total number of chunks uploaded to Dynamo.
One last upload is made to the main metaprot-task table, which effectively acts as a manifest record. This record contains the same token used in the composite key of the chunk, and details information such as: timestamp, number of chunks, and original filename uploaded.
When data is to be retrieved, DDBC exposes a method to retrieve the String content (and any additional chunks needed). The return of this function should match exactly the input in step 1. The logic behind leaving them as a String representation is so that any arbitrary data type can be uploaded and retrieved, as long as it can be reasonably marshalled/unmarshalled into some String representation by the caller.

##Metabolite Analysis (MA)

###Overview: MA uses both REST and web controllers. REST controllers exist to begin analysis, whereas the web controllers exist to display HTML results of the analysis.

Here is a breakdown of the expected interaction with the MA feature, as well as high level details of the business logic:

On the front end, a user will fill out the form that contains important threshold values, among others.
On form submit, the input file will be uploaded to S3, client-side, via the AWS JavaScript SDK.
Once the input file is uploaded, a REST call is made to HTTP POST /analyze/metabolites/<token>, where <token> is a UUID retrieved from HTTP GET /analyze/token. This starts analysis.
The server will run the appropriate R commands via Rserve, which leads to a certain number of files being generated to a predefined directory.
The server will read in these files and store the results into the database.
The REST call above returns some HTML to display to the user, either an error or success message with a link to the results page.
The user navigates to the result page, and the web controller contacts the database for the computed results.
The results are passed back to the front end, using Thymeleaf as the template engine.
The JavaScript modules and libraries (D3.js) now do their work to initialize and bind the necessary events. Notable JS classes: DataSegregator.js for segregating the output from the server into significance groups, SVGPlot.js, handles all plotting and interactive events for the plot(s).

##Temporal Pattern Recognition Analysis

###Overview Temporal Pattern Recognition Analysis also uses both REST and web controllers. REST controllers exist to begin analysis, whereas the web controllers exist to display HTML results of the analysis.

Here is a breakdown of the expected interaction with the Temporal Pattern Recognition feature, as well as high level details of the business logic:

On the front end, a user will fill out the form that contains important cluster values, among others.
On form submit, the input file will be uploaded to S3, client-side, via the AWS JavaScript SDK.
Once the input file is uploaded, a REST call is made to HTTP POST /analyze/temporal-pattern-recognition/<token>, where <token> is a UUID retrieved from HTTP GET /analyze/token. This starts analysis.
The server will run the appropriate R commands via Rserve, which leads to a certain number of files being generated to a predefined directory.
The server will read in these files and store the results into the database.
The REST call above returns some HTML to display to the user, either an error or success message with a link to the results page.
The user navigates to the result page, and the web controller contacts the database for the computed results.
The results are passed back to the front end, using Thymeleaf as the template engine.
The user is also given the option of changing the input parameters and recomputing the clusters.
The JavaScript modules and libraries (D3.js) now do their work to initialize and bind the necessary events. Notable JS classes: PatternRecogPlot.js, handles all plotting and interactive events for the plot(s).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature implementation

Implementation deatils

Clone this wiki locally