This application has two primary features:
- Utilizes a REST API service to collect data from parquet files and provides the data in json 'record' format.
- Consumes the REST API service and returns the json-formatted data in the browser.
Other features of the app include:
- Asset and column models with a one-asset-to-many-column relation
- Page that shows all possible parquet queries
A function has also been developed in the 'read_csv_and_store_parquet.py' that uses PySpark to convert csv data into wide-formatted parquet files partitioned by asset, year, and month.
Pertaining to Django testing, Form, Model, and View tests have been developed and provide 100% coverage.
A Google Colaboratory notebook explaining the steps to process long-formatted csv files into wide-formatted partitioned parquet files has also been included in this repo.
Developed by Stephen Utlak.
This software is available under the MIT license.