Skip to content

Latest commit

 

History

History
99 lines (59 loc) · 2.14 KB

Assignment-3-Deployment.md

File metadata and controls

99 lines (59 loc) · 2.14 KB

Note

This project was built on Ubuntu 18.04 (Running on Windows Subsystem Linux(WSL), Kernel:4.19.75-microsoft-standard). Tested in Ubuntu 18.04 (bionic beaver, Kernel:5.2.0)

Requirements

  • OS: Linux (Ubuntu 18.04)
  • Docker
  • Flink (v1.9.1)
  • Kafka (v2.12)
  • Elasticsearch (v5.6.0)
  • Kibana (v5.6.0)
  • Redis (Latest Docker image)
  • Python (v3.*)

Note: All the commands below should be run from the root directory of the repository. Some of the bash scipts require root access. So, if asked, please provide the root credentials. Also, note that starting Flink slave nodes require adding them to ~/.ssh/known_hosts. So, you have to explicitly type yes when prmopted.

Install Maven

  sudo apt install maven    

Install OpenJDK8

  sudo apt install openjdk-8-jdk

Install Docker

For installing Docker please follow the instructions here

Install Python

For installing Python please follow the instructions here

Install pip3

 sudo apt install python3-pip
 pip3 install -r code/customer-code/requirements.txt
 

Deployment

The following script will deploy the whole pipeline and download test data

 code/deployment-scripts/deploy-all

Running the pipeline

Upload the schema of the final sink Elasticsearch (mysimpbdp-coredms)

 code/customer-code/coredms-schema-upload

Transform the location id ==> (lat,lon) pairs and Populating Redis

 python3 code/customer-code/customer_transformer.py

Running Customerstreamapp

 code/customer-code/run-customerstreamapp <parallelism degree>

Running Customer Real Timeview

 python3 code/customer-code/customer_realtime-view.py --n <number of locations>

Starting streaming customer data to kafka

 python3 code/customer-code/customer_producer.py --rows <number of rows>

For cleanup (Removing all)

 code/deployment-scripts/cleanup