-
Install Docker. If running OSX, Docker 1.13.1 is compatible with the demo. Newer versions might work but some conflicts have been found with Kubernetes.
-
Install OpenShift Origin or Minishift. If running OSX, OpenShift Origin 3.7 is known to work with Docker version above. The instructions below assume you're using OpenShift Origin.
-
Install Maven 3.
-
Install Kubetail.
-
Install Jupyter via Anaconda.
-
Adjust
hostPath
indatagrid/datagrid.yml
to point to the correct folder. -
Make sure Docker is running.
-
Docker settings: 5 CPU, 8 GB
-
Start OpenShift cluster:
oc cluster down && oc cluster up
- Deploy data grid:
cd datagrid
./deploy.sh
You can follow progress of deployment of Infinispan server pods via:
kubetail -l cluster=datagrid
- Deploy analytics component:
cd ../analytics
./deploy.sh
-
Open Chrome and verify all pods are running.
-
Start Jupyter, open
live-demo.ipynb
and verify that the URL returns0
results:
cd analytics/analytics-jupyter
~/anaconda/bin/jupyter notebook
- Clear Jupyter output by clicking:
Cell
/All Output
/Clear
-
Go to Jupyter notebook, open
live-demo.ipynb
and verify that URL returns0
entries. -
Implement delay ratio task in
delays.java.stream.task.DelayRatioTask
class:
Map<Integer, Long> totalPerHour = cache.values().stream()
.collect(
() -> Collectors.groupingBy(
e -> getHourOfDay(e.departureTs),
Collectors.counting()
));
Map<Integer, Long> delayedPerHour = cache.values().stream()
.filter(e -> e.delayMin > 0)
.collect(
() -> Collectors.groupingBy(
e -> getHourOfDay(e.departureTs),
Collectors.counting()
));
- Recompile and redeploy server task:
cd analytics
mvn clean package -pl analytics-server
yes | cp analytics-server/target/analytics-server-1.0-SNAPSHOT.jar ../datagrid/target/analytics-server.jar
- Go to Jupyter notebook and run each cell again of
live-demo.ipynb
. Value foranalytics.size
should be 48. The time with biggest ratio of delayed trains should be 2am.