Skip to content

Latest commit

 

History

History
108 lines (78 loc) · 2.67 KB

madridjug17.md

File metadata and controls

108 lines (78 loc) · 2.67 KB

First Time

  • Install Docker. If running OSX, Docker 1.13.1 is compatible with the demo. Newer versions might work but some conflicts have been found with Kubernetes.

  • Install OpenShift Origin or Minishift. If running OSX, OpenShift Origin 3.7 is known to work with Docker version above. The instructions below assume you're using OpenShift Origin.

  • Install Maven 3.

  • Install Kubetail.

  • Install Jupyter via Anaconda.

Pre talk

  • Adjust hostPath in datagrid/datagrid.yml to point to the correct folder.

  • Make sure Docker is running.

  • Docker settings: 5 CPU, 8 GB

  • Start OpenShift cluster:

oc cluster down && oc cluster up
  • Deploy data grid:
cd datagrid
./deploy.sh

You can follow progress of deployment of Infinispan server pods via:

kubetail -l cluster=datagrid
  • Deploy analytics component:
cd ../analytics
./deploy.sh
cd analytics/analytics-jupyter
~/anaconda/bin/jupyter notebook
  • Clear Jupyter output by clicking: Cell / All Output / Clear

Analytics Demo

  • Go to Jupyter notebook, open live-demo.ipynb and verify that URL returns 0 entries.

  • Implement delay ratio task in delays.java.stream.task.DelayRatioTask class:

Map<Integer, Long> totalPerHour = cache.values().stream()
      .collect(
            () -> Collectors.groupingBy(
                  e -> getHourOfDay(e.departureTs),
                  Collectors.counting()
            ));

Map<Integer, Long> delayedPerHour = cache.values().stream()
      .filter(e -> e.delayMin > 0)
      .collect(
            () -> Collectors.groupingBy(
                  e -> getHourOfDay(e.departureTs),
                  Collectors.counting()
            ));
  • Recompile and redeploy server task:
cd analytics
mvn clean package -pl analytics-server
yes | cp analytics-server/target/analytics-server-1.0-SNAPSHOT.jar ../datagrid/target/analytics-server.jar
  • Go to Jupyter notebook and run each cell again of live-demo.ipynb. Value for analytics.size should be 48. The time with biggest ratio of delayed trains should be 2am.