This is a starter app for the Apache Spark Python template.
To run the application, execute the following steps:
- Setup a Spark cluster as described on http://github.com/big-data-europe/docker-spark by just running:
git https://github.com/big-data-europe/docker-spark.git cd docker-spark docker-compose up -d
- Build the Docker image:
bash build.sh python-example examples/python
- Run the Docker container:
docker run --rm --network dockerspark_default --name pyspark-example bde2020/spark-python-example:3.3.0-hadoop3.3