index
Create a file called Dockerfile (name does matter, docker program will look for that file)
vi Dockerfile, enter below, then save and exit
FROM openjdk:8-alpine
RUN apk --update add wget tar bash
RUN tar -xzf spark-3.0.0-preview-bin-hadoop2.7.tgz && \
COPY start-master.sh /start-master.sh
COPY start-worker.sh /start-worker.sh
Need to create 2 shell scripts, start-master.sh and start-worker.sh, to start up Spark cluster with master and work nodes
vi start-master.sh, enter below, save and exit
#!/bin/sh
/spark/bin/spark-class org.apache.spark.deploy.master.Master \
chmod +x start-master.sh
vi start-worker.sh, enter below, save and exit
#!/bin/sh
/spark/bin/spark-class org.apache.spark.deploy.worker.Worker \
chmod +x start-worker.sh
Then build the docker image to be used in our class.
This is assume you are inside docker_dir directory, if not, cd into it, because it has the Dockerfile required
Run beSetup Elipcse Scala IDElow to build Spark cluster docker
docker build -t spark_lab/spark:latest
Last updated