As we have introduced ourselves to Spark in the previous post(check here ), lets see the spark architecture to get deeper understanding of how Spark works. Simply put, Spark is a distributed computing platform and we can run our programs created to be run on a cluster of machines. Now let us see, How to execute programs (jobs) on Spark cluster? There are 2 ways to do this! Interactive client (or, Spark Shell such as Scala Shell, PyShell, Notebooks) Best suited for exploration and experimental purposes. Submit operation (or, Submit jobs via APIs) All full fledged programs and projects, which needs to run on production, will submit jobs using the Submit utility provided by Spark. Spark runs the jobs submitted, using the Master-Slaves architecture. Every spark application, will have one Master process and multiple Slave processes. In case of Spark, the master is called as the Driver, and the slaves are called the Executor processes. The Driver is responsi...