How To Install Apache Spark Standalone In Centos 7? – check How It Is Easy In 5 Mins!

How to install Apache Spark Standalone in CentOs 7? - check how it is easy in 5 mins! Jak zainstalować Apache Spark Standalone na CentOs? - sprawdź 4 proste kroki!
Share this post and Earn Free Points!

In this tutorial I will show you how to install Apache Spark" Standalone In Centos 7".


Apache Spark

Apache Spark" is an open-source, distributed computing system that is designed for large-scale data processing. It is a fast and flexible data processing engine that can process data in a variety of formats, including structured, semi-structured, and unstructured data.

Spark provides a range of tools and libraries for data processing, including support for SQL", machine learning", and stream processing. It also offers a flexible and interactive programming" model that allows you to easily develop and deploy distributed applications.

Spark is designed to be highly scalable, so it can handle data processing tasks of any size. It can run on a single machine or on a cluster of hundreds of machines, making it an ideal choice for big data" processing tasks.

Overall, Apache Spark" is a powerful and versatile data processing platform that is well-suited for a wide range of data processing tasks and applications.


CentOS" (Community Enterprise Operating System) is a Linux distribution that is based on Red Hat Enterprise Linux" (RHEL). It is a free and open-source operating system that is designed to be stable, reliable, and secure.

Centos 7" is the seventh major release of the CentOS" operating system. It was released in 2014 and is based on RHEL 7. It includes a number of improvements and new features, such as support for the XFS file system, support for Docker", and the ability to use the system as a router.

Centos 7" is a popular choice for servers and other enterprise-level systems due to its stability and security features. It is also widely used in the hosting industry and for building web servers and other Internet-facing systems.

To use Centos 7", you will need to install it on a computer or server. You can download the Centos 7" installation media (such as an ISO image) from the CentOS" website and create a bootable USB drive or DVD. Once the installation media is prepared, you can boot your computer from the media and follow the prompts to install Centos 7".

Install Spark On CentOs 7

To install Spark" on CenOs 7 you have to follow steps:

Install Apache Spark Standalone in CentOs 7

Step #1: Install Java

First of all you have to install Java" on your machine which means that to install Apache Spark" on Centos 7", you will need to have the Java" Development Kit (JDK) installed on your system.

[root@sparkCentOs pawel] sudo yum install java-1.8.0-openjdk
[root@sparkCentOs pawel] java -version
openjdk version "1.8.0_161"
OpenJDK Runtime Environment (build 1.8.0_161-b14)
OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)

If JDK is not installed, you can install it by running the following command:

yum install java-1.8.0-openjdk

Step #2: Install Scala

In second step please install Scala".

[root@sparkCentOs pawel] wget
[root@sparkCentOs pawel] tar xvf scala-2.11.8.tgz
[root@sparkCentOs pawel] sudo mv scala-2.11.8 /usr/lib
[root@sparkCentOs pawel] sudo ln -s /usr/lib/scala-2.11.8 /usr/lib/scala
[root@sparkCentOs pawel] export PATH=$PATH:/usr/lib/scala/bin

Step #3: Installation of Apache Spark​

Now we will download Apache Spark" from official website and install on your machine. (How to install Apache Spark" Standalone in Centos 7")

# Download Spark
[root@sparkCentOs pawel] wget
[root@sparkCentOs pawel] tar xf spark-2.3.1-bin-hadoop2.7.tgz
[root@sparkCentOs pawel] mkdir /usr/local/spark
[root@sparkCentOs pawel] cp -r spark-2.3.1-bin-hadoop2.7/* /usr/local/spark
[root@sparkCentOs pawel] export SPARK_EXAMPLES_JAR=/usr/local/spark/examples/jars/spark-examples_2.11-2.3.1.jar
[root@sparkCentOs pawel] PATH=$PATH:$HOME/bin:/usr/local/spark/bin
[root@sparkCentOs pawel] source ~/.bash_profile

Step #4: Run Spark Shell

Please run Spark shell" and verify if Spark" is working correctly.

[root@sparkCentOs pawel]# spark-shell
2018-08-20 19:57:30 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://sparkCentOs:4040
Spark context available as 'sc' (master = local[*], app id = local-1534795057680).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.1
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_161)
Type in expressions to have them evaluated.
Type :help for more information.

This will start the Spark shell", which is an interactive environment for running Spark commands. You can use the Spark shell" to run Spark applications and perform other tasks.

Let’s type some code 🙂

scala> val data = spark.sparkContext.parallelize(
    Seq("I like Spark","Spark is awesome",
    "My first Spark job is working now and is counting these words")
data: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at :23
scala> val wordCounts = data.flatMap(row => row.split(" ")).
        map(word => (word, 1)).reduceByKey(_ + _)
wordCounts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[3] at reduceByKey at :25
scala> wordCounts.foreach(println)


That’s all about How to install Apache Spark" Standalone in Centos 7". Enjoy! For more information on installing and using Apache Spark", you can consult the Spark documentation or visit the Spark website.

Could You Please Share This Post? 
I appreciate It And Thank YOU! :)
Have A Nice Day!

How useful was this post?

Click on a star to rate it!

Average rating 4.8 / 5. Vote count: 896

No votes so far! Be the first to rate this post.

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?