[SOLVED] Configuration of Apache Spark Scala And IntelliJ IDEA – Short And Easy 5 Steps!

Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!
Share this post and Earn Free Points!

Let’s start with configuration of Apache Spark Scala and IntelliJ"! I assume that you have already installed IntelliJ IDEA software (in otherwise please go to official IntelliJ" website and download the community edition).

This tutorial can be also useful when you want to setup Scala" in IntelliJ".

Introduction

IntelliJ IDEA

IntelliJ" IDEA is an integrated development environment (IDE) for Java", Kotlin", and other programming" languages. It is developed by JetBrains and is widely used by developers for building applications and other software projects.

Some of the key features of IntelliJ IDEA" include:

  • Code completion: IntelliJ" IDEA provides intelligent code completion based on context, which helps you write code faster and with fewer mistakes.
  • Refactoring: IntelliJ" IDEA has a range of refactoring tools that allow you to safely change the structure of your code, such as renaming variables, extracting methods, and inlining variables.
  • Debugging: IntelliJ" IDEA provides a powerful debugger that allows you to step through your code, set breakpoints, and evaluate expressions.
  • Testing: IntelliJ" IDEA includes tools for running and debugging tests, including JUnit, TestNG, and Cucumber.
  • Version control: IntelliJ IDEA integrates with popular version control systems such as Git", GitHub", and SVN", and provides tools for committing, merging, and comparing code changes.
  • Integration: IntelliJ" IDEA integrates with a wide range of tools and frameworks, including build tools like Maven" and Gradle", application servers like Tomcat and Glassfish, and databases like MySQL" and PostgreSQL.

IntelliJ" IDEA is available in two editions: the Community Edition, which is free and open-source, and the Ultimate Edition, which is a paid version with additional features and tools.

Configuration of Apache Spark Scala and IntelliJ – 5 steps!

Instruction Step By Step

To configure Apache Spark" and Scala" in IntelliJ" IDEA, follow these steps:

  1. Download and install the latest version of IntelliJ" IDEA from the JetBrains website.
  2. Install the Scala" plugin in IntelliJ IDEA". Go to File > Settings > Plugins, search for the Scala" plugin, and click Install.
  3. Download the latest version of Apache Spark from the Apache Spark website.
  4. Extract the downloaded Spark" archive to a directory of your choice.
  5. Open IntelliJ IDEA" and create a new project. Select Scala as the project type and sbt as the build tool.
  6. In the Project Structure window, go to Libraries and click the + button to add a new library.
  7. Select Java as the library type and browse to the directory where you extracted Spark". Select the jars directory and click OK.
  8. In the Project Structure window, go to Modules and click the + button to add a new module.
  9. Select Scala as the module type and browse to the directory where you extracted Spark. Select the examples/src/main/scala directory and click OK.
  10. Click Apply to save the changes and close the Project Structure window.
  11. You can now create and run Scala" applications that use Spark in IntelliJ" IDEA.

Note: If you want to use Spark with other programming languages, such as Python" or R, you will need to install the corresponding language plugins and set up the project dependencies accordingly. You may also need to configure the Spark and Hadoop" environment variables" and add them to the IntelliJ" IDEA environment.

Instruction Step By Step With Images

1. Install SBT and Scala Plugins To IntelliJ

In IntelliJ" please go to Plugins–>Browse repositories and install SBT" and Scala" plugins. After that please restart your IntelliJ".

Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!
Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!

2. Create Scala Project

Let’s create new Scala" project. Click “Create new project” and select “SBT”.

Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!

In the next window set the project name and choose correct Scala" version. For Spark 2.3.1" version the Scala" must be in 2.11.x minor version. I selected 2.11.8.

Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!

3. Add Apache Spark Libraries

Spark Scala IntelliJ SBT Setup

In build.sbt file please add Spark libraries. Please make sure that new libraries were downloaded. You can select auto-import option and in the future if you will add new libraries thay will be downloaded automatically. (Configuration of Apache Spark Scala and IntelliJ")

name := "FirstSparkScalaProject"
version := "0.1"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "2.3.1",
  "org.apache.spark" %% "spark-sql" % "2.3.1"
)
Configuration of Apache Spark Scala and IntelliJ IDEA - short and easy 5 steps!

4. Create Spark Application

Now we are ready to create Spark application. Let’s create new Scala" object and set as name “FirstSparkApplication”. 

package com.bigdataetl

import org.apache.spark.sql.SparkSession

object AnalyzerWords extends App {

  val spark = SparkSession.builder
    .master("local[*]")
    .appName("Word count")
    .getOrCreate()
  val data = spark.sparkContext.parallelize(
    Seq("I like Spark", "Spark is awesome", "My first Spark job is working now and is counting these words"))
  val wordCounts = data
    .flatMap(row => row.split(" "))
    .map(word => (word, 1))
    .reduceByKey(_ + _)
  wordCounts.foreach(println)
}

5. Run The Application

Spark will run locally on your computer. You should see the results like this one:

(is,3)
(Spark,3)
(like,1)
(first,1)
(awesome,1)
(job,1)
(now,1)
(I,1)
(words,1)
(working,1)
(and,1)
(counting,1)
(these,1)
(My,1)

Summary

I hope that it helped you now you know how to setup Spark Scala" IntelliJ".

In order to use Spark with programming languages like Python" or R in IntelliJ" IDEA, you will need to install the appropriate language plugins and set up the necessary project dependencies. You may also need to configure the Spark and Hadoop" environment variables" and add them to the IntelliJ" IDEA environment.

Could You Please Share This Post? 
I appreciate It And Thank YOU! :)
Have A Nice Day!

How useful was this post?

Click on a star to rate it!

Average rating 4.8 / 5. Vote count: 931

No votes so far! Be the first to rate this post.

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?