In this short post I will show you how you can change the name of the file / files created by Apache Spark to HDFS or simply rename or delete any file.
In this post I will show you how to run the shell command by programming in Scala and how you can use it in Apache Spark.
If you want to save DataFrame as a file on HDFS, there may be a problem that it will be saved as many files. This is the most correct behavior and it results from the parallel work in Apache Spark.
We will use the FileSystem and Path classes from the org.apache.hadoop.fs library to achieve it.
Today I will show you how you can use Machine Learning libraries (ML), which are available in Spark as a library under the name Spark MLib.
In this tutorial I will show you how you can easily install Apache Spark in CentOs
Simple short tip how to check if table exists int Hive using Spark
Like in the title :)
In this tutorial, I will introduce you to examples of reading data using the Dataframe API in Spark.
In this tutorial I will show you how to create Scala project in IntelliJ IDEA and run the Spark job locally.