Table of Contents
Spark check if table exists in Hive
[ PySpark check if table exists in Hive or Scala ] When you are looking for hive table please provide table name in lowercase, due to fact that spark.sqlContext.tableNames returns the array of table names only in lowercase.plugin table of contents
Information about tables in Hive are stored in Hive Metastore.
Spark 2.0 or higher
// Create SparkSession object with enabled Hive support val spark = SparkSession .builder() .appName("Check table") .enableHiveSupport() .getOrCreate() // Select database where you will search for table - lowercase spark.sqlContext.sql("use bigdata_etl") spark.sqlContext.tableNames.contains("schemas") res4: Boolean = true // With Uppercase spark.sqlContext.tableNames.contains("Schemas") res4: Boolean = false
Since Spark 1.6 to 2.0
// Get HiveContext from SparkContext val sparkConf = new SparkConf().setAppName("Check table") val sc = new SparkContext(sparkConf) val hiveContext = new HiveContext(sc) hiveContext.sql("use bigdata_etl") hiveContext.tableNames.contains("schemas") // With Uppercase hiveContext.tableNames.contains("Schemas") res4: Boolean = false
If table will exist you will give the “true”, otherwise “false”
PySpark Check if table exists
The above examples are presented using Scala language, but we are using a pure Spark API here, so in PySpark it will look the same. In that easy way you can check if table exists PySpark.
If you enjoyed this post please add the comment below and share this post on your Facebook, Twitter, LinkedIn or another social media webpage.
Thanks in advanced!
Side note. spark.sqlContext.tableNames array contains lower case names only. Make sure, the name of the table is all lowercase, otherwise, the result will be false.
spark.sqlContext.tableNames.contains(“schemas”) – true
spark.sqlContext.tableNames.contains(“Schemas”) – false
Yes, you are right. Thanks! I will add this comment also to post.