Need to make WHERE clause compulsory in Spark SQL

Saurabh Gulati Tue, 22 Feb 2022 04:35:17 -0800

Hello,
We are trying to setup Spark as the execution engine for exposing our data 
stored in lake. We have hive metastore running along with Spark thrift server 
and are using Superset as the UI.


We save all tables as External tables in hive metastore with storge being on 
Cloud.

We see that right now when users run a query in Superset SQL Lab it scans the 
whole table. What we want is to limit the data scan by setting something like 
hive.mapred.mode=strict in spark, so that user gets an exception if they don't 
specify a partition column.

We tried setting spark.hadoop.hive.mapred.mode=strict in spark-defaults.conf 
in thrift server  but it still scans the whole table.
Also tried setting hive.mapred.mode=strict in hive-defaults.conf for metastore 
container.

We use Spark 3.2 with hive-metastore version 3.1.2

Is there a way in spark settings to make it happen.


TIA
Saurabh

Need to make WHERE clause compulsory in Spark SQL

Reply via email to