Hello, We are trying to setup Spark as the execution engine for exposing our data stored in lake. We have hive metastore running along with Spark thrift server and are using Superset as the UI.
We save all tables as External tables in hive metastore with storge being on Cloud. We see that right now when users run a query in Superset SQL Lab it scans the whole table. What we want is to limit the data scan by setting something like hive.mapred.mode=strict in spark, so that user gets an exception if they don't specify a partition column. We tried setting spark.hadoop.hive.mapred.mode=strict in spark-defaults.conf in thrift server but it still scans the whole table. Also tried setting hive.mapred.mode=strict in hive-defaults.conf for metastore container. We use Spark 3.2 with hive-metastore version 3.1.2 Is there a way in spark settings to make it happen. TIA Saurabh