There are many ways of interacting with Hive DW from Spark.
You can either use the API from Spark to Hive native or you can use JDBC
connection (local or remote spark).
What is the reference to the driver in this context? Bottom line using
concurrent queries, you will have to go through Hive and
In the high concurrency scenario, the query performance of spark SQL is limited
by namenode and hive Metastore. There are some caches in the code, but the
effect is limited. Do we have a practical and effective way to solve the
time-consuming problem of driver in concurrent query?
-