Xuefu Zhang created HIVE-7593:
---------------------------------

             Summary: Instantiate SparkClient per user session
                 Key: HIVE-7593
                 URL: https://issues.apache.org/jira/browse/HIVE-7593
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
            Reporter: Xuefu Zhang


SparkContext is the main class via which Hive talk to Spark cluster. 
SparkClient encapsulates a SparkContext instance. Currently all user sessions 
share a single SparkClient instance in HiveServer2. While this is good enough 
for a POC, even for our first two milestones, this is not desirable for a 
multi-tenancy environment and gives least flexibility to Hive users. Here is 
what we propose:

1. Have a SparkClient instance per user session. The SparkClient instance is 
created when user executes its first query in the session. It will get 
destroyed when user session ends.

2. The SparkClient is instantiated based on the spark configurations that are 
available to the user, including those defined at the global level and those 
overwritten by the user (thru set command, for instance).

3. Ideally, when user changes any spark configuration during the session, the 
old SparkClient instance should be destroyed and a new one based on the new 
configurations is created. This may turn out to be a little hard, and thus it's 
a "nice-to-have". If not implemented, we need to document that subsequent 
configuration changes will not take effect in the current session.

Please note that there is a thread-safety issue on Spark side where multiple 
SparkContext instances cannot coexist in the same JVM (SPARK-2243). We need to 
work with Spark community to get this addressed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to