[ https://issues.apache.org/jira/browse/HIVE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147344#comment-14147344 ]
Xuefu Zhang commented on HIVE-7382: ----------------------------------- Hi [~lirui], yes, we'd like to use Spark local-cluster to back a mini cluster when running tests because it's closer to a real cluster and easy to start. I know it's for Spark internal use, but for test we shoiuld be okay. Especially it's easy to switch to local if we have to. Such a mini cluster resembles more to a mr minicluster. It also easy for us to control the number of works, executor per node, memory, and so on. Thus, I think this is a nice thing to have. Thanks for researching into this area. When I did the POC, local-cluster actually worked, of course after resolving a few library conflicts. We might have the similar problems with the current code base. > Create a MiniSparkCluster and set up a testing framework [Spark Branch] > ----------------------------------------------------------------------- > > Key: HIVE-7382 > URL: https://issues.apache.org/jira/browse/HIVE-7382 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Xuefu Zhang > Assignee: Rui Li > Labels: Spark-M1 > > To automatically test Hive functionality over Spark execution engine, we need > to create a test framework that can execute Hive queries with Spark as the > backend. For that, we should create a MiniSparkCluser for this, similar to > other execution engines. > Spark has a way to create a local cluster with a few processes in the local > machine, each process is a work node. It's fairly close to a real Spark > cluster. Our mini cluster can be based on that. > For more info, please refer to the design doc on wiki. -- This message was sent by Atlassian JIRA (v6.3.4#6332)