[ https://issues.apache.org/jira/browse/HIVE-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224124#comment-14224124 ]
Chengxiang Li commented on HIVE-8836: ------------------------------------- Hi, [~szehon] and [~brocknoland] I'm not 100% percent sure spark assembly jar would be published to public maven repository, but I find spark assembly at [here|http://mvnrepository.com/artifact/org.apache.spark/spark-assembly_2.10/1.1.0], maybe [~vanzin] know more about this. There is no org.apache.spark:spark-assembly_2.10:jar:1.2.0-SNAPSHOT in any public maven repository yet as it's still in SNAPSHOT status, but we can publish it to http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data as what we have done for spark core. I build spark and public to local maven repository in my local test. {quote} Also another question, as we were trying to set spark.home, which looks for bin/spark-submit, which then pulled in scripts like compute-classpath.sh, load-spark-env.sh, spark-class, and finally spark-assembly itself. I see you are using another way (spark.test.home, spark.testing), how does that avoid looking for these artifacts to start the spark process? {quote} First, bin/spark-submit is optional for Remote Spark Context. Then, local-cluster spark only need compute-classpath.sh for launch executor, which is used to add spark related jars into classpath(Hive unit test should only need spark-assembly). spark.test.home and spark.testing are used to set spark home to dummy spark installation, you can check org.apache.spark.deploy.worker.Worker::line101 for why. I create dummy spark installation with empty compute-classpath.sh as compute-classpath.sh is required, and add spark assembly to spark executor classpath through spark.executor.extraClassPath. > Enable automatic tests with remote spark client.[Spark Branch] > -------------------------------------------------------------- > > Key: HIVE-8836 > URL: https://issues.apache.org/jira/browse/HIVE-8836 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Chengxiang Li > Assignee: Rui Li > Labels: Spark-M3 > Attachments: HIVE-8836-brock-1.patch, HIVE-8836-brock-2.patch, > HIVE-8836-brock-3.patch, HIVE-8836.1-spark.patch, HIVE-8836.2-spark.patch > > > In real production environment, remote spark client should be used to submit > spark job for Hive mostly, we should enable automatic test with remote spark > client to make sure the Hive feature workable with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)