Hi,
We have discussed this few times in this forum. Kindly refer to topic “Answers to recent questions on Hive on Spark” I tried Spark 1.3. 1.3.1. 1.4 and 1.5.2 and neither work with Hive on Spark as execution engine. I ran this one today and it failed in the same familiar place In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Spark Job = 51360aca-bbb8-4648-86a4-26d9cc1d6a85 Status: SENT Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask Anyone that has made Hive to use spark as a successful execution engine please let me know. As far as I know there is none Regards, Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4 Publications due shortly: Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8 Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Jimmy Xiang [mailto:jxi...@cloudera.com] Sent: 30 November 2015 17:28 To: user@hive.apache.org Subject: Re: Hive version with Spark Hi Sofia, For Hive 1.2.1, you should not use Spark 1.5. There are some incompatible interface change in Spark 1.5. Have you tried Hive 1.2.1 with Spark 1.3.1? As Udit pointed out, you can follow the instruction on https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started to build the Spark assembly, which should not contain any Hive/Hadoop related classes. Thanks, Jimmy On Sun, Nov 29, 2015 at 4:03 PM, Xuefu Zhang <xzh...@cloudera.com <mailto:xzh...@cloudera.com> > wrote: Sofia, What specific problem did you encounter when trying spark.master other than local? Thanks, Xuefu On Sat, Nov 28, 2015 at 1:14 AM, Sofia Panagiotidi <sofia.panagiot...@taiger.com <mailto:sofia.panagiot...@taiger.com> > wrote: Hi Mich, I never managed to run Hive on Spark with a spark master other than local so I am afraid I don’t have a reply here. But do try some things. Firstly, run hive as hive --hiveconf hive.root.logger=DEBUG,console so that you are able to see what the exact error is. I am afraid I cannot be much of a help as I think I reached the same point (where it would work only when setting spark.master=local) before abandoning. Cheers On 27 Nov 2015, at 01:59, Mich Talebzadeh <m...@peridale.co.uk <mailto:m...@peridale.co.uk> > wrote: Hi Sophia, There is no Hadoop-2.6. I believe you should use Hadoop-2.4 as shown below mvn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests clean package Also if you are building it for Hive on Spark engine, you should not include Hadoop.jar files in your build. For example I tried to build spark 1.3 from source code (I read that this version works OK with Hive, having tried unsuccessfully spark 1.5.2). The following command created the tar file ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided" spark-1.3.0-bin-hadoop2-without-hive.tar.gz Now I have other issues making Hive to use Spark execution engine (requires Hive 1.1 or above ) In hive I do set spark.home=/usr/lib/spark; set hive.execution.engine=spark; set spark.master=spark://127.0.0.1:7077; set spark.eventLog.enabled=true; set spark.eventLog.dir=/usr/lib/spark/logs; set spark.executor.memory=512m; set spark.serializer=org.apache.spark.serializer.KryoSerializer; use asehadoop; select count(1) from t; I get the following OK Time taken: 0.753 seconds Query ID = hduser_20151127003523_e9863e84-9a81-4351-939c-36b3bef36478 Total jobs = 1 Launching Job 1 out of 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask HTH, Mich NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Sofia [mailto:sofia.panagiot...@taiger.com] Sent: 18 November 2015 16:50 To: user@hive.apache.org <mailto:user@hive.apache.org> Subject: Hive version with Spark Hello After various failed tries to use my Hive (1.2.1) with my Spark (Spark 1.4.1 built for Hadoop 2.2.0) I decided to try to build again Spark with Hive. I would like to know what is the latest Hive version that can be used to build Spark at this point. When downloading Spark 1.5 source and trying: mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-1.2.1 -Phive-thriftserver -DskipTests clean package I get : The requested profile "hive-1.2.1" could not be activated because it does not exist. Thank you Sofia