Re: Hive version with Spark

Sofia Mon, 30 Nov 2015 09:47:52 -0800

This has escaped my attention. It would be a good idea to add it maybe to the 
Hive use page with Spark?



On 30 Nov 2015, at 18:42, Mich Talebzadeh <m...@peridale.co.uk> wrote:

> Hi,
>  
> We have discussed this few times in this forum.
>  
> Kindly refer to topic “Answers to recent questions on Hive on Spark”
>  
> I tried Spark 1.3. 1.3.1. 1.4 and 1.5.2 and neither work with Hive on Spark 
> as execution engine.
>  
> I ran this one today and it failed in the same familiar place
>  
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Starting Spark Job = 51360aca-bbb8-4648-86a4-26d9cc1d6a85
> Status: SENT
> Failed to execute spark task, with exception 
> 'java.lang.IllegalStateException(RPC channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>  
> Anyone that has made Hive to use spark as a successful execution engine 
> please let me know. As far as I know there is none
>  
> Regards,
>  
> Mich Talebzadeh
>  
> Sybase ASE 15 Gold Medal Award 2008
> A Winning Strategy: Running the most Critical Financial Data on ASE 15
> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf
> Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
> ISBN 978-0-9563693-0-7.
> co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
> 978-0-9759693-0-4
> Publications due shortly:
> Complex Event Processing in Heterogeneous Environments, ISBN: 
> 978-0-9563693-3-8
> Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume 
> one out shortly
>  
> http://talebzadehmich.wordpress.com
>  
> NOTE: The information in this email is proprietary and confidential. This 
> message is for the designated recipient only, if you are not the intended 
> recipient, you should destroy it immediately. Any information in this message 
> shall not be understood as given or endorsed by Peridale Technology Ltd, its 
> subsidiaries or their employees, unless expressly so stated. It is the 
> responsibility of the recipient to ensure that this email is virus free, 
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept 
> any responsibility.
>  
> From: Jimmy Xiang [mailto:jxi...@cloudera.com] 
> Sent: 30 November 2015 17:28
> To: user@hive.apache.org
> Subject: Re: Hive version with Spark
>  
> Hi Sofia,
> 
> For Hive 1.2.1, you should not use Spark 1.5. There are some incompatible 
> interface change in Spark 1.5.
> 
> Have you tried Hive 1.2.1 with Spark 1.3.1? As Udit pointed out, you can 
> follow the instruction on
> 
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
> 
> to build the Spark assembly, which should not contain any Hive/Hadoop related 
> classes.
> 
> Thanks,
> Jimmy
> 
>  
> On Sun, Nov 29, 2015 at 4:03 PM, Xuefu Zhang <xzh...@cloudera.com> wrote:
> Sofia,
> 
> What specific problem did you encounter when trying spark.master other than 
> local?
> 
> Thanks,
> Xuefu
>  
> On Sat, Nov 28, 2015 at 1:14 AM, Sofia Panagiotidi 
> <sofia.panagiot...@taiger.com> wrote:
> Hi Mich,
>  
>  
> I never managed to run Hive on Spark with a spark master other than local so 
> I am afraid I don’t have a reply here.
> But do try some things. Firstly, run hive as 
>  
> hive --hiveconf hive.root.logger=DEBUG,console
>  
> so that you are able to see what the exact error is.
>  
> I am afraid I cannot be much of a help as I think I reached the same point 
> (where it would work only when setting spark.master=local) before abandoning.
>  
> Cheers
>  
>  
>  
> On 27 Nov 2015, at 01:59, Mich Talebzadeh <m...@peridale.co.uk> wrote:
>  
> Hi Sophia,
>  
>  
> There is no Hadoop-2.6. I believe you should use Hadoop-2.4 as shown below
>  
>  
> mvn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests clean package
>  
> Also if you are building it for Hive on Spark engine, you should not include 
> Hadoop.jar files in your build.
>  
> For example I tried to build spark 1.3 from source code (I read that this 
> version works OK with Hive, having tried unsuccessfully spark 1.5.2). 
>  
> The following command created the tar file
>  
> ./make-distribution.sh --name "hadoop2-without-hive" --tgz 
> "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"
>  
> spark-1.3.0-bin-hadoop2-without-hive.tar.gz
>  
>  
> Now I have other issues making Hive to use Spark execution engine (requires 
> Hive 1.1 or above )
>  
> In hive I do
>  
> set spark.home=/usr/lib/spark;
> set hive.execution.engine=spark;
> set spark.master=spark://127.0.0.1:7077;
> set spark.eventLog.enabled=true;
> set spark.eventLog.dir=/usr/lib/spark/logs;
> set spark.executor.memory=512m;
> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
> use asehadoop;
> select count(1) from t;
>  
> I get the following
>  
> OK
> Time taken: 0.753 seconds
> Query ID = hduser_20151127003523_e9863e84-9a81-4351-939c-36b3bef36478
> Total jobs = 1
> Launching Job 1 out of 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Failed to execute spark task, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark 
> client.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>  
> HTH,
>  
> Mich
>  
> NOTE: The information in this email is proprietary and confidential. This 
> message is for the designated recipient only, if you are not the intended 
> recipient, you should destroy it immediately. Any information in this message 
> shall not be understood as given or endorsed by Peridale Technology Ltd, its 
> subsidiaries or their employees, unless expressly so stated. It is the 
> responsibility of the recipient to ensure that this email is virus free, 
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept 
> any responsibility.
>  
> From: Sofia [mailto:sofia.panagiot...@taiger.com] 
> Sent: 18 November 2015 16:50
> To: user@hive.apache.org
> Subject: Hive version with Spark
>  
> Hello
>  
> After various failed tries to use my Hive (1.2.1) with my Spark (Spark 1.4.1 
> built for Hadoop 2.2.0) I decided to try to build again Spark with Hive.
> I would like to know what is the latest Hive version that can be used to 
> build Spark at this point.
>  
> When downloading Spark 1.5 source and trying:
>  
> mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-1.2.1 
> -Phive-thriftserver  -DskipTests clean package
>  
> I get :
>  
> The requested profile "hive-1.2.1" could not be activated because it does not 
> exist.
>  
> Thank you
> Sofia

Re: Hive version with Spark

Reply via email to