RE: Hive version with Spark

Mich Talebzadeh Mon, 30 Nov 2015 09:42:51 -0800

Hi,


We have discussed this few times in this forum.

 

Kindly refer to topic “Answers to recent questions on Hive on Spark”

 

I tried Spark 1.3. 1.3.1. 1.4 and 1.5.2 and neither work with Hive on Spark as 
execution engine.

 

I ran this one today and it failed in the same familiar place

 

In order to change the average load for a reducer (in bytes):

  set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

  set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

  set mapreduce.job.reduces=<number>

Starting Spark Job = 51360aca-bbb8-4648-86a4-26d9cc1d6a85

Status: SENT

Failed to execute spark task, with exception 
'java.lang.IllegalStateException(RPC channel is closed.)'

FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask

 

Anyone that has made Hive to use spark as a successful execution engine please 
let me know. As far as I know there is none

 

Regards,

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one 
out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any 
responsibility.

 

From: Jimmy Xiang [mailto:jxi...@cloudera.com] 
Sent: 30 November 2015 17:28
To: user@hive.apache.org
Subject: Re: Hive version with Spark

 

Hi Sofia,

For Hive 1.2.1, you should not use Spark 1.5. There are some incompatible 
interface change in Spark 1.5.

Have you tried Hive 1.2.1 with Spark 1.3.1? As Udit pointed out, you can follow 
the instruction on

https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started


to build the Spark assembly, which should not contain any Hive/Hadoop related 
classes.

Thanks,

Jimmy

 

On Sun, Nov 29, 2015 at 4:03 PM, Xuefu Zhang <xzh...@cloudera.com 
<mailto:xzh...@cloudera.com> > wrote:

Sofia,

What specific problem did you encounter when trying spark.master other than 
local?

Thanks,

Xuefu

 

On Sat, Nov 28, 2015 at 1:14 AM, Sofia Panagiotidi 
<sofia.panagiot...@taiger.com <mailto:sofia.panagiot...@taiger.com> > wrote:

Hi Mich,

 

 

I never managed to run Hive on Spark with a spark master other than local so I 
am afraid I don’t have a reply here.

But do try some things. Firstly, run hive as 

 

hive --hiveconf hive.root.logger=DEBUG,console

 

so that you are able to see what the exact error is.

 

I am afraid I cannot be much of a help as I think I reached the same point 
(where it would work only when setting spark.master=local) before abandoning.

 

Cheers

 

 

 

On 27 Nov 2015, at 01:59, Mich Talebzadeh <m...@peridale.co.uk 
<mailto:m...@peridale.co.uk> > wrote:

 

Hi Sophia,

 

 

There is no Hadoop-2.6. I believe you should use Hadoop-2.4 as shown below

 

 

mvn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests clean package

 

Also if you are building it for Hive on Spark engine, you should not include 
Hadoop.jar files in your build.

 

For example I tried to build spark 1.3 from source code (I read that this 
version works OK with Hive, having tried unsuccessfully spark 1.5.2). 

 

The following command created the tar file

 

./make-distribution.sh --name "hadoop2-without-hive" --tgz 
"-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"

 

spark-1.3.0-bin-hadoop2-without-hive.tar.gz

 

 

Now I have other issues making Hive to use Spark execution engine (requires 
Hive 1.1 or above )

 

In hive I do

 

set spark.home=/usr/lib/spark;

set hive.execution.engine=spark;

set spark.master=spark://127.0.0.1:7077;

set spark.eventLog.enabled=true;

set spark.eventLog.dir=/usr/lib/spark/logs;

set spark.executor.memory=512m;

set spark.serializer=org.apache.spark.serializer.KryoSerializer;

use asehadoop;

select count(1) from t;

 

I get the following

 

OK

Time taken: 0.753 seconds

Query ID = hduser_20151127003523_e9863e84-9a81-4351-939c-36b3bef36478

Total jobs = 1

Launching Job 1 out of 1

In order to change the average load for a reducer (in bytes):

  set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

  set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

  set mapreduce.job.reduces=<number>

Failed to execute spark task, with exception 
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark 
client.)'

FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask

 

HTH,

 

Mich

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any 
responsibility.

 

From: Sofia [mailto:sofia.panagiot...@taiger.com] 
Sent: 18 November 2015 16:50
To: user@hive.apache.org <mailto:user@hive.apache.org> 
Subject: Hive version with Spark

 

Hello

 

After various failed tries to use my Hive (1.2.1) with my Spark (Spark 1.4.1 
built for Hadoop 2.2.0) I decided to try to build again Spark with Hive.

I would like to know what is the latest Hive version that can be used to build 
Spark at this point.

 

When downloading Spark 1.5 source and trying:

 

mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-1.2.1 
-Phive-thriftserver  -DskipTests clean package

 

I get :

 

The requested profile "hive-1.2.1" could not be activated because it does not 
exist.

 

Thank you

Sofia

RE: Hive version with Spark

Reply via email to