Re: Hive version with Spark

Jimmy Xiang Mon, 30 Nov 2015 09:29:34 -0800

Hi Sofia,

For Hive 1.2.1, you should not use Spark 1.5. There are some incompatible
interface change in Spark 1.5.


Have you tried Hive 1.2.1 with Spark 1.3.1? As Udit pointed out, you can
follow the instruction on

https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

to build the Spark assembly, which should not contain any Hive/Hadoop
related classes.

Thanks,
Jimmy


On Sun, Nov 29, 2015 at 4:03 PM, Xuefu Zhang <xzh...@cloudera.com> wrote:

> Sofia,
>
> What specific problem did you encounter when trying spark.master other
> than local?
>
> Thanks,
> Xuefu
>
> On Sat, Nov 28, 2015 at 1:14 AM, Sofia Panagiotidi <
> sofia.panagiot...@taiger.com> wrote:
>
>> Hi Mich,
>>
>>
>> I never managed to run Hive on Spark with a spark master other than local
>> so I am afraid I don’t have a reply here.
>> But do try some things. Firstly, run hive as
>>
>> hive --hiveconf hive.root.logger=DEBUG,console
>>
>>
>> so that you are able to see what the exact error is.
>>
>> I am afraid I cannot be much of a help as I think I reached the same
>> point (where it would work only when setting spark.master=local) before
>> abandoning.
>>
>> Cheers
>>
>>
>>
>> On 27 Nov 2015, at 01:59, Mich Talebzadeh <m...@peridale.co.uk> wrote:
>>
>> Hi Sophia,
>>
>>
>> There is no Hadoop-2.6. I believe you should use Hadoop-2.4 as shown below
>>
>>
>> mvn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests clean package
>>
>> Also if you are building it for Hive on Spark engine, you should not
>> include Hadoop.jar files in your build.
>>
>> For example I tried to build spark 1.3 from source code (I read that this
>> version works OK with Hive, having tried unsuccessfully spark 1.5.2).
>>
>> The following command created the tar file
>>
>> ./make-distribution.sh --name "hadoop2-without-hive" --tgz
>> "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"
>>
>> spark-1.3.0-bin-hadoop2-without-hive.tar.gz
>>
>>
>> Now I have other issues making Hive to use Spark execution engine
>> (requires Hive 1.1 or above )
>>
>> In hive I do
>>
>> set spark.home=/usr/lib/spark;
>> set hive.execution.engine=spark;
>> set spark.master=spark://127.0.0.1:7077;
>> set spark.eventLog.enabled=true;
>> set spark.eventLog.dir=/usr/lib/spark/logs;
>> set spark.executor.memory=512m;
>> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
>> use asehadoop;
>> select count(1) from t;
>>
>> I get the following
>>
>> OK
>> Time taken: 0.753 seconds
>> Query ID = hduser_20151127003523_e9863e84-9a81-4351-939c-36b3bef36478
>> Total jobs = 1
>> Launching Job 1 out of 1
>> In order to change the average load for a reducer (in bytes):
>>   set hive.exec.reducers.bytes.per.reducer=<number>
>> In order to limit the maximum number of reducers:
>>   set hive.exec.reducers.max=<number>
>> In order to set a constant number of reducers:
>>   set mapreduce.job.reduces=<number>
>> Failed to execute spark task, with exception
>> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark
>> client.)'
>> FAILED: Execution Error, return code 1 from
>> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>>
>> HTH,
>>
>> Mich
>>
>> NOTE: The information in this email is proprietary and confidential. This
>> message is for the designated recipient only, if you are not the intended
>> recipient, you should destroy it immediately. Any information in this
>> message shall not be understood as given or endorsed by Peridale Technology
>> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
>> the responsibility of the recipient to ensure that this email is virus
>> free, therefore neither Peridale Ltd, its subsidiaries nor their employees
>> accept any responsibility.
>>
>> *From:* Sofia [mailto:sofia.panagiot...@taiger.com
>> <sofia.panagiot...@taiger.com>]
>> *Sent:* 18 November 2015 16:50
>> *To:* user@hive.apache.org
>> *Subject:* Hive version with Spark
>>
>> Hello
>>
>> After various failed tries to use my Hive (1.2.1) with my Spark (Spark
>> 1.4.1 built for Hadoop 2.2.0) I decided to try to build again Spark with
>> Hive.
>> I would like to know what is the latest Hive version that can be used to
>> build Spark at this point.
>>
>> When downloading Spark 1.5 source and trying:
>>
>> *mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-1.2.1
>> -Phive-thriftserver  -DskipTests clean package*
>>
>> I get :
>>
>> *The requested profile "hive-1.2.1" could not be activated because it
>> does not exist.*
>>
>> Thank you
>> Sofia
>>
>>
>>
>

Re: Hive version with Spark

Reply via email to