Re: Running Hive on Spark

Rajesh Balamohan Mon, 11 Mar 2019 19:21:25 -0700

Not sure why you are using SparkThriftServer. OOTB HiveServer2 would be
good enough for this.


Is there any specific reason for moving from tez to spark as execution
engine?

~Rajesh.B

On Mon, Mar 11, 2019 at 9:45 PM Daniel Mateus Pires <dmate...@gmail.com>
wrote:

> Hi there,
>
> I would like to run Hive using Spark as the execution engine and I'm
> pretty confused with the set up.
>
> For reference I'm using AWS EMR.
>
> First, I'm confused at the difference between running Hive with Spark as
> its execution engine sending queries to Hive using HiveServer2 (Thrift),
> and using the SparkThriftServer (I thought it was built on top of
> HiveServer2) ? Could I read more about the differences somewhere ?
>
> I followed the following docs:
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
> and after changing the execution engine from the EMR default (tez) to
> spark, I can see the difference on the HiveServer2 UI at port 10002 where
> now the steps show "spark" as the execution engine.
>
> However I've set up the following config to get the Spark History Server
> displaying queries coming through JDBC and I can see queries sent to the
> SparkThriftServer (port 10001) but not to the HiveServer2 with execution
> engine of Spark (port 10000)
>
> set spark.eventLog.enabled=true;
> set spark.master=localhost:18080;
> set spark.eventLog.dir=hdfs:///var/log/spark/apps;
> set spark.executor.memory=512m;
> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
>
> Thanks!
>

Re: Running Hive on Spark

Reply via email to