Hi Rajesh, I'm trying to further my understanding of the various interactions and set-ups for Hive + Spark
My understanding so far is that running queries against the SparkThriftServer uses the SparkSQL engine whereas the HiveServer2 + Hive + Spark execution engine uses Hive primitives and only uses Spark for the actual computations I get your question about "why would I do that?" But my goal right now is to understand "what does it mean if I do that" Best regards Daniel On Tue 12 Mar 2019, 02:21 Rajesh Balamohan, <rbalamo...@apache.org> wrote: > Not sure why you are using SparkThriftServer. OOTB HiveServer2 would be > good enough for this. > > Is there any specific reason for moving from tez to spark as execution > engine? > > ~Rajesh.B > > On Mon, Mar 11, 2019 at 9:45 PM Daniel Mateus Pires <dmate...@gmail.com> > wrote: > >> Hi there, >> >> I would like to run Hive using Spark as the execution engine and I'm >> pretty confused with the set up. >> >> For reference I'm using AWS EMR. >> >> First, I'm confused at the difference between running Hive with Spark as >> its execution engine sending queries to Hive using HiveServer2 (Thrift), >> and using the SparkThriftServer (I thought it was built on top of >> HiveServer2) ? Could I read more about the differences somewhere ? >> >> I followed the following docs: >> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started >> and after changing the execution engine from the EMR default (tez) to >> spark, I can see the difference on the HiveServer2 UI at port 10002 where >> now the steps show "spark" as the execution engine. >> >> However I've set up the following config to get the Spark History Server >> displaying queries coming through JDBC and I can see queries sent to the >> SparkThriftServer (port 10001) but not to the HiveServer2 with execution >> engine of Spark (port 10000) >> >> set spark.eventLog.enabled=true; >> set spark.master=localhost:18080; >> set spark.eventLog.dir=hdfs:///var/log/spark/apps; >> set spark.executor.memory=512m; >> set spark.serializer=org.apache.spark.serializer.KryoSerializer; >> >> Thanks! >> >