This is great news sir. It shows perseverance pays at last. Can you inform us when the write-up is ready so I can set it up as well please. I know a bit about the advantages of having Hive using Spark engine. However, the general question I have is when one should use Hive on spark as opposed to Hive on MapReduce engine? Thanks again
On Monday, 7 December 2015, 15:50, Mich Talebzadeh <m...@peridale.co.uk> wrote: #yiv8051477931 -- filtered {font-family:Wingdings;panose-1:5 0 0 0 0 0 0 0 0 0;}#yiv8051477931 filtered {panose-1:2 4 5 3 5 4 6 3 2 4;}#yiv8051477931 filtered {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv8051477931 p.yiv8051477931MsoNormal, #yiv8051477931 li.yiv8051477931MsoNormal, #yiv8051477931 div.yiv8051477931MsoNormal {margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;}#yiv8051477931 a:link, #yiv8051477931 span.yiv8051477931MsoHyperlink {color:blue;text-decoration:underline;}#yiv8051477931 a:visited, #yiv8051477931 span.yiv8051477931MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv8051477931 span.yiv8051477931EmailStyle17 {color:windowtext;}#yiv8051477931 span.yiv8051477931EmailStyle18 {color:windowtext;}#yiv8051477931 .yiv8051477931MsoChpDefault {font-size:10.0pt;}#yiv8051477931 filtered {margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv8051477931 div.yiv8051477931WordSection1 {}#yiv8051477931 For those interested From: Mich Talebzadeh [mailto:m...@peridale.co.uk] Sent: 06 December 2015 20:33 To: user@hive.apache.org Subject: Managed to make Hive run on Spark engine Thanks all especially to Xuefu.for contributions. Finally it works, which means don’t give up until it works J hduser@rhes564::/usr/lib/hive/lib> hiveLogging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-1.2.1.jar!/hive-log4j.propertieshive> set spark.home= /usr/lib/spark-1.3.1-bin-hadoop2.6;hive> set hive.execution.engine=spark;hive> set spark.master=spark://50.140.197.217:7077;hive> set spark.eventLog.enabled=true;hive> set spark.eventLog.dir= /usr/lib/spark-1.3.1-bin-hadoop2.6/logs;hive> set spark.executor.memory=512m;hive> set spark.serializer=org.apache.spark.serializer.KryoSerializer;hive> set hive.spark.client.server.connect.timeout=220000ms;hive> set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;hive> use asehadoop;OKTime taken: 0.638 secondshive> select count(1) from t;Query ID = hduser_20151206200528_4b85889f-e4ca-41d2-9bd2-1082104be42bTotal jobs = 1Launching Job 1 out of 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Spark Job = c8fee86c-0286-4276-aaa1-2a5eb4e4958a Query Hive on Spark job[0] stages:01 Status: Running (Hive on Spark job[0])Job Progress FormatCurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]2015-12-06 20:05:36,299 Stage-0_0: 0(+1)/1 Stage-1_0: 0/12015-12-06 20:05:39,344 Stage-0_0: 1/1 Finished Stage-1_0: 0(+1)/12015-12-06 20:05:40,350 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 FinishedStatus: Finished successfully in 8.10 secondsOK The versions used for this project OS version Linux version 2.6.18-92.el5xen (brewbuil...@ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-41)) #1 SMP Tue Apr 29 13:31:30 EDT 2008 Hadoop 2.6.0Hive 1.2.1spark-1.3.1-bin-hadoop2.6 (downloaded from prebuild spark-1.3.1-bin-hadoop2.6.gz for starting spark standalone cluster)The Jar file used in $HIVE_HOME/lib to link Hive to spark was à spark-assembly-1.3.1-hadoop2.4.0.jar (built from the source downloaded as zipped file spark-1.3.1.gz and built with command line make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided" Pretty picky on parameters, CLASSPATH, IP addresses or hostname etc to make it work I will create a full guide on how to build and make Hive to run with Spark as its engine (as opposed to MR). HTH Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008A Winning Strategy: Running the most Critical Financial Data on ASE 15http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdfAuthor of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4Publications due shortly:Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly http://talebzadehmich.wordpress.com/ NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.