Re: hive on spark - why is it so hard?

2017-10-02 Thread Jörn Franke
You should try with TEZ+LLAP. Additionally you will need to compare different configurations. Finally just any comparison is meaningless. You should use queries, data and file formats that your users are using later. > On 2. Oct 2017, at 03:06, Stephen Sprague wrote: > > so... i made some pro

Re: hive on spark - why is it so hard?

2017-10-01 Thread Stephen Sprague
so... i made some progress after much copying of jar files around (as alluded to by Gopal previously on this thread). following the instructions here: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started and doing this as instructed will leave off about a dozen or s

Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague
ok.. getting further. seems now i have to deploy hive to all nodes in the cluster - don't think i had to do that before but not a big deal to do it now. for me: HIVE_HOME=/usr/lib/apache-hive-2.3.0-bin/ SPARK_HOME=/usr/lib/spark-2.2.0-bin-hadoop2.6 on all three nodes now. i started spar

Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague
thanks. I haven't had a chance to dig into this again today but i do appreciate the pointer. I'll keep you posted. On Wed, Sep 27, 2017 at 10:14 AM, Sahil Takiar wrote: > You can try increasing the value of hive.spark.client.connect.timeout. > Would also suggest taking a look at the HoS Remote

Re: hive on spark - why is it so hard?

2017-09-27 Thread Sahil Takiar
You can try increasing the value of hive.spark.client.connect.timeout. Would also suggest taking a look at the HoS Remote Driver logs. The driver gets launched in a YARN container (assuming you are running Spark in yarn-client mode), so you just have to find the logs for that container. --Sahil O

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
i _seem_ to be getting closer. Maybe its just wishful thinking. Here's where i'm at now. 2017-09-26T21:10:38,892 INFO [stderr-redir-1] client.SparkClientImpl: 17/09/26 21:10:38 INFO rest.RestSubmissionClient: Server responded with CreateSubmissionResponse: 2017-09-26T21:10:38,892 INFO [stderr

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
oh. i missed Gopal's reply. oy... that sounds foreboding. I'll keep you posted on my progress. On Tue, Sep 26, 2017 at 4:40 PM, Gopal Vijayaraghavan wrote: > Hi, > > > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a > spark session: org.apache.hadoop.hive.ql.metadata.HiveExc

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
well this is the spark-submit line from above: 2017-09-26T14:04:45,678 INFO [4cb82b6d-9568-4518-8e00-f0cf7ac58cd3 main] client.SparkClientImpl: Running client driver with argv: */usr/li/spark-2.2.0-bin-**hadoop2.6/bin/spark-submit* and that's pretty clearly v2.2 I do have other versions of

Re: hive on spark - why is it so hard?

2017-09-26 Thread Gopal Vijayaraghavan
Hi, > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a spark > session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create > spark client. I get inexplicable errors with Hive-on-Spark unless I do a three step build. Build Hive first, use that version to build

Re: hive on spark - why is it so hard?

2017-09-26 Thread Sahil Takiar
Are you sure you are using Spark 2.2.0? Based on the stack-trace it looks like your call to spark-submit it using an older version of Spark (looks like some early 1.x version). Do you have SPARK_HOME set locally? Do you have older versions of Spark installed locally? --Sahil On Tue, Sep 26, 2017

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
thanks Sahil. here it is. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:344) at org.apache.spark.deploy.SparkSubmit$.launch(Spark

Re: hive on spark - why is it so hard?

2017-09-26 Thread Sahil Takiar
Hey Stephen, Can you send the full stack trace for the NoClassDefFoundError? For Hive 2.3.0, we only support Spark 2.0.0. Hive may work with more recent versions of Spark, but we only test with Spark 2.0.0. --Sahil On Tue, Sep 26, 2017 at 2:35 PM, Stephen Sprague wrote: > * i've installed hive

Re: Hive on Spark

2017-08-22 Thread Vihang Karajgaonkar
Xuefu is planning to give a talk on Hive-on-Spark @Uber the user meetup this week. We can check if can share the presentation on this list for folks who can't attend the meetup. https://www.meetup.com/Hive-User-Group-Meeting/events/242210487/ On Mon, Aug 21, 2017 at 11:44 PM, peter zhang wrote:

Re: hive on spark - version question

2017-03-17 Thread Stephen Sprague
yeah but... is the glass half-full or half-empty? sure this might suck but keep your head high, bro! Lots of it (hive) does work. :) On Fri, Mar 17, 2017 at 2:25 PM, hernan saab wrote: > Stephan, > > Thanks for the response. > > The one thing that I don't appreciate from those who promote and

Re: hive on spark - version question

2017-03-17 Thread hernan saab
Stephan, Thanks for the response. The one thing that I don't appreciate from those who promote and DOCUMENT spark on hive is that, seemingly, there is absolutely no evidence seen that says that hive on spark WORKS. As a matter of fact, after a lot of pain, I noticed it is not supported by just a

Re: hive on spark - version question

2017-03-17 Thread Stephen Sprague
thanks for the comments and for sure all relevant. And yeah I feel the pain just like the next guy but that's the part of the opensource "life style" you subscribe to when using it. The upside payoff has gotta be worth the downside risk - or else forget about it right? Here in the Hive world in my

Re: hive on spark - version question

2017-03-17 Thread Edward Capriolo
On Fri, Mar 17, 2017 at 2:56 PM, hernan saab wrote: > I have been in a similar world of pain. Basically, I tried to use an > external Hive to have user access controls with a spark engine. > At the end, I realized that it was a better idea to use apache tez instead > of a spark engine for my part

Re: hive on spark - version question

2017-03-17 Thread hernan saab
I have been in a similar world of pain. Basically, I tried to use an external Hive to have user access controls with a spark engine.At the end, I realized that it was a better idea to use apache tez instead of a spark engine for my particular case. But the journey is what I want to share with yo

Re: hive on spark - version question

2017-03-17 Thread Stephen Sprague
:( gettin' no love on this one. any SME's know if Spark 2.1.0 will work with Hive 2.1.0 ? That JavaSparkListener class looks like a deal breaker to me, alas. thanks in advance. Cheers, Stephen. On Mon, Mar 13, 2017 at 10:32 PM, Stephen Sprague wrote: > hi guys, > wondering where we stand w

RE: Hive on Spark not working

2016-11-29 Thread Joaquin Alzola
Being unable to integrate separately Hive with Spark I just started directly on Spark the thrift server. Now it is working as expected. From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: 29 November 2016 11:12 To: user Subject: Re: Hive on Spark not working Hive on Spark engine

RE: Hive on Spark not working

2016-11-29 Thread Joaquin Alzola
gmail.com] Sent: 29 November 2016 11:12 To: user Subject: Re: Hive on Spark not working Hive on Spark engine only works with Spark 1.3.1. Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Dis

Re: Hive on Spark not working

2016-11-29 Thread Mich Talebzadeh
Hive on Spark engine only works with Spark 1.3.1. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com *Disclai

Re: Hive on Spark not working

2016-11-28 Thread Furcy Pin
ClassNotFoundException generally means that jars are missing from your class path. You probably need to link the spark jar to $HIVE_HOME/lib https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-ConfiguringHive On Tue, Nov 29, 2016 at 2:03 AM

Re: Hive on Spark - Mesos

2016-09-15 Thread Mich Talebzadeh
sorry on Yarn only but I gather it should work with Mesos. I don't think that comes into it. The issue is the compatibility of Spark assembly library with Hive. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive on Spark - Mesos

2016-09-15 Thread John Omernik
Did you run it on Mesos? Your presentation doesn't mention Mesos at all... John On Thu, Sep 15, 2016 at 4:20 PM, Mich Talebzadeh wrote: > Yes you can. Hive on Spark meaning Hive using Spark as its execution > engine works fine. The version that I managed to make it work is any Hive > version>

Re: Hive On Spark - ORC Table - Hive Streaming Mutation API

2016-09-14 Thread Benjamin Schaff
Hi, Thanks for the answer. I am running on a custom build of spark 1.6.2 meaning the one given in the hive documentation so without hive jars. I set it up in hive-env.sh. I created the istari table like in the documentation and I run INSERT on it then a GROUP BY. Everything went on spark standal

Re: Hive On Spark - ORC Table - Hive Streaming Mutation API

2016-09-14 Thread Mich Talebzadeh
Hi, You are using Hive 2. What is the Spark version that runs as Hive execution engine? I cannot see spark.home in your hive-site.xml so I cannot figure it out. BTW you are using Spark standalone as the mode. I tend to use yarn-client. Now back to the above issue. Do other queries work OK with

Re: hive on spark job not start enough executors

2016-09-09 Thread 明浩 冯
ble from the parquet. Thanks, Minghao Feng From: Mich Talebzadeh Sent: Friday, September 9, 2016 4:49:55 PM To: user Subject: Re: hive on spark job not start enough executors when you start hive on spark do you set any parameters for the submitted job (or read them f

Re: hive on spark job not start enough executors

2016-09-09 Thread Mich Talebzadeh
when you start hive on spark do you set any parameters for the submitted job (or read them from init file)? set spark.master=yarn; set spark.deploy.mode=client; set spark.executor.memory=3g; set spark.driver.memory=3g; set spark.executor.instances=2; set spark.ui.port=; Dr Mich Talebzadeh

Re: Hive on spark

2016-08-01 Thread Mich Talebzadeh
s, damage or destruction. > > > > On 28 July 2016 at 04:24, Mudit Kumar wrote: > >> Yes Mich,exactly. >> >> Thanks, >> Mudit >> >> From: Mich Talebzadeh >> Reply-To: >> Date: Thursday, July 28, 2016 at 1:08 AM >> To: user &g

Re: Hive on spark

2016-07-31 Thread Chandrakanth Akkinepalli
; >> Thanks, >> Mudit >> >> From: Mich Talebzadeh >> Reply-To: >> Date: Thursday, July 28, 2016 at 1:08 AM >> To: user >> Subject: Re: Hive on spark >> >> You mean you want to run Hive using Spark as the execution engine wh

Re: Hive on spark

2016-07-28 Thread Mudit Kumar
Thanks Guys for the help! Thanks, Mudit From: Mich Talebzadeh Reply-To: Date: Thursday, July 28, 2016 at 9:43 AM To: user Subject: Re: Hive on spark Hi, I made a presentation in London on 20th July on this subject:. In that I explained how to make Spark work as an execution engine for

Re: Hive on spark

2016-07-27 Thread Mich Talebzadeh
ruction. On 28 July 2016 at 04:24, Mudit Kumar wrote: > Yes Mich,exactly. > > Thanks, > Mudit > > From: Mich Talebzadeh > Reply-To: > Date: Thursday, July 28, 2016 at 1:08 AM > To: user > Subject: Re: Hive on spark > > You mean you want to run Hive usi

Re: Hive on spark

2016-07-27 Thread karthi keyan
Date: Thursday, July 28, 2016 at 1:08 AM > To: user > Subject: Re: Hive on spark > > You mean you want to run Hive using Spark as the execution engine which > uses Yarn by default? > > > Something like below > > hive> select max(id) from oraclehadoop.dummy_parquet; &g

Re: Hive on spark

2016-07-27 Thread Mudit Kumar
Yes Mich,exactly. Thanks, Mudit From: Mich Talebzadeh Reply-To: Date: Thursday, July 28, 2016 at 1:08 AM To: user Subject: Re: Hive on spark You mean you want to run Hive using Spark as the execution engine which uses Yarn by default? Something like below hive> select max(id) f

Re: Hive on spark

2016-07-27 Thread Mich Talebzadeh
You mean you want to run Hive using Spark as the execution engine which uses Yarn by default? Something like below hive> select max(id) from oraclehadoop.dummy_parquet; Starting Spark Job = 8218859d-1d7c-419c-adc7-4de175c3ca6d Query Hive on Spark job[1] stages: 2 3 Status: Running (Hive on Spark

Re: Hive on Spark engine

2016-03-26 Thread Mich Talebzadeh
Thanks Ted, More interested in general availability of Hive 2 on Spark 1.6 engine as opposed to Vendors specific custom built. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive on Spark engine

2016-03-26 Thread Ted Yu
According to: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/bk_HDP_RelNotes-20151221.pdf Spark 1.5.2 comes out of box. Suggest moving questions on HDP to Hortonworks forum. Cheers On Sat, Mar 26, 2016 at 3:32 PM, Mich Talebzadeh wrote: > Thanks Jorn. > > Just to be

Re: Hive on Spark engine

2016-03-26 Thread Mich Talebzadeh
Thanks Jorn. Just to be clear they get Hive working with Spark 1.6 out of the box (binary download)? The usual work-around is to build your own package and get the Hadoop-assembly jar file copied over to $HIVE_HOME/lib. Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/

Re: Hive on Spark engine

2016-03-26 Thread Jörn Franke
If you check the newest Hortonworks distribution then you see that it generally works. Maybe you can borrow some of their packages. Alternatively it should be also available in other distributions. > On 26 Mar 2016, at 22:47, Mich Talebzadeh wrote: > > Hi, > > I am running Hive 2 and now Spar

Re: Hive on Spark performance

2016-03-14 Thread sjayatheertha
Thanks for your response. We were evaluating Spark and were curious to know how it is used today and the lowest latency it can provide. > On Mar 14, 2016, at 8:37 AM, Mich Talebzadeh > wrote: > > Hi Wlodeck, > > Let us look at this. > > In Oracle I have two tables channels and sales. This c

Re: Hive on Spark performance

2016-03-14 Thread Mich Talebzadeh
Hi Wlodeck, Let us look at this. In Oracle I have two tables channels and sales. This code works in Oracle 1 select c.channel_id, sum(c.channel_id * (select count(1) from sales s WHERE c.channel_id = s.channel_id)) As R 2 from channels c 3* group by c.channel_id s...@mydb.mich.LOCAL> / C

Re: Hive on Spark performance

2016-03-14 Thread ws
Hive 1.2.1.2.3.4.0-3485Spark 1.5.2Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production ### SELECT  f.description, f.item_number, sum(f.df_a * (select count(1) from e.mv_A_h_a where hb_h_name = r.h_id)) as df_aFROM e.eng_fac_atl_sc_bf_qty f, wv_ATL_2_qty_df_rates rwhere f.

Re: Hive on Spark performance

2016-03-13 Thread Mich Talebzadeh
Depending on the version of Hive on Spark engine. As far as I am aware the latest version of Hive that I am using (Hive 2) has improvements compared to the previous versions of Hive (0.14,1.2.1) on Spark engine. As of today I have managed to use Hive 2.0 on Spark version 1.3.1. So it is not the l

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Elliot West
| 999 | 188 >>>>> | abQyrlxKzPTJliMqDpsfDTJUQzdNdfofUQhrKqXvRKwulZAoJe | 10 | >>>>> xx | >>>>> >>>>> >>>>> +---+--+--+--

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Mich Talebzadeh
nt: 04 February 2016 17:41 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore Hive is not the correct tool for every problem. Use the tool that makes the most sense for your problem and your experience. Many people like hive because it is genera

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Koert Kuipers
; >>> >>> The reality is that once you start factoring in the numerous tuning >>> parameters of the systems and jobs there probably isn't a clear answer. >>> For some queries, the Catalyst optimizer may do a better job...is it going >>> to do a better job wi

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Edward Capriolo
any given hive job). >> >> >> >> The reality is that once you start factoring in the numerous tuning >> parameters of the systems and jobs there probably isn't a clear answer. >> For some queries, the Catalyst optimizer may do a better job...is it going >>

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Koert Kuipers
> > *From:* Koert Kuipers [mailto:ko...@tresata.com] > *Sent:* Tuesday, February 02, 2016 9:50 PM > *To:* user@hive.apache.org > *Subject:* Re: Hive on Spark Engine versus Spark using Hive metastore > > > > yeah but have you ever seen somewhat write a real analytical pro

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Edward Capriolo
Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidentia

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Stephen Sprague
ces", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > >

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
o stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility. From: Mich Talebzadeh [mailto:m...@peridale.co.uk] Sent: 03 February 2016 16:21 To: user@

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
oyees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 03 February 2016 12:47 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore In YARN or standalone mode, you can set spark.executor.cores to utilize all cor

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Koert Kuipers
--------+---+-+-++--+ >>>>>>> >>>>>>> | dummy.id | dummy.clustered | dummy.scattered | >>>>>>> dummy.randomised |

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Edward Capriolo
c | dummy.padding | >>>>>> >>>>>> >>>>>> +---+--+--+---+-+-+------------+--+ >>>>>> >>>>

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Xuefu Zhang
herefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > *From:* Xuefu Zhang [mailto:xzh...@cloudera.com] > *Sent:* 03 February 2016 02:39 > > *To:* user@hive.apache.org > *Subject:* Re: Hive on Spark Eng

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
5 To: user@hive.apache.org Subject: RE: Hive on Spark Engine versus Spark using Hive metastore Hi Jeff, I only have a two node cluster. Is there anyway one can simulate additional parallel runs in such an environment thus having more than two maps? thanks Dr Mich Taleb

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
heir employees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 03 February 2016 02:39 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore Yes, regardless what spark mode you're running in, from Spark AM webui,

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Jörn Franke
GA | 5 | >>>>> xx | >>>>> >>>>> | 10| 99 | 999 | 188 | >>>>> abQyrlxKzPTJliMqDpsfDTJUQzdNdfofUQhrKqXvRKwulZAoJe | 100000 | >>>>>

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Ryan Harris
. For some queries, the Catalyst optimizer may do a better job...is it going to do a better job with ORC based data? less likely IMO. From: Koert Kuipers [mailto:ko...@tresata.com] Sent: Tuesday, February 02, 2016 9:50 PM To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark u

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
| >>>>> >>>>> | 10| 99 | 999 | 188 >>>>> | abQyrlxKzPTJliMqDpsfDTJUQzdNdfofUQhrKqXvRKwulZAoJe | 10 | >>>>> xx | >>>>> >>>>> >>>

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
+-++--+ >>>> >>>> 3 rows selected (80.718 seconds) >>>> >>>> >>>> >>>> Three runs returning the same rows in 80 seconds. >>>> >>>> >>>&

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
tated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > *From:* Koert Kuipers [mailto:ko...@tresata.com] > *Sent:* 03 F

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Xuefu Zhang
not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Edward Capriolo
t;> >>> > FROM sales s, times t, channels c >>> >>> > WHERE s.time_id = t.time_id >>> >>> > AND s.channel_id = c.channel_id >>> >>> > GROUP BY t.calendar_month_desc, c.channel_desc >&g

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
heir employees accept any responsibility. From: Koert Kuipers [mailto:ko...@tresata.com] Sent: 03 February 2016 00:09 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore uuuhm with spark using Hive metastore you actually have a real programming environm

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
Sent: 03 February 2016 00:09 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore uuuhm with spark using Hive metastore you actually have a real programming environment and you can write real functions, versus just being boxed into some version of sq

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
gt;> >> >> >> >> >> >> >> >> >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedi

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Xuefu Zhang
-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
y of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 02 February 2016 23:12 To: user@hive.apache.org Subject: Re: Hive on Spark Eng

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Philip Lee
>From my experience, spark sql has its own optimizer to support Hive query and metastore. After 1.5.2 spark, its optimizer is named catalyst. 2016. 2. 3. 오전 12:12에 "Xuefu Zhang" 님이 작성: > I think the diff is not only about which does optimization but more on > feature parity. Hive on Spark offers a

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Xuefu Zhang
I think the diff is not only about which does optimization but more on feature parity. Hive on Spark offers all functional features that Hive offers and these features play out faster. However, Spark SQL is far from offering this parity as far as I know. On Tue, Feb 2, 2016 at 2:38 PM, Mich Talebz

Re: Hive on Spark task running time is too long

2016-01-11 Thread Xuefu Zhang
You should check executor log to find out why it failed. There might have more explanation. --Xuefu On Sun, Jan 10, 2016 at 11:21 PM, Jone Zhang wrote: > *I have submited a application many times.* > *Most of applications running correctly.See attach 1.* > *But one of the them breaks as expecte

RE: hive on spark

2015-12-18 Thread Mich Talebzadeh
Hi, Your statement “I read that this is due to something not being compiled against the correct hadoop version. my main question what is the binary/jar/file that can cause this?” I believe this is the file in $HIVE_HOME/lib called spark-assembly-1.3.1-hadoop2.4.0.jar which you need to b

Re: Hive on Spark throw java.lang.NullPointerException

2015-12-18 Thread Xuefu Zhang
Could you create a JIRA with repro case? Thanks, Xuefu On Thu, Dec 17, 2015 at 9:21 PM, Jone Zhang wrote: > *My query is * > set hive.execution.engine=spark; > select > > t3.pcid,channel,version,ip,hour,app_id,app_name,app_apk,app_version,app_type,dwl_tool,dwl_status,err_type,dwl_store,dwl_maxs

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-17 Thread Xuefu Zhang
ts*, ISBN: >>> 978-0-9563693-3-8 >>> >>> *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume >>> one out shortly >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> &g

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-17 Thread Ophir Etzion
e designated recipient only, if you are not the intended >> recipient, you should destroy it immediately. Any information in this >> message shall not be understood as given or endorsed by Peridale Technology >> Ltd, its subsidiaries or their employees, unless expressly so stated. It

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Xuefu Zhang
ated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employees > accept any responsibility. > > > > *From:* Ophir Etzion [mailto:op...@foursquare.com] > *Sent:* 15 December 2015

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Xuefu Zhang
Ophir, Can you provide your hive.log here? Also, have you checked your spark application log? When this happens, it usually means that Hive is not able to launch an spark application. In case of spark on YARN, this application is the application master. If Hive fails to launch it, or the applicat

RE: Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Mich Talebzadeh
sponsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Ophir Etzion [mailto:op...@foursquare.com] Sent: 15 December 2015 22:42 To: user@hive.apache.org Cc: u...@spark.apache.org Su

Re: Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Ophir Etzion
Hi, the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3. I find it weird that it would work only on the version you mentioned as there is documentation (not good documentation but still..) on how to do it with cloudera that packages different versions. Thanks for the answer tho

RE: Hive on Spark - Error: Child process exited before connecting back

2015-12-15 Thread Mich Talebzadeh
Hi, The only version that I have managed to run Hive using Spark engine is Spark 1.3.1 on Hive 1.2.1 Can you confirm the version of Spark you are running? FYI, Spark 1.5.2 will not work with Hive. HTH Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy:

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

2015-12-09 Thread Xuefu Zhang
Hi Jone, Thanks for reporting the problem. When you say there is no enough resource, do you mean that you cannot launch Yarn application masters? I feel that we should error out right way if the application cannot be submitted. Any attempt of resubmitted seems problematic. I'm not sure if there i

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

2015-12-09 Thread Jone Zhang
*It seems that the submit number depend on stage of the query.* *This query include three stages.* If queue resources is still *not enough after submit threee applications,** Hive client will close.* *"**Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

2015-12-09 Thread Jone Zhang
> > But in some cases all of the applications will fail which caused > by SparkContext did not initialize after waiting for 15 ms. > See attchment (hive.spark.client.server.connect.timeout is set to 5min). *The error log is different from original mail* Container: container_1448873753366_11

Re: Hive on Spark application will be submited more times when the queue resources is not enough.

2015-12-09 Thread Jone Zhang
Hive version is 1.2.1 Spark version is 1.4.1 Hadoop version is 2.5.1 The application_1448873753366_121062 will success in the above mail. But in some cases all of the applications will fail which caused by SparkContext did not initialize after waiting for 15 ms. See attchment (hive.spark.clie

Re: Hive on spark table caching

2015-12-02 Thread Xuefu Zhang
Depending on the query, Hive on Spark does implicitly cache datasets (not necessarily the input tables) for performance benefits. Such queries include multi-insert, self-join, self-union, etc. However, no caching happens across queries at this time, which may be improved in the future. Thanks, Xue

RE: Hive on spark table caching

2015-12-02 Thread Mich Talebzadeh
onsibility. From: Udit Mehta [mailto:ume...@groupon.com] Sent: 02 December 2015 23:43 To: user@hive.apache.org Subject: Re: Hive on spark table caching Im using Spark 1.3 with Hive 1.2.1. I dont mind using a version of Spark higher than that but I read somewhere that 1.3 is the version of

Re: Hive on spark table caching

2015-12-02 Thread Udit Mehta
Im using Spark 1.3 with Hive 1.2.1. I dont mind using a version of Spark higher than that but I read somewhere that 1.3 is the version of Spark currently supported by Hive. Can I use Spark 1.4 or 1.5 with Hive 1.2.1? On Wed, Dec 2, 2015 at 3:19 PM, Mich Talebzadeh wrote: > Hi, > > > > Which vers

RE: Hive on spark table caching

2015-12-02 Thread Mich Talebzadeh
Hi, Which version of spark are you using please? Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://login.sybase.com/file

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-26 Thread Dasun Hegoda
gt;> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> spark.eventLog.enabled >>>> >>>> *true* >>>> >>>> >>

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-23 Thread Dasun Hegoda
t;Spark event log setting >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Mich Talebzadeh >>> >>> >>> >>> *Sybase ASE 15 Gold Medal Award 2008* >>> >&g

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-23 Thread Dasun Hegoda
BN 978-0-9563693-0-7*. >> >> co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN >> 978-0-9759693-0-4* >> >> *Publications due shortly:* >> >> *Complex Event Processing in Heterogeneous Environments*, ISBN: >> 978-0-9563693-3-8

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-23 Thread Dasun Hegoda
by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employees > accept any responsibility. > &g

RE: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-23 Thread Mich Talebzadeh
s free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Dasun Hegoda [mailto:dasunheg...@gmail.com] Sent: 23 November 2015 10:40 To: user@hive.apache.org Subject: Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu Thank you very much. This i

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-23 Thread Dasun Hegoda
gt; > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com >

RE: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-23 Thread Mich Talebzadeh
lity. From: Dasun Hegoda [mailto:dasunheg...@gmail.com] Sent: 23 November 2015 07:05 To: user@hive.apache.org Subject: Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu Anyone On Sat, Nov 21, 2015 at 1:32 PM, Dasun Hegoda mailto:dasunheg...@gmail.com> > wrote: Thank you

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-22 Thread Dasun Hegoda
<http://hortonworks.com/hdp/downloads/> >> >> >> Regards, >> >> Sai >> >> -- >> *From:* Dasun Hegoda >> *Sent:* Saturday, November 21, 2015 8:00 AM >> *To:* user@hive.apache.org >> *Subject:* Re: Hi

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-21 Thread Dasun Hegoda
s across Linux and > Windows > Read more... <http://hortonworks.com/hdp/downloads/> > > > Regards, > > Sai > > -- > *From:* Dasun Hegoda > *Sent:* Saturday, November 21, 2015 8:00 AM > *To:* user@hive.apache.org > *Subject:* Re: Hive

Re: Hive on Spark - Hadoop 2 - Installation - Ubuntu

2015-11-20 Thread Sai Gopalakrishnan
ce, security and operations across Linux and Windows Read more...<http://hortonworks.com/hdp/downloads/> Regards, Sai From: Dasun Hegoda Sent: Saturday, November 21, 2015 8:00 AM To: user@hive.apache.org Subject: Re: Hive on Spark - Hadoop 2 - Installa

  1   2   >