Re: Hive on Spark engine

2016-03-26 Thread Mich Talebzadeh
Thanks Ted, More interested in general availability of Hive 2 on Spark 1.6 engine as opposed to Vendors specific custom built. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive on Spark engine

2016-03-26 Thread Ted Yu
According to: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/bk_HDP_RelNotes-20151221.pdf Spark 1.5.2 comes out of box. Suggest moving questions on HDP to Hortonworks forum. Cheers On Sat, Mar 26, 2016 at 3:32 PM, Mich Talebzadeh wrote: > Thanks Jorn. > > Just to be

Re: Hive on Spark engine

2016-03-26 Thread Mich Talebzadeh
Thanks Jorn. Just to be clear they get Hive working with Spark 1.6 out of the box (binary download)? The usual work-around is to build your own package and get the Hadoop-assembly jar file copied over to $HIVE_HOME/lib. Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/

Re: Hive on Spark engine

2016-03-26 Thread Jörn Franke
If you check the newest Hortonworks distribution then you see that it generally works. Maybe you can borrow some of their packages. Alternatively it should be also available in other distributions. > On 26 Mar 2016, at 22:47, Mich Talebzadeh wrote: > > Hi, > > I am running Hive 2 and now Spar

Hive on Spark engine

2016-03-26 Thread Mich Talebzadeh
Hi, I am running Hive 2 and now Spark 1.6.1 but I still do not see any sign that Hive can utilise a Spark engine higher than 1.3.1 My understanding was that there were miss-match on Hadoop assembly Jar files that cause Hive not being able to run on Spark using the binary downloads. I just tried H

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Elliot West
| 999 | 188 >>>>> | abQyrlxKzPTJliMqDpsfDTJUQzdNdfofUQhrKqXvRKwulZAoJe | 10 | >>>>> xx | >>>>> >>>>> >>>>> +---+--+--+--

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Mich Talebzadeh
nt: 04 February 2016 17:41 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore Hive is not the correct tool for every problem. Use the tool that makes the most sense for your problem and your experience. Many people like hive because it is genera

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Koert Kuipers
; >>> >>> The reality is that once you start factoring in the numerous tuning >>> parameters of the systems and jobs there probably isn't a clear answer. >>> For some queries, the Catalyst optimizer may do a better job...is it going >>> to do a better job wi

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Edward Capriolo
any given hive job). >> >> >> >> The reality is that once you start factoring in the numerous tuning >> parameters of the systems and jobs there probably isn't a clear answer. >> For some queries, the Catalyst optimizer may do a better job...is it going >>

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Koert Kuipers
> > *From:* Koert Kuipers [mailto:ko...@tresata.com] > *Sent:* Tuesday, February 02, 2016 9:50 PM > *To:* user@hive.apache.org > *Subject:* Re: Hive on Spark Engine versus Spark using Hive metastore > > > > yeah but have you ever seen somewhat write a real analytical pro

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Edward Capriolo
Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidentia

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Stephen Sprague
ces", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > >

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
o stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility. From: Mich Talebzadeh [mailto:m...@peridale.co.uk] Sent: 03 February 2016 16:21 To: user@

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
oyees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 03 February 2016 12:47 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore In YARN or standalone mode, you can set spark.executor.cores to utilize all cor

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Koert Kuipers
--------+---+-+-++--+ >>>>>>> >>>>>>> | dummy.id | dummy.clustered | dummy.scattered | >>>>>>> dummy.randomised |

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Edward Capriolo
c | dummy.padding | >>>>>> >>>>>> >>>>>> +---+--+--+---+-+-+--------+--+ >>>>>> >>>>

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Xuefu Zhang
herefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > *From:* Xuefu Zhang [mailto:xzh...@cloudera.com] > *Sent:* 03 February 2016 02:39 > > *To:* user@hive.apache.org > *Subject:* Re: Hive on Spark Eng

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
5 To: user@hive.apache.org Subject: RE: Hive on Spark Engine versus Spark using Hive metastore Hi Jeff, I only have a two node cluster. Is there anyway one can simulate additional parallel runs in such an environment thus having more than two maps? thanks Dr Mich Taleb

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-03 Thread Mich Talebzadeh
heir employees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 03 February 2016 02:39 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore Yes, regardless what spark mode you're running in, from Spark AM webui,

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Jörn Franke
GA | 5 | >>>>> xx | >>>>> >>>>> | 10| 99 | 999 | 188 | >>>>> abQyrlxKzPTJliMqDpsfDTJUQzdNdfofUQhrKqXvRKwulZAoJe | 100000 | >>>>>

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Ryan Harris
. For some queries, the Catalyst optimizer may do a better job...is it going to do a better job with ORC based data? less likely IMO. From: Koert Kuipers [mailto:ko...@tresata.com] Sent: Tuesday, February 02, 2016 9:50 PM To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark u

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
| >>>>> >>>>> | 10| 99 | 999 | 188 >>>>> | abQyrlxKzPTJliMqDpsfDTJUQzdNdfofUQhrKqXvRKwulZAoJe | 10 | >>>>> xx | >>>>> >>>>> >>>

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
s not recognize CHAR fields which is a pain. >>>> >>>> >>>> >>>> spark-sql> *CREATE TEMPORARY TABLE tmp AS* >>>> >>>> > SELECT t.calendar_month_desc, c.channel_desc, >>>> SUM(s.amount_sold) AS TotalSales &

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
tated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > *From:* Koert Kuipers [mailto:ko...@tresata.com] > *Sent:* 03 F

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Xuefu Zhang
not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Edward Capriolo
t;> >>> > FROM sales s, times t, channels c >>> >>> > WHERE s.time_id = t.time_id >>> >>> > AND s.channel_id = c.channel_id >>> >>> > GROUP BY t.calendar_month_desc, c.channel_desc >&g

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
heir employees accept any responsibility. From: Koert Kuipers [mailto:ko...@tresata.com] Sent: 03 February 2016 00:09 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore uuuhm with spark using Hive metastore you actually have a real programming environm

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
Sent: 03 February 2016 00:09 To: user@hive.apache.org Subject: Re: Hive on Spark Engine versus Spark using Hive metastore uuuhm with spark using Hive metastore you actually have a real programming environment and you can write real functions, versus just being boxed into some version of sq

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Koert Kuipers
gt;> >> >> >> >> >> >> >> >> >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedi

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Xuefu Zhang
-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
y of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 02 February 2016 23:12 To: user@hive.apache.org Subject: Re: Hive on Spark Eng

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Philip Lee
ive on Spark offers all functional features that Hive > offers and these features play out faster. However, Spark SQL is far from > offering this parity as far as I know. > > On Tue, Feb 2, 2016 at 2:38 PM, Mich Talebzadeh > wrote: > >> Hi, >> >> >> >&g

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Xuefu Zhang
Talebzadeh wrote: > Hi, > > > > My understanding is that with Hive on Spark engine, one gets the Hive > optimizer and Spark query engine > > > > With spark using Hive metastore, Spark does both the optimization and > query engine. The only value add is that one can ac

Hive on Spark Engine versus Spark using Hive metastore

2016-02-02 Thread Mich Talebzadeh
Hi, My understanding is that with Hive on Spark engine, one gets the Hive optimizer and Spark query engine With spark using Hive metastore, Spark does both the optimization and query engine. The only value add is that one can access the underlying Hive tables from spark-sql etc Is