Hi Edward,

Could you please clarify what you mean in your last paragraph? You found
Pig Latin  a week framework in terms of MapReduce?

Thanks again for the response.
Mahsa


On Sat, Mar 17, 2012 at 12:04 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:

> in general hive does not offer features it can not do well. Cross joins on
> any data set where one table is not very small do not scale in map reduce.
> So there is not a big win for offering syntax for it.
>
> Not talking about pig but one very common unnamed map reduce framework
> offers Many features that do not paralize into map reduce. I find this
> framework a total 'tease'.
>
> On Saturday, March 17, 2012, buddhika chamith <chamibuddh...@gmail.com>
> wrote:
> > Hi,
> >
> > I think matt's solution is the way to go for now. If you need some basic
> understanding on how reduce and map side joins work see [1] whether if it
> helps you.
> >
> > Regards
> > Buddhika
> >
> > [1] http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/
> >
> > On Sat, Mar 17, 2012 at 6:41 AM, Alan Gates <ga...@hortonworks.com>
> wrote:
> >>
> >> There are algorithms for doing general theta-joins in parallel.  Search
> Google on "theta joins parallel database" and you will find some
> interesting references.  I am not aware of any tools that implement these
> yet.  You can also do it via a cross join followed by a filter, but again
> you need special algorithms to do a cross in MapReduce, which Hive doesn't
> implement yet.  See
> http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html(search 
> for the section on Cross) for a discussion of how to do cross in
> MapReduce.
> >>
> >> Alan.
> >>
> >> On Mar 13, 2012, at 10:13 AM, Tucker, Matt wrote:
> >>
> >> > For theta joins, you’ll have to convert the query to an equi-join,
> and then filter for non-equality in the WHERE clause.  Depending upon the
> size of each table, you might consider looking at map-side joins, which
> will allow for doing non-equality filters during a join before it’s passed
> to the reducers.
> >> >
> >> > Matt Tucker
> >> >
> >> > From: mahsa mofidpoor [mailto:mofidp...@gmail.com]
> >> > Sent: Tuesday, March 13, 2012 1:02 PM
> >> > To: user@hive.apache.org
> >> > Subject: Re: non-equality joins
> >> >
> >> >
> >> > Hi Keith,
> >> >
> >> > Do you know exactly how an algorithm should be in order to fit in the
> MapReduce framework? Could you refer me to some references?
> >> >
> >> > Thanks and Regards,
> >> > Mahsa
> >> >
> >> >
> >> >
> >> > On Tue, Mar 13, 2012 at 12:49 PM, Keith Wiley <kwi...@keithwiley.com>
> wrote:
> >> > https://cwiki.apache.org/Hive/languagemanual-joins.html
> >> >
> >> > "Hive does not support join conditions that are not equality
> conditions as it is very difficult to express such conditions as a
> map/reduce job."
> >> >
> >> > I admit, that isn't a very detailed answer, but it gives some
> indication of the reason for the discrepancy between Hive and other
> databases.  Hive fundamentally operates on Hadoop, namely on MapReduce (we
> all know this, I'm just reiterating the train of thought).  The problem is
> that certain algorithms are exceedingly difficult to wedge into the
> MapReduce framework.
> >> >
> >> > That is as detailed as my personal insight can get.  I've done a lot
> of MapReduce programming in Hadoop but I'm not a database expert and I
> don't really understand the steps involved in various kinds of table-joins,
> so I don't understand the particular ways in which certain database
> operations do or do not fit into MapReduce...but presumably nonequality
> joins (whatever those are :-D ) are particularly difficult to MapReduceify.
> >> >
> >> > Cheers!
> >> >
> >> > On Mar 13, 2012, at 09:17 , mahsa mofidpoor wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > Is there a reason behind not implementing non-equality joins in
> Hive? In other words, is there any usage for theta-join, if implemented?
> >> > >
> >> > > Thank you in advance for your response,
> >> > > Mahsa
> >> >
> >> >
> >> >
> ________________________________________________________________________________
> >> > Keith Wiley     kwi...@keithwiley.com     keithwiley.com
> music.keithwiley.com
> >> >
> >> > "It's a fine line between meticulous and obsessive-compulsive and a
> slippery
> >> > rope between obsessive-compulsive and debilitatingly slow."
> >> >                                           --  Keith Wiley
> >> >
> ________________________________________________________________________________
> >> >
> >> >
> >>
> >
> >
>

Reply via email to