Hi Edward, Could you please clarify what you mean in your last paragraph? You found Pig Latin a week framework in terms of MapReduce?
Thanks again for the response. Mahsa On Sat, Mar 17, 2012 at 12:04 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > in general hive does not offer features it can not do well. Cross joins on > any data set where one table is not very small do not scale in map reduce. > So there is not a big win for offering syntax for it. > > Not talking about pig but one very common unnamed map reduce framework > offers Many features that do not paralize into map reduce. I find this > framework a total 'tease'. > > On Saturday, March 17, 2012, buddhika chamith <chamibuddh...@gmail.com> > wrote: > > Hi, > > > > I think matt's solution is the way to go for now. If you need some basic > understanding on how reduce and map side joins work see [1] whether if it > helps you. > > > > Regards > > Buddhika > > > > [1] http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/ > > > > On Sat, Mar 17, 2012 at 6:41 AM, Alan Gates <ga...@hortonworks.com> > wrote: > >> > >> There are algorithms for doing general theta-joins in parallel. Search > Google on "theta joins parallel database" and you will find some > interesting references. I am not aware of any tools that implement these > yet. You can also do it via a cross join followed by a filter, but again > you need special algorithms to do a cross in MapReduce, which Hive doesn't > implement yet. See > http://ofps.oreilly.com/titles/9781449302641/advanced_pig_latin.html(search > for the section on Cross) for a discussion of how to do cross in > MapReduce. > >> > >> Alan. > >> > >> On Mar 13, 2012, at 10:13 AM, Tucker, Matt wrote: > >> > >> > For theta joins, you’ll have to convert the query to an equi-join, > and then filter for non-equality in the WHERE clause. Depending upon the > size of each table, you might consider looking at map-side joins, which > will allow for doing non-equality filters during a join before it’s passed > to the reducers. > >> > > >> > Matt Tucker > >> > > >> > From: mahsa mofidpoor [mailto:mofidp...@gmail.com] > >> > Sent: Tuesday, March 13, 2012 1:02 PM > >> > To: user@hive.apache.org > >> > Subject: Re: non-equality joins > >> > > >> > > >> > Hi Keith, > >> > > >> > Do you know exactly how an algorithm should be in order to fit in the > MapReduce framework? Could you refer me to some references? > >> > > >> > Thanks and Regards, > >> > Mahsa > >> > > >> > > >> > > >> > On Tue, Mar 13, 2012 at 12:49 PM, Keith Wiley <kwi...@keithwiley.com> > wrote: > >> > https://cwiki.apache.org/Hive/languagemanual-joins.html > >> > > >> > "Hive does not support join conditions that are not equality > conditions as it is very difficult to express such conditions as a > map/reduce job." > >> > > >> > I admit, that isn't a very detailed answer, but it gives some > indication of the reason for the discrepancy between Hive and other > databases. Hive fundamentally operates on Hadoop, namely on MapReduce (we > all know this, I'm just reiterating the train of thought). The problem is > that certain algorithms are exceedingly difficult to wedge into the > MapReduce framework. > >> > > >> > That is as detailed as my personal insight can get. I've done a lot > of MapReduce programming in Hadoop but I'm not a database expert and I > don't really understand the steps involved in various kinds of table-joins, > so I don't understand the particular ways in which certain database > operations do or do not fit into MapReduce...but presumably nonequality > joins (whatever those are :-D ) are particularly difficult to MapReduceify. > >> > > >> > Cheers! > >> > > >> > On Mar 13, 2012, at 09:17 , mahsa mofidpoor wrote: > >> > > >> > > Hello, > >> > > > >> > > Is there a reason behind not implementing non-equality joins in > Hive? In other words, is there any usage for theta-join, if implemented? > >> > > > >> > > Thank you in advance for your response, > >> > > Mahsa > >> > > >> > > >> > > ________________________________________________________________________________ > >> > Keith Wiley kwi...@keithwiley.com keithwiley.com > music.keithwiley.com > >> > > >> > "It's a fine line between meticulous and obsessive-compulsive and a > slippery > >> > rope between obsessive-compulsive and debilitatingly slow." > >> > -- Keith Wiley > >> > > ________________________________________________________________________________ > >> > > >> > > >> > > > > >