Chaining described in chapter 8 of my book provides this to a limited degree.
Cascading, http://www.cascading.org/, also supports complex flows. I do not know how cascading works under the covers. On Thu, Apr 16, 2009 at 8:23 AM, Shevek <[email protected]> wrote: > On Tue, 2009-04-14 at 07:59 -0500, Pankil Doshi wrote: > > Hey, > > > > I am trying complex queries on hadoop and in which i require more than > one > > job to run to get final result..results of job one captures few joins of > the > > query and I want to pass those results as input to 2nd job and again do > > processing so that I can get final results.queries are such that I cant > do > > all types of joins and filterin in job1 and so I require two jobs. > > > > right now I write results of job 1 to hdfs and read dem for job2..but > thats > > take unecessary IO time.So was looking for something that I can store my > > results of job1 in memory and use them as input for job 2. > > Hi, > > I am a programming language and compiler designer. We have a workflow > engine which is capable of taking a description of a complex workflow > and analysing it as a multi-stage map-reduce system to generate an > optimal resource allocation. I'm hunting around for people who have > problems like this, since I'm considering whether to port the whole > thing to hadoop as a high-level language. > > Do you, or any other users have descriptions of workflows more complex > than "one map, maybe one reduce" which you would like to be able to > express easily? > > S. > > -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
