On Tue, Dec 27, 2011 at 2:31 PM, Kevin Burton <burtona...@gmail.com> wrote:
> > I'm pleased to announce Peregrine 0.5.0 - a new map reduce framework > optimized > for iterative and pipelined map reduce jobs. > > http://peregrine_mapreduce.bitbucket.org/ > > This originally started off with some internal work at Spinn3r to build a > fast > and efficient Pagerank implementation. We realized that what we wanted > was a MR > runtime optimized for this type of work which differs radically from the > traditional Hadoop design. > > Peregrine implements a partitioned distributed filesystem where key/value > pairs > are routed to defined partitions. This enables work to be joined against > previous iterations or different units of work by the same key on the same > local > system. > > Peregrine is optimized for ETL jobs where the primary data storage system > is an > external database such as Cassandra, Hbase, MySQL, etc. Jobs are then run > as a > Extract, Transform and Load stages with intermediate data being stored in > the > Peregrine FS. > > We enable features such as Map/Reduce/Merge as well as some additional > functionality like ExtractMap and ReduceLoad (in ETL parlance). > > A key innovation here is a partitioning layout algorithm that can support > fast > many to many recovery similar to HDFS but still support partitioned > operation > with deterministic key placement. > Thanks for your contribution. Is here more detail info on this point? > > We've also tried to optimize for single instance performance and use > modern IO > primitives as much as possible. This includes NOT shying away from > operating > specific features such as mlock, fadvise, fallocate, etc. > > There is still a bit more work I want to do before I am ready to benchmark > it > against Hadoop. Instead of implementing a synthetic benchmark we wanted > to get > a production ready version first which would allow people to port existing > applications and see what the before / after performance numbers looked > like in > the real world. > > For more information please see: > > http://peregrine_mapreduce.bitbucket.org/ > > As well as our design documentation: > > http://peregrine_mapreduce.bitbucket.org/design/ > > > > -- > -- > > Founder/CEO Spinn3r.com <http://spinn3r.com/> > > Location: *San Francisco, CA* > Skype: *burtonator* > > Skype-in: *(415) 871-0687* > >