Hi All,
We would like to announce the first open source release of the Twister framework for iterative MapReduce computations. MapReduce programming model has simplified the implementations of many data parallel applications. The simplicity of the programming model and the quality of services provided by many implementations of MapReduce attract a lot of enthusiasm among parallel computing communities. From the years of experience in applying MapReduce programming model to various scientific applications we identified a set of extensions to the programming model and improvements to its architecture which will expand the applicability of MapReduce to more classes of applications. Twister is a lightweight MapReduce runtime we have developed by incorporating these enhancements. We have published several scientific papers [1-5] explaining the key concepts and comparing it with other MapReduce implementations such as Hadoop and DryadLINQ. Today we would like to announce its first release. Key Features of Twister are: Distinction on static and variable data Configurable long running (cacheable) map/reduce tasks Pub/sub messaging based communication/data transfers Combine phase to collect all reduce outputs Efficient support for Iterative MapReduce computations Data access via local disks Lightweight (5600 lines of code) Tools to manage data We would like to share the design decisions and ideas we have incorporated into Twister with you all and we will be very grateful if you could share your thoughts about it with us. For more details please visit www.iterativemapreduce.org and let us know your thoughts and experience using Twister. SALSA <http://salsaweb.indiana.edu/salsa/> HPC Team. Thank you, Jaliya Ekanayake Phone: Work +1 812-855-2990, Cell +1 812-606-0561 Web: www.cs.indiana.edu/~jekanaya [1]. Jaliya Ekanayake, (Advisor: Geoffrey Fox) Architecture <http://grids.ucs.indiana.edu/ptliupages/publications/SC09-abstract-jaliya-e kanayake.pdf> and Performance of Runtime Environments for Data Intensive Scalable Computing, Doctoral Showcase, SuperComputing2009. [2]. Jaliya Ekanayake, Atilla Soner Balkir, Thilina Gunarathne, Geoffrey Fox, Christophe Poulain, Nelson Araujo, Roger Barga, DryadLINQ <http://grids.ucs.indiana.edu/ptliupages/publications/eScience09-camera-read y-submission.pdf> for Scientific Analyses, Fifth IEEE International Conference on e-Science (eScience2009), Oxford, UK. [3]. Jaliya Ekanayake, Xiaohong Qiu, Thilina Gunarathne, Scott Beason, Geoffrey Fox High <http://grids.ucs.indiana.edu/ptliupages/publications/cloud_handbook_final-w ith-diagrams.pdf> Performance Parallel Computing with Clouds and Cloud Technologies Technical Report August 25 2009 to appear as Book Chapter. [4]. Geoffrey Fox, Seung-Hee Bae, Jaliya Ekanayake, Xiaohong Qiu, and Huapeng Yuan, Parallel <http://grids.ucs.indiana.edu/ptliupages/publications/CetraroWriteupJune11-0 9.pdf> Data Mining from Multicore to Cloudy Grids, High Performance Computing and Grids workshop, 2008. - An extended version of this paper goes to a book chapter. [5]. Jaliya Ekanayake, Shrideep Pallickara, Geoffrey Fox, MapReduce <http://grids.ucs.indiana.edu/ptliupages/publications/ekanayake-MapReduce.pd f> for Data Intensive Scientific Analyses, Fourth IEEE International Conference on eScience, 2008, pp.277-284.