Hi, I'd like to announce the release of a new open source project, Sailfish.
http://code.google.com/p/sailfish/ Sailfish tries to improve Hadoop-performance, particularly for large-jobs which process TB's of data and run for hours. In building Sailfish, we modify how map-output is handled and transported from map->reduce. The project pages provide more information about the project. We are looking for colloborators who can help get some of the ideas into Apache Hadoop. A possible step forward could be to make "shuffle" phase of Hadoop pluggable. If you are interested in working with us, please get in touch with me. Sriram