Sriram et. al., Do you intend this to be a joint project with the Hadoop community or a technology competitor?
Regrettably, KFS is not a "drop in replacement" for HDFS. Hypothetically: I have several petabytes of data in an existing HDFS deployment, which is the norm, and a continuous MapReduce workflow. How do you propose I, practically, migrate to something like Sailfish without a major capital expenditure and/or downtime and/or data loss? However, can the Sailfish I-files implementation be plugged in as an alternate Shuffle implementation in MRv2 (see MAPREDUCE-3060 and MAPREDUCE-4049), with necessary additional plumbing for dynamic adjustment of reduce task population? And the workbuilder could be part of an alternate MapReduce Application Manager? The I-file concept could possibly be implemented here in a fairly self contained way. One could even colocate/embed a KFS filesystem with such an alternate shuffle, like how MR task temporary space is usually colocated with HDFS storage. Does this seem reasonable in any way? Best regards, - Andy >> From: Sriram Rao <srirams...@gmail.com> >> To: common-dev@hadoop.apache.org >> Sent: Tuesday, May 8, 2012 10:32 AM >> Subject: Project announcement: Sailfish (also, looking for colloborators) >> >> Hi, >> >> I'd like to announce the release of a new open source project, Sailfish. >> >> http://code.google.com/p/sailfish/ >> >> Sailfish tries to improve Hadoop-performance, particularly for large-jobs >> which process TB's of data and run for hours. In building Sailfish, we >> modify how map-output is handled and transported from map->reduce. >> >> The project pages provide more information about the project. >> >> We are looking for colloborators who can help get some of the ideas into >> Apache Hadoop. A possible step forward could be to make "shuffle" phase of >> Hadoop pluggable. >> >> If you are interested in working with us, please get in touch with me. >> >> Sriram > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)