On Feb 20, 2013, at 5:12 PM, Aaron T. Myers wrote: > On Wed, Feb 20, 2013 at 4:29 PM, Chris Douglas <cdoug...@apache.org> wrote: > >> Given that HDFS-347 is a strictly better approach, once committed, >> there will be ample motivation to add support for other OSes and >> remove HDFS-2246 entirely. Nobody is confused about this. There's >> ample precedent for retaining obscure, clumsy features as a temporary >> stop-gap (e.g., service plugins, opaque blobs of bytes in Tasks, >> configurable combiner semantics). What's the virtue of insisting on >> removing this? Unless there was a lot of follow-on work, HDFS-2246 >> doesn't look like a lot of code... >> > >
Chris's comment on keeping the removal of HDFS-2246 independent of HDFS-347 makes sense in this specific case and also in the general case where new optimizations are added. One objection is code complexity of fallback code: > Though it's not a ton of code, I think that having to support a more > complex fallback path (i.e. try the HDFS-347 method, then fall back to > trying the HDFS-2246 method, then fall back to doing normal TCP reads to > the local DN) will make the code quite a bit hairier for little added > benefit. Aaron, what if you didn't fall back in the fashion you suggest, but instead the code does one HDFS-2246 OR HDFS-347 when that option is set and HDFS-347 when both are set. Hence the fallback code will be of the same complexity. Clearly one could add a platform specific optimization down the road; one could think of HDFS-2246 as being a windows optimization. Another objection is who will maintain the HDFS-2246 code: Suresh has volunteered to maintain the code. What are the remaining objections leaving both styles of optimizations in the code base? sanjay