Good point Steve. This touches on the larger issue of whether it > makes sense to host FS clients for other file systems in Hadoop > itself. I agree with what I think you're getting which is - if we can > handle the testing and integration via external dependencies it would > probably be better to have the Hadoop client code live and ship as > part of the other projects since it's more likely to be maintained > there. Perhaps start a DISCUSS thread on common-dev since this > pertains to other file systems aside from QFS? > > Seems reasonable -I'll let you start it.
We had this problem with Ant; I'm sure the Maven team hit it too: at first having lots of libraries that bond to external apps makes sense, because nobody else will do them for you. As your application becomes more successful, those obscure tasks become a liability as nobody ever regression tests them, most people don't even have a setup to run them by hand -and you fear support issues related to them as they will be non-reproducible, let alone fixable. Looking at the Ant task list, <netrexxc> and <wljspc> spring to mind -the latter has only ever been tested in WinNT4 and Solaris 5.x, meaning nobody has actually run it it since 2001 and Windows XP hitting the market. http://ant.apache.org/manual/Tasks/wljspc.html The fact that there are no open JIRAs related to KFS are probably a metric of its use -again an argument for pushing the work out to the KFS team -though they will need to work with bigtop to ensure that an RPM can install the kfs support into /usr/lib/hadoop/lib