Hi Steve My other comments inline:
Zookeeper would be nice too, as you could bring up a very small cluster
+1, I will tackle that too ;)
-there are a lot of calls to System.exit() in Hadoop when it isn't happy, you need a security manager to catch them and turn them into exceptions -and no, the code doesn't expect exceptions everywhere.
I will check if we can trap this. Maybe a modification in the core code could do that.
-There are a lot of assumptions that every service (namenode, datanode, etc) is running in its own VM, with its own singletons. They will all need their own classloaders, which implies separate OSGi bundles for each public service.
We can imagine a kind of "fork" in the OSGi container. On the other hand, singletons are per classloader, so we can handle that.
YARN is even more interesting, as it works by deploying the application master (such as the MR engine) on request, picking a suitable node and executing the entry point with a classpath (somehow) set up. If you are going to work with trunk you will need to address this, the simplest tactic being "don't try and run YARN-based services under OSGi, just the YARN Resource Manager and Node Managers itself"; A more advanced options "support OSGi-based YARN services specially", would also be good if it could start both Application Masters and their container applications themselves (Task Trackers &c), and aided the execution of things like actual tasks within the OSGi container (for speed). If you are looking a production use of this stuff, you'll need to worry about loading of the native libraries too. Otherwise this becomes more restricted to experimental-small-machine setups.
Thanks for these comments !! I will take care of that in the following patches ;)
Thanks again, Regards JB -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com