Re: Running Hadoop/Hbase in a OSGi container

Steve Loughran Fri, 12 Jun 2009 03:27:30 -0700

Ninad Raut wrote:

OSGi provides navigability to your components and create a life cycle for
each of those components viz; install. start, stop, un- deploy etc.
This is the reason why we are thinking of creating components using OSGi.
The problem we are facing is our components using mapreduce and HDFS, as
such OSGi container cannot detect hadoop mapred engine or HDFS.


I  have searched through the net and looks like people are working or have
achieved success in running hadoop in OSGi container....

Ninad

1. I am doing work on a simple lifecycle for the services,start/stop/ping, which is not OSGI (which worries a lot aboutclassloading and versioning, check out HADOOP-3628 for this.

2. You can run it under OSGi systems, such as the OSGi branch ofSmartFrog :http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/branches/core-branch-osgi/,or under non-OSGi tools. Either way, these tools are left dealing withclassloading and the like.

3. Any container is going to have to deal with the problem that thereare bits of all the services that call System.Exit() by running under asecurity manager, trapping the call, raising an exception etc.

4. Any container is going to have to then deal with the fact that from0.20 onwards, Hadoop does things with security policy that areincompatible with normal Java security managers. whatever securitymanager you have for trapping system exits, can't extend the default one.

5. any container also has to deal with every service (namenode, jobtracker, etc) makes a lot of assumptions about singletons, that theyhave exclusive use of filesystem objects retrieved throughFileSystem.get(), and the like. While OSGi can do that with itsclassloading work, its still fairly complex.

6. There are also lots of JVM memory/thread management issues, see thevarious Hadoop bugs

If you look at the slides of what I've been up to, you can see that itcan be done

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/doc/dynamic_hadoop_clusters.ppt

However,

* you really need to run every service in its own process, for memoryand reliability alone

 * It's pretty leading edge
 * You will have to invest the time and effort to get it working

If you want to do the work, start with what I've been doing, bring it upunder the OSGi container of your choice. You can come and play with ourtooling, I'm cutting a release today of this week's Hadoop trunk mergedwith my branch, it is of course experimental, as even the trunk is a bitup-and-down on feature stability.


-steve

Re: Running Hadoop/Hbase in a OSGi container

Reply via email to