Hi,
I am exploring ways in which HDFS could be run in an environment supporting 
virtualization. For the particular application, I would at least like the map 
tasks to run in virtual containers (perhaps even lightweight containers for 
example LXC), but actually not duplicate the storage per container. That is, 
one or more map task in the same container, as opposed to virtualized datanode.
One option seems to be to have hadoop.tmp.dir as a virtual disk, use guestfs or 
similar for management,  and respective map tasks could share this disk.
I'm eliding many details but just wanted some initial feedback, or pointers to 
similar effort. I'm aware that Mesos (perhaps Hadoop NG?) has support for lxc.
Charles

Reply via email to