In general, Hadoop is unsuitable for the application you're suggesting. Systems like Fuse HDFS do exist, though they're not widely used. I don't know of anyone trying to connect Hadoop with Apache httpd.
When you say that you have huge images, how big is "huge?" It might be useful if these images are 1 GB or larger. But in general, "huge" on Hadoop means 10s of GBs up to TBs. If you have a large number of moderately-sized files, you'll find that HDFS responds very poorly for your needs. It sounds like glusterfs is designed more for your needs. - Aaron On Thu, Mar 26, 2009 at 4:06 PM, phil cryer <[email protected]> wrote: > This is somewhat of a noob question I know, but after learning about > Hadoop, testing it in a small cluster and running Map Reduce jobs on > it, I'm still not sure if Hadoop is the right distributed file system > to serve web requests. In other words, can, or is it right to, serve > Images and data from HDFS using something like FUSE to mount a > filesystem where Apache could serve images from it? We have huge > images, thus the need for a distributed file system, and they go in, > get stored with lots of metadata, and are redundant with Hadoop/HDFS - > but is it the right way to serve web content? > > I looked at glusterfs before, they had an Apache and Lighttpd module > which made it simple, does HDFS have something like this, do people > just use a FUSE option as I described, or is this not a good use of > Hadoop? > > Thanks > > P >
