hi,We use the HDFS and the way  like your.

We use it as a distributed file system to store web static resource.
The request path is Nginx -> Squid -> Jetty ->HDFS,and the squid will cache the mostly file access. I can not confirm the feasibility of such usage,because the user is not too much now.Could you give me some suggestion?

And the NameNode is a single one node,does hadoop development team will improve it?

For example,we can add more than one NameNode in the clusters?
Thanks.



amit handa write:
Hi,

We are evaluating the use of standalone hdfs for one of our projects.
The file system would be used to store audio,video,images and text
files for various types of batch processing applications hosted across
multiple machines and multiple platforms.

I wanted some feedback on what are the best hdfs based options
(fuse-dfs,hbase or others) that are available given the requirements
below :

1.      Data type that is required to be stored is video, audio, images,
xml and text files.
2.      These files needs to be created/accessed/deleted from linux and
windows machines
3.      Nature of data that is to be stored is transient , we store all
this data for a configurable amount of time (say 2 days) for
processing across multiple machines and then delete it after
processing is complete.
4.      The data needs to be available as close as possible to the
processing machines (linux or windows) to reduce network i/o.
5.      The no. of files that need to be stored per day is of the order of
millions. The number of folders that need to be created for storing
images for a single videos will be in the order of millions
6.     The no. of files that need to be deleted per day will be of the
order of millions as we would be cleaning up the files for whom
processing has been completed.
7.      The file size for audio/video files can range from few KB to few GB.
8.      The file permissions that are needed would be at max restricting
some hosts to access files in a read only v/s read write mode. - good
to have not a must have requirement
9.      The set up can have 200 -600 machines (mix of windows (30%) and
linux (70%)) each having 250-500 GB hard disk drives
10.     File system should be mountable from linux and windows
machines (via mapping network drive)

Please let me know if you need more details.

Thanks in advance,
Amit


Reply via email to