Hi All,
I am considering using HDFS for an application that potentially has many small files – ie 10-100 million files with an estimated average filesize of 50-100k (perhaps smaller) and is an online interactive application. All of the documentation I have seen suggests that a blockszie of 64-128Mb works best for Hadoop/HDFS and it is best used for batch oriented applications. Does anyone have any experience using it for files of this size in an online application environment? Is it worth pursuing HDFS for this type of application? Thanks Peter