What I did for one of our project was similar.... Use super col to strore files and dir metadata.... use another row(Key UUID) to store the dir contents (Files and subdirectory). we used UUID instead of paths because there will be rename or move.... store the small files in cassandra....
We used Internally developed filesystem to store the big files which are more than x bytes.... Locking is done using Zookeeper and queuing by zeromq. Regards, </VJ> On Wed, Apr 14, 2010 at 9:39 PM, Tatu Saloranta <tsalora...@gmail.com>wrote: > On Wed, Apr 14, 2010 at 7:26 PM, Avinash Lakshman > <avinash.laksh...@gmail.com> wrote: > > OPP is not required here. You would be better off using a Random > partitioner > > because you want to get a random distribution of the metadata. > > Not for splitting, but for actual file system hierarchy it would. How > else would you traverse hierarchy? (list sub-directiories, files) > > As to splitting files, yes, can be done, but I personally think that > would be asking for trouble because of lack atomicity for operations. > Exception being if only operations ever would be append. > > -+ Tatu +- >