On Wed, Apr 14, 2010 at 7:26 PM, Avinash Lakshman <avinash.laksh...@gmail.com> wrote: > OPP is not required here. You would be better off using a Random partitioner > because you want to get a random distribution of the metadata.
Not for splitting, but for actual file system hierarchy it would. How else would you traverse hierarchy? (list sub-directiories, files) As to splitting files, yes, can be done, but I personally think that would be asking for trouble because of lack atomicity for operations. Exception being if only operations ever would be append. -+ Tatu +-