OPP is not required here. You would be better off using a Random partitioner because you want to get a random distribution of the metadata.
Avinash On Wed, Apr 14, 2010 at 7:25 PM, Avinash Lakshman < avinash.laksh...@gmail.com> wrote: > Exactly. You can split a file into blocks of any size and you can actually > distribute the metadata across a large set of machines. You wouldn't have > the issue of having small files in this approach. The issue maybe the > eventual consistency - not sure that is a paradigm that would be acceptable > for a file system. But that is a discussion for another time/day. > > Avinash > > On Wed, Apr 14, 2010 at 7:15 PM, Ken Sandney <bluefl...@gmail.com> wrote: > >> Large files can be split into small blocks, and the size of block can be >> tuned. It may increase the complexity of writing such a file system, but can >> be for general purpose (not only for relative small files) >> >> >> On Thu, Apr 15, 2010 at 10:08 AM, Tatu Saloranta <tsalora...@gmail.com>wrote: >> >>> On Wed, Apr 14, 2010 at 6:42 PM, Zhuguo Shi <bluefl...@gmail.com> wrote: >>> > Hi, >>> > Cassandra has a good distributed model: decentralized, auto-partition, >>> > auto-recovery. I am evaluating about writing a file system over >>> Cassandra >>> > (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if >>> > Cassandra is good at such use case? >>> >>> It sort of depends on what you are looking for. From use case for >>> which something like S3 is good, yes, except with one difference: >>> Cassandra is more geared towards lots of small files, whereas S3 is >>> more geared towards moderate number of files (possibly large). >>> >>> So I think it can definitely be a good use case, and I may use >>> Cassandra for this myself in future. Having range queries allows >>> implementing directory/path structures (list keys using path as >>> prefix). And you can split storage such that metadata could live in >>> OPP partition, raw data in RP. >>> >>> -+ Tatu +- >>> >> >> >