OPP is not required here. You would be better off using a Random partitioner
because you want to get a random distribution of the metadata.

Avinash

On Wed, Apr 14, 2010 at 7:25 PM, Avinash Lakshman <
avinash.laksh...@gmail.com> wrote:

> Exactly. You can split a file into blocks of any size and you can actually
> distribute the metadata across a large set of machines. You wouldn't have
> the issue of having small files in this approach. The issue maybe the
> eventual consistency - not sure that is a paradigm that would be acceptable
> for a file system. But that is a discussion for another time/day.
>
> Avinash
>
> On Wed, Apr 14, 2010 at 7:15 PM, Ken Sandney <bluefl...@gmail.com> wrote:
>
>> Large files can be split into small blocks, and the size of block can be
>> tuned. It may increase the complexity of writing such a file system, but can
>> be for general purpose (not only for relative small files)
>>
>>
>> On Thu, Apr 15, 2010 at 10:08 AM, Tatu Saloranta <tsalora...@gmail.com>wrote:
>>
>>> On Wed, Apr 14, 2010 at 6:42 PM, Zhuguo Shi <bluefl...@gmail.com> wrote:
>>> > Hi,
>>> > Cassandra has a good distributed model: decentralized, auto-partition,
>>> > auto-recovery. I am evaluating about writing a file system over
>>> Cassandra
>>> > (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if
>>> > Cassandra is good at such use case?
>>>
>>> It sort of depends on what you are looking for. From use case for
>>> which something like S3 is good, yes, except with one difference:
>>> Cassandra is more geared towards lots of small files, whereas S3 is
>>> more geared towards moderate number of files (possibly large).
>>>
>>> So I think it can definitely be a good use case, and I may use
>>> Cassandra for this myself in future. Having range queries allows
>>> implementing directory/path structures (list keys using path as
>>> prefix). And you can split storage such that metadata could live in
>>> OPP partition, raw data in RP.
>>>
>>> -+ Tatu +-
>>>
>>
>>
>

Reply via email to