Exactly. You can split a file into blocks of any size and you can actually
distribute the metadata across a large set of machines. You wouldn't have
the issue of having small files in this approach. The issue maybe the
eventual consistency - not sure that is a paradigm that would be acceptable
for a file system. But that is a discussion for another time/day.

Avinash

On Wed, Apr 14, 2010 at 7:15 PM, Ken Sandney <bluefl...@gmail.com> wrote:

> Large files can be split into small blocks, and the size of block can be
> tuned. It may increase the complexity of writing such a file system, but can
> be for general purpose (not only for relative small files)
>
>
> On Thu, Apr 15, 2010 at 10:08 AM, Tatu Saloranta <tsalora...@gmail.com>wrote:
>
>> On Wed, Apr 14, 2010 at 6:42 PM, Zhuguo Shi <bluefl...@gmail.com> wrote:
>> > Hi,
>> > Cassandra has a good distributed model: decentralized, auto-partition,
>> > auto-recovery. I am evaluating about writing a file system over
>> Cassandra
>> > (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if
>> > Cassandra is good at such use case?
>>
>> It sort of depends on what you are looking for. From use case for
>> which something like S3 is good, yes, except with one difference:
>> Cassandra is more geared towards lots of small files, whereas S3 is
>> more geared towards moderate number of files (possibly large).
>>
>> So I think it can definitely be a good use case, and I may use
>> Cassandra for this myself in future. Having range queries allows
>> implementing directory/path structures (list keys using path as
>> prefix). And you can split storage such that metadata could live in
>> OPP partition, raw data in RP.
>>
>> -+ Tatu +-
>>
>
>

Reply via email to