On Fri, Apr 16, 2010 at 4:08 AM, Mark Robson <mar...@gmail.com> wrote: > On 15 April 2010 02:42, Zhuguo Shi <bluefl...@gmail.com> wrote: >> >> Hi, >> Cassandra has a good distributed model: decentralized, auto-partition, >> auto-recovery. I am evaluating about writing a file system over Cassandra >> (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if >> Cassandra is good at such use case? > > I have considered this too. > I think a FUSE-based filesystem could be made to work over Cassandra; > initially it could be limited to storing small files (<500M for example) so > that we could put the entire file contents in one row. > However a lot of operations are difficult to do no matter how you design it, > especially renames (e.g. what happens if two nodes rename different files to > the same name). > Also the filesystem would not have POSIX conformity, however, would probably > be able to produce some behaviour which was useful to most applications in > most cases (think of straightforward document management, uploaded image > storage, quarantine storage etc). > Eventual consistency would mean that things which are conventionally atomic > in POSIX, would not be (e.g. rename) and the user (application) would need > to tolerate this. > Depending on how you constructed it, it could be easy to "lose" files which > continued to be stored, but no longer appears in the filesystem (broken > link) which then could not be efficiently garbage collected - the typical > case would be where a file was not completely created (a client node failed) > or where two files were renamed to the same name (one would be lost, but > might not get marked as deleted in Cassandra). This would cause a resource > leak. > If you can work around these problems, it would be an attractive option for > many types of application.
I think that even without solving these, it could be an attractive option, same way as Amazon's S3 is an attractive option. Operations beyond PUT are not atomic, and access is by full read/replace; and yet it is enough for many use cases, because access is fast, maintenance very cheap (i.e. you are not the admin), and, well, you use it for cases where regular file system properties are not needed. So perhaps better way to phrase it would be whether you could build file-system - like thing on Cassandra, which could be used in lieue of traditional file system for some tasks. -+ Tatu +-