diverting riak as a filesystem replacement

pille Sun, 25 Sep 2011 12:15:35 -0700

hi,

i'm quite new to riak and only know it from the docs available online.

to be honest, i did not search for a key/value store, but for a reliable(HA) distributed, replicated filesystem that allows dynamic growth.

all these filesystems i've dealt with are either immature, abandoned, orare limited in features like dynamic scaling, snapshotting or fail inout-of-diskspace scenarios (as they don't give you high availability anddata protection at the same time).

somehow i stumbled upon this project and liked its features, despite notbeing a filesystem at all. i can live with its flat structure if it'llbring me all the other features i need.

so i'm now at the point that after reading the online docs without anyhands-on experience leaves some questions unanswered.since i'm used to storing all data in a filesystem, our application'sstorage interface would need a complete rewrite to interface with riakand provide the same services as before. therefore i'd like to ask youto share your knowledge and experience.


1) are snapshots provided?

i guess they aren't, but i'm more interested weather i can use thevectorclocks for that.i only need one snapshot and live data to provide an consistent oldview of the data for our staging instance.

2) how does riak deal with different storage capacities of the differentnodes? is it a problem, if some nodes provide less space than others? isdata distributed uniformly accross all nodes or is its capacity takeninto account?

3) we've got quite huge files for a database to store. is that aproblem? what storage backend do you propose?currently we see the following distribution, but i expect more inthe range from 512MB to 4GB to come in future:

         <   1KB: 64053
     1KB -   1MB: 873795
     1MB -   2MB: 4776
     2MB -   4MB: 3131
     4MB -   8MB: 3136
     8MB -  16MB: 2842
    16MB -  32MB: 3136
    32MB -  64MB: 4032
    64MB - 128MB: 3118
   128MB - 256MB: 3361
   256MB - 512MB: 3221
   512MB -   1GB: 1423
     1GB -   2GB: 75

4) is range access possible to read parts of a file^W value or do i needto stream the whole file through? this would not perform well on gugevalues.

5) to reduce the impact of a disk failure on the storage backend and i'dlike each disk of a server to be assigned to its own riak-node. i guesshealing the failed node ofter replacement is faster than raid recoveryand less data is at risk.is it possible to reflect the hardware hierarchy in some way toinfluence the place for replicas? CephFS offers this to make surereplicas are hold on different hardware or even in different locations.e.g. a STORAGE is in a SERVER, which is in a RACK, which is in aDATACENTER. replicas of a file in a STORAGE should never be placedinside the same SERVER, (or RACK, or DATACENTER).

6) what happens, if less that R or W nodes report data? does it mean notfound or not available? even if the data is on an currently offline node.


7) can he client applications connect to some random node?
   should it simply retry the next one in the list upon failure?

8) is the data reported back on read is compared/verifies with allreplicas to ensure consistency or just its metadata (if R>1)


9) is data integrity in storage backend is secured through checksums?

these are the questions puzzling me at the moment.

if you know some filesystem that matches my featurelist, please don'thesitate to answer them off-topic ;-)


cheers
  pille

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

diverting riak as a filesystem replacement

Reply via email to