It's worth mentioning that there are also already two FUSE drivers written to work with Riak [1]. They haven't been touched for a while, but at least one was used heavily in production [2], and they might be a good place to start for your use case.
Mark 1 - http://wiki.basho.com/Community-Developed-Libraries-and-Projects.html#Other-Tools-and-Projects (towards the bottom of the list) 2 - https://github.com/crucially/riakfuse On Mon, Sep 26, 2011 at 9:23 AM, Jonathan Langevin < jlange...@loomlearning.com> wrote: > If you were to continue to pursue the use of Riak for a distributed FS, and > if you have any resources to toss at development, it may be possible to > build a FUSE driver that acts as a Riak client. FUSE = filesystem in > userspace, and can function across most any Linux/BSD variant (including Mac > OS X). > > More info: http://en.wikipedia.org/wiki/Filesystem_in_Userspace > > There is also a list of FUSE drivers at the above URL, several of which > mention "distributed" in the description. One of those may suffice for you > (if you've not already reviewed them). Otherwise, you could possibly use > their FUSE drivers as a basis for your own custom FUSE Riak driver. > > <http://www.loomlearning.com/> > * Jonathan Langevin > Systems Administrator > Loom Inc. > Wilmington, NC: (910) 241-0433 - jlange...@loomlearning.com - > www.loomlearning.com - Skype: intel352 * > > > > On Sun, Sep 25, 2011 at 4:29 PM, Jeremiah Peschka < > jeremiah.pesc...@gmail.com> wrote: > >> Responses inline >> --- >> Jeremiah Peschka - Founder, Brent Ozar PLF, LLC >> Microsoft SQL Server MVP >> >> On Sep 25, 2011, at 5:30 AM, pille wrote: >> >> > hi, >> > >> > i'm quite new to riak and only know it from the docs available online. >> > to be honest, i did not search for a key/value store, but for a reliable >> (HA) distributed, replicated filesystem that allows dynamic growth. >> >> To be honest, what you're looking for is a SAN. EMC's Isilon line, Dell's >> Equallogic, and HP's Lefthand devices all meet your needs very well. They >> don't require a lot of administrative knowledge, they're easy to set up and >> maintain, and they are very easy to expand. SANs provide the features and >> functionality that you're looking for and won't require any additional >> development or maintenance. Yes, they cost money, but they do just sorta >> work straight out of the box. >> >> That being said, I answered the rest of these questions as if you weren't >> willing to just throw a bucket of money and SAN gear at your problem. >> >> > >> > all these filesystems i've dealt with are either immature, abandoned, or >> are limited in features like dynamic scaling, snapshotting or fail in >> out-of-diskspace scenarios (as they don't give you high availability and >> data protection at the same time). >> > >> > somehow i stumbled upon this project and liked its features, despite not >> being a filesystem at all. i can live with its flat structure if it'll bring >> me all the other features i need. >> > >> > so i'm now at the point that after reading the online docs without any >> hands-on experience leaves some questions unanswered. >> > since i'm used to storing all data in a filesystem, our application's >> storage interface would need a complete rewrite to interface with riak and >> provide the same services as before. therefore i'd like to ask you to share >> your knowledge and experience. >> > >> > 1) are snapshots provided? >> > i guess they aren't, but i'm more interested weather i can use the >> vectorclocks for that. >> > i only need one snapshot and live data to provide an consistent old >> view of the data for our staging instance. >> >> Snapshots are not provided. You could probably cook something up yourself, >> but there's no snapshotting involved that I know of. Vector clocks are used >> for determining object lineage and conflict resolution. >> >> > >> > 2) how does riak deal with different storage capacities of the different >> nodes? is it a problem, if some nodes provide less space than others? is >> data distributed uniformly accross all nodes or is its capacity taken into >> account? >> >> AFAIK, data is distributed evenly across a number of virtual nodes (64 by >> default). Those virtual nodes are then distributed evenly across your >> physical nodes. I don't know of a way to change this, but I've been very >> wrong before. >> > >> > 3) we've got quite huge files for a database to store. is that a >> problem? what storage backend do you propose? >> > currently we see the following distribution, but i expect more in the >> range from 512MB to 4GB to come in future: >> > < 1KB: 64053 >> > 1KB - 1MB: 873795 >> > 1MB - 2MB: 4776 >> > 2MB - 4MB: 3131 >> > 4MB - 8MB: 3136 >> > 8MB - 16MB: 2842 >> > 16MB - 32MB: 3136 >> > 32MB - 64MB: 4032 >> > 64MB - 128MB: 3118 >> > 128MB - 256MB: 3361 >> > 256MB - 512MB: 3221 >> > 512MB - 1GB: 1423 >> > 1GB - 2GB: 75 >> >> Riak KV's max acceptable performance size is about 64MB for a file, but >> performance would probably start degrading before that. Luwak is an >> application built on top of Riak that probably meets your needs a lot better >> than plain old Riak KV: http://wiki.basho.com/Luwak.html >> >> >> > >> > 4) is range access possible to read parts of a file^W value or do i need >> to stream the whole file through? this would not perform well on guge >> values. >> >> With Luwak it's possible to get a portion of the object using the option >> Range parameter: http://wiki.basho.com/HTTP-Fetch-Luwak-Object.html >> >> > >> > 5) to reduce the impact of a disk failure on the storage backend and i'd >> like each disk of a server to be assigned to its own riak-node. i guess >> healing the failed node ofter replacement is faster than raid recovery and >> less data is at risk. >> > is it possible to reflect the hardware hierarchy in some way to >> influence the place for replicas? CephFS offers this to make sure replicas >> are hold on different hardware or even in different locations. >> > e.g. a STORAGE is in a SERVER, which is in a RACK, which is in a >> DATACENTER. replicas of a file in a STORAGE should never be placed inside >> the same SERVER, (or RACK, or DATACENTER). >> >> You can purchase Riak EDS which has multi-site replication. Otherwise, >> Riak is just going to throw data into N nodes in your cluster and it will be >> up to you to make sure those nodes are in different racks. >> >> > >> > 6) what happens, if less that R or W nodes report data? does it mean not >> found or not available? even if the data is on an currently offline node. >> >> If less than R nodes are present, your write will fail. The R value means >> "this many nodes have to respond with data for it to be considered a >> successful read." Anything less than R would, thusly, mean there was a >> failure. >> >> If less than W nodes are able to write data, a hinted handoff will occur. >> >> > >> > 7) can he client applications connect to some random node? >> > should it simply retry the next one in the list upon failure? >> >> Client applications should connect to a random node, yes. Even better, you >> should put a load balancing proxy server in front of your Riak cluster so >> developers don't have to worry about writing their own load balancing code. >> >> I'd retry on failure, but that's up to you. ;) >> >> > >> > 8) is the data reported back on read is compared/verifies with all >> replicas to ensure consistency or just its metadata (if R>1) >> >> Yes, R nodes have to respond with *the same* copy of the data before a >> read is successful. You can quickly do this by comparing vector clocks and >> other assorted metadata. >> >> > >> > 9) is data integrity in storage backend is secured through checksums? >> >> I think depends on the storage backend implementation. doing a quick grep >> through the source code turns up the word "checksum" a lot, though. >> >> > >> > these are the questions puzzling me at the moment. >> > if you know some filesystem that matches my featurelist, please don't >> hesitate to answer them off-topic ;-) >> >> Other options include HDFS and MogileFS (http://danga.com/mogilefs/). >> Last.fm use MogileFS >> >> > >> > cheers >> > pille >> > >> > _______________________________________________ >> > riak-users mailing list >> > riak-users@lists.basho.com >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com