* That fuse driver appears to be a bit more complete that the ruby-version I linked earlier, nice find Mark.
Another possible option, is to simply use a solution such as Amazon's EBS<http://aws.amazon.com/ebs/>+ S3. You would use S3 for snapshot backups to ensure data persistence. <http://www.loomlearning.com/> Jonathan Langevin I.T. Manager Loom Inc. Wilmington, NC: (910) 241-0433 - jlange...@loomlearning.com - www.loomlearning.com - Skype: intel352 * On Mon, Sep 26, 2011 at 12:39 PM, Mark Phillips <m...@basho.com> wrote: > It's worth mentioning that there are also already two FUSE drivers written > to work with Riak [1]. They haven't been touched for a while, but at least > one was used heavily in production [2], and they might be a good place to > start for your use case. > > Mark > > 1 - > http://wiki.basho.com/Community-Developed-Libraries-and-Projects.html#Other-Tools-and-Projects > (towards the bottom of the list) > 2 - https://github.com/crucially/riakfuse > > > On Mon, Sep 26, 2011 at 9:23 AM, Jonathan Langevin < > jlange...@loomlearning.com> wrote: > >> If you were to continue to pursue the use of Riak for a distributed FS, >> and if you have any resources to toss at development, it may be possible to >> build a FUSE driver that acts as a Riak client. FUSE = filesystem in >> userspace, and can function across most any Linux/BSD variant (including Mac >> OS X). >> >> More info: http://en.wikipedia.org/wiki/Filesystem_in_Userspace >> >> There is also a list of FUSE drivers at the above URL, several of which >> mention "distributed" in the description. One of those may suffice for you >> (if you've not already reviewed them). Otherwise, you could possibly use >> their FUSE drivers as a basis for your own custom FUSE Riak driver. >> >> <http://www.loomlearning.com/> >> * Jonathan Langevin >> Systems Administrator >> Loom Inc. >> Wilmington, NC: (910) 241-0433 - jlange...@loomlearning.com - >> www.loomlearning.com - Skype: intel352 * >> >> >> >> On Sun, Sep 25, 2011 at 4:29 PM, Jeremiah Peschka < >> jeremiah.pesc...@gmail.com> wrote: >> >>> Responses inline >>> --- >>> Jeremiah Peschka - Founder, Brent Ozar PLF, LLC >>> Microsoft SQL Server MVP >>> >>> On Sep 25, 2011, at 5:30 AM, pille wrote: >>> >>> > hi, >>> > >>> > i'm quite new to riak and only know it from the docs available online. >>> > to be honest, i did not search for a key/value store, but for a >>> reliable (HA) distributed, replicated filesystem that allows dynamic growth. >>> >>> To be honest, what you're looking for is a SAN. EMC's Isilon line, Dell's >>> Equallogic, and HP's Lefthand devices all meet your needs very well. They >>> don't require a lot of administrative knowledge, they're easy to set up and >>> maintain, and they are very easy to expand. SANs provide the features and >>> functionality that you're looking for and won't require any additional >>> development or maintenance. Yes, they cost money, but they do just sorta >>> work straight out of the box. >>> >>> That being said, I answered the rest of these questions as if you weren't >>> willing to just throw a bucket of money and SAN gear at your problem. >>> >>> > >>> > all these filesystems i've dealt with are either immature, abandoned, >>> or are limited in features like dynamic scaling, snapshotting or fail in >>> out-of-diskspace scenarios (as they don't give you high availability and >>> data protection at the same time). >>> > >>> > somehow i stumbled upon this project and liked its features, despite >>> not being a filesystem at all. i can live with its flat structure if it'll >>> bring me all the other features i need. >>> > >>> > so i'm now at the point that after reading the online docs without any >>> hands-on experience leaves some questions unanswered. >>> > since i'm used to storing all data in a filesystem, our application's >>> storage interface would need a complete rewrite to interface with riak and >>> provide the same services as before. therefore i'd like to ask you to share >>> your knowledge and experience. >>> > >>> > 1) are snapshots provided? >>> > i guess they aren't, but i'm more interested weather i can use the >>> vectorclocks for that. >>> > i only need one snapshot and live data to provide an consistent old >>> view of the data for our staging instance. >>> >>> Snapshots are not provided. You could probably cook something up >>> yourself, but there's no snapshotting involved that I know of. Vector clocks >>> are used for determining object lineage and conflict resolution. >>> >>> > >>> > 2) how does riak deal with different storage capacities of the >>> different nodes? is it a problem, if some nodes provide less space than >>> others? is data distributed uniformly accross all nodes or is its capacity >>> taken into account? >>> >>> AFAIK, data is distributed evenly across a number of virtual nodes (64 by >>> default). Those virtual nodes are then distributed evenly across your >>> physical nodes. I don't know of a way to change this, but I've been very >>> wrong before. >>> > >>> > 3) we've got quite huge files for a database to store. is that a >>> problem? what storage backend do you propose? >>> > currently we see the following distribution, but i expect more in the >>> range from 512MB to 4GB to come in future: >>> > < 1KB: 64053 >>> > 1KB - 1MB: 873795 >>> > 1MB - 2MB: 4776 >>> > 2MB - 4MB: 3131 >>> > 4MB - 8MB: 3136 >>> > 8MB - 16MB: 2842 >>> > 16MB - 32MB: 3136 >>> > 32MB - 64MB: 4032 >>> > 64MB - 128MB: 3118 >>> > 128MB - 256MB: 3361 >>> > 256MB - 512MB: 3221 >>> > 512MB - 1GB: 1423 >>> > 1GB - 2GB: 75 >>> >>> Riak KV's max acceptable performance size is about 64MB for a file, but >>> performance would probably start degrading before that. Luwak is an >>> application built on top of Riak that probably meets your needs a lot better >>> than plain old Riak KV: http://wiki.basho.com/Luwak.html >>> >>> >>> > >>> > 4) is range access possible to read parts of a file^W value or do i >>> need to stream the whole file through? this would not perform well on guge >>> values. >>> >>> With Luwak it's possible to get a portion of the object using the option >>> Range parameter: http://wiki.basho.com/HTTP-Fetch-Luwak-Object.html >>> >>> > >>> > 5) to reduce the impact of a disk failure on the storage backend and >>> i'd like each disk of a server to be assigned to its own riak-node. i guess >>> healing the failed node ofter replacement is faster than raid recovery and >>> less data is at risk. >>> > is it possible to reflect the hardware hierarchy in some way to >>> influence the place for replicas? CephFS offers this to make sure replicas >>> are hold on different hardware or even in different locations. >>> > e.g. a STORAGE is in a SERVER, which is in a RACK, which is in a >>> DATACENTER. replicas of a file in a STORAGE should never be placed inside >>> the same SERVER, (or RACK, or DATACENTER). >>> >>> You can purchase Riak EDS which has multi-site replication. Otherwise, >>> Riak is just going to throw data into N nodes in your cluster and it will be >>> up to you to make sure those nodes are in different racks. >>> >>> > >>> > 6) what happens, if less that R or W nodes report data? does it mean >>> not found or not available? even if the data is on an currently offline >>> node. >>> >>> If less than R nodes are present, your write will fail. The R value means >>> "this many nodes have to respond with data for it to be considered a >>> successful read." Anything less than R would, thusly, mean there was a >>> failure. >>> >>> If less than W nodes are able to write data, a hinted handoff will occur. >>> >>> > >>> > 7) can he client applications connect to some random node? >>> > should it simply retry the next one in the list upon failure? >>> >>> Client applications should connect to a random node, yes. Even better, >>> you should put a load balancing proxy server in front of your Riak cluster >>> so developers don't have to worry about writing their own load balancing >>> code. >>> >>> I'd retry on failure, but that's up to you. ;) >>> >>> > >>> > 8) is the data reported back on read is compared/verifies with all >>> replicas to ensure consistency or just its metadata (if R>1) >>> >>> Yes, R nodes have to respond with *the same* copy of the data before a >>> read is successful. You can quickly do this by comparing vector clocks and >>> other assorted metadata. >>> >>> > >>> > 9) is data integrity in storage backend is secured through checksums? >>> >>> I think depends on the storage backend implementation. doing a quick grep >>> through the source code turns up the word "checksum" a lot, though. >>> >>> > >>> > these are the questions puzzling me at the moment. >>> > if you know some filesystem that matches my featurelist, please don't >>> hesitate to answer them off-topic ;-) >>> >>> Other options include HDFS and MogileFS (http://danga.com/mogilefs/). >>> Last.fm use MogileFS >>> >>> > >>> > cheers >>> > pille >>> > >>> > _______________________________________________ >>> > riak-users mailing list >>> > riak-users@lists.basho.com >>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com