Listing large key spaces, and bucket Links header

2010-09-06 Thread Gavin Carr
Greetings all,

I'm a riak newbie, trying out version 0.12.1. One of the use-cases
we're interested in is using riak as a backend for brackup[1], an
open source backup tool. 

brackup supports pluggable targets/backends, including filesystems, 
ftp, sftp, Amazon S3, etc. I've written a first-pass riak target 
that I'm testing, which works nicely for small backups. I'm now
looking to scale that up, and had a couple of questions.


1. Almost entirely brackup only needs per-key lookups and writes. 
The one one exception is garbage collection, where I need to walk 
the entire set of keys to figure out which chunks are orphaned and 
can therefore be deleted.

So I'm wondering is there an upper limit on number of keys where
"listing keys is expensive" turns into "listing keys is insane"?
I'm looking at millions of keys/chunks for large backups, I guess.

I guess splitting chunks over multiple buckets and performing
multiple queries might help. Is there an recommended upper limit
for keys per bucket on bitcask for sane list keys performance?


2. There seems to be a standard Link header coming back on my 
bucket key queries that is huge - twice the size of the response
body with my 45b keys. So for 50k keys the response is about 1MB,
and the Link header is about 2MB! I'm wondering if there's any
way of turning this off, given I aren't doing any Link walking?


Thanks,
Gavin


[1] http://code.google.com/p/brackup/


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak support added to brackup

2010-11-02 Thread Gavin Carr
Hi all,

Thought some of you might be interested to know that the recently-released
version 1.10 of brackup [1][2] has added support for using riak as a backup 
datastore. This allows a pretty scalable and performant large-backup storage
cluster to be built using commodity hardware, instead of using expensive SAN 
storage alternatives.

Brackup is a modern net-based backup system that supports deduplication, 
intelligent chunking, and gpg-based-encryption, with support for storage on 
everything from local disk, remote ftp and sftp servers, to Amazon S3, 
RackSpace CloudFiles, and now riak.

Cheers,
Gavin

[1] http://code.google.com/p/brackup/
[2] http://search.cpan.org/~bradfitz/Brackup/


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com