Re: Storing large collections.

2012-03-05 Thread Jeremiah Peschka
LevelDB will compress on disk via Google's Snappy compression routines. I think that's the only Riak backend that does compression. --- Jeremiah Peschka - Managing Director, Brent Ozar PLF, LLC Microsoft SQL Server MVP On Mar 5, 2012, at 8:47 PM, Eric Siegel wrote: > > > > > > My next plan of

Re: Storing large collections.

2012-03-05 Thread Eric Siegel
> > > > > > > My next plan of attack is store a collection of items to a given key, > approximately 1million keys each with 6000 values. > > This sounds cumbersome. > Yes, it is true that I will have to deal with a whole bunch of sibling resolution and merging, but on the plus side, doing range qu

Re: Storing large collections.

2012-03-05 Thread Jeremiah Peschka
On Mar 5, 2012, at 7:09 PM, Eric Siegel wrote: > Originally, I had planned to map each of my items to their own key. > This was foolish as I estimate that I'll have around 6 billion keys, and this > simply won't fit into memory. This is only an issue if you're using bitcask (which is the defau

Storing large collections.

2012-03-05 Thread Eric Siegel
Originally, I had planned to map each of my items to their own key. This was foolish as I estimate that I'll have around 6 billion keys, and this simply won't fit into memory. My next plan of attack is store a collection of items to a given key, approximately 1million keys each with 6000 values.

Re: Storing large collections in Riak (or any distributed store)

2011-02-09 Thread Alexander Sicular
Not surprising, re. Flickr. Don't be too clever when disk is cheap and only getting cheaper. Remember, we in nosql land where denormilization is ... the norm. @siculars on twitter http://siculars.posterous.com Sent from my iPhone On Feb 9, 2011, at 13:12, Jeremiah Peschka wrote: Incid

Re: Storing large collections in Riak (or any distributed store)

2011-02-09 Thread Jeremiah Peschka
Incidentally, this is also how flickr handles writes - when you upload a photo it gets written to wherever your other photos go. When someone tags it or adds it to a group, it gets copied into that group. Unless, of course, it's all changed since the last time I looked for information about how

Re: Storing large collections in Riak (or any distributed store)

2011-02-09 Thread Alexander Sicular
The only way this is functional is if you implement a uniformly random hash function so that you know which key any given address will hash to. Separately, churn will eat you up if you constantly need to take addys out of your keys. Also, as mentioned elsewhere, Riak links won't work at these numbe

Re: Storing large collections in Riak (or any distributed store)

2011-02-09 Thread Scott Lystig Fritchie
Nathan Sobo wrote: ns> Is a key-value store actually inappropriate for this problem? No. One way to do it is to use a single KV key to store multiple addresses worth of info. Pick a relatively big number, 50K subscribers/key, though it may vary. Use a key naming scheme so that you can pre-cal

Re: Storing large collections in Riak (or any distributed store)

2011-02-09 Thread Jeremiah Peschka
So, if I understand this correctly, you want to send out an email to a bunch of users on a list. Each of these users can also have an arbitrary number of attributes. In order to send the email, you'll need to retrieve both the email address AND the user's attributes. Listing keys is a slow operati

Storing large collections in Riak (or any distributed store)

2011-02-08 Thread Nathan Sobo
I've never used a data store like Riak, but I'm working with a client who wants to store a large number of large mailing lists. Each list is potentially a few million entries long, with each list entry consisting of an email address plus arbitrary key-value pairs. When a customer wants to send out