On Sat, Jun 7, 2014 at 10:41 AM, Colin Clark <co...@clark.ws> wrote: > It's an anti-pattern and there are better ways to do this. > > Entirely possible :)
It would be nice to have a document with a bunch of common cassandra design patterns. I've been trying to track down a pattern for this and a lot of this is pieced in different places an individual blogs posts so one has to reverse engineer it. > I have implemented the paging algorithm you've described using wide rows > and bucketing. This approach is a more efficient utilization of > Cassandra's built in wholesome goodness. > So.. I assume the general pattern is to: create a bucket.. you create like 2^16 buckets, this is your partition key. Then you place a timestamp next to the bucket in a primary key. So essentially: primary key( bucket, timestamp )… .. so to read from this buck you essentially execute: select * from foo where bucket = 100 and timestamp > 12345790 limit 10000; > > Also, I wouldn't let any number of clients (huge) connect directly the > cluster to do this-put some type of app server in between to handle the > comm's and fan out. You'll get better utilization of resources and less > overhead in addition to flexibility of which data center you're utilizing > to serve requests. > > this is interesting… since the partition is the bucket, you could make some poor decisions based on the number of buckets. For example, if you use 2^64 buckets, the number of items in each bucket is going to be rather small. So you're going to have tons of queries each fetching 0-1 row (if you have a small amount of data). But if you use very FEW buckets.. say 5, but you have a cluster of 1000 nodes, then you will have 5 of these buckets on 5 nodes, and the rest of the nodes without any data. Hm.. the byte ordered partitioner solves this problem because I can just pick a fixed number of buckets and then this is the primary key prefix and the data in a bucket can be split up across machines based on any arbitrary split even in the middle of a 'bucket' … -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts> <http://spinn3r.com> War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.