"add an additional integer column to the partition key (making it a composite partition key if it isn't already). When inserting, randomly pick a value between, say, 0 and 10 to use for this column" --> Due to the low cardinality of bucket (only 10), there is no guarantee that the partitions would be distributed evenly. But it's better than nothing.
"Alternatively, instead of using a random number, you could hash the other key components and use the lowest bits for the value. This has the advantage of being deterministic" --> Does it work with VNodes, where tokens are split in 256 ranges and shuffled in all nodes ? On Tue, Aug 12, 2014 at 7:39 PM, Tyler Hobbs <ty...@datastax.com> wrote: > > On Mon, Aug 11, 2014 at 4:17 PM, Ian Rose <ianr...@fullstory.com> wrote: > >> >> "You better off create a manuel reverse-index to track modification >> date, something like this" --> I had considered an approach like this but >> my concern is that for any given minute *all* of the updates will be >> handled by a single node, right? For example, if the minute_bucket is 2739 >> then for that one minute, every single item update will flow to the node at >> HASH(2739). Assuming I am thinking about that right, that seemed like a >> potential scaling bottleneck, which scared me off that approach. >> > > If you're concerned about bottlenecking on one node (or set of replicas) > during the minute, add an additional integer column to the partition key > (making it a composite partition key if it isn't already). When inserting, > randomly pick a value between, say, 0 and 10 to use for this column. When > reading, read all 10 partitions and merge them. (Alternatively, instead of > using a random number, you could hash the other key components and use the > lowest bits for the value. This has the advantage of being deterministic.) > > > -- > Tyler Hobbs > DataStax <http://datastax.com/> >