On Fri, Jan 13, 2017 at 8:14 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:
> I've thought about this for years and have never arrived on a particularly > great implementation. Your idea will be maybe OK if the sets are very > small and if the values don't change very often. But in a system where the > values of the keys in the set change frequently (lots of tombstones) or the > sets are large I think you're going to experience quite a bit of pain. > > On Fri, Jan 13, 2017 at 2:14 PM Mike Torra <mto...@demandware.com> wrote: > > We currently use redis to store sorted sets that we increment many, many > times more than we read. For example, only about 5% of these sets are ever > read. We are getting to the point where redis is becoming difficult to > scale (currently at >20 nodes). > > We've started using cassandra for other things, and now we are > experimenting to see if having a similar 'sorted set' data structure is > feasible in cassandra. My approach so far is: > > 1. Use a counter CF to store the values I want to sort by > 2. Periodically read in all key/values in the counter CF and sort in > the client application (~every five minutes or so) > 3. Write back to a different CF with the ordered keys I care about > > Does this seem crazy? Is there a simpler way to do this in cassandra? > > Redis is the other side of the coin. Fast: https://groups.google.com/forum/#!topic/redis-db/4TAItKMyUEE http://stackoverflow.com/questions/6076342/is-there-a-practical-limit-to-the-number-of-elements-in-a-sorted-set-in-redis 320MB memory for a 2,000,000 email addresses is hard to scale. If you are only maintaining a single list great, but if you have millions of lists this memory/ cost profile is not idea.