Mike mentioned "increment" in his initial post. That let me think of a case with increments and fetching a top list by a counter like https://redis.io/commands/zincrby https://redis.io/commands/zrangebyscore
1. Cassandra is absolutely not made to sort by a counter (or a non-counter numeric incrementing value) but it is made to store counters. In this case a partition could be seen as a set. 2. I thought of CS for persistence and - depending on the app requirements like real-time and set size - still use redis as a read cache 2017-01-14 18:45 GMT+01:00 Jonathan Haddad <j...@jonhaddad.com>: > Sorted sets don't have a requirement of incrementing / decrementing. > They're commonly used for thing like leaderboards where the values are > arbitrary. > > In Redis they are implemented with 2 data structures for efficient lookups > of either key or value. No getting around that as far as I know. > > In Cassandra they would require using the score as a clustering column in > order to select top N scores (and paginate). That means a tombstone > whenever the value for a key in the set changes. In sets with high rates of > change that means a lot of tombstones and thus terrible performance. > On Sat, Jan 14, 2017 at 9:40 AM DuyHai Doan <doanduy...@gmail.com> wrote: > >> Sorting on an "incremented" numeric value has always been a nightmare to >> be done properly in C* >> >> Either use Counter type but then no sorting is possible since counter >> cannot be used as type for clustering column (which allows sort) >> >> Or use simple numeric type on clustering column but then to increment the >> value *concurrently* and *safely* it's prohibitive (SELECT to fetch current >> value + UPDATE ... IF value = <old_value>) + retry >> >> >> >> On Sat, Jan 14, 2017 at 8:54 AM, Benjamin Roth <benjamin.r...@jaumo.com> >> wrote: >> >> If your proposed solution is crazy depends on your needs :) >> It sounds like you can live with not-realtime data. So it is ok to cache >> it. Why preproduce the results if you only need 5% of them? Why not use >> redis as a cache with expiring sorted sets that are filled on demand from >> cassandra partitions with counters? >> So redis has much less to do and can scale much better. And you are not >> limited on keeping all data in ram as cache data is volatile and can be >> evicted on demand. >> If this is effective also depends on the size of your sets. CS wont be >> able to sort them by score for you, so you will have to load the complete >> set to redis for caching and / or do sorting in your app on demand. This >> certainly won't work out well with sets with millions of entries. >> >> 2017-01-13 23:14 GMT+01:00 Mike Torra <mto...@demandware.com>: >> >> We currently use redis to store sorted sets that we increment many, many >> times more than we read. For example, only about 5% of these sets are ever >> read. We are getting to the point where redis is becoming difficult to >> scale (currently at >20 nodes). >> >> We've started using cassandra for other things, and now we are >> experimenting to see if having a similar 'sorted set' data structure is >> feasible in cassandra. My approach so far is: >> >> 1. Use a counter CF to store the values I want to sort by >> 2. Periodically read in all key/values in the counter CF and sort in >> the client application (~every five minutes or so) >> 3. Write back to a different CF with the ordered keys I care about >> >> Does this seem crazy? Is there a simpler way to do this in cassandra? >> >> >> >> >> -- >> Benjamin Roth >> Prokurist >> >> Jaumo GmbH · www.jaumo.com >> Wehrstraße 46 · 73035 Göppingen · Germany >> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 >> <+49%207161%203048801> >> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer >> >> >> -- Benjamin Roth Prokurist Jaumo GmbH · www.jaumo.com Wehrstraße 46 · 73035 Göppingen · Germany Phone +49 7161 304880-6 · Fax +49 7161 304880-1 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer