This is a bit difficult. Depending on your access patterns and data volume, I'd be inclined to keep a separate table with a (count, foreign_key) clustering key. Then do a client-side join to read the data back in the order you're looking for. That will at least make the heavily updated table have a much smaller cost to update, but at the cost of impacting read time. At least related values that haven't changed don't need to be deleted and inserted again each time this one value changes.
But like you said, this is a read-then-write operation, over time you'll accumulate a lot of tombstones, and your data may suffer accuracy. I would also recommend rotating your partition keys and have a background process that trues up your object-by-count table into a new partition key on some schedule you determine. Live updates write to partition key *n*, and *n*+1, and your truing up process trues up *n*+1, before your read process changes changes to reading from *n*+1. When all readers are done with *n*, you can delete the whole row, and because nobody is reading from that row any longer, it doesn't matter how many tombstones it accumulated. I suggest using a timestamp for the partition key so it's easy to reason about, and you can rotate it on a schedule that makes sense for you. If there's heavy write contention, your data will end up being always off by a little bit (due to race conditions between the truing up process and the live process), but will correct itself over time. On Sat, Dec 27, 2014 at 10:15 AM, ziju feng <pkdog...@gmail.com> wrote: > I need to sort data on a frequent updated column, such as like count of an > item. The common way of getting data sorted in Cassandra is to have the > column to be sorted on as clustering key. However, whenever such column is > updated, we need to delete the row of old value and insert the new one, > which not only can generate a lot of tombstones, but also require a > read-before-write if we don't know the original value (such as using > counter table to maintain the count and propagate it to the table that > needs to sort on the count). > > I was wondering what is best practice for such use case? I'm currently > using DSE search to handle it but I would like to see a Cassandra only > solution. > > Thanks. >