Thanks Tyler. I was unaware of counters. The use case for column counts is really from a operational perspective, to allow a sysadmin to do adhoc checks on columns to see if something has gone wrong in software outside of cassandra.
I think running a cassandra-cli command such as count, which makes cassandra fall over is not ideal, unless we can say for X number of columns cassandra needs at least Y memory allocation for stability. Cheers Dave On Sun, Dec 12, 2010 at 6:39 PM, Tyler Hobbs <ty...@riptano.com> wrote: > Cassandra has to deserialize all of the columns in the row for get_count(). > So from Cassandra's perspective, it's almost as much work as getting the > entire row, it just doesn't have to send everything back over the network. > > If you're frequently counting 8 million columns (or really, anything > significant), you need to use counters instead. If this is a rare > occurrence, you can do the count in multiple chunks by using a starting and > ending column in the SlicePredicate for each chunk, but this requires some > rough knowledge about the distribution of the column names in the row. > > - Tyler