Re: OutOfMemory on count on cassandra 0.6.8 for large number of columns

Dave Martin Sun, 12 Dec 2010 00:07:58 -0800

Thanks Tyler. I was unaware of counters.

The use case for column counts is really from a operational perspective,
to allow a sysadmin to do adhoc checks on columns to see if something
has gone wrong in software outside of cassandra.


I think running a cassandra-cli command such as count, which makes
cassandra fall over is not ideal,
unless we can say for X number of columns cassandra needs at least Y
memory allocation for stability.

Cheers

Dave


On Sun, Dec 12, 2010 at 6:39 PM, Tyler Hobbs <ty...@riptano.com> wrote:
> Cassandra has to deserialize all of the columns in the row for get_count().
> So from Cassandra's perspective, it's almost as much work as getting the
> entire row, it just doesn't have to send everything back over the network.
>
> If you're frequently counting 8 million columns (or really, anything
> significant), you need to use counters instead.  If this is a rare
> occurrence, you can do the count in multiple chunks by using a starting and
> ending column in the SlicePredicate for each chunk, but this requires some
> rough knowledge about the distribution of the column names in the row.
>
> - Tyler

Re: OutOfMemory on count on cassandra 0.6.8 for large number of columns

Reply via email to