Re: Secondary Index on table with a lot of data crashes Cassandra

Ondřej Černoš Thu, 25 Apr 2013 01:28:17 -0700

Hi,

if you are able to reproduce the issue, file a ticket on
https://issues.apache.org/jira/browse/CASSANDRA - my experience is
developers respond quickly on issues that are clearly a bug.


regards,

ondrej cernos


On Thu, Apr 25, 2013 at 10:03 AM, Tamar Rosen <ta...@correlor.com> wrote:

> Hi,
>
> We have a case of a reproducible crash, probably due to out of memory,
> but I don't understand why.
>
> The installation is currently single node.
>
> We have a column family with approx 50000 rows.
>
> In cql, the CF definition is:
>
> CREATE TABLE users (
>   user_name text PRIMARY KEY,
>   big_json text,
>   status int);
>
> Each big_json can have 500K or more of data.
>
>
>  There is also a secondary index on the status column.
>
>  Status can have various values, over 90% of all rows have status = 2.
>
>
> Calling:
>
> Select user_name from users limit 80000;
>
> Is pretty fast
>
>
> Calling:
>
> Select user_name from users where status = 1;
>
> is slower, even though much less data is returned.
>
>
>  Calling:
>
>  Select user_name from users where status = 2;
>
> Always crashes.
>
>
> What are we doing wrong? Can it be that Cassandra is actually trying to read 
> all the CF data rather than just the keys! (actually, it doesn't need to go 
> to the users CF at all - all the data it needs is in the index CF)
>
>  Also, in the code I am doing the same using Astyanax index query with 
> pagination, and the behavior is the same.
>
>
> Please help me:
>
> 1. solve the immediate issue
>
> 2. understand if there is something in this use case which indicates that we 
> are not using Cassandra the way it is meant.
>
>
> Thanks,
>
>
> Tamar Rosen
>
> Correlor.com
>
>
>
>

Re: Secondary Index on table with a lot of data crashes Cassandra

Reply via email to