Re: Why Secondary indexes is so slowly by my test?

aaron morton Wed, 12 Dec 2012 23:08:17 -0800

The IndexClause for the get_indexed_slices takes a start key. You can page the 
results from your secondary index query by making multiple calls with a sane 
count and including a start key.


Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/12/2012, at 6:34 PM, Chengying Fang <cyf...@ngnsoft.com> wrote:

> You are right, Dean. It's due to the heavy result returned by query, not 
> index itself. According to my test, if the result  rows less than 5000, it's 
> very quick. But how to limit the result? It seems row limit is a good choice. 
> But if do so, some rows I wanted  maybe miss because the row order not 
> fulfill query conditions.
> For example: CF User{I1,C1} with Index I1. Query conditions:I1=foo, order by 
> C1. If I1=foo return 10000 limit 100, I can't get the right result of C1. 
> Also we can not always set row range fulfill the query conditions when doing 
> query. Maybe I should redesign the CF model to fix it.
>  
> ------------------ Original ------------------
> From:  "Hiller, Dean"<dean.hil...@nrel.gov>;
> Date:  Wed, Dec 12, 2012 10:51 PM
> To:  "user@cassandra.apache.org"<user@cassandra.apache.org>;
> Subject:  Re: Why Secondary indexes is so slowly by my test?
>  
> You could always try PlayOrm's query capability on top of cassandra ;)??.it 
> works for us.
> 
> Dean
> 
> From: Chengying Fang <cyf...@ngnsoft.com<mailto:cyf...@ngnsoft.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Tuesday, December 11, 2012 8:22 PM
> To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: Why Secondary indexes is so slowly by my test?
> 
> Thanks to Low. We use CompositeColumn to substitue it in single not-equality 
> and definite equalitys query. And we will give up cassandra because of the 
> weak query ability and unstability. Many times, we found our data in 
> confusion without definite  cause in our cluster. For example, only two rows 
> in one CF, row1-columnname1-columnvalue1,row2-columnname2-columnvalue2, but 
> some times, it becomes 
> row1-columnname1-columnvalue2,row2-columnname2-columnvalue1. Notice the wrong 
> column value.
> 
> 
> ------------------ Original ------------------
> From:  "Richard Low"<r...@acunu.com<mailto:r...@acunu.com>>;
> Date:  Tue, Dec 11, 2012 07:44 PM
> To:  "user"<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>;
> Subject:  Re: Why Secondary indexes is so slowly by my test?
> 
> Hi,
> 
> Secondary index lookups are more complicated than normal queries so will be 
> slower. Items have to first be queried in the index, then retrieved from 
> their actual location. Also, inserting into indexed CFs will be slower (but 
> will get substantially faster in 1.2 due to CASSANDRA-2897).
> 
> If you need to retrieve large amounts of data with your query, you would be 
> better off changing your data model to not use secondary indexes.
> 
> Richard.
> 
> 
> On 7 December 2012 03:08, Chengying Fang 
> <cyf...@ngnsoft.com<mailto:cyf...@ngnsoft.com>> wrote:
> Hi guys,
> 
> I found Secondary indexes too slowly in my product(amazon large instance) 
> with cassandra, then I did test again as describe here. But the result is the 
> same as product. What's wrong with cassandra or me?
> Now my test:
> newly installed ubuntu-12.04 LTS , apache-cassandra-1.1.6, default configure, 
> just one keyspace(test) and one CF(TestIndex):
> 
>  1.  CREATECOLUMN FAMILY TestIndex
>  2.  WITH comparator = UTF8Type
>  3.  AND key_validation_class=UTF8Type
>  4.  AND default_validation_class = UTF8Type
>  5.  AND column_metadata = [
>  6.  {column_name: tk, validation_class: UTF8Type, index_type: KEYS}
>  7.  {column_name: from, validation_class: UTF8Type}
>  8.  {column_name: to, validation_class: UTF8Type}
>  9.  {column_name: tm, validation_class: UTF8Type}
>  10. ];
> 
> and 'tk' just three value:'A'(1000row),'B'(1000row),'X'(increment by test)
> The test query from cql:
> 1,without index:selectcount(*) from TestIndex limit 1000000;
> 2,with index:selectcount(*) from TestIndex where tk='X' limit 1000000;
> When I insert 60000 row 'X', the time:1s and 12s.
> When 'X' up to 130000,the time:2.3s and 33s.
> When 'X' up to 250000,the time:3.8s and 53s.
> 
> According to this, when 'X' up to billon, what's the result? Can Secondary 
> indexes be used in product? I hope it's my mistake in doing this test.Can 
> anyone give some tips about it?
> Thanks in advance.
> fancy
> 
> 
> 
> --
> Richard Low
> Acunu | http://www.acunu.com | @acunu
>

Re: Why Secondary indexes is so slowly by my test?

Reply via email to