Re: Secondary Index

Sam Tunnicliffe Mon, 26 Jun 2017 01:50:06 -0700

The second query will be much more efficient as the partition key
restriction is applied before the one using the indexed column. Only the
replicas for that partition will be involved in query 2, whereas query 1
will be distributed to enough replicas to cover the whole token ring. Also
(& this doesn't make so much difference to the above queries), the indexes
are sorted by primary key of the rows they refer to. So the more of the
primary key you specify in the query, the more targeted the index lookup
becomes.


On 25 June 2017 at 16:18, techpyaasa . <techpya...@gmail.com> wrote:

> Thanks for the reply.
>
> I just have one more doubt, please do clarify this.
>
> Will be there any performance difference between these 2 queries for the
> above table.
>
> 1. select * from ks1.cf1 where status=1;
> 2. select * from ks1.cf1 where id1=123456 and status=1;
>
> where id1 is partition key and status is indexed column as I said above.
>
> Could you please tell me the performance difference btwn above 2 queries.
>
> Thanks in advance,
>
> Techpyaasaa
>
> On Tue, Jun 20, 2017 at 9:03 PM, ZAIDI, ASAD A <az1...@att.com> wrote:
>
>> Hey there –
>>
>>
>>
>> Like other suggested before adding more index , look for opportunity to
>> de-normalize your data model OR create composite keys for your primary
>> index – if that works for you.
>>
>> Secondary index are there so you can leverage them they come with cost.
>> They’re difficult to manage , as you repair data  your secondary index will
>> NOT be automatically repaired so you’ll need to maintain them
>>
>> On each cluster node. Depending on size of your cluster that could be a
>> significant effort. Be prepared to rebuild your new index (nodetool
>> rebuild_index) as often as you change the data . performance will
>> eventually get a hit cause index rebuilding is expensive operation on CPU
>> ..
>>
>>
>>
>> See please http://docs.datastax.com/en/cql/3.1/cql/ddl/ddl_when_use_ind
>> ex_c.html
>>
>>
>>
>>
>>
>>
>>
>> *From:* techpyaasa . [mailto:techpya...@gmail.com]
>> *Sent:* Tuesday, June 20, 2017 2:30 AM
>> *To:* ZAIDI, ASAD A <az1...@att.com>
>> *Cc:* user@cassandra.apache.org
>> *Subject:* Re: Secondary Index
>>
>>
>>
>> Hi ZAIDI,
>>
>> Thanks for reply.
>> Sorry I didn't get your line
>> "You can get away the potential situation by leveraging composite key, if
>> that is possible for you?"
>>
>> How can I get through it??
>>
>> Like I have a table as below
>>
>> CREATE TABLE ks1.cf1 (id1 bigint, id2 bigint, resp text, status int,
>> PRIMARY KEY (id1, id2)
>>
>> ) WITH CLUSTERING ORDER BY (id2 ASC)
>>
>>
>> 'status' will have values of 0/1/2/3/4 (4 possible values) , insertions
>> to table(partition) will happen based on id2 i.e values(id1,id2,resp,status)
>>
>> I want to have a filtering/criteria applied on 'status' column too like
>> select * from ks1.cf1 where id1=123 and status=0;
>>
>> How can I achieve this w/o secondary index (on 'status' column )??
>>
>>
>>
>> On Tue, Jun 20, 2017 at 12:09 AM, ZAIDI, ASAD A <az1...@att.com> wrote:
>>
>> If you’re only creating index so that your query work, think again!
>> You’ll be storing secondary index on each node , queries involving index
>> could create issues (slowness!!) down the road the when index on multiple
>> node Is involved and  not maintained!  Tables involving a lot of
>> inserts/delete could easily ruin index performance.
>>
>>
>>
>> You can get away the potential situation by leveraging composite key, if
>> that is possible for you?
>>
>>
>>
>>
>>
>> *From:* techpyaasa . [mailto:techpya...@gmail.com]
>> *Sent:* Monday, June 19, 2017 1:01 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Secondary Index
>>
>>
>>
>> Hi,
>>
>> I want to create Index on already existing table which has more than 3
>> GB/node.
>> We are using c*-2.1.17 with 2 DCs , each DC with 3 groups and each group
>> has 7 nodes.(Total 42 nodes in cluster)
>>
>> So is it ok to create Index on this table now or will it have any problem?
>> If its ok , how much time it would take for this process?
>>
>>
>> Thanks in advance,
>> TechPyaasa
>>
>>
>>
>
>

Re: Secondary Index

Reply via email to