Hello, I am using Cassandra 1.1.1 and CQL3.
Could you tell me what is the best strategy for retrieving a row using a condition on a row key (operator =) and also filter on a 2nd column? For example, I create a table named "testwhere" with a row key on column "mykey" and 2 other columns "col1" and "col2". I would like to retrieve the row with the key 'key1' only if col1 = 'abcd' I send the request SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; As you can see, the 1st condition in the WHERE clause is based on the row key. However the request doesn't work if no secondary index is created on the column used in the 2nd condition of the WHERE clause. It works only if a secondary indexed is created on this 2nd column (see below). Does that mean that the secondary index is used in the read path instead of the row key, even if there is a condition on the row key in the WHERE clause ? Here is an example : jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] Use HELP for help. cqlsh> use test1; cqlsh:test1> CREATE TABLE testwhere (mykey varchar PRIMARY KEY, ... col1 varchar, ... col2 varchar); cqlsh:test1> INSERT INTO testwhere (mykey, col1, col2) VALUES ('key1', 'abcd', 'efgh'); cqlsh:test1> SELECT mykey, col1 from testwhere where mykey = 'key1'; mykey | col1 -------+------ key1 | abcd cqlsh:test1> SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; Bad Request: No indexed columns present in by-columns clause with Equal operator cqlsh:test1> CREATE INDEX col1_idx ON testwhere (col1); cqlsh:test1> SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; mykey | col1 -------+------ key1 | abcd cqlsh:test1> My understanding is : The 1st SELECT is working because there is only the row key in the WHERE clause The 2nd SELECT is not working because the row key is in the WHERE clause, but there is no index on col1 The 3rd SELECT (which is the same as the 2nd SELECT) is working because the row key is in the WHERE clause, and a secondary index is created on col1 For this use case, what are the recommendations of the Cassandra community ? - do we need to create a secondary index for each column we want to filter ? - do we need to post-process (filter) the result of the query in our application ? - or is there another solution ? Thanks. Jean-Armel