On Tue, Nov 18, 2014 at 1:57 PM, Jacob Rhoden <jacob.rho...@me.com> wrote:
> > > If there are two nodes and RF=2, a simple data lookup on a very small > table will only need to talk to one node if you put in a (fake/pointless) > partition key, but two nodes if you don't. The impact is that if you want > 1ms query time on a 50 row table you must use a partition key, otherwise > you get 250ms query time. > A query that's restricted to a single partition key is fundamentally different from an unrestricted table scan (Cassandra doesn't know that you only have one partition). So seeing a performance difference between the two is not unexpected. However, there were some semi-recent changes that improved the likelihood that neighboring range scans will be merged if one node is a replica for both ranges: CASSANDRA-6465 and CASSANDRA-7535. In your case, all of the ranges could potentially be merged to form a single range scan. If you're seeing this behavior on Cassandra 2.0.10+, it *may* be a bug. > > > >> Why is this a bug? It seems that this behaviour of needing a response to > >> both nodes only exists if you don’t query with a clustering key, or a > key > >> when RF=2. However you can change this behaviour, by, for example, > changing > >> the table from “primary key (uuid)” to “primary key ((a), uuid)” where > the > >> value of a always equals “a” ( so you can query 'where a=“a”’), at which > >> point, cassandra decides it only needs results from one node. > > > > Can you clarify what you mean? It sounds like you're saying "if I > specify > > a partition key, it only needs to query one node", which is also expected > > behavior (assuming a consistency level of ONE). > > It's my understanding from the documentation that consistency level one > (which I am using) is about write consistency, not read consistency. > Actually if I change it in cqlsh, cqlsh refuses to run the query. > The consistency level matters for both reads and writes. -- Tyler Hobbs DataStax <http://datastax.com/>