On Tue, Nov 18, 2014 at 1:57 PM, Jacob Rhoden <jacob.rho...@me.com> wrote:

>
>
> If there are two nodes and RF=2, a simple data lookup on a very small
> table will only need to talk to one node if you put in a (fake/pointless)
> partition key, but two nodes if you don't. The impact is that if you want
> 1ms query time on a 50 row table you must use a partition key, otherwise
> you get 250ms query time.
>

A query that's restricted to a single partition key is fundamentally
different from an unrestricted table scan (Cassandra doesn't know that you
only have one partition).  So seeing a performance difference between the
two is not unexpected.

However, there were some semi-recent changes that improved the likelihood
that neighboring range scans will be merged if one node is a replica for
both ranges: CASSANDRA-6465 and CASSANDRA-7535.  In your case, all of the
ranges could potentially be merged to form a single range scan.  If you're
seeing this behavior on Cassandra 2.0.10+, it *may* be a bug.


>
>
> >> Why is this a bug? It seems that this behaviour of needing a response to
> >> both nodes only exists if you don’t query with a clustering key, or a
> key
> >> when RF=2. However you can change this behaviour, by, for example,
> changing
> >> the table from “primary key (uuid)” to “primary key ((a), uuid)” where
> the
> >> value of a always equals “a” ( so you can query 'where a=“a”’), at which
> >> point, cassandra decides it only needs results from one node.
> >
> > Can you clarify what you mean?  It sounds like you're saying "if I
> specify
> > a partition key, it only needs to query one node", which is also expected
> > behavior (assuming a consistency level of ONE).
>
> It's my understanding from the documentation that consistency level one
> (which I am using) is about write consistency, not read consistency.
> Actually if I change it in cqlsh, cqlsh refuses to run the query.
>

The consistency level matters for both reads and writes.


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Reply via email to