Aaron, thanks for the super-rapid response. That clarifies a lot for me, but I think I am still wondering about one point embedded below.
________________________________ > From: aa...@thelastpickle.com > Subject: Re: is the select result grouped by the value of the partition key? > Date: Thu, 12 Sep 2013 14:19:06 +1200 > To: user@cassandra.apache.org > > GROUP BY "feature", > I would not think of it like that, this is about physical order of rows. > > since it seems really important yet does not seem to be mentioned in the > CQL reference documentation. > It's baked in, this is how the data is organised on the row. Yes, I see, and I absolutely get the relevance of where columns are stored on disk to, say, doing INSERTs. But what I am wondering about is, in the context of a SELECT, we seem to be relying on the Cassandra client api preserving that on-disk order while returning rows. My high-level understanding of how Cassandra handles a SELECT is that : (excuse incorrect terminology) 1. client connects to some node N 2. node N acts as a kind of coordinator and fires off the thrift or binary-protocol messages to all other nodes to fetch rows off the memtables and/or disks 3. coordinator merges, truncates, etc the sets from the nodes and returns one answer set to client. It is step 3 which has me wondering - does it explicitly preserve the on-disk order? In fact - does it simply keep each individual node's answer set separate? Is that how it works? > > http://www.datastax.com/dev/blog/thrift-to-cql3 > We often say the PRIMARY KEY is the PARTITION KEY and the GROUPING COLUMNS > http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/create_table_r.html > > > See also http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html > > Is it something we can bet the farm and farmer's family on? > Sure. > > The kinds of scenarios where I am wondering if it's possible for > partition-key groups > to get intermingled are : > All instances of the table entity with the same value(s) for the > PARTITION KEY portion of the PRIMARY KEY existing in the same storage > engine row. > > . what if the node containing primary copy of a row is down > There is no primary copy of a row. > > . what if there is a heavy stream of UPDATE activity from > applications which > connect to all nodes, causing different nodes to have different > versions of replicas of same row? > That's fine with me. > It's only an issue when the data is read, and at that point the > Consistency Level determines what we do. > > Hope that helps. > > > ----------------- > Aaron Morton > New Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 12/09/2013, at 7:43 AM, John Lumby > <johnlu...@hotmail.com<mailto:johnlu...@hotmail.com>> wrote: > > I would like to make quite sure about this implicit GROUP BY "feature", > > since it seems really important yet does not seem to be mentioned in the > CQL reference documentation. > > > > Aaron, you said "yes" -- is that "yes, always, in all scenarios > no matter what" > > or "yes usually"? Is it something we can bet the farm and farmer's > family on? > > > > The kinds of scenarios where I am wondering if it's possible for > partition-key groups > to get intermingled are : > > > > . what if the node containing primary copy of a row is down > and > cassandra fetches this row from a replica on a different node > (e.g. with CONSISTENCY ONE) > > . what if there is a heavy stream of UPDATE activity from > applications which > connect to all nodes, causing different nodes to have different > versions of replicas of same row? > > > > Can you point me to some place in the cassandra source code where this > grouping is ensured? > > > > Many thanks, > > John Lumby >