I think what you want is a "clustering column”.  When you model your data, you 
specify “partition columns” which are synonymous with the old thrift style 
“keys” and clustering columns.  When creating your PRIMARY KEY, you specify the 
partition column first then each subsequent column in the primary key is is the 
clustering columns. These columns determine how the data in that partition is 
stored on disk. 

For instance if i was storing time series events for URLs I might do something 
like this:

PRIMARY KEY(url, event_time)

This means that all events for a given URL will be stored contiguously in order 
on the same node.

This allows the following type of query:

SELECT * FROM events WHERE url = 'http://devdazed.com' and event_time > 
‘2014-01-01’ AND event_time < ‘2014-01-07’;

Make sense?



On May 30, 2014 at 1:10:51 AM, Kevin Burton (bur...@spinn3r.com) wrote:

I'm trying to grok this but I can't figure it out in CQL world.

I'd like to efficiently page through a table via primary key.

This way I only involve one node at a time and the reads on disk are 
contiguous.  

I would have assumed it was a combination of > pk and order by but that doesn't 
seem to work.

--
Founder/CEO Spinn3r.com
Location: San Francisco, CA
Skype: burtonator
blog: http://burtonator.wordpress.com
… or check out my Google+ profile

War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
people.

Reply via email to