Think of them as: PRIMARY KEY (partition_key[, range_key])
where the partition_key can be compounded as: (partition_key0 [, partition_key1, ...]) and the optional range_key can be compounded as: range_key0 [, range_key1 ...] If you do this: PRIMARY KEY (key1, key2) - then key1 is the partition_key and key2 is the range_key and queries will work that hash to key1 (the partition) using = or IN and specify a range on key2. But if you do this: PRIMARY key ((key1, key2)) then (key1, key2) is the compound partition key - there is no range key - and you can specify = on key1 and = or IN on key2 (but not a range). Anyway that's what I remember! Hope it helps. ml On Thu, Mar 13, 2014 at 11:27 AM, David Savage <davemssav...@gmail.com>wrote: > Great that works, thx! I probably would have never found that... > > It now makes me wonder in general when to use PRIMARY KEY (key1, key2) or > PRIMARY KEY ((key1, key2)), any examples would be welcome if you have the > time. > > Kind regards, > > Dave > > > On Thu, Mar 13, 2014 at 2:56 PM, Laing, Michael <michael.la...@nytimes.com > > wrote: > >> Create your table like this and it will work: >> >> CREATE TABLE test.documents (group text,id bigint,data >> map<text,text>,PRIMARY KEY ((group, id))); >> >> The extra parens catenate 'group' and 'id' into the partition key - IN >> will work on the last component of a partition key. >> >> ml >> >> >> On Thu, Mar 13, 2014 at 10:40 AM, David Savage <davemssav...@gmail.com>wrote: >> >>> Nope, upgraded to 2.0.5 and still get the same problem, I actually >>> simplified the problem a little in my first post, there's a composite >>> primary key involved as I need to partition ids into groups >>> >>> So the full CQL statements are: >>> >>> CREATE KEYSPACE test WITH replication = {'class':'SimpleStrategy', >>> 'replication_factor':3}; >>> >>> >>> CREATE TABLE test.documents (group text,id bigint,data >>> map<text,text>,PRIMARY KEY (group, id)); >>> >>> >>> INSERT INTO test.documents(id,group,data) VALUES >>> (0,'test',{'count':'0'}); >>> >>> INSERT INTO test.documents(id,group,data) VALUES >>> (1,'test',{'count':'1'}); >>> >>> INSERT INTO test.documents(id,group,data) VALUES >>> (2,'test',{'count':'2'}); >>> >>> >>> SELECT id,data FROM test.documents WHERE group='test' AND id IN (0,1,2); >>> >>> >>> Thanks for your help. >>> >>> >>> Kind regards, >>> >>> >>> /Dave >>> >>> >>> On Thu, Mar 13, 2014 at 2:00 PM, David Savage <davemssav...@gmail.com>wrote: >>> >>>> Hmmm that maybe the problem, I'm currently testing with 2.0.2 which got >>>> dragged in by the cassandra unit library I'm using for testing [1] I will >>>> try to fix my build dependencies and retry, thx. >>>> >>>> /Dave >>>> >>>> [1] https://github.com/jsevellec/cassandra-unit >>>> >>>> >>>> On Thu, Mar 13, 2014 at 1:56 PM, Laing, Michael < >>>> michael.la...@nytimes.com> wrote: >>>> >>>>> I have no problem doing this w 2.0.5 - what version of C* are you >>>>> using? Or maybe I don't understand your data model... attach 'creates' if >>>>> you don't mind. >>>>> >>>>> ml >>>>> >>>>> >>>>> On Thu, Mar 13, 2014 at 9:24 AM, David Savage >>>>> <davemssav...@gmail.com>wrote: >>>>> >>>>>> Hi Peter, >>>>>> >>>>>> Thanks for the help, unfortunately I'm not sure that's the problem, >>>>>> the id is the primary key on the documents table and the timestamp >>>>>> is the primary key on the eventlog table >>>>>> >>>>>> Kind regards, >>>>>> >>>>>> >>>>>> Dave >>>>>> >>>>>> On Thursday, 13 March 2014, Peter Lin <wool...@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> it's not clear to me if your "id" column is the KEY or just a >>>>>>> regular column with secondary index. >>>>>>> >>>>>>> queries that have IN on non primary key columns isn't supported yet. >>>>>>> not sure if that answers your question. >>>>>>> >>>>>>> >>>>>>> On Thu, Mar 13, 2014 at 7:12 AM, David Savage < >>>>>>> davemssav...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi there, >>>>>>>> >>>>>>>> I'm experimenting using cassandra and have run across an error >>>>>>>> message which I need a little more information on. >>>>>>>> >>>>>>>> The use case I'm experimenting with is a series of document updates >>>>>>>> (documents being an arbitrary map of key value pairs), I would like to >>>>>>>> find >>>>>>>> the latest document updates after a specified time period. I don't >>>>>>>> want to >>>>>>>> store many copies of the documents (one per update) as the updates are >>>>>>>> often only to single keys in the map so that would involve a lot of >>>>>>>> duplicated data. >>>>>>>> >>>>>>>> The solution I've found that seems to fit best in terms of >>>>>>>> performance is to have two tables. >>>>>>>> >>>>>>>> One that has an event log of timeuuid -> docid and a second that >>>>>>>> stores the documents themselves stored by docid -> map<string, >>>>>>>> string>. I >>>>>>>> then run two queries, one to select ids that have changed after a >>>>>>>> certain >>>>>>>> time: >>>>>>>> >>>>>>>> SELECT id FROM eventlog WHERE timestamp>=minTimeuuid($minimumTime) >>>>>>>> >>>>>>>> and then a second to select the actual documents themselves >>>>>>>> >>>>>>>> SELECT id, data FROM documents WHERE id IN (0, 1, 2, 3, 4, 5, 6, 7…) >>>>>>>> >>>>>>>> However this then explodes on query with the error message: >>>>>>>> >>>>>>>> "Cannot restrict PRIMARY KEY part id by IN relation as a collection >>>>>>>> is selected by the query" >>>>>>>> >>>>>>>> Detective work lead me to these lines in >>>>>>>> org.apache.cassandra.cql3.statementsSelectStatement: >>>>>>>> >>>>>>>> // We only support IN for the last name and for >>>>>>>> compact storage so far >>>>>>>> // TODO: #3885 allows us to extend to non >>>>>>>> compact as well, but that remains to be done >>>>>>>> if (i != stmt.columnRestrictions.length - 1) >>>>>>>> throw new >>>>>>>> InvalidRequestException(String.format("PRIMARY KEY part %s cannot be >>>>>>>> restricted by IN relation", cname)); >>>>>>>> else if (stmt.selectACollection()) >>>>>>>> throw new >>>>>>>> InvalidRequestException(String.format("Cannot restrict PRIMARY KEY >>>>>>>> part %s >>>>>>>> by IN relation as a collection is selected by the query", cname)); >>>>>>>> >>>>>>>> It seems like #3885 will allow support for the first IF block >>>>>>>> above, but I don't think it will allow the second, am I correct? >>>>>>>> >>>>>>>> Any pointers on how I can work around this would be greatly >>>>>>>> appreciated. >>>>>>>> >>>>>>>> Kind regards, >>>>>>>> >>>>>>>> Dave >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>> >> >