Thanks Ondřej! Definitely much easier. N/B, this is a new feature in 2.0.x, it will not work in 1.2.x.
cqlsh:scratch> SELECT * FROM foo WHERE partitionkey = 1 and (ck1, ck2) > (1,2) limit 2; Bad Request: line 1:45 no viable alternative at input '(' On Thu, Feb 12, 2015 at 8:44 AM, Ondřej Nešpor <ondrej.nes...@gmail.com> wrote: > There is a much easier way to do that (and I suppose the Java driver does > it this way): > > page 1: > > SELECT * FROM foo WHERE partitionkey = 1 limit 2; > > partitionkey | ck1 | ck2 | col1 | col2 > > --------------+-----+-----+------+------ > > 1 | 1 | 3 | 3 | 3 > > 1 | 1 | 2 | 2 | 2 > > > > page2: > > SELECT * FROM foo WHERE partitionkey = 1 and (ck1, ck2) > (1,2) limit 2; > > > partitionkey | ck1 | ck2 | col1 | col2 > > --------------+-----+-----+------+------ > > 1 | 1 | 1 | 1 | 1 > > 1 | 2 | 3 | 6 | 6 > > > > > page 3: > > SELECT * FROM foo WHERE partitionkey = 1 and (ck1, ck2) > (2,3) limit 2; > > partitionkey | ck1 | ck2 | col1 | col2 > > --------------+-----+-----+------+------ > > 1 | 2 | 2 | 5 | 5 > > 1 | 2 | 1 | 4 | 4 > > > > Basically you pass ck1 and ck2 values from the last row of the previous > result and tell C* you want results that are greater (next results in case > the first clustering key is ASC). > > > > Andrew > > > > Dne 12.2.2015 v 15:50 Eric Stevens napsal(a): > > Your page state then needs to track the last ck1 and last ck2 you saw. > Pages 2+ will end up needing to be up to two queries if the first query > doesn't fill the page size. > > CREATE TABLE foo ( > partitionkey int, > ck1 int, > ck2 int, > col1 int, > col2 int, > PRIMARY KEY ((partitionkey), ck1, ck2) > ) WITH CLUSTERING ORDER BY (ck1 asc, ck2 desc); > > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,1,1,1); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,2,2,2); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,3,3,3); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,1,4,4); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,2,5,5); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,3,6,6); > > If you're pulling the whole of partition 1 and your page size is 2, your > first page looks like: > > *PAGE 1* > > SELECT * FROM foo WHERE partitionkey = 1 LIMIT 2; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 1 | 3 | 3 | 3 > 1 | 1 | 2 | 2 | 2 > > You got enough rows to satisfy the page, Your page state is taken from > the last row: (ck1=1, ck2=2) > > > *PAGE 2* > Notice that you have a page state, and add some limiting clauses on the > statement: > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 1 AND ck2 < 2 LIMIT 2; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 1 | 1 | 1 | 1 > > Oops, we didn't get enough rows to satisfy the page limit, so we need to > continue on, we just need one more: > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 > 1 LIMIT 1; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 2 | 3 | 6 | 6 > > We have enough to satisfy page 2 now, our new page state: (ck1 = 2, ck2 > = 3). > > > *PAGE 3* > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 2 AND ck2 < 3 LIMIT 2; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 2 | 2 | 5 | 5 > 1 | 2 | 1 | 4 | 4 > > Great, we satisfied this page with only one query, page state: (ck1 = 2, > ck2 = 1). > > > *PAGE 4* > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 2 AND ck2 < 1 LIMIT 2; > (0 rows) > > Oops, our initial query was on the boundary of ck1, but this looks like > any other time that the initial query returns < pageSize rows, we just move > on to the next page: > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 > 2 LIMIT 2; > (0 rows) > > Aha, we've exhausted ck1 as well, so there are no more pages, page 3 > actually pulled the last possible value; page 4 is empty, and we're all > done. Generally speaking you know you're done when your first clustering > key is the only non-equality operator in the statement, and you got no rows > back. > > > > > > > On Wed, Feb 11, 2015 at 10:55 AM, Ajay <ajay.ga...@gmail.com> wrote: > >> Basically I am trying different queries with your approach. >> >> One such query is like >> >> Select * from mycf where condition on partition key order by ck1 asc, ck2 >> desc where ck1 and ck2 are clustering keys in that order. >> >> Here how do we achieve pagination support? >> >> Thanks >> Ajay >> On Feb 11, 2015 11:16 PM, "Ajay" <ajay.ga...@gmail.com> wrote: >> >>> >>> Hi Eric, >>> >>> Thanks for your reply. >>> >>> I am using Cassandra 2.0.11 and in that I cannot append condition like >>> last clustering key column > value of the last row in the previous batch. >>> It fails Preceding column is either not restricted or by a non-EQ relation. >>> It means I need to specify equal condition for all preceding clustering >>> key columns. With this I cannot get the pagination correct. >>> >>> Thanks >>> Ajay >>> > I can't believe that everyone read & process all rows at once >>> (without pagination). >>> >>> Probably not too many people try to read all rows in a table as a >>> single rolling operation with a standard client driver. But those who do >>> would use token() to keep track of where they are and be able to resume >>> with that as well. >>> >>> But it sounds like you're talking about paginating a subset of data - >>> larger than you want to process as a unit, but prefiltered by some other >>> criteria which prevents you from being able to rely on token(). For this >>> there is no general purpose solution, but it typically involves you >>> maintaining your own paging state, typically keeping track of the last >>> partitioning and clustering key seen, and using that to construct your next >>> query. >>> >>> For example, we have client queries which can span several >>> partitioning keys. We make sure that the List of partition keys generated >>> by a given client query List(Pq) is deterministic, then our paging >>> state is the index offset of the final Pq in the response, plus the >>> value of the final clustering column. A query coming in with a paging >>> state attached to it starts the next set of queries from the provided Pq >>> offset >>> where clusteringKey > the provided value. >>> >>> So if you can just track partition key offset (if spanning multiple >>> partitions), and clustering key offset, you can construct your next query >>> from those instead. >>> >>> On Tue, Feb 10, 2015 at 6:58 PM, Ajay <ajay.ga...@gmail.com> wrote: >>> >>>> Thanks Alex. >>>> >>>> But is there any workaround possible?. I can't believe that everyone >>>> read & process all rows at once (without pagination). >>>> >>>> Thanks >>>> Ajay >>>> On Feb 10, 2015 11:46 PM, "Alex Popescu" <al...@datastax.com> wrote: >>>> >>>>> >>>>> On Tue, Feb 10, 2015 at 4:59 AM, Ajay <ajay.ga...@gmail.com> wrote: >>>>> >>>>>> 1) Java driver implicitly support Pagination in the ResultSet (using >>>>>> Iterator) which can be controlled through FetchSize. But it is limited >>>>>> in a >>>>>> way that we cannot skip or go previous. The FetchState is not exposed. >>>>> >>>>> >>>>> Cassandra doesn't support skipping so this is not really a limitation >>>>> of the driver. >>>>> >>>>> >>>>> -- >>>>> >>>>> [:>-a) >>>>> >>>>> Alex Popescu >>>>> Sen. Product Manager @ DataStax >>>>> @al3xandru >>>>> To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to java-driver-user+unsubscr...@lists.datastax.com. >>>>> >>>> >>> > >