Hello, As was mentioned earlier, the Java driver doesn’t actually perform pagination.
Instead, it uses cassandra native protocol to set page size of the result set. (https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v2.spec#L699-L730 <https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v2.spec#L699-L730>) When Cassandra sends the result back to the java driver, it includes a some binary token. This token represents paging state. To fetch the next page, the driver re-executes the same statement with original page size and paging state attached. If there is another page available, Cassandra responds with a new paging state that can be used to fetch it. You could also try reporting this issue on the Cassandra user mailing list. > On Feb 12, 2015, at 8:35 AM, Eric Stevens <migh...@gmail.com> wrote: > > I don't know what the shape of the page state data is deep inside the > JavaDriver, I've actually tried to dig into that in the past and understand > it to see if I could reproduce it as a general purpose any-query kind of > thing. I gave up before I fully understood it, but I think it's actually a > handle to an in-memory state maintained by the coordinator, which is only > maintained for the lifetime of the statement (i.e. it's not stateless > paging). That would make it a bad candidate for stateless paging scenarios > such as REST requests where a typical setup would load balance across HTTP > hosts, never mind across coordinators. > > It shouldn't be too much work to abstract this basic idea for manual paging > into a general purpose class that takes List[ClusteringKeyDef[T, > O<:Ordering]], and can produce a connection agnostic PageState from a > ResultSet or Row, or accepts a PageState to produce a WHERE CQL fragment. > > > > Also RE: possibly multiple queries to satisfy a page - yes, that's > unfortunate. Since you're on 2.0.11, see Ondřej's answer to avoid it. > > On Thu, Feb 12, 2015 at 8:13 AM, Ajay <ajay.ga...@gmail.com > <mailto:ajay.ga...@gmail.com>> wrote: > Thanks Eric. I figured out the same but didn't get time to put it on the > mail. Thanks. > > But it is highly tied up to how data is stored internally in Cassandra. > Basically how partition keys are used to distribute (less likely to change. > We are not directly dependence on the partition algo) and clustering keys are > used to sort the data with in a partition( multi level sorting and henceforth > the restrictions on the ORDER BY clause) which I think can change likely down > the lane in Cassandra 3.x or 4.x in an different way for some better storage > or retrieval. > > Thats said I am hesitant to implement this client side logic for pagination > for a) 2+ queries might need more than one query to Cassandra. b) tied up > implementation to Cassandra internal storage details which can change(though > not often). c) in our case, we are building REST Apis which will be deployed > Tomcat clusters. Hence whatever we cache to support pagination, need to be > cached in a distributed way for failover support. > > It (pagination support) is best done at the server side like ROWNUM in SQL or > better done in Java driver to hide the internal details and can be optimized > better as server sends the paging state with the driver. > > Thanks > Ajay > > On Feb 12, 2015 8:22 PM, "Eric Stevens" <migh...@gmail.com > <mailto:migh...@gmail.com>> wrote: > Your page state then needs to track the last ck1 and last ck2 you saw. Pages > 2+ will end up needing to be up to two queries if the first query doesn't > fill the page size. > > CREATE TABLE foo ( > partitionkey int, > ck1 int, > ck2 int, > col1 int, > col2 int, > PRIMARY KEY ((partitionkey), ck1, ck2) > ) WITH CLUSTERING ORDER BY (ck1 asc, ck2 desc); > > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,1,1,1); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,2,2,2); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,3,3,3); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,1,4,4); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,2,5,5); > INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,3,6,6); > > If you're pulling the whole of partition 1 and your page size is 2, your > first page looks like: > > PAGE 1 > > SELECT * FROM foo WHERE partitionkey = 1 LIMIT 2; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 1 | 3 | 3 | 3 > 1 | 1 | 2 | 2 | 2 > > You got enough rows to satisfy the page, Your page state is taken from the > last row: (ck1=1, ck2=2) > > > PAGE 2 > Notice that you have a page state, and add some limiting clauses on the > statement: > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 1 AND ck2 < 2 LIMIT 2; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 1 | 1 | 1 | 1 > > Oops, we didn't get enough rows to satisfy the page limit, so we need to > continue on, we just need one more: > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 > 1 LIMIT 1; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 2 | 3 | 6 | 6 > > We have enough to satisfy page 2 now, our new page state: (ck1 = 2, ck2 = 3). > > > PAGE 3 > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 2 AND ck2 < 3 LIMIT 2; > partitionkey | ck1 | ck2 | col1 | col2 > --------------+-----+-----+------+------ > 1 | 2 | 2 | 5 | 5 > 1 | 2 | 1 | 4 | 4 > > Great, we satisfied this page with only one query, page state: (ck1 = 2, ck2 > = 1). > > > PAGE 4 > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 2 AND ck2 < 1 LIMIT 2; > (0 rows) > > Oops, our initial query was on the boundary of ck1, but this looks like any > other time that the initial query returns < pageSize rows, we just move on to > the next page: > > SELECT * FROM foo WHERE partitionkey = 1 AND ck1 > 2 LIMIT 2; > (0 rows) > > Aha, we've exhausted ck1 as well, so there are no more pages, page 3 actually > pulled the last possible value; page 4 is empty, and we're all done. > Generally speaking you know you're done when your first clustering key is the > only non-equality operator in the statement, and you got no rows back. > > > > > > > On Wed, Feb 11, 2015 at 10:55 AM, Ajay <ajay.ga...@gmail.com > <mailto:ajay.ga...@gmail.com>> wrote: > Basically I am trying different queries with your approach. > > One such query is like > > Select * from mycf where condition on partition key order by ck1 asc, ck2 > desc where ck1 and ck2 are clustering keys in that order. > > Here how do we achieve pagination support? > > Thanks > Ajay > > On Feb 11, 2015 11:16 PM, "Ajay" <ajay.ga...@gmail.com > <mailto:ajay.ga...@gmail.com>> wrote: > > Hi Eric, > > Thanks for your reply. > > I am using Cassandra 2.0.11 and in that I cannot append condition like last > clustering key column > value of the last row in the previous batch. It fails > Preceding column is either not restricted or by a non-EQ relation. It means I > need to specify equal condition for all preceding clustering key columns. > With this I cannot get the pagination correct. > > Thanks > Ajay > > > I can't believe that everyone read & process all rows at once (without > > pagination). > > Probably not too many people try to read all rows in a table as a single > rolling operation with a standard client driver. But those who do would use > token() to keep track of where they are and be able to resume with that as > well. > > But it sounds like you're talking about paginating a subset of data - larger > than you want to process as a unit, but prefiltered by some other criteria > which prevents you from being able to rely on token(). For this there is no > general purpose solution, but it typically involves you maintaining your own > paging state, typically keeping track of the last partitioning and clustering > key seen, and using that to construct your next query. > > For example, we have client queries which can span several partitioning keys. > We make sure that the List of partition keys generated by a given client > query List(Pq) is deterministic, then our paging state is the index offset of > the final Pq in the response, plus the value of the final clustering column. > A query coming in with a paging state attached to it starts the next set of > queries from the provided Pq offset where clusteringKey > the provided value. > > So if you can just track partition key offset (if spanning multiple > partitions), and clustering key offset, you can construct your next query > from those instead. > > On Tue, Feb 10, 2015 at 6:58 PM, Ajay <ajay.ga...@gmail.com > <mailto:ajay.ga...@gmail.com>> wrote: > Thanks Alex. > > But is there any workaround possible?. I can't believe that everyone read & > process all rows at once (without pagination). > > Thanks > Ajay > > On Feb 10, 2015 11:46 PM, "Alex Popescu" <al...@datastax.com > <mailto:al...@datastax.com>> wrote: > > On Tue, Feb 10, 2015 at 4:59 AM, Ajay <ajay.ga...@gmail.com > <mailto:ajay.ga...@gmail.com>> wrote: > 1) Java driver implicitly support Pagination in the ResultSet (using > Iterator) which can be controlled through FetchSize. But it is limited in a > way that we cannot skip or go previous. The FetchState is not exposed. > > Cassandra doesn't support skipping so this is not really a limitation of the > driver. > > > -- > > [:>-a) > > Alex Popescu > Sen. Product Manager @ DataStax > @al3xandru > > To unsubscribe from this group and stop receiving emails from it, send an > email to java-driver-user+unsubscr...@lists.datastax.com > <mailto:java-driver-user+unsubscr...@lists.datastax.com>. > > >