Re: Pagination support on Java Driver Query API

DataStax Thu, 12 Feb 2015 08:55:50 -0800

Hello,

As was mentioned earlier, the Java driver doesn’t actually perform pagination.


Instead, it uses cassandra native protocol to set page size of the result set. 
(https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v2.spec#L699-L730
 
<https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v2.spec#L699-L730>)
When Cassandra sends the result back to the java driver, it includes a some 
binary token.
This token represents paging state. To fetch the next page, the driver 
re-executes the same
statement with original page size and paging state attached. If there is 
another page available,
Cassandra responds with a new paging state that can be used to fetch it.

You could also try reporting this issue on the Cassandra user mailing list.

> On Feb 12, 2015, at 8:35 AM, Eric Stevens <migh...@gmail.com> wrote:
> 
> I don't know what the shape of the page state data is deep inside the 
> JavaDriver, I've actually tried to dig into that in the past and understand 
> it to see if I could reproduce it as a general purpose any-query kind of 
> thing.  I gave up before I fully understood it, but I think it's actually a 
> handle to an in-memory state maintained by the coordinator, which is only 
> maintained for the lifetime of the statement (i.e. it's not stateless 
> paging). That would make it a bad candidate for stateless paging scenarios 
> such as REST requests where a typical setup would load balance across HTTP 
> hosts, never mind across coordinators.
> 
> It shouldn't be too much work to abstract this basic idea for manual paging 
> into a general purpose class that takes List[ClusteringKeyDef[T, 
> O<:Ordering]], and can produce a connection agnostic PageState from a 
> ResultSet or Row, or accepts a PageState to produce a WHERE CQL fragment.
> 
> 
> 
> Also RE: possibly multiple queries to satisfy a page - yes, that's 
> unfortunate.  Since you're on 2.0.11, see Ondřej's answer to avoid it.
> 
> On Thu, Feb 12, 2015 at 8:13 AM, Ajay <ajay.ga...@gmail.com 
> <mailto:ajay.ga...@gmail.com>> wrote:
> Thanks Eric. I figured out the same but didn't get time to put it on the 
> mail. Thanks.
> 
> But it is highly tied up to how data is stored internally in Cassandra.  
> Basically how partition keys are used to distribute (less likely to change. 
> We are not directly dependence on the partition algo) and clustering keys are 
> used to sort the data with in a partition( multi level sorting and henceforth 
> the restrictions on the ORDER BY clause) which I think can change likely down 
> the lane in Cassandra 3.x or 4.x in an different way for some better storage 
> or retrieval. 
> 
> Thats said I am hesitant to implement this client side logic for pagination 
> for a) 2+ queries might need more than one query to Cassandra. b)  tied up 
> implementation to Cassandra internal storage details which can change(though 
> not often). c) in our case, we are building REST Apis which will be deployed 
> Tomcat clusters. Hence whatever we cache to support pagination, need to be 
> cached in a distributed way for failover support. 
> 
> It (pagination support) is best done at the server side like ROWNUM in SQL or 
> better done in Java driver to hide the internal details and can be optimized 
> better as server sends the paging state with the driver.
> 
> Thanks
> Ajay
> 
> On Feb 12, 2015 8:22 PM, "Eric Stevens" <migh...@gmail.com 
> <mailto:migh...@gmail.com>> wrote:
> Your page state then needs to track the last ck1 and last ck2 you saw.  Pages 
> 2+ will end up needing to be up to two queries if the first query doesn't 
> fill the page size.
> 
> CREATE TABLE foo (
>   partitionkey int,
>   ck1 int,
>   ck2 int,
>   col1 int,
>   col2 int,
>   PRIMARY KEY ((partitionkey), ck1, ck2)
> ) WITH CLUSTERING ORDER BY (ck1 asc, ck2 desc);
> 
> INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,1,1,1);
> INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,2,2,2);
> INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,1,3,3,3);
> INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,1,4,4);
> INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,2,5,5);
> INSERT INTO foo (partitionkey, ck1, ck2, col1, col2) VALUES (1,2,3,6,6);
> 
> If you're pulling the whole of partition 1 and your page size is 2, your 
> first page looks like:
> 
> PAGE 1
> 
> SELECT * FROM foo WHERE partitionkey = 1 LIMIT 2;
>  partitionkey | ck1 | ck2 | col1 | col2
> --------------+-----+-----+------+------
>             1 |   1 |   3 |    3 |    3
>             1 |   1 |   2 |    2 |    2
> 
> You got enough rows to satisfy the page, Your page state is taken from the 
> last row: (ck1=1, ck2=2)
> 
> 
> PAGE 2
> Notice that you have a page state, and add some limiting clauses on the 
> statement:
> 
> SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 1 AND ck2 < 2 LIMIT 2;
>  partitionkey | ck1 | ck2 | col1 | col2
> --------------+-----+-----+------+------
>             1 |   1 |   1 |    1 |    1
> 
> Oops, we didn't get enough rows to satisfy the page limit, so we need to 
> continue on, we just need one more:
> 
> SELECT * FROM foo WHERE partitionkey = 1 AND ck1 > 1 LIMIT 1;
>  partitionkey | ck1 | ck2 | col1 | col2
> --------------+-----+-----+------+------
>             1 |   2 |   3 |    6 |    6
> 
> We have enough to satisfy page 2 now, our new page state: (ck1 = 2, ck2 = 3).
> 
> 
> PAGE 3
> 
> SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 2 AND ck2 < 3 LIMIT 2;
>  partitionkey | ck1 | ck2 | col1 | col2
> --------------+-----+-----+------+------
>             1 |   2 |   2 |    5 |    5
>             1 |   2 |   1 |    4 |    4
> 
> Great, we satisfied this page with only one query, page state: (ck1 = 2, ck2 
> = 1).  
> 
> 
> PAGE 4
> 
> SELECT * FROM foo WHERE partitionkey = 1 AND ck1 = 2 AND ck2 < 1 LIMIT 2;
> (0 rows)
> 
> Oops, our initial query was on the boundary of ck1, but this looks like any 
> other time that the initial query returns < pageSize rows, we just move on to 
> the next page:
> 
> SELECT * FROM foo WHERE partitionkey = 1 AND ck1 > 2 LIMIT 2;
> (0 rows)
> 
> Aha, we've exhausted ck1 as well, so there are no more pages, page 3 actually 
> pulled the last possible value; page 4 is empty, and we're all done.  
> Generally speaking you know you're done when your first clustering key is the 
> only non-equality operator in the statement, and you got no rows back.
> 
> 
> 
> 
> 
> 
> On Wed, Feb 11, 2015 at 10:55 AM, Ajay <ajay.ga...@gmail.com 
> <mailto:ajay.ga...@gmail.com>> wrote:
> Basically I am trying different queries with your approach.
> 
> One such query is like
> 
> Select * from mycf where condition on partition key order by ck1 asc, ck2 
> desc where ck1 and ck2 are clustering keys in that order.
> 
> Here how do we achieve pagination support?
> 
> Thanks
> Ajay
> 
> On Feb 11, 2015 11:16 PM, "Ajay" <ajay.ga...@gmail.com 
> <mailto:ajay.ga...@gmail.com>> wrote:
> 
> Hi Eric,
> 
> Thanks for your reply.
> 
> I am using Cassandra 2.0.11 and in that I cannot append condition like last 
> clustering key column > value of the last row in the previous batch. It fails 
> Preceding column is either not restricted or by a non-EQ relation. It means I 
> need to specify equal  condition for all preceding clustering key columns. 
> With this I cannot get the pagination correct. 
> 
> Thanks
> Ajay
> 
> > I can't believe that everyone read & process all rows at once (without 
> > pagination).
> 
> Probably not too many people try to read all rows in a table as a single 
> rolling operation with a standard client driver.  But those who do would use 
> token() to keep track of where they are and be able to resume with that as 
> well.
> 
> But it sounds like you're talking about paginating a subset of data - larger 
> than you want to process as a unit, but prefiltered by some other criteria 
> which prevents you from being able to rely on token().  For this there is no 
> general purpose solution, but it typically involves you maintaining your own 
> paging state, typically keeping track of the last partitioning and clustering 
> key seen, and using that to construct your next query.
> 
> For example, we have client queries which can span several partitioning keys. 
>  We make sure that the List of partition keys generated by a given client 
> query List(Pq) is deterministic, then our paging state is the index offset of 
> the final Pq in the response, plus the value of the final clustering column.  
> A query coming in with a paging state attached to it starts the next set of 
> queries from the provided Pq offset where clusteringKey > the provided value.
> 
> So if you can just track partition key offset (if spanning multiple 
> partitions), and clustering key offset, you can construct your next query 
> from those instead.  
> 
> On Tue, Feb 10, 2015 at 6:58 PM, Ajay <ajay.ga...@gmail.com 
> <mailto:ajay.ga...@gmail.com>> wrote:
> Thanks Alex.
> 
> But is there any workaround possible?. I can't believe that everyone read & 
> process all rows at once (without pagination).
> 
> Thanks
> Ajay
> 
> On Feb 10, 2015 11:46 PM, "Alex Popescu" <al...@datastax.com 
> <mailto:al...@datastax.com>> wrote:
> 
> On Tue, Feb 10, 2015 at 4:59 AM, Ajay <ajay.ga...@gmail.com 
> <mailto:ajay.ga...@gmail.com>> wrote:
> 1) Java driver implicitly support Pagination in the ResultSet (using 
> Iterator) which can be controlled through FetchSize. But it is limited in a 
> way that we cannot skip or go previous. The FetchState is not exposed.
> 
> Cassandra doesn't support skipping so this is not really a limitation of the 
> driver. 
> 
> 
> -- 
> 
> [:>-a)
> 
> Alex Popescu
> Sen. Product Manager @ DataStax
> @al3xandru
> 
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to java-driver-user+unsubscr...@lists.datastax.com 
> <mailto:java-driver-user+unsubscr...@lists.datastax.com>.
> 
> 
>

Re: Pagination support on Java Driver Query API

Reply via email to