Re: skip + limit support in GetSlice

Mike Peters Sun, 05 Sep 2010 12:57:55 -0700

Hi Michal,

Did you read the PDF Stu sent over, start to finish? There are severaldifferent approaches described there.


With Cassandra, what we found works best for pagination:

* Keep a separate 'total_records' count and increment/decrement it onevery insert/delete* When getting slices, pass 'last seen' as the 'from' and keep the 'to'empty. Pass the number of records you want to show per page in the 'count'.* Avoid letting user skip to page X, using Next/Prev/First/Last only(same way GMail does it)



Michal Augustýn wrote:

I know that "Prev/Next" is good solution for web applications. Butwhen I want to access data from another application or when I want toaccess pages randomly...

I don't know the internal structure of memtables etc., so I don't knowif columns in row are indexable. If now, then I just want to transfermy workaround to server (to avoid huge network traffic)...


2010/9/5 Stu Hood <stu.h...@rackspace.com <mailto:stu.h...@rackspace.com>>

    Cassandra supports the recommended approach from:
    http://www.percona.com/ppc2009/PPC2009_mysql_pagination.pdf

    For large numbers of items, skip + limit is extremely inefficent.

    -----Original Message-----
    From: "Michal Augustýn" <augustyn.mic...@gmail.com
    <mailto:augustyn.mic...@gmail.com>>
    Sent: Sunday, September 5, 2010 5:39am
    To: user@cassandra.apache.org <mailto:user@cassandra.apache.org>
    Subject: skip + limit support in GetSlice

    Hello,

    probably this is feature request. Simply, I would like to have
    support for
    standard pagination (skip + limit) in GetSlice Thrift method. Is this
    feature on the road map?

    Now, I have to perform GetSlice call, that starts on "" and
    "limit" is set
    to "skip" value. Then I read the last column name returned and
    subsequently
    perform the final GetSlice call - I use the last column name as
    "start" and
    set "limit" to "limit" value.

    This workaround is not very efficient when I need to skip a lot of
    columns
    (so "skip" is high) - then a lot of data must be transferred via
    network. So
    I think that support for Skip in GetSlice would be very useful (to
    avoid
    high network traffic).

    The implementation could be very straightforward (same as the
    workaround) or
    maybe it could be more efficient - I think that whole row (so all
    columns)
    must fit into memory so if we have all columns in memory...

    Thank you!

    Augi

Re: skip + limit support in GetSlice

Reply via email to