It's actually setStartKey that's the important method call (in combination with setRowCount). So I should have been clearer.

The following code performs as expected, as far as returning the expected data in the expected order. I believe that the use of IndexedSliceQuery's setStartKey will support efficient queries -- avoiding repulling the entire data set from cassandra. Correct?


        void demoPaging() {
String lastKey = processPage("don",""); // get first batch, starting with "" (smallest key) lastKey = processPage("don",lastKey); // get second batch starting with previous last key lastKey = processPage("don",lastKey); // get third batch starting with previous last key
               //....
        }

        // return last key processed, null when no records left
        String processPage(String username, String startKey) {
                String lastKey=null;
IndexedSlicesQuery<String, String, String> indexedSlicesQuery = HFactory.createIndexedSlicesQuery(keyspace, stringSerializer, stringSerializer, stringSerializer); indexedSlicesQuery.addEqualsExpression("user", username); indexedSlicesQuery.setColumnNames("source","ip"); indexedSlicesQuery.setColumnFamily(ourColumnFamilyName); indexedSlicesQuery.setStartKey(startKey); // <----------------------------------------------------------------------------------------
                                indexedSlicesQuery.setRowCount(batchSize);
QueryResult<OrderedRows<String, String, String>> result =indexedSlicesQuery.execute(); OrderedRows<String,String,String> rows = result.get();
                                for(Row<String,String,String> row:rows ){
                                        if (row==null) { continue; }
                                        totalCount++;
                                        String key = row.getKey();

if (!startKey.equals(key)) {lastKey=key;}
                                }
                                totalCount--;
                                return lastKey;
        }






On 10/13/2011 09:15 AM, Patricio Echagüe wrote:
Hi Don. No it will not. IndexedSlicesQuery will read just the amount of rows specified by RowCount and will go to the DB to get the new page when needed.

SetRowCount is doing indexClause.setCount(rowCount);

On Mon, Oct 10, 2011 at 3:52 PM, Don Smith <dsm...@likewise.com <mailto:dsm...@likewise.com>> wrote:

    Hector's IndexedSlicesQuery has a setRowCount method that you can
    use to page through the results, as described in
    https://github.com/rantav/hector/wiki/User-Guide .

        rangeSlicesQuery.setRowCount(1001);
         .....
        rangeSlicesQuery.setKeys(lastRow.getKey(),  "");

    Is it efficient?  Specifically, suppose my query returns 100,000
    results and I page through batches of 1000 at a time (making 100
    executes of the query). Will it internally retrieve all the
    results each time (but pass only the desired set of 1000 or so to
    me)? Or will it optimize queries to avoid the duplication?      I
    presume the latter. :)

    Can IndexedSlicesQuery's setStartKey method be used for the same
    effect?

      Thanks,  Don



Reply via email to