No ... and I personally don't have a problem with this if you think about what 
is actually going on under the covers.

Note, however, that this is an expensive operation and as a result if there are 
parallel updates to the indexes while you are performing a full keyscan 
(rowscan) you will potentially miss keys because they are inserted earlier in 
the index than you are currently processing.

A further concern is that the keys (and indexes) are spread around a cluster. 
Unless R=N you will be hitting the network during this type of scan.

Lastly, be careful about how you specify the SlicePredicate. A keyscan can 
easily turn into a "dump the entire datastore" if you aren't careful.

On Jun 10, 2010, at 10:03 AM, Dop Sun wrote:

> Hi,
>  
> As documented in the http://wiki.apache.org/cassandra/API, the key range for 
> get_range_slices are both inclusive.
>  
> As discussed in this thread: 
> http://groups.google.com/group/jassandra-user/browse_thread/thread/c2e56453cde067d3,
>  there is a case that user want to discover all keys (huge number) in a 
> column family.
>  
> What I think  is doing batchly: using empty string as start and finish first, 
> then using the last key returned as start and query second.
>  
> My question is: using this method, the last key returned for the first query, 
> will be returned again in the second query as the first key. And it’s a 
> duplication. Is there any other API to discover keys without duplications in 
> current implementation?
>  
> Thanks,
> Regards,
> Dop

Reply via email to