No ... and I personally don't have a problem with this if you think about what is actually going on under the covers.
Note, however, that this is an expensive operation and as a result if there are parallel updates to the indexes while you are performing a full keyscan (rowscan) you will potentially miss keys because they are inserted earlier in the index than you are currently processing. A further concern is that the keys (and indexes) are spread around a cluster. Unless R=N you will be hitting the network during this type of scan. Lastly, be careful about how you specify the SlicePredicate. A keyscan can easily turn into a "dump the entire datastore" if you aren't careful. On Jun 10, 2010, at 10:03 AM, Dop Sun wrote: > Hi, > > As documented in the http://wiki.apache.org/cassandra/API, the key range for > get_range_slices are both inclusive. > > As discussed in this thread: > http://groups.google.com/group/jassandra-user/browse_thread/thread/c2e56453cde067d3, > there is a case that user want to discover all keys (huge number) in a > column family. > > What I think is doing batchly: using empty string as start and finish first, > then using the last key returned as start and query second. > > My question is: using this method, the last key returned for the first query, > will be returned again in the second query as the first key. And it’s a > duplication. Is there any other API to discover keys without duplications in > current implementation? > > Thanks, > Regards, > Dop