It is better to get a sensible amount. Moving a few MB's is ok (see thrift_framed_transport_size_in_mb in cassandra.yaml).
Long running queries can reduce the overall query throughput. They also churn memory over on both the server and the client. Run some tests on your data, see how long it takes to iterate over all the columns using different slice sizes. More is not always better. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 8/03/2012, at 11:56 AM, Kevin wrote: > When dealing with large SliceRanges, it better to read all the results in to > memory (by setting “count” to the largest value possible), or is it better to > divide the query in to smaller SliceRange queries? Large in this case being > on the order of millions of rows. > > There’s a footnote concerning SliceRanges on the main Apache Cassandra > project site that reads: > > “…Thrift will materialize the whole result into memory before returning it to > the client, so be aware that you may be better served by iterating through > slices by passing the last value of one call in as the start of the next > instead of increasing count arbitrarily large.” > > … but it doesn’t delve in to the reasons why going about things that way is > better. > > Can someone shed some light on this? And would the same logic apply to large > KeyRanges? >