Thanks for the pointer on internal paging Tyler, I missed this one. But
then it raises some questions:

1. Is it possible to "tune" the page size or is it hard-coded internally ?
2. Is read-repair performed on EACH page or is it done on the whole
requested rows once they are fetched ?

Question 2. is relevant in some particular scenarios when the user is using
CL QUORUM (or more) and some replicas are out-of-sync. Even in the case of
aggregation over a single partition, if this partition is wide and spans
many fetch pages, the time the coordinator performs all the read-repair and
reconcile over QUORUM replicas, the query may timeout very quickly.



On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs <ty...@datastax.com> wrote:

>
> On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduy...@gmail.com> wrote:
>
>> Cassandra will perform a full table scan and fetch all the data in memory
>> to apply the aggregate function.
>
>
> Just to clarify for others on the list: when executing aggregation
> functions, Cassandra *will* use paging internally, so at most one page
> worth of data will be held in memory at a time.  However, if your
> aggregation function retains a large amount of data, this may contribute to
> heap pressure.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Reply via email to