Re: Re: parallel distinct union and aggregate support patch

bu...@sohu.com Tue, 27 Oct 2020 19:44:23 -0700

> On Tue, Oct 27, 2020 at 3:27 PM Dilip Kumar <dilipbal...@gmail.com> wrote:
> >
> > On Fri, Oct 23, 2020 at 11:58 AM bu...@sohu.com <bu...@sohu.com> wrote:
> > >
> > > > Interesting idea.  So IIUC, whenever a worker is scanning the tuple it
> > > > will directly put it into the respective batch(shared tuple store),
> > > > based on the hash on grouping column and once all the workers are
> > > > doing preparing the batch then each worker will pick those baches one
> > > > by one, perform sort and finish the aggregation.  I think there is a
> > > > scope of improvement that instead of directly putting the tuple to the
> > > > batch what if the worker does the partial aggregations and then it
> > > > places the partially aggregated rows in the shared tuple store based
> > > > on the hash value and then the worker can pick the batch by batch.  By
> > > > doing this way, we can avoid doing large sorts.  And then this
> > > > approach can also be used with the hash aggregate, I mean the
> > > > partially aggregated data by the hash aggregate can be put into the
> > > > respective batch.
> > >
> > > Good idea. Batch sort suitable for large aggregate result rows,
> > > in large aggregate result using partial aggregation maybe out of memory,
> > > and all aggregate functions must support partial(using batch sort this is 
> > > unnecessary).
> > >
> > > Actually i written a batch hash store for hash aggregate(for pg11) like 
> > > this idea,
> > > but not write partial aggregations to shared tuple store, it's write 
> > > origin tuple and hash value
> > > to shared tuple store, But it's not support parallel grouping sets.
> > > I'am trying to write parallel hash aggregate support using batch shared 
> > > tuple store for PG14,
> > > and need support parallel grouping sets hash aggregate.
> >
> > I was trying to look into this patch to understand the logic in more
> > detail.  Actually, there are no comments at all so it's really hard to
> > understand what the code is trying to do.
> >
> > I was reading the below functions, which is the main entry point for
> > the batch sort.
> >
> > +static TupleTableSlot *ExecBatchSortPrepare(PlanState *pstate)
> > +{
> > ...
> > + for (;;)
> > + {
> > ...
> > + tuplesort_puttupleslot(state->batches[hash%node->numBatches], slot);
> > + }
> > +
> > + for (i=node->numBatches;i>0;)
> > + tuplesort_performsort(state->batches[--i]);
> > +build_already_done_:
> > + if (parallel)
> > + {
> > + for (i=node->numBatches;i>0;)
> > + {
> > + --i;
> > + if (state->batches[i])
> > + {
> > + tuplesort_end(state->batches[i]);
> > + state->batches[i] = NULL;
> > + }
> > + }
> >
> > I did not understand this part, that once each worker has performed
> > their local batch-wise sort why we are clearing the baches?  I mean
> > individual workers have their on batches so eventually they supposed
> > to get merged.  Can you explain this part and also it will be better
> > if you can add the comments.
>  
> I think I got this,  IIUC, each worker is initializing the shared
> short and performing the batch-wise sorting and we will wait on a
> barrier so that all the workers can finish with their sorting.  Once
> that is done the workers will coordinate and pick the batch by batch
> and perform the final merge for the batch.


Yes, it is. Each worker open the shared sort as "worker" (nodeBatchSort.c:134),
end of all worker performing, pick one batch and open it as 
"leader"(nodeBatchSort.c:54).

Re: Re: parallel distinct union and aggregate support patch

Reply via email to