songwdfu opened a new pull request, #16336:
URL: https://github.com/apache/pinot/pull/16336

   Implemented partitioned group-by combine for non-order-by, non-trim case. 
This technique is from of DuckDB's [parallel grouped 
aggregate](https://duckdb.org/2022/03/07/aggregate-hashtable.html).
    
   This algorithm has 2 phases.
   In the first phase per-segment results are radix-partitioned.
   In the second phase each worker thread picks up a partition to merge the
   results into a single hashtable. Then the result hashtables, which are still 
radix-partitioned, are logically stitched together since there are no key 
collisions between them.
    
   This enables full inter-thread parallism by eliminating contention between 
worker threads, in contrast to the previous 
   approach where every thread writes into the same shared indexedTable. 
Essentially, this is applicable to the broker as well, since the combine output 
is still radix-partitioned.
   
   To be tested. Will consider order-by / trim case. Will add detail 
optimizations later. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to