comphead commented on issue #14238:
URL: https://github.com/apache/datafusion/issues/14238#issuecomment-2607735745

   The direction proposed by @berkaysynnada is worth to discuss. The join 
specifics doesn't guarantee output batch size in records. It can much much 
smaller or even empty because of filtering, and it can be much larger because 
of join explosions.
   
   The idea to discuss how we can make the output batches after joins to be 
more uniform and close to configured `batch_size`. 
   
   One of the options is to use `BatchSplitter` or `BatchCoalesce` plan nodes 
after the join is called.
   Another is to align the batches in the join internally providing the 
coalescer/splitter or having custom implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to