berkaysynnada commented on PR #14411:
URL: https://github.com/apache/datafusion/pull/14411#issuecomment-2646294361

   > The point is: I think there should be no memory-bloat issue in 
TPCH/clickbench queries caused by `RepartitionExec`, just wondering do you have 
any bad query can reproduce the memory issue reported by #14287, and will this 
PR fix it or it's only performance focused?
   
   We are using `VecDeque`-based channels. As far as I can see, there is no 
size check there: 
https://github.com/apache/datafusion/blob/9c12919786be0cfce5c4817101a378669ba002e2/datafusion/physical-plan/src/repartition/distributor_channels.rs#L209
   
   What this means is that if the producers have a higher throughput than the 
consumers, these VecDeques eventually won't fit in the bounded memory. I 
haven't experienced such a crash yet, but I can surely reproduce it by writing 
some experimental streams. I cannot express an order of importance, but there 
are two goals with this PR:
   
   1) These OOM possibilities are avoided.
   2) The load is distributed more evenly than with round-robin (which is 
intended to be used as an even load distributor).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to