berkaysynnada commented on PR #14411: URL: https://github.com/apache/datafusion/pull/14411#issuecomment-2646294361
> The point is: I think there should be no memory-bloat issue in TPCH/clickbench queries caused by `RepartitionExec`, just wondering do you have any bad query can reproduce the memory issue reported by #14287, and will this PR fix it or it's only performance focused? We are using `VecDeque`-based channels. As far as I can see, there is no size check there: https://github.com/apache/datafusion/blob/9c12919786be0cfce5c4817101a378669ba002e2/datafusion/physical-plan/src/repartition/distributor_channels.rs#L209 What this means is that if the producers have a higher throughput than the consumers, these VecDeques eventually won't fit in the bounded memory. I haven't experienced such a crash yet, but I can surely reproduce it by writing some experimental streams. I cannot express an order of importance, but there are two goals with this PR: 1) These OOM possibilities are avoided. 2) The load is distributed more evenly than with round-robin (which is intended to be used as an even load distributor). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org