alamb commented on PR #17563: URL: https://github.com/apache/datafusion/pull/17563#issuecomment-3307346726
> We can do that but it would feel like a lipstick. I really hope #17624 is addressed. @berkaysynnada knows how to fix this properly without half-means like cutoff. Let's not invest time in a solution that's going to be superseded soonish. I agree having a cutoff is a (very) non ideal solution and I also hope we can fix #17624 asap. The reason I don't like the idea of just turning off the optimization for everyone, is if I imagine this change from a user perspective: 1. I am currently running my queries that have 3 window functions using DataFusion 49.0.0 / Datafusion 50.0.0, and everything is great! 2. When I upgrade to DataFusion 50.0.1 my queries get much slower (b/c now there is a bunch more sorting happening" 3. When I ask why *my* queries got slower, I get told "so people who have 30 window functions don't have problems" I would very much feel like this is a pretty major regression for me The reason I proposed the cutoff is to reduce the number of users who are affected. 1. Users who are running today under the cutoff don't experience a regression and still get the same performance. 2. There probably aren't many people using 20 window functions given they would hit exponential planning time anyways I understand that whatever value of cutoff we pick may still result in some people hitting a regression, but I think by picking a reasonable cutoff we'll keep most of them happy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
