Dandandan commented on PR #16408: URL: https://github.com/apache/datafusion/pull/16408#issuecomment-2972888776
> Or you can just push to the main PR, I gave you write access to our fork :) > > My one question is: how does this optimization play with filter pushdown? If a child plan accepted the filter as Exact should we then _not_ re-filter? A related question which you've alluded to before: if no child plan accepted the filter at all should we avoid updating it? I think avoiding running the filter twice for "exact cases" is optimal. In practice, I am not sure if in this case it would add much overhead: converting to rows and comparing against / updating the heap / running the compaction logic will be the most expensive part by far. It will be hard I think to show it being much slower somewhere. Your second question: I think if we actually always use the filter for topk, I guess it isn't really wasteful. I think theoretically we should do it just before runing the topk instead of after (to avoid running it for the last iteration without using it). But also here I think it will be hard to show any benefit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org