Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-29 Thread via GitHub
adriangb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2763348634 wrt waiting for filter pushdown to be enabled by default, I think we're just making our lives harder by coupling them, especially since we can already test them together under

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-28 Thread via GitHub
alamb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2762326990 @adriangb and I had a discussion about https://github.com/apache/datafusion/pull/15301 here are some notes: ## Usecases: - TopK dynamic filter pushdown - Prune f

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-25 Thread via GitHub
adriangb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2752376477 We already have Statistics on PartitionedFile so we could potentially use Dynamic filters to prune based on those before opening the file -- This is an automated message from

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-20 Thread via GitHub
alamb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2740932019 Thanks @adriangb -- I will try and review it asap (hopefully tomorrow afternoon or tomorrow) -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-20 Thread via GitHub
adriangb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2740808192 @alamb I implemented something like that in #15301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2734575745 > Does anyone have a handle on how we might implement this? I was thinking we’d need to add a method to exec operators called `apply_filter` but that basically sends down the addi

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-17 Thread via GitHub
adriangb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2729172917 Does anyone have a handle on how we might implement this? I was thinking we’d need to add a method to exec operators called `apply_filter` but that basically sends down the add

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-12 Thread via GitHub
alamb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2717447487 BTW I am pretty sure DuckDB is using this technique and why they are so much faster on ClickBench Q23: - https://github.com/apache/datafusion/issues/15177 -- This is an autom

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-05 Thread via GitHub
alamb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2702227244 > Nice to know I'm not totally off on the idea 😄 Not at all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-05 Thread via GitHub
adriangb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2702224598 Nice to know I'm not totally off on the idea 😄 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Dynamic pruning filters from TopK state [datafusion]

2025-03-05 Thread via GitHub
alamb commented on issue #15037: URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2702211348 > @alamb mentioned this sounds similar to [Dynamic Filters](https://docs.starburst.io/latest/admin/dynamic-filtering.html), I assume this must be a known technique (or my analysis