alamb commented on PR #16711:
URL: https://github.com/apache/datafusion/pull/16711#issuecomment-3052692116

   My analysis of these results are very consistent with my last attempt at 
caching filter results
   
   The biggest slow downs are in Q30, Q31
   ```
   │ QQuery 30    │   758.48 ms │          1197.40 ms │  1.58x slower │
   │ QQuery 31    │   780.68 ms │          1172.13 ms │  1.50x slower │
   ```
   
   I am fairly sure this is due to the overhad of RowSelection (these queries 
select many small selections). I started analyzing them here: 
https://github.com/apache/datafusion/pull/16562#issuecomment-3009778287
   
   So TLDR is I think the caching approach is good. but to avoid some queries 
getting slower we will need to improve the RowSelection representation too. I 
will try and think about this / whip up some POC hopefully over the next few 
days


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to