alamb commented on issue #13983:
URL: https://github.com/apache/datafusion/issues/13983#issuecomment-2613378310

   > [@alamb](https://github.com/alamb) Excited to see further optmization 
about `late materialization`, it is really an important feature as I think ! I 
tried to use it in `HoraeDB` last year, and found the same problem mentioned in 
[#6921](https://github.com/apache/datafusion/pull/6921) and it is frustrated...
   > 
   > I will profile again with setting 
`datafusion.execution.parquet.pushdown_filters = true;`, and see what 
optimizations we can do in `datafusion`.
   
   Thanks @Rachelint 
   
   For this case I believe the core change needs to happen in the Parquet 
reader. The background as I understand it is described here
   - https://github.com/apache/arrow-rs/issues/5523
    
   @XiangpengHao has a prototype in the following PR 
   - https://github.com/apache/arrow-rs/pull/6921
   
   A good next step would be to measure how much faster DataFusion is with that 
PR -- the previous measurements we had a few other optimizations mixed in. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to