comphead commented on issue #14510: URL: https://github.com/apache/datafusion/issues/14510#issuecomment-2689243363
Hi @PokIsemaine I'm planning to experiment with this. One thing as you mentioned absolutely correct is to use MemoryReservation which is the helper when memory allocated through the pool and can be read in metrics using `EXPLAIN ANALYZE query` statement. Challenge is our memory pool coverage needs to be improved and get a lot of transformation covered. Another dirty trick which might work is to try to track the max process memory per the each transformation using https://crates.io/crates/sysinfo This approach shows the process memory usage globally not just per specific node of physical plan, but using the Having max memory fingerprint it would be easy to find trends and heavyweight operations -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org