SauronShepherd commented on PR #49724: URL: https://github.com/apache/spark/pull/49724#issuecomment-2754869306
> I think generating the EXPLAIN string once per query is OK It's OK ... as long as it doesn't affect the performance, but I'm afraid it does. I wrote an [article](https://medium.com/towards-data-engineering/apache-spark-wtf-i-love-it-when-a-plan-comes-together-part-iii-296a44d323dd) about that. > The real issue is the AQE plan change event being too frequent and each event generates the EXPLAIN string once. The way I see it, that's not the real issue here. The problem is not that lots of explains are performed. Yes, there are quite a few, but the real problem is that each explain becomes more and more costly as the plan gets bigger and bigger (because all the extra steps that AQE needs and includes). I think the proposed new "off" explain mode is the right approach to fix this, because most cases and for most people ... all those explains are never seen/analized. Do you know any other framework that generates so huge amount of verbose debugging internal messages only in case the developer wants to look at them? I don't. > This sounds like a separated issue. Can we open a new PR for it? Totally agree with you. My mistake. For me it was all related, because adding the new explain mode fixed the OOM but didn't fix the performance problem. .But you're right, this other thing can be perfectly in a new PR. Btw, I was thinking about (optionally) giving the developer the chance to set the tableName in the cache/persist methods. What do you think? Should I include that in this new PR or split it into two different PRs? Thanks for your thoughts on this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org