SauronShepherd commented on PR #49724:
URL: https://github.com/apache/spark/pull/49724#issuecomment-2754869306

   > I think generating the EXPLAIN string once per query is OK
   
   It's OK ... as long as it doesn't affect the performance, but I'm afraid it 
does. I wrote an 
[article](https://medium.com/towards-data-engineering/apache-spark-wtf-i-love-it-when-a-plan-comes-together-part-iii-296a44d323dd)
 about that.
   
   > The real issue is the AQE plan change event being too frequent and each 
event generates the EXPLAIN string once.
   
   The way I see it, that's not the real issue here. The problem is not that 
lots of explains are performed. Yes, there are quite a few, but the real 
problem is that each explain becomes more and more costly as the plan gets 
bigger and bigger (because all the extra steps that AQE needs and includes).
   I think the proposed new "off" explain mode is the right approach to fix 
this, because most cases and for most people ... all those explains are never 
seen/analized. 
   
   Do you know any other framework that generates so huge amount of verbose 
debugging internal messages only in case the developer wants to look at them? I 
don't.
   
   > This sounds like a separated issue. Can we open a new PR for it?
   
   Totally agree with you. My mistake. For me it was all related, because 
adding the new explain mode fixed the OOM but didn't fix the performance 
problem. .But you're right, this other thing can be perfectly in a new PR.
   Btw, I was thinking about (optionally) giving the developer the chance to 
set the tableName in the cache/persist methods. What do you think? Should I 
include that in this new PR or split it into two different PRs?
   
   Thanks for your thoughts on this issue.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to