Where exactly are you getting this information from? As far as I can tell, spark.sql.cbo.enabled has defaulted to false since it was introduced 7 years ago <https://github.com/apache/spark/commit/ae83c211257c508989c703d54f2aeec8b2b5f14d#diff-9ed2b0b7829b91eafb43e040a15247c90384e42fea1046864199fbad77527bb5R649>. It has never been enabled by default.
And I cannot see mention of spark.sql.cbo.strategy anywhere at all in the code base. So again, where is this information coming from? Please link directly to your source. > On Dec 11, 2023, at 5:45 PM, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > You are right. By default CBO is not enabled. Whilst the CBO was the default > optimizer in earlier versions of Spark, it has been replaced by the AQE in > recent releases. > > spark.sql.cbo.strategy > > As I understand, The spark.sql.cbo.strategy configuration property specifies > the optimizer strategy used by Spark SQL to generate query execution plans. > There are two main optimizer strategies available: > CBO (Cost-Based Optimization): The default optimizer strategy, which analyzes > the query plan and estimates the execution costs associated with each > operation. It uses statistics to guide its decisions, selecting the plan with > the lowest estimated cost. > > CBO-Like (Cost-Based Optimization-Like): A simplified optimizer strategy that > mimics some of the CBO's logic, but without the ability to estimate costs. > This strategy is faster than CBO for simple queries, but may not produce the > most efficient plan for complex queries. > > The spark.sql.cbo.strategy property can be set to either CBO or CBO-Like. The > default value is AUTO, which means that Spark will automatically choose the > most appropriate strategy based on the complexity of the query and > availability of statistic > > > > Mich Talebzadeh, > Distinguished Technologist, Solutions Architect & Engineer > London > United Kingdom > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > https://en.everybodywiki.com/Mich_Talebzadeh > > > Disclaimer: Use it at your own risk. Any and all responsibility for any loss, > damage or destruction of data or any other property which may arise from > relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. > > > > On Mon, 11 Dec 2023 at 17:11, Nicholas Chammas <nicholas.cham...@gmail.com > <mailto:nicholas.cham...@gmail.com>> wrote: >> >>> On Dec 11, 2023, at 6:40 AM, Mich Talebzadeh <mich.talebza...@gmail.com >>> <mailto:mich.talebza...@gmail.com>> wrote: >>> >>> By default, the CBO is enabled in Spark. >> >> Note that this is not correct. AQE is enabled >> <https://github.com/apache/spark/blob/8235f1d56bf232bb713fe24ff6f2ffdaf49d2fcc/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L664-L669> >> by default, but CBO isn’t >> <https://github.com/apache/spark/blob/8235f1d56bf232bb713fe24ff6f2ffdaf49d2fcc/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2694-L2699>.