Where exactly are you getting this information from?

As far as I can tell, spark.sql.cbo.enabled has defaulted to false since it was 
introduced 7 years ago 
<https://github.com/apache/spark/commit/ae83c211257c508989c703d54f2aeec8b2b5f14d#diff-9ed2b0b7829b91eafb43e040a15247c90384e42fea1046864199fbad77527bb5R649>.
 It has never been enabled by default.

And I cannot see mention of spark.sql.cbo.strategy anywhere at all in the code 
base.

So again, where is this information coming from? Please link directly to your 
source.



> On Dec 11, 2023, at 5:45 PM, Mich Talebzadeh <mich.talebza...@gmail.com> 
> wrote:
> 
> You are right. By default CBO is not enabled. Whilst the CBO was the default 
> optimizer in earlier versions of Spark, it has been replaced by the AQE in 
> recent releases.
> 
> spark.sql.cbo.strategy
> 
> As I understand, The spark.sql.cbo.strategy configuration property specifies 
> the optimizer strategy used by Spark SQL to generate query execution plans. 
> There are two main optimizer strategies available:
> CBO (Cost-Based Optimization): The default optimizer strategy, which analyzes 
> the query plan and estimates the execution costs associated with each 
> operation. It uses statistics to guide its decisions, selecting the plan with 
> the lowest estimated cost.
> 
> CBO-Like (Cost-Based Optimization-Like): A simplified optimizer strategy that 
> mimics some of the CBO's logic, but without the ability to estimate costs. 
> This strategy is faster than CBO for simple queries, but may not produce the 
> most efficient plan for complex queries.
> 
> The spark.sql.cbo.strategy property can be set to either CBO or CBO-Like. The 
> default value is AUTO, which means that Spark will automatically choose the 
> most appropriate strategy based on the complexity of the query and 
> availability of statistic
> 
> 
> 
> Mich Talebzadeh,
> Distinguished Technologist, Solutions Architect & Engineer
> London
> United Kingdom
> 
>    view my Linkedin profile 
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
> 
>  https://en.everybodywiki.com/Mich_Talebzadeh
> 
>  
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> 
> On Mon, 11 Dec 2023 at 17:11, Nicholas Chammas <nicholas.cham...@gmail.com 
> <mailto:nicholas.cham...@gmail.com>> wrote:
>> 
>>> On Dec 11, 2023, at 6:40 AM, Mich Talebzadeh <mich.talebza...@gmail.com 
>>> <mailto:mich.talebza...@gmail.com>> wrote:
>>> 
>>> By default, the CBO is enabled in Spark.
>> 
>> Note that this is not correct. AQE is enabled 
>> <https://github.com/apache/spark/blob/8235f1d56bf232bb713fe24ff6f2ffdaf49d2fcc/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L664-L669>
>>  by default, but CBO isn’t 
>> <https://github.com/apache/spark/blob/8235f1d56bf232bb713fe24ff6f2ffdaf49d2fcc/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2694-L2699>.

Reply via email to