AQE can dynamically prune shuffle partitions based on filter conditions, this will reduce the amount of data processed.The AQE will optimize logical Plan following every stage i.e at shuffle boundaries. The logical plan created with optimization applied by AQE will colalase small shuffle partitions into one.
regards, Guru On Wed, Nov 13, 2024 at 1:56 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > Yep, AEQ is a useful optimization technique that dynamically adjusts the > query execution plan based on runtime statistics. It is designed to improve > query performance . > You are correct that AQE is primarily triggered at shuffle boundaries. These > are points in the query plan where data is shuffled between stages, such as > after a join, aggregation, or window function. > > Operations like map, filter, and flatMap that can be executed in a single > stage usually do not trigger AQE. While repartition can shuffle data, it is > often not considered a significant enough operation to warrant AQE. However, > AQE might still optimize the subsequent stages if they involve shuffles. > > HTH > > Mich Talebzadeh, > > Architect | Data Engineer | Data Science | Financial Crime > PhD Imperial College London > London, United Kingdom > > > view my Linkedin profile > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > Disclaimer: The information provided is correct to the best of my knowledge > but of course cannot be guaranteed . It is essential to note that, as with > any advice, quote "one test result is worth one-thousand expert opinions > (Werner Von Braun)". > > > > On Tue, 12 Nov 2024 at 15:23, Perfect Stranger <paulpaul1...@gmail.com> wrote: >> >> I thought that AQE is triggered after every kind of shuffle operation. But >> it seems that it isn't. Is there a list of operations that trigger and don't >> trigger AQE? For example I noticed that repartition(partitionsNumber) does >> not trigger AQE. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org