Yep, AEQ is a useful optimization technique that dynamically adjusts the query execution plan based on runtime statistics. It is designed to improve query performance . You are correct that AQE is primarily triggered at shuffle boundaries. These are points in the query plan where data is shuffled between stages, such as after a join, aggregation, or window function. Operations like map, filter, and flatMap that can be executed in a single stage usually do not trigger AQE. While repartition can shuffle data, it is often not considered a significant enough operation to warrant AQE. However, AQE might still optimize the subsequent stages if they involve shuffles.
HTHMich Talebzadeh, Architect | Data Engineer | Data Science | Financial Crime PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College London <https://en.wikipedia.org/wiki/Imperial_College_London> London, United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Tue, 12 Nov 2024 at 15:23, Perfect Stranger <paulpaul1...@gmail.com> wrote: > I thought that AQE is triggered after every kind of shuffle operation. But > it seems that it isn't. Is there a list of operations that trigger and > don't trigger AQE? For example I noticed that repartition(partitionsNumber) > does not trigger AQE. >