wirybeaver commented on PR #22947:
URL: https://github.com/apache/datafusion/pull/22947#issuecomment-4713568295

   Calling out an important point from the discussion in #22946: this PR is not 
only adding a spill path to the existing implementation. It also changes the 
`WindowAggExec` execution model.
   
   Current upstream behavior is:
   
   ```text
   buffer all input -> concat all input -> compute all partitions -> emit once
   ```
   
   This PR changes it to:
   
   ```text
   buffer one active partition -> spill it if needed -> compute completed 
partition -> emit partition output
   ```
   
   That distinction matters because the current memory pressure is worse than 
only "large window partition may OOM": today memory usage can scale with the 
full child input even when every partition is small. With this PR, memory usage 
is bounded by the active/completed partition workflow, with spill used when the 
active partition cannot grow its reservation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to