gyfora opened a new pull request, #505:
URL: https://github.com/apache/flink-kubernetes-operator/pull/505
## What is the purpose of the change
Make busyTime aggregation configurable and use MAX instead of AVG by default.
The previous AVG aggregation was very susceptible to overestimating the true
processing/output rates in the presence of any data skew. This could lead to
situations when the autoscaler assumed the operators were within their target
capacity but in fact they were already backpressuring the whole pipeline.
Changing the default aggregation to MAX solves this problem.
## Verifying this change
Modified existing tests to verify the new aggregation scheme, and the
configurable setting. Verified this manually on a skewed job.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: no
## Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? docs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]