Dennis-Mircea opened a new pull request, #1149:
URL: https://github.com/apache/flink-kubernetes-operator/pull/1149
> **Merge / alignment note**: This change should be merged together with,
and keptaligned with, #1088 ([FLINK-39938] FLIP-586: Composable Parallelism
Alignment Modes for Flink Autoscaler), which also reworks the autoscaler
context model.
## What is the purpose of the change
The autoscaler pipeline (metric collection, evaluation, scaling execution)
passed its per-cycle data through long, growing parameter lists, and each stage
reached into the previous stage's outputs in ad-hoc ways. The two plugin SPIs
(FLIP-514 custom evaluator and FLIP-575 scaling executor) each defined their
own context type that re-declared data already present on
`JobAutoScalerContext` (configuration, evaluated metrics, job topology), so the
same information existed in three shapes.
This change gives the autoscaler a single canonical context that is threaded
through the whole cycle and enriched as it advances:
- A per-cycle `ScalingCycleState` carried by `JobAutoScalerContext` holds
the working data of one scaling cycle.
- Each pipeline stage depends only on the context handed to it.
- Both plugin contexts become thin extensions of `JobAutoScalerContext`
rather than parallel types.
## Brief change log
- Added `JobAutoScalerContext.ScalingCycleState`, the mutable per-cycle
working state of the pipeline (cycle start instant, autoscaler metrics sink,
collected metrics, scaling tracking and history, restart time, evaluated
metrics). It is lazily created per context, has package-private setters, and is
excluded from `toString`.
- Reworked `JobAutoScalerContext` to carry the cycle state and expose
convenience accessors derived from it (`getJobTopology`, `getEvaluatedMetrics`,
`getMetricsHistory`, `getRestartTime`). Added a protected copy constructor used
by plugin contexts that shares the cycle state and replaces the configuration
with the effective per-plugin one.
- Reworked the pipeline into three stages that each rely solely on the
context: `ScalingMetricCollector.collect(ctx)`,
`ScalingMetricEvaluator.evaluate(ctx)`, and `ScalingExecutor.execute(ctx)`.
`runScalingLogic` in `JobAutoScalerImpl` becomes the cycle set-up plus those
three calls.
- Moved the metric state store onto `ScalingMetricCollector` as a field, and
the retained `lastEvaluatedMetrics` onto `ScalingMetricEvaluator` (now generic
and stateful), so the scaling metric gauges keep reporting the latest values
across cycles.
- Collapsed the evaluator entry points to a single `evaluate(Context)` (the
production path that records results and registers gauges), backed by an
internal `computeEvaluatedMetrics` worker that runs the custom evaluator plugin
only when a context is present.
- Made `ScalingExecutorPlugin.Context` and
`ScalingMetricsEvaluatorPlugin.Context` extend `JobAutoScalerContext`. They
share the canonical `ScalingCycleState`, expose the effective per-plugin
configuration through the inherited `getConfiguration()`, and add only the data
that is genuinely plugin-specific (the in-progress evaluated vertex metrics and
backlog flag for the evaluator, nothing extra for the executor).
- Dropped the vestigial `CTX` type parameter from `ScalingExecutorPlugin`,
leaving `ScalingExecutorPlugin<KEY>`.
- Removed the redundant `delayedScaleDown` parameter that was threaded
alongside the context.
## Verifying this change
This is an internal refactor and does not change autoscaler behavior.
- The full flink-autoscaler suite passes (245 tests), including the migrated
tests for the collect / evaluate / execute stages.
- Added
`ScalingMetricEvaluatorTest.testEvaluatorPluginContextExtendsCanonicalContext`
and extended the executor's plugin-context test to assert the new guarantees:
the plugin context is a `JobAutoScalerContext`, shares the canonical
`ScalingCycleState` and metric group by identity, and returns the effective
per-plugin configuration from `getConfiguration()`.
- Downstream modules (flink-autoscaler-standalone,
flink-kubernetes-operator) compile against the updated signatures.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]