[PR] [FLINK-34906] Only scale when all tasks are running [flink-kubernetes-operator]

via GitHub Thu, 21 Mar 2024 02:59:48 -0700


1996fanrui opened a new pull request, #801:
URL: https://github.com/apache/flink-kubernetes-operator/pull/801

## What is the purpose of the change

Currently, the autoscaler will scale a job when the JobStatus is RUNNING.
But the JobStatus will be RUNNING once job starts schedule, so it doesn't mean
all tasks are running. Especially, when the resource isn't enough or job
recovers from large state.

The autoscaler will throw exception and generate the AutoscalerError event
when tasks are not ready. Also, we don't need to scale it when some tasks are
not ready.

## Brief change log

- [FLINK-34906] Only scale when all tasks are running
- Solution: we only scale job that all tasks are running(some of tasks may
be finished).

We can know how many tasks are running from `JobDetailsInfo`:

![image](https://github.com/apache/flink-kubernetes-operator/assets/38427477/b440ac9d-eddc-49b7-b534-b6755fa9e181)

## Verifying this change

Manually test is done, unit test is still writing.

## Does this pull request potentially affect one of the following parts:

- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: no

## Documentation

- Does this pull request introduce a new feature? no

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [FLINK-34906] Only scale when all tasks are running [flink-kubernetes-operator]

Reply via email to