1996fanrui commented on PR #801:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/801#issuecomment-2017164952
Thanks @mxm for the review and discussion!
> > This issue only affects the standalone autoscaler as the kubernetes
operator has this logic already in place for setting the RUNNING state. Can we
somehow deduplicate this logic?
>
> Is that really the case? AFAIK we only check for a RUNNING job state.
`AbstractFlinkService#getEffectiveStatus` adjusts the `JobStatus.RUNNING` to
`JobStatus.CREATED`, thanks @gyfora for helping find it. I didn't extract it as
a common class due to @gyfora mentioned `autoscaler` may be moved to the
separated repo, so it's better to copy related logic to `autoscaler standalone`
module.
> This looks related to #699 which took a different approach by ignoring
certain exceptions during the stabilization phase and effectively postponing
metric collection.
The adjustment logic is introduced before #699 , it means the some of
metrics may be not ready even if all tasks are running(I guess some metrics are
generated after running). That's what exactly what #699 solved.
Why do we need to adjust the JobStatus?
- If some of tasks are not running, autoscaler doesn't need to call metric
collection related logic.
- If `job.autoscaler.stabilization.interval` is set to small value by users,
it's easy to throw metric not found exception.
- As I understand, `job.autoscaler.stabilization.interval` hopes to filter
out unstable metrics when all tasks just start running.
- For example, job starts at `09:00:00`, and all tasks start running at
`09:03:00`, and `job.autoscaler.stabilization.interval` = 1 min.
- We hopes the stabilization period is `09:03:00` to `09:04:00` instead of
`09:00:00` to `09:01:00`, right?
- All tasks starts since `09:03:00`, so the metric may be not stable from
`09:03:00` to `09:04:00`.
- Of course, this issue might needs FLINK-34907 as well.
Please correct me if anything is wrong, thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]