[
https://issues.apache.org/jira/browse/FLINK-34655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17826015#comment-17826015
]
Maximilian Michels commented on FLINK-34655:
--------------------------------------------
Thanks for raising awareness for the Flink version compatibility, [~fanrui]!
Although we've been using Flink Autoscaling with 1.16, it is true that only
Flink 1.17 supports it out of the box.
{quote}In the short term, we only use the autoscaler to give suggestion instead
of scaling directly. After our users think the parallelism calculation is
reliable, they will have stronger motivation to upgrade the flink version.
{quote}
I understand the idea behind providing suggestions. However, it is difficult to
assess the quality of Autoscaling decisions without applying them
automatically. The reason is that suggestions become stale very quickly if the
load pattern is not completely static. Even for static load patterns, if the
user doesn't redeploy in a matter of minutes, the suggestions might already be
stale again when the number of pending records increased too much. In any case,
production load patterns are rarely static which means that autoscaling will
inevitable trigger multiple times a day, but that is where its real power is
unleashed. It would be great to hear about any concerns your users have for
turning on automatic scaling. We've been operating it in production for about a
year now.
Back to the issue here, should we think about a patch release for 1.15 / 1.16
to add support for overriding vertex parallelism?
> Autoscaler doesn't work for flink 1.15
> --------------------------------------
>
> Key: FLINK-34655
> URL: https://issues.apache.org/jira/browse/FLINK-34655
> Project: Flink
> Issue Type: Bug
> Components: Autoscaler
> Reporter: Rui Fan
> Assignee: Rui Fan
> Priority: Major
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.8.0
>
>
> flink-ubernetes-operator is committed to supporting the latest 4 flink minor
> versions, and autoscaler is a part of flink-ubernetes-operator. Currently,
> the latest 4 flink minor versions are 1.15, 1.16, 1.17 and 1.18.
> But autoscaler doesn't work for flink 1.15.
> h2. Root cause:
> * FLINK-28310 added some properties in IOMetricsInfo in flink-1.16
> * IOMetricsInfo is a part of JobDetailsInfo
> * JobDetailsInfo is necessary for autoscaler [1]
> * flink's RestClient doesn't allow miss any property during deserializing the
> json
> That means that the RestClient after 1.15 cannot fetch JobDetailsInfo for
> 1.15 jobs.
> h2. How to fix it properly?
> - [[FLINK-34655](https://issues.apache.org/jira/browse/FLINK-34655)] Copy
> IOMetricsInfo to flink-autoscaler-standalone module
> - Removing them after 1.15 are not supported
> [1]
> https://github.com/apache/flink-kubernetes-operator/blob/ede1a610b3375d31a2e82287eec67ace70c4c8df/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/ScalingMetricCollector.java#L109
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance
--
This message was sent by Atlassian Jira
(v8.20.10#820010)