[ 
https://issues.apache.org/jira/browse/FLINK-36018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872769#comment-17872769
 ] 

Rui Fan commented on FLINK-36018:
---------------------------------

Hey [~gyfora] 
{quote}It makes sense to introduce a new config maybe something like 
`job.autoscaler.scale-down.minimum-interval` that maybe has a more 
straightforward meaning compared to the the word lazy.
{quote}
Lazy-period means we don't trigger the first scale down, and wait for a while. 
It is triggered if the job still needs to be scaled down after the lazy period.

Grace-period or minimum-interval means we trigger the first scale down. We 
lazily trigger these scale downs if the job still needs to be scaled down 
within the grace period after the first scale down.

As I understand, lazy-period could reduce the restart frequency than 
grace-period(minimum-interval) in some cases:
 * Traffic(recommended parallelism) continues to decrease
 * After the traffic decreased, it recovered after a short period of time. 
(Don't need scale down and scale up)
 * Some critical jobs never restarted if lazy-period is set to 24 hour
 ** Some critical jobs do not want to be restarted (if the actual parallelism 
is the same as the peak recommended parallelism.).
 ** All scale down requests will be canceled within 24 hours if lazy-period is 
set to 24 hour for this case.

If the lazy-period name isn't easy to understand, how about these names?
 * job.autoscaler.scale-down.trigger-delay
 * job.autoscaler.scale-down.delay-trigger-period
 * job.autoscaler.scale-down.lazy-trigger-period
 * job.autoscaler.scale-down.delay-period
 * job.autoscaler.scale-down.lazy-period

> Support lazy scale down to avoid frequent rescaling
> ---------------------------------------------------
>
>                 Key: FLINK-36018
>                 URL: https://issues.apache.org/jira/browse/FLINK-36018
>             Project: Flink
>          Issue Type: Improvement
>          Components: Autoscaler
>            Reporter: Rui Fan
>            Assignee: Rui Fan
>            Priority: Major
>
> {*}{color:#de350b}Core idea{color}{*}: Make scaling up sensitive to prevent 
> lags, and make scaling down insensitive to reduce restart frequency.
> h1. Background & Motivation
> We enabled autoscaler scaling for a few flink production jobs. It works with 
> Adaptive Scheduler and Rescale api.
> Scaling results:
>  * The recommended parallelism meets expectations most of the time
>  * When the source traffic increases, the autoscaler scales up the job in 
> time to prevent lags.
>  * When the source traffic decreases, the autoscaler scales down job in time 
> to save resources
>  * {color:#de350b}*Pain point:*{color} Each job rescales more than 20 times a 
> day (job.autoscaler.metrics.window=15 min by default).
> As we all know, the job will be unavailable for a while during the restart 
> for some reasons:
>  * Cancel job
>  * Request resources( 
> [FLIP-472|https://cwiki.apache.org/confluence/display/FLINK/FLIP-472%3A+Aligning+timeout+logic+in+the+AdaptiveScheduler%27s+WaitingForResources+and+Executing+states]
>  is optimizing it)
>  * Initialize task
>  * Restore state
>  * Catch up lag during restart
>  * etc
> *{color:#de350b}Expectations:{color}*
>  * Scaling up in time to prevent lags.
>  * Lazy scaling down to reduce downtime and ensure resources can be released 
> later.
> h1. Solution:
> Introduce job.autoscaler.scale-down.lazy-period, the default value could be 
> 30 min.
> Detailed strategies:
>  * Record the start time of the first scale-down event for each vertex 
> separately. For example:
>  ** vertex1: 2024-08-09 01:35:02
>  ** vertex2: 2024-08-09 01:38:02
>  * Scaling down will be triggered for some cases:
>  ** Any vertex needs scale up
>  *** Job restart cannot be avoided, so trigger scale down for another vertex 
> as well if needed
>  *** After scale down, clean up the start time of scale-down.
>  ** The scale down lazy period for any vertex is coming
>  *** current time - min(start time for each vertex) > scale-down.lazy-period
>  *** This means that there is no scaling up during the scaling down lazy 
> period
> Note1: If the recommend parallelism >= current parallelism, the start time of 
> scale-down will be cleaned up for current vertex.
> Note2: The recommended parallelism still comes from the latest 15-minute 
> metrics.For example:
>  * The current parallelism of vertex1 is 100, the traffic is decreased at 
> night.
>  * 2024-08-09 01:00:00, the recommended parallelism is 60.
>  * 
>  ** The start time of scale down is 2024-08-09 01:00:00.
>  * 2024-08-09 01:15:00, the recommended parallelism is 50.
>  ** Still within the range of scale down lazy period.
>  ** Don't update the start time of scale down.
>  * 2024-08-09 01:31:00, the recommended parallelism is 40.
>  ** Outside of scale-down.lazy-period, trigger rescale, and use 40 as the 
> recommended parallelism.
>  ** The job.autoscaler.metrics.window is 15 min, so metrics from 2024-08-09 
> 01:16:00 to 2024-08-09 01:31:00



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to