Re: [PR] [FLINK-33977][runtime] Adaptive scheduler may not minimize the number of TMs during downscaling [flink]

via GitHub Thu, 17 Oct 2024 19:51:30 -0700


1996fanrui commented on PR #25218:
URL: https://github.com/apache/flink/pull/25218#issuecomment-2421167238

Hey @XComp @ztison , sorry, I'd like to discuss with you again about this PR.

Could we fix the issue for `DefaultSlotAssigner` and `Application Mode`
first? I prefer to fix it first for several reasons:

- @XComp 's first
[concern](https://github.com/apache/flink/pull/25218#discussion_r1771187666) is
this fix conflicts with `execution.state-recovery.from-local`, so it's better
to be handled in a FLIP.
- That's why this PR only change the code of `DefaultSlotAssigner` and
doesn't change any code of `StateLocalitySlotAssigner`.
- @ztison 's
[concern](https://github.com/apache/flink/pull/25218#issuecomment-2401913141)
is this fix conflicts with spreading the workload across as many workers as
possible.
- As we discussed before, this concern only exists for session mode.
That why I'm curious could we fix it for `Application Mode` first.
- The third reason is most important: the issue that this PR is trying to
fix is more like a bug than an optimization for `Application Mode` and
disable `execution.state-recovery.from-local`.
- The phenomenon of this bug is that TM resources cannot be released
after scaling down.
- I believe that flink users use Adaptive Scheduler mainly to scale up
and scale down quickly or more efficiently.
- Many users have questions like: why resources can be saved after
scaling down.
- This bug is reported to 3 JIRAs: FLINK-33977, FLINK-35594 and
FLINK-35903.
- The main reason I wanna discuss with you again is : one flink user
[reported this bug
](https://apache-flink.slack.com/archives/C03G7LJTS2G/p1729167222445569)again
in the Slack troubleshooting channel, the the reporter cc me in the [next
thread](https://apache-flink.slack.com/archives/C03G7LJTS2G/p1729167719506889)
due to I'm the active contributor of autoscaler. (I guess he doesn't know the
bug or phenomenon is not related to autoscaler, it's related to Adaptive
Scheduler)
- It is worth mentioning that as I know @RocMarshal (the developer of
this PR) doesn't report any jira, because he noticed the issue is reported via
some JIRAs.
- It means at least 5 users(From what I have observed, these 5 users
come from 5 different companies) faced this issue in their production jobs. I’m
happy to see more and more companies trying out Adaptive Scheduler.
- The fourth reason: 1.20 is the LTS version for 1.x series.
- If we think it's a bug, we could fix it in 1.20.x and 2.0.x together.
- If we think it's an improvement or feature rather than a bug , and
improve it in a FLIP, it means this issue cannot be fixed in 1.x series.
- That's why
[FLIP-461](https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler)
and
[FLIP-472](https://cwiki.apache.org/confluence/display/FLINK/FLIP-472%3A+Aligning+timeout+logic+in+the+AdaptiveScheduler%27s+WaitingForResources+and+Executing+states)
cannot be backported to 1.x series.
- Actually, I think both of
[FLIP-461](https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler)
and
[FLIP-472](https://cwiki.apache.org/confluence/display/FLINK/FLIP-472%3A+Aligning+timeout+logic+in+the+AdaptiveScheduler%27s+WaitingForResources+and+Executing+states)
are great improvement for Adaptive Scheduler. Thank you for the great work. ❤️
- I believe most of users(companies) are not able to maintain the
internal flink version, and they use the official flink version. If this bug is
not fixed in 1.x, it may be difficult for Adaptive Scheduler to be used by a
large number of users in 1.x.
- Of course, my team maintains our internal flink version. We can easily
fix it in our production environment. My initiative is mainly to enable most
flink users to have a better Adaptive Scheduler experience.

Sorry to bother you again. This is definitely my last try. If you think it
is unreasonable, I can accept it and deal with it in a subsequent FLIP. Thank
you very much. ❤️

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33977][runtime] Adaptive scheduler may not minimize the number of TMs during downscaling [flink]

Reply via email to