Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

Matthias Pohl Wed, 18 Dec 2024 00:51:06 -0800

Hi Yuepeng,
Sorry for not finding the time to respond earlier. I went over FLIP-495 [1]
and the previous FLIP-487 discussion [2]. Thanks for putting it all
together in a FLIP. That makes it easier to discuss the next iteration.
Here are a few comments I have:


Rescale ID section
- How is the resourceRequirementsEpochID generated?
- Why is the Resource ID section "hidden" (at least, I don't have access)
in GoogleDocs and not added to the FLIP?
- Can you add more details of where this ID is coming from?

Slots section
- I was surprised that we have a name for the SlotSharingGroup. Then I
realized that we have 3 different implementations of the SlotSharingGroup
in Flink. The one that's used in the scheduler doesn't have the name
preserved but works with SlotSharingGroupIds. That, we might need to
consider. If we pass down the name (which might make sense for the UI), we
still want to expose the SlotSharingGroupId as well, I guess.
- What about exposing the ResourceProfile of the SlotSharingGroup here as
well?

Rescale Event/Rescale status sections
- I'm not sure about the AdaptiveScheduler state to Rescale event state
mapping that's included in the FLIP right now: Triggering rescaling only
happens in the Executing state of the AdaptiveScheduler right now. Waiting
for resources also happens while the job is running (i.e. Executing state).
The AdaptiveScheduler will immediately transition from Executing to
CreatingExecutionGraph state in case of rescaling (WaitingForResources is
omitted). This was introduced FLIP-472 [3]

- I'm wondering whether we can rely on the StateTransitionManager here
(which was also introduced with FLIP-472 [3]). That instance is coupled
with the Executing state (aside from WaitingForResources where it serves a
different purpose) and holds the information about the rescale trigger
event (and subsequent ignored rescale trigger events) and when the
rescaling was actually initiated. There might not be a need to work

- I also want to point out that we have four different notions of resource
configurations:
  - Desired resources: The ideal resource configuration that we want to
achieve for a job if enough Task slots are available (essentially the upper
bound of the job's parallelism)
  - Sufficient resources: A minimum resource configuration that the job can
run on (the lower bound of the job's parallelism)
  - Current resources: The resource configuration the job runs on before
rescaling
  - Follow-up resources
The first two are the resource configurations the rescale decision is based
on. The last two are the actual applied resource configurations. Keep in
mind that the latter two are not necessarily matching the resource
configurations that were considered when deciding on the rescaling.
Especially the case where the desired resources were met when rescaling was
triggered but where task slots are lost while rescaling can have a
surprising outcome. We might want to have this reflected in the rescale
event.

How/Where to store rescale events section
- It makes sense to have the rescale event history be stored in the
AdaptiveScheduler (analogously to what is done for the exception history).
But can you elaborate a bit more on the different approaches (in-memory, on
disk, DFS). Each of them have different outcome (in-memory: the history is
gone as soon as the job reaches a globally-terminal state; on disk: rescale
history survives the job termination; DFS: rescale history survives a JM
failover). I feel like on disk approach (analogously to the exception
history) makes the most sense here. WDYT?

Best,
Matthias

[1]
https://cwiki.apache.org/confluence/display/FLINK/%5BWIP%5D+FLIP-495%3A+Support+AdaptiveScheduler+record+and+query+the+rescale+history
[2] https://lists.apache.org/thread/f4md4btkf006mxcxf66bng1kfz0rsn8c
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-472%3A+Aligning+timeout+logic+in+the+AdaptiveScheduler%27s+WaitingForResources+and+Executing+states

On Tue, 17 Dec 2024, 16:21 Yuepeng Pan, <panyuep...@apache.org> wrote:

> Hi community,
>
>
>
>
> We discussed several aspects of FLIP-487[1] 'Show history of rescales in
> Web UI for AdaptiveScheduler'
> and received a lot of valuable feedback. Based on the suggestions from the
> email thread[2],
> we plan to split the original proposal for FLIP-487[1].
>
>
>
>
> The current email thread and the FLIP-495[3] wiki will be used to discuss
> 'Support AdaptiveScheduler in recording and querying the rescale history',
> while FLIP-487[1] will primarily focus on displaying-related design content
>
>
>
>
> Looking forward to any feedback and opinions on FLIP-495[3].
>
>
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/%5BWIP%5D+FLIP-487%3A+Show+history+of+rescales+in+Web+UI+for+AdaptiveScheduler
>
> [2] https://lists.apache.org/thread/f4md4btkf006mxcxf66bng1kfz0rsn8c
>
> [3]
> https://cwiki.apache.org/confluence/display/FLINK/%5BWIP%5D+FLIP-495%3A+Support+AdaptiveScheduler+record+and+query+the+rescale+history
>
>
>
>
> Thank you very much.
>
>
>
>
> Best,
>
> Regards.
>
> Yuepeng Pan

Re: [DISCUSS] FLIP-495: Support AdaptiveScheduler record and query the rescale history

Reply via email to