Thanks David for driving this. This is a very valuable work, especially for cloud native environment.
>> How about adding some more information such as vertex type (SOURCE/MAP/JOIN and .etc) in the response of `get jobs resource-requirements`? For users, only vertex-id may be difficult to understand. +1 for this suggestion, including jobvertex's name in the response body is more user-friendly. I saw this sentence in FLIP: "Setting the upper bound to -1 will reset the value to the default setting." What is the default value here (based on what configuration), or just infinite? Best regards, Weijie Shammon FY <zjur...@gmail.com> 于2023年2月6日周一 18:06写道: > Hi David > > Thanks for initiating this discussion. I think declaring job resource > requirements by REST API is very valuable. I just left some comments as > followed > > 1) How about adding some more information such as vertex type > (SOURCE/MAP/JOIN and .etc) in the response of `get jobs > resource-requirements`? For users, only vertex-id may be difficult to > understand. > > 2) For sql jobs, we always use a unified parallelism for most vertices. Can > we provide them with a more convenient setting method instead of each one? > > > Best, > Shammon > > > On Fri, Feb 3, 2023 at 8:18 PM Matthias Pohl <matthias.p...@aiven.io > .invalid> > wrote: > > > Thanks David for creating this FLIP. It sounds promising and useful to > > have. Here are some thoughts from my side (some of them might be rather a > > follow-up and not necessarily part of this FLIP): > > - I'm wondering whether it makes sense to add some kind of resource ID to > > the REST API. This would give Flink a tool to verify the PATCH request of > > the external system in a compare-and-set kind of manner. AFAIU, the > process > > requires the external system to retrieve the resource requirements first > > (to retrieve the vertex IDs). A resource ID <ABC> would be sent along as > a > > unique identifier for the provided setup. It's essentially the version ID > > of the currently deployed resource requirement configuration. Flink > doesn't > > know whether the external system would use the provided information in > some > > way to derive a new set of resource requirements for this job. The > > subsequent PATCH request with updated resource requirements would include > > the previously retrieved resource ID <ABC>. The PATCH call would fail if > > there was a concurrent PATCH call in between indicating to the external > > system that the resource requirements were concurrently updated. > > - How often do we allow resource requirements to be changed? That > question > > might make my previous comment on the resource ID obsolete because we > could > > just make any PATCH call fail if there was a resource requirement update > > within a certain time frame before the request. But such a time period is > > something we might want to make configurable then, I guess. > > - Versioning the JobGraph in the JobGraphStore rather than overwriting it > > might be an idea. This would enable us to provide resource requirement > > changes in the UI or through the REST API. It is related to a problem > > around keeping track of the exception history within the > AdaptiveScheduler > > and also having to consider multiple versions of a JobGraph. But for that > > one, we use the ExecutionGraphInfoStore right now. > > - Updating the JobGraph in the JobGraphStore makes sense. I'm just > > wondering whether we bundle two things together that are actually > separate: > > The business logic and the execution configuration (the resource > > requirements). I'm aware that this is not a flaw of the current FLIP but > > rather something that was not necessary to address in the past because > the > > JobGraph was kind of static. I don't remember whether that was already > > discussed while working on the AdaptiveScheduler for FLIP-160 [1]. Maybe, > > I'm missing some functionality here that requires us to have everything > in > > one place. But it feels like updating the entire JobGraph which could be > > actually a "config change" is not reasonable. ...also considering the > > amount of data that can be stored in a ConfigMap/ZooKeeper node if > > versioning the resource requirement change as proposed in my previous > item > > is an option for us. > > - Updating the JobGraphStore means adding more requests to the HA backend > > API. There were some concerns shared in the discussion thread [2] for > > FLIP-270 [3] on pressuring the k8s API server in the past with too many > > calls. Eventhough, it's more likely to be caused by checkpointing, I > still > > wanted to bring it up. We're working on a standardized performance test > to > > prepare going forward with FLIP-270 [3] right now. > > > > Best, > > Matthias > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-160%3A+Adaptive+Scheduler > > [2] https://lists.apache.org/thread/bm6rmxxk6fbrqfsgz71gvso58950d4mj > > [3] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-270%3A+Repeatable+Cleanup+of+Checkpoints > > > > On Fri, Feb 3, 2023 at 10:31 AM ConradJam <jam.gz...@gmail.com> wrote: > > > > > Hi David: > > > > > > Thank you for drive this flip, which helps less flink shutdown time > > > > > > for this flip, I would like to make a few idea on share > > > > > > > > > - when the number of "slots" is insufficient, can we can stop users > > > rescaling or throw something to tell user "less avaliable slots to > > > upgrade, > > > please checkout your alivalbe slots" ? Or we could have a request > > > switch(true/false) to allow this behavior > > > > > > > > > - when user upgrade job-vertx-parallelism . I want to have an > > interface > > > to query the current update parallel execution status, so that the > > user > > > or > > > program can understand the current status > > > - I want to have an interface to query the current update > parallelism > > > execution status. This also helps similar to *[1] Flink K8S > Operator* > > > management > > > > > > > > > { > > > status: Failed > > > reason: "less avaliable slots to upgrade, please checkout your > alivalbe > > > slots" > > > } > > > > > > > > > > > > - *Pending*: this job now is join the upgrade queue,it will be > update > > > later > > > - *Rescaling*: job now is rescaling,wait it finish > > > - *Finished*: finish do it > > > - *Failed* : something have wrong,so this job is not alivable > upgrade > > > > > > I want to supplement my above content in flip, what do you think ? > > > > > > > > > 1. > > > > > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/ > > > > > > > > > David Morávek <d...@apache.org> 于2023年2月3日周五 16:42写道: > > > > > > > Hi everyone, > > > > > > > > This FLIP [1] introduces a new REST API for declaring resource > > > requirements > > > > for the Adaptive Scheduler. There seems to be a clear need for this > API > > > > based on the discussion on the "Reworking the Rescale API" [2] > thread. > > > > > > > > Before we get started, this work is heavily based on the prototype > [3] > > > > created by Till Rohrmann, and the FLIP is being published with his > > > consent. > > > > Big shoutout to him! > > > > > > > > Last and not least, thanks to Chesnay and Roman for the initial > reviews > > > and > > > > discussions. > > > > > > > > The best start would be watching a short demo [4] that I've recorded, > > > which > > > > illustrates newly added capabilities (rescaling the running job, > > handing > > > > back resources to the RM, and session cluster support). > > > > > > > > The intuition behind the FLIP is being able to define resource > > > requirements > > > > ("resource boundaries") externally that the AdaptiveScheduler can > > > navigate > > > > within. This is a building block for higher-level efforts such as an > > > > external Autoscaler. The natural extension of this work would be to > > allow > > > > to specify per-vertex ResourceProfiles. > > > > > > > > Looking forward to your thoughts; any feedback is appreciated! > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management > > > > [2] https://lists.apache.org/thread/2f7dgr88xtbmsohtr0f6wmsvw8sw04f5 > > > > [3] https://github.com/tillrohrmann/flink/tree/autoscaling > > > > [4] > > > https://drive.google.com/file/d/1Vp8W-7Zk_iKXPTAiBT-eLPmCMd_I57Ty/view > > > > > > > > Best, > > > > D. > > > > > > > > > > > > > -- > > > Best > > > > > > ConradJam > > > > > >