Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-09-03 Thread Zhu Zhu
The new edits look good to me. Looking forward to the vote. Thanks, Zhu Xintong Song 于2020年9月4日周五 上午9:49写道: > Thanks Till, the changes look good to me. Looking forward to the vote. > > Thank you~ > > Xintong Song > > > > On Fri, Sep 4, 2020 at 12:31 AM Till Rohrmann > wrote: > > > Thanks for t

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-09-03 Thread Xintong Song
Thanks Till, the changes look good to me. Looking forward to the vote. Thank you~ Xintong Song On Fri, Sep 4, 2020 at 12:31 AM Till Rohrmann wrote: > Thanks for the feedback Xintong and Zhu Zhu. I've added a bit more details > for the intended interface extensions, potential follow ups (remo

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-09-03 Thread Till Rohrmann
Thanks for the feedback Xintong and Zhu Zhu. I've added a bit more details for the intended interface extensions, potential follow ups (removing the AllocationIDs) and the question about whether to reuse or return a slot if the profiles don't fully match. If nobody objects, then I would start a vo

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-31 Thread Zhu Zhu
Thanks for the clarification @Till Rohrmann >> # Implications for the scheduling Agreed that it turned out to be different execution strategies for batch jobs. We can have a simple one first and improve it later. Thanks, Zhu Xintong Song 于2020年8月31日周一 下午3:05写道: > Thanks for the clarification,

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-31 Thread Xintong Song
Thanks for the clarification, @Till. - For FLIP-56, sounds good to me. I think there should be no problem before removing AllocationID. And even after replacing AllocationID, it should only require limited effort to make FLIP-56 work with SlotID. I was just trying to understand when the effort wil

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-28 Thread Till Rohrmann
Thanks for creating this FLIP @Chesnay and the good input @Xintong and @Zhu Zhu. Let me try to add some comments concerning your questions: # FLIP-56 I think there is nothing fundamentally contradicting FLIP-56 in the FLIP for declarative resource management. As Chesnay said, we have to keep the

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-28 Thread Zhu Zhu
Thanks for the explanation @Chesnay Schepler . Yes, for batch jobs it can be safe to schedule downstream vertices if there are enough slots in the pool, even if these slots are still in use at that moment. And the job can still progress even if the vertices stick to the original parallelism. Loo

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-28 Thread Chesnay Schepler
Maybe :) Imagine a case where the producer and consumer have the same ResourceProfile, or at least one where the consumer requirements are less than the producer ones. In this case, the scheduler can happily schedule consumers, because it knows it will get enough slots. If the profiles are d

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-27 Thread Zhu Zhu
Thanks for the response! >> The scheduler doesn't have to wait for one stage to finish Does it mean we will declare resources and decide the parallelism for a stage which is partially schedulable, i.e. when input data are ready just for part of the execution vertices? >> This will get more compli

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-27 Thread Chesnay Schepler
The scheduler doesn't have to wait for one stage to finish. It is still aware that the upstream execution vertex has finished, and can request/use slots accordingly to schedule the consumer. This will get more complicated once we allow the scheduler to change the parallelism while the job is r

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-27 Thread Zhu Zhu
Thanks Chesnay&Till for proposing this improvement. It's of good value to allow jobs to make best use of available resources adaptively. Not to mention it further supports reactive mode. So big +1 for it. I have a minor concern about possible regression in certain cases due to the proposed JobVert

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-26 Thread Xintong Song
Thanks for the quick response. *Job prioritization, Allocation IDs, Minimum resource requirements, SlotManager Implementation Plan:* Sounds good to me. *FLIP-56* Good point about the trade-off. I believe maximum resource utilization and quick deployment are desired in different scenarios. E.g., a

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-26 Thread Chesnay Schepler
Thank you Xintong for your questions! Job prioritization Yes, the job which declares it's initial requirements first is prioritized. This is very much for simplicity; for example this avoids the nasty case where all jobs get some resources, but none get enough to actually run the job.

Re: [DISCUSS] FLIP-138: Declarative Resource management

2020-08-26 Thread Xintong Song
Thanks for preparing the FLIP and driving this discussion, @Chesnay & @Till. I really like the idea. I see a great value in the proposed declarative resource management, in terms of flexibility, usability and efficiency. I have a few comments and questions regarding the FLIP design. In general, t

[DISCUSS] FLIP-138: Declarative Resource management

2020-08-25 Thread Chesnay Schepler
Hello, in FLIP-138 we want to rework the way the JobMaster acquires slots, such that required resources are declared before a job is scheduled and th job execution is adjusted according to the provided resources (e.g., reducing parallelism), instead of asking for a fixed number of resources d