[GitHub] [flink] xintongsong edited a comment on pull request #13464: [FLINK-19307][coordination] Add ResourceTracker

GitBox Thu, 24 Sep 2020 19:02:08 -0700


xintongsong edited a comment on pull request #13464:
URL: https://github.com/apache/flink/pull/13464#issuecomment-698301073



   I can see how the mapping simplifies things. My concern is whether this 
simplification hurts not only optimal but also the correctness. Not entirely 
sure about this. I'll try to explain my concern with an example.
   
   * 2 requirement profiles: A & B
   * 2 slot profiles: X & Y
   * A can only be fulfilled by X
   * B can be fulfilled by X or Y
   * Resource-requirement mapping status
     * A: 1 -> X: 1
     * B: 2 -> X: 1, Y: 1
     * excees -> Y: 1
   
   Now a slot of profile X is lost. Since neither A nor B have too many 
resources, either of them might be deducted.
   
   If A is deducted, the excess Y cannot be used, and we would need to request 
for a new resource for A.
     * A: 1 -> X: 0
     * B: 2 -> X: 1, Y: 1
     * excees -> Y: 1
   
   If B is deducted, then the excess Y can be used, and we do not need to 
allocate new resources.
     * A: 1 -> X: 1
     * B: 2 -> Y: 2
     * excees -> none
   
   Assuming all tasks are in running state. If a slot assigned to requirement A 
is lost and RM deducts B, then RM will not assign new slot to the job, and JM 
cannot deploy tasks from the lost slot to the excess slot Y. Either the tasks 
cannot recover, or JM will have to stop some tasks from a slot X and move them 
to the excess Y. On the other hand, if a slot assigned to requirement B is lost 
and RM deducts A, then JM will have no problem recovering the failed tasks in 
slot Y, but RM still allocates and assign a new slot to the job. Even if the 
job returns the unneeded slot, RM may keep trying to allocate new slot for the 
job, because it sees that the acquired resources for this job does not match 
the required resources.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong edited a comment on pull request #13464: [FLINK-19307][coordination] Add ResourceTracker

Reply via email to