[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

tillrohrmann Tue, 19 Jul 2016 04:59:39 -0700

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2257#discussion_r71323898
  
    --- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
    @@ -405,36 +374,13 @@ class JobManager(
     
           currentResourceManager match {
             case Some(rm) =>
    -          val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
    -          future.onComplete {
    -            case scala.util.Success(response) =>
    -              // the resource manager is available and answered
    -              self ! response
    -            case scala.util.Failure(t) =>
    -              t match {
    -                case _: TimeoutException =>
    -                  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
    -                case _ =>
    -                  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
    -              }
    -              // slow or unreachable resource manager, register anyway and 
let the rm reconnect
    -              self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
    -              self ! decorateMessage(new ReconnectResourceManager(rm))
    -          }(context.dispatcher)
    -
    +          log.info(s"Register task manager $resourceId at the resource 
manager.")
    +          rm ! decorateMessage(new RegisterResource(msg))
    --- End diff --
    
    If I'm not mistaken then there is hardly any difference between a 
registered worker and a container in launch. So in the current implementation 
it shouldn't matter much whether a container is in state "being launched" or 
"launched". Thus, it does not make much of a difference whether this message 
arrives or not.
    
    Given that the `JobManager` does not yet use the RM to allocate new 
resources, it might actually be a good idea to regard the RM as a tool to 
notify the JM about TM failures. Everything else can be added once we actually 
need it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

Reply via email to