kylemeow opened a new pull request #13706:
URL: https://github.com/apache/flink/pull/13706


   ## What is the purpose of the change
   
   This pull request moved retrieval of TaskManager's host name and FQDN host 
name in _TaskManagerLocation_ from the contructor to the corresponding Getter 
methods (a.k.a. lazy initialization), which greatly reduces the possibility of 
blocking the Akka dispatcher during TaskManager registration with JobManager.
   
   Also, a new configuration parameter called 
`jobmanager.retrieve-taskmanager-hostname` is introduced to turn off 
aforementioned reverse DNS lookup steps during registration, which is 
especially useful in Kubernetes environment where a pod's IP might not be 
reversely resolved without being exposed with a Service.
   
   ## Brief change log
   
     - *Lazy initialization of hostname-related fields in TaskManagerLocation 
class*
     - *Allow for canonical hostname lookups being skipped during TaskManager 
registration with JobManager*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
     - *Added tests for the parameter that turns off reverse hostname lookups 
during JobMaster registration*
     - *Manually verified the change by running a per-job Kubernetes cluser 
with 1 JobManager and 50 TaskManagers (one slot for each TaskManager), when 
lazy initialization is implemented and reverse DNS lookup is disabled via the 
new configuration parameter, the init time for the cluster (from job submission 
to complete RUNNING status) reduced from ~10 minutes to less than 1 minute.*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies: no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: yes, JobManager 
(JobMaster)
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes
     - If yes, how is the feature documented? JavaDocs
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to