[ https://issues.apache.org/jira/browse/FLINK-11632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941373#comment-16941373 ]
Tim commented on FLINK-11632: ----------------------------- Which version of Flink is this fix available in? Thanks; > Make TaskManager automatic bind address picking more explicit (by default) > and more configurable > ------------------------------------------------------------------------------------------------ > > Key: FLINK-11632 > URL: https://issues.apache.org/jira/browse/FLINK-11632 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination, Runtime / Network > Reporter: Alex > Assignee: Alex > Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, there is an optional {{taskmanager.host}} configuration option in > {{flink-conf.yaml}} that allows users of Flink to "statically" pre-define > what should be a bind address for TaskManager to listen on (note: it's also > possible to override this option by passing corresponding command line option > to Flink). > In case when the option is not set, TaskManager would try [heuristically pick > up a bind > address|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L421-L442]. > The resulting address (hostname) is used to advertise different service > endpoints (running in TM) to the JobManager. Also it would be resolved to an > {{[InetAddress|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L359]}} > later that used as binding address for TMs inner node communication. > This proposal is to minimize usage of heuristics (by default) by introducing > a new configuration option (for example, {{taskmanager.host.bind-policy}}) > with possible values: > * {{"hostname"}} - default, use TM's host's name ({{== > InetAddress.getLocalHost().getHostName()}}; > * {{"ip"}} - use TM's host's ip address ({{== > InetAddress.getLocalHost().getHostAddress()}}); > * {{"auto-detect-hostname"}} - use the heuristics based detection mechanism. > *Note:* the configuration key and values could be named better and open for > proposals. > *Note 2:* in the future, the configuration option _may_ require to be > extended to allow choosing some specific network interface, or preference of > ipv6 vs ipv4. > h3. Rationale > [The heuristics > mechanism|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/net/ConnectionUtils.java#L364-L475] > tries to establish a probe connection to {{jobmanager.rpc.address}} from > different network interface addresses. > In case of parallel setups (when JM and multiple TMs start simultaneously, > in parallel), this depends on timing, assigned network ip addresses and may > end up with "non-uniform" address bindings of TMs (some may be "lucky" to > pick up non default network interface, some would fallback to > {{InetAddress.getLocalHost().getHostName()}}. At the end, it's less obvious > and transparent which binding address a TM picks up. > In practice, it's possible that in majority of cases (in well setup > environments) the heuristics mechanism returns a result that matches > {{InetAddress.getLocalHost()}}. The proposal is to stick with this more > simpler and explicit binding (by default), avoiding non-determinism of > heuristics. > The old mechanism is kept available, in case if it is useful in some setups. > But would require explicit configuration setting. > Additionally, this proposal extends "auto configuration" option by allowing > users to choose the host's ip address (instead of hostname). This may be > convenient in situations where the TMs' machines are not necessary reachable > via DNS (for example in a Kubernetes setup). -- This message was sent by Atlassian Jira (v8.3.4#803005)