[ https://issues.apache.org/jira/browse/FLINK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237809#comment-17237809 ]
Yang Wang commented on FLINK-20249: ----------------------------------- [~jiang7chengzitc] [~trohrmann] Thanks for the discussion. First, I want to share some thoughts why I add the internal service at the very beginning. * When the JobManager failover, the previous TaskManagers could reconnect to JobManager via internal service. This could make us reuse the TaskManagers and do not need to allocate new ones. However, for application mode, it also depends on which is faster, "JobMaster requests slot from ResourceManager" and "TaskManager re-registers to ResourceManager". Maybe [~xintongsong] could give us more suggestions here. * For the session cluster, it is really useful especially when we set the TaskManager idle timeout is a bit long(3600s). Moreover, I do not think the internal service will occupy some heavy K8s resources. Because we are using [headless service|https://kubernetes.io/docs/concepts/services-networking/service/#headless-services] now, which means it does not need a ClusterIP and will not take any burden to the kube proxy. Also a K8s node down will not make the internal service disappear. And deleted by users unexpectedly is not the responsibility we have to cover here. Note: The reason why you delete the internal service and the TaskManager still could heartbeats to ResourceManager/JobManager successfully is about the headless service. > Rethink the necessity of the k8s internal Service even in non-HA mode > --------------------------------------------------------------------- > > Key: FLINK-20249 > URL: https://issues.apache.org/jira/browse/FLINK-20249 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes > Affects Versions: 1.11.0 > Reporter: Ruguo Yu > Priority: Minor > Labels: pull-request-available > Fix For: 1.12.0 > > Attachments: k8s internal service - in english.pdf, k8s internal > service - v2.pdf, k8s internal service.pdf > > > In non-HA mode, k8s will create internal service that directs the > communication from TaskManagers Pod to JobManager Pod, and TM Pods could > re-register to the new JM Pod once a JM Pod failover occurs. > However recently I do an experiment and find a problem that k8s will first > create new TM pods and then destory old TM pods after a period of time once > JM Pod failover (note: new JM podIP has changed), then job will be reschedule > by JM on new TM pods, it means new TM has been registered to JM. > During this process, internal service is active all the time, but I think it > is not necessary that keep this internal service, In other words, wo can weed > out internal service and use JM podIP for TM pods communication with JM pod, > In this case, it be consistent with HA mode. > Finally,related experiments is in attached (k8s internal service.pdf). -- This message was sent by Atlassian Jira (v8.3.4#803005)