[ https://issues.apache.org/jira/browse/FLINK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ruguo Yu updated FLINK-20249: ----------------------------- Attachment: image-2020-11-28-21-45-17-563.png > Rethink the necessity of the k8s internal Service even in non-HA mode > --------------------------------------------------------------------- > > Key: FLINK-20249 > URL: https://issues.apache.org/jira/browse/FLINK-20249 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes > Affects Versions: 1.11.0 > Reporter: Ruguo Yu > Priority: Minor > Labels: pull-request-available > Fix For: 1.12.0 > > Attachments: k8s internal service - in english.pdf, k8s internal > service - v2.pdf, k8s internal service.pdf > > > In non-HA mode, k8s will create internal service that directs the > communication from TaskManagers Pod to JobManager Pod, and TM Pods could > re-register to the new JM Pod once a JM Pod failover occurs. > However recently I do an experiment and find a problem that k8s will first > create new TM pods and then destory old TM pods after a period of time once > JM Pod failover (note: new JM podIP has changed), then job will be reschedule > by JM on new TM pods, it means new TM has been registered to JM. > During this process, internal service is active all the time, but I think it > is not necessary that keep this internal service, In other words, wo can weed > out internal service and use JM podIP for TM pods communication with JM pod, > In this case, it be consistent with HA mode. > Finally,related experiments is in attached (k8s internal service.pdf). -- This message was sent by Atlassian Jira (v8.3.4#803005)