[ 
https://issues.apache.org/jira/browse/FLINK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237809#comment-17237809
 ] 

Yang Wang commented on FLINK-20249:
-----------------------------------

[~jiang7chengzitc] [~trohrmann] Thanks for the discussion.

First, I want to share some thoughts why I add the internal service at the very 
beginning.
 * When the JobManager failover, the previous TaskManagers could reconnect to 
JobManager via internal service. This could make us reuse the TaskManagers and 
do not need to allocate new ones. However, for application mode, it also 
depends on which is faster, "JobMaster requests slot from ResourceManager" and 
"TaskManager re-registers to ResourceManager". Maybe [~xintongsong] could give 
us more suggestions here.
 * For the session cluster, it is really useful especially when we set the 
TaskManager idle timeout is a bit long(3600s).

 

Moreover, I do not think the internal service will occupy some heavy K8s 
resources. Because we are using [headless 
service|https://kubernetes.io/docs/concepts/services-networking/service/#headless-services]
 now, which means it does not need a ClusterIP and will not take any burden to 
the kube proxy.

Also a K8s node down will not make the internal service disappear. And deleted 
by users unexpectedly is not the responsibility we have to cover here.

 

Note: The reason why you delete the internal service and the TaskManager still 
could heartbeats to ResourceManager/JobManager successfully is about the 
headless service. 

> Rethink the necessity of the k8s internal Service even in non-HA mode
> ---------------------------------------------------------------------
>
>                 Key: FLINK-20249
>                 URL: https://issues.apache.org/jira/browse/FLINK-20249
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>    Affects Versions: 1.11.0
>            Reporter: Ruguo Yu
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.12.0
>
>         Attachments: k8s internal service - in english.pdf, k8s internal 
> service - v2.pdf, k8s internal service.pdf
>
>
> In non-HA mode, k8s will create internal service that directs the 
> communication from TaskManagers Pod to JobManager Pod, and TM Pods could 
> re-register to the new JM Pod once a JM Pod failover occurs.
> However recently I do an experiment and find a problem that k8s will first 
> create new TM pods and then destory old TM pods after a period of time once 
> JM Pod failover (note: new JM podIP has changed), then job will be reschedule 
> by JM on new TM pods, it means new TM has been registered to JM. 
> During this process, internal service is active all the time, but I think it 
> is not necessary that keep this internal service, In other words, wo can weed 
> out internal service and use JM podIP for TM pods communication with JM pod, 
> In this case, it be consistent with HA mode.
> Finally,related experiments is in attached (k8s internal service.pdf).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to