[
https://issues.apache.org/jira/browse/FLINK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236654#comment-17236654
]
Till Rohrmann commented on FLINK-20249:
---------------------------------------
Thanks for the experiments. Would it be possible to publish the document in
English? I cannot really read it.
For the second point, I would be interested in better understanding what is
causing the service to disappear. Can this be a K8s problem?
> Rethink the necessity of the k8s internal Service even in non-HA mode
> ---------------------------------------------------------------------
>
> Key: FLINK-20249
> URL: https://issues.apache.org/jira/browse/FLINK-20249
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / Kubernetes
> Affects Versions: 1.11.0
> Reporter: jiang7chengzitc
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.12.0
>
> Attachments: k8s internal service - v2.pdf, k8s internal service.pdf
>
>
> In non-HA mode, k8s will create internal service that directs the
> communication from TaskManagers Pod to JobManager Pod, and TM Pods could
> re-register to the new JM Pod once a JM Pod failover occurs.
> However recently I do an experiment and find a problem that k8s will first
> create new TM pods and then destory old TM pods after a period of time once
> JM Pod failover (note: new JM podIP has changed), then job will be reschedule
> by JM on new TM pods, it means new TM has been registered to JM.
> During this process, internal service is active all the time, but I think it
> is not necessary that keep this internal service, In other words, wo can weed
> out internal service and use JM podIP for TM pods communication with JM pod,
> In this case, it be consistent with HA mode.
> Finally,related experiments is in attached (k8s internal service.pdf).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)