[ https://issues.apache.org/jira/browse/FLINK-24947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446370#comment-17446370 ]
liuzhuo commented on FLINK-24947: --------------------------------- As you said, using HostNetWork mode, we will have some changes to the port generation. In the k8s environment, we provide ports through two Services, InternalService (start port 6123 for TaskManager to access JobManager) and ExternalService (start port 8080 for client to access JobManager), these two services are created by the client. In HostNetWork mode, the port must be generated randomly, so the creation time of these two services needs to be postponed, and then the service is created after the JobManager is started. This is because we can get the values of these two ports only after the JobManager starts successfully. It should be noted that these two services need to be created/updated every time the JobManager is started. >1. How the TaskManager could find the leader JobManager address after the >JobManager failover without HA? In non-HA mode, after the JobManager fails over and started, it will use the new port (6123) to modify the InternalService, so that the TaskManager can still obtain the correct JobManager address information through the InternalService >2.How the Flink client could find the leader JobManager address without HA? This is indeed a more difficult place, because the client has no way to get the real JobManager's Rest port (8080) when submitting. Maybe we can wait for a certain period of time on the client to get the accurate port of the ExternalService (8080). . In fact, we internally use Ingress to solve this problem. When the Client builds the Deployment, we add an IngressDecorator, so that we can access the JobManager through this ingress (this Ingress is not reachable when it is just started, because the corresponding The ExternalService has not been created, and it takes effect after the JobManager creates the ExternalService). This method is a better way, but the prerequisite is that an IngressController is required. I am not sure if this is suitable for everyone. >3.Should we update the internal/external K8s service when the JobManager has >allocated a dynamic port Yes, as described above, JobManager need to update the InternalService and ExternalService every time it starts I suddenly discovered that we are using session mode internally. For applicaiton mode, I may need to investigate again. > Flink on k8s support HostNetWork model > -------------------------------------- > > Key: FLINK-24947 > URL: https://issues.apache.org/jira/browse/FLINK-24947 > Project: Flink > Issue Type: New Feature > Components: Deployment / Kubernetes > Reporter: liuzhuo > Priority: Minor > > For the use of flink on k8s, for performance considerations, it is important > to choose a CNI plug-in. Usually we have two environments: Managed and > UnManaged. > Managed: Cloud vendors usually provide very efficient CNI plug-ins, we > don’t need to care about network performance issues > UnManaged: On self-built K8s clusters, CNI plug-ins are usually optional, > similar to Flannel and Calcico, but such software network cards usually lose > some performance or require some additional network strategies. > For an unmanaged environment, if we also want to achieve the best network > performance, should we support the *HostNetWork* model? > Use the host network to achieve the best performance -- This message was sent by Atlassian Jira (v8.20.1#820001)