mbalassi commented on code in PR #558: URL: https://github.com/apache/flink-kubernetes-operator/pull/558#discussion_r1155274830
########## flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java: ########## @@ -627,14 +637,42 @@ public Map<String, String> getClusterInfo(Configuration conf) throws Exception { .toSeconds(), TimeUnit.SECONDS); - runtimeVersion.put( + clusterInfo.put( DashboardConfiguration.FIELD_NAME_FLINK_VERSION, dashboardConfiguration.getFlinkVersion()); - runtimeVersion.put( + clusterInfo.put( DashboardConfiguration.FIELD_NAME_FLINK_REVISION, dashboardConfiguration.getFlinkRevision()); } - return runtimeVersion; + + // JobManager resource usage can be deduced from the CR + var jmParameters = + new KubernetesJobManagerParameters( + conf, new KubernetesClusterClientFactory().getClusterSpecification(conf)); + var jmTotalCpu = + jmParameters.getJobManagerCPU() + * jmParameters.getJobManagerCPULimitFactor() + * jmParameters.getReplicas(); + var jmTotalMemory = + Math.round( + jmParameters.getJobManagerMemoryMB() + * Math.pow(1024, 2) + * jmParameters.getJobManagerMemoryLimitFactor() + * jmParameters.getReplicas()); + + // TaskManager resource usage is best gathered from the REST API to get current replicas Review Comment: Thanks @mateczagany, this approach looks good. If you have the bandwidth would you mind pushing your suggestions to this PR branch so that the commit can be attributed to you? 😏 I have invited you as a collaborator to my fork, you might need to accept that. I would ask the following if you have the time: 1. Get resource configuration from the config as you suggested uniformly for JMs and TMs 2. Get JM replicas from config, TM replicas from the REST API (we are trying to be careful with the TM replicas because we foresee that we might be changing things dynamically there via the autoscaler soon) 3. Add a test to `FlinkDeploymentMetricsTest` that verifies that given that the `status.clusterInfo` is properly filled out we fill out the metrics properly. Currently we do not have meaningful test for creating the clusterInfo and since we are relying on the application's REST API I do not see an easy way of testing it properly, so I would accept this change without that (but it might merit a separate JIRA). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org