Hi All, Has anyone tried to manage production Flink applications through JMX remote monitoring & management[1]?
We were experimenting to enable JMXRMI on Flink by default in production and would like to share some of our thoughts: ** Is there any straightforward way to dynamically allocate JMXRMI remote ports?* - It is unrealistic to use JMXRMI static port in production environment, however we have to go all around the logging system to make the dynamic remote port number printed out in the log files - this seems very inconvenient. - I think it would be very handy if we can show the JMXRMI remote information on JobManager/TaskManager UI, or via REST API. (I am thinking about something similar to [2]) ** Is there any performance overhead enabling JMX for a Flink application?* - We haven't seen any significant performance impact in our experiments. However the experiment is not that well-rounded and the observation is inconclusive. - I was wondering would it be a good idea to put some benchmark in the regression tests[3] to see what's the overhead would be? It would be highly appreciated if anyone could share some experiences or provide any suggestions in how we can improve the JMX remote integration with Flink. Thanks, Rong [1] https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html [2] https://samza.apache.org/learn/documentation/0.14/jobs/web-ui-rest-api.html [3] http://codespeed.dak8s.net:8000/