Hi everyone,

Linkis is a distributed microservices system, including multi long-running and 
dynamic / ac-hoc (the engine conn) instances.
It might be better if we can monitor each JVM service, rather than the fixed 
host server.


The current pain point is how to enable monitoring system connect to each 
instance and retrieve the metrics in a standard way.
The Prometheus can help us solve this problem, by leveraging the feature of 
service discovery (SD).


The current register center used in Linkis is Eureka, and it's support by 
prometheus as one of the available SD configuration, whcih allows retrieving 
scrape targets using the Eureka REST API. 
And Prometheus will periodically check the REST endpoint and create a target 
for every app instance.


Based on this, we can enable Linkis to provide the scrape targets in Eureka 
metadata, and open the metrics endpoint for each instance.
Once the instances can be monitored in Prometheus, we can setup the alter 
channel in Prometheus AlertManager and dashboard in Grafana.


The overall monitoring process can be designed in the following way:




The feature and corresponding use cases can be referrf in [Feature] Monitor 
Linkis based on Prometheus #1656


Welcome the suggestion or idea from you, to make it better.


Thanks!
Sun Shun





Reply via email to