Hi,
I'm building a monitoring system for Apache Spark and want to set up
default alerts (threshold or anomaly) on 2-3 key metrics everyone who uses
Spark typically wants to alert on, but I don't yet have production-grade
experience with Spark.

Importantly, alert rules have to be generally useful, so can't be on
metrics whose values vary wildly based on the size of deployment.

In other words, which metrics would be most significant indicators that
something went wrong with your Spark:
 - master
 - worker
 - driver
 - executor
 - streaming


I thought the best place to find experienced Spark users, who would find
answering this question trivial, would be here.

Thanks very much,
Mark Scott

Reply via email to