We are using Monit to kick off spark streaming jobs n seems to work fine. On Monday, September 28, 2015, Chen Song <chen.song...@gmail.com> wrote:
> I am also interested specifically in monitoring and alerting on Spark > streaming jobs. It will be helpful to get some general guidelines or advice > on this, from people who implemented anything on this. > > On Fri, Sep 18, 2015 at 2:35 AM, Krzysztof Zarzycki <k.zarzy...@gmail.com > <javascript:_e(%7B%7D,'cvml','k.zarzy...@gmail.com');>> wrote: > >> Hi there Spark Community, >> I would like to ask you for an advice: I'm running Spark Streaming jobs >> in production. Sometimes these jobs fail and I would like to get email >> notification about it. Do you know how I can set up Spark to notify me by >> email if my job fails? Or do I have to use external monitoring tool? >> I'm thinking of the following options: >> 1. As I'm running those jobs on YARN, monitor somehow YARN jobs. Looked >> for it as well but couldn't find any YARN feature to do it. >> 2. Run Spark Streaming job in some scheduler, like Oozie, Azkaban, Luigi. >> Those are created rather for batch jobs, not streaming, but could work. Has >> anyone tried that? >> 3. Run job driver under "monit" tool and catch the failure and send an >> email about it. Currently I'm deploying with yarn-cluster mode and I would >> need to resign from it to run under monit.... >> 4. Implement monitoring tool (like Graphite, Ganglia, Prometheus) and use >> Spark metrics. And then implement alerting in those. Can I get information >> of failed jobs in Spark metrics? >> 5. As 4. but implement my own custom job metrics and monitor them. >> >> What's your opinion about my options? How do you people solve this >> problem? Anything Spark specific? >> I'll be grateful for any advice in this subject. >> Thanks! >> Krzysiek >> >> > > > -- > Chen Song > >