Hello David, that's the thing I want to know. To build a script to check this is not the problem. In the first check I have started with " doveadm replicator status" search for " Waiting 'failed' requests" and if this is > 0 then give me a failure. But if I have this in my monitoring then I have a lot of alarms that where cleared during the next poll. For example: OpenNMS polls this nrpe check that looks at the value described, there are one or more "Waiting 'failed' requests" it gives an alarm. 5 min later (the next poll from OpenNMS) the "Waiting 'failed' requests" are 0 because dovecot has fixed the the failed users by itself. And so I have a lot of alarms that where cleared 5-10 min after they came into the monitoring without doing anything. I'm searching for a way to get the user out of the system where dovecot could not solve a failure by itself. Because this is what I want to altert so that I can take a look and fix it.
Regards, Oliver -----Ursprüngliche Nachricht----- Von: dovecot [mailto:dovecot-boun...@dovecot.org] Im Auftrag von David Morsberger Gesendet: Donnerstag, 18. Februar 2021 23:17 An: MK Cc: dovecot@dovecot.org Betreff: Re: Monitoring Dovecot Replication Oliver, What’s your observable event that indicates replication has failed or is behind? Log message? Different file checksums? David > On Feb 18, 2021, at 10:54 AM, MK <dovecot...@mk.de> wrote: > > Hello Andrea, > > thanks for sharing your script to the community. > > But think your script does not solve my problem. Monitoring failed > replication with the output of "doveadm replicator status" > I have allready tried. In my opinion there is nothing in this output and also > in other status output I found that shows me the > user that failed longer time and where the replication process does not solve > this failure by itself. > I'm searching for something that shows me an alarm if dovecot could not fix a > replication by itself > after > 10 min. With my experience the most replication failures where fixed > by dovecot automatically > in under 10 min. Because dovecot starts every 5min another try. > Or did you have a logic outside this script, maybe in Check_MK that knows > when a user is greater than 10 min > out of replication or something like hat? Until now I don't unterstand how > this works for you as monitoring the > replication. > > To understand my side better. We are using OpenNMS to monior our servers and > in this case I would use a > nrpe check on the cluster to monitor this. OpenNMS polls this check every 5 > min and if it gives a fail result > I have an alarm. Maybe this helps a little bit to understand my problem. > > Regards, > Oliver > > -----Ursprüngliche Nachricht----- > Von: dovecot [mailto:dovecot-boun...@dovecot.org] Im Auftrag von Andrea > Gabellini > Gesendet: Montag, 15. Februar 2021 11:04 > An: Steven Varco; dovecot@dovecot.org > Betreff: Re: Monitoring Dovecot Replication > > Hello, > > here my script. I'm not a professional programmer... ;-) > > Andrea > > Il 12/02/21 17:53, Steven Varco ha scritto: >> Hi Andrea >> >> It would be great if oyu could post that here, as I (and possibly others) >> would also be interested. :) >> >> thanks, >> Steven >> > > -- > __________________________ > hAS ANYONE SEEN MY cAPSLOCK KEY? > __________________________ > > TIM San Marino S.p.A. > Andrea Gabellini > Engineering R&D > TIM San Marino S.p.A. - https://www.telecomitalia.sm > Via Ventotto Luglio, 212 - Piano -2 > 47893 - Borgo Maggiore - Republic of San Marino > Tel: (+378) 0549 886237 > Fax: (+378) 0549 886188 > > > > -- > Informativa Privacy > > Questa email ha per destinatari dei contatti presenti negli archivi di TIM > San Marino S.p.A.. Tutte le informazioni vengono trattate e tutelate nel > rispetto della normativa vigente sulla protezione dei dati personali (Reg. EU > 2016/679). Per richiedere informazioni e/o variazioni e/o la cancellazione > dei vostri dati presenti nei nostri archivi potete inviare una email a > priv...@telecomitalia.sm. > > Avviso di Riservatezza > > Il contenuto di questa e-mail e degli eventuali allegati e' strettamente > confidenziale e destinato alla/e persona/e a cui e' indirizzato. Se avete > ricevuto per errore questa e-mail, vi preghiamo di segnalarcelo > immediatamente e di cancellarla dal vostro computer. E' fatto divieto di > copiare e divulgare il contenuto di questa e-mail. Ogni utilizzo abusivo > delle informazioni qui contenute da parte di persone terze o comunque non > indicate nella presente e-mail potra' essere perseguito ai sensi di legge.