Very interesting this MONIT thing... is it possible to see how your script 
starts kannel.. my init.d script does not create a PID file in /var/run for 
smsbox or sqlbox or... any box. How do you manage that?? 

Thank you!

Alejandro Ramírez




----- Original Message -----
From: Jonathan Houser <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [email protected]
Sent: Fri,  2 Sep 2005 07:44:47 -0600
Subject: Re: Kannel Suicide


> 
>         Reuben,
> 
> > My problem is that when I start kannel and send an sms via the HTTP
> > service on my server everything is ok and I leave bearerbox and smsbox
> > running. But when I check back on them, I always find something wrong.
> > Sometimes smsbox kills itself and the HTTP service is no longer
> > available.. Sometimes both processes kill themselves. Sometimes
> > brearerbox looses connection with the SMSC and I have to restart
> > bearerbox for it to try to login again. Is there a way how this can be
> > avoided? If the connection is lost with the SMSC, can bearerbox
> > reconnect to the SMSC automatically? If the processes are terminated
> > unexpectedly, can I create a bash script that detects if they are
> > running and if not, it launches them? I was going to killall bearerbox
> > and killall smsbox every few hours and re launch them again but it is
> > not very wise because kannel might still be down for hours until the
> > next killall and relaunch commands are executed. I dont afford one
> > second of downtime. And if kannel was performing ok, it would have been
> > killed and relaunched for nothing. Can anyone help please in avoiding
> > kannel to commit suicide? Or if it does, can we resurrect it immediately?
> 
>      Been there.  Done that.  Two things:  First of all, it would be
> good to do a "grep -rF "PANIC" /your/kannel/log/dir/*" to find what's
> causing them to crash -- it may be something that's been fixed or could
> be fixed.  Secondly, use a monitoring daemon.  I built some init scripts
> for Kannel (one for bearerbox, one for smsbox -- then since I'm using
> Gentoo I made smsbox depend on {and thus automatically start} bearerbox,
> and bearerbox will also stop smsbox before it stops itself accordingly).
>  I personally use monit.  I have it watch the PID file of smsbox, and
> have it call "/etc/init.d/bearerbox stop" to stop the service,
> "/etc/init.d/smsbox start" to start it.  I did have to do some extra
> work with scripts for monit because monit will call only the start
> script if the process is not running, which won't work because the
> damaged bearerbox really needs to be stopped first too.  Anyway, back to
> the first statement.  I have had what *was* the latest Kannel break WAP
> with some handsets, so I instead had to backport and add new patches to
> my working version to keep it from PANIC'ing.  For whatever reason, if
> the slightest little thing goes wrong and triggers an assert(), Kannel
> freaks out and exits.  For me it was empty PDUs from WAP and then more
> recently empty List's.  Neither of those actually cause catastrophic,
> irrecoverable damage to Kannel if I wrap them up in quick "if ( x ==
> NULL )" checks, so they obviously weren't that bad.  You're probably
> seeing the same sort of non-fatal "oh crap, let's panic and exit" checks
> biting you.  As stated, they may have already been fixed, or may be
> really easy to wrap up in a quick check to keep them from taking down
> your Kannel as well.
> 
> Jon
> 
> 


Reply via email to