Re: [Clamav-users] Re: Freshclam daemon dies during update process

Rolf E. Sonneveld Thu, 08 Feb 2007 00:39:55 -0800

Dear Ian,

some time ago you wrote, in answer to one of my questions:

On 02/01/07 21:51, Rolf E. Sonneveld wrote:
Ian Abbott wrote:
On 02/01/07 12:12, Rolf E. Sonneveld wrote:
According to the monitoring system, the freshclam processdisappeared between 14:29 and 14:34. Running ClamAV on Solaris 9.Any idea why after a 'connection refused' or 'connection timed out'the freshclam process dies?
It would be nice if there was an option to run freshclam as a"foreground daemon" so you could monitor its exit status, but thereisn't. My guess is that it's receiving a signal whose currentaction is set to kill the process.
The signal handling for SIGALRM and SIGUSR1 in freshclam.c's main()function is a bit buggy. It sets the following actions in the mainloop:
        sigaction(SIGALRM, &sigact, &oldact);
        sigaction(SIGUSR1, &sigact, &oldact);

then later on:

        sigaction(SIGALRM, &oldact, NULL);
        sigaction(SIGUSR1, &oldact, NULL);
There are two problems here. The two signals shouldn't really beusing the same variable 'oldact', even though the default action forboth signals is the same. The other problem is that the programspends some of its time with the SIGALRM and SIGUSR1 signals set tothe default action, which is to terminate the process. In fact, themore I look at the main loop of the freshclam daemon, the worse itgets! It may catch SIGHUP and set the 'terminate' variable at thewrong time, causing the main loop to exit prematurely, or it mayfail to catch 'SIGALRM' or 'SIGUSR1' some of the time, causing theprocess to terminate with that signal.
Thanks, Ian. This sounds interesting. If I understand you correctly,this can be related to the problem we see, with the disappearingfreshclam daemon process? I'm not a programmer so I'm afraid I can'tcontribute code here; also, I'm not familiar with the way ClamAVchanges/fixes are done. Is anyone in charge of the freshclam code?
It might be the problem, especially if you are sending a signal(SIGHUP) to the freshclam process from a log rotation script. If thisoccurs almost immediately after an internally generated SIGALRM, itcould cause the main loop to terminate early, though that is extremelyunlikely as the time window is very small. A far more likely cause isthat the process is woken up by the SIGHUP and then the internallygenerated SIGALRM occurs later, killing the process. The program usesthe default SIGALRM handler while it is doing all the network stuff,for example, so if the process is woken by an external SIGHUP, spendsa lot of time doing network stuff, and receives the internallygenerated SIGALRM at this time, the process will be killed.
I'll mention my theory on the devel list, anyway.

Did you get any response on this issue on the development list? Theproblem still occurs now and then (occassionally, once every two orthree weeks, without a pattern). Today I came in the office and foundfreshclam had died again. Logfile:


--------------------------------------
Received signal: wake up
ClamAV update process started at Thu Feb  8 04:03:52 2007
WARNING: Your ClamAV installation is OUTDATED!
WARNING: Local version: 0.88.6 Recommended version: 0.88.7
DON'T PANIC! Read http://www.clamav.net/faq.html

main.cvd is up to date (version: 42, sigs: 83951, f-level: 10, builder:tkojm)daily.cvd is up to date (version: 2533, sigs: 5388, f-level: 9, builder:sven)

--------------------------------------
Received signal: wake up
ClamAV update process started at Thu Feb  8 04:33:52 2007
WARNING: Your ClamAV installation is OUTDATED!
WARNING: Local version: 0.88.6 Recommended version: 0.88.7
DON'T PANIC! Read http://www.clamav.net/faq.html

main.cvd is up to date (version: 42, sigs: 83951, f-level: 10, builder:tkojm)

nonblock_connect: connect timing out (30 secs)
nonblock_connect: connect timing out (30 secs)
nonblock_connect: connect timing out (30 secs)
nonblock_connect: connect timing out (30 secs)
nonblock_connect: connect timing out (30 secs)
nonblock_connect: connect timing out (30 secs)
nonblock_connect: connect timing out (30 secs)
connect_error: getsockopt(SO_ERROR): fd=0 error=145: Connection timed out

No core file found. Unfortunately, enabling Debug does not show timestamps.
Running:

-bash-3.00$ /opt/ClamAV/sbin/clamd -V
ClamAV 0.88.6/2534/Thu Feb  8 04:28:17 2007

The ClamAV mirror defined is:

bash-3.00# grep -i db /opt/ClamAV/etc/freshclam.conf
DatabaseMirror db.DE.clamav.net

We have seen the same problem when using db.NL.clamav.net. Looking atthe availability figures for Germany(http://www.clamav.net/mirrors.html#de) it seems there has only been oneserver with a temp. failure tonight (which matches roughly the time theproblem occurred).


What does freshclam daemon do:

a) do one DNS lookup (find multiple A reocrds), and after the first hostfails, take the second host and so on.

b) perform a DNS lookup after each failed connection

In case a) I can't understand why freshclam would fail seven times,except when there has been a network problem for this host (therewasn't). In case b) it is possible that the system each time gets thesame IP address (depends on the DNS client library and the way theresults are sorted).

FYI, the system on which ClamAV is running is a Solaris 10 system. Ihope there will be a fix for this in the next release.


Regards,
/rolf
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://lurker.clamav.net/list/clamav-users.html

Re: [Clamav-users] Re: Freshclam daemon dies during update process

Reply via email to