On Mar 5, 2004, at 15:26, Doug Hardie wrote:



On Mar 5, 2004, at 02:41, Trog wrote:


On Fri, 2004-03-05 at 01:15, Doug Hardie wrote:


I just uncommented the thread timeout the last time I restarted clamd
a couple minutes ago so I don't know what effect that will have.

ThreadTimeout isn't used in the current CVS version.


Here is some more information: After running with the timeout set to
500, clamd no longer dies. It chugs along for quite awhile (about 10
minutes) at full cpu usage and then returns to normal use. I don't see
anything different in the load between the periods. However a ktrace
of clamd shows a significant difference. Normally clamd shows nothing
much when idle and it shows the messages being received (read) when
processing a message. However, when its running at full cpu
utilization, ktrace shows thousands of sequences like:


   8313 clamd    PSIG  SIGPROF caught handler=0x28116228 mask=0x0
code=0x0
   8313 clamd    CALL  gettimeofday(0x2815fe4c,0)
   8313 clamd    RET   gettimeofday 0
   8313 clamd    CALL  sigprocmask(0x3,0x2815fed8,0)
   8313 clamd    RET   sigprocmask 0
   8313 clamd    CALL  sigaltstack(0x2817c000,0)
   8313 clamd    RET   sigaltstack 0
   8313 clamd    CALL  poll(0x806f000,0x1,0)
   8313 clamd    RET   poll 0
   8313 clamd    CALL  sigreturn(0x808ac64)
   8313 clamd    RET   sigreturn JUSTRETURN

and then there will be one message processed and then back to a few
more thousand of those sequences.

This looks entirely broken. Your trace indicates that the last argument
to poll (the timeout) is zero. The code looked like this


count = poll(poll_data, 1, CL_DEFAULT_SCANTIMEOUT*1000);

i.e. the timeout *can't* be zero unless you changed the value of
CL_DEFAULT_SCANTIMEOUT or your system is fundamentally broken.

unless your system is using poll to spin somewhere.

-trog

That was my thought also. I don't know why its zero. When clamd is only using about 2% of the cpu, the number is on the order of 5 to 10 seconds. However, something is very unusual here. The line of code above is not in the version I am using. I am using the snapshot from the morning of 4 Mar.

After a review of clamd/session.c and the developers forum archives I know what the cause of my problem is, but not necessarily why. The version that works (clamd / ClamAV version devel-20040209', clamav-milter version '0.66m) does not use either poll or select. At least neither is called directly. All of the later versions use select and they fail - when calling poll. So I suspect that on my system select is calling poll. However, the time field is getting set to zero when the source code clearly indicates that it should be non-zero. The time field is reset to a constant after each select call. Recompiling with no optimization does not change the outcome so its not likely to be an overlay either. I am guessing that haveing quite a number of threads active may be too much for select which may be getting them confused. However, thats a wild guess. I have no idea how to check that out.


Granted I am only working with one OS type/version, but it appears to me that neither the poll or select is reuqired. The accept seems to handle the situation fine by itself.



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Clamav-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/clamav-users

Reply via email to