2010/4/8 Török Edwin <edwinto...@gmail.com>: > On 2010-04-08 23:21, Royce Williams wrote: >> >> 2010/4/8 Török Edwin<edwinto...@gmail.com>: >>>> >>>> I do have strace. I am having trouble recreating the problem. I first >>>> uploaded the file to the server, but it scanned clean. I then >>>> configured MIMEDefang to leave its files behind after processing, sent >>>> myself a PDF that triggered the error, and then manually scan ned the >>>> left-behind file. This scan succeeded. >>>> >>>> I am not sure what failure mode would make this succeed with clamscan >>>> on the command line but fail when called by MIMEDefang. >>> >>> Does MIMEDefang use clamdscan or clamscan? >> >> We are using clamd. MIMEDefang can use both, but prefers clamd if it >> is available. > > Then try to reproduce the problem by using clamdscan, not clamscan.
Apologies for making you have to point out the obvious. :-) >>>> I will also drop back to 0.95.2 to see if the problem was the OS >>>> upgrade or the clamav upgrade. >> >> My MX servers are experiencing the same problem, but I just noticed >> that they have a higher size threshold - my test 2.5M PDF gets through >> OK, but a 7M PDF triggers "can't map file into memory" error. >> >> I am starting to suspect a resource exhaustion issue. I just restarted >> clamd on one of the inbound servers and the problem remains, but the >> threshold increased. Now, 10M files are getting through but a 25M >> file caused the error. >> >> On the upgrade/downgrade test server, not only did the issue go away >> for 2M files when I downgraded, it remained gone when I upgraded back. >> This is consistent with the exhaustion theory. >> >> The servers that were exhibiting the problem vigorously today are the >> ones that I upgraded first - days before the other servers. From the >> logs, the problem has been getting steadily more frequent over time. >> >> In another branch of this thread, Chuck Swiger suggested that >> munmap()s might not be happening, which may be consistent with this >> theory (in my limited view as a sysadmin; I am not a developer). > > Even if clamd had some leaks, once it quits the kernel reclaims all memory > used by it (unless the kernel is horribly broken). > >> >> Edwin, should I proceed with the source modification you suggested >> earlier, or something else? > > If you can reproduce the problem with clamdscan, then run clamd (not > clamdscan!) under strace and record stderr output. I took a production server out of the farm without restarting clamd, and can reproduce the problem. However, the strace output appears to contain nothing useful. Syntax was: strace -f -p 61805 >strace.out 2>&1 and the uniq'd output is: [r...@glenn /var/tmp]# cat strace.out | sort -u <unfinished ...> ) = 0x1c6 NULL) = 189 Process 61805 attached - interrupt to quit Process 61805 detached SYS_11() = 0 SYS_11(syscall: missing exit chmod(0xbdf010c0, 0700) = 0 clock_gettime(0, 0xbeb20f34) = 0 close(11) = 0 close(12) = 0 close(3) = 0 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 fstat(11, {...}) = 0 fstat(12, {...}) = 0 getpid() = 61805 (ppid 0) gettimeofday({...}, NULL) = 0 mkdir(0xbdf010c0, 0700) = 0 open(0xbdf010c0, O_RDONLY|O_NONBLOCK) = 12 read(11, ""..., 43562) = 0 read(11, 0xbdf1c17c, 131072) = 131072 read(11, 0xbdf1c17c, 131072) = 87510 read(11, 0xbeb1e833, 1024) = 1024 rmdir(0xbdf010c0) = 0 sendto(10, 0xbeb1fbfe, 74, 0, NULL, 0) = 74 sendto(3, 0xbeb20ae0, 41, 0, NULL, 0) = 41 shutdown(3, 2 /* send and receive */) = 0 stat(0xb83085c0, {...}) = 0 stat(0xbdf010c0, {...}) = 0 syscall: missing entry syscall_3090294560(0x8, 0x1) = 0 syscall_397(0xc, 0xbeb1e6c4) = 0 syscall_454(0xb8322720) = 0x1c6 syscall_454(0xb8322720, 0x8) = 0x1c6 syscall_454(0xb8322720, 0x8, 0x1) = 0x1c6 syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700) = 0x1c6 syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700, 0xbeb20f2c <unfinished ...> syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700, 0xbeb20f2c) = 0 syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700, 0xbeb20f2c) = 0x1c6 syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700, 0xbeb20f2c, 0x3) = 0x1c6 syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700, 0xbeb20f2c, 0x3, 0) = 0x1c6 syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700, 0xbeb20f2c, 0x3syscall: missing exit syscall_454(0xb8322720, 0x8, 0x1, 0xb8322700syscall: missing exit syscall_454(0xb8322720, 0x8, 0x1syscall: missing exit syscall_454(0xb8322720, 0x8syscall: missing exit syscall_454(0xb8322720syscall: missing exit syscall_477(0, 0x9d55d6, 0x1, 0x2, 0xb, 0, 0) = -1 (errno 12) syscall_478(0xb, 0, 0, 0) = 0 syscall_480(0xb, 0, 0) = 0 unlink(0xb8322540) = 0 write(1, 0xb8389000, 41) = 41 write(2, 0xbeb1e50c, 40) = 40 write(4, 0xb8321000, 69) = 69 I started this bug report, attaching both the clamdscan strace (which you said that you don't need, but at least it has the error in it) and the clamd strace (which looks to my untrained eye to have no smoking gun). https://wwws.clamav.net/bugzilla/show_bug.cgi?id=1941 I also attached a kdump of ktrace -d -p [clamd-pid]. Because a large PDF (10M) was needed to trigger the error, the kdump is about 24M, so attached as a URL. Royce _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml