Hi, Today, all of a sudden, in 5 of our email servers (to be more precise, mx processing servers), we started to get qmail-scanner errors (the feared "qq temporary error" message). After some digging, we found out the reason was clamav (0-99.2) was dead. Erros like:
/var/spool/qscan/tmp/mx3151697687779817876/1516976877.17928-1.mx3: Can't open file or directory ERROR /var/spool/qscan/tmp/mx3151697687779817876/image001.jpg: Can't open file or directory ERROR /var/spool/qscan/tmp/mx3151697687779817876/image002.jpg: Can't open file or directory ERROR /var/spool/qscan/tmp/mx3151697687779817876/image003.jpg: Can't open file or directory ERROR ERROR: accept() failed: ERROR: accept() failed: ERROR: accept() failed: ERROR: accept() failed: ERROR: accept() failed: ERROR: accept() failed: ERROR: accept() failed: and more apparently fs/resource related errors. No disk space problems, no permission problems, no inode problems. Plenty of free RAM. CPU usage and load averages very low. Upgrading to the recent 0-99.3 didn't help. Finally, strace found out what's happening: [pid 10797] open("/tmp/clamav-43d08860615a4d14dae3046aee3e5e98.tmp/clamav-781e98988f119c6433f2328d7224825c.tmp", O_RDWR|O_CREAT|O_EXCL|O_TRUNC, 0700) = -1 EMFILE (Too many open files) [pid 10797] write(2, "LibClamAV Warning: fileblobScan, fullname == NULL\n", 50) = 50 [pid 10797] write(2, "LibClamAV Error: fileblobDestroy: textportion not saved: report to http://bugs.clamav.net\n", 90) = 90 which is weird, because the system is not even near to be so heavily loaded (by workload) to get clamav to eat up the default 1024 descriptors. So looking at /proc/PID/fd ... lrwx------ 1 root root 64 Jan 26 19:20 68 -> /tmp/clamav-2b88cf1b1e55ab0d8cb045ab908e3273.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 680 -> /tmp/clamav-7564115561d870008cbd6783ea304e96.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 681 -> /tmp/clamav-a3be54920dff8420d86dfffcf6df41ea.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 682 -> /tmp/clamav-270dd0e754c54dacab1bcd28a90c38e3.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 683 -> /tmp/clamav-7405a8caced08e2020809f2621cd16a2.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 684 -> /tmp/clamav-b41105e2aeb1da3cef054e6938f5e26a.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 685 -> /tmp/clamav-d8735b7980637a5866dd7a2ee274a272.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 686 -> /tmp/clamav-9395651f00ea6530a58bdb8480d93223.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 687 -> /tmp/clamav-01c6f26331d41a92e5454203d7ee3229.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 688 -> /tmp/clamav-d6d3b8d9da8982ae0f724b2920d37c47.tmp (deleted) lrwx------ 1 root root 64 Jan 26 19:20 689 -> /tmp/clamav-b5328deea4127b4c5f07a9d0a6f095c0.tmp (deleted) Those servers have been working with the same configuration for years, without this happening until now. Of course we could ulimit to a high value the nofiles value, but that would just postpone the daemon's dead. Any help would be greatly appreciated. Thanks, Rubén. _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml