Hi, 

Today, all of a sudden, in 5 of our email servers (to be more precise,
mx processing servers), we started to get qmail-scanner errors (the
feared "qq temporary error" message). After some digging, we found out
the reason was clamav (0-99.2) was dead. Erros like: 

/var/spool/qscan/tmp/mx3151697687779817876/1516976877.17928-1.mx3: Can't
open file or directory ERROR
/var/spool/qscan/tmp/mx3151697687779817876/image001.jpg: Can't open file
or directory ERROR
/var/spool/qscan/tmp/mx3151697687779817876/image002.jpg: Can't open file
or directory ERROR
/var/spool/qscan/tmp/mx3151697687779817876/image003.jpg: Can't open file
or directory ERROR 

ERROR: accept() failed:
ERROR: accept() failed:
ERROR: accept() failed:
ERROR: accept() failed:
ERROR: accept() failed:
ERROR: accept() failed:
ERROR: accept() failed: 

and more apparently fs/resource related errors. No disk space problems,
no permission problems, no inode problems. Plenty of free RAM. CPU usage
and load averages very low. Upgrading to the recent 0-99.3 didn't help. 

Finally, strace found out what's happening: 

[pid 10797]
open("/tmp/clamav-43d08860615a4d14dae3046aee3e5e98.tmp/clamav-781e98988f119c6433f2328d7224825c.tmp",
O_RDWR|O_CREAT|O_EXCL|O_TRUNC, 0700) = -1 EMFILE (Too many open files) 

[pid 10797] write(2, "LibClamAV Warning: fileblobScan, fullname ==
NULL\n", 50) = 50
[pid 10797] write(2, "LibClamAV Error: fileblobDestroy: textportion not
saved: report to http://bugs.clamav.net\n";, 90) = 90 

which is weird, because the system is not even near to be so heavily
loaded (by workload) to get clamav to eat up the default 1024
descriptors. So looking at /proc/PID/fd ... 

lrwx------ 1 root root 64 Jan 26 19:20 68 ->
/tmp/clamav-2b88cf1b1e55ab0d8cb045ab908e3273.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 680 ->
/tmp/clamav-7564115561d870008cbd6783ea304e96.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 681 ->
/tmp/clamav-a3be54920dff8420d86dfffcf6df41ea.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 682 ->
/tmp/clamav-270dd0e754c54dacab1bcd28a90c38e3.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 683 ->
/tmp/clamav-7405a8caced08e2020809f2621cd16a2.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 684 ->
/tmp/clamav-b41105e2aeb1da3cef054e6938f5e26a.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 685 ->
/tmp/clamav-d8735b7980637a5866dd7a2ee274a272.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 686 ->
/tmp/clamav-9395651f00ea6530a58bdb8480d93223.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 687 ->
/tmp/clamav-01c6f26331d41a92e5454203d7ee3229.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 688 ->
/tmp/clamav-d6d3b8d9da8982ae0f724b2920d37c47.tmp (deleted)
lrwx------ 1 root root 64 Jan 26 19:20 689 ->
/tmp/clamav-b5328deea4127b4c5f07a9d0a6f095c0.tmp (deleted) 

Those servers have been working with the same configuration for years,
without this happening until now. 

Of course we could ulimit to a high value the nofiles value, but that
would just postpone the daemon's dead. 

Any help would be greatly appreciated. 

Thanks, 

Rubén.
_______________________________________________
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to