Does clamd use multi-threading for the various "engines" within a single scan, or only to handle multiple requests from different sources?
On Tue, 9 Apr 2019 21:29:43 +0000 "Micah Snyder \(micasnyd\) via clamav-users" <clamav-users@lists.clamav.net> wrote: > Maarten, > > Your test results are pretty great. I really like your breakdown of > the signatures by category. I will caution that scan times will vary > quite heavily depending on what you’re scanning, based on Target type > (https://www.clamav.net/documents/clamav-file-types). > > In addition, it’s important to distinguish between load and scan > times. The time reported by clamscan is both load + scan. If you > just want scan time, you will want to load the database with clamd > and then test the scantime with clamdscan. > > Regarding load time vs scantime, all of the signatures must be > loaded, but depending on the target type of the file being scanned, > not all of the signatures will be matched against the file. That is, > daily_Win.ldb might take the longest to load due to the number of > signatures or complexity of the signatures but when scanning a PDF, > they probably won’t impact scan time, as Win signatures are probably > mostly target type 1 (PE file). > > I’ve bit of time today investigating what I believe is responsible > for slow load and scan times for the Phishtank sigs. I had a hunch, > based on a conversation we saw a while back in the mailing list, that > the identical beginning for URL-based signatures result in an > un-balanced and inefficient tree for matching. That is, some 3000 > signatures each began with either: > > > 1. href="http:// (687265663d22687474703a2f2f) > 2. HYPERLINK"http (48595045524c494e4b2022687474703a2f2f) > 3. S/URI/URI(http:// (532f5552492f55524928687474703a2f2f) > > Looking at a few of the Phish.Phishing signatures, these appear to > have the same issue (href="http:// prefix). In testing with scan of > a PDF document, I was able to reduce the scan time from 31.987 sec > down to 2.632 sec simply by changing the start of the Phishtank > signatures for the following: > > > 1. href="http:// > * from: 687265663d22687474703a2f2f > * to: 687265663d2268747470{3-4} > 2. HYPERLINK "http > * from: 48595045524c494e4b2022687474703a2f2f > * to: 48595045524c494e4b202268747470{3-4} > 3. S/URI/URI(http:// > * from: 532f5552492f55524928687474703a2f2f > * to: 532f5552492f5552492868747470{3-4} > > This should get the same detection with a faster load and scan time, > and will accommodate for httpS for better coverage. To turn lemonade > into really good lemonade, we may be able to take the above > optimization and apply it to the Phish.Phishing signatures identified > by Maarten to reduce scan times further to levels below those before > the addition of the Phishtank signatures. > > As noted by Maarten as well, the Phish.Phishing sigs are Target type > 0, whereas we’d split the Phishtank.Phishing signatures up by target > type to reduce scan times of files where the signatures won’t apply. > It should also speed things up quite a bit for other file types to > split those up by Target types. > > Further research into scan time optimization is definitely welcome > and appreciated. > > Regards, > Micah _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml