Does clamd use multi-threading for the various "engines" within a
single scan, or only to handle multiple requests from different sources?


On Tue, 9 Apr 2019 21:29:43 +0000
"Micah Snyder \(micasnyd\) via clamav-users"
<clamav-users@lists.clamav.net> wrote:

> Maarten,
> 
> Your test results are pretty great.  I really like your breakdown of
> the signatures by category.  I will caution that scan times will vary
> quite heavily depending on what you’re scanning, based on Target type
> (https://www.clamav.net/documents/clamav-file-types).
> 
> In addition, it’s important to distinguish between load and scan
> times.  The time reported by clamscan is both load + scan.  If you
> just want scan time, you will want to load the database with clamd
> and then test the scantime with clamdscan.
> 
> Regarding load time vs scantime, all of the signatures must be
> loaded, but depending on the target type of the file being scanned,
> not all of the signatures will be matched against the file.  That is,
> daily_Win.ldb might take the longest to load due to the number of
> signatures or complexity of the signatures but when scanning a PDF,
> they probably won’t impact scan time, as Win signatures are probably
> mostly target type 1 (PE file).
> 
> I’ve bit of time today investigating what I believe is responsible
> for slow load and scan times for the Phishtank sigs.  I had a hunch,
> based on a conversation we saw a while back in the mailing list, that
> the identical beginning for URL-based signatures result in an
> un-balanced and inefficient tree for matching. That is, some 3000
> signatures each began with either:
> 
> 
>   1.  href="http:// (687265663d22687474703a2f2f)
>   2.  HYPERLINK"http (48595045524c494e4b2022687474703a2f2f)
>   3.  S/URI/URI(http:// (532f5552492f55524928687474703a2f2f)
> 
> Looking at a few of the Phish.Phishing signatures, these appear to
> have the same issue (href="http:// prefix).  In testing with scan of
> a PDF document, I was able to reduce the scan time from 31.987 sec
> down to 2.632 sec simply by changing the start of the Phishtank
> signatures for the following:
> 
> 
>   1.  href="http://
>      *   from: 687265663d22687474703a2f2f
>      *   to: 687265663d2268747470{3-4}
>   2.  HYPERLINK "http
>      *   from: 48595045524c494e4b2022687474703a2f2f
>      *   to: 48595045524c494e4b202268747470{3-4}
>   3.  S/URI/URI(http://
>      *   from: 532f5552492f55524928687474703a2f2f
>      *   to: 532f5552492f5552492868747470{3-4}
> 
> This should get the same detection with a faster load and scan time,
> and will accommodate for httpS for better coverage.  To turn lemonade
> into really good lemonade, we may be able to take the above
> optimization and apply it to the Phish.Phishing signatures identified
> by Maarten to reduce scan times further to levels below those before
> the addition of the Phishtank signatures.
> 
> As noted by Maarten as well, the Phish.Phishing sigs are Target type
> 0, whereas we’d split the Phishtank.Phishing signatures up by target
> type to reduce scan times of files where the signatures won’t apply.
> It should also speed things up quite a bit for other file types to
> split those up by Target types.
> 
> Further research into scan time optimization is definitely welcome
> and appreciated.
> 
> Regards,
> Micah

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to