On Sun, 2010-05-23 at 17:43 +0300, Török Edwin wrote:
> > > If a file is determined to be clean, its MD5 is added to an in-memory 
> > > cache.
> > > When scanning a new file, its MD5 is computed and looked up in the
> > > cache. If found, it is considered clean.
> > > On DB reload the entire cache is cleared.
> > 
> > But, isn't that typically done multiple times a day?
> > 
> > So what exactly is the use-case for this, other than doing full system
> > scans more frequently than signature updates?
> 
> Even when doing full systems scan you still have a cache of last N
> minutes (where N depends how often you reload the DB).
> This helps with:
>  - duplicate files, or files both in archived an unarchived state
>  - since we cache at the extracted files level, even if only part of an
> archive/container is redundant, we have that cached
>  - mails containing same attachment, which was already determined to be
> clean

Ah, now I see. :)  Thanks for explaining, Török.

>  - archive bombs: instead of trying to scan 2^N files until the
> recursion depth/maxfilesize limit is reached, it only needs to scan N
> files (N is recursion depth) for a typical archive bomb that expands to
> 2 more archives at each depth.
>  - ensure that the bytecode won't accidentally need 2^N time to run: if
> it happens to extract a file that matches the logical signature of the
> same bytecode again, which would trigger further extraction and so on
> 
> The latter is the reason why the feature was added, however some initial
> tests have showed improved performance for nearly any kind of scan
> (system, mails, home, etc.)

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Reply via email to