Paul Kosinski wrote:
> Hi,
>
> I noticed the following anomaly when scanning a tar.gz file compared
> to scanning the result of untarring it. Scanning the tar.gz file
> results in less "data read" than scanning the files which it expands
> to (as one would expect), but the "data scanned" amount is *much* more
> for the tar.gz file than for the resultant files in the directory
> tree.
>
> Does this indicate some problem with the way clamav handles
> compressed files, or is it some peculiarity of this tar.gz file?
>    
I don't think this is a problem with clamav's handling of compressed 
files. I think it is a feature.

The following is just a bunch of assumptions, though:
When clamav scans a tar.gz, it initially scans the raw tar.gz data and 
tries to match that against virus patterns. Then it scans the ungzip-ed 
tar and tries to match some hashes of that data against virus defs. And 
then it scans the individual files in the tar, possibly scanning and 
then expanding and scanning other archives that are found in the tar.
The result is that more data is scanned. This is a feature for two 
reasons: 1) signatures that match against the part of a tar archive that 
represents a file will catch a virus more efficiently than having clamav 
expand the viral file and then scan it. This improves clam's efficiency 
as, IIRC, clam stops scanning once it encounters a virus match. 2) the 
gzip or tar stream may be specially crafted to take advantage of 
exploits in buggy versions of gzip, GNU tar, or proprietary 
implementations of the programs. Clamav should detect this, not just 
viruses stored in tars or files encoded using gzip.
> 09:51:08 u...@host:~/src/openssl>  clamscan -ri openssl-0.9.8k/
>    
Is your username really ``user'' and hostname really ``host''?

-- 
binki

_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Reply via email to