Re: [clamav-users] Scanning very large files in chunks

G.W. Haywood Thu, 11 Aug 2016 10:15:47 -0700

Hello once again,

On Thu, 11 Aug 2016, sapientdust+cla...@gmail.com wrote:

I scan a 4.5 GB file in multiple instream calls, by scanning the first
3 GB in one call, and then making a second instream call that provides
the first N  MB followed by the last 2 GB of the file.

Would clamav be expected to work similarly in the two cases in terms
of identifying a virus, assuming the virus is the same in the two
scenarios and it's in ClamAV's database? Or are there technical
reasons why ClamAV wouldn't detect the virus in the second scenario
but would in the first, even though the virus bytes are identical?


There's a possibility of failing to find it in the second scenario.
It's anybody's guess what the probability will be; my guess would be
that the probability of that failure would be small compared with the
relatively large probability of not finding it at all in both cases.

This is a question for clamav developers or those who understand the
codebase sufficiently to know the impact of scanning a partial file.


I don't think so.  Just think about it a bit:

Much of ClamAV's operation is looking for pattern matches.
Suppose you scan a 4.5GB file in two chunks.
Suppose half this mysterious 'huge file virus' is in the first chunk.
Presumably the other half is in the second chunk.
What happens if the pattern is designed to match the entire virus?

Should I have asked this question on the developer list?


No.  You're a user, the developers' list is for working on ClamAV.

--

73,
Ged.
_______________________________________________
Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Re: [clamav-users] Scanning very large files in chunks

Reply via email to