Hello, On Wed, Aug 10, 2016 at 10:11 AM, G.W. Haywood <cla...@jubileegroup.co.uk> wrote: > Hello again, > > In August 2016, sapientdust+cla...@gmail.com wrote: > >> The specifics are not important to my question > > > That's not what you said earlier. To be specific, you said: > >> >> In my case, the consequence factor is very large
Those two statements are perfectly consistent. The consequences are significant enough that I have to scan all files, but why the consequence are large, or what the specific consequences are, don't matter for my technical question. >> >> Does anybody have any feedback on the proposed solution to scanning >> >> large files in chunks? >> > > Stop worrying about it, it's a waste of time and effort. The >> > > probability >> > that you will actually find what you're looking for is very small. >> >> What are the technical reasons that the probability is very small >> (compared to the probability of finding a virus if the file is small >> enough to be scanned in one instream call)? > > > I didn't say anything about comparisons. You asked for feedback, I > gave you some, and I said you wouldn't like it. You're not going to > like it any better if you modify the question, because my feedback is > going to be the same. I've been using ClamAV for more than a decade > so I have a reasonable idea what it can achieve and what it can't. > I didn't say that you mentioned comparisons. I was making clear that I'm not asking for 100% reliability and I'm not asking whether the multi-scan idea is perfect in some general sense, but only whether it's significantly worse than scanning a smaller file that doesn't need to be broken into multiple pieces. I'm interested in knowing if there are technical reasons why the following two scenarios would work very differently: scenario 1: I scan a 2.5 GB file in one instream call scenario 2: I scan a 4.5 GB file in multiple instream calls, by scanning the first 3 GB in one call, and then making a second instream call that provides the first N MB followed by the last 2 GB of the file. Would clamav be expected to work similarly in the two cases in terms of identifying a virus, assuming the virus is the same in the two scenarios and it's in ClamAV's database? Or are there technical reasons why ClamAV wouldn't detect the virus in the second scenario but would in the first, even though the virus bytes are identical? This is a question for clamav developers or those who understand the codebase sufficiently to know the impact of scanning a partial file. Should I have asked this question on the developer list? I asked here because it looked like the developer list gets very little use, and I thought developers would probably be on this list too. _______________________________________________ Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml