On 12/25/2011 07:48 AM, John-Charles D. Sokolow wrote:
> I am experimenting with a python script which uses 
> http://xael.org/norman/python/pyclamd/ to scan blocks of data.
> Here is my scenario, I read one block, ( 4096 bytes in my case ) from a 
> socket. I call pyclamd.scan_stream( block ), which I assume is in turn 
> calling either INSTREAM, or STREAM, ( I don't know since
> the docs for pyclamd don't specify which actual calmd call occurs when 
> calling scan_stream. ) I then check the return code from calmd if it returns 
> None (NULL) I know that the block is safe and I pass
> it along, otherwise I throw an exception and close the connection. My 
> question is this since I'm breaking the stream up into blocks and scanning 
> each block separately am I running the risk of a virus
> sneeking by the edge of the blocks and not matching a pattern. For example 
> take the block 'Hello Vir' and the block 'us World' assume that the sub 
> string 'Virus' is the actual virus, since neither
> 'Vir' ( the last 3 bytes of the first block ) nor 'us'( the first two bytes 
> of the second block ) are 'Virus' it would seem that clamd would miss "Virus" 
> and not return a match, letting the virus
> essentially sneak through the sides as it were. Is this true? If so, is there 
> a work around? Or do I need to save the complete stream to disk then call 
> clamd.scan_file("/tmp/tfile.bin") before
> re-transmitting the file?
Clamd needs the entire file, without that you won't get the results you are 
expecting.
Scanning 4k blocks at a time is not a good idea.

It appears to be a limitation of the python wrapper you are using: you don't 
need to send all your data at once.
You can send the STREAM/INSTREAM command, and then stream your data when you 
get it.

You don't necesarely have to save the file to disk prior to scanning though, 
you can just stream
all your blocks using INSTREAM (which will create the tempfile on clamd's end).
The format for INSTREAM on the socket is:
 1. send the INSTREAM command: zINSTREAM\0, or nINSTREAM\n
 2. send <length> (big endian, 4 bytes)
 3. send the chunk of data corresponding to the above length
 4. repeat at 2 as long as you have more blocks to send
 5. send a 0-length block to mark end of stream

And STREAM is similar to FTP, you get port back where you can send the entire 
data.

Best regards,
--Edwin
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Reply via email to