>> Author: Tomasz Kojm
>> Date: 2005-07-21 04:17 -700

>> Could you please point me to a paper on bloom filters on which you have 
>> based your implementation?

Sure, I'd be happy to. This is the paper that we based our
implementation on (and it references another paper):

http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html

There are multiple variations of bloom filters, the one we chose is a
simple bit array of (256 K * 8) bits. The filter is at first
initialized to all zeros. Then, for every virus signature we load, we
take the first 7 bytes, and hash it with four very fast hash
functions. The corresponding four bits in the bloom filter are set to
1s.

Our intuition is that if the filter is small enough to fit in the CPU
cache, we should be able to avoid memory accesses that cost around 200
CPU cycles each. This paper applies Bloom filters to Snort IDS at the
hardware level:

http://citeseer.ist.psu.edu/dharmapurikar03deep.html

In my opinion, bloom filters apply nicer to bigger signature sets
(Snort -at the software level- divides its signatures into sets of
50-100 signatures each based on port number/protocol).

Thanks,

Ozgun.
_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html

Reply via email to