>> Author: Tomasz Kojm >> Date: 2005-07-21 04:17 -700 >> Could you please point me to a paper on bloom filters on which you have >> based your implementation?
Sure, I'd be happy to. This is the paper that we based our implementation on (and it references another paper): http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html There are multiple variations of bloom filters, the one we chose is a simple bit array of (256 K * 8) bits. The filter is at first initialized to all zeros. Then, for every virus signature we load, we take the first 7 bytes, and hash it with four very fast hash functions. The corresponding four bits in the bloom filter are set to 1s. Our intuition is that if the filter is small enough to fit in the CPU cache, we should be able to avoid memory accesses that cost around 200 CPU cycles each. This paper applies Bloom filters to Snort IDS at the hardware level: http://citeseer.ist.psu.edu/dharmapurikar03deep.html In my opinion, bloom filters apply nicer to bigger signature sets (Snort -at the software level- divides its signatures into sets of 50-100 signatures each based on port number/protocol). Thanks, Ozgun. _______________________________________________ http://lurker.clamav.net/list/clamav-devel.html
