At 11:05 PM +0200 6/30/09, Tomasz Kojm wrote:
On Tue, 30 Jun 2009 11:26:25 -0700
"Bill Landry" <b...@inetmsg.com> wrote:
So if I were to include a signature in my 3rd party database, and then a
few days later ClamAV adds the same signature to the official signature
database, that is not your problem, but rather my problem? Seems like if
you (ClamAV) is providing the means for including 3rd party databases,
then wouldn't you agree that it really is ClamAV's responsibility to make
sure that duplicate signatures do not get loaded and used?
Hi Bill,
taking care about duplicates in the engine doesn't make sense (see below).
Without a centralized system for signature maintenance we offered to 3rd
parties, it's not possible to avoid duplicates. Having said that,
even if there
were a few thousands of duplicated sigs, it shouldn't cause any significant
slowdown to the engine.
> We had an idea to allow 3rd party signature
> creators to use our mechanisms for signature maintenance ([1], easy
> checking for FPs, dups, name collisions) and also our network
> infrastructure and freshclam to make everything more smooth but
> unfortunately this idea didn't get much interest.
Hmmm, first I've heard of this. Why was there a lack of interest?
Well, I don't know why.. AFAIK, only Securiteinfo was interested in using
that solution. And in my opinion it would only have advantages - all the
mechanisms we developed for the last 7 years, including the mirror
infrastructure, could be used to maintain and distribute the 3rd party
sigs making all processes much more efficient!
> It would be inefficient (and could be even unsafe in some cases) to do
> such things in the engine.
Why is that? If ClamAV sorts all signatures when reloading, and ignores
duplicate signatures, why would that be dangerous in the engine?
Because detecting duplicated signatures is not that easy and must be
done with a great care so that we don't incorrectly skip some unique sigs!
Eg. the following logical sigs are all duplicates:
Sig1;Target:0;0&1&(2|3);dead;beef;feed;face
Sig2;Target:0;0&((1&2)|(1&3));dead;beef;feed;face
Sig3;Target:0;0&1&(2|3);dead;beef;face;feed
Sig4;Target:0;(0|1)&2&3;feed;face;dead;beef
but this one is not (and still is very similar):
Sig5;Target:0;(0|1)&2&3;feed;dead;face;beef
Even for some very simple hex signatures there may be cases where
it's not easy to detect dups, eg. dead{3}beef is in practice a duplicate
of dead??????beef but since the engine handles these signatures
differently, the situation complicates again. So in the engine we could
only implement some very limited checks, but then the other day
someone would open a bug report that this "feature" doesn't work
nicely for some sigs... (take the issue with local.ign for example)
The centralized system for signature development eliminates the
problem because one can easily see that a sample is already detected
(such samples automatically get "closed"). It could also provide some
detection of duplicates which could be later handled manually. It's
working really great for us that's why we made that offer to 3rd party
signature developers. Hopefully, we will close the bug #781 some day...
Tomas,
I like having a central DB. In fact I think the central DB should be
queryable (eg submit signatures and get feedback if they are already
superceded but other detections)
On a similar line I suggested to Luca a while ago that it would be go
if you maintained a DB of MD5 signatures of files that you have
processed. I have submitted over 1600 unique malware files since 23
Mar and I am pretty sure that 99% are real malware because they show
up in my honeypot. Unfortunately, I have 1054 outstanding that I
have in my winnow_malware.hdb sig file that still do not have
"official" signature for them.
As far as an MD5 DB, I would like it to include the following status:
in queue, verified benign, and in work. This would allow me to know
that you have it and know when something is benign. I know you must
have something like this internally if for any reason to cull dups
and to checkout or signature creation so adding some exposure of the
DB shouldn't be an issue.
Unfortunately nothing has come from this....
Tom
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml