On 04/27/2010 08:57 PM, Kris Deugau wrote: > Török Edwin wrote: >> On 04/22/2010 05:26 PM, Kris Deugau wrote: >>> I've had reports of several FPs due to PhishingScanURLs recently - is >>> there any way it can be made less aggressive rather than just turning it >>> off outright? >> >> You could remove domains from daily.pdb > > I don't seem to have this as a separate file fresh from freshclam - > currently we have no infrastructure in place to meddle with Clamav > signature files before putting them into production. > >> whitelist all mails that contain >> certain domains. > > Problem is, there's no way to tell in advance what kind of perfectly > legitimate but strange email customers will send or receive. :(
I'm open to suggestions. > >>> The messages triggering it so far have been both outgoing and incoming >>> mail from our customers: forwarded copies of legitimate Amazon.ca mail >>> and eBay replies on the outgoing side; a newsletter linking to a bank >>> website for a contest of some kind on the incoming side. >> >> The problem is that amazon/ebay is a very likely target for phishing, so >> if you remove these domains entirely you will miss some phishing. > > *nod* And SpamAssassin won't catch everything either - but similar > "link doesn't match text" rules in SpamAssassin have found to have a > high enough FP rate that they're not worth many points in a > sum-of-many-rules system like SA. (I'm still not sure why some of the > forwarded eBay messages hit; the customer never forwarded a copy where > I could examine it specifically. Curiously enough, changing the > *subject* line seems to have let a few of these get through... Are you sure it was a Heuristics.Phishing.*, or Phishing.Heuristics.* detection? It doesn't look at the subject line at all. > >> Here is an example daily.wdb entry (it is a regular expression for the 2 >> sides of the link): >> X:.+\.amazon\.(at|ca|co\.uk|co\.jp|com|de|fr)([/?].*)?:(.+\.)?amazon\.com([/?].*)?:17- >> > > What does the "17-" at the end indicate? It indicates that the signature should only be loaded on ClamAV with functionality level >=17 (0.91+). Older versions crashed when loading that sig. This is probably redundant now that we don't support those versions anymore. > I added another whitelist > entry for the bank contest FP that seems to have worked: > > X:links\.scene\.ca:scotiabank.com:16- > > but I ended up just sticking numbers in that last entry until it worked. > >> And here is another one: >> M:chase.com:jpmchase.com >> >> You can see all the current .wdb entries by downloading daily.cvd, and >> running sigtool --unpack-current daily. >> >> The format of .wdb is documented in docs/phishsigs_howto.pdf >> >> You can start by adding just the domain names to the .wdb, i.e.: >> M:amazon.ca:OTHERDOMAIN >> >> where OTHERDOMAIN is the displayed domain name (the part between >> <a></a>), assuming amazon.ca is the domain in the href. > > Thanks! That last format is probably easiest for now. > > Unfortunately I'm limited to reacting to FPs; I can't predict all the > newsletters that customers receive, and I can't predict which ones will > suddenly trigger this test due to a one-off special contest. You mentioned that you can't get the samples due to privacy concerns. What if I'd write a script (in python or perl, or something) that takes an email and outputs the .wdb rules? It would chop off the query and path part of the URLs, and the output is human readable, so the customer can see exactly what they're sending to you. Now I can't expect your users to have ClamAV installed, right? Maybe its possible to write something in python/perl only for the wdb generation, but before doing that: would your users have python/perl installed in the first place? > > I'm also hoping I can come up with a patch for our mail delivery handler > (custom Perl script) to treat Phishing.Heuristics hits a little > differently than any others (ie, pass it on to SA as well, using the > Clam hit as a SA rule in some manner). I've done this before (on a much > smaller system) so it should be possible. Keep in mind that with 0.96 it is called Heuristics.Phishing.*. This was done to have a uniform naming for all engine detections (they all begin with Heuristics.* now). Best regards, --Edwin _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml