On Wed, 14 Mar 2012, Alex wrote:

I actually created a bunch of those already, and would appreciate if
someone would check my work:

uri         LOC_WP
m{https?://.[^/]+/(wp-content|modules/mod_wdbanners|wp-admin|wp-includes|cruise/wp-content|includes/|web/wp-content|google_recommends|mt-static)/}
describe    LOC_WP              Contains wordpress uri
score       LOC_WP              0.5

meta        LOC_WP_SHORT        (LOC_WP && LOC_SHORT)
describe    LOC_WP_SHORT        Contains wp-content and short body
score       LOC_WP_SHORT        0.6

meta        LOC_WP_SUBJ         (LOC_WP && MISSING_SUBJECT)
describe    LOC_WP_SUBJ         Contains wordpress uri and missing subject
score       LOC_WP_SUBJ         1.2

LOC_SHORT is a meta for rawbody lt 200 and contains a URI.

These appear to hit quite a bit of ham, or false-negatives; I'm not
sure. Wordpress URLs are obviously pretty frequent, but I don't think
0.5 would be too much to push ham to spam.

That will FP, as almost any legit page ref from a WP site will have
"/wp-content/" in it, I was referring to the "/wp-content/plugins/" as
being suspicious. But your idea of using it in metas with other
spammy characteristics is good.

One clue: "X-Originating-IP: [41.189.207.189]"
Check the various RBL hits on that address. ;)

Are there existing plugins for this?

Is there a way to check a range to see if it's part of a known
blacklisted botnet?

The "cbl.abuseat.org" RBL explicitly lists infected/bot-net machines.
(which does list that IP addr). So mail that contains a CBL listed
ip addr anywhere in its headers is suspect.


--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to