Steve Freegard wrote:
Hi All,
On 17/09/10 14:11, Steve Freegard wrote:
Hi All,
Recently I've been getting a bit of filter-bleed from a bunch of spams
injected via Hotmail/Yahoo that contain shortened URLs e.g. bit.ly/foo
that upon closer inspection would have been rejected with a high score
if the real URL had been used.
To that end - it annoyed me enough to write a plug-in that decodes the
shortened URL using an HTTP HEAD request to extract the location header
sent by the shortening service and to put this into the list of
extracted URIs for other plug-ins to find (such as URIDNSBL).
On the messages I tested it with - it raised the scores from <5 to >10
based on URIDNSBL hits which is just what I wanted.
Hopefully it will be useful to others; you can grab it from:
http://www.fsl.com/support/DecodeShortURLs.pm
http://www.fsl.com/support/DecodeShortURLs.cf
I've just put up a new version at the above URLs (v0.3) which adds the
following new features:
- Now follows 'chained' short URLs (e.g. shortURL -> shortURL -> real)
When chained URLs are detected the rule 'SHORT_URL_CHAINED' is fired.
If a chained loop is detected the rule 'SHORT_URL_LOOP' is fired.
If more than 10 chained URLs are found 'SHORT_URL_MAXCHAIN' is fired
and no further redirections are checked.
- If the shortener returns 404 (e.g. not found) for the short URL then
'SHORT_URL_404' is fired.
- Prevent amavis from die'ing on eval block tests by adding "local
$SIG{'__DIE__'} to each block.
- Added option to allow logging to syslog (mail.info).
Kind regards,
Steve.
I've been testing this plugin, version 0.5. I'm running SpamAssassin
v3.2.5 on CentOS v5.5 32-bit, Perl v5.8.8. I've been testing using a
test message and changing out the URLs it contains.
Using URLs like these:
http://goo.gl/foo
http://bit.ly/foo
http://2chap.it/foo
I consistently hit on these rules:
HAS_SHORT_URL
SHORT_URL_404
SHORT_URL_CHAINED
SHORT_URL_LOOP
SHORT_URL_MAXCHAIN
I can understand hitting on HAS_SHORT_URL and SHORT_URL_404, but why is
-every- test hitting SHORT_URL_CHAINED, SHORT_URL_LOOP, SHORT_URL_MAXCHAIN?
Brent Gardner