On Wed, 2003-12-10 at 20:58, Joel Newkirk wrote:

> On Mon, 2003-12-08 at 01:06, Joel Newkirk wrote:

> > Now I'm working on a console command to offer the same functionality
> > (only needing to read the rules, not write) using the same dbm.  I've
> > used precisely the same subroutine as in the webmin version, but
> > whenever I reach:
> > dbmopen (%PLRULES, "/var/szs/rules.dbm", undef) or die $!;
> > I die, with "No such file or directory".

> > What am I doing wrong??

I found my problem - apparently webmin was doing "use GDBM_File;" for
me, which is why it worked in the webmin module, and since I wasn't
including either that or the webmin "web-lib.pl" it was defaulting to a
different DB format, hence the "File Not Found" error.  Once I include
"Use GDBM_File;" I can successfully read the DBM written by the webmin
module.  Doh!  Maybe my second week with Perl will be smoother... ;^)

Many thanks to all who offered advice.  Once I get the suite working as 
desired I'll be going through again and optimizing, and during that
phase I plan to change over to using a tie instead, but for now I've got
the functionality I need - remaining work is webmin UI cleanup and
adding further analysis capabilities, and auto-unblock as counterpart
to the autoblock.pl/autoblock.cgi functionality.

Our spam-harried clients thank you as well...  :^)  Now I can get the
autoblock running under cron, instead of only being available manually
from webmin.

FWIW, here's the bigger picture:

Using the ULOG target in iptables, and ulog-acctd, I log a flag (SMTP,
POP3, or FILT) a timestamp and a source IP for every connection to our
mailserver cluster.  (the standard iptables LOG target gives us far more
information than we need, and as a result takes about 5x longer to process
and 10x more disk space) Periodic analysis brings up the top (N) source IPs
with greater than (M) SMTP connections, which we process as follows:

If the IP is already blocked, do nothing.  If the IP is one of our own,
do nothing.  If the IP is the source of authenticated POP3 connections,
do nothing.  If the IP is in /var/szs/whitelist, do nothing.  Otherwise:
Perform a reverse lookup of the IP.  If the reverse record is empty, or
contains all four octets of the IP (decimal or hex) the block it with
iptables from entering the cluster at all.  Then we compare the reverse
record against /var/szs/rules.dbm's regex rules, and if we find a match
we block it.  Those rules are crafted to identify end-user IPs by 
patterns in their naming, IE '$3.$4.{1,20}rr.com' with IP octets being
subbed for $1,$2,$3,$4 before regex evaluation.

The point is this - I'm opposed on principle to using mail-abuse.org's
DUL (dial-up IP pool list) and blocking huge blocks of dynamic IPs, 
since I consider it not unreasonable that someone is (like me) running 
a mailserver at home on a dynamic IP.  However, with the hundreds of
thousands of spam messages hitting the company mailservers each day,
largely from broadband-connected end-user machines infected with
SoBig or similar spam-relay infections, we needed a way to weed them
out.  In the last 24 hours, we've received just over 2 million SMTP
connections, roughly 85% of which are incoming spam.  (logging all of
those with the native iptables LOG target is hopeless, with logfiles
topping 1gb daily, while ULOG with the custom format is about 100mb)

This set of scripts has allowed us to cut our incoming spam by about 80%
while cutting the resource usage on the servers drastically, instead of
boosting it several times by implementing extensive content filtering
on the servers themselves.  The reason for the tremendous cut in resource
requirements is that not only is this crap never reaching the servers,
they don't get bogged down repeatedly trying to send "unknown recipient"
bounces to non-existant sources.  We reached one point where we had over
100,000 bounce messages clogging the outbound message queue...  And that's
with periodic manual flushes.

All of this processing takes place on the Director node of an LVS cluster
running qmail on multiple nodes, handling email for about 50 domains.  (we
are an ISP)  Apart from this processing, the only things the Director node
deals with is routing SMTP/POP3/HTTP(S) to the most-available node, and 
presenting a caching nameserver for use by the clustered servers, so I 
have resources to spare up front, while resources on the mailserver nodes 
themselves are in much higher demand.  And the reverse lookups performed
by the autoblock scripts come at very low cost, since we are performing
them anyway on behalf of the mailserver nodes, and caching the lookups.

Again, thanks for advice and hints.

j

Joel Newkirk
perl at newkirk.us
firewalldude @ dsslink.net

-- 
"Not all those who wander are lost."  - JRR Tolkien


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to