On Wed, 2003-12-10 at 20:58, Joel Newkirk wrote: > On Mon, 2003-12-08 at 01:06, Joel Newkirk wrote:
> > Now I'm working on a console command to offer the same functionality > > (only needing to read the rules, not write) using the same dbm. I've > > used precisely the same subroutine as in the webmin version, but > > whenever I reach: > > dbmopen (%PLRULES, "/var/szs/rules.dbm", undef) or die $!; > > I die, with "No such file or directory". > > What am I doing wrong?? I found my problem - apparently webmin was doing "use GDBM_File;" for me, which is why it worked in the webmin module, and since I wasn't including either that or the webmin "web-lib.pl" it was defaulting to a different DB format, hence the "File Not Found" error. Once I include "Use GDBM_File;" I can successfully read the DBM written by the webmin module. Doh! Maybe my second week with Perl will be smoother... ;^) Many thanks to all who offered advice. Once I get the suite working as desired I'll be going through again and optimizing, and during that phase I plan to change over to using a tie instead, but for now I've got the functionality I need - remaining work is webmin UI cleanup and adding further analysis capabilities, and auto-unblock as counterpart to the autoblock.pl/autoblock.cgi functionality. Our spam-harried clients thank you as well... :^) Now I can get the autoblock running under cron, instead of only being available manually from webmin. FWIW, here's the bigger picture: Using the ULOG target in iptables, and ulog-acctd, I log a flag (SMTP, POP3, or FILT) a timestamp and a source IP for every connection to our mailserver cluster. (the standard iptables LOG target gives us far more information than we need, and as a result takes about 5x longer to process and 10x more disk space) Periodic analysis brings up the top (N) source IPs with greater than (M) SMTP connections, which we process as follows: If the IP is already blocked, do nothing. If the IP is one of our own, do nothing. If the IP is the source of authenticated POP3 connections, do nothing. If the IP is in /var/szs/whitelist, do nothing. Otherwise: Perform a reverse lookup of the IP. If the reverse record is empty, or contains all four octets of the IP (decimal or hex) the block it with iptables from entering the cluster at all. Then we compare the reverse record against /var/szs/rules.dbm's regex rules, and if we find a match we block it. Those rules are crafted to identify end-user IPs by patterns in their naming, IE '$3.$4.{1,20}rr.com' with IP octets being subbed for $1,$2,$3,$4 before regex evaluation. The point is this - I'm opposed on principle to using mail-abuse.org's DUL (dial-up IP pool list) and blocking huge blocks of dynamic IPs, since I consider it not unreasonable that someone is (like me) running a mailserver at home on a dynamic IP. However, with the hundreds of thousands of spam messages hitting the company mailservers each day, largely from broadband-connected end-user machines infected with SoBig or similar spam-relay infections, we needed a way to weed them out. In the last 24 hours, we've received just over 2 million SMTP connections, roughly 85% of which are incoming spam. (logging all of those with the native iptables LOG target is hopeless, with logfiles topping 1gb daily, while ULOG with the custom format is about 100mb) This set of scripts has allowed us to cut our incoming spam by about 80% while cutting the resource usage on the servers drastically, instead of boosting it several times by implementing extensive content filtering on the servers themselves. The reason for the tremendous cut in resource requirements is that not only is this crap never reaching the servers, they don't get bogged down repeatedly trying to send "unknown recipient" bounces to non-existant sources. We reached one point where we had over 100,000 bounce messages clogging the outbound message queue... And that's with periodic manual flushes. All of this processing takes place on the Director node of an LVS cluster running qmail on multiple nodes, handling email for about 50 domains. (we are an ISP) Apart from this processing, the only things the Director node deals with is routing SMTP/POP3/HTTP(S) to the most-available node, and presenting a caching nameserver for use by the clustered servers, so I have resources to spare up front, while resources on the mailserver nodes themselves are in much higher demand. And the reverse lookups performed by the autoblock scripts come at very low cost, since we are performing them anyway on behalf of the mailserver nodes, and caching the lookups. Again, thanks for advice and hints. j Joel Newkirk perl at newkirk.us firewalldude @ dsslink.net -- "Not all those who wander are lost." - JRR Tolkien -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>