Victor Duchovni put forth on 11/18/2010 12:52 PM:

> This filter is too fragile IMHO. My best advice is to find filters that
> detect spam.

I think you've missed some of my previous posts regarding my spam
filtering setup.  I kill about 99.5+% of it at SMTP time, without
resorting to body filters or any external content filters such as
SpamAssassin.  I'm stalking the last half percent Victor.  Probably a
fruitless effort.  Not long ago someone stated on another list that the
goal in fighting spam isn't necessarily to kill it all, but to simply
make the problem manageable.  I'm probably crossing that line with this
effort.

>> But the fact that we don't receive legit mail composed in
>> non-English languages does make it spam.
> 
> RFC 2047 is not about non-Enlish. It enables encoding of non-ASCII
> characters, these may crop up in English text from time to time.
> It is naïve to assume otherwise :-)

Yes, I fully understand this possibility.  My analysis and action is
based on mail flow seen here.  And I'm not looking to filter all mail
with this, simply the suspect stuff I've identified.

>> Does my motivation here make more sense now?
> 
> Not really, the approach is still too fragile IMHO.

Ok.  Here's my basic overall anti-spam setup snippets from main.cf.  As
I said, this combo kills about 99.5+% of the spam:

header_checks = pcre:/etc/postfix/header_checks,
regexp:/etc/postfix/phish419.regexp, tcp:[127.0.0.1]:2526

smtpd_recipient_restrictions =
        permit_mynetworks
        reject_unauth_destination
        check_sender_access hash:/etc/postfix/auto-whtlst
        check_client_access hash:/etc/postfix/blacklist
        check_sender_access hash:/etc/postfix/blacklist
        check_recipient_access hash:/etc/postfix/whitelist
        check_sender_access hash:/etc/postfix/whitelist
        check_client_access hash:/etc/postfix/whitelist
        check_client_access proxy:${cidr}/dnswl

        reject_unknown_reverse_client_hostname
        reject_non_fqdn_sender
        reject_non_fqdn_helo_hostname
        reject_invalid_helo_hostname
        reject_unknown_helo_hostname
        reject_unlisted_recipient

        check_client_access proxy:pcre:/etc/postfix/fqrdns.pcre
        check_client_access proxy:pcre:/etc/postfix/ptr-tld.pcre
        check_client_access proxy:${cidr}/countries
        check_client_access proxy:${cidr}/spammer
        check_client_access proxy:${cidr}/misc-spam-srcs

        reject_rbl_client zen.spamhaus.org
        reject_rbl_client psbl.surriel.com
        reject_rhsbl_client dbl.spamhaus.org
        reject_rhsbl_sender dbl.spamhaus.org
        reject_rhsbl_helo dbl.spamhaus.org
        check_policy_service inet:127.0.0.1:60000


Ok, here's the approach for sorting the spam with the encoded subject
headers, which is merely a test case at this point on a single mailbox.
 Do you still feel this is too fragile (assuming the PCRE is correct to
match what I'm targeting)?

/etc/postfix/header_checks
/^Subject: =\?.{6,12}\?/            PREPEND X-Encoded-Subject: true

/home/user/.dovecot.sieve
require "fileinto";

if false {}

elsif header :contains "List-Id" "list.spammers.dontlike.us" {
        fileinto "1-SDLU";
        stop;
}
elsif header :contains "List-Id" "linux-ide.vger.kernel.org" {
        fileinto "1-Linux-IDE";
        stop;
}
elsif header :contains "List-Id" "linux-raid.vger.kernel.org" {
        fileinto "1-Linux-RAID";
        stop;
}
elsif header :contains "List-Id" "linux-scsi.vger.kernel.org" {
        fileinto "1-Linux-SCSI";
        stop;
}
elsif header :contains "List-Id" "XFS" {
        fileinto "1-XFS";
        stop;
}
elsif header :contains "List-Post" "postfix-users@postfix.org" {
        fileinto "1-Postfix-Users";
        stop;
}
elsif header :contains "List-Id" "users.lists.roundcube.net" {
        fileinto "1-Roundcube";
        stop;
}
elsif header :contains "List-Id" "dovecot.dovecot.org" {
        fileinto "1-Dovecot";
        stop;
}
elsif address :contains "from" "*" {
        fileinto "*";
        stop;
}
elsif address :contains "from" "*...@gmail.com" {
        fileinto "*";
        stop;
}
elsif address :contains "from" "*...@gmail.com" {
        fileinto "*";
        stop;
}
elsif address :contains "from" "*" {
        fileinto "*";
        stop;
}
elsif address :contains "from" "*" {
        fileinto "*";
        stop;
}
elsif address :contains "from" "*" {
        fileinto "*";
        stop;
}
elsif address :contains "to" "postmas...@hardwarefreak.com" {
        fileinto "Postmaster";
        stop;
}
elsif header :contains "Received" "for <postmas...@hardwarefreak.com>" {
        fileinto "Postmaster";
        stop;
}
elsif header :contains "List-Id" "debian-user.lists.debian.org" {
        fileinto "1-Debian-Users";
        stop;
}
elsif header :contains "List-Id" "spam-l.spam-l.com" {
        fileinto "1-Spam-l";
        stop;
}
elsif address :contains "To" "*" {
        fileinto "SpamTrap";
        stop;
}
elsif address :contains "Cc" "*" {
        fileinto "SpamTrap";
        stop;
}
elsif header :contains "Received" "for <*>" {
        fileinto "SpamTrap";
        stop;
}
elsif address :contains "To" "*" {
        fileinto "SpamTrap";
        stop;
}
elsif address :contains "Cc" "*" {
        fileinto "SpamTrap";
        stop;
}
elsif header :contains "Received" "for <*>" {
        fileinto "SpamTrap";

}
elsif address :contains "To" "*" {
        fileinto "SpamTrap";
        stop;
}
elsif address :contains "Cc" "*" {
        fileinto "SpamTrap";
        stop;
}
elsif header :contains "Received" "for <*>" {
        fileinto "SpamTrap";
        stop;
}
elsif header :contains "List-Id" "samba.lists.samba.org" {
        fileinto "1-Samba";
        stop;
}
elsif header :contains "X-Encoded-Subject" "true" {
        fileinto "1-1encoded_subject";
        stop;
}
else {
        fileinto "INBOX";
}

As you can see this is pretty darn selective.

-- 
Stan

Reply via email to