On Wed, 2014-01-08 at 10:45 +0300, Christopher Culver wrote: > For the last year or so, I’ve been deluged with Spanish-language spam > with a very predictable format: the sender name begins with "Lic. " and > the sender address is at an .info domain. > > EXAMPLE SENDERS: > > Lic. Mayra Miranda > Lic. Toledano > Lic. Carmen Quintanar > Lic. Lizárraga Mena > Lic. Lizárraga Mena > Lic. Mildreth Palma > > EXAMPLE DOMAINS: > > cont...@superecursos.info > acev...@asistenciaejecutiva.info > n...@controltecnicas.info > prestacio...@hoteles2013.info > eficiencialogisticamx.info > fideicomi...@controlinterno.info > > While some of the .info domains are reused from spam message to spam > message, allowing me to blacklist them, occasionally new domains > appear. Even with feeding thousands of these into the Bayesian database, > they still get only a spamassassin score of 3.0 out of 5.0 on my > system. Therefore, I believe a new rule is called for. > > Is this type of spam common enough that a new rule can be pushed out to > all spamassassin users with sa-update raising the score on messages with > Spanish-language text, with sender names beginning with the substring > "Lic. " and coming from an .info domain? > I for one have never seen this type of spam, but that doesn't mean much: a lot of sites get types of spam I never see and vice versa. However, that's why SA lets you write local rules. In this case something like:
SPAMES describe Spanish info spam SPAMES header From =~ /Lic\..*\.info/ SPAMES score 5 should kill any mail from anybody using Lic as a title and posting from the .info domain. Disclaimer: The regex has been tested with "grep -P" but I haven't tested the rule in SA, so it must be syntax checked and tested against both spam and ham before putting it live. I prefer to use meta-rules that match two or more body text phrases to detect spam, e.g. a rule that matches one or more sales come-ons and one or more product names or types, e.g. "lowest price for new" and "Comfi-sleep pillow". My suggested rule will catch anybody using Lic as a title and posting from .info domains, so you'd better be *very* sure that you never get legitimate mail from anybody like this before using it as shown. It would be better to use it as one of the subrules in a meta-rule that won't fire unless it also matches some combination of Spanish body text phrases that do not appear in your ham stream. Martin