On Fri, 25 Nov 2005, Gerard Robin wrote:

> Can someone give me some advices to recognize such subject if it's 
> possible.

Yes, it's possible. The solution is already available, it's called 
SpamAssassin, and if you're trying to hand-roll your own approach to 
this problem, you're reinventing an already impressively round wheel.

http://spamassassin.apache.org/
http://search.cpan.org/dist/Mail-SpamAssassin/lib/Mail/SpamAssassin.pm 

Now if SpamAssassin does an inadequate job of filtering based on garbage 
subject lines, there's a very easy mechanism for adding rules to it so 
that you can make it do a better job. My strong hunch though is that, 
out of the box, it will already come with dozens of rules that can be 
applied to identifying spam just based on characteristics of the subject 
header alone, as well as hundreds of rules for matching based on other 
properties, including other headers and the message body itself. 

Plus, just to make things even better, SpamAssassin can do Bayesian 
statistical analysis of the messages that *you* consider to be spam or 
not-spam, so that even messages that slip through the canned rules can 
still be identified just based on their similarity to spam you've 
received in the past.

Outdoing all this would take considerable effort and a lot of time.

You have, I am sure, better ways to spend your time :-)

Install SpamAssassin and go find a better way to spend your time than 
managing your junk mail -- you'll be glad you did :-)



-- 
Chris Devers

R±Ð­6ׂÄÜì
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Reply via email to