On Tue, 3 May 2005, gavin mc auley wrote:

> I am new to perl and regular expressions. I am trying to write a 
> simple spam filter.

There's your first mistake! :-)

This is a slippery slope. 

Your simple filter isn't going to be as effective as you want, so you'll 
want to tweak it. The second draft will be a little better, but maybe 
it'll trap some of your legit mail, so you have to write a third version 
and maybe it will be less effective than the first two.

Repeat this cycle for a while and you'll end up with a big, convoluted 
system that, if you're lucky, will almost work as well as SpamAssassin. 

That, or you can just install SpamAssassin now, and start tweaking it to 
do what you want, if you find that it doing everything you need. This is 
a much easier approach to follow, not least because you can actually 
look through the SpamAssassin source code, figure out how the different 
rules it uses works, then adjust them or add more of your own. 

In the long run, I think this way will be much easier to manage.

But anyway, you had a specific question...

> I want to extract all characters after the @ character. i.e. if the 
> email is [EMAIL PROTECTED], I want to extract hotmail.com.

   my $address = '[EMAIL PROTECTED]';
   my ( $domain  = $address ) =~ m/@(.*)/;

But there's all kinds of edge cases to account for. The above should 
work if the address is valid and isn't surrounded by any other text, but 
if you hit any wrinkles -- and you will -- then you're going to start 
needing something a lot more nuanced before long. 

Really, SpamAssassin will save you a lot of headaches here.



-- 
Chris Devers      [EMAIL PROTECTED]
http://devers.homeip.net:8080/blog/

np: 'War Pigs'
     by The Dresden Dolls
     from 'Live at the Paradise Rock Club - 07.09.2004'

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to