> while ( m{ ( (?: [a-zA-Z0-9:./-]+ @ )?
> [a-zA-Z0-9][a-zA-Z0-9.;-]+\.$tld )
>(?! \.?\w ) }gxo ) {
> my $host = lc $1;
> # Deal with inserted-semicolon munging, e.g. 'http://foo;.com'
> if ( my @split = $host =~ /(.*?);(.*)/ ) {
>
> This seems to be a big improvement at least on the 3 million lines of
> random traffic i tested with, and it's a smaller patch:
[snip]
Well, it may have been an improvement over my own data, but a colleague
pointed out the following case:
check out spamsite.com;it's awesome!
And this didn't de