Re: Regex question

M. Lewis Thu, 04 Jan 2007 00:42:15 -0800

Igor Sutton wrote:

2007/1/4, M. Lewis <[EMAIL PROTECTED]>:



I'm trying to parse the domain name out of some URLs. In the example
data, my regex works fine on the first two URLs, but clips off the first
two characters of the domain on the third example. My regex probably
could be much better.

#!/usr/bin/perl

use strict;
use warnings;

my $regex = qr'http://\w+?\.?\w+?\.?(\w+\.com)';

while(<DATA>){
   if (/$regex/o){print "$1 \t $_"}
}

__DATA__
http://www.asldkjlkwerj.com/
http://w71r2xk22q1affwp1ewpjeee.alaskjhhawe.com/?
http://qwlkjekwl.com/?IJESRKUFZedFRCVFJYQV4cUFtY

Thanks for any pointers,
Mike



Why don't you use the URI module for that?

<code>

use strict;
use warnings;
use URI;

while (<DATA>) {
   my $uri = URI->new($_);
   print $uri->host, "\n";
}
</code>

HTH.

Thanks Igor. I now see why the first two characters of the third exampleare clipped.

I'm pretty sure the only reason I haven't used the URI module is Ididn't know it existed. :-)


Thanks for the pointer Igor!
Mike

--

 IBM: Incredibly Big Monster
  03:35:01 up 21 days, 26 min,  0 users,  load average: 0.14, 0.22, 0.25

 Linux Registered User #241685  http://counter.li.org

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regex question

Reply via email to