Igor Sutton wrote:
2007/1/4, M. Lewis <[EMAIL PROTECTED]>:
I'm trying to parse the domain name out of some URLs. In the example
data, my regex works fine on the first two URLs, but clips off the first
two characters of the domain on the third example. My regex probably
could be much better.
#!/usr/bin/perl
use strict;
use warnings;
my $regex = qr'http://\w+?\.?\w+?\.?(\w+\.com)';
while(<DATA>){
if (/$regex/o){print "$1 \t $_"}
}
__DATA__
http://www.asldkjlkwerj.com/
http://w71r2xk22q1affwp1ewpjeee.alaskjhhawe.com/?
http://qwlkjekwl.com/?IJESRKUFZedFRCVFJYQV4cUFtY
Thanks for any pointers,
Mike
Why don't you use the URI module for that?
<code>
use strict;
use warnings;
use URI;
while (<DATA>) {
my $uri = URI->new($_);
print $uri->host, "\n";
}
</code>
HTH.
I tried this Igor and it doesn't do exactly what I'm looking for. I
checked the docs and don't think it will do exactly what I want. I need
to do some more reading though. What I want to extract is the domain
name, ala:
asldkjlkwerj.com
alaskjhhawe.com
qwlkjekwl.com
--
The generation of random numbers is too important to be left to chance.
03:50:01 up 21 days, 41 min, 0 users, load average: 0.31, 0.25, 0.21
Linux Registered User #241685 http://counter.li.org
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/