On Monday, Jul 14, 2003, at 09:37 US/Pacific, Mike Blezien wrote:
Hello,
We have a fairly simple redirect script a url is entered, and even tho there are directions to not enter the "http://www" sometimes we get it or "http://"... what is the simplest method to extract just the domain name if a "http://www.somedomain_name.com" or "http://somedomain_name.com" is enter so we can extract just the "somedomain_name.com"
The simplest of re's I can think of is
#------------------------ # sub simple_get_domain { my ($string) = @_; my $some_dom; if ( $string =~ m!http://www.(.*)! ) { $some_dom = $1; }elsif ( $string =~ m!http://(.*)! ) { $some_dom = $1; } $some_dom; } # end of simple_get_domain
but this is merely a string parser - and is not really gonna make sure that the $some_dom returned is kosher...
The squirelly part of course are things like:
http://foo.bar.com:12345/uri_path_stuff_here or
http://127.0.0.1/
I start that part with something like
sub parse_url { my ($me,$url) = @_; my($schema, $host_port, $uri ) = ($url =~ m!^([^:]+)://([^/]+)(.*)$!); } # end of parse_url
sub get_host_port { my ($me,$host_port) = @_;
my ($host,$port) = ($host_port =~ m/([^:]+):([^:]+)/) ? ($1,$2) : ($host_port, ''); }
And then the validation part gets into either working backwards from the TLD - top level domain - or trying to figure out which part of the string is the host part..... try to remember that
nas.nasa.gov
IS the domain name, not a host and domain component....
ciao drieux
---
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]