On Monday, Jul 14, 2003, at 09:37 US/Pacific, Mike Blezien wrote:


Hello,

We have a fairly simple redirect script a url is entered, and even tho there are directions to not enter the "http://www"; sometimes we get it or "http://";... what is the simplest method to extract just the domain name if a "http://www.somedomain_name.com"; or "http://somedomain_name.com"; is enter so we can extract just the "somedomain_name.com"

The simplest of re's I can think of is


        #------------------------
        #
        sub simple_get_domain
        {
                my ($string) = @_;
                my $some_dom;
                
                if ( $string =~ m!http://www.(.*)! ) {
                        $some_dom = $1;
                }elsif ( $string =~ m!http://(.*)! ) {
                        
                        $some_dom = $1;
                }
                $some_dom;
        
        } # end of simple_get_domain

but this is merely a string parser -
and is not really gonna make sure that the
$some_dom returned is kosher...

The squirelly part of course are things like:

        http://foo.bar.com:12345/uri_path_stuff_here
or

http://127.0.0.1/

I start that part with something like

        sub parse_url
        {
                my ($me,$url) = @_;
        
                my($schema, $host_port, $uri ) = ($url =~ m!^([^:]+)://([^/]+)(.*)$!);
        
        } # end of parse_url

        sub get_host_port
        {
                my ($me,$host_port) = @_;

                my ($host,$port) = ($host_port =~ m/([^:]+):([^:]+)/) ?
                        ($1,$2) : ($host_port, '');
        }

And then the validation part gets into either working
backwards from the TLD - top level domain - or
trying to figure out which part of the string is
the host part..... try to remember that

nas.nasa.gov

IS the domain name, not a host and domain component....


ciao drieux

---


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to