Try using URI to figure out the absolute URL. use URI;
# the base is the *current absolute page* my $base_url = 'http://foo.com/documents/help.html'; print URI->new_abs('doc1.html', $base_url), "\n"; print URI->new_abs('./doc2.html', $base_url), "\n"; print URI->new_abs('../documents/doc3.html', $base_url), "\n"; print URI->new_abs('http://somewhere.com/', $base_url), "\n"; <<< SCRIPT OUTPUT >>> http://foo.com/documents/doc1.html http://foo.com/documents/doc2.html http://foo.com/documents/doc3.html http://somewhere.com/ So your regex *might* look like this (untested)... my $base = 'http://foo.com/documents/help.html'; $html_code =~ s/href="(.*?)"/'href="' . URI->new_abs($1, $base) . '"'/seg; Rob -----Original Message----- From: Dan Muey [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 04, 2003 3:26 PM To: [EMAIL PROTECTED] Subject: Modify links I'm trying to work out a regex that will do this: Take an entire page's html: my $html_code; # all lines in thes one variable And make any href's that are relative absolute by prepending $url into them: $url = "http://myclonesite.com"; make <a href="./documents/help.hml"> into <a href="http://myclonesite.com/documents/help.html"> $html_code =~ s/href\=\"\.?\/?(.*)\"/href\=\"$url\/"/ig; the rpobolem with this is it prepends $url to absolute url's also I need to say : Put $url in front of relative urls (make ./foo /foo or foo $url/foo and ../foo would have to be treated differently, ignored for now) in href if href does not start with https?:// Any ideas? TIA Dan -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]