On Jan 9, 3:22 pm, Stef Mientki <[EMAIL PROTECTED]> wrote:
> hello,
>
> I'm trying to convert the links in html pages to absolute links,
> these pages can either be webpages or files on local harddisk (winXP).
> Now I've struggling for a while, and this code works a lilttle:
>
>       i = line.find ( 'href=' )
>           if i < 0 :
>               i = line.find ( ' src=' )
>           if i >= 0 :
>             ii = line.find ( '"', i+6 )
>             file = line [ i+6 : ii ]
>             #print urlparse.urljoin ( p, file )
>             if file.find ( 'http:' ) < 0 :
>                 abspath = os.path.normpath ( os.path.join ( p, file ) )
>                 line = line.replace ( file, abspath )
>             print line
>
> but it only covers files on local disk and just 1 link per line,
> so I guess it's a lot of trouble to catch all cases.
> Isn't there a convenient function for (OS independent preferable) ?
> Googled for it, but can't find it.
>
> thanks,
> Stef Mientki

I googled a bit too. The Perl forums talk about using a regular
expression. You can probably take that and translate it into the
Python equivalent:

http://forums.devshed.com/perl-programming-6/how-to-parse-relatives-links-to-absolute-links-8173.html

I also found this, which appears to be an old c.l.py thread:

http://www.dbforums.com/archive/index.php/t-320359.html

You might have more luck if you google for "relative to absolute
links". I would also take a look at how django or cherrypy creates
their URLs.

Mike
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to