On Jan 9, 3:22 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: > hello, > > I'm trying to convert the links in html pages to absolute links, > these pages can either be webpages or files on local harddisk (winXP). > Now I've struggling for a while, and this code works a lilttle: > > i = line.find ( 'href=' ) > if i < 0 : > i = line.find ( ' src=' ) > if i >= 0 : > ii = line.find ( '"', i+6 ) > file = line [ i+6 : ii ] > #print urlparse.urljoin ( p, file ) > if file.find ( 'http:' ) < 0 : > abspath = os.path.normpath ( os.path.join ( p, file ) ) > line = line.replace ( file, abspath ) > print line > > but it only covers files on local disk and just 1 link per line, > so I guess it's a lot of trouble to catch all cases. > Isn't there a convenient function for (OS independent preferable) ? > Googled for it, but can't find it. > > thanks, > Stef Mientki
I googled a bit too. The Perl forums talk about using a regular expression. You can probably take that and translate it into the Python equivalent: http://forums.devshed.com/perl-programming-6/how-to-parse-relatives-links-to-absolute-links-8173.html I also found this, which appears to be an old c.l.py thread: http://www.dbforums.com/archive/index.php/t-320359.html You might have more luck if you google for "relative to absolute links". I would also take a look at how django or cherrypy creates their URLs. Mike -- http://mail.python.org/mailman/listinfo/python-list