"Steven D'Aprano" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]

> I could just do a string replace, but is there a "right" way to escape
> and unescape URLs?

The right way is to parse your HTML with an HTML parser. URLs are not
exempt from the normal HTML escaping rules, although there are an awful lot
of pages that get this wrong.

You didn't post any code, so it's hard to tell but maybe something like
ElementTree or lxml would be a better tool than the ones you are currently 
using. 


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to