Steven D'Aprano <[EMAIL PROTECTED]> wrote: > I didn't say it urlretrieve was escaping the URL. I actually think the > URLs are pre-escaped when I scrape them from a HTML file. I have > searched for, but been unable to find, standard library functions that > escapes or unescapes URLs. Are there any such functions? > Whenever you put a URL into an HTML file you need to escape it, so naturally you will also need to unescape it when it is retrieved from the file. However, whatever you use to parse the HMTL ought to be unescaping text and attributes as part of the parsing process, so you shouldn't need a separate function for this.
e.g. >>> from BeautifulSoup import BeautifulSoup >>> soup = BeautifulSoup('''<a href="http://www.example.com/parrot.php?x=1 &y=2">link</a>''') >>> soup.contents[0]['href'] u'http://www.example.com/parrot.php?x=1&y=2' >>> Even Python's builtin HTMLParser class will do this for you. What parser are you using? -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list