Re: URLs and ampersands

2008-08-06 Thread Duncan Booth
Matthew Woodcraft <[EMAIL PROTECTED]> wrote: > Gabriel Genellina wrote: >> Steven D'Aprano wrote: > >>> I have searched for, but been unable to find, standard library >>> functions that escapes or unescapes URLs. Are there any such >>> functions? > >> Yes: cgi.escape/unescape, and xml.sax.saxuti

Re: URLs and ampersands

2008-08-05 Thread Paul Rubin
Steven D'Aprano <[EMAIL PROTECTED]> writes: > I could just do a string replace, but is there a "right" way to escape > and unescape URLs? I've looked through the standard lib, but I can't find > anything helpful. xml.sax.utils.unescape() -- http://mail.python.org/mailman/listinfo/python-list

Re: URLs and ampersands

2008-08-05 Thread Steven D'Aprano
On Tue, 05 Aug 2008 12:07:39 +, Duncan Booth wrote: > Whenever you put a URL into an HTML file you need to escape it, so > naturally you will also need to unescape it when it is retrieved from > the file. However, whatever you use to parse the HMTL ought to be > unescaping text and attributes

Re: URLs and ampersands

2008-08-05 Thread Matthew Woodcraft
Gabriel Genellina wrote: > Steven D'Aprano wrote: >> I have searched for, but been unable to find, standard library >> functions that escapes or unescapes URLs. Are there any such >> functions? > Yes: cgi.escape/unescape, and xml.sax.saxutils.escape/unescape. I don't see a cgi.unescape in the st

Re: URLs and ampersands

2008-08-05 Thread Matthew Woodcraft
Steven D'Aprano wrote: > I'm using urllib.urlretrieve() to download HTML pages, and I've hit a > snag with URLs containing ampersands: > > http://www.example.com/parrot.php?x=1&y=2 > > Somewhere in the process, urls like the above are escaped to: > > http://www.example.com/parrot.php?x=1&y=2 > > w

Re: URLs and ampersands

2008-08-05 Thread Gabriel Genellina
En Tue, 05 Aug 2008 06:59:20 -0300, Steven D'Aprano <[EMAIL PROTECTED]> escribió: > On Mon, 04 Aug 2008 23:16:46 -0300, Gabriel Genellina wrote: > >> En Mon, 04 Aug 2008 20:43:45 -0300, Steven D'Aprano >> <[EMAIL PROTECTED]> escribi�: >> >>> I'm using urllib.urlretrieve() to download HTML pages,

Re: URLs and ampersands

2008-08-05 Thread Duncan Booth
Steven D'Aprano <[EMAIL PROTECTED]> wrote: > I didn't say it urlretrieve was escaping the URL. I actually think the > URLs are pre-escaped when I scrape them from a HTML file. I have > searched for, but been unable to find, standard library functions that > escapes or unescapes URLs. Are there any

Re: URLs and ampersands

2008-08-05 Thread Richard Brodie
"Steven D'Aprano" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I could just do a string replace, but is there a "right" way to escape > and unescape URLs? The right way is to parse your HTML with an HTML parser. URLs are not exempt from the normal HTML escaping rules, although

Re: URLs and ampersands

2008-08-05 Thread Wojtek Walczak
Dnia 05 Aug 2008 09:59:20 GMT, Steven D'Aprano napisa�(a): > I didn't say it urlretrieve was escaping the URL. I actually think the > URLs are pre-escaped when I scrape them from a HTML file. I have searched > for, but been unable to find, standard library functions that escapes or > unescapes

Re: URLs and ampersands

2008-08-05 Thread Steven D'Aprano
On Mon, 04 Aug 2008 23:16:46 -0300, Gabriel Genellina wrote: > En Mon, 04 Aug 2008 20:43:45 -0300, Steven D'Aprano > <[EMAIL PROTECTED]> escribi�: > >> I'm using urllib.urlretrieve() to download HTML pages, and I've hit a >> snag with URLs containing ampersands: >> >> http://www.example.com/parro

Re: URLs and ampersands

2008-08-04 Thread Gabriel Genellina
En Mon, 04 Aug 2008 20:43:45 -0300, Steven D'Aprano <[EMAIL PROTECTED]> escribi�: I'm using urllib.urlretrieve() to download HTML pages, and I've hit a snag with URLs containing ampersands: http://www.example.com/parrot.php?x=1&y=2 Somewhere in the process, urls like the above are escaped to

URLs and ampersands

2008-08-04 Thread Steven D'Aprano
I'm using urllib.urlretrieve() to download HTML pages, and I've hit a snag with URLs containing ampersands: http://www.example.com/parrot.php?x=1&y=2 Somewhere in the process, urls like the above are escaped to: http://www.example.com/parrot.php?x=1&y=2 which naturally fails to exist. I could