"Kevin Grittner" <kevin.gritt...@wicourts.gov> writes: > Tom Lane <t...@sss.pgh.pa.us> wrote: >> ie the critical point seems to be that url_path is willing to soak >> up a string containing "<" and ">", so the span tags don't get >> recognized as separate lexemes. While that's "obviously" the >> wrong thing in this particular example, I'm not sure if it's the >> wrong thing in general. Can anyone comment on the frequency of >> usage of those two symbols in URLs? > http://www.ietf.org/rfc/rfc2396.txt section 2.4.3 "delims" expressly > forbids their use in URIs. > In spite of the above prohibition, I notice that firefox and wget > both seem to *try* to use such characters if they're included.
Hmm, thanks for the reference, but I'm not sure this is specifying quite what we want to get at. In particular I note that it excludes '%' on the grounds that that ought to be escaped, so I guess this is specifying the characters allowed in an underlying URI, *not* the textual representation of a URI. Still, it seems like this is a sufficient defense against any complaints we might get for not treating "<" or ">" as part of a URL. I wonder whether we ought to reject any of the other characters listed here too. Right now, the InURLPath state seems to eat everything until a space, quote, or double quote mark. We could easily make it stop at "<" or ">" too, but what else? regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs