Hi Nicolas, Hi Nick, At Wed, 24 Jul 2013 13:09:05 +0200, Nicolas Goaziou wrote: > > Hello, > > Nick Dokos <ndo...@gmail.com> writes: > > > Maybe the thing to do is to delete '=' from org-link-escape-chars and > > see what problems arise. > > AFAICT, `url-encode-url' is subtler than that. It encodes characters > whenever they are really forbidden, which is not the case of > `org-link-escape'. Hence my initial question: do we need to reinvent the > wheel? > > > But I did find that '%' was originally in org-link-escape-chars and > > David Maus hardcoded it (commit 139cc1d4), so that it is *always* > > escaped. > > I Cc David Maus in case he has time to enlighten us about his choice. >
IIRC org-link-escape is not used to create URLs but to escape characters in a link that would otherwise conflict with Orgmode syntax (e.g. square brackets). Org applies percent escaping to a link before it is stored in the buffer and applies unescaping when it reads a link back. The percent sign is hardcoded because if org-link-escape/unescape is used in this way we must make sure that the identity of a link is preserved. If we would *not* escape the percent sign, then an original link with percent encoded characters would be read back wrongly, i.e. with the percent escaped characters unescaped. This broke links. E.g. consider a redirector link to the target url `http://target.example.org?id=33&format=html"': ,---- | http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml `---- If we don't escape the percent sign but apply unescaping when, say, the user opens the link we would get: ,---- | http://redirect.example.org?url=http://target.example.org?id=33&format=html `---- And voila: The `format' parameter is turned into a query parameter of redirect.example.org, not target.example.org. The specs (RFC3986) have to say the following about escaping: ,---- | Because the percent ("%") character serves as the indicator for | percent-encoded octets, it must be percent-encoded as "%25" for that | octet to be used as data within a URI. Implementations must not | percent-encode or decode the same string more than once, as decoding | an already decoded string might lead to misinterpreting a percent | data octet as the beginning of a percent-encoding, or vice versa in | the case of percent-encoding an already percent-encoded string. `---- There is, of course, the nasty thing that we don't know if the link in a buffer went through org-link-escape or not. E.g. if you paste ,---- | [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]] `---- into the buffer you'll get a broken link because org-link-open assumes the link to be escaped by org. The bottom-line: Org creates link programmatically (org-store-link) and needs a mechanism to protected conflicting characters. It chose percent-escaping and in order to preserve the identity of a link Org has to escape the escape-character. Hope that helps! Best, -- David -- OpenPGP... 0x99ADB83B5A4478E6 Jabber.... dmj...@jabber.org Email..... dm...@ictsoc.de