David Maus <dm...@ictsoc.de> writes: > Sébastien Vauban wrote: >>Hello, > >>With current git pull, and such an Org file (in UTF-8 encoding): > >> ... > >>I get the following error when trying to export it via PDFLaTeX: > > The problem is, that the 'É' character is not in Org's default list > for link escapes but `string-match' matches for the lower case > character. Adding more chars to `org-link-escape-chars' would solve > the problem, but this seems to be a broder issue: > > Regular links (URIs) are restricted to a special set of ASCII > characters and non-ascii chars are hex-encoded. Currently Org escapes > links to Org mode headlines using the table mentioned above. But Org > files and hence Org headlines might be Unicode, containing multibyte > characters that cannot be hex-escaped in the normal fashion. > > Maybe something like this would be a solution: > > - Org only escapes square brackets when escaping a link to an Org > mode headline > - `org-link-escape' uses a shotgun-approach: Every char that is not > allowed according to the specs (Cf. RFC3986) is percent encoded if > the link sequence does not contain multibyte chars; If the sequence > does contain multibyte chars, `org-link-escape' produces an IRI > (Cf. RFC3987).
Is there a reason for this distinction between multibyte and unibyte? I favour the "shotgun-approach" if not. It's bullet-proof. The JavaScript function `encodeURIComponent()' encodes the German Umlaut `ü' as `%C3%B6' regardless of the sources encoding actually. That's why I wrote the two functions `org-protocol-unhex-string' and `org-protocol-unhex-compound' (s. org-protocol.el). I'll have to take a look at that RFC you mentioned :) Best wishes Sebastian _______________________________________________ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode