Embedding a literal "\u" in a unicode raw string.
Hi, while writing some LaTeX preprocessing code, I stumbled into this problem: (I have a -*- coding: utf-8 -*- line, obviously) s = ur"añado $\uparrow$" Which gave an error because the \u escape is interpreted in raw unicode strings, too. So I found that the only way to solve this is to write: s = unicode(r"añado $\uparrow$", "utf-8") or s = ur"añado $\u005cuparrow$" The second one is too ugly to live, while the first is at least acceptable; but looking around the Python 3.0 doc, I saw that the first one will fail, too. Am I doing something wrong here or there is another solution for this? Romano -- http://mail.python.org/mailman/listinfo/python-list
Re: Embedding a literal "\u" in a unicode raw string.
Thinker branda.to> writes: > > > > s = ur"añado $\uparrow$" > > > > Which gave an error because the \u escape is interpreted in raw > > unicode strings, too. So I found that the only way to solve this is > > to write: > > > > s = unicode(r"añado $\uparrow$", "utf-8") > > > > or > > > > s = ur"añado $\u005cuparrow$" > > > > > The backslash '\' is a meta-char that escapes the string. You can > escape the char as following string > u"\\u' > insert another '\' before it. > (Answering this and the other off thread answer by Diez) Well, I have simplified too much. The problem is, when writing LaTeX snippets, a lot of backslashed are involved. So the un-raw string is difficult to read because all those doubled \\, and the raw string is just handy. Moreover, that way I can copy-and-paste LaTeX code between ur""" """ marks, Searching more, I even found a thread in python-dev where Guido himself seemed convinced that this "\u" interpratation in raw strings is at least a bit disappointing: http://mail.python.org/pipermail/python-dev/2007-May/073042.html but I have seen later that it will still here in 3.0. That means that all my unicode(r"\uparrow", "utf-8") will break... sigh. Thanks anyway, Romano -- http://mail.python.org/mailman/listinfo/python-list
Re: Embedding a literal "\u" in a unicode raw string.
> unicode(r"\uparrow", "utf-8") will break... sigh. > Moreover, I checked with 2to3.py, and it say (similar case): -ok_preamble = unicode(r""" +ok_preamble = str(r""" \usepackage[utf8]{inputenc} \begin{document} Añadidos: """, "utf-8") which AFAIK will give an error for the \u in \usepackage. Hmmm... should I dare ask on the developer list? :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Embedding a literal "\u" in a unicode raw string.
On Feb 25, 6:03 pm, "OKB (not okblacke)" <[EMAIL PROTECTED]> wrote: > > I too encountered this problem, in the same situation (making > strings that contain LaTeX commands). One possibility is to separate > out just the bit that has the \u, and use string juxtaposition to attach > it to the others: > > s = ur"añado " u"$\\uparrow$" > > It's not ideal, but I think it's easier to read than your solution > #2. > Yes, I think I will do something like that, although... I really do not understand why \x5c is not interpreted in a raw string but \u005c is interpreted in a unicode raw string... is, well, not elegant. Raw should be raw... Thanks anyway -- http://mail.python.org/mailman/listinfo/python-list