Embedding a literal "\u" in a unicode raw string.

2008-02-25 Thread Romano Giannetti
Hi, 

while writing some LaTeX preprocessing code, I stumbled into this problem: (I
have a -*- coding: utf-8 -*- line, obviously) 

s = ur"añado $\uparrow$" 

Which gave an error because the \u escape is interpreted in raw unicode strings,
too. So I found that the only way to solve this is to write: 

s = unicode(r"añado $\uparrow$", "utf-8")

or 

s = ur"añado $\u005cuparrow$"

The second one is too ugly to live, while the first is at least acceptable; but
looking around the Python 3.0 doc, I saw that the first one will fail, too. 

Am I doing something wrong here or there is another solution for this? 

Romano 



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Embedding a literal "\u" in a unicode raw string.

2008-02-25 Thread Romano Giannetti
Thinker  branda.to> writes:

> 
> 
> > s = ur"añado $\uparrow$"
> >
> > Which gave an error because the \u escape is interpreted in raw
> > unicode strings, too. So I found that the only way to solve this is
> >  to write:
> >
> > s = unicode(r"añado $\uparrow$", "utf-8")
> >
> > or
> >
> > s = ur"añado $\u005cuparrow$"
> >
> >
> The backslash '\' is a meta-char that escapes the string.  You can
> escape the char as following string
> u"\\u'
> insert another '\' before it.
> 

(Answering this and the other off thread answer by Diez)

Well, I have simplified too much. The problem is, when writing LaTeX snippets, a
lot of backslashed are involved. So the un-raw string is difficult to read
because all those doubled \\, and the raw string is just handy. Moreover, that
way I can copy-and-paste LaTeX code between ur"""   """ marks, 

Searching more, I even found a thread in python-dev where Guido himself seemed
convinced that this "\u" interpratation in raw strings is at least a bit
disappointing:

http://mail.python.org/pipermail/python-dev/2007-May/073042.html

but I have seen later that it will still here in 3.0. That means that all my
unicode(r"\uparrow", "utf-8") will break... sigh.

Thanks anyway,

Romano 


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Embedding a literal "\u" in a unicode raw string.

2008-02-25 Thread romano . giannetti

> unicode(r"\uparrow", "utf-8") will break... sigh.
>

Moreover, I checked with 2to3.py, and it say (similar case):

-ok_preamble = unicode(r"""
+ok_preamble = str(r"""
 \usepackage[utf8]{inputenc}
 \begin{document}
 Añadidos:
 """, "utf-8")

which AFAIK will give an error for the \u in \usepackage. Hmmm...
should I dare ask on the developer list? :-)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Embedding a literal "\u" in a unicode raw string.

2008-02-25 Thread romano . giannetti
On Feb 25, 6:03 pm, "OKB (not okblacke)"
<[EMAIL PROTECTED]> wrote:
>
> I too encountered this problem, in the same situation (making
> strings that contain LaTeX commands).  One possibility is to separate
> out just the bit that has the \u, and use string juxtaposition to attach
> it to the others:
>
> s = ur"añado " u"$\\uparrow$"
>
> It's not ideal, but I think it's easier to read than your solution
> #2.
>

Yes, I think I will do something like that, although... I really do
not understand why \x5c is not interpreted in a raw string but \u005c
is interpreted  in a unicode raw string... is, well, not elegant. Raw
should be raw...

Thanks anyway

-- 
http://mail.python.org/mailman/listinfo/python-list