On 10/11/13 4:16 AM, Stephen Tucker wrote:
I am using IDLE, Python 2.7.2 on Windows 7, 64-bit.

I have four questions:

1. Why is it that
     print unicode_object
displays non-ASCII characters in the unicode object correctly, whereas
     print (unicode_object, another_unicode_object)
displays non-ASCII characters in the unicode objects as escape sequences (as repr() does)?

2. Given that this is actually /deliberately /the case (which I, at the moment, am finding difficult to accept), what is the neatest (that is, the most Pythonic) way to get non-ASCII characters in unicode objects in tuples displayed correctly?

3. A similar thing happens when I write such objects and tuples to a file opened by
     codecs.open ( ..., "utf-8")
I have also found that, even though I use write to send the text to the file, unicode objects not in tuples get their non-ASCII characters sent to the file correctly, whereas, unicode objects in tuples get their characters sent to the file as escape sequences. Why is this the case?

4. As for question 1 above, I ask here also: What is the neatest way to get round this?

Stephen Tucker.


Although Python 3 is better than Python 2 at Unicode, as the others have said, the most important point is one that you hit upon yourself.

When you print an object x, you are actually printing str(x). The str() of a tuple is a paren, followed by the repr()'s of its elements, separated by commas, then a closing paren. Tuples and lists use the repr() of their elements when producing either their own str() or their own repr().

Python 3 does better at this because repr() in Python 3 will gladly include non-ASCII characters in its output, while Python 2 will only include ASCII characters, and so must resort to escape sequences. (BTW: if you like the ASCII-only idea from Python 2, Python 3 has the ascii() function and the %a string formatting directive for that very purpose.)

The two string representation alternatives str() and repr() can be confusing. Think of it as: str() is for customers, repr() is for developers, or: str() is for humans, repr() is for geeks. The reason tuples use the repr() of their elements is that the parens+commas representation of a tuple is geeky to begin with, so it uses repr() of its elements, even for str(tuple).

The way to avoid repr() for the elements is to format the tuple yourself.

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to