[issue20686] Confusing statement about unicode strings in tutorial introduction

Daniel U. Thibault Thu, 20 Mar 2014 13:01:17 -0700

Daniel U. Thibault added the comment:

>>> mystring="äöü"
>>> myustring=u"äöü"


>>> mystring
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> myustring
u'\xe4\xf6\xfc'

>>> str(mystring)
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> str(myustring)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: 
ordinal not in range(128)

>>> f = open('workfile', 'w')
>>> f.write(mystring)
>>> f.close()
>>> f = open('workufile', 'w')
>>> f.write(myustring)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: 
ordinal not in range(128)
>>> f.close()

workfile contains C3 A4 C3 B6 C3 BC

So the Unicode string (myustring) does indeed try to convert to ASCII when 
written to file. But not when just printed.

It seems really strange that non-Unicode strings (mystring) should actually be 
more flexible than Unicode strings...

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20686>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue20686] Confusing statement about unicode strings in tutorial introduction

Reply via email to