Marc-Andre Lemburg <m...@egenix.com> added the comment: Marc-Andre Lemburg wrote: > > Marc-Andre Lemburg <m...@egenix.com> added the comment: > > STINNER Victor wrote: >> >> STINNER Victor <victor.stin...@haypocalc.com> added the comment: >> >> I created a TAR archive with the 7-zip archiver of file with diacritics in >> their name (eg. "é" and "à"). Then I opened the archive with WinRAR: the >> file names were not displayed correctly :-/ >> >> 7-zip encodes "à" (U+00e0) as 0x85 (1 byte), and "é" (U+00e9) as 0x82 (1 >> byte). I don't know this encoding. > > That's an old DOS code paged used in Europe: CP850 > > http://en.wikipedia.org/wiki/Code_page_850
Looks like the cmd.exe on WinXP still uses it. At least on my German WinXP it does for Python 2.3 and older. Starting with Python 2.4, the behavior changed to use CP1252 instead: D:\Python26>python Python 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit (Intel)] on wi 32 Type "help", "copyright", "credits" or "license" for more information. >>> u'àé' u'\xe0\xe9' D:\Python25>python Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> u'áé' u'\xe1\xe9' D:\Python24>python Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> u'àé' u'\xe0\xe9' D:\Python23>python Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> u'àé' u'\x85\x82' >>> ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8784> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com