Bugs item #1634774, was opened at 2007-01-13 18:30 Message generated for change (Comment added) made by dobrokot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1634774&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Ivan Dobrokotov (dobrokot) Assigned to: Nobody/Anonymous (nobody) Summary: locale 1251 does not convert to upper case properly Initial Comment: <pre> # -*- coding: 1251 -*- import locale locale.setlocale(locale.LC_ALL, ".1251") #locale name may be Windows specific? #----------------------------------------------- print chr(184), chr(168) assert chr(255).upper() == chr(223) #OK assert chr(184).upper() == chr(168) #fail #----------------------------------------------- assert 'q'.upper() == 'Q' #OK assert 'ж'.upper() == 'Ж' #OK assert 'Ж'.upper() == 'Ж' #OK assert 'я'.upper() == 'Я' #OK assert u'ё'.upper() == u'Ё' #OK (locale independent) assert 'ё'.upper() == 'Ё' #fail </pre> I suppose incorrect realization of uppercase like <pre> if ('a' <= c && c <= 'я') return c+'Я'-'я' </pre> symbol 'ё' (184 in cp1251) is not in range 'a'-'я' ---------------------------------------------------------------------- >Comment By: Ivan Dobrokotov (dobrokot) Date: 2007-01-18 22:18 Message: Logged In: YES user_id=1538986 Originator: YES well, C: ---------------------------- #include <locale.h> #include <stdio.h> #include <assert.h> int main() { int i = 184; char *old = setlocale(LC_CTYPE, ".1251"); assert(old); printf("%d -> %d\n", i, _toupper(i)); printf("%d -> %d\n", i, toupper(i)); } ---------------------------- C ouput: 184 -> 152 184 -> 168 so, _toupper and upper are different functions. MSDN does not mention nothing about difference, except that 'toupper' is "ANSI compatible" :( File Added: toupper.zip ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2007-01-18 21:08 Message: Logged In: YES user_id=21627 Originator: NO You can see the implementation of .upper in http://svn.python.org/projects/python/tags/r25/Objects/stringobject.c (function string_upper) Off-hand, I cannot see anything wrong in that code. It definitely does *not* use c+'Я'-'я'. ---------------------------------------------------------------------- Comment By: Ivan Dobrokotov (dobrokot) Date: 2007-01-13 22:08 Message: Logged In: YES user_id=1538986 Originator: YES forgot to mention used python version - http://www.python.org/ftp/python/2.5/python-2.5.msi ---------------------------------------------------------------------- Comment By: Ivan Dobrokotov (dobrokot) Date: 2007-01-13 18:51 Message: Logged In: YES user_id=1538986 Originator: YES sorry, I mean toupper((int)(unsigned char)'ё') not just toupper('ё') ---------------------------------------------------------------------- Comment By: Ivan Dobrokotov (dobrokot) Date: 2007-01-13 18:49 Message: Logged In: YES user_id=1538986 Originator: YES C-CRT library fucntion toupper('ё') works properly, if I set setlocale(LC_ALL, ".1251") ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1634774&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com