Bugs item #1699853, was opened at 2007-04-13 12:26 Message generated for change (Comment added) made by ber You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1699853&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Bernhard Reiter (ber) Assigned to: Nobody/Anonymous (nobody) Summary: locale.getlocale() output fails as setlocale() input Initial Comment: This problem report about the locale module consists of three closely related parts (this is why I have decided to put it in one report). a) the example in the docs is wrong / missleading b) under some locale settings python as a defect c) a test case for the locale module, showing b) but useful as general start for a test module. Details: a) Section example: The line >>> loc = locale.getlocale(locale.LC_ALL) # get current locale contradicts that getlocale should not be called with LC_ALL, as stated in the description of getlocale. Suggestion is to change the example to be more useful as getting the locale as first action is not really useful. It should be "C" anyway which will lead to (None, None) so the value is already known. It would make more sense to first set the default locale to the user preferences: import locale locale.setlocale(locale.LC_ALL,'') loc = locale.getlocale(locale.LC_NUMERIC) locale.setlocale(locale.LC_NUMERIC,"C") # convert a string here locale.setlocale(locale.LC_NUMERIC, loc) _but_ this does not work, see problem b). What does work is: import locale.setlocale(locale.LC_ALL,'') loc = locale.setlocale(locale.LC_NUMERIC) locale.setlocale(locale.LC_NUMERIC,"C") # convert a string here locale.setlocale(locale.LC_NUMERIC, loc) Note that all_loc = locale.setlocale(locale.LC_ALL) might contain several categories (see attached test_locale.py where I needed to decode this). 'LC_CTYPE=de_DE.UTF-8;LC_NUMERIC=en_GB.utf8;LC_TIME=de_DE.UTF-8;LC_COLLATE=de_DE.UTF-8;LC_MONETARY=de_DE.UTF-8;LC_MESSAGES=de_DE.UTF-8;LC_PAPER=de_DE.UTF-8;LC_NAME=de_DE.UTF-8;LC_ADDRESS=de_DE.UTF-8;LC_TELEPHONE=de_DE.UTF-8;LC_MEASUREMENT=de_DE.UTF-8;LC_IDENTIFICATION=de_DE.UTF-8' b) The output of getlocale cannot be used as input to setlocale sometimes. Works with * python2.5 und python2.4 on Debian GNU/Linux Etch ppc, de_DE.utf8. I had failures with * python2.3, python2.4, python2.5 on Debian GNU/Linux Sarge ppc, [EMAIL PROTECTED] * Windows XP SP2 python-2.4.4.msi German, see: >>> import locale >>> result = locale.setlocale(locale.LC_NUMERIC,"") >>> print result German_Germany.1252 >>> got = locale.getlocale(locale.LC_NUMERIC) >>> print got ('de_DE', '1252') >>> # works ... locale.setlocale(locale.LC_NUMERIC, result) 'German_Germany.1252' >>> # fails ... locale.setlocale(locale.LC_NUMERIC, got) Traceback (most recent call last): File "<stdin>", line 2, in ? File "C:\Python24\lib\locale.py", line 381, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting >>> ---------------------------------------------------------------------- >Comment By: Bernhard Reiter (ber) Date: 2007-04-19 12:23 Message: Logged In: YES user_id=113859 Originator: YES Feel free to drop them an email, this is a good idea. Maybe "svn blame" or history inspection produces more names that actually wrote the code and the documentation. ---------------------------------------------------------------------- Comment By: Istvan Szegedi (iszegedi) Date: 2007-04-19 12:18 Message: Logged In: YES user_id=1772412 Originator: NO Hi Bernhard, I absolutely agree with you and I cannot really judge my correction, either. It was just a quick and dirty solution to see if it would fix the problem. In fact, there are other ways to do it as well, like to modify the encoding_alias table not to translate utf-8 string into utf (and thus to prevent it to produce an invalid locale setting for _setlocale ) In the locale.py file I found two names mentioned: Author: Mark-Andre Lemburg ([EMAIL PROTECTED]) and Fredrick Lund ([EMAIL PROTECTED]) as a modifyier so it might be a good idea to drop them a mail and ask for their comments. Do you want to do it or shall I? If you are willing to do it, please, keep me in the loop. ---------------------------------------------------------------------- Comment By: Bernhard Reiter (ber) Date: 2007-04-19 10:55 Message: Logged In: YES user_id=113859 Originator: YES Istvan, thanks for looking into this and adding information. I do not feel competent to judge what the solution would be, as I do not know the design goals of getlocale(). Given the documentation the function call would only make sense if setlocale(getlocale(LC_XYZ)) would work in all cases, especially after the locale has been set to the user preferances with setlocale(LC_ALL,"") There is no simple test case that can make sure this is the case. The workaround for current code for me is to use setlocale(LC_XYZ) only to ask for the currently set locale and then decipher the string if the categories have different settings. This workaround can be seen in my proposed test_case.py. I believe next steps could be to get a full overview and check design and implementation, add some testcases so that more is covered and then fix the implementation. We could try to find out who invented getlocale and ask. ---------------------------------------------------------------------- Comment By: Istvan Szegedi (iszegedi) Date: 2007-04-19 10:24 Message: Logged In: YES user_id=1772412 Originator: NO I could reproduce the problem on Fedora Core 5 with Python 2.4.3. After tracing down the issue, I found the following: The problem is in locate.py. There is a function called normalize defined in the locate.py file. This function is invoked by setlocale function if the incoming locale argument is not a string. (in your example this condition is true because locale.getlocale function returns a tuple so got variable is a tuple.) The normalize function is using an encoding_alias table which results to translate the full locale into an incorrect version. What happens in my environment is that there is an incoming value en_us.utf-8 which is converted to en_us.utf and that is the return value from normalize function. Then _setlocale low level function invoked in setlocale function throws an exception when it receives en_us.utf argument and it is an unsupported locale setting. This is the original code snippet in locale.py where it is converted in a wrong way in normalize function: # Second try: langname (without encoding) code = locale_alias.get(langname, None) if code is not None: if '.' in code: langname, defenc = code.split('.') else: langname = code defenc = '' if encoding: encoding = encoding_alias.get(encoding, encoding) else: encoding = defenc if encoding: return langname + '.' + encoding else: return langname else: return localename To get it fixed, I modified the code in locate.py as follows: # Second try: langname (without encoding) code = locale_alias.get(langname, None) if code is not None: if '.' in code: langname, defenc = code.split('.') else: langname = code defenc = '' # if encoding: # encoding = encoding_alias.get(encoding, encoding) # else: # encoding = defenc if encoding is None: encoding = defenc if encoding: return langname + '.' + encoding else: return langname else: return localename So the effect of encoding_table is skipped. Then your test_locale.py returns OK. ---------------------------------------------------------------------- Comment By: Istvan Szegedi (iszegedi) Date: 2007-04-18 12:05 Message: Logged In: YES user_id=1772412 Originator: NO I could reproduce the problem on Fedora Core 5 with Python 2.4.3. After tracing down the issue, I found the following: The problem is in locate.py. There is a function called normalize defined in the locate.py file. This function is invoked by setlocale function if the incoming locale argument is not a string. (in your example this condition is true because locale.getlocale function returns a tuple so got variable is a tuple.) The normalize function is using an encoding_alias table which results to translate the full locale into an incorrect version. What happens in my environment is that there is an incoming value en_us.utf-8 which is converted to en_us.utf and that is the return value from normalize function. Then _setlocale low level function invoked in setlocale function throws an exception when it receives en_us.utf argument and it is an unsupported locale setting. This is the original code snippet in locale.py where it is converted in a wrong way in normalize function: # Second try: langname (without encoding) code = locale_alias.get(langname, None) if code is not None: if '.' in code: langname, defenc = code.split('.') else: langname = code defenc = '' if encoding: encoding = encoding_alias.get(encoding, encoding) else: encoding = defenc if encoding: return langname + '.' + encoding else: return langname else: return localename To get it fixed, I modified the code in locate.py as follows: # Second try: langname (without encoding) code = locale_alias.get(langname, None) if code is not None: if '.' in code: langname, defenc = code.split('.') else: langname = code defenc = '' # if encoding: # encoding = encoding_alias.get(encoding, encoding) # else: # encoding = defenc if encoding is None: encoding = defenc if encoding: return langname + '.' + encoding else: return langname else: return localename So the effect of encoding_table is skipped. Then your test_locale.py returns OK. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1699853&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com