Bugs item #1290505, was opened at 2005-09-13 15:50 Message generated for change (Comment added) made by bcannon You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290505&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Pending Resolution: None Priority: 5 Private: No Submitted By: Adam Monsen (meonkeys) Assigned to: Brett Cannon (bcannon) Summary: strptime(): can't switch locales more than once Initial Comment: After calling strptime() once, it appears that subsequent efforts to modify the locale settings (so dates strings in different locales can be parsed) throw a ValueError. I'm pasting everything here since spacing is irrelevant: import locale, time print locale.getdefaultlocale() # ('en_US', 'utf') print locale.getlocale(locale.LC_TIME) # (None, None) # save old locale old_loc = locale.getlocale(locale.LC_TIME) locale.setlocale(locale.LC_TIME, 'nl_NL') print locale.getlocale(locale.LC_TIME) # ('nl_NL', 'ISO8859-1') # parse local date date = '10 augustus 2005 om 17:26' format = '%d %B %Y om %H:%M' dateTuple = time.strptime(date, format) # switch back to previous locale locale.setlocale(locale.LC_TIME, old_loc) print locale.getlocale(locale.LC_TIME) # (None, None) date = '10 August 2005 at 17:26' format = '%d %B %Y at %H:%M' dateTuple = time.strptime(date, format) The output I get from this script is: ('en_US', 'utf') (None, None) ('nl_NL', 'ISO8859-1') (None, None) Traceback (most recent call last): File "switching.py", line 17, in ? dateTuple = time.strptime(date, format) File "/usr/lib/python2.4/_strptime.py", line 292, in strptime raise ValueError("time data did not match format: data=%s fmt=%s" % ValueError: time data did not match format: data=10 August 2005 at 17:26 fmt=%d %B %Y at %H:%M One workaround I found is by manually busting the regular expression cache in _strptime: import _strptime _strptime._cache_lock.acquire() _strptime._TimeRE_cache = _strptime.TimeRE() _strptime._regex_cache = {} _strptime._cache_lock.release() If I do all that, I can change the LC_TIME part of the locale as many times as I choose. If this isn't a bug, this should at least be in the documentation for the locale module and/or strptime(). ---------------------------------------------------------------------- >Comment By: Brett Cannon (bcannon) Date: 2007-03-29 13:03 Message: Logged In: YES user_id=357491 Originator: NO The test was checking that the TimeRE instance is recreated when the locale changes. You do have a valid point about the 'if' check; should have put the setlocale call in an try/except block and just returned if an exception was raised. As for the %d usage of strptime, that is just to force a call into strptime and thus trigger the new instance of TimeRE. That is why the test checks the id of the objects; don't really care about strptime directly failing. Did the test not fail properly even when you removed the 'if' but left everything else alone? ---------------------------------------------------------------------- Comment By: Javier Sanz (kovan) Date: 2007-03-29 09:53 Message: Logged In: YES user_id=1426755 Originator: NO I've been looking at the test case, and I noticed that isn't actually checking anything, because locale.getlocale(locale.LC_TIME) is returning (None,None), which is ok and just means that the default locale (which is the C locale, not the system locale) is being used. After removing that 'if' I also changed de_DE by es_ES to fit my system, and strptime('10', '%d') by strptime('Fri', '%a') and strptime('vie','%a'); because '10' is '10' in all -occidental- languages, and the test would not fail when the wrong locale is being used. Once I made these changes to the test case, it successfully failed when using the non-patched _strptime.py, AND ran ok when using the patched version. This is the test case I ended up using: def test_TimeRE_recreation(self): # The TimeRE instance should be recreated upon changing the locale. locale_info = locale.getlocale(locale.LC_TIME) locale.setlocale(locale.LC_TIME, ('en_US', 'UTF8')) try: _strptime.strptime('Fri', '%a') first_time_re_id = id(_strptime._TimeRE_cache) locale.setlocale(locale.LC_TIME, ('es_ES', 'UTF8')) _strptime.strptime('vie', '%a') second_time_re_id = id(_strptime._TimeRE_cache) self.failIfEqual(first_time_re_id, second_time_re_id) finally: locale.setlocale(locale.LC_TIME, locale_info) ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2007-03-28 19:07 Message: Logged In: YES user_id=357491 Originator: NO I have uploaded a patch for test_strptime that adds a test to make sure that the TimeRE instance is recreated if the locale changes (went with en_US and de_DE, but could easily be other locales if there are other ones that are more common). Let me know if the test runs fine and works. Even better is if it fails without the fix. File Added: strptime_timere_test.diff ---------------------------------------------------------------------- Comment By: Javier Sanz (kovan) Date: 2007-03-28 16:44 Message: Logged In: YES user_id=1426755 Originator: NO I'll be glad to help in whatever I can. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2007-03-28 16:40 Message: Logged In: YES user_id=357491 Originator: NO The power of procrastination in the morning. =) I am going to try to come up with a test case for this. I might ask, kovan, if you can run the test case to make sure it works. ---------------------------------------------------------------------- Comment By: Javier Sanz (kovan) Date: 2007-03-28 15:55 Message: Logged In: YES user_id=1426755 Originator: NO I applied the patch, and it works now :). Thanks bcannon for the quick responses. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2007-03-28 11:39 Message: Logged In: YES user_id=357491 Originator: NO kovan, can you please apply the patch I have uploaded to your copy of _strptime and let me know if that fixes it? I am oS X and switching locales doesn't work for me so I don't have an easy way to test this. File Added: strptime_cache.diff ---------------------------------------------------------------------- Comment By: Javier Sanz (kovan) Date: 2007-03-28 00:06 Message: Logged In: YES user_id=1426755 Originator: NO This is the code: def parseTime(strTime, format = "%a %b %d %H:%M:%S"):# example: Mon Aug 7 21:08:52 locale.setlocale(locale.LC_TIME, ('en_US','UTF8')) format = "%Y " + format strTime = str(datetime.now().year) + " " +strTime import _strptime _strptime._cache_lock.acquire() _strptime._TimeRE_cache = _strptime.TimeRE() _strptime._regex_cache = {} _strptime._cache_lock.release() tuple = strptime(strTime, format) return datetime(*tuple[0:6]) If I remove the code to clear the cache and add "print format_regex.pattern" statement to _strptime.py after "format_regex = time_re.compile(format)", I get (?P<Y>\d\d\d\d)\s*(?P<a>mi\�\�|s\�\�b|lun|mar|jue|vie|dom)\s*(?P<b>ene|feb|mar|abr|may|jun|jul|ago|sep|oct|nov|dic)\s*(?P<d>3[0-1]|[1-2]\d|0[1-9]|[1-9]| [1-9])\s*(?P<H>2[0-3]|[0-1]\d|\d):(?P<M>[0-5]\d|\d):(?P<S>6[0-1]|[0-5]\d|\d) which is in my system's locale (es), and it should be in english. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2007-03-27 19:35 Message: Logged In: YES user_id=357491 Originator: NO Can you show some code that recreatess the problem? ---------------------------------------------------------------------- Comment By: Javier Sanz (kovan) Date: 2007-03-27 18:06 Message: Logged In: YES user_id=1426755 Originator: NO I think I'm having this issue with Python 2.5, as I can only make strptime take into account locale.setlocale() calls if I clear strptime's internal regexp cache between the calls to setlocal() and strptime(). ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2005-09-14 19:42 Message: Logged In: YES user_id=357491 OK, the problem was that the cache for the locale information in terms of dates and time was being invalidated and recreated, but the regex cache was not being touched. I has now been fixed in rev. 1.41 for 2.5 and in rev. 1.38.2.3 for 2.4 . Thanks for reporting this, Adam. ---------------------------------------------------------------------- Comment By: Adam Monsen (meonkeys) Date: 2005-09-13 15:57 Message: Logged In: YES user_id=259388 I think there were some long lines in my code. Attaching test case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290505&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com