On Jan 12, 10:51 pm, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > Robert Kern wrote: > >> However it appears from your bug ticket that you have a much narrower > >> problem (case-shifting a small known list of English words like VOID) > >> and can work around it by writing your own locale-independent casing > >> functions. Do you still need to find out whether Python unicode > >> casings are locale-dependent? > > > I would still like to know. There are other places where .lower() is used in > > numpy, not to mention the rest of my code. > > "lower" uses the informative case mappings provided by the Unicode > character database; see > > http://www.unicode.org/Public/4.1.0/ucd/UCD.html
of which the relevant part is """ Case Mappings There are a number of complications to case mappings that occur once the repertoire of characters is expanded beyond ASCII. For more information, see Chapter 3 in Unicode 4.0. For compatibility with existing parsers, UnicodeData.txt only contains case mappings for characters where they are one-to-one mappings; it also omits information about context-sensitive case mappings. Information about these special cases can be found in a separate data file, SpecialCasing.txt. """ It seems that Python doesn't use the SpecialCasing.txt file. Effects include: (a) one-to-many mappings don't happen e.g. LATIN SMALL LETTER SHARP S: u'\xdf'.upper() produces u'\xdf' instead of u'SS' (b) language-sensitive mappings (e.g. dotted/dotless I/i for Turkish (and Azeri)) don't happen (c) context-sensitive mappings don't happen e.g. lower case of GREEK CAPITAL LETTER SIGMA depends on whether it is the last letter in a word. > > afaik, changing the locale has no influence whatsoever on Python's > Unicode subsystem. > > </F> -- http://mail.python.org/mailman/listinfo/python-list