Le mercredi 19 décembre 2012 15:52:23 UTC+1, Christian Heimes a écrit : > Am 19.12.2012 15:23, schrieb wxjmfa...@gmail.com: > > > But, this is not the problem. > > > I was suprised to discover this: > > > > > >>>> 'Straße'.upper() > > > 'STRASSE' > > > > > > I really, really do not know what I should think about that. > > > (It is a complex subject.) And the real question is why? > > > > It's correct. LATIN SMALL LETTER SHARP S doesn't have an upper case > > form. However the unicode database specifies an upper case mapping from > > ß to SS. http://codepoints.net/U+00DF > > > > Christian
----- Yes, it is correct (or can be considered as correct). I do not wish to discuss the typographical problematic of "Das Grosse Eszett". The web is full of pages on the subject. However, I never succeeded to find an "official position" from Unicode. The best information I found seem to indicate (to converge), U+1E9E is now the "supported" uppercase form of U+00DF. (see DIN). What is bothering me, is more the implementation. The Unicode documentation says roughly this: if something can not be honoured, there is no harm, but do not implement a workaroud. In that case, I'm not sure Python is doing the best. If "wrong", this can be considered as programmatically correct or logically acceptable (Py3.2) >>> 'Straße'.upper().lower().capitalize() == 'Straße' True while this will *always* be problematic (Py3.3) >>> 'Straße'.upper().lower().capitalize() == 'Straße' False jmf -- http://mail.python.org/mailman/listinfo/python-list