On Sat, 12 Jan 2008 13:51:18 -0800, John Machin wrote: > On Jan 12, 11:26 pm, Torsten Bronger <[EMAIL PROTECTED]> > wrote: >> Hallöchen! >> >> >> >> Fredrik Lundh writes: >> > Robert Kern wrote: >> >> >>> However it appears from your bug ticket that you have a much >> >>> narrower problem (case-shifting a small known list of English words >> >>> like VOID) and can work around it by writing your own >> >>> locale-independent casing functions. Do you still need to find out >> >>> whether Python unicode casings are locale-dependent? >> >> >> I would still like to know. There are other places where .lower() is >> >> used in numpy, not to mention the rest of my code. >> >> > "lower" uses the informative case mappings provided by the Unicode >> > character database; see >> >> > http://www.unicode.org/Public/4.1.0/ucd/UCD.html >> >> > afaik, changing the locale has no influence whatsoever on Python's >> > Unicode subsystem. >> >> Slightly off-topic because it's not part of the Unicode subsystem, but >> I was once irritated that the none-breaking space (codepoint xa0 I >> think) was included into string.whitespace. I cannot reproduce it on >> my current system anymore, but I was pretty sure it occured with a >> fr_FR.UTF-8 locale. Is this possible? And who is to blame, or must my >> program cope with such things? > > The NO-BREAK SPACE is treated as whitespace in the Python unicode > subsystem. As for str objects, the default "C" locale doesn't know it > exists; otherwise AFAIK if the character set for the locale has it, it > will be treated as whitespace. > > You were irritated because non-break SPACE was included in > string.whiteSPACE? Surely not! It seems eminently logical to me.
To me it seems the point of a non-breaking space is to have something that's printed as whitespace but not treated as it. > Perhaps > you were irritated because str.split() ignored the "no-break"? If like > me you had been faced with removing trailing spaces from text columns in > databases, you surely would have been delighted that str.rstrip() > removed the trailing-padding-for-nicer-layout no-break spaces that the > users had copy/pasted from some clown's website :-) > > What was the *real* cause of your irritation? If you want to use str.split() to split words, you will foil the user who wants to not break at a certain point. Your use of rstrip() is a lot more specialized, if you ask me. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list