Fredrik Lundh wrote: > Serge Orlov wrote: > >>>>>> re.compile(ur'\w+', re.U).findall(u'\xb5\xba\xe4\u0430') >>>>>> [u'\xb5\xba\xe4\u0430'] >> >> I can't find the strict definition of isalpha, but I believe average >> C program shouldn't care about the current locale alphabet, so >> isalpha is a union of all supported characters in all alphabets > > nope. isalpha() depends on the locale, as does all other ctype > functions (this also applies to wctype, on some platforms).
I mean "all supported characters in all alphabets [in the current locale]". For example in ru_RU.koi8-r isalpha should return true for characters in English and Russian alphabets. In ru_RU.koi8-u -- for characters in English, Russia and Ukrain alphabets, in ru_RU.utf-8 -- for all supported by the implementation alphabetic characters in unicode. IMHO iswalpha in POSIX locale can return true for all alphabetic characters in unicode instead of being limited by English alphabet. Serge. true in -- http://mail.python.org/mailman/listinfo/python-list