Denis S. Otkidach wrote:
> On all platfroms \w matches all unicode letters when used with flag
> re.UNICODE, but this doesn't work on SuSE 9.2:
>
> Python 2.3.4 (#1, Dec 17 2004, 19:56:48)
> [GCC 3.3.4 (pre 3.3.5 20040809)] on linux2
> Type "help", "copyright", "credits" or "license" for more
information.
> >>> import re
> >>> re.compile(ur'\w+', re.U).match(u'\xe4')
> >>>
>
> BTW, is correctly recognize this character as lowercase letter:
> >>> import unicodedata
> >>> unicodedata.category(u'\xe4')
> 'Ll'
>
> I've looked through all SuSE patches applied, but found nothing
> related. What is the reason for broken behavior?  Incorrect
> configure options?

To summarize the discussion: either it's a bug in glibc or there is an
option to specify modern POSIX locale. POSIX locale consist of
characters from the portable character set, unicode is certainly
portable. 

  Serge.

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to