Matthew Barnett added the comment:

The way the re handles ranges is to convert the two endpoints to lowercase and 
then check whether the lowercase form of the character in the text is in that 
range.

For example, [A-Z] is converted to the range [\x41-\x5A], and the lowercase 
form of 'Q' ('\x51') is 'q' ('\x7A'), which is in the range.

In your example, [\u0400-\u0527] is converted to the range [\u0450-\u0527], but 
the lowercase form of 'А' ('\u0410') is 'а' ('\u0430'), which isn't in the 
range.

This is the same as issue #3511, but a worse failure.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17381>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to