On 2012-12-08 23:27, Hans Mulder wrote:
On 8/12/12 23:19:40, rh wrote:
I reduced the expression too. Now I wonder why re.DEBUG doesn't unroll
category_word. Some other re flag?
he category word consists of the '_' character and the
characters for which .isalnum() return True.
On my system there are 102158 characters matching '\w':
That would be because you're using Python 3, where strings are Unicode.
sum(1 for i in range(sys.maxunicode+1)
... if re.match(r'\w', chr(i)))
102158
You wouldn't want to see the complete list.
The number of such codepoints depends on which version of Unicode is
being supported (Unicode is evolving all the time).
--
http://mail.python.org/mailman/listinfo/python-list