Matthew Barnett <pyt...@mrabarnett.plus.com> added the comment: I use Python 3, where len("\U00010337") == 2 on a narrow build.
Yes, wide Unicode on a narrow build is a problem: >>> regex.findall("\\U00010337", "a\U00010337bc") [] >>> regex.findall("(?i)\\U00010337", "a\U00010337bc") [] I'm not sure how (or whether!) to handle surrogate pairs. It _would_ make things more complicated. I suppose the moral is that if you want to use wide Unicode then you really should use a wide build. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2636> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com