Chris Angelico <ros...@gmail.com>:

> what you're more likely to want is "match the letter á", and you don't
> care whether it's represented as U+0061 U+0301 or as U+00E1. That's
> where Unicode normalization comes in.

Yes. Also, not every letter can be normalized to a single codepoint so
NFC is not a way out. For example,

    re.match("^[q̈]$", "q̈")

returns None regardless of normalization.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to