Chris Angelico <ros...@gmail.com>: > On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa <ma...@pacujo.net> wrote: >> Chris Angelico <ros...@gmail.com>: >> >>> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa <ma...@pacujo.net> wrote: >>>> Yes. Also, not every letter can be normalized to a single codepoint so >>>> NFC is not a way out. For example, >>>> >>>> re.match("^[q̈]$", "q̈") >>>> >>>> returns None regardless of normalization. > [...] > > What I *think* you're asking for is for square brackets in a regex to > count combining characters with their preceding base character.
Yes. My example tries to match a single character against a single character. > That would make a lot of sense, and would actually be a reasonable > feature to request. (Probably as an option, in case there's a backward > compatibility issue.) There's the flag re.IGNORECASE. In the same vein, it might be useful to have re.IGNOREDIACRITICS, which would match re.match("^[abc]$", "ä", re.IGNOREDIACRITICS) regardless of normalization. Marko -- https://mail.python.org/mailman/listinfo/python-list