On Tue, 8 Oct 2019 15:25:34 +0100 Richard Wordingham via Unicode <unicode@unicode.org> wrote:
> An example UTS#18 gives for matching a literal cluster can be > simplified to, in its notation: > > [c \q{ch}] > > This is interpreted as 'match against "ch" if possible, otherwise > against "c". Thus the strings "ca" and "cha" would both match the > expression > > [c \q{ch}]a > > while "chh" but not "ch" would match against > > [c \q{ch}]h > > Or have I got this wrong? After comparing this with the Perl behaviour of /(:?ch|c) and /(:?ch|c)h, I've come to the conclusion that I've got the interpretation wrong. The former may match "ch" or "c", and I conclude that the only funny meaning of \q is to indicate a preference for the sequence of two characters - if the engine yields all matches, it has no meaning. This greatly simplifies matters. Richard.