We should let external collator to handle all these fancy features.
People can always normalize/canonicalize/do-whatever-you-want
and send the result text/binary to regex. All the features we
argue about here can be easily done by a customized collator.

Do NOT expect the Perl regex be a linguist that can understand
every language in the world and be able to match my name in 
English and Chinese :-) (Of course, that will be a useful
feature for me.)

Please note regex is O(n) at best, adding an external collator
will make is O(2n). Put fancy unicode feature into regex will 
not make it any faster.

My recommendation is to keep regex locale independent. And
have some API for handling locale specific features, though
I am not sure what is the best way to do this.

Hong

Reply via email to