On Thursday 14 June 2001 12:01 pm, Dan Sugalski wrote:
> Fancy character classes are probably enough to handle the various casing
> issues and their analogs. They're probably not enough to handle things
> like the arabic tatwheel, or proper word breaks in most asian languages.
> Heck, unless I'm missing something, they're insufficient for something as
> simple as \d.
>
> I'm not advocating forcing dictionaries into the regex engine, nor even
> shipping them with the core.
That's not to say that some Locale::* couldn't include one, or reference a
third party one.
> As I see it, locales specify:
>
> * Collating order
> * Comparison/equality specification
> * Unicode codepoint interpretation
What do you mean by that?
> * Regex character classes
> * Regex character identification
> * Regex zero-width assertion rules
> * 'casing' rules
>
> It'd be nice to specify them all separately and inherit the ones you don't
> need to change from some parent locale.
Or have these individual bits and pieces be addressable through the regexen,
and have locales *defined* via that.
module Locale::Hawaiian;
use re 'class (\w => [aeiouâêîôûhklmnpw`])';
...
On a side note (and this *will* sound stupid, but there is a reason I'm
asking). Why is there no logical opposite to '.'; that is, a character
which never matches another character? (Besides, of course, that it's
utterly useless from a classic regex perspective.)
--
Bryan C. Warnock
[EMAIL PROTECTED]