For reference, here's how Perl 5.8 will define \p{IsFoo} character
classes:

    # 005F: SPACING UNDERSCROE
    ['IsWord',   '$cat =~ /^[LMN]/ or $code eq "005F"', ''],
    ['IsAlnum',  '$cat =~ /^[LMN]/',    ''],
    ['IsAlpha',  '$cat =~ /^[LM]/',     ''],
    # 0009: HORIZONTAL TABULATION
    # 000A: LINE FEED
    # 000B: VERTICAL TABULATION
    # 000C: FORM FEED
    # 000D: CARRIAGE RETURN
    # 0020: SPACE
    ['IsSpace',  '$cat  =~ /^Z/ ||
                  $code =~ /^(0009|000A|000B|000C|000D)$/',     ''],
    ['IsSpacePerl',
                 '$cat  =~ /^Z/ ||
                  $code =~ /^(0009|000A|000C|000D)$/',          ''],
    ['IsBlank',  '$code =~ /^(0020|0009)$/ ||
                  $cat  =~ /^Z[^lp]$/', ''],
    ['IsDigit',  '$cat =~ /^Nd$/',      ''],
    ['IsUpper',  '$cat =~ /^L[ut]$/',   ''],
    ['IsLower',  '$cat =~ /^Ll$/',      ''],
    ['IsASCII',  '$code le "007f"',     ''],
    ['IsCntrl',  '$cat =~ /^C/',        ''],
    ['IsGraph',  '$cat =~ /^([LMNPS]|Co)/',     ''],
    ['IsPrint',  '$cat =~ /^([LMNPS]|Co|Zs)/',  ''],
    ['IsPunct',  '$cat =~ /^P/',        ''],
    # 003[0-9]: DIGIT ZERO..NINE, 00[46][1-6]: A..F, a..f
    ['IsXDigit', '$code =~ /^00(3[0-9]|[46][1-6])$/',   ''],

(lib/unicode/mktables.PL)

This code was originally devised by Larry (based on some Unicode 2.0
material?), and some time ago sligthly modified by me based on a
little thread in the [EMAIL PROTECTED] list (someone was defining
character classes for his or hers language implementation)

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen

Reply via email to