Steve Heaven <[EMAIL PROTECTED]> writes:
> Does the regular expression parser have anything equivalent to Perl's \w
> word boundary metacharacter?

src/backend/regex/re_format.7 contains the whole scoop (for some reason
this page doesn't seem to get installed with the rest of the
documentation).  In particular:

        There are two special cases of bracket expressions:
        the bracket expressions `[[:<:]]' and `[[:>:]]' match the null
        string at the beginning and end of a word respectively.
        A word is defined as a sequence of word characters
        which is neither preceded nor followed by word characters.
        A word character is an alnum character (as defined by ctype(3))
        or an underscore.  This is an extension, compatible with but not
        specified by POSIX 1003.2, and should be used with caution in
        software intended to be portable to other systems.
        
        ...
        
        BUGS
        
        The syntax for word boundaries is incredibly ugly.

POSIX bracket expressions are pretty ugly anyway, and this is no worse
than the rest.  However, if you prefer Perl or Tcl, I'd recommend that
you just *use* Perl or Tcl ;-).  plperl and pltcl make great
implementation languages for text-mashing functions...

                        regards, tom lane

Reply via email to