Hi guys,

2012/11/6 Philip Olson <phi...@roshambo.org>

>
> On Nov 5, 2012, at 8:55 AM, Rasmus Lerdorf wrote:
>
> > On 11/05/2012 08:41 AM, Jean-Sébastien Hedde wrote:
> >> On Mon, 05 Nov 2012 08:04:06 -0800, Rasmus Lerdorf <ras...@lerdorf.com>
> >> wrote:
> >>>
> >>> I think the documentation is wrong on that. In Unicode mode [[:alnum:]]
> >>> actually becomes \p{Xan} which should match Unicode chars as well, but
> >>> only if PCRE was compiled with Unicode support. So I suspect you don't
> >>> actually have a Unicode-capable PCRE build in some cases there.
> >>>
> >>> -Rasmus
> >>
> >> I will report the bug to the package maintainers (remi, debian too...).
> >>
> >> Is there anyway for us to avoid those "wrong" builds ?
> >
> > I don't see how.
>
>
> Hi geeks,
>
> Does anyone have a suggestion on how the documentation should be
> updated? The quote is from here:
>
>   http://php.net/manual/en/regexp.reference.character-classes.php
>
> With the quote being:
>
>   "In UTF-8 mode, characters with values greater than 128 do
>    not match any of the POSIX character classes."
>
> A few simple/related facts:
>
>   - PCRE_UCP exists as of PCRE 8.10
>   - Gustavo mentioned the related PHP change on Oct 3, 2010 (not sure
>     what PHP version, and googling for "87a237342" turns up empty,
>     and I miss SVN version numbers)
>
> Anyway, how should this be documented?
>
> Regards,
> Philip
>
>
I added PCRE_UCP on PHP 5.3.4 as a fix for bug #52971. [1]

For documentation just say something like:

"In unicode mode the unicode properties are used instead to classify
characters of some classes."

More information extracted from PCRE documentation [2]:

---------8<-------------------------------

 [:alnum:] becomes \p{Xan}
[:alpha:] becomes \p{L}
[:blank:] becomes \h
[:digit:] becomes \p{Nd}
[:lower:] becomes \p{Ll}
[:space:] becomes \p{Xps}
[:upper:] becomes \p{Lu}
[:word:] becomes \p{Xwd}

Negated versions, such as [:^alpha:] use \P instead of \p.
The other POSIX classes are unchanged, and match only
characters with code points less than 128.

---------------------------------------------


[1] - http://svn.php.net/viewvc/?view=revision&amp;revision=303963
[2] - http://pcre.org/man.txt


-- 
Regards,
Felipe Pena

Reply via email to