Hi
This proposal tries to address
(1) j.u.regex does not meet Unicode regex's Simple Word Boundaries [1]
requirement as Tom pointed
out in his email on i18n-dev list [2]. Basically we have 3 problems here.
a. ju.regex word boundary construct \b and \B uses Unicode
\p{letter} + \p{digit
The flag this request proposed to add is
UNICODE_CHARSET
not the "UNICODE_UNICODE" in last email.
My apology for the typo.
Any suggestion for a better name? It was UNICODE_CHARACTERCLASS, but then it
became UNICODE_CHARSET, considering the unicode_case.
-Sherman
On 4/23/2011 1:00 AM, Xuemi
The changes sound good. The flag UNICODE_CHARSET will be misleading, since
all of Java uses the Unicode Charset (= encoding). How about:
UNICODE_SPEC
or something that gives that flavor.
Mark
*— Il meglio è l’inimico del bene —*
On Sat, Apr 23, 2011 at 01:12, Xueming Shen wrote:
> The flag
Mark Davis ☕ wrote
on Sat, 23 Apr 2011 09:09:55 PDT:
> The changes sound good.
They sure do, don't they? I'm quite happy about this. I think it is more
important to get this in the queue than that it (necessarily) be done for
JDK7. That said, having a good tr18 RL1 story for JDK7's Unico
Sherman,
The comparison to Perl 5 in the Java Pattern class documentation needs
to be corrected. However, I would not recommend as long a laundry list
of missing features from either side as the following email might imply.
I'm just trying to be complete, but in doing so, it produces a list that
Forwarding...forgot to include the list.
Original Message
Subject: Re: Codereview Request: 7039066 j.u.rgex does not match TR#18
RL1.4 Simple Word Boundaries and RL1.2 Properties
Date: Sat, 23 Apr 2011 17:53:42 -0700
From: Xueming Shen
To: Tom Christiansen
Mark,