Thanks Alan! webrev has been updated accordingly.
-Sherman
On 4/27/2011 8:51 AM, Alan Bateman wrote:
Xueming Shen wrote:
:
UNICODE_CHARACTER_CLASS is clear and straightforward. I am OK with it.
The webrev, ccc and api docs have been updated accordingly.
Yes, I still need a reviewer for th
Xueming Shen wrote:
:
UNICODE_CHARACTER_CLASS is clear and straightforward. I am OK with it.
The webrev, ccc and api docs have been updated accordingly.
Yes, I still need a reviewer for the implementation changes. Tom has
helped review
the doc (and the definition of those properties).
I've g
odereview Request: 7039066 j.u.rgex does not match
TR#18 RL1.4 Simple Word Boundaries and RL1.2 Properties
Date: Sat, 23 Apr 2011 17:53:42 -0700
From: Xueming Shen
<mailto:xueming.s...@oracle.com>
To: Tom Christiansen <mailto:tchr...@perl
On 04-26-2011 2:20 AM, Alan Bateman wrote:
Xueming Shen wrote:
Thanks Mark!
Let's go with UNICODE_PROPERTY, if there is no objection.
I went through the updates to the javadoc and the approach looks good
and nicely done. A minor comment is that the compile(String,int)
method repeats the list
Xueming Shen wrote:
Thanks Mark!
Let's go with UNICODE_PROPERTY, if there is no objection.
I went through the updates to the javadoc and the approach looks good
and nicely done. A minor comment is that the compile(String,int) method
repeats the list of flags that are allowed so that should be
Thanks Tom!
The j.u.regex does not have its own direct access to PropList for now,
have to use the properties from j..l.Character
class. I will have to move those CharacterDateNN classes from the
java.lang package (package private) to sun.lang
or somewhere that both j.u.Character and j.u.regex
On 4/23/2011 6:50 PM, Xueming Shen wrote:
>
> Forwarding...forgot to include the list.
>
> Original Message Subject: Re: Codereview Request:
> 7039066 j.u.rgex does not match TR#18 RL1.4 Simple Word Boundaries and RL1.2
> Properties Date: Sat, 23 Apr 2011 17:53:42 -0
Xueming, the docs look good.
On the name of the flag, I have no strong feelings one way or the other.
Perhaps between UNICODE_PROPERTIES and UNICODE_CLASSES, I would prefer
the second one. The first makes me think of the regular properties like
\p{Script=Greek} from RL1.2, not the compat proper
Simple Word Boundaries and RL1.2 Properties
Date: Sat, 23 Apr 2011 17:53:42 -0700
From: Xueming Shen
To: Tom Christiansen
Mark, Tom,
I agree with Mark that UNICODE_SPEC is a better name than
UNICODE_CHARSET. We will have to deal with
the "compatibility" issue Tom mentio
Forwarding...forgot to include the list.
Original Message
Subject: Re: Codereview Request: 7039066 j.u.rgex does not match TR#18
RL1.4 Simple Word Boundaries and RL1.2 Properties
Date: Sat, 23 Apr 2011 17:53:42 -0700
From: Xueming Shen
To: Tom Christiansen
Mark
Mark Davis ☕ wrote
on Sat, 23 Apr 2011 09:09:55 PDT:
> The changes sound good.
They sure do, don't they? I'm quite happy about this. I think it is more
important to get this in the queue than that it (necessarily) be done for
JDK7. That said, having a good tr18 RL1 story for JDK7's Unico
The changes sound good. The flag UNICODE_CHARSET will be misleading, since
all of Java uses the Unicode Charset (= encoding). How about:
UNICODE_SPEC
or something that gives that flavor.
Mark
*— Il meglio è l’inimico del bene —*
On Sat, Apr 23, 2011 at 01:12, Xueming Shen wrote:
> The flag
The flag this request proposed to add is
UNICODE_CHARSET
not the "UNICODE_UNICODE" in last email.
My apology for the typo.
Any suggestion for a better name? It was UNICODE_CHARACTERCLASS, but then it
became UNICODE_CHARSET, considering the unicode_case.
-Sherman
On 4/23/2011 1:00 AM, Xuemi
Hi
This proposal tries to address
(1) j.u.regex does not meet Unicode regex's Simple Word Boundaries [1]
requirement as Tom pointed
out in his email on i18n-dev list [2]. Basically we have 3 problems here.
a. ju.regex word boundary construct \b and \B uses Unicode
\p{letter} + \p{digit
Sherman wrote:
> Regarding RL1.4.(1), the U+200C and U+2000 are obviously a bug that
> the Java regex failed to update the implementation to sync with the
> tr#18 update, it appears these two don't "exists" in RL1.4/v9,
> neither does RL1.2a, the compatibility properties.
> The words for 1.4(1)
Sherman wrote:
> Thanks for the detailed and excellent "reality check". While I'm still
> going through all the details it appears that the fact the current
> Java Unicode property data does not include the properties defined in
> PropList.txt (current implementation reads the property data only f
them in details,
file corresponding
bug/rfe into our database and then follow up from there.
-Sherman
On 1-23-2011 11:44 11:44 AM, Tom Christiansen wrote:
Java does not meet this requirement. Specifically, it
does not offer a mechanism for stipulation #1 cited below:
RL1.4 Simpl
Java does not meet this requirement. Specifically, it
does not offer a mechanism for stipulation #1 cited below:
RL1.4 Simple Word Boundaries
To meet this requirement, an implementation shall extend the
word boundary mechanism so that:
(1) The class of includes all the
18 matches
Mail list logo