Re: Now what?

2011-01-26 Thread Tom Christiansen
Sherman wrote: > The CR# so far I have are > 7014645: Support Perl style Unicode hex notation \x{...} > 7014633: Support loose matching forboth abbreviated and longer names of > Unicode priperty > 7014640: Add meta character for line ending '\R' > It might take a couple days(?) for these CR# to

Re: Now what?

2011-01-26 Thread Xueming Shen
On 1.27.2011 3:09, Tom Christiansen wrote: 7006289: java.util.regex yields nonsense by breaking the connection between \b and \w Categoryjava:classes_util State 1-Dispatched, bug Priority: 4-Low Submit Date 12-DEC-2010 7006291: Java claims to support Unicode p

Re: RL1.1 Hex Notation (part 2 of 3)

2011-01-26 Thread Tom Christiansen
Mark wrote: > The Unicode Standard distinguishes between Unicode Strings (16-bit) and > UTF-16. In the former, which is often the form used in programming > languages, a singleton value of 0xD800..0xDFFF is allowed, and is treated > as if it were a reserved code point. Ahah! "Unicode Strings (16

Re: RL1.1 Hex Notation

2011-01-26 Thread Mark Davis ☕
> I guess you are asking for something like? I'm not asking for that. What I'm saying is that as far as I can tell, there is no way in Java to meet the terms of RL1.1, because there is not a way to use hex numbers in any syntax for values above to indicate literals. That is, if you supply "ab

Re: RL1.7 Code Points

2011-01-26 Thread Tom Christiansen
On Monday, 24 January 2011 at 14:39:59 +0900, Masayoshi Okutsu wrote >>> Are you talking about unpaired surrogates or something else? >> Yes, I am talking about unpaired surrogates. > I believe each code unit of UTF-16 gets converted to its code point. So, > an unpaired surrogate gets conver

Possible error in tr18?

2011-01-26 Thread Tom Christiansen
Under the RL2.2 link of tr18, there appears to be a error: C2. An implementation claiming conformance to Level 2 of this specification shall satisfy C1, and meet the requirements described in the following sections: RL2.1 Canonical Equivalents RL2.2 Extended G

Re: UTS18 clarifications

2011-01-26 Thread Tom Christiansen
Mark wrote: > We are coming up to a quarterly Unicode Technical Committee meeting > (starting Feb 7), so there is the opportunity to make requests / proposals > about UTS18. In particular, if there are areas of the spec that are unclear > or features that people would like to see added or changed,

Re: RL1.1 Hex Notation

2011-01-26 Thread Xueming Shen
On 01/26/2011 11:50 AM, Mark Davis ☕ wrote: > I guess you are asking for something like? I'm not asking for that. What I'm saying is that as far as I can tell, there is no way in Java to meet the terms of RL1.1, because there is not a way to use hex numbers in any syntax for values above

Re: RL1.1 Hex Notation

2011-01-26 Thread Mark Davis ☕
Ok, now I understand. With that change, the situation is much better. It doesn't fully satisfy RL1.1, because you can't use hex codepoint numbers -- you have to use the fairly ugly workaround of String hexPattern = codePoint <= 0x ? String.format("\\u%04x", codePoint) : String.format("