Sherman wrote: > The Unicode/java version of lowercase, uppercase, withespace > and letter character classes are provided via \p{javaXYZ},
I'm afraid that is *not* true; please see part 2. > and the \p{Lower/Upper/Alpha/Space} are specified/implemented > for POSIX version, which is clearly documented in the API > document. I would not use "worst" for this. I don't think the > "conformance" requests the implementation to use exactly the > name specified in standard. > The following classes/properties are actually > supported/implemented, while only the \p{javaLowerCase}, > \p{javaUpperCase}, \p{javaWhitespace} and \p{javaMirrored} are > explicitly documented in Pattern API, the rest are covered by > notes as "Categories that behave like the java.lang.Character > boolean ismethodname methods are available through the same > \p{prop} syntax..." > \p{javaLowerCase} > \p{javaUpperCase} > \p{javaTitleCase} > \p{javaDigit} > \p{javaDefined} > \p{javaLetter} > \p{javaLetterOrDigit} > \p{javaJavaIdentifierStart} > \p{javaJavaIdentifierPart} > \p{javaUnicodeIdentifierStart} > \p{javaUnicodeIdentifierPart} > \p{javaIdentifierIgnorable} > \p{javaSpaceChar} > \p{javaWhitespace} > \p{javaISOControl} > \p{javaMirrored} Last I checked there was also a \p{javaJavaIdentifierPart}, which is pretty silly. I think. > It appears the "noncharacter_cp and "default_ignorable_cp" are > missing from the list, will take a look later, but I guess > these 2 are really not that "significant". They are two of the eleven properties which must be supported to meet RL1.2 compliance, and therefore Level 1 compliance. Having access to the real Unicode properties is more important than having these java versions, which don't work right. See part 2, please. --tom