On Sun, 3 Sep 2023 12:33:18 GMT, Claes Redestad <redes...@openjdk.org> wrote:
> The two odd codepoints I was curious about are `0xaa` and `0xba`, both of > which are lower-case according to `Character.isLowerCase(..)` but does not > actually have an uppercase. The Unicode data categorize these two as `Lo`, > Letter, other, so I'm a little confused how they got tagged as lowercase. > > `Character.toUpperCaseEx` is specified as adhering to the definition of the > unicode data (unlike some legacy java character definition that might differ > subtly) so perhaps it's reasonable to specify this newly invented > `isLowerCaseEx` as strictly adhering to the unicode data in which case I > think `0xaa` and `0xbb` should not be considered lower case. I am not a > domain expert and would like someone like @naotoj to weigh in here. But > either way we should think about how to specify this kind of method to keep > it precise. Even if it's only internal code.. > > I suggested `hasUpperCase` (or maybe `hasUpperCaseEx`) as a way out of this > particular conundrum, since it makes perfect sense to define a method named > like that to be equivalent to `return cp != > CharacterDataLatin1.instance.toUpperCaseEx(cp);` i have renamed isLowerCaseEx to hasNotUpperCaseEx, is this ok? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14751#issuecomment-1704360024