Just a quick followup. This bug with equalsIgnoreCase working only for BMP alone went undetected all the way up through Unicode 3.1. That's when the Deseret script was introduced, which is a case-changing script outside the BMP. That was more than 10 years ago now. Obviously no one is screaming about it, but we never know what will happen in the future, and there is no reason for Java to misbehave on applicable future code points that are someday added outside the BMP. Best to future-proof it.
Apparently there was never any organized code inspection to check all core Java libraries to fix anything processing Strings in a char-wise fashion to do so in by code points unless it really and truly made no difference, which in this case it does. That surprises me. This would also have been caught by an extensive test suite that tried all code points for various things. Even processing strings by code point doesn't give the best results. I'd rather like to see a way to disregard lengths and instead compare the two strings' full casefolds instead. However, I recognize that that has performance impacts at the very least and perhaps compatibility ones as well, so arguably a new and different method might be a more appropriate solution if that route were deemed sufficiently desirable. The problem is that this is unreasonably hard to implement on one's own without a method that produces a string's casefold. Because of this, I believe Java needs a String method that returns the full casefold of that string, and perhaps for performance concerns also a Character method that takes a code point and returns its simple casefold only. I don't know how locales enter into that, either. There is room in casefolding rules for locale stuff like Turkic, since that gets a different (full) casefold in that locale. --tom