This PR continues the efforts from #12632 to speed up case-insensitive string 
matching.

We now tackle case-insensitive comparison of mixed-coder strings, implemented 
in `StringLatin1.regionMatchesCI_UTF16`

Key insights:

- If the UTF16 code point is also in latin1 range, we can leverage improvements 
from 12632 directly by calling `CharacterDataLatin1.equalsIgnoreCase`
- There are exactly 7 non-latin1 Unicode code points which case fold into the 
latin1 range. We can special-case our comparison of these code points by adding 
the method `CharacterDataLatin1.latin1CaseFold`.
- To avoid checking of `a == b` twice, this check is lifted out of 
`CharacterDataLatin1.equalsIgnoreCase` and the two callers are updated to check 
that `a != b` before calling the method. 
 
For completeness, the RegionMatches test is updated to also compare Turkic 
dotted/dotless 'i's against the uppercase ASCII 'I', not just the lowercase 
one.  Not stricktly related to the purpose of this PR, but it did help catch a 
regression introduced in an earlier iteration of the PR.   

To guard against regressions caused by future changes to the set of Unicode 
code points folding into latin1, a new test is added to `EqualsIgnoreCase` 
which identifies all such code points and verifies they are compared correcty.

Performance is tested for matching and mismatching cases of selected code point 
pairs picked from the ASCII letter, ASCII number, latin1 letter and non-latin 
Unicode letter ranges. Results in the first comment below.

-------------

Commit messages:
 - Inline local variable
 - latin1CaseFold was moved to CharacterDataLatin1
 - Move latin1CaseFold to CharacterDataLatin1
 - Improve latin1CaseFold javadocs
 - Simplify comments
 - Prefer fast matching by comparing for equality before checking latin1 range
 - Improve Javadocs of latin1CaseFold
 - Be consistent in comments
 - CharacterData.latin1LowerCase was renamed to latin1CaseFold
 - Hoist equality check out of CharacterDataLatin1.equalsIgnoreCase
 - ... and 13 more: https://git.openjdk.org/jdk/compare/f2b03f9a...92755920

Changes: https://git.openjdk.org/jdk/pull/12637/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12637&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8302872
  Stats: 169 lines in 5 files changed: 155 ins; 2 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/12637.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12637/head:pull/12637

PR: https://git.openjdk.org/jdk/pull/12637

Reply via email to