On Thu, 31 Jul 2025 22:04:56 GMT, Naoto Sato <na...@openjdk.org> wrote:
>> Enabling lenient minus sign matching when parsing numbers. In some locales, >> e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), >> which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. >> Thus the parsing of user input numbers may fail. This change utilizes CLDR's >> `parseLenient` element for minus signs and loosely matches them with the >> hyphen-minus so that user input numbers can parse. As this is a behavioral >> change, a corresponding CSR has been drafted. > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Address review comments src/java.base/share/classes/java/text/DecimalFormat.java line 3544: > 3542: var t = text.substring(position, Math.min(tlen, position + > alen)) > 3543: .replaceAll(lmsp, "-"); > 3544: return t.regionMatches(0, a, 0, alen); I presume the original solution was regex for all cases, and you made the single char case for the fast path. If you still have the perf benchmark handy, I wonder how a char by char approach would compare instead of regex. It would also simplify the code since it would be combining the slow and fast path. Although the perf is variable based on the affix string length vs lms length (11). int i = 0; for (; position + i < Math.min(tlen, position + alen); i++) { int tIndex = lms.indexOf(text.charAt(position + i)); int aIndex = lms.indexOf(affix.charAt(i)); // Non LMS. Match direct if (tIndex < 0 && aIndex < 0) { if (text.charAt(position + i) != affix.charAt(i)) { return false; } } else { // By here, at least one LMS. Ensure both LMS. if (tIndex < 0 || aIndex < 0) { return false; } } } // Return true if entire affix was matched return i == alen; test/jdk/java/text/Format/NumberFormat/LenientMinusSignTest.java line 120: > 118: public void testLenientPrefix(String sign) throws ParseException > { > 119: var df = new DecimalFormat(PREFIX, DFS); > 120: df.setStrict(false); No need to set strict as false since `df` is lenient by default. If it was done for clarity, I think the method name already adequately indicates it is a lenient style test. Applies to CNF test as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246593124 PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246491998