On Tue, 13 Jun 2023 17:56:57 GMT, Naoto Sato <na...@openjdk.org> wrote:
>> This is stemming from the PR: https://github.com/openjdk/jdk/pull/14211 >> where aggressive GC can cause NPE in `BaseLocale$Key` class. I refactored >> the in-house cache with WeakHashMap, and removed the Key class as it is no >> longer needed (thus the original NPE will no longer be possible). Also with >> the new JMH test case, it gains some performance improvement: >> >> (w/o fix) >> >> Benchmark Mode Cnt Score Error Units >> LocaleCache.testForLanguageTag avgt 20 5781.275 ± 569.580 ns/op >> LocaleCache.testLocaleOf avgt 20 62564.079 ± 406.697 ns/op >> >> (w/ fix) >> Benchmark Mode Cnt Score Error Units >> LocaleCache.testForLanguageTag avgt 20 4801.175 ± 371.830 ns/op >> LocaleCache.testLocaleOf avgt 20 60394.652 ± 352.471 ns/op > > Naoto Sato has updated the pull request incrementally with one additional > commit since the last revision: > > Addressing review comments src/java.base/share/classes/sun/util/locale/BaseLocale.java line 168: > 166: // BaseLocale as the key. The returned "normalized" instance > 167: // can subsequently be used by the Locale instance which > 168: // guarantees the locale components are properly cased/interned. Is it truly important that the String values are interned, or is the desire simply that all refer to the same "canonical" String instance? my thought is that managing the canonical String instances could avoid WeakReferences and synchronized map access and just return a new BaseLocale each time. public static BaseLocale getInstance(String language, String script, String region, String variant) { if (script == null) { script = ""; } else { script = getCanonicalString(script, LocaleUtils::toTitleString); } if (region == null) { region = ""; } else { region = getCanonicalString(region, LocaleUtils::toUpperString); } if (language == null) { language = ""; } else { language = getCanonicalString(language, LocaleUtils::toLowerString); } if (variant == null) { variant = ""; } else { variant = getCanonicalString(variant, Function.identity()); } ... return new BaseLocale(language, script, region, variant); } private static final ConcurrentMap<String, String> CANONICALIZED_STRINGS = new ConcurrentHashMap<>(); static { // prime the old iso codes CANONICALIZED_STRINGS.put("he", "he"); CANONICALIZED_STRINGS.put("iw", "iw"); CANONICALIZED_STRINGS.put("id", "id"); CANONICALIZED_STRINGS.put("in", "in"); CANONICALIZED_STRINGS.put("ji", "ji"); CANONICALIZED_STRINGS.put("yi", "yi"); } private static String getCanonicalString(String value, Function<String, String> conversion) { String existing = CANONICALIZED_STRINGS.get(value); if (existing != null) { return existing; } String converted = conversion.apply(value); return CANONICALIZED_STRINGS.merge(converted, converted, (k,v) -> v); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14404#discussion_r1232205527