# HashMap.putAll() optimizations: Eliminating Megamorphic Call Site Bottlenecks
## Summary This PR addresses performance bottlenecks in `HashMap.putMapEntries()` by implementing direct optimizations for specific input types: `j.u.HashMap` and `j.u.Collections$UnmodifiableMap`. The optimizations target `HashMap(Map)` constructor and `putAll()` operations based on the real-world megamorphic behavior identified in [JDK-8368292](https://bugs.openjdk.org/browse/JDK-8368292), delivering significant performance improvements when multiple `Map` subtypes are used. ## Problem Context ### Megamorphic Call Site Overhead in Map Iteration `HashMap.putMapEntries()` currently uses a generic approach that suffers from megamorphic call site overhead when applications perform bulk creation or population of HashMaps from various source map types: 1. `m.entrySet()` becomes megamorphic across different map implementations 2. `entrySet().iterator()` creates different iterator types 3. `entry.getKey()` and `entry.getValue()` calls vary by map type 4. Individual `putVal()` calls for each entry When the source is `Collections$UnmodifiableMap`, the problem is compounded by megamorphic wrappers around the already-megamorphic iteration methods. In cases where the unwrapped map is also a HashMap, both the wrapper overhead and the iteration overhead can be eliminated with a single optimization. ## Optimized Methods ### HashMap - **`putMapEntries(Map<? extends K, ? extends V> m, boolean evict)`**: Added fast paths for UnmodifiableMap unwrapping and HashMap-to-HashMap copying - **`putMapEntries(HashMap<? extends K, ? extends V> src, boolean evict)`**: copies HashMap-to-HashMap via direct Node processing. Avoids polymorphic issues and eliminates redundant calls to HashMap.hash(). ## Implementation Details ### HashMap-to-HashMap Fast Path Directly iterates over `src.table` to eliminate entrySet() allocation and polymorphic iterator calls, using the internal Node structure for maximum efficiency. ### UnmodifiableMap Unwrapping Detects UnmodifiableMap instances and accesses the underlying map directly via the `m` field, eliminating wrapper-induced megamorphic call sites. UnmodifiableMap visibility changed from `private` to package-private to enable this direct access. ## Performance Impact | Source Type | Size | Poisoned | Baseline | Optimized | Improvement | |-------------|------|----------|----------|-----------|-------------| | HashMap | 0 | true | 9.429 ns/op | 8.875 ns/op | **6% faster** | | HashMap | 5 | true | 181.645 ns/op | 80.012 ns/op | **56% faster** | | HashMap | 25 | true | 796.415 ns/op | 283.643 ns/op | **64% faster** | | TreeMap | 0 | true | 9.356 ns/op | 9.155 ns/op | **-** | | TreeMap | 5 | true | 169.233 ns/op | 169.566 ns/op | **-** | | TreeMap | 25 | true | 792.667 ns/op | 791.195 ns/op | **-** | | ConcurrentHashMap | 0 | true | 9.275 ns/op | 9.156 ns/op | **-** | | ConcurrentHashMap | 5 | true | 229.291 ns/op | 227.382 ns/op | **-** | | ConcurrentHashMap | 25 | true | 1071.336 ns/op | 1105.111 ns/op | **-** | | UnmodifiableMap(HashMap) | 0 | true | 13.563 ns/op | 9.253 ns/op | **32% faster** | | UnmodifiableMap(HashMap) | 5 | true | 318.184 ns/op | 76.264 ns/op | **76% faster** | | UnmodifiableMap(HashMap) | 25 | true | 1510.489 ns/op | 284.001 ns/op | **81% faster** | | UnmodifiableMap(TreeMap) | 5 | true | 323.412 ns/op | 186.990 ns/op | **42% faster** | | UnmodifiableMap(TreeMap) | 25 | true | 1529.959 ns/op | 809.125 ns/op | **47% faster** | This shows substantial improvements (32-81%) for optimized paths while maintaining no significant regression for non-optimized paths. ### Call Site Poisoning Impact | Source Type | Size | Unpoisoned | Poisoned | Degradation | |-------------|------|------------|----------|-------------| | HashMap | 0 | 4.949 ns/op | 9.429 ns/op | 91% slower | | HashMap | 5 | 86.796 ns/op | 181.645 ns/op | 109% slower | | HashMap | 25 | 322.196 ns/op | 796.415 ns/op | 147% slower | | TreeMap | 0 | 5.043 ns/op | 9.356 ns/op | 86% slower | | TreeMap | 5 | 102.329 ns/op | 169.233 ns/op | 65% slower | | TreeMap | 25 | 433.668 ns/op | 792.667 ns/op | 83% slower | | ConcurrentHashMap | 0 | 5.054 ns/op | 9.275 ns/op | 84% slower | | ConcurrentHashMap | 5 | 135.616 ns/op | 229.291 ns/op | 69% slower | | ConcurrentHashMap | 25 | 591.134 ns/op | 1071.336 ns/op | 81% slower | | UnmodifiableMap(HashMap) | 0 | 5.319 ns/op | 13.563 ns/op | 155% slower | | UnmodifiableMap(HashMap) | 5 | 146.471 ns/op | 318.184 ns/op | 117% slower | | UnmodifiableMap(HashMap) | 25 | 393.053 ns/op | 1510.489 ns/op | 284% slower | | UnmodifiableMap(TreeMap) | 0 | 5.331 ns/op | 13.539 ns/op | 154% slower | | UnmodifiableMap(TreeMap) | 5 | 78.215 ns/op | 323.412 ns/op | 314% slower | | UnmodifiableMap(TreeMap) | 25 | 332.189 ns/op | 1529.959 ns/op | 361% slower | ## FAQ **Q: Why use exact class check instead of instanceof for HashMap?** A: This prevents the fast path from becoming megamorphic if HashMap subclasses are common, ensuring the optimization remains effective. Also, subclasses might override behavior in ways that make direct table access incorrect. **Q: Is it safe to make UnmodifiableMap package-private?** A: Yes. The class remains non-public and is only accessible within java.util. The change enables internal optimizations without exposing implementation details to external code. **Q: Are these optimizations safe?** A: Yes. HashMap-to-HashMap copying maintains identical behavior, and UnmodifiableMap unwrapping accesses the exact same data through a more direct path. **Q: What about other Map types?** A: Profiling data shows these are the biggest opportunities. Others exist but are lower-priority. **Q: How do these optimizations interact with existing code?** A: These are internal implementation optimizations that maintain full API compatibility. No external code changes are required. ## Testing Comprehensive unit tests added to MOAT.java covering: - HashMap.putAll() with various source map types - UnmodifiableMap handling in putAll operations - Correctness verification for all optimization paths Tier-1 tests passed. ------------- Commit messages: - Bug fix & unit test - fixing whitespace - Optimizing HashMap.putAll() and .<init> for HashMap and C$UM Changes: https://git.openjdk.org/jdk/pull/28243/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28243&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371656 Stats: 235 lines in 4 files changed: 230 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28243.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28243/head:pull/28243 PR: https://git.openjdk.org/jdk/pull/28243
