# HashMap.putAll() optimizations: Eliminating Megamorphic Call Site Bottlenecks

## Summary

This PR addresses performance bottlenecks in `HashMap.putMapEntries()` by 
implementing direct optimizations for specific input types: `j.u.HashMap` and 
`j.u.Collections$UnmodifiableMap`. The optimizations target `HashMap(Map)` 
constructor and `putAll()` operations based on the real-world megamorphic 
behavior identified in 
[JDK-8368292](https://bugs.openjdk.org/browse/JDK-8368292), delivering 
significant performance improvements when multiple `Map` subtypes are used.

## Problem Context

### Megamorphic Call Site Overhead in Map Iteration
`HashMap.putMapEntries()` currently uses a generic approach that suffers from 
megamorphic call site overhead when applications perform bulk creation or 
population of HashMaps from various source map types:

1. `m.entrySet()` becomes megamorphic across different map implementations
2. `entrySet().iterator()` creates different iterator types
3. `entry.getKey()` and `entry.getValue()` calls vary by map type
4. Individual `putVal()` calls for each entry

When the source is `Collections$UnmodifiableMap`, the problem is compounded by 
megamorphic wrappers around the already-megamorphic iteration methods. In cases 
where the unwrapped map is also a HashMap, both the wrapper overhead and the 
iteration overhead can be eliminated with a single optimization.

## Optimized Methods

### HashMap
- **`putMapEntries(Map<? extends K, ? extends V> m, boolean evict)`**: Added 
fast paths for UnmodifiableMap unwrapping and HashMap-to-HashMap copying
- **`putMapEntries(HashMap<? extends K, ? extends V> src, boolean evict)`**: 
copies HashMap-to-HashMap via direct Node processing.  Avoids polymorphic 
issues and eliminates redundant calls to HashMap.hash().

## Implementation Details

### HashMap-to-HashMap Fast Path
Directly iterates over `src.table` to eliminate entrySet() allocation and 
polymorphic iterator calls, using the internal Node structure for maximum 
efficiency.

### UnmodifiableMap Unwrapping
Detects UnmodifiableMap instances and accesses the underlying map directly via 
the `m` field, eliminating wrapper-induced megamorphic call sites. 
UnmodifiableMap visibility changed from `private` to package-private to enable 
this direct access.

## Performance Impact

| Source Type | Size | Poisoned | Baseline | Optimized | Improvement |
|-------------|------|----------|----------|-----------|-------------|
| HashMap | 0 | true | 9.429 ns/op | 8.875 ns/op | **6% faster** |
| HashMap | 5 | true | 181.645 ns/op | 80.012 ns/op | **56% faster** |
| HashMap | 25 | true | 796.415 ns/op | 283.643 ns/op | **64% faster** |
| TreeMap | 0 | true | 9.356 ns/op | 9.155 ns/op | **-** |
| TreeMap | 5 | true | 169.233 ns/op | 169.566 ns/op | **-** |
| TreeMap | 25 | true | 792.667 ns/op | 791.195 ns/op | **-** |
| ConcurrentHashMap | 0 | true | 9.275 ns/op | 9.156 ns/op | **-** |
| ConcurrentHashMap | 5 | true | 229.291 ns/op | 227.382 ns/op | **-** |
| ConcurrentHashMap | 25 | true | 1071.336 ns/op | 1105.111 ns/op | **-** |
| UnmodifiableMap(HashMap) | 0 | true | 13.563 ns/op | 9.253 ns/op | **32% 
faster** |
| UnmodifiableMap(HashMap) | 5 | true | 318.184 ns/op | 76.264 ns/op | **76% 
faster** |
| UnmodifiableMap(HashMap) | 25 | true | 1510.489 ns/op | 284.001 ns/op | **81% 
faster** |
| UnmodifiableMap(TreeMap) | 5 | true | 323.412 ns/op | 186.990 ns/op | **42% 
faster** |
| UnmodifiableMap(TreeMap) | 25 | true | 1529.959 ns/op | 809.125 ns/op | **47% 
faster** |

This shows substantial improvements (32-81%) for optimized paths while 
maintaining no significant regression for non-optimized paths.

### Call Site Poisoning Impact

| Source Type | Size | Unpoisoned | Poisoned | Degradation |
|-------------|------|------------|----------|-------------|
| HashMap | 0 | 4.949 ns/op | 9.429 ns/op | 91% slower |
| HashMap | 5 | 86.796 ns/op | 181.645 ns/op | 109% slower |
| HashMap | 25 | 322.196 ns/op | 796.415 ns/op | 147% slower |
| TreeMap | 0 | 5.043 ns/op | 9.356 ns/op | 86% slower |
| TreeMap | 5 | 102.329 ns/op | 169.233 ns/op | 65% slower |
| TreeMap | 25 | 433.668 ns/op | 792.667 ns/op | 83% slower |
| ConcurrentHashMap | 0 | 5.054 ns/op | 9.275 ns/op | 84% slower |
| ConcurrentHashMap | 5 | 135.616 ns/op | 229.291 ns/op | 69% slower |
| ConcurrentHashMap | 25 | 591.134 ns/op | 1071.336 ns/op | 81% slower |
| UnmodifiableMap(HashMap) | 0 | 5.319 ns/op | 13.563 ns/op | 155% slower |
| UnmodifiableMap(HashMap) | 5 | 146.471 ns/op | 318.184 ns/op | 117% slower |
| UnmodifiableMap(HashMap) | 25 | 393.053 ns/op | 1510.489 ns/op | 284% slower |
| UnmodifiableMap(TreeMap) | 0 | 5.331 ns/op | 13.539 ns/op | 154% slower |
| UnmodifiableMap(TreeMap) | 5 | 78.215 ns/op | 323.412 ns/op | 314% slower |
| UnmodifiableMap(TreeMap) | 25 | 332.189 ns/op | 1529.959 ns/op | 361% slower |

## FAQ

**Q: Why use exact class check instead of instanceof for HashMap?**
A: This prevents the fast path from becoming megamorphic if HashMap subclasses 
are common, ensuring the optimization remains effective. Also, subclasses might 
override behavior in ways that make direct table access incorrect.

**Q: Is it safe to make UnmodifiableMap package-private?**
A: Yes. The class remains non-public and is only accessible within java.util. 
The change enables internal optimizations without exposing implementation 
details to external code.

**Q: Are these optimizations safe?**
A: Yes. HashMap-to-HashMap copying maintains identical behavior, and 
UnmodifiableMap unwrapping accesses the exact same data through a more direct 
path.

**Q: What about other Map types?**
A: Profiling data shows these are the biggest opportunities. Others exist but 
are lower-priority.

**Q: How do these optimizations interact with existing code?**
A: These are internal implementation optimizations that maintain full API 
compatibility. No external code changes are required.

## Testing

Comprehensive unit tests added to MOAT.java covering:
- HashMap.putAll() with various source map types
- UnmodifiableMap handling in putAll operations  
- Correctness verification for all optimization paths

Tier-1 tests passed.

-------------

Commit messages:
 - Bug fix & unit test
 - fixing whitespace
 - Optimizing HashMap.putAll() and .<init> for HashMap and C$UM

Changes: https://git.openjdk.org/jdk/pull/28243/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28243&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8371656
  Stats: 235 lines in 4 files changed: 230 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/28243.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28243/head:pull/28243

PR: https://git.openjdk.org/jdk/pull/28243

Reply via email to