On Fri, 31 May 2024 09:01:38 GMT, MaxXing <d...@openjdk.org> wrote: > This patch changes the algorithm of `Node::dominates` to make the result more > precise, and allows the iterators of `ConcurrentHashMap` to be scalar > replaced. > > The previous algorithm will return a conservative result when encountering a > dead control flow, and only try the first two input paths of a multi-input > Region node, which may prevent the scalar replacement in some cases. > > For example, with G1 GC enabled, C2 generates GC barriers for > `ConcurrentHashMap` iteration operations at some early phases, and then > eliminates them in a later IGVN, but `LoadNode` is also idealized in the same > IGVN. This causes `LoadNode::Ideal` to see some dead barrier control flows, > and refuse to split some instance field loads through Phi due to the > conservative result of `Node::dominates`, and thus the scalar replacement can > not be applied to iterators in the later macro elimination phase. > > This patch allows `Node::dominates` to try other paths of the last > multi-input Region node when the first path is dead, and makes > `ConcurrentHashMap` iteration ~30% faster: > > > Benchmark (nkeys) Mode Cnt Score > Error Units > Maps.testConcurrentHashMapIterators 10000 avgt 15 414099.085 ± > 33230.945 ns/op # baseline > Maps.testConcurrentHashMapIterators 10000 avgt 15 315490.281 ± > 3037.056 ns/op # patch > > > Testing: tier1-4.
Impressive results! I haven't looked at the change yet but here are a few questions / requests: - Could you add a screenshot of the IR of the case you are describing? - Wouldn't it help to add the LoadNode back to the IGVN worklist and wait for the dead path to be removed? - Could you add an [IR framework](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/ir_framework/README.md) test that verifies that the optimization works as expected? Thanks, Tobias ------------- PR Review: https://git.openjdk.org/jdk/pull/19496#pullrequestreview-2098076379