On Fri, 12 May 2023 21:09:01 GMT, Cesar Soares Lucas <cslu...@openjdk.org> wrote:
>> Can I please get reviews for this PR? >> >> The most common and frequent use of NonEscaping Phis merging object >> allocations is for debugging information. The two graphs below show numbers >> for Renaissance and DaCapo benchmarks - similar results are obtained for all >> other applications that I tested. >> >> With what frequency does each IR node type occurs as an allocation merge >> user? I.e., if the same node type uses a Phi N times the counter is >> incremented by N: >> >>  >> >> What are the most common users of allocation merges? I.e., if the same node >> type uses a Phi N times the counter is incremented by 1: >> >>  >> >> This PR adds support scalar replacing allocations participating in merges >> used as debug information OR as a base for field loads. I plan to create >> subsequent PRs to enable scalar replacement of merges used by other node >> types (CmpP is next on the list) subsequently. >> >> The approach I used for _rematerialization_ is pretty straightforward. It >> consists basically of the following. 1) New IR node (suggested by V. >> Kozlov), named SafePointScalarMergeNode, to represent a set of >> SafePointScalarObjectNode; 2) Each scalar replaceable input participating in >> a merge will get a SafePointScalarObjectNode like if it weren't part of a >> merge. 3) Add a new Class to support the rematerialization of SR objects >> that are part of a merge; 4) Patch HotSpot to be able to serialize and >> deserialize debug information related to allocation merges; 5) Patch C2 to >> generate unique types for SR objects participating in some allocation merges. >> >> The approach I used for _enabling the scalar replacement of some of the >> inputs of the allocation merge_ is also pretty straightforward: call >> `MemNode::split_through_phi` to, well, split AddP->Load* through the merge >> which will render the Phi useless. >> >> I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't >> see regression. I also experimented with several applications and didn't see >> any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp >> -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server >> -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions >> -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related >> failures. > > Cesar Soares Lucas has updated the pull request incrementally with one > additional commit since the last revision: > > Address PR review 5: refactor on rematerialization & add tests. Very nice, Cesar. I like how the code shapes now. I verified that the new test cases do trigger SR+NSR scenario. How do you test that deoptimization works as expected? Diagnostic output is still hard to read. On one hand, it's too verbose when it comes to PcDesc/ScopeDesc sections ("pc-bytecode offsets" and "scopes") in nmethod output (enabled either w/ `-XX:+PrintAssembly` or `-XX:CompileCommand=print,...`). On the other hand, it lacks some important details, like `selector` and `merge_ptr` location information which is essential to make sense of debug information at a safepoint in the code. FTR `_skip_rematerialization` flag is unused now. Speaking of `_only_merge_candidate` flag, I find it easier about the code when the property being tracked is whether the `ObjectValue` is referenced from corresponding JVM state or not. (Maybe call it `is_root()`?) So, `ScopeDesc::objects_to_rematerialize()` would skip everything not referenced from JVM state, but then unconditionally accept anything returned by `ObjectMergeValue::select()` which doesn't need to adjust the flag before returning selected object. Also, it's safer to track the flag status for every `ObjectValues`, even for `ObjectMergeValue`. Are you sure there's no way to end up with nested `ObjectMergeValue`s in presence of iterative EA? ------------- PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1553966589