On Mon, 1 May 2023 20:20:51 GMT, Cesar Soares Lucas <cslu...@openjdk.org> wrote:

>> Can I please get reviews for this PR? 
>> 
>> The most common and frequent use of NonEscaping Phis merging object 
>> allocations is for debugging information. The two graphs below show numbers 
>> for Renaissance and DaCapo benchmarks - similar results are obtained for all 
>> other applications that I tested.
>> 
>> With what frequency does each IR node type occurs as an allocation merge 
>> user? I.e., if the same node type uses a Phi N times the counter is 
>> incremented by N:
>> 
>> ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png)
>> 
>> What are the most common users of allocation merges? I.e., if the same node 
>> type uses a Phi N times the counter is incremented by 1:
>> 
>> ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png)
>> 
>> This PR adds support scalar replacing allocations participating in merges 
>> used as debug information OR as a base for field loads. I plan to create 
>> subsequent PRs to enable scalar replacement of merges used by other node 
>> types (CmpP is next on the list) subsequently.
>> 
>> The approach I used for _rematerialization_ is pretty straightforward. It 
>> consists basically of the following. 1) New IR node (suggested by V. 
>> Kozlov), named SafePointScalarMergeNode, to represent a set of 
>> SafePointScalarObjectNode; 2) Each scalar replaceable input participating in 
>> a merge will get a SafePointScalarObjectNode like if it weren't part of a 
>> merge. 3) Add a new Class to support the rematerialization of SR objects 
>> that are part of a merge; 4) Patch HotSpot to be able to serialize and 
>> deserialize debug information related to allocation merges; 5) Patch C2 to 
>> generate unique types for SR objects participating in some allocation merges.
>> 
>> The approach I used for _enabling the scalar replacement of some of the 
>> inputs of the allocation merge_ is also pretty straightforward: call 
>> `MemNode::split_through_phi` to, well, split AddP->Load* through the merge 
>> which will render the Phi useless.
>> 
>> I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't 
>> see regression. I also experimented with several applications and didn't see 
>> any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp 
>> -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server 
>> -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions 
>> -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related 
>> failures.
>
> Cesar Soares Lucas has updated the pull request with a new target base due to 
> a merge or a rebase. The pull request now contains 12 commits:
> 
>  - Merge remote-tracking branch 'origin/master' into 
> rematerialization-of-merges
>  - Address part of PR review 4 & fix a bug setting only_candidate
>  - Catching up with master
>    
>    Merge remote-tracking branch 'origin/master' into 
> rematerialization-of-merges
>  - Fix tests. Remember previous reducible Phis.
>  - Address PR review 3. Some comments and be able to abort compilation.
>  - Merge with Master
>  - Addressing PR review 2: refactor & reuse MacroExpand::scalar_replacement 
> method.
>  - Address PR feeedback 1: make ObjectMergeValue subclass of ObjectValue & 
> create new IR class to represent scalarized merges.
>  - Add support for SR'ing some inputs of merges used for field loads
>  - Fix some typos and do some small refactorings.
>  - ... and 2 more: https://git.openjdk.org/jdk/compare/561ec9c5...542c5ef1

The new pass over deserialized debug info would adapt `ScopeDesc::objects()` 
(initialized by `decode_object_values(obj_decode_offset)` and accesses through 
`chunk->at(0)->scope()->objects()`) and produce 2 lists:
  * new list of objects which enumerates all scalarized instances which needs 
to be rematerialized;
  * complete set of objects referenced in the current scope (the purpose 
`chunk->at(0)->scope()->objects()` serves now).

It should be performed before `rematerialize_objects`.

By preprocessing I mean all the conditional checks before it is attempted to 
reallocate an `ObjectValue`. By the end of the new pass, it should be enough to 
just iterate over the new list of scalarized instances in 
`Deoptimization::realloc_objects`. And after `Deoptimization::realloc_objects` 
and `Deoptimization::reassign_fields` are over, debug info should be ready to 
go.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1539210279

Reply via email to