On Thu, 24 Oct 2024 01:26:10 GMT, Jasmine Karthikeyan 
<jkarthike...@openjdk.org> wrote:

>> Hi all,
>> This patch adds a new pass to consolidate lowering of complex 
>> backend-specific code patterns, such as `MacroLogicV` and the optimization 
>> proposed by #21244. Moving these optimizations to backend code can simplify 
>> shared code, while also making it easier to develop more in-depth 
>> optimizations. The linked bug has an example of a new optimization this 
>> could enable. The new phase does GVN to de-duplicate nodes and calls nodes' 
>> `Value()` method, but it does not call `Identity()` or `Ideal()` to avoid 
>> undoing any changes done during lowering. It also reuses the IGVN worklist 
>> to avoid needing to re-create the notification mechanism.
>> 
>> In this PR only the skeleton code for the pass is added, moving 
>> `MacroLogicV` to this system will be done separately in a future patch. Tier 
>> 1 tests pass on my linux x64 machine. Feedback on this patch would be 
>> greatly appreciated!
>
> Jasmine Karthikeyan has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Address some changes from code review

I'd like to reiterate that the primary place in C2 designated for 
platform-specific IR transformations is during matching phase. That's where the 
vast majority of cases is covered. Everything else are special cases which are 
handled in different places (mostly due to historic reasons and differing 
requirements). 

> It would not be possible without a stretch, consider my example regarding 
> ExtractINode above, ...

Matcher exposes a number of constants and queries which are used in shared code 
to sense platform capabilities (akin to `VM_Version`).

> What do you think about keeping the node declaration in shared code but 
> putting the lowering transformations in the backend-specific source files? 

I definitely prefer shared over platform-specific Ideal node declarations.

> We can then use prefixes to denote a node being available on a specific 
> backend only.

Ideal-to-Mach IR lowering is already partial as `Matcher::match_rule_support` 
usages illustrate.

> That's why it is intended to be executed only after general igvn.

Keep in mind that final graph reshaping also resides in shared code. 

> Macro expansion would be too early, as we still do platform-independent igvn 
> there, while final graph reshaping and custom matching logic would be too 
> late, as we have destroyed the node hash table already.

The nice thing about macro expansion is that it is performed consistently. 
Every macro node is unconditionally expanded and subsequent passes don't have 
to care about them. (The same applies to matching: Ideal nodes are consistently 
turned into platform-specific Mach nodes, except a very limited number of 
well-known cases.) 

I'd say that if we want the lowering pass being discussed to be truly scalable, 
it's better to follow the same pattern. I have some doubts that 
platform-specific ad-hoc IR tweaks scale will scale well.  

BTW it's not clear to me now what particular benefits IGVN brings. `DivMod` 
transformation doesn't use IGVN and after examining `MacroLogicV` code it can 
be rewritten to avoid IGVN as well. I believe that `ExtractI` case you 
mentioned can be implemented without relying on IGVN. In that case, lowering 
transformations can be moved to a later phase.

> I don't think this is a concern, enumerating all live nodes once without 
> doing anything is not expensive.

I do have some concerns about unconditionally performing something which is 
useless most of the time (or even completely redundant on some platforms).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21599#issuecomment-2445501092

Reply via email to