matthias-springer wrote:

> The only very high level conern I have is the use and need of 
> `InsertionPoint::after`. As I understand, this is a not-yet-implemented 
> optimization for the future for the purpose of reusing materializations by 
> calculating a better insertion point where it may dominate more operations 
> which might need exactly that materialization after pattern application as 
> well.

Yes, that's correct. But this optimization is already implemented and it is 
quite deeply ingrained into the dialect conversion driver.

SSA values are not replaced directly, but in a delayed fashion. The replacement 
value is stored in the `ConversionValueMapping` (a kind of `IRMapping`). We 
store not only user-specified replacement values but also materializations in 
there. The data structure is kind of a linked list: for an original value, we 
store a list of replacement values: the specified replacement value and 
materializations to a different type. In the "finalize" phase, the driver goes 
through that list to find a replacement/materialization value with the correct 
type. We cannot store more than one replacement/materialization value with the 
same type; the driver would just pick the first one that matches.

Note: Computing an insertion point for a 1:1 materialization (or a 1:N block 
signature conversion) is much easier because there is only one SSA value (or 
one block). That's why computing the insertion point was trivial until now.

> Is the current complexity of calculating the insertion point (and potentially 
> calculatin the post-dom tree, a O(n log n) operation) worth the gain of the 
> optimization? I am thinking that the better insertion point insertion logic 
> should probably be post-poned until we can measure its effectivness (and 
> avoid the risk of a premature optimization) and have something simpler and 
> working that does not worsen the status-quo for now instead.

This PR also makes the `DominanceInfo` argument to `InsertPoint::after` 
optional in case of a single SSA value. (And no `DominanceInfo` is passed in 
that case.) For the most frequent case of 1:1 replacements, we do not compute a 
dominator tree at all. (And we are not doing any extra work.)

In case of N:1 materializations, the implementation uses `DominanceInfo` and we 
create it during `ConversionPatternRewriter::replaceOpWithMultiple`. 
Unfortunately, it is not safe to reuse the same `DominanceInfo` object because 
a pattern could have made IR changes that invalidate the dominator tree.

However, I believe `DominanceInfo` is quite "cheap" to use. The dominator tree 
is built lazily and it is built on a per-region basis. E.g., creating a new 
`DominanceInfo` and querying dominance for two ops in the same region will just 
build the dominator tree for that region. In case of two ops from different 
regions, the implementation will find the common ancestors (in the same region) 
and then compute the dominator tree only for that region.

Long term, the `ConversionValueMapping` is going to disappear with the One-Shot 
Dialect Conversion. As part of that, I'm also thinking of removing the 
"materialization caching" mechanism and just building duplicate 
materializations (in the form of `unrealized_conversion_cast`, i.e., not 
calling the callback). These can then be CSE'd before calling the 
materialization callback. The CSE-ing will require `DominanceInfo`, but the 
same `DominanceInfo` can be reused for the entire dialect conversion because at 
this point of time we are done with pattern applications.




https://github.com/llvm/llvm-project/pull/115816
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to