andykaylor wrote:

@rjmccall I understand your point, and I think you're raising a good question. 
Let's walk through an example that illustrates why we currently want FMF on 
phis and selects and see if we can agree on an alternative way to handle it. In 
a comment on https://github.com/llvm/llvm-project/issues/51601, I started with 
this:

```
double floatingAbs(double x) {
  return (x < 0) ? -x : x;
}

```
We want to optimize that to `llvm.fabs(x)`, but we can only do that if we don't 
care about the sign of zero. I walked through the steps of how that happens 
[here](https://github.com/llvm/llvm-project/issues/51601#issuecomment-981047527),
 but let me jump to the optimized IR just before the replacement happens 
because that's sufficient for the discussion about FMF on select instructions. 
The optimizer reduces the IR to this:

```
define dso_local double @floatingAbs(double %0) {
  %2 = fcmp fast olt double %0, 0.000000e+00
  %3 = fneg fast double %0
  %4 = select fast i1 %2, double %3, double %0
  ret double %4
}
```
We need the `nsz` flag to turn this into `llvm.fabs(x)`. The `nsz` flag is 
present on the `fcmp` instruction, but that doesn't matter because `0.0` 
compares as equal to `-0.0` with or without the flag. We also have `nsz` on the 
`fneg` instruction, but again that doesn't matter because we don't hit the 
`fneg` instruction for `0.0` or `-0.0`. We want this sequence to produce the 
absolute value of `x` but the only way we can say that it does is if we don't 
care about the sign of zero on the `select` instruction.

If this code is called with `x == -0.0`, the `select` instruction will return 
-0.0. We can only replace this with `llvm.fabs(x)` if we know we are allowed to 
ignore the sign of zero for the whole pattern. That leaves us two choices: 
either we depend on a function attribute saying that we can ignore the sign of 
zero for the entire function, or we must have the `nsz` flag set on all 
instructions involved in the pattern.

As I said before, relying on the function attribute gives us correct results, 
but because functions can have mixed fast-math states (through either inlining 
or pragmas), we may lose an optimization here. To me, it seems that having 
fast-math flags on `select` instruction is easiest way to do this without 
potentially losing optimizations, and that indirectly implies that we'd like to 
have FMF on phis and loads. Currently, we're in a mixed state in that regard 
where we accept the loss of optimization if a load is involved but preserve 
optimization through phis and selects.

However, as I think about this it occurs to me that there is another 
possibility. The argument about only applies to the `nsz` flag. The `nnan` and 
`ninf` flags can be deduced, and the rewrite flags don't have any meaning for 
phis, loads, and selects. Perhaps we shouldn't be looking at the `select` 
instruction but instead should be looking for the `nsz` flag on the uses of the 
select instruction. In the trivial case I cited above, the result of the select 
is being returned, so looking at the function attribute is correct. If the 
value selected were being used in the function, we would look at the uses. If 
all of the uses have the `nsz` flag set, This would complicate the handling a 
bit, but it certainly seems to make more sense in terms of the semantics of the 
`nsz` flag.

Tagging @jcranmer-intel who has been working on clarifying FMF semantics.

https://github.com/llvm/llvm-project/pull/105912
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to