https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118342

            Bug ID: 118342
           Summary: `a == 0 ? 32 : __builtin_ctz(a)` for Intel and AMD
                    cores could be implemented even without BMI1
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

So reading https://github.com/llvm/llvm-project/issues/122004.

```
int f(int a)
{
  if (a == 0) return 32;
  return __builtin_ctz (a);
}
```

Could be implemented as just:
```
  mov eax, 32
  rep bsf eax, edi
  ret
```

This is based on this:
```
Intel has changed it's documentation of it's bsf instruction implementation to
match AMD's. Intel's bsf is now documented to leave the destination register
unchanged if the input is zero. 
```

So we might need input from VIA/Zhaoxin about their cores.

Reply via email to