https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>Benching based on the Linux kernel and the Sapphire Rapids CPU:


With -mtune=sapphirerapids , GCC produces:
```
_Z4zeroP3foo:
.LFB0:
        .cfi_startproc
        mov     QWORD PTR [rdi], 0
        mov     QWORD PTR [rdi+8], 0
        mov     QWORD PTR [rdi+16], 0
        mov     QWORD PTR [rdi+24], 0
        mov     QWORD PTR [rdi+32], 0
        mov     BYTE PTR [rdi+40], 0
        ret

````
Which is what you want.

Again I will mention this:
Plus for generic tuning you need to benchmark one more than just one processor
(at least a few Intel and AMD processors).

Reply via email to