On Thu, Jul 30, 2015 at 6:40 PM, Matt Turner <matts...@gmail.com> wrote:
> I'd like to tell gcc that it's okay to inline functions (such as
> rintf(), to get the SSE4.1 roundss instruction) at particular call
> sights without compiling the entire source file or calling function
> with different CFLAGS.
> I attempted this by making inline wrapper functions annotated with
> attribute((optimize(...))), but it appears that the annotation does
> not apply to inline functions? Take for example, ex.c:
> #include <math.h>
> static inline float __attribute__((optimize("-fno-trapping-math")))
> rintf_wrapper_inline(float x)
> {
>    return rintf(x);
> }
> float
> rintf_wrapper_inline_call(float x)
> {
>    return rintf(x);
> }
> float __attribute__((optimize("-fno-trapping-math")))
> rintf_wrapper(float x)
> {
>    return rintf(x);
> }
> % gcc -O2 -msse4.1 -c ex.c
> % objdump -d ex.o
> ex.o:     file format elf64-x86-64
> Disassembly of section .text:
> 0000000000000000 <rintf_wrapper_inline_call>:
>    0: e9 00 00 00 00       jmpq   5 <rintf_wrapper_inline_call+0x5>
>    5: 66 66 2e 0f 1f 84 00 data32 nopw %cs:0x0(%rax,%rax,1)
>    c: 00 00 00 00
> 0000000000000010 <rintf_wrapper>:
>   10: 66 0f 3a 0a c0 04     roundss $0x4,%xmm0,%xmm0
>   16: c3                   retq
> whereas I expected that rintf_wrapper_inline_call would be the same as
> rintf_wrapper.
> I've read that per-function optimization is broken [1]. Is this still
> the case? Is there a way to accomplish what I want?

Not in this way.  Once rintf would be inlined the no-trapping-math flag
would be gone.

The only way is to use SSE intrinsics directly here or have the optimized
variant not inlined.


> [1] https://gcc.gnu.org/ml/gcc/2012-07/msg00201.html

Reply via email to