On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam <tmsri...@google.com> wrote: > On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li <davi...@google.com> > wrote: >> What is the compile time impact for turning it on? Code not including >> the intrinsic headers should not be affected too much. If the impact >> is small, why not turning on this option by default -- which seems to >> be the behavior of ICC. > > I will get back with data on this. > >> >> With this option, all functions without the appropriate target options >> will be allowed to reference not supported builtins, without warnings >> or errors. Is it possible to delay the check till the builtin >> expansion time? > > Right now, an error is generated if a function accesses an unsupported > builtin. Since the intrinsic functions are marked inline and call some > target builtin, this will always be caught.
To be clear, same example without the target attribute and with -mgenerate-builtins results in an error: #include <smmintrin.h> __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc -mgenerate-builtins smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); > > Sri > >> >> thanks, >> >> David >> >> On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsri...@google.com> >> wrote: >>> Hi, >>> >>> *mmintrin headers does not work with function specific opts. >>> >>> Example 1: >>> >>> >>> #include <smmintrin.h> >>> >>> __attribute__((target("sse4.1"))) >>> __m128i foo(__m128i *V) >>> { >>> return _mm_stream_load_si128(V); >>> } >>> >>> >>> $ g++ test.cc >>> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled" >>> # error "SSE4.1 instruction set not enabled" >>> >>> This error happens even though foo is marked "sse4.1" >>> >>> There are multiple issues at play here. One, the headers are guarded >>> by macros that are switched on only when the target specific options, >>> like -msse4.1 in this case, are present in the command line. Also, the >>> target specific builtins, like __builtin_ia32_movntdqa called by >>> _mm_stream_load_si128, are exposed only in the presence of the >>> appropriate target ISA option. >>> >>> >>> I have attached a patch that fixes this. I have added an option >>> "-mgenerate-builtins" that will do two things. It will define a macro >>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also >>> expose all the target specific builtins. -mgenerate-builtins will not >>> affect code generation. >>> >>> This feature will greatly benefit the function multiversioning usability >>> too. >>> >>> Comments? >>> >>> Thanks >>> Sri