On Tue, May 14, 2013 at 3:04 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote: >> When trying with -O2 -mno-avx: >> #ifndef __AVX__ >> #pragma GCC push_options >> #pragma GCC target("avx") >> #define __DISABLE_AVX__ >> #endif >> typedef float __v8sf __attribute__ ((__vector_size__ (32))); >> typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); >> extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, >> __artificial__)) >> _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) >> __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } >> #ifdef __DISABLE_AVX__ >> #pragma GCC pop_options >> #undef __DISABLE_AVX__ >> #endif >> __m256 a, b, c; >> void __attribute__((target ("avx"))) >> foo (void) >> { >> a = _mm256_and_ps (b, c); >> } >> we get bogus errors and ICE: >> tty2.c: In function '_mm256_and_ps': >> tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has >> changed in GCC 4.6 >> tty2.c: In function 'foo': >> tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32 >> tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486 >> 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*) >> ../../gcc/expr.c:3485 >> (I have added "1 ||" instead of your generate_builtins into i386.c >> (def_builtin)), that just shows that target attribute/pragma support still >> has very severe issues that need to be fixed, instead of papered around. >> >> Note, we ICE on: >> #pragma GCC target ("mavx") >> That should be fixed too. > > Ok, I had a brief look at the above two issues. > > The first testcase has the problem that the ix86_previous_fndecl cache > gets out of date. When set_cfun is called on _mm256_and_ps (with the > implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps, > TARGET_AVX is set to true, target reinited. Then set_cfun is called > with NULL, we don't do anything. Later on #pragma GCC pop_options appears, > sets !TARGET_AVX (as that is the new target_option_current_node). > Next foo is being parsed, avx attribute is noticed, the same target node > is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's > target node is the same as foo's and so we don't do cl_target_restore_option > at all, so !TARGET_AVX remains, while it should be set. That is the reason > for the bogus inform etc. Fixed by resetting the ix86_previous_fndecl cache > on any #pragma GCC target below. The #pragma GCC target ("mavx") is also > fixed below. The patch also includes the "1 ||" to enable building all > builtins. We still ICE with: > #0 fancy_abort (file=0x11d8fad "../../gcc/expr.c", line=316, > function=0x11dada3 "convert_move") at ../../gcc/diagnostic.c:1180 > #1 0x0000000000771c39 in convert_move (to=0x7ffff1b2df00, > from=0x7ffff1b314e0, unsignedp=0) at ../../gcc/expr.c:316 > #2 0x000000000078009f in store_expr (exp=0x7ffff19ab390, > target=0x7ffff1b2df00, call_param_p=0, nontemporal=false) at > ../../gcc/expr.c:5300 > #3 0x000000000077eba1 in expand_assignment (to=0x7ffff1b35090, > from=0x7ffff19ab390, nontemporal=false) at ../../gcc/expr.c:5025 > on the first testcase. We don't ICE say on: > #ifndef __AVX__ > #pragma GCC push_options > #pragma GCC target("avx") > #define __DISABLE_AVX__ > #endif > typedef float __v8sf __attribute__ ((__vector_size__ (32))); > typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); > extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, > __artificial__)) > _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) > __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } > #ifdef __DISABLE_AVX__ > #pragma GCC pop_options > #undef __DISABLE_AVX__ > #endif > __m256 a[10], b[10], c[10]; > void __attribute__((target ("avx"))) > foo (void) > { > a[0] = _mm256_and_ps (b[0], c[0]); > } > The problem is that in the first testcase, the VAR_DECL c (guess also b and > a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector > types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode > (it has been laid out while -mno-avx has been the current) and also > DECL_RTL which is a mem:BLK. Guess expr.c would need to special case > TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have > DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM > to the desired mode? > > --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.000000000 +0100 > +++ gcc/config/i386/i386-c.c 2013-05-14 11:46:50.773806894 +0200 > @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tre > > if (! args) > { > - cur_tree = ((pop_target) > - ? pop_target > - : target_option_default_node); > + cur_tree = (pop_target ? pop_target : target_option_default_node); > cl_target_option_restore (&global_options, > TREE_TARGET_OPTION (cur_tree)); > } > else > { > cur_tree = ix86_valid_target_attribute_tree (args); > - if (!cur_tree) > - return false; > + if (!cur_tree || cur_tree == error_mark_node) > + { > + cl_target_option_restore (&global_options, > + TREE_TARGET_OPTION (prev_tree)); > + return false; > + } > } > > target_option_current_node = cur_tree; > + ix86_reset_previous_fndecl (); > > /* Figure out the previous/current isa, arch, tune and the differences. */ > prev_opt = TREE_TARGET_OPTION (prev_tree); > --- gcc/config/i386/i386-protos.h.jj 2013-04-03 08:28:50.000000000 +0200 > +++ gcc/config/i386/i386-protos.h 2013-05-14 11:41:35.389638299 +0200 > @@ -40,6 +40,8 @@ extern void ix86_output_addr_diff_elt (F > extern enum calling_abi ix86_cfun_abi (void); > extern enum calling_abi ix86_function_type_abi (const_tree); > > +extern void ix86_reset_previous_fndecl (void); > + > #ifdef RTX_CODE > extern int standard_80387_constant_p (rtx); > extern const char *standard_80387_constant_opcode (rtx); > --- gcc/config/i386/i386.c.jj 2013-05-14 08:23:31.000000000 +0200 > +++ gcc/config/i386/i386.c 2013-05-14 11:40:54.996867411 +0200 > @@ -4559,6 +4559,13 @@ ix86_can_inline_p (tree caller, tree cal > /* Remember the last target of ix86_set_current_function. */ > static GTY(()) tree ix86_previous_fndecl; > > +/* Invalidate ix86_previous_fndecl cache. */ > +void > +ix86_reset_previous_fndecl (void) > +{ > + ix86_previous_fndecl = NULL_TREE; > +} > + > /* Establish appropriate back-end context for processing the function > FNDECL. The argument might be NULL to indicate processing at top > level, outside of any function scope. */ > @@ -26829,7 +26836,7 @@ def_builtin (HOST_WIDE_INT mask, const c > ix86_builtins_isa[(int) code].isa = mask; > > mask &= ~OPTION_MASK_ISA_64BIT; > - if (mask == 0 > + if (/* HACK */ 1 || mask == 0 > || (mask & ix86_isa_flags) != 0 > || (lang_hooks.builtin_function > == lang_hooks.builtin_function_ext_scope))
This does not seem necessary now once the #pragma GCC target is used in the header. The target-specific builtins seem to be generated when the target is specified using pragmas. I will consolidate your patches and send one shortly. Thanks for clearing this, Sri > > > Jakub