Register constrained variable issue
If one writes a bit of code like this: int foo( void ) { register int x asm( "Rn" ); asm volatile ( "INSN_A %0 \n\t" : "=r" (x) ); bar(); asm volatile ( "INSN_B %0,%0 \n\t" : "=r" (x) : "0" (x) ); return x; } and Rn is a register not saved over function calls, then gcc does not save it but allows it to get clobbered by the calling of bar(). For example, if the processor is ARM and Rn is r2 (on ARM r0-r3 and r12 can get clobbered by a function call), then the following code is generated; if you don't know ARM assembly, comments tell you what's going on: foo: stmfd sp!, {r3, lr} // Save the return address INSN_A r2// The insn generating r2's content bl bar // Call bar(); it may destroy r2 INSN_B r2, r2// *** Here a possibly clobbered r2 is used! mov r0, r2// Copy r2 to the return value register ldmfd sp!, {r3, lr} // Restore the return address bx lr// Return Note that you don't need a real function call in your code, it is enough if you do something which forces gcc to call a function in libgcc.a. On some ARM targets a long long shift or an integer division or even just a switch {} statement is enough to trigger a call to the support library. Which basically means that one *must not* allocate a register not saved by function calls because then they can get clobbered at any time. It is not an ARM specific issue, either; other targets behave the same. The compiler version is 4.5.3. The info page regarding to specifying registers for variables does not say that the register one chooses must be a register saved across calls. On the other hand, it does say that register content might be destroyed when the compiler knows that the data is not live any more; a statement which has a vibe suggesting that the register content is preserved as long as the data is live. For global register variables the info page does actually warn about library routines possibly clobbering registers and says that one should use a saved and restored register. However, global and function local variables are very different animals; global regs are reserved and not tracked by the data flow analysis while local var regs are part of the data flow analysis, as stated by the info page. So I don't know if it is a bug (i.e. the compiler is supposed to protect local reg vars) or just misleading/omitted information in the info page? Thanks, Zoltan
**Help I love GCC
Please see the whole E-mail Please send a GCC for windows. Language:Chinese or English. I'm a Chinese student,now I'm studing C++.I want a GCC(For Windows,Chinese),but my English isn't very good,and I can't find GCC. So,please send me a GCC,for tomorrow of the wold. Write (Send)to me soon. E-MAIL ADDRESS:870523...@qq.com -- Read Me!
Fix gcc.dg/builtins-67.c on Solaris 8/9
The test fails with a link error, as 'round' and 'rint' are only C99. Fixed thusly, tested on SPARC/Solaris 8, applied on the mainline as obvious. 2011-10-13 Eric Botcazou * gcc.dg/builtins-67.c: Guard iround and irint with HAVE_C99_RUNTIME. -- Eric Botcazou Index: gcc.dg/builtins-67.c === --- gcc.dg/builtins-67.c (revision 179844) +++ gcc.dg/builtins-67.c (working copy) @@ -58,14 +58,14 @@ long long llceilf (float a) { return (lo long long llceill (long double a) { return (long long) ceill (a); } #endif -int iround (double a) { return (int) round (a); } #ifdef HAVE_C99_RUNTIME +int iround (double a) { return (int) round (a); } int iroundf (float a) { return (int) roundf (a); } int iroundl (long double a) { return (int) roundl (a); } #endif -int irint (double a) { return (int) rint (a); } #ifdef HAVE_C99_RUNTIME +int irint (double a) { return (int) rint (a); } int irintf (float a) { return (int) rintf (a); } int irintl (long double a) { return (int) rintl (a); } #endif
Re: Register constrained variable issue
On 10/13/2011 12:26 AM, Zoltán Kócsi wrote: > So I don't know if it is a bug (i.e. the compiler is supposed to protect local > reg vars) or just misleading/omitted information in the info page? It's the documentation that could perhaps be improved. Local register variables are not protected from calls. r~
Vector alignment tracking
Hi I would like to share some plans about improving the situation with vector alignment tracking. First of all, I would like to start with a well-known bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50716. There are several aspects of the problem: 1) We would like to avoid the quiet segmentation fault. 2) We would like to warn a user about the potential problems considering assignment of vectors with different alignment. 3) We would like to replace obvious aligned vector assignments with aligned move, and unaligned with unaligned. All these aspects are interconnected and in order to find the problem, we have to improve the alignment tracking facilities. 1) Currently in C we cannot provide information that an array is aligned to a certain number. The problem is hidden in the fact, that pointer can represents an array or an address of an object. And it turns out that current aligned attribute doesn't help here. My proposal is to introduce an attribute called array_alligned (I am very flexible on the name) which can be applied only to the pointers and which would show that the pointer of this type represents an array, where the first element is aligned to the given number. 2) After we have the new attribute, we can have a pass which would check all the pointer arithmetic expressions, and in case of vectors, mark the assignments with __builtin_assume_aligned. 3) In the separate pass we need to mark an alignments of the function return types, in order to propagate this information through the flow-graph. 4) In case of LTO, it becomes possible to track all the pointer dereferences, and depending on the parameters warn, or change aligned assignment to unaligned and vice-versa. As a very first draft of (1) I include the patch, that introduces array_aligned attribute. The attribute sets is_array_flag in the type, ans uses alignment number to store the alignment of the array. In this implementation, we loose information about the alignment of the pointer itself, but I don't know if we need it in this particular situation. Alternatively we can keep array_alignment in a separate field, which one is better I am not sure. Thanks, Artem. Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 179906) +++ gcc/c-family/c-common.c (working copy) @@ -341,6 +341,7 @@ static tree handle_destructor_attribute static tree handle_mode_attribute (tree *, tree, tree, int, bool *); static tree handle_section_attribute (tree *, tree, tree, int, bool *); static tree handle_aligned_attribute (tree *, tree, tree, int, bool *); +static tree handle_aligned_array_attribute (tree *, tree, tree, int, bool *); static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ; static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *); static tree handle_ifunc_attribute (tree *, tree, tree, int, bool *); @@ -643,6 +644,8 @@ const struct attribute_spec c_common_att handle_section_attribute, false }, { "aligned",0, 1, false, false, false, handle_aligned_attribute, false }, + { "aligned_array", 0, 1, false, false, false, + handle_aligned_array_attribute, false }, { "weak", 0, 0, true, false, false, handle_weak_attribute, false }, { "ifunc", 1, 1, true, false, false, @@ -6682,6 +6685,26 @@ handle_section_attribute (tree *node, tr } return NULL_TREE; +} + +/* Handle "aligned_array" attribute. */ +static tree +handle_aligned_array_attribute (tree *node, tree ARG_UNUSED (name), tree args, + int flags, bool *no_add_attrs) +{ + if (!TYPE_P (*node) || !POINTER_TYPE_P (*node)) +{ + error ("array_alignment attribute must be applied to a pointer-type"); + *no_add_attrs = true; +} + else +{ + tree ret = handle_aligned_attribute (node, name, args, flags, no_add_attrs); + TYPE_IS_ARRAY (*node) = true; + return ret; +} + + return NULL_TREE; } /* Handle a "aligned" attribute; arguments as in Index: gcc/tree.h === --- gcc/tree.h (revision 179906) +++ gcc/tree.h (working copy) @@ -2149,6 +2149,7 @@ struct GTY(()) tree_block { #define TYPE_NEXT_VARIANT(NODE) (TYPE_CHECK (NODE)->type_common.next_variant) #define TYPE_MAIN_VARIANT(NODE) (TYPE_CHECK (NODE)->type_common.main_variant) #define TYPE_CONTEXT(NODE) (TYPE_CHECK (NODE)->type_common.context) +#define TYPE_IS_ARRAY(NODE) (TYPE_CHECK (NODE)->type_common.is_array_flag) /* Vector types need to check target flags to determine type. */ extern enum machine_mode vector_type_mode (const_tree); @@ -2411,6 +2412,7 @@ struct GTY(()) tree_type_common { unsigned lang_flag_5 : 1; unsigned lang_flag_6 : 1; + unsigned is_array_flag: 1;
Re: Vector alignment tracking
Artem Shinkarov writes: > > 1) Currently in C we cannot provide information that an array is > aligned to a certain number. The problem is hidden in the fact, that Have you considered doing it the other way round: when an optimization needs something to be aligned, make the declaration aligned? -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: Vector alignment tracking
On Thu, Oct 13, 2011 at 4:54 PM, Andi Kleen wrote: > Artem Shinkarov writes: >> >> 1) Currently in C we cannot provide information that an array is >> aligned to a certain number. The problem is hidden in the fact, that > > Have you considered doing it the other way round: when an optimization > needs something to be aligned, make the declaration aligned? > > -Andi Andi, I can't realistically imagine how could it work. The problem is that for an arbitrary arr[x], I have no idea whether it should be aligned or not. what if arr = ptr + 5; v = *(vec *) arr; I can make arr aligned, because it would be better for performance, but obviously, the pointer expression breaks this alignment. But the code is valid, because unaligned move is still possible. So I think that checking is a more conservative approach. Or I am missing someting? Thanks, Artem. > -- > a...@linux.intel.com -- Speaking for myself only >
Re: Vector alignment tracking
> Or I am missing someting? I often see the x86 vectorizer with -mtune=generic generate a lot of complicated code just to adjust for potential misalignment. My thought was just if the alias oracle knows what the original declaration is, and it's available for changes (e.g. LTO), it would be likely be better to just add an __attribute__((aligned())) there. In the general case it's probably harder, you would need some cost model to decide when it's worth it. Your approach of course would still be needed for cases where this isn't possible. But it sounded like the infrastructure you're building could in principle do both. -Andi
Re: Vector alignment tracking
On Thu, Oct 13, 2011 at 06:57:47PM +0200, Andi Kleen wrote: > > Or I am missing someting? > > I often see the x86 vectorizer with -mtune=generic generate a lot of > complicated code just to adjust for potential misalignment. > > My thought was just if the alias oracle knows what the original > declaration is, and it's available for changes (e.g. LTO), it would be > likely be better to just add an __attribute__((aligned())) > there. > > In the general case it's probably harder, you would need some > cost model to decide when it's worth it. GCC already does that on certain targets, see increase_alignment in tree-vectorizer.c. Plus, various backends attempt to align larger arrays more than they have to be aligned. Jakub
Re: VIS2 pattern review
From: Richard Henderson Date: Wed, 12 Oct 2011 17:49:19 -0700 > There's a code sample 7-1 that illustrates a 16x16 multiply: > > fmul8sux16 %f0, %f1, %f2 > fmul8ulx16 %f0, %f1, %f3 > fpadd16%f2, %f3, %f4 Be wary of code examples that don't even assemble (even numbered float registers are required here). fmul8sux16 basically does, for each element: src1 = (rs1 >> 8) & 0xff; src2 = rs2 & 0x; product = src1 * src2; scaled = (product & 0x0000) >> 8; if (product & 0x80) scaled++; rd = scaled & 0x; fmul8ulx16 does the same except the assignment to src1 is: src1 = rs1 & 0xff; Therefore, I think this "16 x 16 multiply" operation isn't the kind you think it is, and it's therefore not appropriate to use this in the compiler for vector multiplies. Just for shits and grins I tried it and the slp-7 testcase, as expected, fails. The main multiply loop in that test case is compiled to: sethi %hi(.LLC6), %i3 sethi %hi(in2), %g1 ldd [%i3+%lo(.LLC6)], %f22 sethi %hi(.LLC7), %i4 sethi %hi(.LLC8), %i2 sethi %hi(.LLC9), %i3 add %fp, -256, %g2 ldd [%i4+%lo(.LLC7)], %f20 or %g1, %lo(in2), %g1 ldd [%i2+%lo(.LLC8)], %f18 mov %fp, %i5 ldd [%i3+%lo(.LLC9)], %f16 mov %g1, %g4 mov %g2, %g3 .LL10: ldd [%g4+8], %f14 ldd [%g4+16], %f12 fmul8sux16 %f14, %f22, %f26 ldd [%g4+24], %f10 fmul8ulx16 %f14, %f22, %f24 ldd [%g4], %f8 fmul8sux16 %f12, %f20, %f34 fmul8ulx16 %f12, %f20, %f32 fmul8sux16 %f10, %f18, %f30 fpadd16 %f26, %f24, %f14 fmul8ulx16 %f10, %f18, %f28 fmul8sux16 %f8, %f16, %f26 fmul8ulx16 %f8, %f16, %f24 fpadd16 %f34, %f32, %f12 std %f14, [%g3+8] fpadd16 %f30, %f28, %f10 std %f12, [%g3+16] fpadd16 %f26, %f24, %f8 std %f10, [%g3+24] std %f8, [%g3] add %g3, 32, %g3 cmp %g3, %i5 bne,pt %icc, .LL10 add%g4, 32, %g4 and it simply gives the wrong results. The entire out2[] array is all zeros.
Re: VIS2 pattern review
From: David Miller Date: Thu, 13 Oct 2011 14:26:36 -0400 (EDT) > product = src1 * src2; > > scaled = (product & 0x0000) >> 8; > if (product & 0x80) > scaled++; In fact, all of the partitioned multiply instructions scale the result by 8 bits with rounding towards positive infinity. Therefore, we have to use an unspec for all of them.
Re: VIS2 pattern review
On 10/13/2011 11:26 AM, David Miller wrote: > Therefore, I think this "16 x 16 multiply" operation isn't the kind > you think it is, and it's therefore not appropriate to use this in the > compiler for vector multiplies. Ah, I see the magic word in the docs now: "fixed point". I.e. class MODE_ACCUM not class MODE_INT. I guess that's a totally different kind of support that could be added. r~
Re: VIS2 pattern review
From: Richard Henderson Date: Wed, 12 Oct 2011 17:49:19 -0700 > The comment for fpmerge_vis is not correct. > I believe that the operation is representable with > > (vec_select:V8QI > (vec_concat:V8QI > (match_operand:V4QI 1 ...) > (match_operand:V4QI 2 ...) > (parallel [ > 0 4 1 5 2 6 3 7 > ])) > > which can be used as the basis for both of the > > vec_interleave_lowv8qi > vec_interleave_highv8qi > > named patterns. Agreed. > AFAICS, this needs an unspec, like fmul8x16al. > Similarly for fmul8sux16_vis, fmuld8sux16_vis, Yes, as we found all the partitioned multiplies need to be unspecs. >> (define_code_iterator vis3_addsub_ss [ss_plus ss_minus]) >> (define_code_attr vis3_addsub_ss_insn >> [(ss_plus "fpadds") (ss_minus "fpsubs")]) >> >> (define_insn "_vis" >> [(set (match_operand:VASS 0 "register_operand" "=") >> (vis3_addsub_ss:VASS (match_operand:VASS 1 "register_operand" >> "") >> (match_operand:VASS 2 "register_operand" >> "")))] >> "TARGET_VIS3" >> "\t%1, %2, %0") > > These should be exposed as "ssadd3" "sssub3". Agreed. I'm currently regstrapping the patch at the end of this mail and will commit it to trunk if no regressions pop up. I'll look into the rest of your feedback. But I want to look into a more fundamental issue with VIS support before moving much further. I worked for several evenings on adding support for the VIS3 instructions that move directly between float and integer regs. I tried really hard to get the compiler to do something sensible but it's next to impossible for two reasons: 1) We don't represent single entry vectors using vector modes, we just use SImode, DImode etc. I think this is a huge mistake, because the compiler now thinks it can do "SImode stuff" in the float regs. Other backends are able to segregate vector vs. non-vector operations by using the single entry vector modes. 2) In addition to that, because of how we number the registers for allocation on sparc for leaf functions, the compiler starts trying to reload SImode and DImode values into the floating point registers before trying to use the non-leaf integer regs. Because the leaf allocation order is "leaf integer regs", "float regs", "non-leaf integer regs". Even if I jacked up the register move cost for the cases where the VIS3 instructions applied, it still did these kinds of reloads. This also gets reload into trouble because it believes that if it can move an address value (say Pmode == SImode) from one place to another, then a plus on the same operands can be performed (with perhaps minor reloading). But that doesn't work when this "move" is "move float reg to int reg" and therefore the operands are "%f12" and "%g3". It tries to do things like "(plus:SI (reg:SI %f12) (reg:SI %g3))" All of these troubles would be eliminated if we used vector modes for all the VIS operations instead of using SImode and DImode for the single entry vector cases. Unfortunately, that would involve some ABI changes for the VIS builtins. I'm trending towards considering just changing things anyways since the VIS intrinsics were next to unusable beforehand. I've scoured the net for examples of people actually using the GCC intrinsics before all of my recent changes, and they all fall into two categories: 1) they use inline assembler because the VIS intrinsics don't work and 2) they try to use the intrinsics but the code is disabled because it "doesn't work". Fix the RTL of some sparc VIS patterns. * config/sparc/sparc.md (UNSPEC_FPMERGE): Delete. (UNSPEC_MUL16AU, UNSPEC_MUL8, UNSPEC_MUL8SU, UNSPEC_MULDSU): New unspecs. (fpmerge_vis): Remove inaccurate comment, represent using vec_select of a vec_concat. (vec_interleave_lowv8qi, vec_interleave_highv8qi): New insns. (fmul8x16_vis, fmul8x16au_vis, fmul8sux16_vis, fmuld8sux16_vis): Reimplement as unspecs and remove inaccurate comments. (vis3_shift_patname): New code attr. (_vis): Rename to "v3". (vis3_addsub_ss_patname): New code attr. (_vis): Rename to "". diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 017594f..ae36634 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,18 @@ 2011-10-12 David S. Miller + * config/sparc/sparc.md (UNSPEC_FPMERGE): Delete. + (UNSPEC_MUL16AU, UNSPEC_MUL8, UNSPEC_MUL8SU, UNSPEC_MULDSU): New + unspecs. + (fpmerge_vis): Remove inaccurate comment, represent using vec_select + of a vec_concat. + (vec_interleave_lowv8qi, vec_interleave_highv8qi): New insns. + (fmul8x16_vis, fmul8x16au_vis, fmul8sux16_vis, fmuld8sux16_vis): + Reimplement as unspecs and remove inaccurate comments. + (vis3_shift_patname): New code attr. + (_vis): Rename to "v3". + (vis3_addsub_ss_patname): New code attr. + (_vis): Rename to "". +
Re: VIS2 pattern review
On 10/13/2011 12:55 PM, David Miller wrote: > -(define_insn "_vis" > +(define_insn "" Missing a "3" on the end. Otherwise these look ok. > Unfortunately, that would involve some ABI changes for the VIS > builtins. I'm trending towards considering just changing things > anyways since the VIS intrinsics were next to unusable beforehand. Why? You can do just about anything you like inside the builtin expander, including frobbing the modes around. E.g. for x86, we can't create anything TImode at the user level, and thus can't expose either TImode or V2TImode operands into the builtins. So we have the builtins use V2DI or V4DImode and do some gen_lowpart frobbing while expanding the builtins to rtl. I don't see why you couldn't accept DImode from the builtin and transform it to V1DImode in the rtl. Etc. r~
Re: VIS2 pattern review
From: Richard Henderson Date: Thu, 13 Oct 2011 13:06:19 -0700 > On 10/13/2011 12:55 PM, David Miller wrote: >> -(define_insn "_vis" >> +(define_insn "" > > Missing a "3" on the end. Otherwise these look ok. Thanks for finding that. >> Unfortunately, that would involve some ABI changes for the VIS >> builtins. I'm trending towards considering just changing things >> anyways since the VIS intrinsics were next to unusable beforehand. > > Why? You can do just about anything you like inside the builtin > expander, including frobbing the modes around. Hmmm, ok, I'll look into approaching the change that way. Thanks again Richard.
Re: VIS2 pattern review
> Unfortunately, that would involve some ABI changes for the VIS > builtins. I'm trending towards considering just changing things > anyways since the VIS intrinsics were next to unusable beforehand. Could you elaborate? The calling conventions for vectors (like for the other classes) shouldn't depend on the mode but only on the type. -- Eric Botcazou
gcc-4.5-20111013 is now available
Snapshot gcc-4.5-20111013 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20111013/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 179947 You'll find: gcc-4.5-20111013.tar.bz2 Complete GCC MD5=06bf0ac5a15811b6c1a3997d2d0db585 SHA1=791b37fa42cb824f58e5015da972505cbe683639 Diffs from 4.5-20111006 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: VIS2 pattern review
From: Eric Botcazou Date: Fri, 14 Oct 2011 00:41:42 +0200 >> Unfortunately, that would involve some ABI changes for the VIS >> builtins. I'm trending towards considering just changing things >> anyways since the VIS intrinsics were next to unusable beforehand. > > Could you elaborate? The calling conventions for vectors (like for the other > classes) shouldn't depend on the mode but only on the type. Right and as Richard said I can munge the modes during expansion of existing builtins when needed.
Re: Question about default_elf_asm_named_section function
"Iyer, Balaji V" writes: > This email is in reference to the "default_elf_asm_named_section" > function in the varasm.c file. > > This function is defined like this: > > void > default_elf_asm_named_section (const char *name, unsigned int flags, >tree decl ATTRIBUTE_UNUSED) > > > But, inside the function, there is this if-statement: > > > if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE)) > { > if (TREE_CODE (decl) == IDENTIFIER_NODE) > fprintf (asm_out_file, ",%s,comdat", IDENTIFIER_POINTER (decl)); > else > fprintf (asm_out_file, ",%s,comdat", > IDENTIFIER_POINTER (DECL_COMDAT_GROUP (decl))); > } > > > The decl is set with "ATTRIBUTE_UNUSED" but the if-statement is using "decl." > Should we remove the attribute unused tag near the "tree decl" or is the > if-statement a deadcode that should never be ? ATTRIBUTE_UNUSED does not mean "this parameter is never used." It means "this parameter may not be used." The difference is due to #ifdefs--if a parameter is only used in code that is something #ifdef'ed out, then the parameter should be marked as ATTRIBUTE_UNUSED. In this case the parameter is always used, so we might as well remove the ATTRIBUTE_UNUSED. Ian
Re: **Help I love GCC
"花儿对我笑" <870523...@qq.com> writes: > Please see the whole E-mail Please send a GCC > for windows. Language:Chinese or English. I'm a Chinese student,now > I'm studing C++.I want a GCC(For Windows,Chinese),but my English isn't very > good,and I can't find GCC. So,please send me a GCC,for tomorrow of the wold. > Write (Send)to me soon. E-MAIL ADDRESS:870523...@qq.com > -- This messages should have been sent to gcc-h...@gcc.gnu.org, not gcc@gcc.gnu.org. Please send any followups to gcc-help. Thanks. For gcc for Windows see http://cygwin.com or http://mingw.org . Ian
Re: **Help I love GCC
2011/10/14 Ian Lance Taylor : > "花儿对我笑" <870523...@qq.com> writes: > >> Please see the whole E-mail Please send a GCC >> for windows. Language:Chinese or English. I'm a Chinese student,now >> I'm studing C++.I want a GCC(For Windows,Chinese),but my English isn't very >> good,and I can't find GCC. So,please send me a GCC,for tomorrow of the wold. >>Write (Send)to me soon. E-MAIL ADDRESS:870523...@qq.com >> -- > > This messages should have been sent to gcc-h...@gcc.gnu.org, not > gcc@gcc.gnu.org. Please send any followups to gcc-help. Thanks. > > For gcc for Windows see http://cygwin.com or http://mingw.org . > > Ian > And, If you wanna a IDE, google mingw+eclipse. Anymore you need, let me know. Liu
Re: VIS2 pattern review
> Right and as Richard said I can munge the modes during expansion of > existing builtins when needed. OK, but you precisely shouldn't need to do it since the type is fixed. -- Eric Botcazou