Re: A visualization of GCC's passes, as a subway map
On 07/12/2011 06:07 PM, David Malcolm wrote: On this build of GCC (standard Fedora 15 gcc package of 4.6.0), the relevant part of cfgexpand.c looks like this: struct rtl_opt_pass pass_expand = { { RTL_PASS, "expand", /* name */ [...snip...] PROP_ssa | PROP_gimple_leh | PROP_cfg | PROP_gimple_lcx, /* properties_required */ PROP_rtl, /* properties_provided */ PROP_ssa | PROP_trees, /* properties_destroyed */ [...snip...] } and gcc/tree-pass.h has: #define PROP_trees \ (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh | PROP_gimple_lomp) and that matches up with both the diagram, and the entry for "expand" in the table below [1]. So it seems that the diagram is correctly accessing the "properties_destroyed" data for the "expand" pass; does PROP_gimple_lcx need to be added somewhere? (or should the diagram we taught to specialcase some things, perhaps?) Yes, PROP_gimple_lcx needs to be added to PROP_trees. I cannot approve the patch, unfortunately. Also, several passes are likely lacking PROP_crited in their properties_destroyed. At least all those that can be followed by TODO_cleanup_cfg: * pass_split_functions * pass_call_cdcen * pass_build_cfg * pass_cleanup_eh * pass_if_conversion * pass_ipa_inline * pass_early_inline * pass_fixup_cfg * pass_cse_sincos * pass_predcom * pass_lim * pass_loop_prefetch * pass_vectorize * pass_iv_canon * pass_tree_unswitch * pass_vrp * pass_sra_early * pass_sra * pass_early_ipa_sra * pass_ccp * pass_fold_builtins * pass_copy_prop * pass_dce * pass_dce_loop * pass_cd_dce * pass_dominator * pass_phi_only_cprop * pass_forwprop * pass_tree_ifcombine * pass_scev_cprop * pass_parallelize_loops * pass_ch * pass_cselim * pass_pre * pass_fre * pass_tail_recursion * pass_tail_calls Paolo
Pta_flags enum overflow in i386.c
Hi All! As you may see pta_flags enum in i386.c is almost full. So there is a risk of overflow in quite near future. Comment in source code advises "widen struct pta flags" which is now defined as unsigned. But it looks not optimal. What will be the most proper solution for this problem? Thanks in advance, Igor
RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
Hello all, I seek a tree attribute which tells that a "pointer" (in the C/middle-end sense) does not alias with any other variable in the translation unit (i.e. like "restrict"), but on the other hand, it should prevent code movements and value assumptions across (impure) function calls - as it is done for non-restrict pointers. The primary usage are Fortran's coarrays. Those variables exists on all processes ("images") and can be accessed remotely using one-side communication semantics. As coarrays are also used in hot loops, I would like avoid using a non-restricted pointer. A similar issue exists for variables with the ASYNCHRONOUS attribute. Middle-end question: How to handle this best with regards to the middle end? C/C++ question: As one can also with C/C++ use asynchronous I/O, asynchronous communication via libraries as MPI, or single-sided communication via POSIX threads - or with C++0x's std:thread: How do you handle it? Just by avoiding "restrict"? Or do you have a solution, which can also be applied for Fortran? I'm sure that a "restrict + hope & pray" solution won't work reliably and thus is not used ;-) Fortran question: Do my requirements make sense? That is: No code movements for any variable which is a coarray or has the asynchronous attribute in the scoping unit. Plus, no assumption of the value after any call to any impure function? Can something be relaxed or has anything to be tightened? ASYNCHRONOUS is defined in the Fortran standard (e.g 2008, Section 5.3.4) and extended to explicitly allow for asynchronous user functions in Technical Report 29113. The latter functionality will be used in the Message Passing Interface (MPI) specification 3.0. Like VOLATILE, the asynchronous attribute might be restricted to a block (in C: { ... }). Coarrays are defined in the Fortran 2008 standard. (For semantics of interest, see especially Section 8.5 and, in particular, Subsection 8.5.2.) The Fortran 2008 standard is available at ftp://ftp.nag.co.uk/sc22wg5/N1801-N1850/N1830.pdf and the PDTR 29113 at ftp://ftp.nag.co.uk/sc22wg5/N1851-N1900/N1866.pdf Example 1: Asynchronous I/O; in this example using build-in functions, but asynchronous MPI communication would be another example integer, ASYNCHRONOUS :: a ... READ(unit_number,ID=idvar, asynchronous='yes') a ... WAIT(ID=idvar) ... = a Here, "= a" may not be moved before WAIT. Example 2: Coarray with sync. The SYNC is not directly called, but via a wrapper function to increase the fun factor. subroutine sub(coarray) integer :: coarray[*] coarray = 5 call SYNC_calling_proc() ! coarray is modified remotely call SYNC_calling_proc() if (coarray /= 5) ... end subroutine sub Here, the "if" may not be removed as the image could have been changed remotely. Example 3: Allow other optimizations subroutine sub(coarray1, coarray2) integer :: coarray1[*], coarray2[*] coarray1 = 5 coarray2 = 7 if (coarray1 /= 5) Here, the "if" can be removed as "coarray1" cannot alias with any other variable in "sub" as it is not TARGET - and, in particular, it cannot alias with "coarray2" as neither of them is a pointer. Tobias
Re: A visualization of GCC's passes, as a subway map
On Wed, Jul 13, 2011 at 11:49 AM, Paolo Bonzini wrote: > On 07/12/2011 06:07 PM, David Malcolm wrote: >> >> On this build of GCC (standard Fedora 15 gcc package of 4.6.0), the >> relevant part of cfgexpand.c looks like this: >> >> struct rtl_opt_pass pass_expand = >> { >> { >> RTL_PASS, >> "expand", /* name */ >> >> [...snip...] >> >> PROP_ssa | PROP_gimple_leh | PROP_cfg >> | PROP_gimple_lcx, /* properties_required */ >> PROP_rtl, /* properties_provided */ >> PROP_ssa | PROP_trees, /* properties_destroyed */ >> >> [...snip...] >> >> } >> >> and gcc/tree-pass.h has: >> #define PROP_trees \ >> (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh | >> PROP_gimple_lomp) >> >> and that matches up with both the diagram, and the entry for "expand" in >> the table below [1]. >> >> So it seems that the diagram is correctly accessing the >> "properties_destroyed" data for the "expand" pass; does PROP_gimple_lcx >> need to be added somewhere? (or should the diagram we taught to >> specialcase some things, perhaps?) > > Yes, PROP_gimple_lcx needs to be added to PROP_trees. I cannot approve the > patch, unfortunately. Hm, why? complex operations are lowered after a complex lowering pass has executed. they are still lowered on RTL, so I don't see why we need to destroy them technically. > Also, several passes are likely lacking PROP_crited in their > properties_destroyed. At least all those that can be followed by > TODO_cleanup_cfg: Yeah, well - most PROPerties are for informational purposes only right now - things like critical edge splitting should possibly be automatically managed by the pass manager via properties (likewise dominator info for which we don't have a property right now). Of course we'd like to have a verifier for each property. Richard. > * pass_split_functions > * pass_call_cdcen > * pass_build_cfg > * pass_cleanup_eh > * pass_if_conversion > * pass_ipa_inline > * pass_early_inline > * pass_fixup_cfg > * pass_cse_sincos > * pass_predcom > * pass_lim > * pass_loop_prefetch > * pass_vectorize > * pass_iv_canon > * pass_tree_unswitch > * pass_vrp > * pass_sra_early > * pass_sra > * pass_early_ipa_sra > * pass_ccp > * pass_fold_builtins > * pass_copy_prop > * pass_dce > * pass_dce_loop > * pass_cd_dce > * pass_dominator > * pass_phi_only_cprop > * pass_forwprop > * pass_tree_ifcombine > * pass_scev_cprop > * pass_parallelize_loops > * pass_ch > * pass_cselim > * pass_pre > * pass_fre > * pass_tail_recursion > * pass_tail_calls > > Paolo >
Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
On Wed, Jul 13, 2011 at 12:30 PM, Tobias Burnus wrote: > Hello all, > > I seek a tree attribute which tells that a "pointer" (in the C/middle-end > sense) does not alias with any other variable in the translation unit (i.e. > like "restrict"), but on the other hand, it should prevent code movements > and value assumptions across (impure) function calls - as it is done for > non-restrict pointers. > > The primary usage are Fortran's coarrays. Those variables exists on all > processes ("images") and can be accessed remotely using one-side > communication semantics. As coarrays are also used in hot loops, I would > like avoid using a non-restricted pointer. A similar issue exists for > variables with the ASYNCHRONOUS attribute. > > > Middle-end question: How to handle this best with regards to the middle end? > > C/C++ question: As one can also with C/C++ use asynchronous I/O, > asynchronous communication via libraries as MPI, or single-sided > communication via POSIX threads - or with C++0x's std:thread: How do you > handle it? Just by avoiding "restrict"? Or do you have a solution, which can > also be applied for Fortran? I'm sure that a "restrict + hope & pray" > solution won't work reliably and thus is not used ;-) > > Fortran question: Do my requirements make sense? That is: No code movements > for any variable which is a coarray or has the asynchronous attribute in the > scoping unit. Plus, no assumption of the value after any call to any impure > function? Can something be relaxed or has anything to be tightened? > > > ASYNCHRONOUS is defined in the Fortran standard (e.g 2008, Section 5.3.4) > and extended to explicitly allow for asynchronous user functions in > Technical Report 29113. The latter functionality will be used in the Message > Passing Interface (MPI) specification 3.0. Like VOLATILE, the asynchronous > attribute might be restricted to a block (in C: { ... }). Coarrays are > defined in the Fortran 2008 standard. (For semantics of interest, see > especially Section 8.5 and, in particular, Subsection 8.5.2.) > > The Fortran 2008 standard is available at > ftp://ftp.nag.co.uk/sc22wg5/N1801-N1850/N1830.pdf and the PDTR 29113 at > ftp://ftp.nag.co.uk/sc22wg5/N1851-N1900/N1866.pdf > > > Example 1: Asynchronous I/O; in this example using build-in functions, but > asynchronous MPI communication would be another example > integer, ASYNCHRONOUS :: a > ... > READ(unit_number,ID=idvar, asynchronous='yes') a > ... > WAIT(ID=idvar) > ... = a > > Here, "= a" may not be moved before WAIT. > > Example 2: Coarray with sync. The SYNC is not directly called, but via a > wrapper function to increase the fun factor. > subroutine sub(coarray) > integer :: coarray[*] > coarray = 5 > call SYNC_calling_proc() > ! coarray is modified remotely > call SYNC_calling_proc() > if (coarray /= 5) ... > end subroutine sub > Here, the "if" may not be removed as the image could have been changed > remotely. > > Example 3: Allow other optimizations > subroutine sub(coarray1, coarray2) > integer :: coarray1[*], coarray2[*] > coarray1 = 5 > coarray2 = 7 > if (coarray1 /= 5) > Here, the "if" can be removed as "coarray1" cannot alias with any other > variable in "sub" as it is not TARGET - and, in particular, it cannot alias > with "coarray2" as neither of them is a pointer. >From the last two examples it looks like a regular restrict qualified pointer would work. At least I don't see how it would not. Richard. > Tobias >
Re: Google Summer of Code 2011 Doc Camp 17 October - 21 October
On 12 July 2011 18:29, Diego Novillo wrote: > On 11-07-12 12:52 , Philip Herron wrote: > >> Would Gcc internals documentation count or is it more for a whole >> project documentation work? I probably missed the thing about this in >> London since i had to leave on the Sunday morning. >> >> I am kind of interested but i am unsure what kind of documentation >> would be appropriate i've spent the last few days working on some >> internals documentation on and off so its kind of fresh in my mind. > > Any kind of documentation is fine. Internals, user documentation, etc. > > > Diego. > I am quite interested in applying for this but not quite sure what my proposal should be like. Should i just discuss my interest in front-end and middle-end stuff and the lack of documentation currently etc. Plus the question "Who else would you like to recommend to attend the book sprint?" I can only think of you and Ian off the top of my head but i would suggest Andi who has helped me a lot especially in documentation. --Phil
Re: IRA: matches insn even though !reload_in_progress
Michael Meissner wrote: > On Mon, Jul 11, 2011 at 12:38:34PM +0200, Georg-Johann Lay wrote: >> How do I write a pre-reload combine + pre-reload split correctly? >> I'd like to avoid clobber reg. >> >> Thanks much for any hint. > > The move patterns are always kind of funny, particularly during register > allocation. > > Lets see given your pattern is: > > (define_insn_and_split "*mulsqihi3.const" > [(set (match_operand:HI 0 "register_operand" "=&r") > (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a")) > (match_operand:HI 2 "u8_operand" "n")))] > "AVR_HAVE_MUL >&& !reload_completed >&& !reload_in_progress" > { gcc_unreachable(); } > "&& 1" > [(set (match_dup 3) > (match_dup 2)) >; *mulsu >(set (match_dup 0) > (mult:HI (sign_extend:HI (match_dup 1)) >(zero_extend:HI (match_dup 3] > { > operands[3] = gen_reg_rtx (QImode); > }) > > I would probably rewrite it as: > > (define_insn_and_split "*mulsqihi3.const" > [(set (match_operand:HI 0 "register_operand" "=&r") > (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a")) > (match_operand:HI 2 "u8_operand" "n")))] > "AVR_HAVE_MUL >&& !reload_completed >&& !reload_in_progress" > { gcc_unreachable(); } > "&& 1" > [(set (match_dup 3) > (unspec:QI [(match_dup 2)] WRAPPER)) >; *mulsu >(set (match_dup 0) > (mult:HI (sign_extend:HI (match_dup 1)) >(zero_extend:HI (match_dup 3] > { > operands[3] = gen_reg_rtx (QImode); > }) > > (define_insn "*wrapper" > [(set (match_operand:QI 0 "register_operand" "=&r") > (unspec:QI [(match_operand:QI 1 "u8_operand" "n")] WRAPPER))] > "AVR_HAVE_MUL" > "...") > > That way you are using the unspec to make the move not look like a generic > move. All the trouble arises because there is no straight forward way to write the right insn condition, doesn't it? Working around like that will work but it is obfuscating the code, IMHO. Is there a specific reason for early-clobber ind *wrapper? As *wrapper is not a proper move, could this produce move-move-sequences? These would have to be fixed in peep2 or so. > The other way to do it, would be to split it to another pattern that combines > the move and the HI multiply, which you then split after reload. Something > like: > > (define_insn_and_split "*mulsqihi3_const" > [(set (match_operand:HI 0 "register_operand" "=&r") > (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a")) > (match_operand:HI 2 "u8_operand" "n")))] > "AVR_HAVE_MUL >&& !reload_completed >&& !reload_in_progress" > { gcc_unreachable(); } > "&& 1" > [(parallel [(set (match_dup 3) > (match_dup 2)) > ; *mulsu > (set (match_dup 0) > (mult:HI (sign_extend:HI (match_dup 1)) > (zero_extend:HI (match_dup 3])] > { > operands[3] = gen_reg_rtx (QImode); > }) > > (define_insn_and_split "*mulsqihi3_const2" > [(set (match_operand:QI 0 "register_operand" "r") > (match_operand:QI 1 "u8_operand" "n")) >(set (match_operand:HI 2 "register_operand" "r") > (mult:HI (sign_extend:HI (match_operand:QI 3 "register_operand" "a")) >(zero_extend:HI (match_dup 0] > "AVR_HAV_MUL" > "#" > "&& reload_completed" > [(set (match_dup 0) > (match_dup 1)) >(set (match_dup 2) > (mult:HI (sign_extend:HI (match_dup 3)) >(zero_extend:HI (match_dup 0] >{}) The latest patch http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00898.html works around the insn condition shortcoming by writing a gate function. This is the missing part, and if gcc learns something like !ira_in_progress or !split1_completed in the future, the cleanup will be minimal and straight forward. The code is obvious and without obfuscation: (define_insn_and_split "*mulsqihi3.sconst" [(set (match_operand:HI 0 "register_operand" "=r") (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "d")) (match_operand:HI 2 "s8_operand" "n")))] "AVR_HAVE_MUL && avr_gate_split1()" { gcc_unreachable(); } "&& 1" [(set (match_dup 3) (match_dup 2)) ; mulqihi3 (set (match_dup 0) (mult:HI (sign_extend:HI (match_dup 1)) (sign_extend:HI (match_dup 3] { operands[2] = GEN_INT (trunc_int_for_mode (INTVAL (operands[2]), QImode)); operands[3] = gen_reg_rtx (QImode); }) /* FIXME: We compose some insns by means of insn combine and split them in split1. We don't want IRA/reload to combine them to the original insns again because that avoid some CSE optimizations if constants are involved. If IRA/reload combines, the recombined ins
Re: Google Summer of Code 2011 Doc Camp 17 October - 21 October
On Wed, Jul 13, 2011 at 07:09, Philip Herron wrote: > I am quite interested in applying for this but not quite sure what my > proposal should be like. Should i just discuss my interest in > front-end and middle-end stuff and the lack of documentation currently > etc. Given that you are volunteering to produce documentation, I would say that you should propose something that interests you. I would particularly want to see more beginners and internal documentation (which would be appropriate for the Quick Start guides described in the call for proposals). > Plus the question "Who else would you like to recommend to attend the > book sprint?" Anyone who is interested in writing documentation, of course. Diego.
Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
On 07/13/2011 12:57 PM, Richard Guenther wrote: On Wed, Jul 13, 2011 at 12:30 PM, Tobias Burnus wrote: Example 2: Coarray with sync. The SYNC is not directly called, but via a wrapper function to increase the fun factor. subroutine sub(coarray) integer :: coarray[*] coarray = 5 call SYNC_calling_proc() ! coarray is modified remotely call SYNC_calling_proc() if (coarray /= 5) ... end subroutine sub Here, the "if" may not be removed as the image could have been changed remotely. > From the last two examples it looks like a regular restrict qualified pointer would work. At least I don't see how it would not. Would it? How does the compiler know that between "call SYNC_calling_proc()" the value of "coarray" could change? Hmm, seemingly, that's indeed the case, looking at the optimized dump of the example above: sub (integer(kind=4) * restrict coarray) { integer(kind=4) D.1560; *coarray_1(D) = 5; sync_calling_proc (); sync_calling_proc (); D.1560_2 = *coarray_1(D); if (D.1560_2 != 5) Well, then I have a different question: How can one tell the middle end to optimize the "if (...)" away in the following case? Seemingly having an "integer(kind=4) & restrict non_aliasing_var" does not seem to be sufficient to do so: subroutine sub(non_aliasing_var) interface subroutine some_function() end subroutine some_function end interface integer :: non_aliasing_var non_aliasing_var = 5 call some_function() if (non_aliasing_var /= 5) call foobar_() end subroutine sub That's an optimization, which other compiles do - such as NAG or PathScale/Open64/sunf95. Tobias
[pph] Merged trunk->pph
This brings in the cp_binding_level change I made recently on trunk. Tested on x86_64. Diego.
Re: A visualization of GCC's passes, as a subway map
On 07/13/2011 12:54 PM, Richard Guenther wrote: > Yes, PROP_gimple_lcx needs to be added to PROP_trees. I cannot approve the > patch, unfortunately. Hm, why? complex operations are lowered after a complex lowering pass has executed. they are still lowered on RTL, so I don't see why we need to destroy them technically. Because it's PROP_*gimple*_lcx. :) Paolo
Re: Pta_flags enum overflow in i386.c
Igor Zamyatin writes: > As you may see pta_flags enum in i386.c is almost full. So there is a > risk of overflow in quite near future. Comment in source code advises > "widen struct pta flags" which is now defined as unsigned. But it > looks not optimal. > > What will be the most proper solution for this problem? Why is widening pta_flags "not optimal?" It's hard for me to believe that we still care about bootstrapping a i386-*-* compiler with a compiler which doesn't support any 64-bit type. So I don't see any problem with setting need_64bit_hwint=yes in config.gcc for i386-*-*, changing pta_flags to be unsigned HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1 << 63. If anybody doesn't like that idea, we can simply add a flags2 field and a pta_flags2 enum with PTA2_xxx constants. Ian
Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
Tobias Burnus writes: > Would it? How does the compiler know that between "call > SYNC_calling_proc()" the value of "coarray" could change? Hmm, > seemingly, that's indeed the case, looking at the optimized dump of > the example above: The C99 restrict qualifier doesn't mean that some random function can change the memory to which the pointer points; it means that assignments through pointer 1 can't change the memory to which pointer 2 points. That is, restrict is all about whether one pointer can affect another; it doesn't say anything about functions, and in general a call to a function can change any memory pointed to by any pointer. > Well, then I have a different question: How can one tell the middle > end to optimize the "if (...)" away in the following case? Seemingly > having an "integer(kind=4) & restrict non_aliasing_var" does not seem > to be sufficient to do so: > >subroutine sub(non_aliasing_var) > interface >subroutine some_function() >end subroutine some_function > end interface > > integer :: non_aliasing_var > non_aliasing_var = 5 > call some_function() > if (non_aliasing_var /= 5) call foobar_() >end subroutine sub > > That's an optimization, which other compiles do - such as NAG or > PathScale/Open64/sunf95. >From a C perspective, the trick here is to know that the address "non_aliasing_var" does not escape the current function, and that therefore it can not be changed by a function call. gcc already knows that local variables whose address is not taken do not escape the current function. I don't know how to express the above code in C; is there something in there which makes the compiler think that the code is taking the address of non_aliasing_var? If not, this should already work. If so, what is it? I.e., what does this code look like in C? Ian
IPA and LTO
Hello, I have written a simple ipa pass and would like to make use of Link time optimisation. My pass requires access to the function bodies and ideally I would like the driver function to be called once at link time and have access to functions in all of the files as if they were one compilation unit. The documentation would indicate that this is possible, but ad hoc instrumentation of some of the other simple ipa passes seems to suggest different behaviour. My question is whether LTO can be used in this way, to have a simple ipa pass called once at link time with access to the function bodies, and if so how is this achieved? cgraph_function_body_availability seems to only be half the story. I am using GCC 4.6.0 with the gold linker plugin (binutils 2.21). Andrew -- View this message in context: http://old.nabble.com/IPA-and-LTO-tp32052768p32052768.html Sent from the gcc - Dev mailing list archive at Nabble.com.
Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
On 07/13/2011 03:27 PM, Ian Lance Taylor wrote: The C99 restrict qualifier doesn't mean that some random function can change the memory to which the pointer points; it means that assignments through pointer 1 can't change the memory to which pointer 2 points. That is, restrict is all about whether one pointer can affect another; it doesn't say anything about functions, and in general a call to a function can change any memory pointed to by any pointer. That was actually my impression - thus, I wanted to have a different flag to tag asynchronous/coarray variables, which do not alias but might change until a synchronization point via single-sided communication or until a wait with asynchronous I/O/communication. As one does not know where a synchronization/waiting point is, all code movements and variable value assumptions (of such tagged variables) should be prohibited across impure function calls. By contrast, for a normal Fortran variable without POINTER or TARGET attribute does not alias - and may not be changed asynchronously. The latter is what I thought "restrict" (more precisely: TYPE_QUAL_RESTRICT) does, but seemingly it currently also does the former. From a C perspective, the trick here is to know that the address "non_aliasing_var" does not escape the current function, and that therefore it can not be changed by a function call. gcc already knows that local variables whose address is not taken do not escape the current function. I don't know how to express the above code in C; is there something in there which makes the compiler think that the code is taking the address of non_aliasing_var? If not, this should already work. If so, what is it? I.e., what does this code look like in C? I am not sure whether there is a 100% equivalence, but it should match: void some_function(void); void sub (int *restrict non_aliasing_var) { *non_aliasing_var = 5; some_function (); if (*non_aliasing_var != 5) foobar_(); } Also in this case, the "if" block is not optimized away with -O3. Tobias PS: See also just-filled PR middle-end/49733.
Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
On 07/13/2011 03:46 PM, Tobias Burnus wrote: On 07/13/2011 03:27 PM, Ian Lance Taylor wrote: [...] it doesn't say anything about functions, and in general a call to a function can change any memory pointed to by any pointer. I misread the paragraph - in particular the last sentence. In Fortran that's not the case. Fortran alias rules says that a dummy argument may only be modified through the dummy argument, i.e. for subroutine foo(a, b) ! "a" and "b" are passed by reference integer :: a, b a = 5 b = 6 call bar() the value of "a" is neither modified by "b = 6" nor by "call bar()". Exception: If "a" is a target (i.e. some pointer may point to it) or "a" is a POINTER. Thus, in my test case, the function call does not may change the value - and, thus, the "if" block can be optimized away. See quote of the Fortran standard at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49733#c0 Seemingly, in C only the first case, Fortran's "b = 5" (C: "*b = 5"), would be guaranteed to be not affected if "a" (and "b") are "restrict", while the function call can change the value. In that sense, I do not seem to need a new flags for asynchronous/coarrays - which are handled by TYPE_QUAL_RESTRICT, but I need a new flag for normal (noncoarray, nonasychronous) variables, which are passed by value or are allocatable - and where a function call won't affect the value. Tobias
unicode in gcc 4.6.1 output
Hi all, As part of a testsuite script I am parsing GCC's output and I noticed that format specifier %qs quotes the string by surrounding it with unicode characters. I can't find where this %qs is defined so that I can try and override it to quote with '%s' or `%s'. Anything but unicode. Any suggestions? Cheers, -- PMatos
IPA and LTO
Hello, I have written a simple ipa pass and would like to make use of Link time optimisation. My pass requires access to the function bodies and ideally I would like the driver function to be called once at link time and have access to functions in all of the files as if they were one compilation unit. The documentation would indicate that this is possible, but ad hoc instrumentation of some of the other simple ipa passes seems to suggest different behaviour. My question is whether LTO can be used in this way, to have a simple ipa pass called once at link time with access to the function bodies, and if so how is this achieved? cgraph_function_body_availability seems to only be half the story. I am using GCC 4.6.0 with the gold linker plugin (binutils 2.21). Andrew -- View this message in context: http://old.nabble.com/IPA-and-LTO-tp32052838p32052838.html Sent from the gcc - Dev mailing list archive at Nabble.com.
Re: unicode in gcc 4.6.1 output
On 13 July 2011 15:18, Paulo J. Matos wrote: > Hi all, > > As part of a testsuite script I am parsing GCC's output and I noticed that > format specifier %qs quotes the string by surrounding it with unicode > characters. I can't find where this %qs is defined so that I can try and > override it to quote with '%s' or `%s'. Anything but unicode. > > Any suggestions? set LANG=C in your environment when running gcc
Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)
Tobias Burnus writes: > In that sense, I do not seem to need a new flags for > asynchronous/coarrays - which are handled by TYPE_QUAL_RESTRICT, but I > need a new flag for normal (noncoarray, nonasychronous) variables, > which are passed by value or are allocatable - and where a function > call won't affect the value. Yes, sounds like it. At first glance I don't think it should be a TYPE_QUAL, I think it should be a flag on the DECL. Ian
Re: unicode in gcc 4.6.1 output
On Wed, 13 Jul 2011 15:55:58 +0100 Jonathan Wakely wrote: > On 13 July 2011 15:18, Paulo J. Matos wrote: > > Hi all, > > > > As part of a testsuite script I am parsing GCC's output and I noticed that > > format specifier %qs quotes the string by surrounding it with unicode > > characters. I can't find where this %qs is defined so that I can try and > > override it to quote with '%s' or `%s'. Anything but unicode. > > > > Any suggestions? > > set LANG=C in your environment when running gcc Also, the %q format is probably handled inside gcc/diagnostic.c & gcc/pretty-print.c cheers -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
Re: unicode in gcc 4.6.1 output
"Paulo J. Matos" writes: > As part of a testsuite script I am parsing GCC's output and I noticed > that format specifier %qs quotes the string by surrounding it with > unicode characters. I can't find where this %qs is defined so that I > can try and override it to quote with '%s' or `%s'. Anything but > unicode. %qs is implemented by pp_base_format in pretty-print.c. Note that %q can be used with any format specifier, not just s. %q is implemented using the open_quote and close_quote variables, which are initialized by gcc_init_libintl in intl.c. If you are just interested in changing the quote characters that gcc prints when you run it, check your LANG environment variable. In normal use you will only see U+2018 and U+2019 if you are using a LANG which specifies utf8. Ian
Re: IPA and LTO
On Wed, Jul 13, 2011 at 10:22, AJM-2 wrote: > My question is whether LTO can be used in this way, to have a simple ipa > pass called once at link time with access to the function bodies, and if so > how is this achieved? cgraph_function_body_availability seems to only be > half the story. Yes, it can. You seem to be describing what GCC calls "simple IPA pass". These are passes that cannot run in partitioned LTO mode, as they require the function bodies to operate. Look for passes like pass_ipa_function_and_variable_visibility for an example of a simple IPA pass. Diego.
Re: IPA and LTO
What you say is in line with my understanding, however when I instrument the execute function of ipa-function-and-variable-visibility (local_function_and_variable_visibility()) I note that: gcc -flto a.c b.c causes the pass to be called twice (presumably once per file). If I split the compilation into two stages, then in the link stage gcc -flto a.o b.o the pass is never called. Conversely, the gate of IPA-Points-to does seem to be called three times at link time (presumably once for each file and then once for all together). I cannot discover the cause of the different behaviours here. Diego Novillo-3 wrote: > > On Wed, Jul 13, 2011 at 10:22, AJM-2 wrote: > >> My question is whether LTO can be used in this way, to have a simple ipa >> pass called once at link time with access to the function bodies, and if >> so >> how is this achieved? cgraph_function_body_availability seems to only be >> half the story. > > Yes, it can. You seem to be describing what GCC calls "simple IPA > pass". These are passes that cannot run in partitioned LTO mode, as > they require the function bodies to operate. Look for passes like > pass_ipa_function_and_variable_visibility for an example of a simple > IPA pass. > > > Diego. > > -- View this message in context: http://old.nabble.com/IPA-and-LTO-tp32052838p32054720.html Sent from the gcc - Dev mailing list archive at Nabble.com.
Re: IPA and LTO
Hello, If local_function_and_variable_visibility was not a simple IPA pass it would not have been called once per file but once per function (as it is with GIMPLE pass). I feel this is normal that this pass is run 2 times because it is run before any link operations. However, I don't know exactly how and when ld is called and which passes run after this. Pierre Vittet On 13/07/2011 17:54, AJM-2 wrote: What you say is in line with my understanding, however when I instrument the execute function of ipa-function-and-variable-visibility (local_function_and_variable_visibility()) I note that: gcc -flto a.c b.c causes the pass to be called twice (presumably once per file). If I split the compilation into two stages, then in the link stage gcc -flto a.o b.o the pass is never called. Conversely, the gate of IPA-Points-to does seem to be called three times at link time (presumably once for each file and then once for all together). I cannot discover the cause of the different behaviours here.
Re: C++ mangling, function name to mangled name (or tree)
Hello, sorry to answer that late (I didn't saw your mail in my mailbox + I was preparing me for RMLL/Libre software meeting). Your solution looks to be a nice one, I am goiing to try it and I will post the result of my experiment. I was not aware of that hook. Thanks! Pierre Vittet Hello, Have you considered the reverse way to do that. I mean, why don't you hook on the PLUGIN_PRE_GENERICIZE event to catch all function bodies, and then compare the argument the user gave you to current_function_name() (that will returns you the full protoype of the current function, ie: malloc full name is "void* malloc(size_t)"). Then, you can store the FUNCTION_DECL tree if there's a match and use it for later processing. That's how i proceed for my plugins. Romain Geissler
Re: IRA: matches insn even though !reload_in_progress
On Wed, Jul 13, 2011 at 01:42:29PM +0200, Georg-Johann Lay wrote: > All the trouble arises because there is no straight forward way to > write the right insn condition, doesn't it? > > Working around like that will work but it is obfuscating the code, IMHO. Given I don't know the AVR port, I don't know what the right condition is. I was just guessing from the patterns you provided. > Is there a specific reason for early-clobber ind *wrapper? You could instead put (clobber (match_scratch)) in the insn, and not split it until after reload. > As *wrapper is not a proper move, could this produce move-move-sequences? > These would have to be fixed in peep2 or so. In theory the register allocator will eliminate the normal move, and just use the wrapper. > This is the missing part, and if gcc learns something like > !ira_in_progress or !split1_completed in the future, the cleanup will > be minimal and straight forward. The code is obvious and without > obfuscation: I tend to think a few more _in_progress and _completed flags would be helpful, particularly ira_in_progress or make reload_in_progress be set by ira. You are probably the first person to run into this. The question is how many do we want and need? Note, a few years ago, this type of splitting was not possible. You had to have all of the match_scratch'es that you would need allocated in the RTL generation pass. Being able to allocate new psuedos before the split1 pass certainly makes things easier to do, but there are always things that could be done better. > bool > avr_gate_split1 (void) > { > if (current_pass->static_pass_number > < pass_match_asm_constraints.pass.static_pass_number) > return true; > > return false; > } > > > I choose .asmcons because it runs between IRA and split1, > and because I observed that pass numbers are fuzzy; > presumably because sub-passes like df etc. I'm not a big fan of this. I think it would be better to just add ira_in_progress and a few others as needed. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: IPA and LTO
On Wed, Jul 13, 2011 at 5:54 PM, AJM-2 wrote: > > What you say is in line with my understanding, however when I instrument the > execute function of ipa-function-and-variable-visibility > (local_function_and_variable_visibility()) I note that: > > gcc -flto a.c b.c > causes the pass to be called twice (presumably once per file). > > If I split the compilation into two stages, then in the link stage > gcc -flto a.o b.o > the pass is never called. > > Conversely, the gate of IPA-Points-to does seem to be called three times at > link time (presumably once for each file and then once for all together). I > cannot discover the cause of the different behaviours here. It depends on where in the pass pipeline you put your IPA pass. A simple IPA pass that should run at ltrans time (either seeing each partition for the partitioned program or the whole program if you use one partition) needs to be put alongside IPA PTA (that's the only simple IPA pass executed at link LTO time right now). Richard. > > Diego Novillo-3 wrote: >> >> On Wed, Jul 13, 2011 at 10:22, AJM-2 wrote: >> >>> My question is whether LTO can be used in this way, to have a simple ipa >>> pass called once at link time with access to the function bodies, and if >>> so >>> how is this achieved? cgraph_function_body_availability seems to only be >>> half the story. >> >> Yes, it can. You seem to be describing what GCC calls "simple IPA >> pass". These are passes that cannot run in partitioned LTO mode, as >> they require the function bodies to operate. Look for passes like >> pass_ipa_function_and_variable_visibility for an example of a simple >> IPA pass. >> >> >> Diego. >> >> > > -- > View this message in context: > http://old.nabble.com/IPA-and-LTO-tp32052838p32054720.html > Sent from the gcc - Dev mailing list archive at Nabble.com. > >
Re: IPA and LTO
Putting my "simple IPA pass" adjacent to IPA-PTA does cause it to be called as expected. However for each node in the call graph (with cgraph_function_body_availability returning AVAIL_AVAILABLE), gimple_has_body_p is always false. The call graph data seems to be available, but the documentation indicates that access to the gimple is also possible, using the standard accessors. Is there some extra step that must be taken to access gimple under LTO? Richard Guenther-2 wrote: > > It depends on where in the pass pipeline you put your IPA pass. A simple > IPA pass that should run at ltrans time (either seeing each partition for > the partitioned program or the whole program if you use one partition) > needs to be put alongside IPA PTA (that's the only simple IPA pass > executed > at link LTO time right now). > > Richard. > > -- View this message in context: http://old.nabble.com/IPA-and-LTO-tp32052838p32056682.html Sent from the gcc - Dev mailing list archive at Nabble.com.
Re: IPA and LTO
On Wed, Jul 13, 2011 at 10:09 PM, AJM-2 wrote: > > Putting my "simple IPA pass" adjacent to IPA-PTA does cause it to be called > as expected. However for each node in the call graph (with > cgraph_function_body_availability returning AVAIL_AVAILABLE), > gimple_has_body_p is always false. > > The call graph data seems to be available, but the documentation indicates > that access to the gimple is also possible, using the standard accessors. > Is there some extra step that must be taken to access gimple under LTO? The body should be available. Make sure to use a recent SVN trunk though. Richard. > > > Richard Guenther-2 wrote: >> >> It depends on where in the pass pipeline you put your IPA pass. A simple >> IPA pass that should run at ltrans time (either seeing each partition for >> the partitioned program or the whole program if you use one partition) >> needs to be put alongside IPA PTA (that's the only simple IPA pass >> executed >> at link LTO time right now). >> >> Richard. >> >> > > -- > View this message in context: > http://old.nabble.com/IPA-and-LTO-tp32052838p32056682.html > Sent from the gcc - Dev mailing list archive at Nabble.com. > >
cachecc1 query
Hi, Has anyone used cachecc1 (http://cachecc1.sourceforge.net/) to cache gcc bootstraps of in recent years? The project looks rather stale, 2004. I would love to accelerate bootstraps of gcc rebuilds to test snapshots more frequently. Is there any interest in getting this to work? (I'm particularly interested in a darwin port, but would benefit from having it work on any modern platform.) Fang -- David Fang http://www.csl.cornell.edu/~fang/