Re: A visualization of GCC's passes, as a subway map

2011-07-13 Thread Paolo Bonzini

On 07/12/2011 06:07 PM, David Malcolm wrote:

On this build of GCC (standard Fedora 15 gcc package of 4.6.0), the
relevant part of cfgexpand.c looks like this:

struct rtl_opt_pass pass_expand =
{
  {
   RTL_PASS,
   "expand",  /* name */

[...snip...]

   PROP_ssa | PROP_gimple_leh | PROP_cfg
 | PROP_gimple_lcx, /* properties_required */
   PROP_rtl, /* properties_provided */
   PROP_ssa | PROP_trees,   /* properties_destroyed */

[...snip...]

}

and gcc/tree-pass.h has:
   #define PROP_trees \
 (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh |  PROP_gimple_lomp)

and that matches up with both the diagram, and the entry for "expand" in
the table below [1].

So it seems that the diagram is correctly accessing the
"properties_destroyed" data for the "expand" pass; does PROP_gimple_lcx
need to be added somewhere?  (or should the diagram we taught to
specialcase some things, perhaps?)


Yes, PROP_gimple_lcx needs to be added to PROP_trees.  I cannot approve 
the patch, unfortunately.


Also, several passes are likely lacking PROP_crited in their 
properties_destroyed.  At least all those that can be followed by 
TODO_cleanup_cfg:


* pass_split_functions
* pass_call_cdcen
* pass_build_cfg
* pass_cleanup_eh
* pass_if_conversion
* pass_ipa_inline
* pass_early_inline
* pass_fixup_cfg
* pass_cse_sincos
* pass_predcom
* pass_lim
* pass_loop_prefetch
* pass_vectorize
* pass_iv_canon
* pass_tree_unswitch
* pass_vrp
* pass_sra_early
* pass_sra
* pass_early_ipa_sra
* pass_ccp
* pass_fold_builtins
* pass_copy_prop
* pass_dce
* pass_dce_loop
* pass_cd_dce
* pass_dominator
* pass_phi_only_cprop
* pass_forwprop
* pass_tree_ifcombine
* pass_scev_cprop
* pass_parallelize_loops
* pass_ch
* pass_cselim
* pass_pre
* pass_fre
* pass_tail_recursion
* pass_tail_calls

Paolo


Pta_flags enum overflow in i386.c

2011-07-13 Thread Igor Zamyatin
Hi All!

As you may see pta_flags enum in i386.c is almost full. So there is a
risk of overflow in quite near future. Comment in source code advises
"widen struct pta flags" which is now defined as unsigned. But it
looks not optimal.

What will be the most proper solution for this problem?


Thanks in advance,
Igor


RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Tobias Burnus

Hello all,

I seek a tree attribute which tells that a "pointer" (in the 
C/middle-end sense) does not alias with any other variable in the 
translation unit (i.e. like "restrict"), but on the other hand, it 
should prevent code movements and value assumptions across (impure) 
function calls - as it is done for non-restrict pointers.


The primary usage are Fortran's coarrays. Those variables exists on all 
processes ("images") and can be accessed remotely using one-side 
communication semantics. As coarrays are also used in hot loops, I would 
like avoid using a non-restricted pointer. A similar issue exists for 
variables with the ASYNCHRONOUS attribute.



Middle-end question: How to handle this best with regards to the middle end?

C/C++ question: As one can also with C/C++ use asynchronous I/O, 
asynchronous communication via libraries as MPI, or single-sided 
communication via POSIX threads - or with C++0x's std:thread: How do you 
handle it? Just by avoiding "restrict"? Or do you have a solution, which 
can also be applied for Fortran? I'm sure that a "restrict + hope & 
pray" solution won't work reliably and thus is not used ;-)


Fortran question: Do my requirements make sense? That is: No code 
movements for any variable which is a coarray or has the asynchronous 
attribute in the scoping unit. Plus, no assumption of the value after 
any call to any impure function? Can something be relaxed or has 
anything to be tightened?



ASYNCHRONOUS is defined in the Fortran standard (e.g 2008, Section 
5.3.4) and extended to explicitly allow for asynchronous user functions 
in Technical Report 29113. The latter functionality will be used in the 
Message Passing Interface (MPI) specification 3.0. Like VOLATILE, the 
asynchronous attribute might be restricted to a block (in C: { ... }). 
Coarrays are defined in the Fortran 2008 standard. (For semantics of 
interest, see especially Section 8.5 and, in particular, Subsection 8.5.2.)


The Fortran 2008 standard is available at 
ftp://ftp.nag.co.uk/sc22wg5/N1801-N1850/N1830.pdf and the PDTR 29113 at 
ftp://ftp.nag.co.uk/sc22wg5/N1851-N1900/N1866.pdf



Example 1: Asynchronous I/O; in this example using build-in functions, 
but asynchronous MPI communication would be another example

  integer, ASYNCHRONOUS :: a
  ...
  READ(unit_number,ID=idvar, asynchronous='yes') a
...
  WAIT(ID=idvar)
  ... = a

Here, "= a" may not be moved before WAIT.

Example 2: Coarray with sync. The SYNC is not directly called, but via a 
wrapper function to increase the fun factor.

subroutine sub(coarray)
  integer :: coarray[*]
  coarray = 5
  call SYNC_calling_proc()
  ! coarray is modified remotely
  call SYNC_calling_proc()
  if (coarray /= 5) ...
end subroutine sub
Here, the "if" may not be removed as the image could have been changed 
remotely.


Example 3: Allow other optimizations
subroutine sub(coarray1, coarray2)
  integer :: coarray1[*], coarray2[*]
  coarray1 = 5
  coarray2 = 7
  if (coarray1 /= 5)
Here, the "if" can be removed as "coarray1" cannot alias with any other 
variable in "sub" as it is not TARGET - and, in particular, it cannot 
alias with "coarray2" as neither of them is a pointer.


Tobias


Re: A visualization of GCC's passes, as a subway map

2011-07-13 Thread Richard Guenther
On Wed, Jul 13, 2011 at 11:49 AM, Paolo Bonzini  wrote:
> On 07/12/2011 06:07 PM, David Malcolm wrote:
>>
>> On this build of GCC (standard Fedora 15 gcc package of 4.6.0), the
>> relevant part of cfgexpand.c looks like this:
>>
>> struct rtl_opt_pass pass_expand =
>> {
>>  {
>>   RTL_PASS,
>>   "expand",                            /* name */
>>
>> [...snip...]
>>
>>   PROP_ssa | PROP_gimple_leh | PROP_cfg
>>     | PROP_gimple_lcx,                 /* properties_required */
>>   PROP_rtl,                             /* properties_provided */
>>   PROP_ssa | PROP_trees,               /* properties_destroyed */
>>
>> [...snip...]
>>
>> }
>>
>> and gcc/tree-pass.h has:
>>   #define PROP_trees \
>>     (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh |
>>  PROP_gimple_lomp)
>>
>> and that matches up with both the diagram, and the entry for "expand" in
>> the table below [1].
>>
>> So it seems that the diagram is correctly accessing the
>> "properties_destroyed" data for the "expand" pass; does PROP_gimple_lcx
>> need to be added somewhere?  (or should the diagram we taught to
>> specialcase some things, perhaps?)
>
> Yes, PROP_gimple_lcx needs to be added to PROP_trees.  I cannot approve the
> patch, unfortunately.

Hm, why?  complex operations are lowered after a complex lowering pass
has executed.  they are still lowered on RTL, so I don't see why we need
to destroy them technically.

> Also, several passes are likely lacking PROP_crited in their
> properties_destroyed.  At least all those that can be followed by
> TODO_cleanup_cfg:

Yeah, well - most PROPerties are for informational purposes only right
now - things like critical edge splitting should possibly be automatically
managed by the pass manager via properties (likewise dominator info
for which we don't have a property right now).  Of course we'd like to
have a verifier for each property.

Richard.

> * pass_split_functions
> * pass_call_cdcen
> * pass_build_cfg
> * pass_cleanup_eh
> * pass_if_conversion
> * pass_ipa_inline
> * pass_early_inline
> * pass_fixup_cfg
> * pass_cse_sincos
> * pass_predcom
> * pass_lim
> * pass_loop_prefetch
> * pass_vectorize
> * pass_iv_canon
> * pass_tree_unswitch
> * pass_vrp
> * pass_sra_early
> * pass_sra
> * pass_early_ipa_sra
> * pass_ccp
> * pass_fold_builtins
> * pass_copy_prop
> * pass_dce
> * pass_dce_loop
> * pass_cd_dce
> * pass_dominator
> * pass_phi_only_cprop
> * pass_forwprop
> * pass_tree_ifcombine
> * pass_scev_cprop
> * pass_parallelize_loops
> * pass_ch
> * pass_cselim
> * pass_pre
> * pass_fre
> * pass_tail_recursion
> * pass_tail_calls
>
> Paolo
>


Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Richard Guenther
On Wed, Jul 13, 2011 at 12:30 PM, Tobias Burnus  wrote:
> Hello all,
>
> I seek a tree attribute which tells that a "pointer" (in the C/middle-end
> sense) does not alias with any other variable in the translation unit (i.e.
> like "restrict"), but on the other hand, it should prevent code movements
> and value assumptions across (impure) function calls - as it is done for
> non-restrict pointers.
>
> The primary usage are Fortran's coarrays. Those variables exists on all
> processes ("images") and can be accessed remotely using one-side
> communication semantics. As coarrays are also used in hot loops, I would
> like avoid using a non-restricted pointer. A similar issue exists for
> variables with the ASYNCHRONOUS attribute.
>
>
> Middle-end question: How to handle this best with regards to the middle end?
>
> C/C++ question: As one can also with C/C++ use asynchronous I/O,
> asynchronous communication via libraries as MPI, or single-sided
> communication via POSIX threads - or with C++0x's std:thread: How do you
> handle it? Just by avoiding "restrict"? Or do you have a solution, which can
> also be applied for Fortran? I'm sure that a "restrict + hope & pray"
> solution won't work reliably and thus is not used ;-)
>
> Fortran question: Do my requirements make sense? That is: No code movements
> for any variable which is a coarray or has the asynchronous attribute in the
> scoping unit. Plus, no assumption of the value after any call to any impure
> function? Can something be relaxed or has anything to be tightened?
>
>
> ASYNCHRONOUS is defined in the Fortran standard (e.g 2008, Section 5.3.4)
> and extended to explicitly allow for asynchronous user functions in
> Technical Report 29113. The latter functionality will be used in the Message
> Passing Interface (MPI) specification 3.0. Like VOLATILE, the asynchronous
> attribute might be restricted to a block (in C: { ... }). Coarrays are
> defined in the Fortran 2008 standard. (For semantics of interest, see
> especially Section 8.5 and, in particular, Subsection 8.5.2.)
>
> The Fortran 2008 standard is available at
> ftp://ftp.nag.co.uk/sc22wg5/N1801-N1850/N1830.pdf and the PDTR 29113 at
> ftp://ftp.nag.co.uk/sc22wg5/N1851-N1900/N1866.pdf
>
>
> Example 1: Asynchronous I/O; in this example using build-in functions, but
> asynchronous MPI communication would be another example
>  integer, ASYNCHRONOUS :: a
>  ...
>  READ(unit_number,ID=idvar, asynchronous='yes') a
>    ...
>  WAIT(ID=idvar)
>  ... = a
>
> Here, "= a" may not be moved before WAIT.
>
> Example 2: Coarray with sync. The SYNC is not directly called, but via a
> wrapper function to increase the fun factor.
>    subroutine sub(coarray)
>      integer :: coarray[*]
>      coarray = 5
>      call SYNC_calling_proc()
>      ! coarray is modified remotely
>      call SYNC_calling_proc()
>      if (coarray /= 5) ...
>    end subroutine sub
> Here, the "if" may not be removed as the image could have been changed
> remotely.
>
> Example 3: Allow other optimizations
>    subroutine sub(coarray1, coarray2)
>      integer :: coarray1[*], coarray2[*]
>      coarray1 = 5
>      coarray2 = 7
>      if (coarray1 /= 5)
> Here, the "if" can be removed as "coarray1" cannot alias with any other
> variable in "sub" as it is not TARGET - and, in particular, it cannot alias
> with "coarray2" as neither of them is a pointer.

>From the last two examples it looks like a regular restrict qualified pointer
would work.  At least I don't see how it would not.

Richard.

> Tobias
>


Re: Google Summer of Code 2011 Doc Camp 17 October - 21 October

2011-07-13 Thread Philip Herron
On 12 July 2011 18:29, Diego Novillo  wrote:
> On 11-07-12 12:52 , Philip Herron wrote:
>
>> Would Gcc internals documentation count or is it more for a whole
>> project documentation work? I probably missed the thing about this in
>> London since i had to leave on the Sunday morning.
>>
>> I am kind of interested but i am unsure what kind of documentation
>> would be appropriate i've spent the last few days working on some
>> internals documentation on and off so its kind of fresh in my mind.
>
> Any kind of documentation is fine.  Internals, user documentation, etc.
>
>
> Diego.
>

I am quite interested in applying for this but not quite sure what my
proposal should be like. Should i just discuss my interest in
front-end and middle-end stuff and the lack of documentation currently
etc.

Plus the question "Who else would you like to recommend to attend the
book sprint?"

I can only think of you and Ian off the top of my head but i would
suggest Andi who has helped me a lot especially in documentation.

--Phil


Re: IRA: matches insn even though !reload_in_progress

2011-07-13 Thread Georg-Johann Lay
Michael Meissner wrote:
> On Mon, Jul 11, 2011 at 12:38:34PM +0200, Georg-Johann Lay wrote:
>> How do I write a pre-reload combine + pre-reload split correctly?
>> I'd like to avoid clobber reg.
>>
>> Thanks much for any hint.
> 
> The move patterns are always kind of funny, particularly during register
> allocation.
> 
> Lets see given your pattern is:
> 
> (define_insn_and_split "*mulsqihi3.const"
>   [(set (match_operand:HI 0 "register_operand" "=&r")
> (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a"))
>  (match_operand:HI 2 "u8_operand" "n")))]
>   "AVR_HAVE_MUL
>&& !reload_completed
>&& !reload_in_progress"
>   { gcc_unreachable(); }
>   "&& 1"
>   [(set (match_dup 3)
> (match_dup 2))
>; *mulsu
>(set (match_dup 0)
>   (mult:HI (sign_extend:HI (match_dup 1))
>(zero_extend:HI (match_dup 3]
>   {
> operands[3] = gen_reg_rtx (QImode);
>   })
> 
> I would probably rewrite it as:
> 
> (define_insn_and_split "*mulsqihi3.const"
>   [(set (match_operand:HI 0 "register_operand" "=&r")
> (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a"))
>  (match_operand:HI 2 "u8_operand" "n")))]
>   "AVR_HAVE_MUL
>&& !reload_completed
>&& !reload_in_progress"
>   { gcc_unreachable(); }
>   "&& 1"
>   [(set (match_dup 3)
> (unspec:QI [(match_dup 2)] WRAPPER))
>; *mulsu
>(set (match_dup 0)
>   (mult:HI (sign_extend:HI (match_dup 1))
>(zero_extend:HI (match_dup 3]
>   {
> operands[3] = gen_reg_rtx (QImode);
>   })
> 
> (define_insn "*wrapper"
>   [(set (match_operand:QI 0 "register_operand" "=&r")
> (unspec:QI [(match_operand:QI 1 "u8_operand" "n")] WRAPPER))]
>   "AVR_HAVE_MUL"
>   "...")
> 
> That way you are using the unspec to make the move not look like a generic
> move.

All the trouble arises because there is no straight forward way to
write the right insn condition, doesn't it?

Working around like that will work but it is obfuscating the code, IMHO.

Is there a specific reason for early-clobber ind *wrapper?

As *wrapper is not a proper move, could this produce move-move-sequences?
These would have to be fixed in peep2 or so.

> The other way to do it, would be to split it to another pattern that combines
> the move and the HI multiply, which you then split after reload.  Something
> like:
> 
> (define_insn_and_split "*mulsqihi3_const"
>   [(set (match_operand:HI 0 "register_operand" "=&r")
> (mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a"))
>  (match_operand:HI 2 "u8_operand" "n")))]
>   "AVR_HAVE_MUL
>&& !reload_completed
>&& !reload_in_progress"
>   { gcc_unreachable(); }
>   "&& 1"
>   [(parallel [(set (match_dup 3)
>  (match_dup 2))
> ; *mulsu
> (set (match_dup 0)
>  (mult:HI (sign_extend:HI (match_dup 1))
>   (zero_extend:HI (match_dup 3])]
>   {
> operands[3] = gen_reg_rtx (QImode);
>   })
> 
> (define_insn_and_split "*mulsqihi3_const2"
>   [(set (match_operand:QI 0 "register_operand" "r")
>   (match_operand:QI 1 "u8_operand" "n"))
>(set (match_operand:HI 2 "register_operand" "r")
>   (mult:HI (sign_extend:HI (match_operand:QI 3 "register_operand" "a"))
>(zero_extend:HI (match_dup 0]
>   "AVR_HAV_MUL"
>   "#"
>   "&& reload_completed"
>   [(set (match_dup 0)
>   (match_dup 1))
>(set (match_dup 2)
>   (mult:HI (sign_extend:HI (match_dup 3))
>(zero_extend:HI (match_dup 0]
>{})

The latest patch
   http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00898.html
works around the insn condition shortcoming by writing a gate
function.

This is the missing part, and if gcc learns something like
!ira_in_progress or !split1_completed in the future, the cleanup will
be minimal and straight forward.  The code is obvious and without
obfuscation:


(define_insn_and_split "*mulsqihi3.sconst"
  [(set (match_operand:HI 0 "register_operand" "=r")
(mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "d"))
 (match_operand:HI 2 "s8_operand"   "n")))]
  "AVR_HAVE_MUL
   && avr_gate_split1()"
  { gcc_unreachable(); }
  "&& 1"
  [(set (match_dup 3)
(match_dup 2))
   ; mulqihi3
   (set (match_dup 0)
(mult:HI (sign_extend:HI (match_dup 1))
 (sign_extend:HI (match_dup 3]
  {
operands[2] = GEN_INT (trunc_int_for_mode (INTVAL (operands[2]),
   QImode));
operands[3] = gen_reg_rtx (QImode);
  })


/* FIXME:  We compose some insns by means of insn combine
  and split them in split1.  We don't want IRA/reload
  to combine them to the original insns again because
  that avoid some CSE optimizations if constants are
  involved.  If IRA/reload combines, the recombined
  ins

Re: Google Summer of Code 2011 Doc Camp 17 October - 21 October

2011-07-13 Thread Diego Novillo
On Wed, Jul 13, 2011 at 07:09, Philip Herron  wrote:

> I am quite interested in applying for this but not quite sure what my
> proposal should be like. Should i just discuss my interest in
> front-end and middle-end stuff and the lack of documentation currently
> etc.

Given that you are volunteering to produce documentation, I would say
that you should propose something that interests you.  I would
particularly want to see more beginners and internal documentation
(which would be appropriate for the Quick Start guides described in
the call for proposals).

> Plus the question "Who else would you like to recommend to attend the
> book sprint?"

Anyone who is interested in writing documentation, of course.


Diego.


Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Tobias Burnus

On 07/13/2011 12:57 PM, Richard Guenther wrote:

On Wed, Jul 13, 2011 at 12:30 PM, Tobias Burnus  wrote:

Example 2: Coarray with sync. The SYNC is not directly called, but via a
wrapper function to increase the fun factor.
subroutine sub(coarray)
  integer :: coarray[*]
  coarray = 5
  call SYNC_calling_proc()
  ! coarray is modified remotely
  call SYNC_calling_proc()
  if (coarray /= 5) ...
end subroutine sub
Here, the "if" may not be removed as the image could have been changed
remotely.

> From the last two examples it looks like a regular restrict qualified pointer
would work.  At least I don't see how it would not.


Would it? How does the compiler know that between "call 
SYNC_calling_proc()" the value of "coarray" could change? Hmm, 
seemingly, that's indeed the case, looking at the optimized dump of the 
example above:


sub (integer(kind=4) * restrict coarray)
{
  integer(kind=4) D.1560;
  *coarray_1(D) = 5;
  sync_calling_proc ();
  sync_calling_proc ();
  D.1560_2 = *coarray_1(D);
  if (D.1560_2 != 5)


Well, then I have a different question: How can one tell the middle end 
to optimize the "if (...)" away in the following case? Seemingly having 
an "integer(kind=4) & restrict non_aliasing_var" does not seem to be 
sufficient to do so:


   subroutine sub(non_aliasing_var)
 interface
   subroutine some_function()
   end subroutine some_function
 end interface

 integer :: non_aliasing_var
 non_aliasing_var = 5
 call some_function()
 if (non_aliasing_var /= 5) call foobar_()
   end subroutine sub

That's an optimization, which other compiles do - such as NAG or 
PathScale/Open64/sunf95.


Tobias


[pph] Merged trunk->pph

2011-07-13 Thread Diego Novillo
This brings in the cp_binding_level change I made recently on
trunk.

Tested on x86_64.


Diego.


Re: A visualization of GCC's passes, as a subway map

2011-07-13 Thread Paolo Bonzini

On 07/13/2011 12:54 PM, Richard Guenther wrote:

>  Yes, PROP_gimple_lcx needs to be added to PROP_trees.  I cannot approve the
>  patch, unfortunately.

Hm, why?  complex operations are lowered after a complex lowering pass
has executed.  they are still lowered on RTL, so I don't see why we need
to destroy them technically.


Because it's PROP_*gimple*_lcx. :)

Paolo


Re: Pta_flags enum overflow in i386.c

2011-07-13 Thread Ian Lance Taylor
Igor Zamyatin  writes:

> As you may see pta_flags enum in i386.c is almost full. So there is a
> risk of overflow in quite near future. Comment in source code advises
> "widen struct pta flags" which is now defined as unsigned. But it
> looks not optimal.
>
> What will be the most proper solution for this problem?

Why is widening pta_flags "not optimal?"

It's hard for me to believe that we still care about bootstrapping a
i386-*-* compiler with a compiler which doesn't support any 64-bit type.
So I don't see any problem with setting need_64bit_hwint=yes in
config.gcc for i386-*-*, changing pta_flags to be unsigned
HOST_WIDE_INT, and letting pta_flags go up to (unsigned HOST_WIDE_INT) 1
<< 63.

If anybody doesn't like that idea, we can simply add a flags2 field and
a pta_flags2 enum with PTA2_xxx constants.

Ian


Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Ian Lance Taylor
Tobias Burnus  writes:

> Would it? How does the compiler know that between "call
> SYNC_calling_proc()" the value of "coarray" could change? Hmm,
> seemingly, that's indeed the case, looking at the optimized dump of
> the example above:

The C99 restrict qualifier doesn't mean that some random function can
change the memory to which the pointer points; it means that assignments
through pointer 1 can't change the memory to which pointer 2 points.
That is, restrict is all about whether one pointer can affect another;
it doesn't say anything about functions, and in general a call to a
function can change any memory pointed to by any pointer.


> Well, then I have a different question: How can one tell the middle
> end to optimize the "if (...)" away in the following case? Seemingly
> having an "integer(kind=4) & restrict non_aliasing_var" does not seem
> to be sufficient to do so:
>
>subroutine sub(non_aliasing_var)
>  interface
>subroutine some_function()
>end subroutine some_function
>  end interface
>
>  integer :: non_aliasing_var
>  non_aliasing_var = 5
>  call some_function()
>  if (non_aliasing_var /= 5) call foobar_()
>end subroutine sub
>
> That's an optimization, which other compiles do - such as NAG or
> PathScale/Open64/sunf95.

>From a C perspective, the trick here is to know that the address
"non_aliasing_var" does not escape the current function, and that
therefore it can not be changed by a function call.  gcc already knows
that local variables whose address is not taken do not escape the
current function.  I don't know how to express the above code in C; is
there something in there which makes the compiler think that the code is
taking the address of non_aliasing_var?  If not, this should already
work.  If so, what is it?  I.e., what does this code look like in C?

Ian


IPA and LTO

2011-07-13 Thread AJM-2

Hello,

I have written a simple ipa pass and would like to make use of Link time
optimisation.  My pass requires access to the function bodies and ideally I
would like the driver function to be called once at link time and have
access to functions in all of the files as if they were one compilation
unit.  The documentation would indicate that this is possible, but ad hoc
instrumentation of some of the other simple ipa passes seems to suggest
different behaviour.

My question is whether LTO can be used in this way, to have a simple ipa
pass called once at link time with access to the function bodies, and if so
how is this achieved?  cgraph_function_body_availability seems to only be
half the story.

I am using GCC 4.6.0 with the gold linker plugin (binutils 2.21).

Andrew
-- 
View this message in context: 
http://old.nabble.com/IPA-and-LTO-tp32052768p32052768.html
Sent from the gcc - Dev mailing list archive at Nabble.com.



Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Tobias Burnus

On 07/13/2011 03:27 PM, Ian Lance Taylor wrote:

The C99 restrict qualifier doesn't mean that some random function can
change the memory to which the pointer points; it means that assignments
through pointer 1 can't change the memory to which pointer 2 points.
That is, restrict is all about whether one pointer can affect another;
it doesn't say anything about functions, and in general a call to a
function can change any memory pointed to by any pointer.


That was actually my impression - thus, I wanted to have a different 
flag to tag asynchronous/coarray variables, which do not alias but might 
change until a synchronization point via single-sided communication or 
until a wait with asynchronous I/O/communication. As one does not know 
where a synchronization/waiting point is, all code movements and 
variable value assumptions (of such tagged variables) should be 
prohibited across impure function calls.


By contrast, for a normal Fortran variable without POINTER or TARGET 
attribute does not alias - and may not be changed asynchronously.


The latter is what I thought "restrict" (more precisely: 
TYPE_QUAL_RESTRICT) does, but seemingly it currently also does the former.



 From a C perspective, the trick here is to know that the address
"non_aliasing_var" does not escape the current function, and that
therefore it can not be changed by a function call.  gcc already knows
that local variables whose address is not taken do not escape the
current function.  I don't know how to express the above code in C; is
there something in there which makes the compiler think that the code is
taking the address of non_aliasing_var?  If not, this should already
work.  If so, what is it?  I.e., what does this code look like in C?


I am not sure whether there is a 100% equivalence, but it should match:

void some_function(void);

void
sub (int *restrict non_aliasing_var)
{
  *non_aliasing_var = 5;
  some_function ();
  if (*non_aliasing_var != 5)
foobar_();
}

Also in this case, the "if" block is not optimized away with -O3.

Tobias

PS: See also just-filled PR middle-end/49733.


Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Tobias Burnus

On 07/13/2011 03:46 PM, Tobias Burnus wrote:

On 07/13/2011 03:27 PM, Ian Lance Taylor wrote:

[...]

it doesn't say anything about functions, and in general a call to a
function can change any memory pointed to by any pointer.


I misread the paragraph - in particular the last sentence. In Fortran 
that's not the case. Fortran alias rules says that a dummy argument may 
only be modified through the dummy argument, i.e. for


   subroutine foo(a, b)  ! "a" and "b" are passed by reference
 integer :: a, b
 a = 5
 b = 6
 call bar()

the value of "a" is neither modified by "b = 6" nor by "call bar()". 
Exception: If "a" is a target (i.e. some pointer may point to it) or "a" 
is a POINTER.


Thus, in my test case, the function call does not may change the value - 
and, thus, the "if" block can be optimized away.


See quote of the Fortran standard at
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49733#c0

Seemingly, in C only the first case, Fortran's "b = 5" (C: "*b = 5"), 
would be guaranteed to be not affected if "a" (and "b") are "restrict", 
while the function call can change the value.



In that sense, I do not seem to need a new flags for 
asynchronous/coarrays - which are handled by TYPE_QUAL_RESTRICT, but I 
need a new flag for normal (noncoarray, nonasychronous) variables, which 
are passed by value or are allocatable - and where a function call won't 
affect the value.


Tobias


unicode in gcc 4.6.1 output

2011-07-13 Thread Paulo J. Matos

Hi all,

As part of a testsuite script I am parsing GCC's output and I noticed 
that format specifier %qs quotes the string by surrounding it with 
unicode characters. I can't find where this %qs is defined so that I can 
try and override it to quote with '%s' or `%s'. Anything but unicode.


Any suggestions?

Cheers,

--
PMatos



IPA and LTO

2011-07-13 Thread AJM-2

Hello,

I have written a simple ipa pass and would like to make use of Link time
optimisation.  My pass requires access to the function bodies and ideally I
would like the driver function to be called once at link time and have
access to functions in all of the files as if they were one compilation
unit.  The documentation would indicate that this is possible, but ad hoc
instrumentation of some of the other simple ipa passes seems to suggest
different behaviour.

My question is whether LTO can be used in this way, to have a simple ipa
pass called once at link time with access to the function bodies, and if so
how is this achieved?  cgraph_function_body_availability seems to only be
half the story.

I am using GCC 4.6.0 with the gold linker plugin (binutils 2.21).

Andrew

-- 
View this message in context: 
http://old.nabble.com/IPA-and-LTO-tp32052838p32052838.html
Sent from the gcc - Dev mailing list archive at Nabble.com.



Re: unicode in gcc 4.6.1 output

2011-07-13 Thread Jonathan Wakely
On 13 July 2011 15:18, Paulo J. Matos wrote:
> Hi all,
>
> As part of a testsuite script I am parsing GCC's output and I noticed that
> format specifier %qs quotes the string by surrounding it with unicode
> characters. I can't find where this %qs is defined so that I can try and
> override it to quote with '%s' or `%s'. Anything but unicode.
>
> Any suggestions?

set LANG=C in your environment when running gcc


Re: RFH: Impose code-movement restrictions and value assumption (for ASYNCHRONOUS/Coarrays)

2011-07-13 Thread Ian Lance Taylor
Tobias Burnus  writes:

> In that sense, I do not seem to need a new flags for
> asynchronous/coarrays - which are handled by TYPE_QUAL_RESTRICT, but I
> need a new flag for normal (noncoarray, nonasychronous) variables,
> which are passed by value or are allocatable - and where a function
> call won't affect the value.

Yes, sounds like it.  At first glance I don't think it should be a
TYPE_QUAL, I think it should be a flag on the DECL.

Ian


Re: unicode in gcc 4.6.1 output

2011-07-13 Thread Basile Starynkevitch
On Wed, 13 Jul 2011 15:55:58 +0100
Jonathan Wakely  wrote:

> On 13 July 2011 15:18, Paulo J. Matos wrote:
> > Hi all,
> >
> > As part of a testsuite script I am parsing GCC's output and I noticed that
> > format specifier %qs quotes the string by surrounding it with unicode
> > characters. I can't find where this %qs is defined so that I can try and
> > override it to quote with '%s' or `%s'. Anything but unicode.
> >
> > Any suggestions?
> 
> set LANG=C in your environment when running gcc

Also, the %q format is probably handled inside gcc/diagnostic.c & 
gcc/pretty-print.c

cheers


-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


Re: unicode in gcc 4.6.1 output

2011-07-13 Thread Ian Lance Taylor
"Paulo J. Matos"  writes:

> As part of a testsuite script I am parsing GCC's output and I noticed
> that format specifier %qs quotes the string by surrounding it with
> unicode characters. I can't find where this %qs is defined so that I
> can try and override it to quote with '%s' or `%s'. Anything but
> unicode.

%qs is implemented by pp_base_format in pretty-print.c.  Note that %q
can be used with any format specifier, not just s.

%q is implemented using the open_quote and close_quote variables, which
are initialized by gcc_init_libintl in intl.c.

If you are just interested in changing the quote characters that gcc
prints when you run it, check your LANG environment variable.  In normal
use you will only see U+2018 and U+2019 if you are using a LANG which
specifies utf8.

Ian


Re: IPA and LTO

2011-07-13 Thread Diego Novillo
On Wed, Jul 13, 2011 at 10:22, AJM-2  wrote:

> My question is whether LTO can be used in this way, to have a simple ipa
> pass called once at link time with access to the function bodies, and if so
> how is this achieved?  cgraph_function_body_availability seems to only be
> half the story.

Yes, it can.  You seem to be describing what GCC calls "simple IPA
pass".  These are passes that cannot run in partitioned LTO mode, as
they require the function bodies to operate.  Look for passes like
pass_ipa_function_and_variable_visibility for an example of a simple
IPA pass.


Diego.


Re: IPA and LTO

2011-07-13 Thread AJM-2

What you say is in line with my understanding, however when I instrument the
execute function of ipa-function-and-variable-visibility
(local_function_and_variable_visibility()) I note that:

gcc -flto a.c b.c
causes the pass to be called twice (presumably once per file).

If I split the compilation into two stages, then in the link stage
gcc -flto a.o b.o
the pass is never called.

Conversely, the gate of IPA-Points-to does seem to be called three times at
link time (presumably once for each file and then once for all together).  I
cannot discover the cause of the different behaviours here.


Diego Novillo-3 wrote:
> 
> On Wed, Jul 13, 2011 at 10:22, AJM-2  wrote:
> 
>> My question is whether LTO can be used in this way, to have a simple ipa
>> pass called once at link time with access to the function bodies, and if
>> so
>> how is this achieved?  cgraph_function_body_availability seems to only be
>> half the story.
> 
> Yes, it can.  You seem to be describing what GCC calls "simple IPA
> pass".  These are passes that cannot run in partitioned LTO mode, as
> they require the function bodies to operate.  Look for passes like
> pass_ipa_function_and_variable_visibility for an example of a simple
> IPA pass.
> 
> 
> Diego.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/IPA-and-LTO-tp32052838p32054720.html
Sent from the gcc - Dev mailing list archive at Nabble.com.



Re: IPA and LTO

2011-07-13 Thread Pierre Vittet

Hello,

If local_function_and_variable_visibility was not a simple IPA pass it 
would not have been called once per file but once per function (as it is 
with GIMPLE pass).
I feel this is normal that this pass is run 2 times because it is run 
before any link operations.


However, I don't know exactly how and when ld is called and which passes 
run after this.


Pierre Vittet

On 13/07/2011 17:54, AJM-2 wrote:


What you say is in line with my understanding, however when I instrument the
execute function of ipa-function-and-variable-visibility
(local_function_and_variable_visibility()) I note that:

gcc -flto a.c b.c
causes the pass to be called twice (presumably once per file).

If I split the compilation into two stages, then in the link stage
gcc -flto a.o b.o
the pass is never called.

Conversely, the gate of IPA-Points-to does seem to be called three times at
link time (presumably once for each file and then once for all together).  I
cannot discover the cause of the different behaviours here.



Re: C++ mangling, function name to mangled name (or tree)

2011-07-13 Thread Pierre Vittet

Hello,

sorry to answer that late (I didn't saw your mail in my mailbox + I 
was preparing me for RMLL/Libre software meeting).


Your solution looks to be a nice one, I am goiing to try it and I will 
post the result of my experiment. I was not aware of that hook.


Thanks!

Pierre Vittet

Hello,

Have you considered the reverse way to do that. I mean, why don't you hook on the 
PLUGIN_PRE_GENERICIZE event to catch all function bodies, and then compare the argument 
the user gave you to current_function_name() (that will returns you the full protoype of 
the current function, ie: malloc   full name is "void* malloc(size_t)"). Then, 
you can store the FUNCTION_DECL tree if there's a match and use it for later processing. 
That's how i proceed for my plugins.

Romain Geissler

   




Re: IRA: matches insn even though !reload_in_progress

2011-07-13 Thread Michael Meissner
On Wed, Jul 13, 2011 at 01:42:29PM +0200, Georg-Johann Lay wrote:
> All the trouble arises because there is no straight forward way to
> write the right insn condition, doesn't it?
> 
> Working around like that will work but it is obfuscating the code, IMHO.

Given I don't know the AVR port, I don't know what the right condition is.  I
was just guessing from the patterns you provided.

> Is there a specific reason for early-clobber ind *wrapper?

You could instead put (clobber (match_scratch)) in the insn, and not split it
until after reload.

> As *wrapper is not a proper move, could this produce move-move-sequences?
> These would have to be fixed in peep2 or so.

In theory the register allocator will eliminate the normal move, and just use
the wrapper.

> This is the missing part, and if gcc learns something like
> !ira_in_progress or !split1_completed in the future, the cleanup will
> be minimal and straight forward.  The code is obvious and without
> obfuscation:

I tend to think a few more _in_progress and _completed flags would be
helpful, particularly ira_in_progress or make reload_in_progress be set by ira.
You are probably the first person to run into this.  The question is how many
do we want and need?

Note, a few years ago, this type of splitting was not possible.  You had to
have all of the match_scratch'es that you would need allocated in the RTL
generation pass.  Being able to allocate new psuedos before the split1 pass
certainly makes things easier to do, but there are always things that could be
done better.

> bool
> avr_gate_split1 (void)
> {
>   if (current_pass->static_pass_number
>   < pass_match_asm_constraints.pass.static_pass_number)
> return true;
> 
>   return false;
> }
> 
> 
> I choose .asmcons because it runs between IRA and split1,
> and because I observed that pass numbers are fuzzy;
> presumably because sub-passes like df etc.

I'm not a big fan of this.  I think it would be better to just add
ira_in_progress and a few others as needed.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899


Re: IPA and LTO

2011-07-13 Thread Richard Guenther
On Wed, Jul 13, 2011 at 5:54 PM, AJM-2  wrote:
>
> What you say is in line with my understanding, however when I instrument the
> execute function of ipa-function-and-variable-visibility
> (local_function_and_variable_visibility()) I note that:
>
> gcc -flto a.c b.c
> causes the pass to be called twice (presumably once per file).
>
> If I split the compilation into two stages, then in the link stage
> gcc -flto a.o b.o
> the pass is never called.
>
> Conversely, the gate of IPA-Points-to does seem to be called three times at
> link time (presumably once for each file and then once for all together).  I
> cannot discover the cause of the different behaviours here.

It depends on where in the pass pipeline you put your IPA pass.  A simple
IPA pass that should run at ltrans time (either seeing each partition for
the partitioned program or the whole program if you use one partition)
needs to be put alongside IPA PTA (that's the only simple IPA pass executed
at link LTO time right now).

Richard.

>
> Diego Novillo-3 wrote:
>>
>> On Wed, Jul 13, 2011 at 10:22, AJM-2  wrote:
>>
>>> My question is whether LTO can be used in this way, to have a simple ipa
>>> pass called once at link time with access to the function bodies, and if
>>> so
>>> how is this achieved?  cgraph_function_body_availability seems to only be
>>> half the story.
>>
>> Yes, it can.  You seem to be describing what GCC calls "simple IPA
>> pass".  These are passes that cannot run in partitioned LTO mode, as
>> they require the function bodies to operate.  Look for passes like
>> pass_ipa_function_and_variable_visibility for an example of a simple
>> IPA pass.
>>
>>
>> Diego.
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/IPA-and-LTO-tp32052838p32054720.html
> Sent from the gcc - Dev mailing list archive at Nabble.com.
>
>


Re: IPA and LTO

2011-07-13 Thread AJM-2

Putting my "simple IPA pass" adjacent to IPA-PTA does cause it to be called
as expected.  However for each node in the call graph (with
cgraph_function_body_availability returning AVAIL_AVAILABLE),
gimple_has_body_p is always false.

The call graph data seems to be available, but the documentation indicates
that access to the gimple is also possible, using the standard accessors. 
Is there some extra step that must be taken to access gimple under LTO?



Richard Guenther-2 wrote:
> 
> It depends on where in the pass pipeline you put your IPA pass.  A simple
> IPA pass that should run at ltrans time (either seeing each partition for
> the partitioned program or the whole program if you use one partition)
> needs to be put alongside IPA PTA (that's the only simple IPA pass
> executed
> at link LTO time right now).
> 
> Richard.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/IPA-and-LTO-tp32052838p32056682.html
Sent from the gcc - Dev mailing list archive at Nabble.com.



Re: IPA and LTO

2011-07-13 Thread Richard Guenther
On Wed, Jul 13, 2011 at 10:09 PM, AJM-2  wrote:
>
> Putting my "simple IPA pass" adjacent to IPA-PTA does cause it to be called
> as expected.  However for each node in the call graph (with
> cgraph_function_body_availability returning AVAIL_AVAILABLE),
> gimple_has_body_p is always false.
>
> The call graph data seems to be available, but the documentation indicates
> that access to the gimple is also possible, using the standard accessors.
> Is there some extra step that must be taken to access gimple under LTO?

The body should be available.  Make sure to use a recent SVN trunk though.

Richard.

>
>
> Richard Guenther-2 wrote:
>>
>> It depends on where in the pass pipeline you put your IPA pass.  A simple
>> IPA pass that should run at ltrans time (either seeing each partition for
>> the partitioned program or the whole program if you use one partition)
>> needs to be put alongside IPA PTA (that's the only simple IPA pass
>> executed
>> at link LTO time right now).
>>
>> Richard.
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/IPA-and-LTO-tp32052838p32056682.html
> Sent from the gcc - Dev mailing list archive at Nabble.com.
>
>


cachecc1 query

2011-07-13 Thread David Fang

Hi,
	Has anyone used cachecc1 (http://cachecc1.sourceforge.net/) to 
cache gcc bootstraps of in recent years?  The project looks rather stale, 
2004. I would love to accelerate bootstraps of gcc rebuilds to test 
snapshots more frequently.  Is there any interest in getting this to work? 
(I'm particularly interested in a darwin port, but would benefit from 
having it work on any modern platform.)


Fang

--
David Fang
http://www.csl.cornell.edu/~fang/