Re: Ada subtypes and base types

2006-03-21 Thread Laurent GUERBY
On Mon, 2006-03-20 at 19:47 -0700, Jeffrey A Law wrote:
> On Sat, 2006-03-18 at 10:24 +0100, Laurent GUERBY wrote:
> > On Fri, 2006-03-17 at 12:51 -0700, Jeffrey A Law wrote:
> > > I'm not suggesting the FEs deduce more types and track ranges;
> > > that would be rather absurd.  What I'm saying is that exposing
> > > these types outside the FE is most likely costing you both on
> > > the compile-time side and on the run-time side.
> > 
> > About optimization, in most languages with array bounds and
> > range checks, the FE will build a structure with bounds
> > and code like the following for a typical array loop (sorry
> > for poor C):
> > 
> > struct {
> >int low,high; /* no alias */
> >double *data;
> > } X;
> > 
> > int first_i=X.low+2; 
> > int last_i=X.high-3;
> > int i;
> > 
> > if (first_i<=last_i) {
> >for(i=first_i;i<=last_i;i++) {
> >   if (iX.high) raise_bound_error(); /* CHECK */
> >   do_something(array_with_bounds[i]);
> >}
> > }
> > 
> > The interesting thing performance wise would be to be able to remove the
> > CHECK in the BE. 
> > 
> > Is there some optimization pass able to do that?
> > 
> > And when "+2" and "-3" are replaced by "+x" and "-y" and we
> > know through typing that x>=0 and y>=0?
> Not sure, mostly because of the structure accesses and my lack
> of knowledge about the symbolic range support.  I know we have
> symbolic range support, but haven't looked to see how good it
> really is.  What we're doing now is certainly better than what
> DOM did for symbolic ranges (nothing).

A casual read of tree-vrp.c showed that symbolic_range_p is mostly
used to do nothing, did I miss something? May be it's in another file.

> Note that this is closely related to some of the bounds checking
> elimination we want to support for Java one day IIRC.

Pascal and Ada will benefit (may be Fortran too, I don't know enough to
say).

> Note also that if i, first_i and/or last_i are globals, then the
> odds of these kind of tests being deleted go down considerably
> as they're likely call-clobbered by the call to do_something.

Very likely to be all locals in typical Ada code.

Laurent



Re: alias time explosion

2006-03-21 Thread Richard Guenther
On 3/21/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote:
> On Mon, 2006-03-20 at 18:55 -0500, Andrew Pinski wrote:
> > On Mar 20, 2006, at 5:18 PM, Andrew MacLeod wrote:
> >
> > > I'm not sure when this happened, but I noticed on the weekend that
> > > there
> > > has been an explosion in the time spent during the alias analysis
> > > phase.
> > > using cplusplus-grammer.ii, it use to compile on my machine in about 55
> > > seconds, and its now up to about 150 seconds.
> > >
> > > A quick gprof indicated about 60% of compile time is being spent in
> > > bitmap_bit_p, called from compute_may_aliases.
> > >
> > > someone made it WAY slow :-)
> >
> > Could it be that 2 more passes of may_alias was added?
>
> I don't think so. I would expect that to double or triple the time spent
> in alias analysis, not the entire compile time.  This is a 200% increase
> in compiler time... going from 50 seconds to 150 is pretty significant.
> And since its almost all in bitmap_bit_p, it sounds algorithmic to me...
>
> btw, this was on a 3.0 Ghz P4 running linux with a checkout from
> mainline this morning built with no checking...
>
> Doing a quick check back, on 01/23 shows similar time (71% of compiler
> time spent in alias analysis, 97 seconds out of 135). The previous
> compiler to that which I have laying around is 10/30/05, and it shows a
> much more sensible 6.32 seconds in alias analysis.
>
> It looks like sometime between 10/30 and 01/23 alias analysis got out of
> hand. Odd it hasn't been noted before.

Can you do a comparison to 4.1.0 and file a PR with the testcase please?

Thanks,
Richard.


Re: Ada subtypes and base types

2006-03-21 Thread Duncan Sands
> > I think that it is easy for back end to make good use of
> > TYPE_MIN_VALUE/TYPE_MAX_VALUE. Namely, consider the assignment
> > 
> > x := y + z * w;
> > 
> > where variables y, z and w have values in the interval [0,7] and
> > x have values in [0,1000]. Pascal converts the above to the
> > following C like code:
> > 
> > int tmp = (int) y + (int) z * (int) w;
> > x = (tmp < 0 || tmp > 1000)? (Range_Check_Error (), 0) : tmp;
> >  
> > I expect VRP to deduce that tmp will have values in [0..56] and
> > eliminate range check. Also, it should be clear that in the
> > assigment above artithmetic can be done using any convenient
> > precision.
> VRP can certainly do this -- I added the ability to see through
> more typecasts a while back.  In general, I've tried to improve
> the ability of our optimizers to eliminate or at least "see through"
> some typecasts.  However, that capability is often very limited
> and the ability to see through type casts is not pervasive in
> the optimization pipeline.

Hi Jeff, on the subject of seeing through typecasts, I was playing around
with VRP and noticed that the following "if" statement is not eliminated:

int u (unsigned char c) {
int i = c;

if (i < 0 || i > 255)
return -1; /* never taken */
else
return 0;
}

Is it supposed to be?

Thanks,

Duncan.


Re: Aliasing sets on Arrays Types

2006-03-21 Thread Richard Guenther
On 3/21/06, Andrew Pinski <[EMAIL PROTECTED]> wrote:
> Take the following C code:
> typedef long atype[];
> typedef long atype1[];
>
> int NumSift (atype *a, atype1 *a1)
> {
>(*a)[0] = 0;
>(*a1)[0] = 1;
>return (*a)[0];
> }
> Shouldn't the aliasing set for the type atype be the same as atype1?

Im not entirely sure, but the C99 standard does at least not suggest
otherwise, instead it says (6.7.7/3) "A typedef declaration does not introduce
a new type, only a synonym for the type so specified."  And continues
with an example

After the declarations
typedef struct s1 { int x; } t1, *tp1;
typedef struct s2 { int x; } t2, *tp2;
type t1 and the type pointed to by tp1 are compatible. Type t1 is also
compatible with type struct
s1, but not compatible with the types struct s2, t2, the type pointed
to by tp2, or int.

where it explicitly says sth about structures, but not array types.

Richard.


Re: Aliasing sets on Arrays Types

2006-03-21 Thread Andrew Haley
Richard Guenther writes:
 > On 3/21/06, Andrew Pinski <[EMAIL PROTECTED]> wrote:
 > > Take the following C code:
 > > typedef long atype[];
 > > typedef long atype1[];
 > >
 > > int NumSift (atype *a, atype1 *a1)
 > > {
 > >(*a)[0] = 0;
 > >(*a1)[0] = 1;
 > >return (*a)[0];
 > > }
 > > Shouldn't the aliasing set for the type atype be the same as atype1?
 > 
 > Im not entirely sure, but the C99 standard does at least not suggest
 > otherwise, instead it says (6.7.7/3) "A typedef declaration does not 
 > introduce
 > a new type, only a synonym for the type so specified."

atype and atype1 are compatible bacsue of 6.7.5.2, Array declarators:

6   For two array types to be compatible, both shall have compatible
element types, and if both size specifiers are present, and are
integer constant expressions, then both size specifiers shall have
the same constant value. If the two array types are used in a
context which requires them to be compatible, it is undefined
behavior if the two size specifiers evaluate to unequal values.

I assume that compatible types should be in the same alias set.

Andrew.



Re: FUNCTION_OK_FOR_SIBCALL vs INITIAL_FRAME_POINTER_OFFSET

2006-03-21 Thread Richard Henderson
On Mon, Mar 20, 2006 at 04:03:08PM -, Dave Korn wrote:
>  Do you happen to know off the top of your head when get_frame_size()
> becomes valid?

You don't get a good first-pass estimate until after all rtl 
generation has been done.  Which is later than you need.

Another possibility is to allocate an extra word high in the
stack frame for temporary storage in large frames w/ sibcalls
to functions using all arguments.  Then you'd deallocate in 
several steps:

scratchslot = argument
argument = stack frame size - small
sp += argument
argument = scratchslot
sp += small

This scheme has the advantage that you don't need to commit
to it until the last minute.  Don't forget that you can also
handle "small" large values in a small number of steps.  E.g.
for 65k:

sp += 32764
sp += 32764
sp += 1032

Which may well be faster than going through memory as above.
Of course, you wouldn't want to use this scheme for a 2mb frame.


r~


Re: Leaf functions and noreturn calls

2006-03-21 Thread Richard Henderson
On Mon, Mar 20, 2006 at 04:19:52PM -, Dave Korn wrote:
> However, I might still want to make it an option for cases where debugging
> isn't going to be important; it seems to me that the generated code should
> still be valid.

At which point you should tail call to abort and be done with it.
So the correct change is to the tail call code, which currently
filters out noreturn functions.


r~


Re: FUNCTION_OK_FOR_SIBCALL vs INITIAL_FRAME_POINTER_OFFSET

2006-03-21 Thread Paul Brook
On Tuesday 21 March 2006 14:57, Richard Henderson wrote:
> On Mon, Mar 20, 2006 at 04:03:08PM -, Dave Korn wrote:
> >  Do you happen to know off the top of your head when get_frame_size()
> > becomes valid?
>
> You don't get a good first-pass estimate until after all rtl
> generation has been done.  Which is later than you need.
>
> Another possibility is to allocate an extra word high in the
> stack frame for temporary storage in large frames w/ sibcalls
> to functions using all arguments.  Then you'd deallocate in
> several steps:
>
>   scratchslot = argument
>   argument = stack frame size - small
>   sp += argument
>   argument = scratchslot
>   sp += small

This assumes you have a frame pointer or sp+large_offset addressing mode for 
accessing scratchslot. In which case you could either use fp as scratch 
storage or probably have an add sp, large_offset instruction.

Paul


RE: FUNCTION_OK_FOR_SIBCALL vs INITIAL_FRAME_POINTER_OFFSET

2006-03-21 Thread Dave Korn
On 21 March 2006 15:12, Paul Brook wrote:

> On Tuesday 21 March 2006 14:57, Richard Henderson wrote:
>> On Mon, Mar 20, 2006 at 04:03:08PM -, Dave Korn wrote:
>>>  Do you happen to know off the top of your head when get_frame_size()
>>> becomes valid?
>> 
>> You don't get a good first-pass estimate until after all rtl
>> generation has been done.  Which is later than you need.
>> 
>> Another possibility is to allocate an extra word high in the
>> stack frame for temporary storage in large frames w/ sibcalls
>> to functions using all arguments.  Then you'd deallocate in several steps:
>> 
>>  scratchslot = argument
>>  argument = stack frame size - small
>>  sp += argument
>>  argument = scratchslot
>>  sp += small
> 
> This assumes you have a frame pointer or sp+large_offset addressing mode for
> accessing scratchslot. In which case you could either use fp as scratch
> storage or probably have an add sp, large_offset instruction.
> 
> Paul


  :)  OK, I'll allocate a successive series of scratchslots every 32k all the
way up the stack and bounce the (continuously-reducing) frame size argument
from one to the next until we get within range of the top!



cheers,
  DaveK
-- 
Can't think of a witty .sigline today



RE: FUNCTION_OK_FOR_SIBCALL vs INITIAL_FRAME_POINTER_OFFSET

2006-03-21 Thread Dave Korn
On 21 March 2006 14:58, Richard Henderson wrote:

> On Mon, Mar 20, 2006 at 04:03:08PM -, Dave Korn wrote:
>>  Do you happen to know off the top of your head when get_frame_size()
>> becomes valid?
> 
> You don't get a good first-pass estimate until after all rtl
> generation has been done.  Which is later than you need.

  Absolutely.  Right.
 
> Another possibility is to allocate an extra word high in the
> stack frame for temporary storage in large frames w/ sibcalls
> to functions using all arguments.  Then you'd deallocate in
> several steps:
> 
>   scratchslot = argument
>   argument = stack frame size - small
>   sp += argument
>   argument = scratchslot
>   sp += small

  Yeh, I'd thought about that.  I reckon it's probably a better solution than
aborting.  I'll just make sure and handle all kinds of sibcalls.  Like I said,
my original question was inspired by wanting to make a quick first pass at it
that would just opt out from doing the hard cases, and it seems that it's
easier to take care of them than to try and detect them in order to forbid
sibcalls in those large-framed functions.

> This scheme has the advantage that you don't need to commit
> to it until the last minute.  Don't forget that you can also
> handle "small" large values in a small number of steps.  E.g.
> for 65k:
> 
>   sp += 32764
>   sp += 32764
>   sp += 1032
> 
> Which may well be faster than going through memory as above.
> Of course, you wouldn't want to use this scheme for a 2mb frame.

  On my small embedded target, if I ever have a 2mb frame, I'm gonna be in a
whole lot more trouble than just taking a few dozen insns to unwind it!  I'm
only making sure that >32k stack frames work /at all/ as a contingency!

cheers,
  Dave '32k should be enough for anybody' K
-- 
Can't think of a witty .sigline today



Re: FUNCTION_OK_FOR_SIBCALL vs INITIAL_FRAME_POINTER_OFFSET

2006-03-21 Thread Paul Brook
> >>scratchslot = argument
> >>argument = stack frame size - small
> >>sp += argument
> >>argument = scratchslot
> >>sp += small
> >
> > This assumes you have a frame pointer or sp+large_offset addressing mode
> > for accessing scratchslot. In which case you could either use fp as
> > scratch storage or probably have an add sp, large_offset instruction.
> >
> > Paul
> >
>   :)  OK, I'll allocate a successive series of scratchslots every 32k all
>   : the
>
> way up the stack and bounce the (continuously-reducing) frame size argument
> from one to the next until we get within range of the top!

Or use 2 scratch regs:

sp[0] = tmp1;
tmp1 = #(framesize - small)
sp[tmp1] = tmp2;
tmp2 = tmp1;
tmp1 = sp[0];
sp += tmp2;
tmp2 = sp[0];
sp += #small;

Paul


Re: Ada subtypes and base types

2006-03-21 Thread Jeffrey A Law
On Tue, 2006-03-21 at 10:14 +0100, Duncan Sands wrote:

> Hi Jeff, on the subject of seeing through typecasts, I was playing around
> with VRP and noticed that the following "if" statement is not eliminated:
> 
> int u (unsigned char c) {
> int i = c;
> 
> if (i < 0 || i > 255)
> return -1; /* never taken */
> else
> return 0;
> }
> 
> Is it supposed to be?
Depends...

The "problem" is the parameter is marked a VR_VARYING (as it
should be).  Unfortunately, the VR_VARYING source operand prevents
us from extracting a range for the result of the typecast.

Sigh.  We've got this "interesting" problem in VRP, namely that 
we can sometimes extract ranges for operations on VR_VARYING
objects, but generally we punt.  To compensate we often create
a range, say 0..MAXINT for an unsigned object -- that's effectively
a VARYING range, but the rest of VRP doesn't pessimize so much
when presented with 0..MAXINT vs VARYING.

If we weren't so pessimistic with VR_VARYING or we always created 
ranges, even if they cover TYPE_MIN .. TYPE_MAX then this wouldn't
be a problem.

I'll see what I can cobble together...

Jeff




Re: Ada subtypes and base types

2006-03-21 Thread Andrew Pinski


On Mar 21, 2006, at 11:15 AM, Jeffrey A Law wrote:


On Tue, 2006-03-21 at 10:14 +0100, Duncan Sands wrote:

Hi Jeff, on the subject of seeing through typecasts, I was playing 
around
with VRP and noticed that the following "if" statement is not 
eliminated:


int u (unsigned char c) {
int i = c;

if (i < 0 || i > 255)
return -1; /* never taken */
else
return 0;
}

Is it supposed to be?

Depends...

The "problem" is the parameter is marked a VR_VARYING (as it
should be).  Unfortunately, the VR_VARYING source operand prevents
us from extracting a range for the result of the typecast.



This is also PR 23745.

-- Pinski



Re: Ada subtypes and base types

2006-03-21 Thread Diego Novillo
On 03/21/06 03:25, Laurent GUERBY wrote:

> A casual read of tree-vrp.c showed that symbolic_range_p is mostly
> used to do nothing, did I miss something? May be it's in another file.
> 
That's correct.  We mostly punt on symbolic ranges because they are
fairly expensive to track.  We do try to use symbolic range information
in some places like 'if (a_3 > b_10)'.  If b_10's range is [a_3, +INF]
then we know the conditional is false.

Do you have anything specific in mind that you would like to track?  Any
kind of heavy symbolic processing will likely be very expensive.  And
VRP already is a hog.


RE: Leaf functions and noreturn calls

2006-03-21 Thread Dave Korn
On 21 March 2006 14:59, Richard Henderson wrote:

> On Mon, Mar 20, 2006 at 04:19:52PM -, Dave Korn wrote:
>> However, I might still want to make it an option for cases where debugging
>> isn't going to be important; it seems to me that the generated code should
>> still be valid.
> 
> At which point you should tail call to abort and be done with it.
> So the correct change is to the tail call code, which currently
> filters out noreturn functions.

  Ok, seems reasonable.  I see code in calls.c that checks for the noreturn
flag and clears try_tail_call if so, so I'll need to fix that.  I'm wondering
if that'd be enough on its own though, because I don't see why this chunk of
code from sibcall.c...

  if (no_sibcalls_this_function
  /* ??? Overly conservative.  */
  || frame_offset
  /* Any function that calls setjmp might have longjmp called from
 any called function.  ??? We really should represent this
 properly in the CFG so that this needn't be special cased.
*/
  || current_function_calls_setjmp
  /* Can't if more than one successor or single successor is not
 exit block.  These two tests prevent tail call optimization
 in the presense of active exception handlers.  */
  || call_block->succ == NULL
  || call_block->succ->succ_next != NULL
  || (call_block->succ->dest != EXIT_BLOCK_PTR
  && call_block->succ->dest != alternate_exit)
  /* If this call doesn't end the block, there are operations at
 the end of the block which we must execute after returning.
*/
  || ! call_ends_block_p (insn, call_block->end))
sibcall = 0, tailrecursion = 0;

.. wouldn't check the succ block of the noreturn call and forbid the sibcall
if it wasn't at the end of the enclosing function.

  I may also have to improve my call/call_value patterns.  At the moment they
just test SIBLING_CALL_P and emit a non-linking jump if so, but that means
they still have a clobber of LR attached which is no longer the case and
that's likely to force my functions to still have stackframes unnecessarily.

  Looking at the ARM md-file as an example, it seems that I can use expanders
to match them.

  And looking at call.c, it seems there are perhaps some undocumented (in
3.3.3) named patterns that provide a sib- equivalent for each of the normal
call_x patterns?  Ow, that's a little tricky!



cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Ada subtypes and base types

2006-03-21 Thread Duncan Sands
On Tuesday 21 March 2006 17:15, Jeffrey A Law wrote:
> On Tue, 2006-03-21 at 10:14 +0100, Duncan Sands wrote:
> 
> > Hi Jeff, on the subject of seeing through typecasts, I was playing around
> > with VRP and noticed that the following "if" statement is not eliminated:
> > 
> > int u (unsigned char c) {
> > int i = c;
> > 
> > if (i < 0 || i > 255)
> > return -1; /* never taken */
> > else
> > return 0;
> > }
> > 
> > Is it supposed to be?
> Depends...
> 
> The "problem" is the parameter is marked a VR_VARYING (as it
> should be).

Should it be?  I was surprised to see that all ranges are initialised
to VR_VARYING in the vrp pass, since many types have natural ranges
associated with them, for example [0, 255] for the above unsigned char;
starting off with this natural range is, well, natural, and surely
simplifies a bunch of code, by making things more uniform, eg: no need
to special case unsigned values, type conversions, ...

> Unfortunately, the VR_VARYING source operand prevents 
> us from extracting a range for the result of the typecast.
>
> Sigh.  We've got this "interesting" problem in VRP, namely that 
> we can sometimes extract ranges for operations on VR_VARYING
> objects, but generally we punt.  To compensate we often create
> a range, say 0..MAXINT for an unsigned object -- that's effectively
> a VARYING range, but the rest of VRP doesn't pessimize so much
> when presented with 0..MAXINT vs VARYING.
> 
> If we weren't so pessimistic with VR_VARYING or we always created 
> ranges, even if they cover TYPE_MIN .. TYPE_MAX then this wouldn't
> be a problem.
> 
> I'll see what I can cobble together...

That would be great.

All the best,

Duncan.


Re: Ada subtypes and base types

2006-03-21 Thread Jeffrey A Law
On Tue, 2006-03-21 at 17:41 +0100, Duncan Sands wrote:

> Should it be?  I was surprised to see that all ranges are initialised
> to VR_VARYING in the vrp pass, since many types have natural ranges
> associated with them, for example [0, 255] for the above unsigned char;
> starting off with this natural range is, well, natural, and surely
> simplifies a bunch of code, by making things more uniform, eg: no need
> to special case unsigned values, type conversions, ...
Depends on who you talk to -- we certainly have a problem in that we
have two ways to represent the same thing and the optimizer treats
the two representations differently.

In a world where we could handle VR_VARYING without pessimizing so
much, then I don't think either representation would be clearly
better than the other.

Jeff



Re: Ada subtypes and base types

2006-03-21 Thread Duncan Sands
On Tuesday 21 March 2006 18:01, Jeffrey A Law wrote:
> On Tue, 2006-03-21 at 17:41 +0100, Duncan Sands wrote:
> 
> > Should it be?  I was surprised to see that all ranges are initialised
> > to VR_VARYING in the vrp pass, since many types have natural ranges
> > associated with them, for example [0, 255] for the above unsigned char;
> > starting off with this natural range is, well, natural, and surely
> > simplifies a bunch of code, by making things more uniform, eg: no need
> > to special case unsigned values, type conversions, ...
> Depends on who you talk to -- we certainly have a problem in that we
> have two ways to represent the same thing and the optimizer treats
> the two representations differently.
> 
> In a world where we could handle VR_VARYING without pessimizing so
> much, then I don't think either representation would be clearly
> better than the other.

Is memory use a problem here?  VR_VARYING has the advantage of using
a bit less memory.  On the other hand, I guess you could introduce the
convention that VR_RANGE with null min/max means to use TYPE_MIN/
TYPE_MAX, or something along those lines.

Ciao,

D.


Re: Ada subtypes and base types

2006-03-21 Thread Tom Tromey
> "Jeff" == Jeffrey A Law <[EMAIL PROTECTED]> writes:

Jeff> Note that this is closely related to some of the bounds checking
Jeff> elimination we want to support for Java one day IIRC.

My understanding from the PR (21855) is that this is stuck on
aliasing, not VRP -- the VRP parts supposedly work.

Tom


Re: Ada subtypes and base types

2006-03-21 Thread Diego Novillo
On 03/21/06 12:14, Tom Tromey wrote:
>> "Jeff" == Jeffrey A Law <[EMAIL PROTECTED]> writes:
> 
> Jeff> Note that this is closely related to some of the bounds checking
> Jeff> elimination we want to support for Java one day IIRC.
> 
> My understanding from the PR (21855) is that this is stuck on
> aliasing, not VRP -- the VRP parts supposedly work.
> 
Correct.


Re: Ada subtypes and base types

2006-03-21 Thread Jeffrey A Law
On Tue, 2006-03-21 at 18:13 +0100, Duncan Sands wrote:

> Is memory use a problem here?  VR_VARYING has the advantage of using
> a bit less memory.  On the other hand, I guess you could introduce the
> convention that VR_RANGE with null min/mae means to use TYPE_MIN/
> TYPE_MAX, or something along those lines.
Well, more than anything, I just don't think we really thought
about the issue of what the best representation would be in this
kind of case.

If I was to lean any particular direction I would tend to lean
towards exposing the range and removing VR_VARYING.  The net
result would be good consistency in the transformations within
VRP.  ie, we would not need to special case stuff like

VR_VARYING | ~[0, 0]  -> ~[0, 0]

SImilarly for typecasts and a variety of other stuff.

jeff


> 
> Ciao,
> 
> D.



Re: FSF Policy re. inclusion of source code from other projects in GCC

2006-03-21 Thread Tom Tromey
> "Mark" == Mark Mitchell <[EMAIL PROTECTED]> writes:

Mark> I would prefer it go on savannah, which is more clearly unaffiliated
Mark> with any particular commercial entity.

Ok, I submitted a request there.

Tom


sibcall, sibcall_pop, sibcall_value and sibcall_pop_value named patterns not documented

2006-03-21 Thread Dave Korn


  At least as far as I have been able to find, there's no mention of these
anywhere in any version of the internals manual.

  I think they should at least be mentioned and the similarities/differences
to ordinary call/call_value/call_pop/call_value_pop explained, because they
actually turn out to be vital, but the dependency isn't immediately obvious:
if you implement epilogue and sibcall_epilogue (as schedulable RTL generated
by expanders, rather than as insn patterns with code-generating functions
which just opencode the epilogue at assembly-generation time), and you haven't
implemented sibcall/sibcall_value patterns, the compiler falls back to the
ordinary call/call_value pattern and then you get that abort in
propagate_one_insn when it finds itself trying to delete an epilogue insn.  I
initially thought it would be sufficient to just test for SIBLING_CALL_P
(insn) in the code generation for the call/call_XXX patterns, but it turns out
to be subtly not so.

  This may be only apply on architectures that have a link register for all I
know.  It seems at first glance to have been because of using an ordinary call
pattern (which clobbers LR) to match a sibcall (which does not, in fact, use
LR at all); it was the reload of the LR from the stack prior to the sibcall
jump that got deleted, but then again it was only a simple testcase from the
testsuite and there weren't any other items on the stack frame in that
instance, so I can't be sure.

  Since I have only deduced the existence of these patterns from stumbling
across HAVE_sibcall etc. in calls.c, I won't immediately volunteer to write
the doco patch.  Shall I file a bugzilla, or is there some documentation about
the sibcall patterns somewhere that I missed?


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Results for 4.2.0 20060320 (experimental) testsuite on powerpc-apple-darwin8.5.0 (-m64 results)

2006-03-21 Thread Bradley Lucier

64-bit powerpc-darwin results be found here:

http://www.math.purdue.edu/~lucier/gcc/test-results/4_2-2006-03-20.gz

The mail-report-with-warnings.log file is again too large to be  
accepted by the gcc-testresults mail list after quite a few weeks  
when it was only about 125K long.


I'm curious about whether any of the changes recently proposed to  
clean up the x86-darwin port can be applied to the 64-bit PowerPC  
darwin compiler; I'm getting the feeling that gcc on ppc64 darwin may  
become something of an orphan.


Brad


Re: Ada subtypes and base types

2006-03-21 Thread Laurent GUERBY
On Tue, 2006-03-21 at 11:29 -0500, Diego Novillo wrote:
> On 03/21/06 03:25, Laurent GUERBY wrote:
> 
> > A casual read of tree-vrp.c showed that symbolic_range_p is mostly
> > used to do nothing, did I miss something? May be it's in another file.
> > 
> That's correct.  We mostly punt on symbolic ranges because they are
> fairly expensive to track.  We do try to use symbolic range information
> in some places like 'if (a_3 > b_10)'.  If b_10's range is [a_3, +INF]
> then we know the conditional is false.
> 
> Do you have anything specific in mind that you would like to track?  Any
> kind of heavy symbolic processing will likely be very expensive.  And
> VRP already is a hog.

The typical Ada check we'd want to remove is below. I suspect Java
and Pascal are similar (Java has only one variable bound IIRC,
the first is always zero).

procedure T is

   -- Natural means [0 .. 2**32-1]
   procedure P (X : in out String; N1, N2 : in Natural) is
  First : constant Natural := X'First;
  Last  : constant Natural := X'Last;
   begin
  for I in First + N1 .. Last - N2 loop
 X (I) := ' '; -- CHECK that I>=X'First and I<=X'Last
  end loop;
   end P;

   S : String := "123456";

begin
   P (S, 2, 1);
   if S /= "12   6" then
  raise Program_Error;
   end if;
end T;

The Ada front-end expanded codes look like (gcc -c -gnatdg t.adb) :

   procedure t__p (x : in out string; n1 : in natural; n2 : in natural) is
  subtype t__p__S1b is string (x'first(1) .. x'last(1));
  first : constant natural := {x'first};
  last : constant natural := {x'last};
  L_1 : label
   begin
  S2b : integer;
  S2b := first + n1;
  S3b : integer;
  S3b := last - n2;
  L_1 : for i in S2b .. S3b loop
 [constraint_error when not (integer(i) in x'first .. x'last) "index 
check failed"]
 x (i) := ' ';
  end loop L_1;
  return;
   end t__p;


So we know:

- First, Last, N1, N2 are locally constant
- N1 >= 0 
- N2 >= 0
- First + N1 didn't overflow
- Last - N2 didn't overflow

And within the loop body:

- I >= First + N1
- I <= Last - N2
- First + N1 <= Last - N2

We'd need to track enough to prove that the following are always true

- I >= First 
- I <= Last

So the FE generated check can be removed.

Looks like not very far from what you describe, but I'm not a specialist
of those issues.

Laurent



Re: Ada subtypes and base types

2006-03-21 Thread Jeffrey A Law
On Tue, 2006-03-21 at 10:14 +0100, Duncan Sands wrote:

> Hi Jeff, on the subject of seeing through typecasts, I was playing around
> with VRP and noticed that the following "if" statement is not eliminated:
> 
> int u (unsigned char c) {
> int i = c;
> 
> if (i < 0 || i > 255)
> return -1; /* never taken */
> else
> return 0;
> }
> 
> Is it supposed to be?
Fixed thusly.  Bootstrapped and regression tested on i686-pc-linux-gnu.


* tree-vrp.c (extract_range_from_unary_expr): Derive ranges for
type conversions of a VR_VARYING source to a wider type.

* gcc.dg/tree-ssa/vrp28.c: New test.

Index: tree-vrp.c
===
*** tree-vrp.c  (revision 112250)
--- tree-vrp.c  (working copy)
*** extract_range_from_unary_expr (value_ran
*** 1641,1654 
return;
  }
  
!   /* Refuse to operate on varying and symbolic ranges.  Also, if the
!  operand is neither a pointer nor an integral type, set the
!  resulting range to VARYING.  TODO, in some cases we may be able
!  to derive anti-ranges (like nonzero values).  */
!   if (vr0.type == VR_VARYING
!   || (!INTEGRAL_TYPE_P (TREE_TYPE (op0))
! && !POINTER_TYPE_P (TREE_TYPE (op0)))
!   || symbolic_range_p (&vr0))
  {
set_value_range_to_varying (vr);
return;
--- 1641,1652 
return;
  }
  
!   /* Refuse to operate on symbolic ranges, or if neither operand is
!  a pointer or integral type.  */
!   if ((!INTEGRAL_TYPE_P (TREE_TYPE (op0))
!&& !POINTER_TYPE_P (TREE_TYPE (op0)))
!   || (vr0.type != VR_VARYING
! && symbolic_range_p (&vr0)))
  {
set_value_range_to_varying (vr);
return;
*** extract_range_from_unary_expr (value_ran
*** 1681,1700 
 or equal to the new max, then we can safely use the newly
 computed range for EXPR.  This allows us to compute
 accurate ranges through many casts.  */
!   if (vr0.type == VR_RANGE)
!   {
! tree new_min, new_max;
  
! /* Convert VR0's min/max to OUTER_TYPE.  */
! new_min = fold_convert (outer_type, vr0.min);
! new_max = fold_convert (outer_type, vr0.max);
  
  /* Verify the new min/max values are gimple values and
!that they compare equal to VR0's min/max values.  */
  if (is_gimple_val (new_min)
  && is_gimple_val (new_max)
! && tree_int_cst_equal (new_min, vr0.min)
! && tree_int_cst_equal (new_max, vr0.max)
  && compare_values (new_min, new_max) <= 0
  && compare_values (new_min, new_max) >= -1)
{
--- 1679,1714 
 or equal to the new max, then we can safely use the newly
 computed range for EXPR.  This allows us to compute
 accurate ranges through many casts.  */
!   if (vr0.type == VR_RANGE
! || (vr0.type == VR_VARYING
! && TYPE_PRECISION (outer_type) > TYPE_PRECISION (inner_type)))
!   {
! tree new_min, new_max, orig_min, orig_max;
! 
! /* Convert the input operand min/max to OUTER_TYPE.   If
!the input has no range information, then use the min/max
!for the input's type.  */
! if (vr0.type == VR_RANGE)
!   {
! orig_min = vr0.min;
! orig_max = vr0.max;
!   }
! else
!   {
! orig_min = TYPE_MIN_VALUE (inner_type);
! orig_max = TYPE_MAX_VALUE (inner_type);
!   }
  
! new_min = fold_convert (outer_type, orig_min);
! new_max = fold_convert (outer_type, orig_max);
  
  /* Verify the new min/max values are gimple values and
!that they compare equal to the orignal input's
!min/max values.  */
  if (is_gimple_val (new_min)
  && is_gimple_val (new_max)
! && tree_int_cst_equal (new_min, orig_min)
! && tree_int_cst_equal (new_max, orig_max)
  && compare_values (new_min, new_max) <= 0
  && compare_values (new_min, new_max) >= -1)
{
*** extract_range_from_unary_expr (value_ran
*** 1717,1722 
--- 1731,1746 
}
  }
  
+   /* Conversion of a VR_VARYING value to a wider type can result
+  in a usable range.  So wait until after we've handled conversions
+  before dropping the result to VR_VARYING if we had a source
+  operand that is VR_VARYING.  */
+   if (vr0.type == VR_VARYING)
+ {
+   set_value_range_to_varying (vr);
+   return;
+ }
+ 
/* Apply the operation to each end of the range and see what we end
   up with.  */
if (code == NEGATE_EXPR
Index: testsuite/gcc.dg/tree-ssa/vrp28.c
===
*** testsuite/gcc.dg/tree-ssa/vrp28.c   (revision 0)
--- testsuite/gcc.dg/tree-ss

Re: alias time explosion

2006-03-21 Thread Andrew MacLeod
On Tue, 2006-03-21 at 10:10 +0100, Richard Guenther wrote:
> On 3/21/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote:
> > On Mon, 2006-03-20 at 18:55 -0500, Andrew Pinski wrote:
> > > On Mar 20, 2006, at 5:18 PM, Andrew MacLeod wrote:
> > 
> > It looks like sometime between 10/30 and 01/23 alias analysis got out of
> > hand. Odd it hasn't been noted before.
> 
> Can you do a comparison to 4.1.0 and file a PR with the testcase please?

I will do so in a day or so when I get a chance. until then:


I seem to  have narrowed it down to this patch:

http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00908.html



Dan, this appear to *not* be compile time neutral:

Timings on this patch show that it is not faster or slower than
what we
do now (even with the removal of the call clobbering patch).  This is
true even on fortran tests i had that clobber a lot of stuff.


running cpgram.ii shows a regression:

before patch:

 tree alias analysis   :   2.49 ( 7%) usr   0.25 ( 5%) sys   6.13 ( 5%) wall
4971 kB ( 1%) ggc
 TOTAL :  36.90 4.72   130.34 
467341 kB

after patch:

 tree alias analysis   :  59.00 (63%) usr   0.40 ( 7%) sys  70.43 (36%) wall
4957 kB ( 1%) ggc
 TOTAL :  94.13 5.43   193.85 
468339 kB


on a 386 linux machine bootstrapped with checking disabled.


Andrew



Re: alias time explosion

2006-03-21 Thread Daniel Berlin
On Tue, 2006-03-21 at 17:30 -0500, Andrew MacLeod wrote:
> On Tue, 2006-03-21 at 10:10 +0100, Richard Guenther wrote:
> > On 3/21/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote:
> > > On Mon, 2006-03-20 at 18:55 -0500, Andrew Pinski wrote:
> > > > On Mar 20, 2006, at 5:18 PM, Andrew MacLeod wrote:
> > > 
> > > It looks like sometime between 10/30 and 01/23 alias analysis got out of
> > > hand. Odd it hasn't been noted before.
> > 
> > Can you do a comparison to 4.1.0 and file a PR with the testcase please?
> 
> I will do so in a day or so when I get a chance. until then:
> 
> 
> I seem to  have narrowed it down to this patch:
> 
> http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00908.html
> 

That's quite a while ago :).

> 
> 
> Dan, this appear to *not* be compile time neutral:
> 
> Timings on this patch show that it is not faster or slower than
> what we
> do now (even with the removal of the call clobbering patch).  This is
> true even on fortran tests i had that clobber a lot of stuff.

> 
> 
> running cpgram.ii shows a regression:
> 
> before patch:
> 
>  tree alias analysis   :   2.49 ( 7%) usr   0.25 ( 5%) sys   6.13 ( 5%) wall  
>   4971 kB ( 1%) ggc
>  TOTAL :  36.90 4.72   130.34 
> 467341 kB
> 
> after patch:
> 
>  tree alias analysis   :  59.00 (63%) usr   0.40 ( 7%) sys  70.43 (36%) wall  
>   4957 kB ( 1%) ggc
>  TOTAL :  94.13 5.43   193.85 
> 468339 kB
> 

> on a 386 linux machine bootstrapped with checking disabled.

Can you send me cpgram.ii, so i can look into it?

i will note the patch is pretty much required for correctness.  We were
getting seriously wrong answers before in some cases.





Re: alias time explosion

2006-03-21 Thread Richard Guenther
On 3/21/06, Daniel Berlin <[EMAIL PROTECTED]> wrote:
> On Tue, 2006-03-21 at 17:30 -0500, Andrew MacLeod wrote:
> >
> > I seem to  have narrowed it down to this patch:
> >
> > http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00908.html
> >
>
> That's quite a while ago :).
>
> >
> >
> > Dan, this appear to *not* be compile time neutral:
> >
> > Timings on this patch show that it is not faster or slower than
> > what we
> > do now (even with the removal of the call clobbering patch).  This 
> > is
> > true even on fortran tests i had that clobber a lot of stuff.
>
> >
> >
> > running cpgram.ii shows a regression:
> >
> > before patch:
> >
> >  tree alias analysis   :   2.49 ( 7%) usr   0.25 ( 5%) sys   6.13 ( 5%) 
> > wall4971 kB ( 1%) ggc
> >  TOTAL :  36.90 4.72   130.34   
> >   467341 kB
> >
> > after patch:
> >
> >  tree alias analysis   :  59.00 (63%) usr   0.40 ( 7%) sys  70.43 (36%) 
> > wall4957 kB ( 1%) ggc
> >  TOTAL :  94.13 5.43   193.85   
> >   468339 kB
> >
>
> > on a 386 linux machine bootstrapped with checking disabled.
>
> Can you send me cpgram.ii, so i can look into it?
>
> i will note the patch is pretty much required for correctness.  We were
> getting seriously wrong answers before in some cases.

Maybe someone can have a look at the attribute((pointer_no_escape))
patch I posted a while ago.  With some IPA machinery we could possibly
trim down the clobber lists quite a bit.

Richard.


Re: alias time explosion

2006-03-21 Thread Daniel Berlin

> Maybe someone can have a look at the attribute((pointer_no_escape))
> patch I posted a while ago.  With some IPA machinery we could possibly
> trim down the clobber lists quite a bit.
> 

Well, let me confirm first that he is right.  This requires a cpgram.ii
that compiles (none of the attachments to the testcase do)
This is rather easy, i will test the improved-aliasing-branch from right
before the merge.

Second, it could be one of the previously silly tests that was removed
(ie the is_global_var check in is_call_clobbered, even though we have
globals that are *not* clobbered).  Those would now be bitmap tests.

If the bitmap is too expensive for random testing, we might have to move
it into a flag.




> Richard.
> 



compile error on 4.1.0 vs. 3.4.2

2006-03-21 Thread Greg Buchholz
I don't know which version is correct, but the program below
compiles fine and works with g++-3.4.2, but fails with the following
error message when compiled with g++-4.1.0...

example.cpp: In function 'typename compose::type fmap(typename 
compose::type, Out (*)(In)) [with Coll = TemList, In = int, Out = int]':
example.cpp:59:   instantiated from here
example.cpp:37: error: no matching function for call to 'fmap(const int&, int 
(*&)(int))'

Thanks,

Greg Buchholz



#include 
#include 
#include

using namespace std;

class NullType {};

templateclass H, class T> struct TemList
{
typedef T Tail;
};

template struct compose;

templateclass Tem,class Base> 
struct compose,Base >
{
typedef Tem type;
};

templateclass Tem,class Tail, class Base> 
struct compose,Base >
{
typedef Tem::type> type;
};

template 
typename compose::type 
fmap(typename compose::type l, Out (*f)(In))
{
typename compose::type temp;

for(typename compose::type::const_iterator iter = l.begin();
iter != l.end(); ++iter)
{
temp.push_back(fmap(*iter,f));
}
return temp; 
}

template 
Out fmap(In i, Out (*f)(In))
{
return f(i);
}

int inc(int x){ return x+1; }

template std::ostream& operator<<(std::ostream&, const std::list&);

int main(int argc, char* argv[])
{
int tmp[] = {1,2,3};
list simple_list(tmp,tmp+3);

cout << "fmap on single int :" << fmap(5,inc) << endl;
cout << "inc of simple list: " << simple_list << " => " ;
cout << fmap,int,int>(simple_list,inc) << endl;   
return 0;
}

template std::ostream& operator<<(std::ostream& o, const std::list& 
l)
{
o << "[";
for(typename std::list::const_iterator i = l.begin(); i != --l.end(); 
++i)
o << *i << ",";
return o << *(--l.end()) << "]";
}



insns for register-move between general and floating

2006-03-21 Thread Greg McGary
I'm working on a port that has instructions to move bits between
64-bit floating-point and 64-bit general-purpose regs.  I say "bits"
because there's no conversion between float and int: the bit pattern
is unaltered.  Therefore, it's possible to use scratch FPRs for
spilling GPRs & vice-versa, and float<->int conversions need not go
through memory.

Among all the knobs to turn regarding register classes, reload
classes, and modes+constraints on movM, floatMN2, fixMN2 patterns,
I need some advice on how to do this properly.

Thanks!
Greg


Re: alias time explosion

2006-03-21 Thread Andrew MacLeod
On Tue, 2006-03-21 at 17:42 -0500, Daniel Berlin wrote:
> On Tue, 2006-03-21 at 17:30 -0500, Andrew MacLeod wrote:
> > On T
> > I seem to  have narrowed it down to this patch:
> > 
> > http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00908.html
> > 
> 
> That's quite a while ago :).
> 

It was, I am suprised no one took note of it befor enow. I was suprised
I had to go back that far to find it.

> 
> i will note the patch is pretty much required for correctness.  We were
> getting seriously wrong answers before in some cases.
> 

yeah, well wrong pessimistic, or wrong incorrect code. If its the
latter, there must be a better way to do it.

hopefully its just some check ordering issue thats easily resolved. Its
seems like an insane amount of time.

Andrew



Re: alias time explosion

2006-03-21 Thread Daniel Berlin
On Tue, 2006-03-21 at 22:16 -0500, Andrew MacLeod wrote:
> On Tue, 2006-03-21 at 17:42 -0500, Daniel Berlin wrote:
> > On Tue, 2006-03-21 at 17:30 -0500, Andrew MacLeod wrote:
> > > On T
> > > I seem to  have narrowed it down to this patch:
> > > 
> > > http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00908.html
> > > 
> > 
> > That's quite a while ago :).
> > 
> 
> It was, I am suprised no one took note of it befor enow. I was suprised
> I had to go back that far to find it.

This is probably a pathological testcase for something.

> > 
> > i will note the patch is pretty much required for correctness.  We were
> > getting seriously wrong answers before in some cases.
> > 
> 
> yeah, well wrong pessimistic, or wrong incorrect code. If its the
> latter, there must be a better way to do it.

Well, both, actually.

> 
> hopefully its just some check ordering issue thats easily resolved. Its
> seems like an insane amount of time.

Looking at the testcase, it's probably the removal of is_global_var from
is_call_clobbered, or something close that used to not check the bitmap
and now does.   This would actually generate wrong code in some cases
where we got out of whack between the real call clobbered bitmap and
what is_call_clobbered would return.

I will check if this is the case, and if so, it is trivial to fix.


--Dan



Re: alias time explosion

2006-03-21 Thread Daniel Berlin
On Tue, 2006-03-21 at 22:13 -0500, Andrew MacLeod wrote:
> On Tue, 2006-03-21 at 18:00 -0500, Daniel Berlin wrote:
> > > Maybe someone can have a look at the attribute((pointer_no_escape))
> > > patch I posted a while ago.  With some IPA machinery we could possibly
> > > trim down the clobber lists quite a bit.
> > > 
> > 
> > Well, let me confirm first that he is right.  This requires a cpgram.ii
> > that compiles (none of the attachments to the testcase do)
> > This is rather easy, i will test the improved-aliasing-branch from right
> > before the merge.
> > 
> 
> I used the attached one with -fpermissive


Thanks, i'm looking into it now.




Re: Results for 4.2.0 20060320 (experimental) testsuite on powerpc-apple-darwin8.5.0 (-m64 results)

2006-03-21 Thread Shantonu Sen


On Mar 21, 2006, at 12:34 PM, Bradley Lucier wrote:

I'm curious about whether any of the changes recently proposed to  
clean up the x86-darwin port can be applied to the 64-bit PowerPC  
darwin compiler;


Like what? I haven't really seen many cleanups that were x86/darwin- 
specific


I'm getting the feeling that gcc on ppc64 darwin may become  
something of an orphan.


Note sure what you mean here, but these ppc64 results are no worse  
than the ppc(32) results for Darwin. Did you look at the failures?


For example, here's your results:
=== g++ Summary ===

# of expected passes10807
# of unexpected failures1287
# of expected failures  67
# of unresolved testcases   14
# of unsupported tests  126

=== libstdc++ Summary ===

# of expected passes355
# of unexpected failures1576
# of expected failures  7
# of unsupported tests  323

And here's my results, on 10.4.3 with CVS odcctools (which has  
cctools-590.36/ld64-26.0.81)

=== g++ Summary ===

# of expected passes9814
# of unexpected failures1406
# of expected failures  65
# of unresolved testcases   77
# of unsupported tests  124

=== libstdc++ Summary ===

# of expected passes354
# of unexpected failures1589
# of expected failures  7
# of unsupported tests  323


This should inspire you with confidence that ppc32-darwin is as bad,  
if not worse, than ppc64-darwin. Many of the c++ failures are of the  
form:

/opt/odcctools/bin/ld: Undefined symbols:
__Unwind_GetIPInfo
collect2: ld returned 1 exit status

A fix for this was posted by Eric C. (), but never committed. Someone will  
have to decide Real Soon Now what the expected binary compatibility  
of GCC trunk vs. Mac OS X's libgcc should/will be. Andrew has filed  
this as 


What is the prior art for other OS vendors that ship a shared libgcc?

Shantonu