Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-13 Thread Gary Funck
On 10/13/11 06:15:31, Laurynas Biveinis wrote:
> [...] In your case (correct me if I misunderstood something)
> you have one hash table, marking of which will mark more objects
> which are required for the correct marking of the second hash table.
> GC might be simply walking the second one first.

Yes, I think that this accurately summarizes the situation, and the result.

Any suggestions on how to fix this?  It seems that one fix might
be to use a non garbage-collected hash table for the hash map.

- Gary


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Jakub Jelinek
On Thu, Oct 13, 2011 at 01:38:44AM +0200, Michael Matz wrote:
> IMO reading the standard to allow an access to be 
> based "on s.p _as well as_ t->p" and that this should result in any 
> sensible behaviour regarding restrict is interpreting too much into it.  

No.  Because s.p and t->p designates (if the wrapper function returns
the address it was passed as the first argument, otherwise may designate)
the same object.  And the based on P relation is basing on objects,
not expressions - "expression E based on object P" in the standard.

> Let's do away with the fields, trying to capture the core of the 
> disagreement.  What you seem to be saying is that this code is 
> well-defined and shouldn't return 1:
> 
> int foo (int * _a, int * _b)
> {
>   int * restrict a = _a;
>   int * restrict b = _b;
>   int * restrict *pa = wrap (&a);   
>   *pa = _b; // 1
>   *a = 0;
>   **pa = 1;
>   return *a;
> }

This is valid.  *pa and a expressions designate (if wrap returns the passed
in pointer) the same object, pa itself is not restrict, thus it is fine
if both of those expressions are used to access the object a.
The store to the restrict object (// 1) is fine, the standard has only
restrictions when you assign to the restrict pointer object a value based
on another restrict pointer (then the inner block resp. return from block
rules apply), in the above case _b is either not based on any restrict
pointer, or could be based on a restrict pointer associated with some outer
block (caller).  Both *a and **pa lvalues have (or may have) address based
on the same restricted pointer object (a).

Of course if you change the above to
int * restrict * restrict pa = wrap (&a);
the testcase would be invalid, because then accesses to a would be done
through both expression based on the restricted pointer pa and through a
directly in the same block.  So, you can disambiguate based on
int *restrict*pa or field restrict, but only if you can first disambiguate
that *pa and a is not actually the same object.  That disambiguation can
be through restrict pa or some other means (PTA/IPA-PTA usual job).

> I think that would go straight against the intent of restrict.  I'd read 
> the standard as making the above trick undefined.

Where exactly?

> > Because, if you change t->p (or s.p) at some point in between t->p = q; 
> > and s.p[0]; (i.e. prior to the access) to point to a copy of the array, 
> > both s.p and t->p change.
> 
> Yes, but the question is, if the very modification of t->p was valid to 
> start with.  In my example above insn 1 is a funny way to write "a = _b", 
> i.e. reassigning the already set restrict pointer a to the one that also 
> is already in b.  Simplifying the above then leads to:
> 
> int foo (int * _a, int * _b)
> {
>   int * restrict a = _a;
>   int * restrict b = _b;
>   a = _b;
>   *a = 0;
>   *b = 1;
>   return *a;
> }
> 
> which I think is undefined because of the fourth clause (multiple 
> modifying accesses to the same underlying object X need to go through one 
> particular restrict chain).

Yes, this one is undefined, *_b object is modified
and accessed here through lvalues based on different restrict pointer
objects (a and b).  Note that in the earlier testcase, although
you have int * restrict b = _b; there, nothing is accessed through lvalue
based on b, unlike here.

> Seen from another perspective your reading would introduce an 
> inconsistency with composition.  Let's assume we have this function 
> available:
> 
> int tail (int * restrict a, int * restrict b) {
>   *a = 0;
>   *b = 1;
>   return *a;
> }
> 
> Clearly we can optimize this into { *a=0;*b=1;return 0; } without 
> looking at the context.  Now write the testcase or my example above in 

Sure.

> terms of that function:
> 
> int goo (int *p, int *q)
> {
>   struct S s, *t;
>   s.a = 1;
>   s.p = p;   // 1
>   t = wrap(&s);  // 2 t=&s in effect, but GCC doesn't see this
>   t->p = q;  // 3
>   return tail (s.p, t->p);
> }
> 
> Now we get the same behaviour of returning a zero.  Something must be 
> undefined here, and it's not in tail itself.  It's either the call of 
> tail, the implicit modification of s.p with writes to t->p or the 
> existence of two separate restrict pointers of the same value.  I think 
> the production of two separate equal-value restrict pointers via 
> indirect modification is the undefinedness, and _if_ the standard can be 
> read in a way that this is supposed to be valid then it needs to be 
> clarified to not allow that anymore.

This is undefined, the undefined behavior happens when running the tail.
The same object X (*q) is modified and accessed through lvalue based on
restricted pointer object a as well as through lvalue based on restricted
pointer b.

> I believe the standard should say something to the effect of disallowing 
> modifying restrict pointers after they are initialized/assigned to once.

The standard doesn't say anything like that, 

[PATCH] Fix PR50698

2011-10-13 Thread Richard Guenther

This fixes PR50698, a failure to disambiguate &MEM[&mem + 10] from
&MEM[&mem] in data-reference analysis.  Fixed by also looking
at offsets for non-subsetted references.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-10-13  Richard Guenther  

PR tree-optimization/50698
* tree-data-ref.c (split_constant_offset_1): Also process
offsets of &MEM.

* g++.dg/vect/pr50698.cc: New testcase.

Index: gcc/tree-data-ref.c
===
*** gcc/tree-data-ref.c (revision 179856)
--- gcc/tree-data-ref.c (working copy)
*** split_constant_offset_1 (tree type, tree
*** 589,597 
int punsignedp, pvolatilep;
  
op0 = TREE_OPERAND (op0, 0);
-   if (!handled_component_p (op0))
- return false;
- 
base = get_inner_reference (op0, &pbitsize, &pbitpos, &poffset,
&pmode, &punsignedp, &pvolatilep, false);
  
--- 589,594 
Index: gcc/testsuite/g++.dg/vect/pr50698.cc
===
*** gcc/testsuite/g++.dg/vect/pr50698.cc(revision 0)
--- gcc/testsuite/g++.dg/vect/pr50698.cc(revision 0)
***
*** 0 
--- 1,27 
+ // { dg-do compile }
+ // { dg-require-effective-target vect_float }
+ 
+ float mem[4096];
+ const int N=1024;
+ 
+ struct XYZ {
+ float * mem;
+ int n;
+ float * x() { return mem;}
+ float * y() { return x()+n;}
+ float * z() { return y()+n;}
+ };
+ 
+ inline
+ void sum(float * x, float * y, float * z, int n) {
+ for (int i=0;i!=n; ++i)
+   x[i]=y[i]+z[i];
+ }
+ 
+ void sumS() {
+ XYZ xyz; xyz.mem=mem; xyz.n=N;
+ sum(xyz.x(),xyz.y(),xyz.z(),xyz.n);
+ }
+ 
+ // { dg-final { scan-tree-dump-not "run-time aliasing" "vect" } }
+ // { dg-final { cleanup-tree-dump "vect" } }


Re: New warning for expanded vector operations

2011-10-13 Thread Mike Stump
On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote:
> This patch fixed PR50704.
> 
> gcc/testsuite:
>* gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target.
>* gcc.target/i386/warn-vect-op-1.c: Ditto.
>* gcc.target/i386/warn-vect-op-2.c: Ditto.
> 
> Ok for trunk?

Ok.  Is this x32 clean?  :-)  If not, HJ will offer an even better spelling.


Re: [PATCH] Fix number of arguments in call to alloca_with_align

2011-10-13 Thread Richard Guenther
On Wed, Oct 12, 2011 at 9:24 PM, Tom de Vries  wrote:
> Richard,
>
> This patch fixes a trivial problem in gimplify_parameters, introduced by the
> patch that introduced BUILT_IN_ALLOCA_WITH_ALIGN.
> BUILT_IN_ALLOCA_WITH_ALIGN has 2 parameters, so the number of arguments in the
> corresponding build_call_expr should be 2, not 1.
>
> Bootstrapped and reg-tested (including Ada) on x86_64.
>
> OK for trunk?

Yes.  Qualifies as obvious anyway ;)

Thanks,
Richard.


> Thanks,
> - Tom
>
>
> 2011-10-12  Tom de Vries  
>
>        * function.c (gimplify_parameters): Set number of arguments of call to
>        BUILT_IN_ALLOCA_WITH_ALIGN to 2.
>


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Richard Guenther
On Thu, Oct 13, 2011 at 10:41 AM, Jakub Jelinek  wrote:
> On Thu, Oct 13, 2011 at 01:38:44AM +0200, Michael Matz wrote:
>> IMO reading the standard to allow an access to be
>> based "on s.p _as well as_ t->p" and that this should result in any
>> sensible behaviour regarding restrict is interpreting too much into it.
>
> No.  Because s.p and t->p designates (if the wrapper function returns
> the address it was passed as the first argument, otherwise may designate)
> the same object.  And the based on P relation is basing on objects,
> not expressions - "expression E based on object P" in the standard.
>
>> Let's do away with the fields, trying to capture the core of the
>> disagreement.  What you seem to be saying is that this code is
>> well-defined and shouldn't return 1:
>>
>> int foo (int * _a, int * _b)
>> {
>>   int * restrict a = _a;
>>   int * restrict b = _b;
>>   int * restrict *pa = wrap (&a);
>>   *pa = _b;         // 1
>>   *a = 0;
>>   **pa = 1;
>>   return *a;
>> }
>
> This is valid.  *pa and a expressions designate (if wrap returns the passed
> in pointer) the same object, pa itself is not restrict, thus it is fine
> if both of those expressions are used to access the object a.
> The store to the restrict object (// 1) is fine, the standard has only
> restrictions when you assign to the restrict pointer object a value based
> on another restrict pointer (then the inner block resp. return from block
> rules apply), in the above case _b is either not based on any restrict
> pointer, or could be based on a restrict pointer associated with some outer
> block (caller).  Both *a and **pa lvalues have (or may have) address based
> on the same restricted pointer object (a).
>
> Of course if you change the above to
> int * restrict * restrict pa = wrap (&a);
> the testcase would be invalid, because then accesses to a would be done
> through both expression based on the restricted pointer pa and through a
> directly in the same block.  So, you can disambiguate based on
> int *restrict*pa or field restrict, but only if you can first disambiguate
> that *pa and a is not actually the same object.  That disambiguation can
> be through restrict pa or some other means (PTA/IPA-PTA usual job).
>
>> I think that would go straight against the intent of restrict.  I'd read
>> the standard as making the above trick undefined.
>
> Where exactly?
>
>> > Because, if you change t->p (or s.p) at some point in between t->p = q;
>> > and s.p[0]; (i.e. prior to the access) to point to a copy of the array,
>> > both s.p and t->p change.
>>
>> Yes, but the question is, if the very modification of t->p was valid to
>> start with.  In my example above insn 1 is a funny way to write "a = _b",
>> i.e. reassigning the already set restrict pointer a to the one that also
>> is already in b.  Simplifying the above then leads to:
>>
>> int foo (int * _a, int * _b)
>> {
>>   int * restrict a = _a;
>>   int * restrict b = _b;
>>   a = _b;
>>   *a = 0;
>>   *b = 1;
>>   return *a;
>> }
>>
>> which I think is undefined because of the fourth clause (multiple
>> modifying accesses to the same underlying object X need to go through one
>> particular restrict chain).
>
> Yes, this one is undefined, *_b object is modified
> and accessed here through lvalues based on different restrict pointer
> objects (a and b).  Note that in the earlier testcase, although
> you have int * restrict b = _b; there, nothing is accessed through lvalue
> based on b, unlike here.
>
>> Seen from another perspective your reading would introduce an
>> inconsistency with composition.  Let's assume we have this function
>> available:
>>
>> int tail (int * restrict a, int * restrict b) {
>>   *a = 0;
>>   *b = 1;
>>   return *a;
>> }
>>
>> Clearly we can optimize this into { *a=0;*b=1;return 0; } without
>> looking at the context.  Now write the testcase or my example above in
>
> Sure.
>
>> terms of that function:
>>
>> int goo (int *p, int *q)
>> {
>>   struct S s, *t;
>>   s.a = 1;
>>   s.p = p;       // 1
>>   t = wrap(&s);  // 2 t=&s in effect, but GCC doesn't see this
>>   t->p = q;      // 3
>>   return tail (s.p, t->p);
>> }
>>
>> Now we get the same behaviour of returning a zero.  Something must be
>> undefined here, and it's not in tail itself.  It's either the call of
>> tail, the implicit modification of s.p with writes to t->p or the
>> existence of two separate restrict pointers of the same value.  I think
>> the production of two separate equal-value restrict pointers via
>> indirect modification is the undefinedness, and _if_ the standard can be
>> read in a way that this is supposed to be valid then it needs to be
>> clarified to not allow that anymore.
>
> This is undefined, the undefined behavior happens when running the tail.
> The same object X (*q) is modified and accessed through lvalue based on
> restricted pointer object a as well as through lvalue based on restricted
> pointer b.
>
>> I believe the standard should say something to the effec

Re: New warning for expanded vector operations

2011-10-13 Thread Richard Guenther
On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump  wrote:
> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote:
>> This patch fixed PR50704.
>>
>> gcc/testsuite:
>>        * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target.
>>        * gcc.target/i386/warn-vect-op-1.c: Ditto.
>>        * gcc.target/i386/warn-vect-op-2.c: Ditto.
>>
>> Ok for trunk?
>
> Ok.  Is this x32 clean?  :-)  If not, HJ will offer an even better spelling.

I suppose you instead want sth like

{ dg-require-effective-target lp64 }

?


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Jakub Jelinek
On Thu, Oct 13, 2011 at 11:21:58AM +0200, Richard Guenther wrote:
> I suggested that for a final patch we only add ADD_RESTRICT in the
> gimplifier for restrict qualified parameters, to make the inlining case
> work again.  ADD_RESTRICTs for casts to restrict qualified pointers
> I would add at parsing time, exactly when a literal cast to a
> restrict qualified pointer from a non-restrict qualified pointer happens
> in the source.  Everything else sounds too fragile IMHO.

I'd sum up my previous mail as noting that restricted pointers are objects,
so restrict is not property of expressions.  So e.g. I don't think
we should add ADD_RESTRICT (or, at least, not an ADD_RESTRICT with different
tag) on every assignment to a restrict pointer object.
E.g. the restrict tag for cases we don't handle yet currently
(GLOBAL_RESTRICT/PARM_RESTRICT) could be based on a DECL_UID (for fields
on FIELD_DECL uid, I think we are not duplicating the fields anywhere,
for VAR_DECLs based on DECL_ABSTRACT_ORIGIN's DECL_UID?) - by based mean
map those uids using some hash table into the uids of the artificial
restrict vars or something similar.
We probably can't use restrict when we have t->p where p is restrict field
or int *restrict *q with *q directly, it would be PTA's job to find that
out.  Of course if t above is restrict too, it could be handled, as well as
int *restrict *restrict q with *q.

Jakub


Re: New warning for expanded vector operations

2011-10-13 Thread Artem Shinkarov
On Thu, Oct 13, 2011 at 10:23 AM, Richard Guenther
 wrote:
> On Thu, Oct 13, 2011 at 10:59 AM, Mike Stump  wrote:
>> On Oct 12, 2011, at 2:37 PM, Artem Shinkarov wrote:
>>> This patch fixed PR50704.
>>>
>>> gcc/testsuite:
>>>        * gcc.target/i386/warn-vect-op-3.c: Exclude ia32 target.
>>>        * gcc.target/i386/warn-vect-op-1.c: Ditto.
>>>        * gcc.target/i386/warn-vect-op-2.c: Ditto.
>>>
>>> Ok for trunk?
>>
>> Ok.  Is this x32 clean?  :-)  If not, HJ will offer an even better spelling.
>
> I suppose you instead want sth like
>
> { dg-require-effective-target lp64 }
>
> ?
>

See our discussion with HJ here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50704
/* { dg-do compile { target { ! { ia32 } } } } */ was his idea.  As
far as x32 sets UNITS_PER_WORD to 8, these tests should work fine.

Artem.


Re: [PATCH] Mark static const strings as read-only.

2011-10-13 Thread Tom de Vries
On 10/12/2011 10:13 AM, Tom de Vries wrote:
> On 10/10/2011 05:50 PM, Eric Botcazou wrote:
>>> So, the patch for build_constant_desc does not have the desired effect.
>>
>> OK, too bad that we need to play this back-and-forth game with MEMs.  So the 
>> original patch is OK (with TREE_READONLY (base) on the next line to mimic 
>> what 
>> is done just above and without the gcc/ prefix in the ChangeLog).  If you 
>> have 
>> some available cycles, you can test and install the build_constant_desc 
>> change 
>> in the same commit, otherwise I'll do it myself.
>>
> 
> I'll include the build_constant_desc change in a bootstrap/reg-test on x86_64.
> 
> Thanks,
> - Tom

No problems found in bootstrap/reg-test on x86_64. Committed.

Thanks,
- Tom

2011-10-13  Tom de Vries  

* emit-rtl.c (set_mem_attributes_minus_bitpos): Set MEM_READONLY_P
for static const strings.
* varasm.c (build_constant_desc): Generate the memory location of the
constant using gen_const_mem.

* gcc.dg/memcpy-4.c: New test.
Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c (revision 179773)
+++ gcc/emit-rtl.c (working copy)
@@ -1696,6 +1696,12 @@ set_mem_attributes_minus_bitpos (rtx ref
 	  && !TREE_THIS_VOLATILE (base))
 	MEM_READONLY_P (ref) = 1;
 
+  /* Mark static const strings readonly as well.  */
+  if (base && TREE_CODE (base) == STRING_CST
+	  && TREE_READONLY (base)
+	  && TREE_STATIC (base))
+	MEM_READONLY_P (ref) = 1;
+
   /* If this expression uses it's parent's alias set, mark it such
 	 that we won't change it.  */
   if (component_uses_parent_alias_set (t))
Index: gcc/varasm.c
===
--- gcc/varasm.c (revision 179773)
+++ gcc/varasm.c (working copy)
@@ -3119,7 +3119,7 @@ build_constant_desc (tree exp)
   SET_SYMBOL_REF_DECL (symbol, decl);
   TREE_CONSTANT_POOL_ADDRESS_P (symbol) = 1;
 
-  rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (exp)), symbol);
+  rtl = gen_const_mem (TYPE_MODE (TREE_TYPE (exp)), symbol);
   set_mem_attributes (rtl, exp, 1);
   set_mem_alias_set (rtl, 0);
   set_mem_alias_set (rtl, const_alias_set);
Index: gcc/testsuite/gcc.dg/memcpy-4.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/memcpy-4.c (revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+
+void
+f1 (char *p)
+{
+  __builtin_memcpy (p, "123", 3);
+}
+
+/* { dg-final { scan-rtl-dump-times "mem/s/u" 3 "expand" { target mips*-*-* } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini

On 10/13/2011 01:04 AM, Richard Kenner wrote:


I still don't like the patch, but I'm no longer as familiar with the code
as I used to be so can't suggest a replacement.  Let's see what others
think about it.


Same here, I don't like it but I hardly see any alternative.  The only 
possibility could be to prevent calling expand_compound_operation 
completely for addresses.  Richard, what do you think?  Don't worry, 
combine hasn't changed much since your days. :)


Paolo


[Ada] Correct error handling in Initialize

2011-10-13 Thread Arnaud Charlet
This change fixes the error handling circuitry in the initialization routine
for suspension objects so that Storage_Error is propagated as intended if
the allocation of the underlying OS entities fails.

No test (requires system resource allocation failure).

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Thomas Quinot  

* s-taprop-posix.adb (Initialize): Always raise Storage_Error
if we fail to initialize CV attributes or CV.

Index: s-taprop-posix.adb
===
--- s-taprop-posix.adb  (revision 179894)
+++ s-taprop-posix.adb  (working copy)
@@ -1089,9 +1089,7 @@
  Result := pthread_mutex_destroy (S.L'Access);
  pragma Assert (Result = 0);
 
- if Result = ENOMEM then
-raise Storage_Error;
- end if;
+ raise Storage_Error;
   end if;
 
   Result := pthread_cond_init (S.CV'Access, Cond_Attr'Access);
@@ -1101,11 +1099,10 @@
  Result := pthread_mutex_destroy (S.L'Access);
  pragma Assert (Result = 0);
 
- if Result = ENOMEM then
-Result := pthread_condattr_destroy (Cond_Attr'Access);
-pragma Assert (Result = 0);
-raise Storage_Error;
- end if;
+ Result := pthread_condattr_destroy (Cond_Attr'Access);
+ pragma Assert (Result = 0);
+
+ raise Storage_Error;
   end if;
 
   Result := pthread_condattr_destroy (Cond_Attr'Access);


[Ada] Fix runtime assertion failure in timed selective wait

2011-10-13 Thread Arnaud Charlet
This change ensures fixes an improper usage of Defer_Abort where
Defer_Abort_Nestable is meant, that would cause a failed assrtion
if a timed selective accept statement occurs when there already is
a pending call to the accepted entry.

The following program must compile and execute quietly:

with Ada.Exceptions;
with Ada.Text_IO; use Ada.Text_IO;

procedure Call_Then_Accept is
   task Caller is
  entry Start;
   end Caller;

   task Callee is
  entry Start;
  entry With_Body;
   end Callee;

   task body Caller is
   begin
  accept Start do
 null;
  end Start;
  Callee.With_Body;
   end Caller;

   task body Callee is
  Called : Boolean := False;
   begin
  accept Start do
 null;
  end Start;
  select
 delay 10.0;
  or
 accept With_Body do
Called := True;
 end With_Body;
  end select;
   exception
  when E : others =>
 Put_Line ("Callee: got " & Ada.Exceptions.Exception_Information (E));
   end Callee;

begin
   Caller.Start;
   delay 0.1;
   Callee.Start;
end Call_Then_Accept;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Thomas Quinot  

* s-tasren.adb (Timed_Selective_Wait, case
Accept_Alternative_Selected): Use Defer_Abort_Nestable, since
we know abortion is already deferred.

Index: s-tasren.adb
===
--- s-tasren.adb(revision 179894)
+++ s-tasren.adb(working copy)
@@ -1502,7 +1502,7 @@
 --  Null_Body. Defer abort until it gets into the accept body.
 
 Uninterpreted_Data := Self_Id.Common.Call.Uninterpreted_Data;
-Initialization.Defer_Abort (Self_Id);
+Initialization.Defer_Abort_Nestable (Self_Id);
 STPO.Unlock (Self_Id);
 
  when Accept_Alternative_Completed =>


[Ada] Factoring duplicated code

2011-10-13 Thread Arnaud Charlet
This change factors a chunk of code that was duplicated between
Par.Ch2.P_Identifier and Par.Ch3.P_Defining_Identifier.

No behaviour change, no test.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Thomas Quinot  

* par-ch2.adb, par.adb, par-util.adb, par-ch3.adb
(Check_Future_Identifier): New subprogram,
factors duplicated code from Par.Ch2.P_Identifier and
Par.Ch3.P_Defining_Identifier.

Index: par-ch2.adb
===
--- par-ch2.adb (revision 179894)
+++ par-ch2.adb (working copy)
@@ -62,34 +62,7 @@
   --  Code duplication, see Par_Ch3.P_Defining_Identifier???
 
   if Token = Tok_Identifier then
-
- --  Shouldn't the warnings below be emitted when in Ada 83 mode???
-
- --  Ada 2005 (AI-284): If compiling in Ada 95 mode, we warn that
- --  INTERFACE, OVERRIDING, and SYNCHRONIZED are new reserved words.
-
- if Ada_Version = Ada_95
-   and then Warn_On_Ada_2005_Compatibility
- then
-if Token_Name = Name_Overriding
-  or else Token_Name = Name_Synchronized
-  or else (Token_Name = Name_Interface
-and then Prev_Token /= Tok_Pragma)
-then
-   Error_Msg_N ("& is a reserved word in Ada 2005?", Token_Node);
-end if;
- end if;
-
- --  Similarly, warn about Ada 2012 reserved words
-
- if Ada_Version in Ada_95 .. Ada_2005
-   and then Warn_On_Ada_2012_Compatibility
- then
-if Token_Name = Name_Some then
-   Error_Msg_N ("& is a reserved word in Ada 2012?", Token_Node);
-end if;
- end if;
-
+ Check_Future_Keyword;
  Ident_Node := Token_Node;
  Scan; -- past Identifier
  return Ident_Node;
Index: par.adb
===
--- par.adb (revision 179894)
+++ par.adb (working copy)
@@ -1156,6 +1156,11 @@
   --  mode. The caller has typically checked that the current token,
   --  an identifier, matches one of the 95 keywords.
 
+  procedure Check_Future_Keyword;
+  --  Emit a warning if the current token is a valid identifier in the
+  --  language version in use, but is a reserved word in a later language
+  --  version (unless the language version in use is Ada 83).
+
   procedure Check_Simple_Expression (E : Node_Id);
   --  Given an expression E, that has just been scanned, so that Expr_Form
   --  is still set, outputs an error if E is a non-simple expression. E is
Index: par-util.adb
===
--- par-util.adb(revision 179894)
+++ par-util.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2011, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -169,6 +169,43 @@
end Check_Bad_Layout;
 
--
+   -- Check_Future_Keyword --
+   --
+
+   procedure Check_Future_Keyword is
+   begin
+  --  Ada 2005 (AI-284): Compiling in Ada95 mode we warn that INTERFACE,
+  --  OVERRIDING, and SYNCHRONIZED are new reserved words.
+
+  if Ada_Version = Ada_95
+and then Warn_On_Ada_2005_Compatibility
+  then
+ if Token_Name = Name_Overriding
+   or else Token_Name = Name_Synchronized
+   or else (Token_Name = Name_Interface
+ and then Prev_Token /= Tok_Pragma)
+ then
+Error_Msg_N ("& is a reserved word in Ada 2005?", Token_Node);
+ end if;
+  end if;
+
+  --  Similarly, warn about Ada 2012 reserved words
+
+  if Ada_Version in Ada_95 .. Ada_2005
+and then Warn_On_Ada_2012_Compatibility
+  then
+ if Token_Name = Name_Some then
+Error_Msg_N ("& is a reserved word in Ada 2012?", Token_Node);
+ end if;
+  end if;
+
+  --  Note: we deliberately do not emit these warnings when operating in
+  --  Ada 83 mode because in that case we assume the user is building
+  --  legacy code anyway.
+
+   end Check_Future_Keyword;
+
+   --
-- Check_Misspelling_Of --
--
 
Index: par-ch3.adb
===
--- par-ch3.adb (revision 179894)
+++ par-c

[Ada] Checks on intrinsic operators

2011-10-13 Thread Arnaud Charlet
An operator can be declared Import (Intrinsic) only if the current view of the
operand type (s) is a numeric type. With this patch the compiler properly
rejects the pragma if the operand type is private or incomplete.

Compiling mysystem.ads must yield:

   mysystem.ads:3:13: intrinsic operator can only apply to numeric types
   mysystem.ads:7:13: intrinsic operator can only apply to numeric types
   mysystem.ads:7:18: invalid use of incomplete type "Self"

---
package Mysystem is
   type A is private;
   function "<"  (Left, Right : A) return Boolean;
   pragma Import (Intrinsic, "<");

   type Self;
   function "+" (X, Y : Self) return Boolean;
   pragma Import (Intrinsic, "+");
   type Self is tagged null record;
private
   type A is mod 2 ** 32;
end Mysystem;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* sem_intr.adb (Check_Intrinsic_Operator): Check that type
is fully defined before checking that it is a numeric type.

Index: sem_intr.adb
===
--- sem_intr.adb(revision 179894)
+++ sem_intr.adb(working copy)
@@ -317,7 +317,11 @@
  return;
   end if;
 
-  if not Is_Numeric_Type (Underlying_Type (T1)) then
+  --  The type must be fully defined and numeric.
+
+  if No (Underlying_Type (T1))
+or else not Is_Numeric_Type (Underlying_Type (T1))
+  then
  Errint ("intrinsic operator can only apply to numeric types", E, N);
   end if;
end Check_Intrinsic_Operator;


[Ada] Box associations in record aggregates

2011-10-13 Thread Arnaud Charlet
a component association for component X has a boc, then X is covered in the
aggregate even if there is not default value for X in the type declaration, and
X has to be default-initialized. If the aggregate also has an others clause, X
is not covered by it.

The following must compile quietly in gnat05 mode:

procedure P is
   type R is record
  X : Integer;
  Y : Boolean;
   end record;
   Z : R;
begin
   Z := (X => <>, others => True);
end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* sem_aggr.adb (Resolve_Record_Aggregate): If a component
association for component X has a box, then X is covered in the
aggregate even if there is not default value for X in the type
declaration, and X has to be default-initialized.

Index: sem_aggr.adb
===
--- sem_aggr.adb(revision 179894)
+++ sem_aggr.adb(working copy)
@@ -3121,6 +3121,13 @@
 
 Expr := New_Copy_Tree (Expression (Parent (Compon)));
 
+--  Component may have no default, in which case the
+--  expression is empty and the component is default-
+--  initialized, but an association for the component
+--  exists, and it is not covered by an others clause.
+
+return Expr;
+
  else
 if Present (Next (Selector_Name)) then
Expr := New_Copy_Tree (Expression (Assoc));


[Ada] Referenced enumeration literals in attributes.

2011-10-13 Thread Arnaud Charlet
When an enumeration type appears in an attribute reference, all literals of
the type are marked as referenced. This must only be done if the attribute
reference appears in the current source. Else the information on references
may differ between a normal compilation and one that performs inlining.

No simple test available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* sem_attr.adb (Check_Enum_Image, Analyze_Attribute case
'Value): Mark literals as referenced only if reference is in
current source unit.

Index: sem_attr.adb
===
--- sem_attr.adb(revision 179894)
+++ sem_attr.adb(working copy)
@@ -264,6 +264,10 @@
   --  If the prefix type is an enumeration type, set all its literals
   --  as referenced, since the image function could possibly end up
   --  referencing any of the literals indirectly. Same for Enum_Val.
+  --  Set the flag only if the reference is in the main code unit. Same
+  --  restriction when resolving 'Value; otherwise an improperly set
+  --  reference when analyzing an inlined body will lose a proper warning
+  --  on a useless with_clause.
 
   procedure Check_Fixed_Point_Type;
   --  Verify that prefix of attribute N is a fixed type
@@ -1226,7 +1230,9 @@
   procedure Check_Enum_Image is
  Lit : Entity_Id;
   begin
- if Is_Enumeration_Type (P_Base_Type) then
+ if Is_Enumeration_Type (P_Base_Type)
+   and then In_Extended_Main_Code_Unit (N)
+ then
 Lit := First_Literal (P_Base_Type);
 while Present (Lit) loop
Set_Referenced (Lit);
@@ -5031,7 +5037,9 @@
 
  --  Case of enumeration type
 
- if Is_Enumeration_Type (P_Type) then
+ if Is_Enumeration_Type (P_Type)
+   and then In_Extended_Main_Code_Unit (N)
+ then
 Check_Restriction (No_Enumeration_Maps, N);
 
 --  Mark all enumeration literals as referenced, since the use of


[Ada] Unchecked union types can be limited

2011-10-13 Thread Arnaud Charlet
This patch removes an improper check on type to which the pragma Unchecked_Union
applies. Such a type can be limitied.
The following must compile quietly:

package UU is
type Val is (One, Two);

type T (X : Val := One) is limited record
   case  X is
  when One => A : Long_Long_Integer;
  when Two => B : Boolean;
   end case;
end record;
pragma Unchecked_Union (T);
end UU;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* sem_prag.adb (Analyze_Pragma, case Unchecked_Union): an
unchecked union type can be limited.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 179894)
+++ sem_prag.adb(working copy)
@@ -13762,12 +13762,6 @@
Error_Msg_N ("Unchecked_Union must not be tagged", Typ);
return;
 
-elsif Is_Limited_Type (Typ) then
-   Error_Msg_N
- ("Unchecked_Union must not be limited record type", Typ);
-   Explain_Limited_Type (Typ, Typ);
-   return;
-
 else
if not Has_Discriminants (Typ) then
   Error_Msg_N


[Ada] Conditional and case expressions are legal return values

2011-10-13 Thread Arnaud Charlet
In Ada 2005, several constructs of a limited type can be built in place, and
as such can be returned by function calls. In Ada 2012, the new expression forms
conditional expressions and case expressions can also be built in place if each
of their dependent expressions can be built in place.

The following must compile and execute quietly:

---
procedure Lim is
   type T is limited record
  Value  : Integer;
   end record;

   function Build (X : integer) return T is
   begin
  return (if X > 10 then (Value => X / 2) else  (Value => 0));
   end;
   THing : T := Build (12);

   function Build2 (X : Integer) return T is
   begin
 return (case X is
   when 1..10  =>  (value => 2 * X),
   when others => (Value => 0));
   end;

  Thing2 : T := Build2 (9);
begin
   if THing.Value /= 6 then
  raise Program_Error;
   end if;

   if THing2.Value /= 18 then
  raise Program_Error;
   end if;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* sem_ch3.adb (OK_For_Limited_Init_In_05): Conditional and case
expressions are legal limited return values if each one of their
dependent expressions are legal.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 179898)
+++ sem_ch3.adb (working copy)
@@ -16889,6 +16889,38 @@
  when N_Attribute_Reference =>
 return Attribute_Name (Original_Node (Exp)) = Name_Input;
 
+ --  For a conditional expression, all dependent expressions must be
+ --  legal constructs.
+
+ when N_Conditional_Expression =>
+declare
+   Then_Expr : constant Node_Id :=
+ Next
+   (First (Expressions (Original_Node (Exp;
+   Else_Expr : constant Node_Id := Next (Then_Expr);
+
+begin
+   return OK_For_Limited_Init_In_05 (Typ, Then_Expr)
+ and then OK_For_Limited_Init_In_05 (Typ, Else_Expr);
+end;
+
+ when N_Case_Expression =>
+declare
+   Alt : Node_Id;
+
+begin
+   Alt := First (Alternatives (Original_Node (Exp)));
+   while Present (Alt) loop
+  if not OK_For_Limited_Init_In_05 (Typ, Expression (Alt)) then
+ return False;
+  end if;
+
+  Next (Alt);
+   end loop;
+
+   return True;
+end;
+
  when others =>
 return False;
   end case;


[Ada] Qualified expressions and Code statements in Ada 2012

2011-10-13 Thread Arnaud Charlet
In Ada 2012 a qualified expression is a valid name, and for example a function
call that is disambiguated by means of a qualification can appear in the place
of a constant object. On the other hand A qualified expression that appears as
a statement denotes a machine code insertion. With the new rule, a qualified
expression by itself is parsed as a parameterless procedure call, and must be
rewritten and analyzed as a code statement.

The following must compile quietly:

 gcc -c -gnat12 -gnatws code_statement.adb

---
WITH MACHINE_CODE;-- N/A => ERROR.
USE MACHINE_CODE;
PROCEDURE code_statement IS

 PROCEDURE CODE IS
 BEGIN
  Asm_Insn'(Asm ("nop"));
 END;

BEGIN
 CODE;
END code_statement;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* sem_ch6.adb (Analyze_Procedure_Call_Statement): In Ada 2012 mode,
if the prefix of the call is a qualified expression, rewrite as
a code statement.
* sem_ch13.adb (Analyze_Code_Statement): In Ada 2012 mode, the
code statement is legal if it is a rewriting of a procedure call.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 179894)
+++ sem_ch6.adb (working copy)
@@ -1340,6 +1340,15 @@
  Analyze (P);
  Analyze_Call_And_Resolve;
 
+  --  In Ada 2012. a qualified expression is a name, but it cannot be a
+  --  procedure name, so the construct can only be a qualified expression.
+
+  elsif Nkind (P) = N_Qualified_Expression
+and then Ada_Version >= Ada_2012
+  then
+ Rewrite (N, Make_Code_Statement (Loc, Expression => P));
+ Analyze (N);
+
   --  Anything else is an error
 
   else
Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 179894)
+++ sem_ch13.adb(working copy)
@@ -3364,11 +3364,19 @@
 
  --  No statements other than code statements, pragmas, and labels.
  --  Again we allow certain internally generated statements.
+ --  In Ada 2012, qualified expressions are names, and the code
+ --  statement is initially parsed as a procedure call.
 
  Stmt := First (Statements (HSS));
  while Present (Stmt) loop
 StmtO := Original_Node (Stmt);
-if Comes_From_Source (StmtO)
+
+if Ada_Version >= Ada_2012
+  and then Nkind (StmtO) = N_Procedure_Call_Statement
+then
+   null;
+
+elsif Comes_From_Source (StmtO)
   and then not Nkind_In (StmtO, N_Pragma,
 N_Label,
 N_Code_Statement)


[Ada] Box associations for components without defaults in aggregates

2011-10-13 Thread Arnaud Charlet
Box associations are used to initialize aggregate components through the default
value of the corresponding components, or through calls to initialization
procedures. In general aggregates with such initializations cannot be built
statically. With this patch the following must compile quietly:

   gcc -c -gnat05 p.adb

---
procedure P is
   type A1 is array (1 .. 2, 1 .. 2) of Integer;
   A : A1;
begin
   A := ((1 => 2, others => <>), (others => 0));
end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Ed Schonberg  

* exp_aggr.adb (Flatten): If a component association has a box,
assume that aggregate is not static.
(Safe_Aggregate): If a component association in a non-limited
aggregate has a box, assume that it cannot be expanded in place.

Index: exp_aggr.adb
===
--- exp_aggr.adb(revision 179894)
+++ exp_aggr.adb(working copy)
@@ -3398,6 +3398,15 @@
 begin
Assoc := First (Component_Associations (N));
while Present (Assoc) loop
+
+  --  If this is a box association, flattening is in general
+  --  not possible because at this point we cannot tell if the
+  --  default is static or even exists.
+
+  if Box_Present (Assoc) then
+ return False;
+  end if;
+
   Choice := First (Choices (Assoc));
 
   while Present (Choice) loop
@@ -4148,6 +4157,12 @@
 return False;
  end if;
 
+  --  If association has a box, no way to determine yet
+  --  whether default can be assigned in place.
+
+  elsif Box_Present (Expr) then
+ return False;
+
   elsif not Safe_Component (Expression (Expr)) then
  return False;
   end if;


[Ada] Unknown attribute in project member of aggregate project

2011-10-13 Thread Arnaud Charlet
If there was an unknown attribute (such as IDE'Gnat) in any member
project of an aggregate project, then gprbuild fails if it is invoked on
the aggregate project.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Vincent Celier  

* prj-conf.adb (Get_Or_Create_Configuration_File): Call
Process_Project_Tree_Phase_1 with Packages_To_Check.
(Process_Project_And_Apply_Config): Ditto
* prj-part.ads, prj-part.adb, prj-pars.ads, prj-pars.adb (Parse):
Remove default for argument Packages_To_Check.
* prj-proc.adb (Recursive_Process): New argument
Packages_To_Check.
(Process): Ditto.
(Process_Project_Tree_Phase_1): Ditto.
(Recursive_Project.Process_Aggregated_Projects): Call
Prj.Part.Parse and Process_Project_Tree_Phase_1 with
Packages_To_Check.
* prj-proc.ads (Process): New argument Packages_To_Check
(Process_Project_Tree_Phase_1): Ditto

Index: prj-proc.adb
===
--- prj-proc.adb(revision 179894)
+++ prj-proc.adb(working copy)
@@ -145,6 +145,7 @@
procedure Recursive_Process
  (In_Tree: Project_Tree_Ref;
   Project: out Project_Id;
+  Packages_To_Check  : String_List_Access;
   From_Project_Node  : Project_Node_Id;
   From_Project_Node_Tree : Project_Node_Tree_Ref;
   Env: in out Prj.Tree.Environment;
@@ -1347,6 +1348,7 @@
procedure Process
  (In_Tree: Project_Tree_Ref;
   Project: out Project_Id;
+  Packages_To_Check  : String_List_Access;
   Success: out Boolean;
   From_Project_Node  : Project_Node_Id;
   From_Project_Node_Tree : Project_Node_Tree_Ref;
@@ -1361,6 +1363,7 @@
  From_Project_Node  => From_Project_Node,
  From_Project_Node_Tree => From_Project_Node_Tree,
  Env=> Env,
+ Packages_To_Check  => Packages_To_Check,
  Reset_Tree => Reset_Tree);
 
   if Project_Qualifier_Of
@@ -2325,6 +2328,7 @@
procedure Process_Project_Tree_Phase_1
  (In_Tree: Project_Tree_Ref;
   Project: out Project_Id;
+  Packages_To_Check  : String_List_Access;
   Success: out Boolean;
   From_Project_Node  : Project_Node_Id;
   From_Project_Node_Tree : Project_Node_Tree_Ref;
@@ -2349,6 +2353,7 @@
   Recursive_Process
 (Project=> Project,
  In_Tree=> In_Tree,
+ Packages_To_Check  => Packages_To_Check,
  From_Project_Node  => From_Project_Node,
  From_Project_Node_Tree => From_Project_Node_Tree,
  Env=> Env,
@@ -2482,6 +2487,7 @@
procedure Recursive_Process
  (In_Tree: Project_Tree_Ref;
   Project: out Project_Id;
+  Packages_To_Check  : String_List_Access;
   From_Project_Node  : Project_Node_Id;
   From_Project_Node_Tree : Project_Node_Tree_Ref;
   Env: in out Prj.Tree.Environment;
@@ -2539,6 +2545,7 @@
Recursive_Process
  (In_Tree=> In_Tree,
   Project=> New_Project,
+  Packages_To_Check  => Packages_To_Check,
   From_Project_Node  =>
 Project_Node_Of
   (With_Clause, From_Project_Node_Tree),
@@ -2596,6 +2603,7 @@
 Prj.Part.Parse
   (In_Tree   => From_Project_Node_Tree,
Project   => Loaded_Project,
+   Packages_To_Check => Packages_To_Check,
Project_File_Name => Get_Name_String (List.Path),
Errout_Handling   => Prj.Part.Never_Finalize,
Current_Directory => Get_Name_String (Project.Directory.Name),
@@ -2627,6 +2635,7 @@
   Process_Project_Tree_Phase_1
 (In_Tree=> Tree,
  Project=> List.Project,
+ Packages_To_Check  => Packages_To_Check,
  Success=> Success,
  From_Project_Node  => Loaded_Project,
  From_Project_Node_Tree => From_Project_Node_Tree,
@@ -2638,6 +2647,7 @@
   Process_Project_Tree_Phase_1
 (In_Tree=> Tree,
  Project=> List.Project,
+ Packages_To_Check  => Packages_To_Check,
  Success=> Success,
  From_Project_Node  => Loaded_Project,
  From_Project_Node_Tree => From_Project_Node_Tree,
@@ -2859,6 +2869,7 @@
 Recursive_Process

[Ada] Support for user-defined storage pools in limited function returns

2011-10-13 Thread Arnaud Charlet
This patch fixes a bug in which the global heap was used, even when a
user-defined storage pool had been specified. The bug occurred when the
function result type is immutably limited (so build-in-place is used),
and the result subtype is unconstrained or tagged (so has caller-unknown-size),
and the call site is the initial value for an allocator of an access type with
a user-defined storage pool.

The following test should run silently.

gnatmake -f -gnat05 driver

with Ada.Text_IO;
with S;
with P;

procedure Driver is
begin
   P.Alloc;

   raise Program_Error;

exception
   when S.Pool_Error =>
  null; -- OK
end Driver;

package P is

   procedure Alloc;

end P;

with S;
package body P is

   type T is tagged limited null record;

   function C return T'Class is
   begin
  return T'(null record);
   end C;

   P : S.Test_Pool;
   type T_Access is access T'Class;
   for T_Access'Storage_Pool use P;

   procedure Alloc is
  X : T_Access := new T'Class'(C);
  --  XXX Here Pool_Error must be raised.
   begin
  null;
   end Alloc;

end P;

with System.Storage_Elements;
with System.Storage_Pools;
package S is

   type Test_Pool is
 new System.Storage_Pools.Root_Storage_Pool with null record;

   procedure Allocate
(Pool : in out Test_Pool;
 Storage_Address  :out System.Address;
 Size_In_Storage_Elements : in System.Storage_Elements.Storage_Count;
 Alignment: in System.Storage_Elements.Storage_Count);

   procedure Deallocate
(Pool : in out Test_Pool;
 Storage_Address  : in System.Address;
 Size_In_Storage_Elements : in System.Storage_Elements.Storage_Count;
 Alignment: in System.Storage_Elements.Storage_Count);

   function Storage_Size (Pool : in Test_Pool)
 return System.Storage_Elements.Storage_Count;

   Pool_Error : exception;

end S;

with P;
package body S is

   procedure Allocate
(Pool : in out Test_Pool;
 Storage_Address  :out System.Address;
 Size_In_Storage_Elements : in System.Storage_Elements.Storage_Count;
 Alignment: in System.Storage_Elements.Storage_Count)
   is
   begin
  raise Pool_Error;
   end Allocate;

   procedure Deallocate
(Pool : in out Test_Pool;
 Storage_Address  : in System.Address;
 Size_In_Storage_Elements : in System.Storage_Elements.Storage_Count;
 Alignment: in System.Storage_Elements.Storage_Count)
   is
   begin
  raise Program_Error;
   end Deallocate;

   function Storage_Size (Pool : in Test_Pool)
 return System.Storage_Elements.Storage_Count
   is
   begin
  raise Program_Error;
  return 0;
   end Storage_Size;

end S;

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Bob Duff  

* exp_ch6.ads (BIP_Storage_Pool): New "extra implicit parameter"
that gets passed in the same cases where BIP_Alloc_Form is passed
(caller-unknown-size results). BIP_Storage_Pool is used when
BIP_Alloc_Form = User_Storage_Pool.  In that case, a pointer
to the user-defined storage pool is passed at the call site,
and this pool is used in callee to allocate the result.
* exp_ch6.adb (Add_Unconstrained_Actuals_To_Build_In_Place_Call): New
version of Add_Alloc_Form_Actual_To_Build_In_Place_Call. Passes
the additional BIP_Storage_Pool actual.
(Expand_N_Extended_Return_Statement): Allocate the function
result using the user-defined storage pool, if BIP_Alloc_Form =
User_Storage_Pool.
* sem_ch6.adb: Add the "extra formal" for BIP_Storage_Pool.
* exp_ch4.adb: Don't overwrite storage pool set by
Expand_N_Extended_Return_Statement.
* s-stopoo.ads, rtsfind.ads (Root_Storage_Pool_Ptr): New type,
for use in build-in-place function calls within allocators
where the access type has a user-defined storage pool.

Index: rtsfind.ads
===
--- rtsfind.ads (revision 179894)
+++ rtsfind.ads (working copy)
@@ -1346,6 +1346,7 @@
  RE_Storage_Offset,  -- System.Storage_Elements
  RE_To_Address,  -- System.Storage_Elements
 
+ RE_Root_Storage_Pool_Ptr,   -- System.Storage_Pools
  RE_Allocate_Any,-- System.Storage_Pools
  RE_Deallocate_Any,  -- System.Storage_Pools
  RE_Root_Storage_Pool,   -- System.Storage_Pools
@@ -2542,6 +2543,7 @@
  RE_Storage_Offset   => System_Storage_Elements,
  RE_To_Address   => System_Storage_Elements,
 
+ RE_Root_Storage_Pool_Ptr=> System_Storage_Pools,
  RE_Allocate_Any => System_Storage_Pools,
  RE_Deallocate_Any   => System_St

[Ada] Entity list of for loop for enumeration with rep gets truncated

2011-10-13 Thread Arnaud Charlet
When a for loop for an enumeration type with an enumeration representation
clause is expanded, it's rewritten as a loop over an integer loop parameter
and the original loop parameter is moved into a new nested block. When
the block is analyzed, the moved entity gets appended to the block scope's
entity list (as it should), but is left on the loop scope's entity list,
and no longer has a successor (because its Next_Entity is set to null). This
causes other entities on the loop to get lost from the loop's entity list.
The fix is to remove the loop parameter entity from the loop scope's list,
leaving any other declarations on that list intact (such as an itype for
the loop paramter's subtype). This problem surfaced when generating debugging
info with the GNAAMP compiler, due to encountering a loop parameter's itype
that was not found on any entity list.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Gary Dismukes  

* exp_ch5.adb (Expand_N_Loop_Statement): For the transformation
of a for loop for an enumeration type with an enumeration rep
clause, which involves moving the original loop parameter into
a nested block, the loop parameter's entity must be removed from
the entity list of the loop scope.

Index: exp_ch5.adb
===
--- exp_ch5.adb (revision 179894)
+++ exp_ch5.adb (working copy)
@@ -3458,6 +3458,20 @@
Statements => Statements (N,
 
End_Label => End_Label (N)));
+
+   --  The loop parameter's entity must be removed from the loop
+   --  scope's entity list, since itw will now be located in the
+   --  new block scope. Any other entities already associated with
+   --  the loop scope, such as the loop parameter's subtype, will
+   --  remain there.
+
+   pragma Assert (First_Entity (Scope (Loop_Id)) = Loop_Id);
+
+   Set_First_Entity (Scope (Loop_Id), Next_Entity (Loop_Id));
+   if Last_Entity (Scope (Loop_Id)) = Loop_Id then
+  Set_Last_Entity (Scope (Loop_Id), Empty);
+   end if;
+
Analyze (N);
 
 --  Nothing to do with other cases of for loops


Re: [Ada] Entity list of for loop for enumeration with rep gets truncated

2011-10-13 Thread Duncan Sands

Hi Arnaud,


--- exp_ch5.adb (revision 179894)
+++ exp_ch5.adb (working copy)
@@ -3458,6 +3458,20 @@
Statements => Statements (N,

End_Label => End_Label (N)));
+
+   --  The loop parameter's entity must be removed from the loop
+   --  scope's entity list, since itw will now be located in the


typo: itw -> it

Ciao, Duncan.


[Ada] Modify L2_Norm implementation to be more suitable for Complex_Vector

2011-10-13 Thread Arnaud Charlet
Using the existing definition, we'd have to first convert a complex vector
to a real one by computing the modulus ("abs") of each element. Constructing
an extra temporary vector is inefficient and may use an unexpected amount
of extra storage. No change in behavior for the real case though, this just
prepares for subsequent use in the Generic_Complex_Arrays package.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Geert Bosch  

* a-ngrear.adb ("abs"): Adjust for modified L2_Norm generic
* s-gearop.ads (L2_Norm): Change profile to be suitable for
Complex_Vector
* s-gearop.adb (L2_Norm): Reimplement using direct definition,
not inner product

Index: a-ngrear.adb
===
--- a-ngrear.adb(revision 179894)
+++ a-ngrear.adb(working copy)
@@ -356,10 +356,14 @@
 
   function "abs" is new
 L2_Norm
-  (Scalar=> Real'Base,
-   Vector=> Real_Vector,
-   Inner_Product => "*",
-   Sqrt  => Sqrt);
+  (X_Scalar  => Real'Base,
+   Result_Real   => Real'Base,
+   X_Vector  => Real_Vector,
+   "abs" => "+");
+  --  While the L2_Norm by definition uses the absolute values of the
+  --  elements of X_Vector, for real values the subsequent squaring
+  --  makes this unnecessary, so we substitute the "+" identity function
+  --  instead.
 
   function "abs" is new
 Vector_Elementwise_Operation
Index: s-gearop.adb
===
--- s-gearop.adb(revision 179907)
+++ s-gearop.adb(working copy)
@@ -336,9 +336,14 @@
-- L2_Norm --
-
 
-   function L2_Norm (X : Vector) return Scalar is
+   function L2_Norm (X : X_Vector) return Result_Real'Base is
+  Sum: Result_Real'Base := 0.0;
begin
-  return Sqrt (Inner_Product (X, X));
+  for J in X'Range loop
+ Sum := Sum + Result_Real'Base (abs X (J))**2;
+  end loop;
+
+  return Sqrt (Sum);
end L2_Norm;
 
--
Index: s-gearop.ads
===
--- s-gearop.ads(revision 179894)
+++ s-gearop.ads(working copy)
@@ -291,11 +291,12 @@
-
 
generic
-  type Scalar is private;
-  type Vector is array (Integer range <>) of Scalar;
-  with function Inner_Product (Left, Right : Vector) return Scalar is <>;
-  with function Sqrt (X : Scalar) return Scalar is <>;
-   function L2_Norm (X : Vector) return Scalar;
+  type X_Scalar is private;
+  type Result_Real is digits <>;
+  type X_Vector is array (Integer range <>) of X_Scalar;
+  with function "abs" (Right : X_Scalar) return Result_Real is <>;
+  with function Sqrt (X : Result_Real'Base) return Result_Real'Base is <>;
+   function L2_Norm (X : X_Vector) return Result_Real'Base;
 
---
-- Outer_Product --


[Ada] Make local Sqrt implementation generic

2011-10-13 Thread Arnaud Charlet
This prepares for reusing the Sqrt implementation in Generic_Complex_Arrays.
The local implementation avoids having to instantiate entire new copies of
Generic_Elementary_Functions just to get square root.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Geert Bosch  

* a-ngrear.adb, s-gearop.adb, s-gearop.ads (Sqrt): Make generic and
move to System.Generic_Array_Operations.

Index: a-ngrear.adb
===
--- a-ngrear.adb(revision 179908)
+++ a-ngrear.adb(working copy)
@@ -102,10 +102,10 @@
procedure Swap (Left, Right : in out Real);
--  Exchange Left and Right
 
-   function Sqrt (X : Real) return Real;
-   --  Sqrt is implemented locally here, in order to avoid dragging in all of
-   --  the elementary functions. Speed of the square root is not a big concern
-   --  here. This also avoids depending on a specific floating point type.
+   function Sqrt is new Ops.Sqrt (Real);
+   --  Instant a generic square root implementation here, in order to avoid
+   --  instantiating a complete copy of Generic_Elementary_Functions.
+   --  Speed of the square root is not a big concern here.
 

-- Rotate --
@@ -120,51 +120,6 @@
end Rotate;
 
--
-   -- Sqrt --
-   --
-
-   function Sqrt (X : Real) return Real is
-  Root, Next : Real;
-
-   begin
-  --  Be defensive: any comparisons with NaN values will yield False.
-
-  if not (X > 0.0) then
- if X = 0.0 then
-return X;
- else
-raise Argument_Error;
- end if;
-  end if;
-
-  --  Compute an initial estimate based on:
-
-  -- X = M * R**E and Sqrt (X) = Sqrt (M) * R**(E / 2.0),
-
-  --  where M is the mantissa, R is the radix and E the exponent.
-
-  --  By ignoring the mantissa and ignoring the case of an odd
-  --  exponent, we get a final error that is at most R. In other words,
-  --  the result has about a single bit precision.
-
-  Root := Real (Real'Machine_Radix) ** (Real'Exponent (X) / 2);
-
-  --  Because of the poor initial estimate, use the Babylonian method of
-  --  computing the square root, as it is stable for all inputs. Every step
-  --  will roughly double the precision of the result. Just a few steps
-  --  suffice in most cases. Eight iterations should give about 2**8 bits
-  --  of precision.
-
-  for J in 1 .. 8 loop
- Next := (Root + X / Root) / 2.0;
- exit when Root = Next;
- Root := Next;
-  end loop;
-
-  return Root;
-   end Sqrt;
-
-   --
-- Swap --
--
 
Index: s-gearop.adb
===
--- s-gearop.adb(revision 179908)
+++ s-gearop.adb(working copy)
@@ -29,6 +29,8 @@
 --  --
 --
 
+with Ada.Numerics; use Ada.Numerics;
+
 package body System.Generic_Array_Operations is
 
--  The local function Check_Unit_Last computes the index
@@ -567,6 +569,56 @@
   return R;
end Scalar_Vector_Elementwise_Operation;
 
+   --
+   -- Sqrt --
+   --
+
+   function Sqrt (X : Real'Base) return Real'Base is
+  Root, Next : Real'Base;
+
+   begin
+  --  Be defensive: any comparisons with NaN values will yield False.
+
+  if not (X > 0.0) then
+ if X = 0.0 then
+return X;
+ else
+raise Argument_Error;
+ end if;
+
+  elsif X > Real'Base'Last then
+ --  X is infinity, which is its own square root
+
+ return X;
+  end if;
+
+  --  Compute an initial estimate based on:
+
+  -- X = M * R**E and Sqrt (X) = Sqrt (M) * R**(E / 2.0),
+
+  --  where M is the mantissa, R is the radix and E the exponent.
+
+  --  By ignoring the mantissa and ignoring the case of an odd
+  --  exponent, we get a final error that is at most R. In other words,
+  --  the result has about a single bit precision.
+
+  Root := Real'Base (Real'Machine_Radix) ** (Real'Exponent (X) / 2);
+
+  --  Because of the poor initial estimate, use the Babylonian method of
+  --  computing the square root, as it is stable for all inputs. Every step
+  --  will roughly double the precision of the result. Just a few steps
+  --  suffice in most cases. Eight iterations should give about 2**8 bits
+  --  of precision.
+
+  for J in 1 .. 8 loop
+ Next := (Root + X / Root) / 2.0;
+ exit when Root = Next;
+ Root := Next;
+  end loop;
+
+  return Root;
+   end Sqrt;
+
---
-- Matrix_Matrix_Product --
---
Index: s-gearop.ads
===
--- s-gearop.ads(revision 

[Ada] Fix Forward_Eliminate routine to allow use with complex matrices

2011-10-13 Thread Arnaud Charlet
Use proper "abs" function returning a real for comparing magnitude of
elements. The previous local implementation using "-" only was correct
for real values. This prepares for the pure Ada reimplementation of
Generic_Complex_Arrays.

Tested on x86_64-pc-linux-gnu, committed on trunk

2011-10-13  Geert Bosch  

* s-gearop.ads (Forward_Eliminate): Add "abs" formal function
returning a Real.
* s-gearop.adb (Forward_Eliminate): Remove local "abs" function
and use formal.
* a-ngrear.adb (Forward_Eliminate): Adjust instantiation for
new profile.

Index: a-ngrear.adb
===
--- a-ngrear.adb(revision 179909)
+++ a-ngrear.adb(working copy)
@@ -33,7 +33,7 @@
 --  reason for this is new Ada 2012 requirements that prohibit algorithms such
 --  as Strassen's algorithm, which may be used by some BLAS implementations. In
 --  addition, some platforms lacked suitable compilers to compile the reference
---  BLAS/LAPACK implementation. Finally, on many platforms there may be more
+--  BLAS/LAPACK implementation. Finally, on some platforms there are be more
 --  floating point types than supported by BLAS/LAPACK.
 
 with Ada.Containers.Generic_Anonymous_Array_Sort; use Ada.Containers;
@@ -59,6 +59,7 @@
 
procedure Forward_Eliminate is new Ops.Forward_Eliminate
 (Scalar=> Real'Base,
+ Real  => Real'Base,
  Matrix=> Real_Matrix,
  Zero  => 0.0,
  One   => 1.0);
Index: s-gearop.adb
===
--- s-gearop.adb(revision 179909)
+++ s-gearop.adb(working copy)
@@ -161,9 +161,6 @@
   pragma Assert (M'First (1) = N'First (1) and then
  M'Last  (1) = N'Last (1));
 
-  function "abs" (X : Scalar) return Scalar is
-(if X < Zero then Zero - X else X);
-
   --  The following are variations of the elementary matrix row operations:
   --  row switching, row multiplication and row addition. Because in this
   --  algorithm the addition factor is always a negated value, we chose to
@@ -274,14 +271,14 @@
   for J in M'Range (2) loop
  declare
 Max_Row : Integer := Row;
-Max_Abs : Scalar := Zero;
+Max_Abs : Real'Base := 0.0;
 
  begin
 --  Find best pivot in column J, starting in row Row
 
 for K in Row .. M'Last (1) loop
declare
-  New_Abs : constant Scalar := abs M (K, J);
+  New_Abs : constant Real'Base := abs M (K, J);
begin
   if Max_Abs < New_Abs then
  Max_Abs := New_Abs;
@@ -290,7 +287,7 @@
end;
 end loop;
 
-if Zero < Max_Abs then
+if Max_Abs > 0.0 then
Switch_Row (M, N, Row, Max_Row);
Divide_Row (M, N, Row, M (Row, J));
 
Index: s-gearop.ads
===
--- s-gearop.ads(revision 179909)
+++ s-gearop.ads(working copy)
@@ -65,12 +65,14 @@
 
generic
   type Scalar is private;
+  type Real is digits <>;
   type Matrix is array (Integer range <>, Integer range <>) of Scalar;
+  with function "abs" (Right : Scalar) return Real'Base is <>;
   with function "-" (Left, Right : Scalar) return Scalar is <>;
   with function "*" (Left, Right : Scalar) return Scalar is <>;
   with function "/" (Left, Right : Scalar) return Scalar is <>;
-  with function "<" (Left, Right : Scalar) return Boolean is <>;
-  Zero, One : Scalar;
+  Zero : Scalar;
+  One  : Scalar;
procedure Forward_Eliminate
  (M   : in out Matrix;
   N   : in out Matrix;


[Ada] Fix PR ada/50589

2011-10-13 Thread Eric Botcazou
Some left-overs from an earlier patch.  Applied on the mainline.


2011-10-13  Eric Botcazou  

PR ada/50589
* s-linux-alpha.ads: Do not "with" Interfaces.C.
* s-linux-sparc.ads: Likewise.

-- 
Eric Botcazou
Index: s-linux-sparc.ads
===
--- s-linux-sparc.ads	(revision 179844)
+++ s-linux-sparc.ads	(working copy)
@@ -35,8 +35,6 @@
 --  PLEASE DO NOT add any with-clauses to this package or remove the pragma
 --  Preelaborate. This package is designed to be a bottom-level (leaf) package
 
-with Interfaces.C;
-
 package System.Linux is
pragma Preelaborate;
 
Index: s-linux-alpha.ads
===
--- s-linux-alpha.ads	(revision 179844)
+++ s-linux-alpha.ads	(working copy)
@@ -35,8 +35,6 @@
 --  PLEASE DO NOT add any with-clauses to this package or remove the pragma
 --  Preelaborate. This package is designed to be a bottom-level (leaf) package.
 
-with Interfaces.C;
-
 package System.Linux is
pragma Preelaborate;
 


[Ada] Reimplement Generic_Complex_Arrays in pure Ada

2011-10-13 Thread Arnaud Charlet
This completes the removal of dependencies on BLAS and LAPACK for these
packages. The main reason for this is limited availability of these libraries
on some platforms and for some types, in particular types wider than 64 bits.
Furthermore, some BLAS implementations may use sub-cubic implementations for
matrix multiplication, which would violate Ada 2012 accuracy requirements.

Mostly the complex versions of the various routines could use generalized
versions of the existing implementation of Generic_Real_Arrays. For
solving eigensystems on Hermitian matrices, the implementation for symmetric
real matrices is reused, through the use of an augmented matrix with doubled
dimensions.

This change also addresses AI05-0047, which fixes return type of the "abs"
function to be real instead of complex.

2011-10-13  Geert Bosch  

* a-ngrear.adb (Solve): Make generic and move to
System.Generic_Array_Operations.
* s-gearop.ads (Matrix_Vector_Solution, Matrix_Matrix_Solution):
New generic solvers to  compute a vector resp. matrix Y such
that A * Y = X, approximately.
* s-gearop.adb (Matrix_Vector_Solution, Matrix_Matrix_Solution):
Implement using Forward_Eliminate and Back_Substitute
* a-ngcoar.adb: Reimplement in pure Ada to remove dependencies
on BLAS and LAPACK.
* a-ngcoar.ads ("abs"): Fix return type to be real.

Index: a-ngcoar.adb
===
--- a-ngcoar.adb(revision 179894)
+++ a-ngcoar.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---Copyright (C) 2006-2009, Free Software Foundation, Inc.   --
+--Copyright (C) 2006-2011, Free Software Foundation, Inc.   --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -30,66 +30,35 @@
 --
 
 with System.Generic_Array_Operations; use System.Generic_Array_Operations;
-with System.Generic_Complex_BLAS;
-with System.Generic_Complex_LAPACK;
+with Ada.Numerics; use Ada.Numerics;
 
 package body Ada.Numerics.Generic_Complex_Arrays is
 
-   --  Operations involving inner products use BLAS library implementations.
-   --  This allows larger matrices and vectors to be computed efficiently,
-   --  taking into account memory hierarchy issues and vector instructions
-   --  that vary widely between machines.
-
--  Operations that are defined in terms of operations on the type Real,
--  such as addition, subtraction and scaling, are computed in the canonical
--  way looping over all elements.
 
-   --  Operations for solving linear systems and computing determinant,
-   --  eigenvalues, eigensystem and inverse, are implemented using the
-   --  LAPACK library.
+   package Ops renames System.Generic_Array_Operations;
 
-   type BLAS_Real_Vector is array (Integer range <>) of Real;
-
-   package BLAS is new System.Generic_Complex_BLAS
- (Real   => Real,
-  Complex_Types  => Complex_Types,
-  Complex_Vector => Complex_Vector,
-  Complex_Matrix => Complex_Matrix);
-
-   package LAPACK is new System.Generic_Complex_LAPACK
- (Real   => Real,
-  Real_Vector=> BLAS_Real_Vector,
-  Complex_Types  => Complex_Types,
-  Complex_Vector => Complex_Vector,
-  Complex_Matrix => Complex_Matrix);
-
subtype Real is Real_Arrays.Real;
--  Work around visibility bug ???
 
-   use BLAS, LAPACK;
+   function Is_Non_Zero (X : Complex) return Boolean is (X /= (0.0, 0.0));
+   --  Needed by Back_Substitute
 
-   --  Procedure versions of functions returning unconstrained values.
-   --  This allows for inlining the function wrapper.
+   procedure Back_Substitute is new Ops.Back_Substitute
+ (Scalar=> Complex,
+  Matrix=> Complex_Matrix,
+  Is_Non_Zero   => Is_Non_Zero);
 
-   procedure Eigenvalues
- (A  : Complex_Matrix;
-  Values : out Real_Vector);
+   procedure Forward_Eliminate is new Ops.Forward_Eliminate
+(Scalar=> Complex,
+ Real  => Real'Base,
+ Matrix=> Complex_Matrix,
+ Zero  => (0.0, 0.0),
+ One   => (1.0, 0.0));
 
-   procedure Inverse
- (A  : Complex_Matrix;
-  R  : out Complex_Matrix);
-
-   procedure Solve
- (A  : Complex_Matrix;
-  X  : Complex_Vector;
-  B  : out Complex_Vector);
-
-   procedure Solve
- (A  : Complex_Matrix;
-  X  : Complex_Matrix;
-  B  : out Complex_Matrix);
-
-   procedur

[C++ Patch] PR 17212

2011-10-13 Thread Paolo Carlini

Hi,

in this simple PR, submitter remarks that there isn't a real reason not 
to have -W(no-)format-zero-length working in C++ exactly like in C.


In fact, the status quo is that the warning *is* active in C++ too, part 
of -Wformat, but it cannot be *disabled*, because 
-Wno-format-zero-length is not a legal C++ option!


Thus the below, tested x86_64-linux. Ok for mainline?

Thanks,
Paolo.


/gcc
2011-10-13  Paolo Carlini  

PR c++/17212
* c-family/c.opt ([Wformat-zero-length]): Add C++.
* doc/invoke.texi: Update.

/testsuite
2011-10-13  Paolo Carlini  

PR c++/17212
* g++.dg/warn/format6.C: New.
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 179904)
+++ doc/invoke.texi (working copy)
@@ -3189,7 +3189,7 @@ in the case of @code{scanf} formats, this option w
 warning if the unused arguments are all pointers, since the Single
 Unix Specification says that such unused arguments are allowed.
 
-@item -Wno-format-zero-length @r{(C and Objective-C only)}
+@item -Wno-format-zero-length @r{(C, C++ and Objective-C)}
 @opindex Wno-format-zero-length
 @opindex Wformat-zero-length
 If @option{-Wformat} is specified, do not warn about zero-length formats.
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 179904)
+++ c-family/c.opt  (working copy)
@@ -396,7 +396,7 @@ C ObjC C++ ObjC++ Var(warn_format_y2k) Warning
 Warn about strftime formats yielding 2-digit years
 
 Wformat-zero-length
-C ObjC Var(warn_format_zero_length) Warning
+C ObjC C++ Var(warn_format_zero_length) Warning
 Warn about zero-length formats
 
 Wformat=
Index: testsuite/g++.dg/warn/format6.C
===
--- testsuite/g++.dg/warn/format6.C (revision 0)
+++ testsuite/g++.dg/warn/format6.C (revision 0)
@@ -0,0 +1,7 @@
+// PR c++/17212
+// { dg-options "-Wformat -Wno-format-zero-length" }
+
+void f()
+{
+  __builtin_printf("");
+}


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
Hello,

this new version addresses the comments from Michael and additional fixes
an latent issue shown up by this rewrite in fold-const.
On gimplify.c's gimple_boolify we didn't handled the case that operands
for TRUTH-expressions need to have same operand-size for transformation to
bitwise operation.  This shows up for Fortran, as here are more then one
boolean-kind type with different mode-sizes.  I added a testcase for this,

ChangeLog

2011-10-13  Kai Tietz  

* fold-const.c (simple_operand_p_2): New function.
(fold_truthop): Rename to
(fold_truth_andor_1): function name.
Additionally remove branching creation for logical and/or.
(fold_truth_andor): Handle branching creation for logical and/or here.
* gimplify.c (gimple_boolify): Take care that for bitwise-binary
transformation the operands have compatible types.

2011-10-13  Kai Tietz  

* gfortran.fortran-torture/compile/logical-2.f90: New test.

Bootstrapped and regression-tested for all languages plus Ada and
Obj-C++ on x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c
+++ gcc/gcc/fold-const.c
@@ -112,13 +112,13 @@ static tree decode_field_reference (loca
 static int all_ones_mask_p (const_tree, int);
 static tree sign_bit_p (tree, const_tree);
 static int simple_operand_p (const_tree);
+static bool simple_operand_p_2 (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
 static tree range_predecessor (tree);
 static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, tree,
tree, tree);
 static tree unextend (tree, int, int, tree);
-static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
 static tree optimize_minmax_comparison (location_t, enum tree_code,
tree, tree, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a field reference.

If EXP is a comparison reference, we return the innermost reference.

@@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
 }

-/* Subroutine for fold_truthop: determine if an operand is simple enough
+/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
to be evaluated unconditionally.  */

 static int
@@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
- || TREE_CODE (exp) == SSA_NAME
+ || TREE_CODE (exp) == SSA_NAME
  || (DECL_P (exp)
  && ! TREE_ADDRESSABLE (exp)
  && ! TREE_THIS_VOLATILE (exp)
@@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
 registers aren't expensive.  */
  && (! TREE_STATIC (exp) || DECL_REGISTER (exp;
 }
+
+/* Subroutine for fold_truth_andor: determine if an operand is simple enough
+   to be evaluated unconditionally.
+   I addition to simple_operand_p, we assume that comparisons and logic-not
+   operations are simple, if their operands are simple, too.  */
+
+static bool
+simple_operand_p_2 (tree exp)
+{
+  enum tree_code code;
+
+  /* Strip any conversions that don't change the machine mode.  */
+  STRIP_NOPS (exp);
+
+  code = TREE_CODE (exp);
+
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+return (!tree_could_trap_p (exp)
+   && simple_operand_p_2 (TREE_OPERAND (exp, 0))
+   && simple_operand_p_2 (TREE_OPERAND (exp, 1)));
+
+  if (TREE_SIDE_EFFECTS (exp)
+  || tree_could_trap_p (exp))
+return false;
+
+  switch (code)
+{
+case SSA_NAME:
+  return true;
+case TRUTH_NOT_EXPR:
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+case BIT_NOT_EXPR:
+  if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
+   return false;
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+default:
+  return simple_operand_p (exp);
+}
+}
+
 
 /* The following functions are subroutines to fold_range_test and allow it to
try to change a logical combination of comparisons into a range test.
@@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
 }
 
-/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
+/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
bit value.  Arrange things so the extra bits will be set to zero if and
only if C is signed-extended to its full width.  If MASK is nonzero,
it is an INTEGER_CST that should be AND'ed with the extra bits.  */
@@ -5025,8 +5065,8 @@ merge_truthop_with_opposite_arm (locatio
We return the simplifie

Re: [pph] More DECL merging. (issue5268042)

2011-10-13 Thread Diego Novillo
On Wed, Oct 12, 2011 at 23:36, Lawrence Crowl  wrote:
> Use the mangled name for merging, as this should enable us to
> handle function overloads.  We use the regular identifier for other
> declarations, as that should be sufficient and avoids the problem of
> different typedefs mangling to the same name.
>
> Merge struct members as well as namespace members.  This will
> eventually help with member declaration versus definition issues.
>
> Change test cases to reflect the above.
>
> Comment on other failing tests.
>
> Comment on failing cache handling for merge.
>
> Tested on x64.
>
>
> Index: gcc/testsuite/ChangeLog.pph
>
> 2011-10-12   Lawrence Crowl  
>
>        * g++.dg/pph/p2pr36533.cc: Mark expected fail on unexpanded intrinsic.
>        * g++.dg/pph/p4pr36533.cc: Likewise.
>        * g++.dg/pph/p4mean.cc: Likewise.
>        * g++.dg/pph/c3variables.cc: Comment on reason for fail.
>        * g++.dg/pph/c4vardef.cc: Likewise.
>
> Index: gcc/cp/ChangeLog.pph
>
> 2011-10-12   Lawrence Crowl  
>
>        * pph-streamer.h (pph_merge_name): New.
>        * pph-streamer.c (pph_merge_name): New.
>        * pph-streamer-out.c (pph_out_mergeable_tree_vec): Emit the vector
>        in declaration order.
>        (pph_out_merge_name): New.
>        (pph_write_any_tree): Use pph_out_merge_name instead of raw code.
>        * pph-streamer-in.c (pph_match_to_link): Use pph_merge_name.
>        (pph_in_binding_level): Also merge members of structs.
>        (pph_read_any_tree): Save read tree to determine if it is different
>        from the tree to be used.
>
>
> Index: gcc/testsuite/g++.dg/pph/p2pr36533.cc
> ===
> --- gcc/testsuite/g++.dg/pph/p2pr36533.cc       (revision 179880)
> +++ gcc/testsuite/g++.dg/pph/p2pr36533.cc       (working copy)
> @@ -1,2 +1,6 @@
>  /* { dg-options "-w -fpermissive" } */
> +// pph asm xdiff 25347
> +// xfail BOGUS INTRINSIC
> +// failing to recognise memset as an intrinsic
> +
>  #include "p1pr36533.h"
> Index: gcc/testsuite/g++.dg/pph/c3variables.cc
> ===
> --- gcc/testsuite/g++.dg/pph/c3variables.cc     (revision 179880)
> +++ gcc/testsuite/g++.dg/pph/c3variables.cc     (working copy)
> @@ -1,4 +1,5 @@
>  // pph asm xdiff 34997
> +// xfail BOGUS DUPVAR
>  // tentative definition emitted twice
>
>  #include "c0variables1.h"
> Index: gcc/testsuite/g++.dg/pph/p4mean.cc
> ===
> --- gcc/testsuite/g++.dg/pph/p4mean.cc  (revision 179880)
> +++ gcc/testsuite/g++.dg/pph/p4mean.cc  (working copy)
> @@ -1,4 +1,8 @@
>  /* { dg-options "-w -fpermissive" }  */
> +// pph asm xdiff 39234
> +// xfail BOGUS INTRINSIC
> +// failing to recognize sqrt as an intrinsic
> +
>  #include 
>  #include 
>  #include 
> Index: gcc/testsuite/g++.dg/pph/p4pr36533.cc
> ===
> --- gcc/testsuite/g++.dg/pph/p4pr36533.cc       (revision 179880)
> +++ gcc/testsuite/g++.dg/pph/p4pr36533.cc       (working copy)
> @@ -1,2 +1,6 @@
>  /* { dg-options "-w -fpermissive" } */
> +// pph asm xdiff 25347
> +// xfail BOGUS INTRINSIC
> +// failing to recognise memset as an intrinsic
> +
>  #include "p4pr36533.h"
> Index: gcc/testsuite/g++.dg/pph/c4vardef.cc
> ===
> --- gcc/testsuite/g++.dg/pph/c4vardef.cc        (revision 179880)
> +++ gcc/testsuite/g++.dg/pph/c4vardef.cc        (working copy)
> @@ -1,4 +1,6 @@
>  // pph asm xdiff 00553
> +// xfail BOGUS DUPVAR
> +// definition emitted twice
>
>  #include "c0vardef1.h"
>  #include "c0vardef2.h"
> Index: gcc/cp/pph-streamer-in.c
> ===
> --- gcc/cp/pph-streamer-in.c    (revision 179880)
> +++ gcc/cp/pph-streamer-in.c    (working copy)
> @@ -803,7 +803,7 @@ pph_match_to_function (tree expr ATTRIBU
>    against an LINK of a chain. */
>
>  static tree
> -pph_match_to_link (tree expr, location_t where, const char *idstr, tree* 
> link)
> +pph_match_to_link (tree expr, location_t where, const char *idstr, tree 
> *link)
>  {
>   enum tree_code link_code, expr_code;
>   tree idtree;
> @@ -817,7 +817,7 @@ pph_match_to_link (tree expr, location_t
>   if (link_code != expr_code)
>     return NULL;
>
> -  idtree = DECL_NAME (*link);
> +  idtree = pph_merge_name (*link);
>   if (!idtree)
>     return NULL;
>
> @@ -1072,7 +1072,7 @@ pph_in_binding_level (cp_binding_level *
>   *out_field = bl;
>
>   entity = bl->this_entity = pph_in_tree (stream);
> -  if (NAMESPACE_SCOPE_P (entity))
> +  if (NAMESPACE_SCOPE_P (entity) || DECL_CLASS_SCOPE_P (entity))
>     {
>       if (flag_pph_debug >= 3)
>         debug_tree_chain (bl->names);
> @@ -1962,7 +1962,8 @@ pph_read_any_tree (pph_stream *stream, t
>  {
>   struct lto_input_block *ib = stream->encoder.r.ib;
>   struct data_in *data_in = stream->encoder.r.data_in;
> -  tree 

Re: [pph] Remove old tracing. (issue5271041)

2011-10-13 Thread Diego Novillo
On Thu, Oct 13, 2011 at 01:01, Lawrence Crowl  wrote:

>  /* Read and return a location_t from STREAM.
> -   FIXME pph: If pph_trace didn't depend on STREAM, we could avoid having to
> -   call this function, only for it to call lto_input_location, which calls 
> the
> -   streamer hook back to pph_read_location.  */
> +   FIXME pph: Tracing doesn't depend on STREAM any more.  We could avoid 
> having
> +   to call this function, only for it to call lto_input_location, which calls
> +   the streamer hook back to pph_read_location.  Say what?  */

This function makes no sense, actually.  Just rename the hook from
pph_read_location to pph_in_location.

I'll fix it.


Diego.


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Richard Guenther
On Thu, Oct 13, 2011 at 1:25 PM, Kai Tietz  wrote:
> Hello,
>
> this new version addresses the comments from Michael and additional fixes
> an latent issue shown up by this rewrite in fold-const.
> On gimplify.c's gimple_boolify we didn't handled the case that operands
> for TRUTH-expressions need to have same operand-size for transformation to
> bitwise operation.  This shows up for Fortran, as here are more then one
> boolean-kind type with different mode-sizes.  I added a testcase for this,
>
> ChangeLog
>
> 2011-10-13  Kai Tietz  
>
>        * fold-const.c (simple_operand_p_2): New function.
>        (fold_truthop): Rename to
>        (fold_truth_andor_1): function name.
>        Additionally remove branching creation for logical and/or.
>        (fold_truth_andor): Handle branching creation for logical and/or here.
>        * gimplify.c (gimple_boolify): Take care that for bitwise-binary
>        transformation the operands have compatible types.
>
> 2011-10-13  Kai Tietz  
>
>        * gfortran.fortran-torture/compile/logical-2.f90: New test.
>
> Bootstrapped and regression-tested for all languages plus Ada and
> Obj-C++ on x86_64-pc-linux-gnu.
> Ok for apply?
>
> Regards,
> Kai
>
> Index: gcc/gcc/fold-const.c
> ===
> --- gcc.orig/gcc/fold-const.c
> +++ gcc/gcc/fold-const.c
> @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
>  static int all_ones_mask_p (const_tree, int);
>  static tree sign_bit_p (tree, const_tree);
>  static int simple_operand_p (const_tree);
> +static bool simple_operand_p_2 (tree);
>  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
>  static tree range_predecessor (tree);
>  static tree range_successor (tree);
>  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
>  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
> tree, tree);
>  static tree unextend (tree, int, int, tree);
> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
>  static tree optimize_minmax_comparison (location_t, enum tree_code,
>                                        tree, tree, tree);
>  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
> @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
>   return lhs;
>  }
>
> -/* Subroutine for fold_truthop: decode a field reference.
> +/* Subroutine for fold_truth_andor_1: decode a field reference.
>
>    If EXP is a comparison reference, we return the innermost reference.
>
> @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
>   return NULL_TREE;
>  }
>
> -/* Subroutine for fold_truthop: determine if an operand is simple enough
> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
> enough
>    to be evaluated unconditionally.  */
>
>  static int
> @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
>   STRIP_NOPS (exp);
>
>   return (CONSTANT_CLASS_P (exp)
> -         || TREE_CODE (exp) == SSA_NAME
> +         || TREE_CODE (exp) == SSA_NAME
>          || (DECL_P (exp)
>              && ! TREE_ADDRESSABLE (exp)
>              && ! TREE_THIS_VOLATILE (exp)
> @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
>                 registers aren't expensive.  */
>              && (! TREE_STATIC (exp) || DECL_REGISTER (exp;
>  }
> +
> +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
> +   to be evaluated unconditionally.
> +   I addition to simple_operand_p, we assume that comparisons and logic-not
> +   operations are simple, if their operands are simple, too.  */
> +
> +static bool
> +simple_operand_p_2 (tree exp)
> +{
> +  enum tree_code code;
> +
> +  /* Strip any conversions that don't change the machine mode.  */
> +  STRIP_NOPS (exp);
> +
> +  code = TREE_CODE (exp);
> +
> +  if (TREE_CODE_CLASS (code) == tcc_comparison)
> +    return (!tree_could_trap_p (exp)
> +           && simple_operand_p_2 (TREE_OPERAND (exp, 0))
> +           && simple_operand_p_2 (TREE_OPERAND (exp, 1)));
> +
> +  if (TREE_SIDE_EFFECTS (exp)
> +      || tree_could_trap_p (exp))
> +    return false;
> +
> +  switch (code)
> +    {
> +    case SSA_NAME:
> +      return true;
> +    case TRUTH_NOT_EXPR:
> +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
> +    case BIT_NOT_EXPR:
> +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
> +       return false;
> +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
> +    default:
> +      return simple_operand_p (exp);
> +    }
> +}
> +
>
>  /* The following functions are subroutines to fold_range_test and allow it to
>    try to change a logical combination of comparisons into a range test.
> @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
>   return 0;
>  }
>
> -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
> +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
>    bit value.  Arrange things so the

[PATCH] Fix PR50712

2011-10-13 Thread Richard Guenther

This fixes PR50712, an issue with IPA split uncovered by adding
verifier calls after it ... we need to also gimplify reads of
register typed memory when passing it as argument.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-10-13  Richard Guenther  

PR tree-optimization/50712
* ipa-split.c (split_function): Always re-gimplify parameters
when they are not gimple vals before passing them.  Properly
check for type compatibility.

* gcc.target/i386/pr50712.c: New testcase.

Index: gcc/ipa-split.c
===
*** gcc/ipa-split.c (revision 179894)
--- gcc/ipa-split.c (working copy)
*** split_function (struct split_point *spli
*** 958,964 
tree retval = NULL, real_retval = NULL;
bool split_part_return_p = false;
gimple last_stmt = NULL;
-   bool conv_needed = false;
unsigned int i;
tree arg;
  
--- 958,963 
*** split_function (struct split_point *spli
*** 1000,1011 
else
  arg = parm;
  
!   if (TYPE_MAIN_VARIANT (DECL_ARG_TYPE (parm))
!   != TYPE_MAIN_VARIANT (TREE_TYPE (arg)))
! {
!   conv_needed = true;
!   arg = fold_convert (DECL_ARG_TYPE (parm), arg);
! }
VEC_safe_push (tree, heap, args_to_pass, arg);
}
  
--- 999,1006 
else
  arg = parm;
  
!   if (!useless_type_conversion_p (DECL_ARG_TYPE (parm), TREE_TYPE (arg)))
! arg = fold_convert (DECL_ARG_TYPE (parm), arg);
VEC_safe_push (tree, heap, args_to_pass, arg);
}
  
*** split_function (struct split_point *spli
*** 1135,1148 
  
/* Produce the call statement.  */
gsi = gsi_last_bb (call_bb);
!   if (conv_needed)
! FOR_EACH_VEC_ELT (tree, args_to_pass, i, arg)
!   if (!is_gimple_val (arg))
!   {
! arg = force_gimple_operand_gsi (&gsi, arg, true, NULL_TREE,
! false, GSI_NEW_STMT);
! VEC_replace (tree, args_to_pass, i, arg);
!   }
call = gimple_build_call_vec (node->decl, args_to_pass);
gimple_set_block (call, DECL_INITIAL (current_function_decl));
  
--- 1130,1142 
  
/* Produce the call statement.  */
gsi = gsi_last_bb (call_bb);
!   FOR_EACH_VEC_ELT (tree, args_to_pass, i, arg)
! if (!is_gimple_val (arg))
!   {
!   arg = force_gimple_operand_gsi (&gsi, arg, true, NULL_TREE,
!   false, GSI_NEW_STMT);
!   VEC_replace (tree, args_to_pass, i, arg);
!   }
call = gimple_build_call_vec (node->decl, args_to_pass);
gimple_set_block (call, DECL_INITIAL (current_function_decl));
  
Index: gcc/testsuite/gcc.target/i386/pr50712.c
===
*** gcc/testsuite/gcc.target/i386/pr50712.c (revision 0)
--- gcc/testsuite/gcc.target/i386/pr50712.c (revision 0)
***
*** 0 
--- 1,33 
+ /* { dg-do compile } */
+ /* { dg-require-effective-target ilp32 } */
+ /* { dg-options "-O2" } */
+ 
+ typedef __builtin_va_list __va_list;
+ typedef __va_list __gnuc_va_list;
+ typedef __gnuc_va_list va_list;
+ struct MSVCRT__iobuf { };
+ typedef struct MSVCRT__iobuf MSVCRT_FILE;
+ typedef union _printf_arg { } printf_arg;
+ MSVCRT_FILE MSVCRT__iob[20];
+ int pf_print_a (va_list *);
+ int __attribute__((__cdecl__))
+ MSVCRT_vfprintf_s(MSVCRT_FILE* file, const char *format, va_list valist)
+ {
+   if(!((file != ((void *)0))
+|| (MSVCRT__invalid_parameter(((void *)0), ((void *)0),
+((void *)0), 0, 0),0)))
+   return -1;
+   return pf_printf_a(&valist);
+ }
+ int __attribute__((__cdecl__))
+ MSVCRT_vprintf_s(const char *format, va_list valist)
+ {
+   return MSVCRT_vfprintf_s((MSVCRT__iob+1),format,valist);
+ }
+ int __attribute__((__cdecl__))
+ MSVCRT_fprintf_s(MSVCRT_FILE* file, const char *format, ...)
+ {
+   va_list valist;
+   va_start (valist, format);
+   return MSVCRT_vfprintf_s(file, format, valist);
+ }


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
2011/10/13 Richard Guenther :
> On Thu, Oct 13, 2011 at 1:25 PM, Kai Tietz  wrote:
>> Hello,
>>
>> this new version addresses the comments from Michael and additional fixes
>> an latent issue shown up by this rewrite in fold-const.
>> On gimplify.c's gimple_boolify we didn't handled the case that operands
>> for TRUTH-expressions need to have same operand-size for transformation to
>> bitwise operation.  This shows up for Fortran, as here are more then one
>> boolean-kind type with different mode-sizes.  I added a testcase for this,
>>
>> ChangeLog
>>
>> 2011-10-13  Kai Tietz  
>>
>>        * fold-const.c (simple_operand_p_2): New function.
>>        (fold_truthop): Rename to
>>        (fold_truth_andor_1): function name.
>>        Additionally remove branching creation for logical and/or.
>>        (fold_truth_andor): Handle branching creation for logical and/or here.
>>        * gimplify.c (gimple_boolify): Take care that for bitwise-binary
>>        transformation the operands have compatible types.
>>
>> 2011-10-13  Kai Tietz  
>>
>>        * gfortran.fortran-torture/compile/logical-2.f90: New test.
>>
>> Bootstrapped and regression-tested for all languages plus Ada and
>> Obj-C++ on x86_64-pc-linux-gnu.
>> Ok for apply?
>>
>> Regards,
>> Kai
>>
>> Index: gcc/gcc/fold-const.c
>> ===
>> --- gcc.orig/gcc/fold-const.c
>> +++ gcc/gcc/fold-const.c
>> @@ -112,13 +112,13 @@ static tree decode_field_reference (loca
>>  static int all_ones_mask_p (const_tree, int);
>>  static tree sign_bit_p (tree, const_tree);
>>  static int simple_operand_p (const_tree);
>> +static bool simple_operand_p_2 (tree);
>>  static tree range_binop (enum tree_code, tree, tree, int, tree, int);
>>  static tree range_predecessor (tree);
>>  static tree range_successor (tree);
>>  static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
>>  static tree fold_cond_expr_with_comparison (location_t, tree, tree,
>> tree, tree);
>>  static tree unextend (tree, int, int, tree);
>> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
>>  static tree optimize_minmax_comparison (location_t, enum tree_code,
>>                                        tree, tree, tree);
>>  static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
>> @@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
>>   return lhs;
>>  }
>>
>> -/* Subroutine for fold_truthop: decode a field reference.
>> +/* Subroutine for fold_truth_andor_1: decode a field reference.
>>
>>    If EXP is a comparison reference, we return the innermost reference.
>>
>> @@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
>>   return NULL_TREE;
>>  }
>>
>> -/* Subroutine for fold_truthop: determine if an operand is simple enough
>> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple 
>> enough
>>    to be evaluated unconditionally.  */
>>
>>  static int
>> @@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
>>   STRIP_NOPS (exp);
>>
>>   return (CONSTANT_CLASS_P (exp)
>> -         || TREE_CODE (exp) == SSA_NAME
>> +         || TREE_CODE (exp) == SSA_NAME
>>          || (DECL_P (exp)
>>              && ! TREE_ADDRESSABLE (exp)
>>              && ! TREE_THIS_VOLATILE (exp)
>> @@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
>>                 registers aren't expensive.  */
>>              && (! TREE_STATIC (exp) || DECL_REGISTER (exp;
>>  }
>> +
>> +/* Subroutine for fold_truth_andor: determine if an operand is simple enough
>> +   to be evaluated unconditionally.
>> +   I addition to simple_operand_p, we assume that comparisons and logic-not
>> +   operations are simple, if their operands are simple, too.  */
>> +
>> +static bool
>> +simple_operand_p_2 (tree exp)
>> +{
>> +  enum tree_code code;
>> +
>> +  /* Strip any conversions that don't change the machine mode.  */
>> +  STRIP_NOPS (exp);
>> +
>> +  code = TREE_CODE (exp);
>> +
>> +  if (TREE_CODE_CLASS (code) == tcc_comparison)
>> +    return (!tree_could_trap_p (exp)
>> +           && simple_operand_p_2 (TREE_OPERAND (exp, 0))
>> +           && simple_operand_p_2 (TREE_OPERAND (exp, 1)));
>> +
>> +  if (TREE_SIDE_EFFECTS (exp)
>> +      || tree_could_trap_p (exp))
>> +    return false;
>> +
>> +  switch (code)
>> +    {
>> +    case SSA_NAME:
>> +      return true;
>> +    case TRUTH_NOT_EXPR:
>> +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
>> +    case BIT_NOT_EXPR:
>> +      if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
>> +       return false;
>> +      return simple_operand_p_2 (TREE_OPERAND (exp, 0));
>> +    default:
>> +      return simple_operand_p (exp);
>> +    }
>> +}
>> +
>>
>>  /* The following functions are subroutines to fold_range_test and allow it 
>> to
>>    try to change a logical combination of comparisons into a range test.
>> @@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
>>   return 0;
>>  }
>>
>> -/* Subroutine for fol

Re: [Ada] Checks on intrinsic operators

2011-10-13 Thread Iain Sandoe


On 13 Oct 2011, at 11:22, Arnaud Charlet wrote:

An operator can be declared Import (Intrinsic) only if the current  
view of the
operand type (s) is a numeric type. With this patch the compiler  
properly

rejects the pragma if the operand type is private or incomplete.

Compiling mysystem.ads must yield:

  mysystem.ads:3:13: intrinsic operator can only apply to numeric  
types
  mysystem.ads:7:13: intrinsic operator can only apply to numeric  
types

  mysystem.ads:7:18: invalid use of incomplete type "Self"

---
package Mysystem is
  type A is private;
  function "<"  (Left, Right : A) return Boolean;
  pragma Import (Intrinsic, "<");

  type Self;
  function "+" (X, Y : Self) return Boolean;
  pragma Import (Intrinsic, "+");
  type Self is tagged null record;
private
  type A is mod 2 ** 32;
end Mysystem;


just out of curiosity, is there a reason why some of the changes  
applied say " x must do y " with an example - but the example is not  
made into a test-case?

(apologies if this has already be discussed).
Iain



Ping shrink wrap patches

2011-10-13 Thread Alan Modra
Ping
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01002.html
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01003.html
and
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01596.html

The last one needs a tweak.
s/FUNCTION_VALUE_REGNO_P/targetm.calls.function_value_regno_p/,
or wrap the whole patch in #ifdef FUNCTION_VALUE_REGNO_P.  Perhaps I
should explain more than what I wrote in the comment too.  For targets
like powerpc that pass function arguments in registers and return a
function result in the same register as one of the arguments, the
ifcvt optimization can prevent shrink wrapping.  Without the ifcvt
optimization for a function "int foo (int x)" we might have something
like

 r29 = r3; // save r3 in callee saved reg
 if (some test) goto exit_label
 // main body of foo, calling other functions
 r3 = 0;
 return;
exit_label:
 r3 = 1;
 return;

Bernd's http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00380.html quite
happily rearranges the r29 assignment to be after the "if", and shrink
wrapping occurs.  With the ifcvt optimization we get

 r29 = r3; // save r3 in callee saved reg
 r3 = 1;
 if (some test) goto exit_label
 // main body of foo, calling other functions
 r3 = 0;
exit_label:
 return;

and now the r29 assignment cannot be moved past the "if", disabling
shrink wrap.

-- 
Alan Modra
Australia Development Lab, IBM


Fix gcc.dg/builtins-67.c on Solaris 8/9

2011-10-13 Thread Eric Botcazou
[To the right list this time]

The test fails with a link error, as 'round' and 'rint' are only C99.

Fixed thusly, tested on SPARC/Solaris 8, applied on the mainline as obvious.


2011-10-13  Eric Botcazou  

* gcc.dg/builtins-67.c: Guard iround and irint with HAVE_C99_RUNTIME.


-- 
Eric Botcazou
Index: gcc.dg/builtins-67.c
===
--- gcc.dg/builtins-67.c	(revision 179844)
+++ gcc.dg/builtins-67.c	(working copy)
@@ -58,14 +58,14 @@ long long llceilf (float a) { return (lo
 long long llceill (long double a) { return (long long) ceill (a); }
 #endif
 
-int iround (double a) { return (int) round (a); }
 #ifdef HAVE_C99_RUNTIME
+int iround (double a) { return (int) round (a); }
 int iroundf (float a) { return (int) roundf (a); }
 int iroundl (long double a) { return (int) roundl (a); }
 #endif
 
-int irint (double a) { return (int) rint (a); }
 #ifdef HAVE_C99_RUNTIME
+int irint (double a) { return (int) rint (a); }
 int irintf (float a) { return (int) rintf (a); }
 int irintl (long double a) { return (int) rintl (a); }
 #endif


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-13 Thread Laurynas Biveinis
2011/10/13 Gary Funck :
> On 10/13/11 06:15:31, Laurynas Biveinis wrote:
>> [...] In your case (correct me if I misunderstood something)
>> you have one hash table, marking of which will mark more objects
>> which are required for the correct marking of the second hash table.
>> GC might be simply walking the second one first.
>
> Yes, I think that this accurately summarizes the situation, and the result.
>
> Any suggestions on how to fix this?  It seems that one fix might
> be to use a non garbage-collected hash table for the hash map.

Is it feasible to write an if_marked function for the second hash
table that would duplicate the work of the first hash table function
and then some? I.e. it would determine if an entry needs to be marked
based on information outside of both hash tables and independently of
the first one (even if duplicating its logic).


-- 
Laurynas


[patch optimization]: Improve tree-ssa-ifcombine pass

2011-10-13 Thread Kai Tietz
Hello,

This patch adds further optimization to gimple's ifcombine pass for single-bit
andif operations.

New patterns recognized are:

* if ((a & 4) == 0) if ((a & 8) == 0) -> if ((a & 12) == 0)
* if ((a & 4) != 0) if ((a & 8) == 0) -> if ((a & 12) == 4)
* if ((a & 4) == 0) if ((a & 8) != 0) -> if ((a & 12) == 8)

To support that, patch adds required additional patterns for
if.and.if, and if.or.if
detection to tree_ssa_ifcombine_bb.

ChangeLog

2011-10-13  Kai Tietz  

* tree-ssa-ifcombine.c (same_phi_args_p_2): New
helper for new andif pattern edge PHI comparison.
(recognize_single_bit_test): Add new argument and
allow EQ_EXPR.
(ifcombine_ifandif): Handle == 0 cases.
(tree_ssa_ifcombine_bb): Add new ifandif pattern.

2011-10-13  Kai Tietz  

* gcc.dg/tree-ssa/ssa-ifcombine-8.c: New test.
* gcc.dg/tree-ssa/ssa-ifcombine-9.c: New test.
* gcc.dg/tree-ssa/ssa-ifcombine-10.c: New test.
* gcc.dg/tree-ssa/ssa-ifcombine-11.c: New test.
* gcc.dg/tree-ssa/ssa-ifcombine-12.c: New test.

Bootstrapped and regression tested for all languages plus Ada and
Obj-C++ on x86_64-unknown-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/tree-ssa-ifcombine.c
===
--- gcc.orig/gcc/tree-ssa-ifcombine.c
+++ gcc/gcc/tree-ssa-ifcombine.c
@@ -138,6 +138,54 @@ same_phi_args_p (basic_block bb1, basic_
   return true;
 }

+/* Verify if all PHI node arguments in DEST1 for edges from BB1, BB2
+   or DEST2 to DEST1 are the same.  This makes the CFG merge point
+   free from side-effects.  Return true in this case, else false.
+   If DEST1 is not equal to DEST2, then DEST2 has to be a PHI node
+   in DEST1 and DEST2 has not to have statements or its own PHI node.  */
+
+static bool
+same_phi_args_p_2 (basic_block bb1, basic_block bb2, basic_block
dest1, basic_block dest2)
+{
+  edge e1 = find_edge (bb1, dest1);
+  edge e2 = find_edge (bb2, dest2);
+  gimple_stmt_iterator gsi1;
+  gimple_stmt_iterator gsi2;
+  gimple phi;
+
+  gsi1 = gsi_start_phis (dest1);
+  if (gsi_end_p (gsi1))
+return (dest1 == dest2);
+
+  /* See if we can't find a PHI in DEST2 and that we find
+ a PHI edge in DEST1 for DEST2.  */
+  gsi2 = gsi_start_phis (dest2);
+  if (gsi_end_p (gsi2))
+{
+  /* If DEST2 has no PHI, then it also has not to contain
+ any statements.  */
+  if (last_stmt (dest2) != NULL)
+return false;
+  gsi2 = gsi1;
+  /* See if we can find DEST2 within PHI of DEST1.  */
+  e2 = find_edge (dest2, dest1);
+  if (!e2)
+return false;
+}
+  else if (dest1 != dest2)
+return false;
+
+  for (; !gsi_end_p (gsi1); gsi_next (&gsi1))
+{
+  phi = gsi_stmt (gsi1);
+  if (!operand_equal_p (PHI_ARG_DEF_FROM_EDGE (phi, e1),
+   PHI_ARG_DEF_FROM_EDGE (phi, e2), 0))
+   return false;
+}
+
+  return true;
+}
+
 /* Return the best representative SSA name for CANDIDATE which is used
in a bit test.  */

@@ -165,15 +213,19 @@ get_name_for_bit_test (tree candidate)
 /* Recognize a single bit test pattern in GIMPLE_COND and its defining
statements.  Store the name being tested in *NAME and the bit
in *BIT.  The GIMPLE_COND computes *NAME & (1 << *BIT).
+   The GIMPLE_COND code is either NE_EXPR or EQ_EXPR.
+   IS_CMPEQ will be set to true, if comparison is EQ_EXPR, otherwise
+   to false.
Returns true if the pattern matched, false otherwise.  */

 static bool
-recognize_single_bit_test (gimple cond, tree *name, tree *bit)
+recognize_single_bit_test (gimple cond, tree *name, tree *bit, bool *is_cmpeq)
 {
   gimple stmt;

   /* Get at the definition of the result of the bit test.  */
-  if (gimple_cond_code (cond) != NE_EXPR
+  if ((gimple_cond_code (cond) != NE_EXPR
+   && gimple_cond_code (cond) != EQ_EXPR)
   || TREE_CODE (gimple_cond_lhs (cond)) != SSA_NAME
   || !integer_zerop (gimple_cond_rhs (cond)))
 return false;
@@ -181,6 +233,8 @@ recognize_single_bit_test (gimple cond,
   if (!is_gimple_assign (stmt))
 return false;

+  *is_cmpeq = (gimple_cond_code (cond) == EQ_EXPR);
+
   /* Look at which bit is tested.  One form to recognize is
  D.1985_5 = state_3(D) >> control1_4(D);
  D.1986_6 = (int) D.1985_5;
@@ -306,6 +360,7 @@ ifcombine_ifandif (basic_block inner_con
   gimple_stmt_iterator gsi;
   gimple inner_cond, outer_cond;
   tree name1, name2, bit1, bit2;
+  bool is_cmpeq1, is_cmpeq2;

   inner_cond = last_stmt (inner_cond_bb);
   if (!inner_cond
@@ -321,25 +376,40 @@ ifcombine_ifandif (basic_block inner_con
  that case remove the outer test, merging both else edges,
  and change the inner one to test for
  name & (bit1 | bit2) == (bit1 | bit2).  */
-  if (recognize_single_bit_test (inner_cond, &name1, &bit1)
-  && recognize_single_bit_test (outer_cond, &name2, &bit2)
+  if (recognize_single_bit_test (inner_cond, &name1, &bit1, &is_cmpeq1)
+

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Same here, I don't like it but I hardly see any alternative.  The only 
> possibility could be to prevent calling expand_compound_operation 
> completely for addresses.  Richard, what do you think?  Don't worry, 
> combine hasn't changed much since your days. :)

The problem wasn't potential changes to combine, but my memory ...

But now I refreshed it and understand the issue.

expand_compound_operation and make_compond_operation were meant to be used
as a pair.  You first do the former, then see if the result can be
simplified, then call the latter in case it couldn't.

In the SET case, we call the latter in simplify_set.  But there's also this
code in combine_simplify_rtx:

case MEM:
  /* Ensure that our address has any ASHIFTs converted to MULT in case
 address-recognizing predicates are called later.  */
  temp = make_compound_operation (XEXP (x, 0), MEM);
  SUBST (XEXP (x, 0), temp);
  break;

THAT'S the code that should do the transformation that this patch contains.
So I'd suggest doing some debugging and seeing why it isn't.  This could
just be a bug in make_compound_operation not handling the SUBREG.


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Michael Matz
Hi,

On Thu, 13 Oct 2011, Jakub Jelinek wrote:

> I'd sum up my previous mail as noting that restricted pointers are objects,
> so restrict is not property of expressions.  So e.g. I don't think
> we should add ADD_RESTRICT (or, at least, not an ADD_RESTRICT with different
> tag) on every assignment to a restrict pointer object.

Yes, if you meant to include "from non-restrict pointer objects".

> E.g. the restrict tag for cases we don't handle yet currently
> (GLOBAL_RESTRICT/PARM_RESTRICT) could be based on a DECL_UID (for fields
> on FIELD_DECL uid, I think we are not duplicating the fields anywhere,
> for VAR_DECLs based on DECL_ABSTRACT_ORIGIN's DECL_UID?)

field_decls are shared between different variants of types.  That should 
lead only to more conflicts between restrict pointers so would be 
conservatively correct, but something to keep in mind.

> We probably can't use restrict when we have t->p where p is restrict field
> or int *restrict *q with *q directly, it would be PTA's job to find that
> out.

Right.  Your reading of the standard (and after thinking about it some 
more last night I agree that it can be read like you do, but I still think 
it's a omission and loophole in it, and goes against the intent of 
restrict which was about making it easy to disambiguate for the compiler) 
implies some IMHO severe limitations on restrict from addressable objects, 
because we can't be sure anymore if something didn't change one restrict 
pointer behind our back to some other restrict pointer making them based 
on each other and some other object:

struct S {int * restrict p;};
void foo (struct S *s, struct S *t) {
  s->p[0] = 0;
  t->p[0] = 1;  // undefined if s->p == t->p; the caller was responsible 
// to not do that
}

but:

void foo (struct S *a, struct S *b) {
  *some_global_variable = something_else;
  a->p[0] = 0;
  b->p[0] = 1;  // here a->p == b->p can be well defined, when 
// a == b, and some_global_variable == &a->p.
// Due to that the caller is _not_ responsible to not
// call with a == b.
}

I don't immediately see how we can easily disambiguate s->p[0] from 
t->p[0] while also not disambiguating a->p[0] and b->p[0].


Ciao,
Michael.


[SPARC] Add workaround switch for AT697F processor

2011-10-13 Thread Eric Botcazou
Given that we support the LEON series of processors in 4.6.x and later, it may 
make sense to provide a workaround for the erratum of the AT697F processor.
The compiler isn't supposed to generate the problematic instructions sequences 
in the 4.5 and later series (unlike the 4.4 series) under normal circumstances 
but this cannot be ruled out.

Tested on SPARC/Solaris 8, applied on the mainline and 4.6 branch.


2011-10-13  Eric Botcazou  

* doc/invoke.texi (SPARC options): Document -mfix-at697f.
* config/sparc/sparc.opt (mfix-at697f): New option.
* config/sparc/sparc.c (TARGET_MACHINE_DEPENDENT_REORG): Define.
(sparc_reorg): New function.


-- 
Eric Botcazou
Index: config/sparc/sparc.opt
===
--- config/sparc/sparc.opt	(revision 179894)
+++ config/sparc/sparc.opt	(working copy)
@@ -184,6 +184,11 @@ mstd-struct-return
 Target Report RejectNegative Var(sparc_std_struct_return)
 Enable strict 32-bit psABI struct return checking.
 
+mfix-at697f
+Target Report RejectNegative Var(sparc_fix_at697f)
+Enable workaround for single erratum of AT697F processor
+(corresponding to erratum #13 of AT697E processor)
+
 Mask(LITTLE_ENDIAN)
 ;; Generate code for little-endian
 
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 179894)
+++ config/sparc/sparc.c	(working copy)
@@ -444,6 +444,7 @@ static void sparc_output_mi_thunk (FILE
    HOST_WIDE_INT, tree);
 static bool sparc_can_output_mi_thunk (const_tree, HOST_WIDE_INT,
    HOST_WIDE_INT, const_tree);
+static void sparc_reorg (void);
 static struct machine_function * sparc_init_machine_status (void);
 static bool sparc_cannot_force_const_mem (enum machine_mode, rtx);
 static rtx sparc_tls_get_addr (void);
@@ -582,6 +583,9 @@ char sparc_hard_reg_printed[8];
 #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
 #define TARGET_ASM_CAN_OUTPUT_MI_THUNK sparc_can_output_mi_thunk
 
+#undef TARGET_MACHINE_DEPENDENT_REORG
+#define TARGET_MACHINE_DEPENDENT_REORG sparc_reorg
+
 #undef TARGET_RTX_COSTS
 #define TARGET_RTX_COSTS sparc_rtx_costs
 #undef TARGET_ADDRESS_COST
@@ -10389,6 +10393,104 @@ sparc_can_output_mi_thunk (const_tree th
   return (vcall_offset >= -32768 || ! fixed_regs[5]);
 }
 
+/* We use the machine specific reorg pass to enable workarounds for errata.  */
+
+static void
+sparc_reorg (void)
+{
+  rtx insn, next;
+
+  /* The only erratum we handle for now is that of the AT697F processor.  */
+  if (!sparc_fix_at697f)
+return;
+
+  /* We need to have the (essentially) final form of the insn stream in order
+ to properly detect the various hazards.  Run delay slot scheduling.  */
+  if (optimize > 0 && flag_delayed_branch)
+dbr_schedule (get_insns ());
+
+  /* Now look for specific patterns in the insn stream.  */
+  for (insn = get_insns (); insn; insn = next)
+{
+  bool insert_nop = false;
+  rtx set;
+
+  /* Look for a single-word load into an odd-numbered FP register.  */
+  if (NONJUMP_INSN_P (insn)
+	  && (set = single_set (insn)) != NULL_RTX
+	  && GET_MODE_SIZE (GET_MODE (SET_SRC (set))) == 4
+	  && MEM_P (SET_SRC (set))
+	  && REG_P (SET_DEST (set))
+	  && REGNO (SET_DEST (set)) > 31
+	  && REGNO (SET_DEST (set)) % 2 != 0)
+	{
+	  /* The wrong dependency is on the enclosing double register.  */
+	  unsigned int x = REGNO (SET_DEST (set)) - 1;
+	  unsigned int src1, src2, dest;
+	  int code;
+
+	  /* If the insn has a delay slot, then it cannot be problematic.  */
+	  next = next_active_insn (insn);
+	  if (NONJUMP_INSN_P (next) && GET_CODE (PATTERN (next)) == SEQUENCE)
+	code = -1;
+	  else
+	{
+	  extract_insn (next);
+	  code = INSN_CODE (next);
+	}
+
+	  switch (code)
+	{
+	case CODE_FOR_adddf3:
+	case CODE_FOR_subdf3:
+	case CODE_FOR_muldf3:
+	case CODE_FOR_divdf3:
+	  dest = REGNO (recog_data.operand[0]);
+	  src1 = REGNO (recog_data.operand[1]);
+	  src2 = REGNO (recog_data.operand[2]);
+	  if (src1 != src2)
+		{
+		  /* Case [1-4]:
+ ld [address], %fx+1
+ FPOPd %f{x,y}, %f{y,x}, %f{x,y}  */
+		  if ((src1 == x || src2 == x)
+		  && (dest == src1 || dest == src2))
+		insert_nop = true;
+		}
+	  else
+		{
+		  /* Case 5:
+			 ld [address], %fx+1
+			 FPOPd %fx, %fx, %fx  */
+		  if (src1 == x
+		  && dest == src1
+		  && (code == CODE_FOR_adddf3 || code == CODE_FOR_muldf3))
+		insert_nop = true;
+		}
+	  break;
+
+	case CODE_FOR_sqrtdf2:
+	  dest = REGNO (recog_data.operand[0]);
+	  src1 = REGNO (recog_data.operand[1]);
+	  /* Case 6:
+			 ld [address], %fx+1
+			 fsqrtd %fx, %fx  */
+	  if (src1 == x && dest == src1)
+		insert_nop = true;
+	  break;
+
+	default:
+	  break;
+	}
+	}
+  else
+	next = NEXT_INSN (insn);
+
+  if (insert_nop)
+	emit_insn_after (gen_nop (), insn);
+}
+}
+
 /* How to allocate a 

Re: [C++ Patch] PR 17212

2011-10-13 Thread Jason Merrill

Why not support it in Obj-C++, too?

Jason


Re: [testsuite] require arm_little_endian in two tests

2011-10-13 Thread Richard Earnshaw
On 13/10/11 00:21, Janis Johnson wrote:
> Tests gcc.target/arm/pr48252.c and gcc.target/arm/neon-vset_lanes8.c
> expect little-endian code and fail when compiled with -mbig-endian.
> This patch skips the test if the current multilib does not generate
> little-endian code.
> 
> I'm not able to run execution tests for -mbig-endian for GCC mainline
> but have tested this patch with CodeSourcery's GCC 4.6.  OK for trunk?
> 
> 
> gcc-20111012-003
> 
> 
> 2011-10-12  Janis Johnson  
> 
>   * gcc.target/arm/pr48252.c: Require arm_little_endian.
>   * gcc.target/arm/neon-vset_lanes8.c: Likewise.
> 
> Index: gcc/testsuite/gcc.target/arm/pr48252.c
> ===
> --- gcc/testsuite/gcc.target/arm/pr48252.c(revision 344214)
> +++ gcc/testsuite/gcc.target/arm/pr48252.c(working copy)
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-require-effective-target arm_neon_hw } */
> +/* { dg-require-effective-target arm_little_endian } */
>  /* { dg-options "-O2" } */
>  /* { dg-add-options arm_neon } */
>  

I can't think of any obvious reason why this should fail in big-endian.

> Index: gcc/testsuite/gcc.target/arm/neon-vset_lanes8.c
> ===
> --- gcc/testsuite/gcc.target/arm/neon-vset_lanes8.c   (revision 344214)
> +++ gcc/testsuite/gcc.target/arm/neon-vset_lanes8.c   (working copy)
> @@ -2,6 +2,7 @@
>  
>  /* { dg-do run } */
>  /* { dg-require-effective-target arm_neon_hw } */
> +/* { dg-require-effective-target arm_little_endian } */
>  /* { dg-options "-O0" } */
>  /* { dg-add-options arm_neon } */
>  

I can see why this fails at present, the test is based on the assumption
that

int8x8_t x = {...}
puts the first element in lane 0 and subsequent elements in consecutive
lanes, *and* that this is equivalent to casting char[8] into a vector.
However, this isn't the case for big-endian.

There's two ways this could be sorted.

1) Change the testcase to:

#include "arm_neon.h"
#include 
#include 

signed char x_init[8] = { 1, 2, 3, 4, 5, 6, 7, 8 };
signed char y_init[8] = { 1, 2, 3, 16, 5, 6, 7, 8 };

int main (void)
{
  int8x8_t x = vld1_s8(x_init);
  int8x8_t y = vld1_s8(y_init);

  x = vset_lane_s8 (16, x, 3);
  if (memcmp (&x, &y, sizeof (x)) != 0)
abort();
  return 0;
}

2) Change the compiler to make initializers of vectors assign elements
of initializers to consecutive lanes in a vector, rather than the
current behaviour of 'casting' an array of elements to a vector.

While the second would be my preferred change, I suspect it's too hard
to fix, and may well cause code written for other targets to break on
big-endian (altivec for example).

R.



Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini

On 10/13/2011 02:51 PM, Richard Kenner wrote:

 case MEM:
   /* Ensure that our address has any ASHIFTs converted to MULT in case
  address-recognizing predicates are called later.  */
   temp = make_compound_operation (XEXP (x, 0), MEM);
   SUBST (XEXP (x, 0), temp);
   break;

THAT'S the code that should do the transformation that this patch contains.
So I'd suggest doing some debugging and seeing why it isn't.  This could
just be a bug in make_compound_operation not handling the SUBREG.


Or being fooled by the 0xfffc masking, perhaps.

Paolo


Re: [patch] --enable-dynamic-string default for mingw-w64 v2

2011-10-13 Thread JonY
On 10/8/2011 23:50, Kai Tietz wrote:
> 2011/10/8 Paolo Carlini:
>> Hi,
>>
>>> Ok, fixed it, I made a very dumb mistake in configure.host, new patch
>>> attached.
>>
>> Patch is still ok with me, if Kai is ok with it (remember for next time: 
>> regenerated files are not posted, are just a distraction)
>>
>> Paolo
> 
> Ok, by me, too.
> 

Ping, did this go in trunk already?




signature.asc
Description: OpenPGP digital signature


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
Hello,

this new version addresses the comments from you.
On gimplify.c's gimplify_expr we didn't handled the case that operands
for TRUTH-AND/OR/XOR expressions need to have same operand-size in
case  of transformation to bitwise-binary operation.  This shows up
for Fortran, as there are more than one boolean-kind type with
different mode-sizes.  I added a testcase for this,

ChangeLog

2011-10-13  Kai Tietz  

* fold-const.c (simple_operand_p_2): New function.
(fold_truthop): Rename to
(fold_truth_andor_1): function name.
Additionally remove branching creation for logical and/or.
(fold_truth_andor): Handle branching creation for logical and/or here.
* gimplify.c (gimplify_expr): Take care that for bitwise-binary
transformation the operands have compatible types.

2011-10-13  Kai Tietz  

* gfortran.fortran-torture/compile/logical-2.f90: New test.

Bootstrapped and regression-tested for all languages plus Ada and
Obj-C++ on x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c
+++ gcc/gcc/fold-const.c
@@ -112,13 +112,13 @@ static tree decode_field_reference (loca
 static int all_ones_mask_p (const_tree, int);
 static tree sign_bit_p (tree, const_tree);
 static int simple_operand_p (const_tree);
+static bool simple_operand_p_2 (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
 static tree range_predecessor (tree);
 static tree range_successor (tree);
 static tree fold_range_test (location_t, enum tree_code, tree, tree, tree);
 static tree fold_cond_expr_with_comparison (location_t, tree, tree,
tree, tree);
 static tree unextend (tree, int, int, tree);
-static tree fold_truthop (location_t, enum tree_code, tree, tree, tree);
 static tree optimize_minmax_comparison (location_t, enum tree_code,
tree, tree, tree);
 static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *);
@@ -3500,7 +3500,7 @@ optimize_bit_field_compare (location_t l
   return lhs;
 }
 
-/* Subroutine for fold_truthop: decode a field reference.
+/* Subroutine for fold_truth_andor_1: decode a field reference.

If EXP is a comparison reference, we return the innermost reference.

@@ -3668,7 +3668,7 @@ sign_bit_p (tree exp, const_tree val)
   return NULL_TREE;
 }

-/* Subroutine for fold_truthop: determine if an operand is simple enough
+/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough
to be evaluated unconditionally.  */

 static int
@@ -3678,7 +3678,7 @@ simple_operand_p (const_tree exp)
   STRIP_NOPS (exp);

   return (CONSTANT_CLASS_P (exp)
- || TREE_CODE (exp) == SSA_NAME
+ || TREE_CODE (exp) == SSA_NAME
  || (DECL_P (exp)
  && ! TREE_ADDRESSABLE (exp)
  && ! TREE_THIS_VOLATILE (exp)
@@ -3692,6 +3692,46 @@ simple_operand_p (const_tree exp)
 registers aren't expensive.  */
  && (! TREE_STATIC (exp) || DECL_REGISTER (exp;
 }
+
+/* Subroutine for fold_truth_andor: determine if an operand is simple enough
+   to be evaluated unconditionally.
+   I addition to simple_operand_p, we assume that comparisons and logic-not
+   operations are simple, if their operands are simple, too.  */
+
+static bool
+simple_operand_p_2 (tree exp)
+{
+  enum tree_code code;
+
+  /* Strip any conversions that don't change the machine mode.  */
+  STRIP_NOPS (exp);
+
+  code = TREE_CODE (exp);
+
+  if (TREE_CODE_CLASS (code) == tcc_comparison)
+return (!tree_could_trap_p (exp)
+   && simple_operand_p_2 (TREE_OPERAND (exp, 0))
+   && simple_operand_p_2 (TREE_OPERAND (exp, 1)));
+
+  if (TREE_SIDE_EFFECTS (exp)
+  || tree_could_trap_p (exp))
+return false;
+
+  switch (code)
+{
+case SSA_NAME:
+  return true;
+case TRUTH_NOT_EXPR:
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+case BIT_NOT_EXPR:
+  if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE)
+   return false;
+  return simple_operand_p_2 (TREE_OPERAND (exp, 0));
+default:
+  return simple_operand_p (exp);
+}
+}
+
 
 /* The following functions are subroutines to fold_range_test and allow it to
try to change a logical combination of comparisons into a range test.
@@ -4888,7 +4928,7 @@ fold_range_test (location_t loc, enum tr
   return 0;
 }
 
-/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P
+/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P
bit value.  Arrange things so the extra bits will be set to zero if and
only if C is signed-extended to its full width.  If MASK is nonzero,
it is an INTEGER_CST that should be AND'ed with the extra bits.  */
@@ -5025,8 +5065,8 @@ merge_truthop_with_opposite_arm (locatio
We return the simplified tree or 0 if no optimization is possible.  */

 stati

[PATCH] Handle COND_EXPR/VEC_COND_EXPR in walk_stmt_load_store_addr_ops and ssa verification

2011-10-13 Thread Jakub Jelinek
Hi!

Andrew mentioned on IRC he found walk_stmt_load_store_addr_ops
doesn't handle COND_EXPR weirdo first argument well, the following
patch is an attempt to handle that.

I've noticed similar spot in verify_ssa, though in that case I'm not
sure about whether the change is so desirable, as it doesn't seem to
handle SSA_NAMEs embedded in MEM_EXPRs, ARRAY_REFs etc. either.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Or just the gimple.c part?

2011-10-13  Jakub Jelinek  

* gimple.c (walk_stmt_load_store_addr_ops): Call visit_addr
also on COND_EXPR/VEC_COND_EXPR comparison operands if they are
ADDR_EXPRs.

* tree-ssa.c (verify_ssa): For COND_EXPR/VEC_COND_EXPR count
SSA_NAMEs in comparison operand as well.

--- gcc/gimple.c.jj 2011-10-13 11:13:39.0 +0200
+++ gcc/gimple.c2011-10-13 11:15:25.0 +0200
@@ -5313,9 +5313,24 @@ walk_stmt_load_store_addr_ops (gimple st
   || gimple_code (stmt) == GIMPLE_COND))
 {
   for (i = 0; i < gimple_num_ops (stmt); ++i)
-   if (gimple_op (stmt, i)
-   && TREE_CODE (gimple_op (stmt, i)) == ADDR_EXPR)
- ret |= visit_addr (stmt, TREE_OPERAND (gimple_op (stmt, i), 0), data);
+   {
+ tree op = gimple_op (stmt, i);
+ if (op == NULL_TREE)
+   ;
+ else if (TREE_CODE (op) == ADDR_EXPR)
+   ret |= visit_addr (stmt, TREE_OPERAND (op, 0), data);
+ /* COND_EXPR and VCOND_EXPR rhs1 argument is a comparison
+tree with two operands.  */
+ else if (i == 1 && COMPARISON_CLASS_P (op))
+   {
+ if (TREE_CODE (TREE_OPERAND (op, 0)) == ADDR_EXPR)
+   ret |= visit_addr (stmt, TREE_OPERAND (TREE_OPERAND (op, 0),
+  0), data);
+ if (TREE_CODE (TREE_OPERAND (op, 1)) == ADDR_EXPR)
+   ret |= visit_addr (stmt, TREE_OPERAND (TREE_OPERAND (op, 1),
+  0), data);
+   }
+   }
 }
   else if (is_gimple_call (stmt))
 {
--- gcc/tree-ssa.c.jj   2011-10-07 10:03:28.0 +0200
+++ gcc/tree-ssa.c  2011-10-13 11:19:30.0 +0200
@@ -1069,14 +1069,27 @@ verify_ssa (bool check_modified_stmt)
  for (i = 0; i < gimple_num_ops (stmt); i++)
{
  op = gimple_op (stmt, i);
- if (op && TREE_CODE (op) == SSA_NAME && --count < 0)
+ if (op == NULL_TREE)
+   continue;
+ if (TREE_CODE (op) == SSA_NAME)
+   --count;
+ /* COND_EXPR and VCOND_EXPR rhs1 argument is a comparison
+tree with two operands.  */
+ else if (i == 1 && COMPARISON_CLASS_P (op))
{
- error ("number of operands and imm-links don%'t agree"
-" in statement");
- print_gimple_stmt (stderr, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
- goto err;
+ if (TREE_CODE (TREE_OPERAND (op, 0)) == SSA_NAME)
+   --count;
+ if (TREE_CODE (TREE_OPERAND (op, 1)) == SSA_NAME)
+   --count;
}
}
+ if (count < 0)
+   {
+ error ("number of operands and imm-links don%'t agree"
+" in statement");
+ print_gimple_stmt (stderr, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
+ goto err;
+   }
 
  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE|SSA_OP_VUSE)
{

Jakub


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Michael Matz
Hi,

On Thu, 13 Oct 2011, Kai Tietz wrote:

> this new version addresses the comments from Michael and additional fixes
> an latent issue shown up by this rewrite in fold-const.
> On gimplify.c's gimple_boolify we didn't handled the case that operands
> for TRUTH-expressions need to have same operand-size for transformation to
> bitwise operation.

The requirement comes from BIT_AND_EXPR, not from any of the TRUTH_* 
expressions, hence the point of generating the BIT_AND_EXPR is the point 
to fixup the types.  Similar to this (fixes your testcase):

Index: gimplify.c
===
--- gimplify.c  (revision 179855)
+++ gimplify.c  (working copy)
   /* See if any simplifications can be done based on what the RHS is.  */
@@ -7257,6 +7264,18 @@ gimplify_expr (tree *expr_p, gimple_seq
  {
tree orig_type = TREE_TYPE (*expr_p);
*expr_p = gimple_boolify (*expr_p);
+   /* We are going to transform this into BIT operations,
+  which have stricter requirements on the operand types.  */
+   if (!useless_type_conversion_p
+(orig_type, TREE_TYPE (TREE_OPERAND (*expr_p, 0
+ TREE_OPERAND (*expr_p, 0)
+   = fold_convert_loc (input_location, orig_type,
+   TREE_OPERAND (*expr_p, 0));
+   if (!useless_type_conversion_p
+(orig_type, TREE_TYPE (TREE_OPERAND (*expr_p, 1
+ TREE_OPERAND (*expr_p, 1)
+   = fold_convert_loc (input_location, orig_type,
+   TREE_OPERAND (*expr_p, 1));
if (!useless_type_conversion_p (orig_type, TREE_TYPE (*expr_p)))
  {
*expr_p = fold_convert_loc (input_location, orig_type, *expr_p);


Ciao,
Michael.


[PATCH] Optimize V8HImode UMIN reduction using PHMINPOSUW insn and some cleanup

2011-10-13 Thread Jakub Jelinek
Hi!

This patch is partly taken from my part of the PR50374 patch,
though that patch will need some further work in the vectorizer
etc.

SSE4.1 has the phminposuw insn which can be used for reduction
instead of 3 shuffles and 3 min insns:
...
-   vpsrldq $8, %xmm0, %xmm1
-   vpminuw %xmm1, %xmm0, %xmm0
-   vpsrldq $4, %xmm0, %xmm1
-   vpminuw %xmm0, %xmm1, %xmm0
-   vpsrldq $2, %xmm0, %xmm1
-   vpminuw %xmm0, %xmm1, %xmm0
+   vphminposuw %xmm0, %xmm0
vpextrw $0, %xmm0, %eax

E.g.
#define N 32
unsigned short b[N];
__attribute__((noinline)) unsigned short
vecumin (void)
{
  int i;
  unsigned short r = 65535;
  for (i = 0; i < N; ++i)
if (r > b[i]) r = b[i];
  return r;
}
function got ~ 12.5% faster when executing it 10x
on SandyBridge.  The insn doesn't have 256-bit counterpart
in AVX unfortunately, so it is left for V8HImode only.

The other change is just a cleanup of ix86_expand_reduc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-10-13  Jakub Jelinek  

* config/i386/sse.md (reduc_umin_v8hi): New pattern.
* config/i386/i386.c (ix86_build_const_vector): Handle
also V32QI, V16QI, V16HI and V8HI modes.
(emit_reduc_half): New function.
(ix86_expand_reduc): Use phminposuw insn for V8HImode UMIN.
Use emit_reduc_half helper function.

* gcc.target/i386/sse4_1-phminposuw-2.c: New test.
* gcc.target/i386/sse4_1-phminposuw-3.c: New test.
* gcc.target/i386/avx-vphminposuw-2.c: New test.
* gcc.target/i386/avx-vphminposuw-3.c: New test.

--- gcc/config/i386/sse.md.jj   2011-10-13 11:13:41.0 +0200
+++ gcc/config/i386/sse.md  2011-10-13 12:26:13.0 +0200
@@ -1303,6 +1303,16 @@ (define_expand "reduc__"
   DONE;
 })
 
+(define_expand "reduc_umin_v8hi"
+  [(umin:V8HI
+ (match_operand:V8HI 0 "register_operand" "")
+ (match_operand:V8HI 1 "register_operand" ""))]
+  "TARGET_SSE4_1"
+{
+  ix86_expand_reduc (gen_uminv8hi3, operands[0], operands[1]);
+  DONE;
+})
+
 ;
 ;;
 ;; Parallel floating point comparisons
--- gcc/config/i386/i386.c.jj   2011-10-13 11:13:41.0 +0200
+++ gcc/config/i386/i386.c  2011-10-13 11:56:19.0 +0200
@@ -17008,6 +17008,10 @@ ix86_build_const_vector (enum machine_mo
 
   switch (mode)
 {
+case V32QImode:
+case V16QImode:
+case V16HImode:
+case V8HImode:
 case V8SImode:
 case V4SImode:
 case V4DImode:
@@ -33250,72 +33254,100 @@ ix86_expand_vector_extract (bool mmx_ok,
 }
 }
 
-/* Expand a vector reduction.  FN is the binary pattern to reduce;
-   DEST is the destination; IN is the input vector.  */
+/* Generate code to copy vector bits i / 2 ... i - 1 from vector SRC
+   to bits 0 ... i / 2 - 1 of vector DEST, which has the same mode.
+   The upper bits of DEST are undefined, though they shouldn't cause
+   exceptions (some bits from src or all zeros are ok).  */
 
-void
-ix86_expand_reduc (rtx (*fn) (rtx, rtx, rtx), rtx dest, rtx in)
+static void
+emit_reduc_half (rtx dest, rtx src, int i)
 {
-  rtx tmp1, tmp2, tmp3, tmp4, tmp5;
-  enum machine_mode mode = GET_MODE (in);
-  int i;
-
-  tmp1 = gen_reg_rtx (mode);
-  tmp2 = gen_reg_rtx (mode);
-  tmp3 = gen_reg_rtx (mode);
-
-  switch (mode)
+  rtx tem;
+  switch (GET_MODE (src))
 {
 case V4SFmode:
-  emit_insn (gen_sse_movhlps (tmp1, in, in));
-  emit_insn (fn (tmp2, tmp1, in));
-  emit_insn (gen_sse_shufps_v4sf (tmp3, tmp2, tmp2,
- const1_rtx, const1_rtx,
- GEN_INT (1+4), GEN_INT (1+4)));
+  if (i == 128)
+   tem = gen_sse_movhlps (dest, src, src);
+  else
+   tem = gen_sse_shufps_v4sf (dest, src, src, const1_rtx, const1_rtx,
+  GEN_INT (1 + 4), GEN_INT (1 + 4));
+  break;
+case V2DFmode:
+  tem = gen_vec_interleave_highv2df (dest, src, src);
+  break;
+case V16QImode:
+case V8HImode:
+case V4SImode:
+case V2DImode:
+  tem = gen_sse2_lshrv1ti3 (gen_lowpart (V1TImode, dest),
+   gen_lowpart (V1TImode, src),
+   GEN_INT (i / 2));
   break;
 case V8SFmode:
-  tmp4 = gen_reg_rtx (mode);
-  tmp5 = gen_reg_rtx (mode);
-  emit_insn (gen_avx_vperm2f128v8sf3 (tmp4, in, in, const1_rtx));
-  emit_insn (fn (tmp5, tmp4, in));
-  emit_insn (gen_avx_shufps256 (tmp1, tmp5, tmp5, GEN_INT (2+12)));
-  emit_insn (fn (tmp2, tmp1, tmp5));
-  emit_insn (gen_avx_shufps256 (tmp3, tmp2, tmp2, const1_rtx));
+  if (i == 256)
+   tem = gen_avx_vperm2f128v8sf3 (dest, src, src, const1_rtx);
+  else
+   tem = gen_avx_shufps256 (dest, src, src,
+GEN_INT (i == 128 ? 2 + (3 << 2) : 1));
   break;
 case V4DFmode:
-  emit_insn (gen_avx_vperm2f128v4df3 (tmp1, i

Re: [patch] --enable-dynamic-string default for mingw-w64 v2

2011-10-13 Thread Paolo Carlini
> 
> Ping, did this go in trunk already?

I would be surprised to see this happening if nobody like you or Kai actually 
does the commit ;)

P


Re: [patch] --enable-dynamic-string default for mingw-w64 v2

2011-10-13 Thread NightStrike
On Thu, Oct 13, 2011 at 9:47 AM, Paolo Carlini  wrote:
>>
>> Ping, did this go in trunk already?
>
> I would be surprised to see this happening if nobody like you or Kai actually 
> does the commit ;)
>
> P
>

Does Jon have commit access?


Re: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-13 Thread Kai Tietz
Yes, I have already sent an patch with Richard's wish.  Indeed we need
only to do this type-casting for operands on transcription of
TRUTH_(AND|OR|XOR)_EXPR to BIT_(AND|OR|XOR)_EXPR.

Cheers,
Kai


[C/C++] Fix PR c++/50608

2011-10-13 Thread Eric Botcazou
Hi,

this is a regression present on the mainline and 4.6 branch, introduced by the 
offsetof folding change.  The compiler now rejects:

int fails = __builtin_offsetof (C, b.offset);

error: cannot apply 'offsetof' to a non constant address

whereas it still accepts:

int works = (int)(&(((C*)0)->b.offset));


The problem is a lack of associativity of the offsetof folding: the first case 
is built as

  offsetof (offsetof (C.b), offset))

whereas the second is built as

  offsetof (C, b.offset)

so, in the first case, the second offsetof has a non-NULL base, which triggers 
the error about the non-constant address.

It turns out that fold_offsetof_1 already has the code to handle a non-null 
base, but it is unused because of the integer_zerop test.  The proposed fix is 
therefore to remove the limitation, which in turn makes it possible to factor 
out the common handling of a non-null base and finally to get rid of the 
STOP_REF argument.

Bootstrapped/regtested on x86_64-suse-linux, OK for mainline and 4.6 branch?


2011-10-13  Eric Botcazou  

PR c++/50608
* c-parser.c (c_parser_postfix_expression) : Adjust call
to fold_offsetof.
* c-typeck.c (build_unary_op) : Call fold_offsetof_1.
c-family/
* c-common.c (c_fully_fold_internal) : Call fold_offsetof_1.
(fold_offsetof_1): Make global.  Remove STOP_REF argument and adjust.
: Return the argument.
: Remove special code for negative offset.
Call fold_build_pointer_plus instead of size_binop.
(fold_offsetof): Remove STOP_REF argument and adjust.
* c-common.h (fold_offsetof_1): Declare.
(fold_offsetof): Remove STOP_REF argument.
cp/
* semantics.c (finish_offsetof): Adjust call to fold_offsetof.
* typeck.c (cp_build_addr_expr_1): Call fold_offsetof_1.


2011-10-13  Eric Botcazou  

* g++.dg/other/offsetof7.C: New test.


-- 
Eric Botcazou
Index: c-parser.c
===
--- c-parser.c	(revision 179844)
+++ c-parser.c	(working copy)
@@ -6388,7 +6388,7 @@ c_parser_postfix_expression (c_parser *p
 	  c_parser_error (parser, "expected identifier");
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
-	expr.value = fold_offsetof (offsetof_ref, NULL_TREE);
+	expr.value = fold_offsetof (offsetof_ref);
 	  }
 	  break;
 	case RID_CHOOSE_EXPR:
Index: c-typeck.c
===
--- c-typeck.c	(revision 179844)
+++ c-typeck.c	(working copy)
@@ -3890,10 +3890,7 @@ build_unary_op (location_t location,
   if (val && TREE_CODE (val) == INDIRECT_REF
   && TREE_CONSTANT (TREE_OPERAND (val, 0)))
 	{
-	  tree op0 = fold_offsetof (arg, val), op1;
-
-	  op1 = fold_convert_loc (location, argtype, TREE_OPERAND (val, 0));
-	  ret = fold_build_pointer_plus_loc (location, op1, op0);
+	  ret = fold_convert_loc (location, argtype, fold_offsetof_1 (arg));
 	  goto return_build_unary_op;
 	}
 
Index: c-family/c-common.c
===
--- c-family/c-common.c	(revision 179844)
+++ c-family/c-common.c	(working copy)
@@ -1272,12 +1272,7 @@ c_fully_fold_internal (tree expr, bool i
 	  && (op1 = get_base_address (op0)) != NULL_TREE
 	  && TREE_CODE (op1) == INDIRECT_REF
 	  && TREE_CONSTANT (TREE_OPERAND (op1, 0)))
-	{
-	  tree offset = fold_offsetof (op0, op1);
-	  op1
-	= fold_convert_loc (loc, TREE_TYPE (expr), TREE_OPERAND (op1, 0));
-	  ret = fold_build_pointer_plus_loc (loc, op1, offset);
-	}
+	ret = fold_convert_loc (loc, TREE_TYPE (expr), fold_offsetof_1 (op0));
   else if (op0 != orig_op0 || in_init)
 	ret = in_init
 	  ? fold_build1_initializer_loc (loc, code, TREE_TYPE (expr), op0)
@@ -8547,20 +8542,15 @@ c_common_to_target_charset (HOST_WIDE_IN
 return uc;
 }
 
-/* Build the result of __builtin_offsetof.  EXPR is a nested sequence of
-   component references, with STOP_REF, or alternatively an INDIRECT_REF of
-   NULL, at the bottom; much like the traditional rendering of offsetof as a
-   macro.  Returns the folded and properly cast result.  */
+/* Fold an offsetof-like expression.  EXPR is a nested sequence of component
+   references with an INDIRECT_REF of a constant at the bottom; much like the
+   traditional rendering of offsetof as a macro.  Return the folded result.  */
 
-static tree
-fold_offsetof_1 (tree expr, tree stop_ref)
+tree
+fold_offsetof_1 (tree expr)
 {
-  enum tree_code code = PLUS_EXPR;
   tree base, off, t;
 
-  if (expr == stop_ref && TREE_CODE (expr) != ERROR_MARK)
-return size_zero_node;
-
   switch (TREE_CODE (expr))
 {
 case ERROR_MARK:
@@ -8577,15 +8567,15 @@ fold_offsetof_1 (tree expr, tree stop_re
 
 case NOP_EXPR:
 case INDIRECT_REF:
-  if (!integer_zerop (TREE_OPERAND (expr, 0)))
+  if (!TREE_CONSTANT (TREE_OPERAND (expr, 0)))
 	{
 	  error ("cannot apply % to a non constant ad

Re: [C++ Patch] PR 17212

2011-10-13 Thread Paolo Carlini
Hi,

> Why not support it in Obj-C++, too?

Yes I briefly wondered that but I know *so* little about that front end... Do 
you think we can just add it? Probably yes ;)

Paolo


Re: [patch] --enable-dynamic-string default for mingw-w64 v2

2011-10-13 Thread Kai Tietz
2011/10/13 Paolo Carlini :
>>
>> Ping, did this go in trunk already?
>
> I would be surprised to see this happening if nobody like you or Kai actually 
> does the commit ;)
>
> P

I will take care to apply it.

Kai


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Jakub Jelinek
On Thu, Oct 13, 2011 at 02:57:56PM +0200, Michael Matz wrote:
> struct S {int * restrict p;};
> void foo (struct S *s, struct S *t) {
>   s->p[0] = 0;
>   t->p[0] = 1;  // undefined if s->p == t->p; the caller was responsible 
> // to not do that

This is undefined only if s->p == t->p && &s->p != &t->p.  If both
s->p and t->p designate the same restricted pointer object,
it is fine.  It is just fine to call the above with:
  struct S u;
  u.p = p;
  foo (&u, &u);
but not with:
  struct S u, v;
  u.p = p;
  v.p = p;
  foo (&u, &v);

If you change it to
void foo (struct S *restrict s, struct S *restrict t)
then obviously even calling it with foo (&u, &u) is invalid.

Jakub


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Or being fooled by the 0xfffc masking, perhaps.

No, I'm pretty sure that's NOT the case.  The *whole point* of the
routine is to deal with that masking.


Re: [C++ Patch] PR 17212

2011-10-13 Thread Jason Merrill

On 10/13/2011 09:53 AM, Paolo Carlini wrote:

Yes I briefly wondered that but I know *so* little about that front end... Do 
you think we can just add it? Probably yes ;)


Definitely.  Anything supported in C++ should also be in Obj-C++ by default.

Jason



Re: [PATCH][ARM] -m{cpu,tune,arch}=native

2011-10-13 Thread Andrew Stubbs

Ping.

On 20/09/11 11:51, Andrew Stubbs wrote:

On 09/09/11 12:55, Richard Earnshaw wrote:

The part number field is meaningless outside of the context of a a
specific vendor -- only taken as a pair can they refer to a specific
part. So why is the vendor field hard-coded rather than factored into
the table of parts.

Maybe it would be better to have a table of tables, with the top-level
table being indexed by vendor id. Something like


Yes, but since I only have part numbers for one vendor, I left that sort
of thing out on the principle that it's best not to add complexity until
you need it.

Anyway, I have done it now, so here it is. :)

I've also fixed the problem that if it didn't recognise the CPU, it
defaulted to the hard default, ignoring the --with-cpu configured default.

OK?

Andrew




[Patch,AVR] Fix PR46278, Take #3

2011-10-13 Thread Georg-Johann Lay
This is yet another attempt to fix PR46278 (fake X addressing).

After the previous clean-ups it is just a small change.

caller-saves.c tries to eliminate call-clobbered hard-regs allocated to pseudos
around function calls and that leads to situations that reload is no more
capable to perform all requested spills because of the very few AVR's address
registers.

Thus, the patch adds a new target option -mstrict-X so that the user can turn
that option if he like to do so, and then -fcaller-save is disabled.

The patch passes the testsuite without regressions. Moreover, the testsuite
passes without regressions if all test cases are run with -mstrict-X and all
libraries (libgcc, avr-libc) are built with the new option turned on.

The sizes from the test cases attached to the PR are:

 > avr-gcc vektor-zeichen-i.c -c -std=gnu99 -Os -mmcu=avr4 -mno-strict-X &&
avr-size vektor-zeichen-i.o

   textdata bss dec hex filename
   1084   0 1901274 4fa vektor-zeichen-i.o

 > avr-gcc vektor-zeichen-i.c -c -std=gnu99 -Os -mmcu=avr4 -mstrict-X &&
avr-size vektor-zeichen-i.o

   textdata bss dec hex filename
732   0 190 922 39a vektor-zeichen-i.o

 > avr-gcc snake.c -c -std=gnu99 -Os -mmcu=avr4 -mno-strict-X && avr-size 
 > snake.o

   textdata bss dec hex filename
   1537   0   01537 601 snake.o

 > avr-gcc snake.c -c -std=gnu99 -Os -mmcu=avr4 -mstrict-X && avr-size snake.o

   textdata bss dec hex filename
   1417   0   01417 589 snake.o


So these programs gets smaller, similar for -O2 where the first test case
reduces by 30%.

Even the test case testsuite/gcc.c-torture/compile/950612-1.c that caused
problems in earlier patches with spill fails reduces in size:

 > avr-gcc 950612-1.c -c -std=gnu99 -Os -mmcu=avr4 -mno-strict-X -save-temps
-dp && avr-size 950612-1.o

   textdata bss dec hex filename
   7101   0   071011bbd 950612-1.o

 > avr-gcc 950612-1.c -c -std=gnu99 -Os -mmcu=avr4 -mstrict-X -save-temps -dp
&& avr-size 950612-1.o

   textdata bss dec hex filename
   6931   0   069311b13 950612-1.o

And again similarly for -O2.

For the snake test case, there is room for improvement. The prologue with -Os
-mstrict-X reads

onRedraw_snake:
push r13
push r14
push r15
push r16
push r17
push r28
push r29
rcall .
rcall .
in r28,__SP_L__
in r29,__SP_H__
/* prologue: function */
/* frame size = 4 */
/* stack size = 11 */

and there is a frame set up without need. The variables put in the frame could
just as well live in remaining hard registers saving a frame pointer and
accessing the values there altogether.

I guess this is fallout from IRA that assigns to stack slots and the program is
too complex for reload to fix that. But I see similar bloat (setting up FP
without need, sometimes even without using it) for programs without this patch,
too. So it's not caused by this batch and general IRA/reload flaw.

The results are quite promising IMHO and I'd like to know what you think about
it and maybe it's already fine to apply?

Johann

PR target/46278
* config/avr/avr.c (avr_reg_ok_for_addr_p): Add parameter
outer_code and pass it down to avr_regno_mode_code_ok_for_base_p.
(avr_legitimate_address_p): Pass outer_code to
avr_reg_ok_for_addr_p and use that function in case PLUS.
(avr_mode_code_base_reg_class): Depend on avr_strict_X.
(avr_regno_mode_code_ok_for_base_p): Ditto, and depend on outer_code.
(avr_option_override): Disable -fcaller-saves if -mstrict-X is on.
* config/avr/avr.opt (-mstrict-X): New option.
(avr_strict_X): New variable reflecting -mstrict-X.
* doc/invoke.texi (AVR Options): Document -mstrict-X.
Index: config/avr/avr.opt
===
--- config/avr/avr.opt	(revision 179842)
+++ config/avr/avr.opt	(working copy)
@@ -61,3 +61,7 @@ Relax branches
 mpmem-wrap-around
 Target Report
 Make the linker relaxation machine assume that a program counter wrap-around occurs.
+
+mstrict-X
+Target Report Var(avr_strict_X) Init(0)
+When accessing RAM, use X as imposed by the hardware, i.e. just use pre-decrement, post-increment and indirect addressing with the X register.  Without this option, the compiler may assume that there is an addressing mode X+const similar to Y+const and Z+const and emit instructions to emulate such an addressing mode for X.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 179843)
+++ config/avr/avr.c	(working copy)
@@ -351,6 +351,17 @@ avr_option_override (void)
 {
   flag_delete_null_pointer_checks = 0;
 
+  /* caller-save.c looks for call-clobbered hard registers that are assigned
+ to pseudos that c

Re: [PATCH] Optimize V8HImode UMIN reduction using PHMINPOSUW insn and some cleanup

2011-10-13 Thread Richard Henderson
On 10/13/2011 06:44 AM, Jakub Jelinek wrote:
>   * config/i386/sse.md (reduc_umin_v8hi): New pattern.
>   * config/i386/i386.c (ix86_build_const_vector): Handle
>   also V32QI, V16QI, V16HI and V8HI modes.
>   (emit_reduc_half): New function.
>   (ix86_expand_reduc): Use phminposuw insn for V8HImode UMIN.
>   Use emit_reduc_half helper function.
> 
>   * gcc.target/i386/sse4_1-phminposuw-2.c: New test.
>   * gcc.target/i386/sse4_1-phminposuw-3.c: New test.
>   * gcc.target/i386/avx-vphminposuw-2.c: New test.
>   * gcc.target/i386/avx-vphminposuw-3.c: New test.

Ok.

>  case V8SFmode:
> +  if (i == 256)
> + tem = gen_avx_vperm2f128v8sf3 (dest, src, src, const1_rtx);
> +  else
> + tem = gen_avx_shufps256 (dest, src, src,
> +  GEN_INT (i == 128 ? 2 + (3 << 2) : 1));

It occurs to me to wonder if we wouldn't get better performance
dropping to a 128-bit vector during the first fold.  Let the AVX
128-bit operations zero the high bits of the ymm register.

Definitely something for a future patch though.


r~


Re: [testsuite] require arm_little_endian in two tests

2011-10-13 Thread Joseph S. Myers
On Thu, 13 Oct 2011, Richard Earnshaw wrote:

> 2) Change the compiler to make initializers of vectors assign elements
> of initializers to consecutive lanes in a vector, rather than the
> current behaviour of 'casting' an array of elements to a vector.
> 
> While the second would be my preferred change, I suspect it's too hard
> to fix, and may well cause code written for other targets to break on
> big-endian (altivec for example).

Indeed, vector initializers are part of the target-independent GNU C 
language and have target-independent semantics that the elements go in 
memory order, corresponding to the target-independent semantics of lane 
numbers where they appear in GENERIC, GIMPLE and (non-UNSPEC) RTL and any 
target-independent built-in functions that use such numbers.  (The issue 
here being, as you saw, that the lane numbers used in ARM-specific NEON 
intrinsics are for big-endian not the same as those used in 
target-independent features of GNU C and target-independent internal 
representations in GCC - hence various code to translate them between the 
two conventions when processing intrinsics into non-UNSPEC RTL, and to 
translate back when generating assembly instructions that encode lane 
numbers with the ARM conventions, as expounded at greater length at 
.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch, Fortran, committed] PR 50659: [4.4/4.5/4.6/4.7 Regression] ICE with PROCEDURE statement

2011-10-13 Thread Janus Weil
> Committed to the 4.6 branch as r179864:

... and to 4.5 as r179923.

Cheers,
Janus



> 2011/10/9 Janus Weil :
>> Hi all,
>>
>> I have just committed as obvious a patch for an ICE-on-valid problem
>> with PROCEDURE statements:
>>
>> http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179723
>>
>> The problem was the following: When setting up an external procedure
>> or procedure pointer (declared via a PROCEDURE statement), we copy the
>> expressions for the array bounds and string length from the interface
>> symbol given in the PROCEDURE declaration (cf.
>> 'resolve_procedure_interface'). If those expressions depend on the
>> actual args of the interface, we have to replace those args by the
>> args of the new procedure symbol that we're setting up. This is what
>> 'gfc_expr_replace_symbols' / 'replace_symbol' does. Unfortunately we
>> failed to check whether the symbol we try to replace is actually a
>> dummy!
>>
>> Contrary to Andrew's initial assumption, I think the test case is
>> valid. I could neither find a compiler which rejects it, nor a
>> restriction in the standard which makes it invalid. The relevant part
>> of F08 is probably chapter 7.1.11 ("Specification expression"). This
>> states that a specification expression can contain variables, which
>> are made accessible via use association.
>>
>> I'm planning to apply the patch to the 4.6, 4.5 and 4.4 branches soon.
>>
>> Cheers,
>> Janus
>>
>


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Michael Matz
Hi,

On Thu, 13 Oct 2011, Jakub Jelinek wrote:

> On Thu, Oct 13, 2011 at 02:57:56PM +0200, Michael Matz wrote:
> > struct S {int * restrict p;};
> > void foo (struct S *s, struct S *t) {
> >   s->p[0] = 0;
> >   t->p[0] = 1;  // undefined if s->p == t->p; the caller was responsible 
> > // to not do that
> 
> This is undefined only if s->p == t->p && &s->p != &t->p.  If both
> s->p and t->p designate the same restricted pointer object,
> it is fine.

Yeah.  But I continue to think that this reading is against the intent (or 
should be).  All the examples in the standard and rationale never say 
anything about pointers to restricted objects and the problematic cases 
one can construct with them, i.e. that one restricted pointer object might 
have different names.  That leads me to think that this aspect simply was 
overlooked or thought to be irrelevant.

I'm leaning towards (for C) to ignore restrict qualifications on all 
indirectly accessed or address-taken objects.  Or better not to ignore the 
restrict but make them conflict with all other pointers, restrict or 
non-restrict (normally non-restrict and restrict don't conflict in 
theory, although for GCC they do).


Ciao,
Michael.


[Patch]: fix typo in rs6000.c (AIX bootstrap broken)

2011-10-13 Thread Tristan Gingold
Hi,

looks like an obvious typo.  Ok for trunk ?

Tristan.

2011-10-13  Tristan Gingold  

* config/rs6000/rs6000.c (rs6000_init_builtins): Fix typo.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 4fd2192..3bfe33e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -12213,7 +12213,7 @@ rs6000_init_builtins (void)
 
 #if TARGET_XCOFF
   /* AIX libm provides clog as __clog.  */
-  if ((tdecl = builtin_decl_explicit ([BUILT_IN_CLOG))) != NULL_TREE)
+  if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
 set_user_assembler_name (tdecl, "__clog");
 #endif
 



Re: [testsuite] require arm_little_endian in two tests

2011-10-13 Thread Richard Earnshaw
On 13/10/11 15:56, Joseph S. Myers wrote:
> On Thu, 13 Oct 2011, Richard Earnshaw wrote:
> 
>> 2) Change the compiler to make initializers of vectors assign elements
>> of initializers to consecutive lanes in a vector, rather than the
>> current behaviour of 'casting' an array of elements to a vector.
>>
>> While the second would be my preferred change, I suspect it's too hard
>> to fix, and may well cause code written for other targets to break on
>> big-endian (altivec for example).
> 
> Indeed, vector initializers are part of the target-independent GNU C 
> language and have target-independent semantics that the elements go in 
> memory order, corresponding to the target-independent semantics of lane 
> numbers where they appear in GENERIC, GIMPLE and (non-UNSPEC) RTL and any 
> target-independent built-in functions that use such numbers.  (The issue 
> here being, as you saw, that the lane numbers used in ARM-specific NEON 
> intrinsics are for big-endian not the same as those used in 
> target-independent features of GNU C and target-independent internal 
> representations in GCC - hence various code to translate them between the 
> two conventions when processing intrinsics into non-UNSPEC RTL, and to 
> translate back when generating assembly instructions that encode lane 
> numbers with the ARM conventions, as expounded at greater length at 
> .)
> 

This is all rather horrible, and leads to THREE different layouts for a
128-bit vector for big-endian Neon.

GCC format
'VLD1.n' format
'ABI' format

GCC format and 'ABI' format differ in that the 64-bit words of the
128-bit vector are swapped.

All this and they are all expected to share a single machine mode.

Furthermore, the definitions in GCC are broken, in that the types
defined in arm_neon.h (eg int8x16_t) are supposed to be ABI format, not
GCC format.

Eukk! :-(

R.



Re: [pph] More DECL merging. (issue5268042)

2011-10-13 Thread Diego Novillo
I'm seeing an infinite loop in g++.dg/pph/c1limits-externalid.cc.  The
while() loop in pph_search_in_chain is not ending.  Or maybe it's
falling into the N^2 trap you mention in that routine?

I've added a short timeout to this test and XFAIL'd it so you can debug it.


Diego.


Re: RFC: Add ADD_RESTRICT tree code

2011-10-13 Thread Joseph S. Myers
On Thu, 13 Oct 2011, Michael Matz wrote:

> Yeah.  But I continue to think that this reading is against the intent (or 
> should be).  All the examples in the standard and rationale never say 
> anything about pointers to restricted objects and the problematic cases 
> one can construct with them, i.e. that one restricted pointer object might 
> have different names.  That leads me to think that this aspect simply was 
> overlooked or thought to be irrelevant.

(Restricted) pointers to restricted objects are exactly what the sentence 
"Every access that modifies X shall be considered also to modify P, for 
the purposes of this subclause." is about.  See my annotation in 
.

-- 
Joseph S. Myers
jos...@codesourcery.com


[Patch, Darwin] fix PR50699.

2011-10-13 Thread Iain Sandoe

.. this looks like an (almost) obvious fix for the bootstrap breakage...
OK for trunk?
Iain

Index: gcc/config/darwin.c
===
--- gcc/config/darwin.c (revision 179865)
+++ gcc/config/darwin.c (working copy)
@@ -2957,10 +2957,11 @@ darwin_override_options (void)
   darwin_running_cxx = (strstr (lang_hooks.name, "C++") != 0);
 }

-/* Add $LDBL128 suffix to long double builtins.  */
+#if defined (__ppc__) || defined (__ppc64__)
+/* Add $LDBL128 suffix to long double builtins for ppc darwin.  */

 static void
-darwin_patch_builtin (int fncode)
+darwin_patch_builtin (enum built_in_function fncode)
 {
   tree fn = builtin_decl_explicit (fncode);
   tree sym;
@@ -2998,6 +2999,7 @@ darwin_patch_builtins (void)
 #undef PATCH_BUILTIN_NO64
 #undef PATCH_BUILTIN_VARIADIC
 }
+#endif

 /*  CFStrings implementation.  */
 static GTY(()) tree cfstring_class_reference = NULL_TREE;



Re: [PATCH] Fix PR50712

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 4:55 AM, Richard Guenther  wrote:
>
> This fixes PR50712, an issue with IPA split uncovered by adding
> verifier calls after it ... we need to also gimplify reads of
> register typed memory when passing it as argument.
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> Richard.
>
> 2011-10-13  Richard Guenther  
>
>        PR tree-optimization/50712
>        * ipa-split.c (split_function): Always re-gimplify parameters
>        when they are not gimple vals before passing them.  Properly
>        check for type compatibility.
>
>        * gcc.target/i386/pr50712.c: New testcase.
>

This test is valid only for ia32, not ilp32. I checked in this patch
to fix it.

-- 
H.J.
---
Index: gcc.target/i386/pr50712.c
===
--- gcc.target/i386/pr50712.c   (revision 179925)
+++ gcc.target/i386/pr50712.c   (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target ilp32 } */
+/* { dg-require-effective-target ia32 } */
 /* { dg-options "-O2" } */

 typedef __builtin_va_list __va_list;
Index: ChangeLog
===
--- ChangeLog   (revision 179925)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2011-10-13  H.J. Lu  
+
+   * gcc.target/i386/pr50712.c: Check ia32 instead of ilp32.
+
 2011-10-13  Eric Botcazou  

* gcc.dg/builtins-67.c: Guard iround and irint with HAVE_C99_RUNTIME.


Vector alignment tracking

2011-10-13 Thread Artem Shinkarov
Hi

I would like to share some plans about improving the situation with
vector alignment tracking.  First of all, I would like to start with a
well-known bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50716.

There are several aspects of the problem:
1) We would like to avoid the quiet segmentation fault.
2) We would like to warn a user about the potential problems
considering assignment of vectors with different alignment.
3) We would like to replace obvious aligned vector assignments with
aligned move, and unaligned with unaligned.

All these aspects are interconnected and in order to find the problem,
we have to improve the alignment tracking facilities.

1) Currently in C we cannot provide information that an array is
aligned to a certain number.  The problem is hidden in the fact, that
pointer can represents an array or an address of an object.  And it
turns out that current aligned attribute doesn't help here.  My
proposal is to introduce an attribute called array_alligned (I am very
flexible on the name) which can be applied only to the pointers and
which would show that the pointer of this type represents an array,
where the first element is aligned to the given number.

2) After we have the new attribute, we can have a pass which would
check all the pointer arithmetic expressions, and in case of vectors,
mark the assignments with __builtin_assume_aligned.

3) In the separate pass we need to mark an alignments of the function
return types, in order to propagate this information through the
flow-graph.

4) In case of LTO, it becomes possible to track all the pointer
dereferences, and depending on the parameters warn, or change aligned
assignment to unaligned and vice-versa.


As a very first draft of (1) I include the patch, that introduces
array_aligned attribute.  The attribute sets is_array_flag in the
type, ans uses alignment number to store the alignment of the array.
In this implementation, we loose information about the alignment of
the pointer itself, but I don't know if we need it in this particular
situation.  Alternatively we can keep array_alignment in a separate
field, which one is better I am not sure.


Thanks,
Artem.
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 179906)
+++ gcc/c-family/c-common.c (working copy)
@@ -341,6 +341,7 @@ static tree handle_destructor_attribute
 static tree handle_mode_attribute (tree *, tree, tree, int, bool *);
 static tree handle_section_attribute (tree *, tree, tree, int, bool *);
 static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
+static tree handle_aligned_array_attribute (tree *, tree, tree, int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
 static tree handle_ifunc_attribute (tree *, tree, tree, int, bool *);
@@ -643,6 +644,8 @@ const struct attribute_spec c_common_att
  handle_section_attribute, false },
   { "aligned",0, 1, false, false, false,
  handle_aligned_attribute, false },
+  { "aligned_array",  0, 1, false, false, false,
+ handle_aligned_array_attribute, false },
   { "weak",   0, 0, true,  false, false,
  handle_weak_attribute, false },
   { "ifunc",  1, 1, true,  false, false,
@@ -6682,6 +6685,26 @@ handle_section_attribute (tree *node, tr
 }
 
   return NULL_TREE;
+}
+
+/* Handle "aligned_array" attribute.  */
+static tree
+handle_aligned_array_attribute (tree *node, tree ARG_UNUSED (name), tree args,
+   int flags, bool *no_add_attrs)
+{
+  if (!TYPE_P (*node) || !POINTER_TYPE_P (*node))
+{
+  error ("array_alignment attribute must be applied to a pointer-type");
+  *no_add_attrs = true;
+}
+  else
+{
+  tree ret = handle_aligned_attribute (node, name, args, flags, 
no_add_attrs);
+  TYPE_IS_ARRAY (*node) = true;
+  return ret;
+}
+
+  return NULL_TREE;
 }
 
 /* Handle a "aligned" attribute; arguments as in
Index: gcc/tree.h
===
--- gcc/tree.h  (revision 179906)
+++ gcc/tree.h  (working copy)
@@ -2149,6 +2149,7 @@ struct GTY(()) tree_block {
 #define TYPE_NEXT_VARIANT(NODE) (TYPE_CHECK (NODE)->type_common.next_variant)
 #define TYPE_MAIN_VARIANT(NODE) (TYPE_CHECK (NODE)->type_common.main_variant)
 #define TYPE_CONTEXT(NODE) (TYPE_CHECK (NODE)->type_common.context)
+#define TYPE_IS_ARRAY(NODE) (TYPE_CHECK (NODE)->type_common.is_array_flag)
 
 /* Vector types need to check target flags to determine type.  */
 extern enum machine_mode vector_type_mode (const_tree);
@@ -2411,6 +2412,7 @@ struct GTY(()) tree_type_common {
   unsigned lang_flag_5 : 1;
   unsigned lang_flag_6 : 1;
 
+  unsigned is_array_flag: 1;
   

[pph] shorten timeout on c1limits-externalid.cc and XFAIL (issue5278042)

2011-10-13 Thread Diego Novillo

I think this may be an infinite loop, but it may also just be taking a
long time to do the merge operations.

Teste on x86_64.  Committed to branch.


Diego.

* g++.dg/pph/c1limits-externalid.cc: Add shorter timeout.
Document failure mode.

diff --git a/gcc/testsuite/g++.dg/pph/c1limits-externalid.cc 
b/gcc/testsuite/g++.dg/pph/c1limits-externalid.cc
index b10f1c1..c44475f 100644
--- a/gcc/testsuite/g++.dg/pph/c1limits-externalid.cc
+++ b/gcc/testsuite/g++.dg/pph/c1limits-externalid.cc
@@ -1 +1,6 @@
+/* FIXME pph - The following timeout may cause failures on slow targets.
+   In general it takes no longer than a couple of seconds to compile
+   this test, but the new merging code is having trouble with this.  */
+/* { dg-timeout 15 } */
+/* { dg-xfail-if "MERGE INFINITE LOOP" { *-*-* } { "-fpph-map=pph.map" } } */
 #include "c0limits-externalid.h"
-- 
1.7.3.1


--
This patch is available for review at http://codereview.appspot.com/5278042


[pph] Make streamer hooks internal (issue5278043)

2011-10-13 Thread Diego Novillo

To avoid confusion, I moved the callbacks into pph-streamer.c so they
can be internal to that file.  They don't need to be called directly
ever.

Tested on x86_64.  Committed to branch.


Diego.

* pph-streamer-in.c (pph_in_mergeable_tree): Fix comment.
(pph_read_tree): Move to pph-streamer.c.
(pph_in_location): Rename from pph_read_location.
(pph_read_location): Move to pph-streamer.c.
(pph_in_mergeable_chain): Call pph_in_hwi.
(pph_in_any_tree): Fix comment.
* pph-streamer-out.c (pph_write_tree): Move to pph-streamer.c.
(pph_out_location): Rename from pph_write_location.
(pph_write_location): Move to pph-streamer.c.
* pph-streamer.c (pph_write_tree): Move from pph-streamer-out.c.
Make static.
(pph_read_tree): Move from pph-streamer-in.c.  Make static.
(pph_input_location): Move from pph-streamer-in.c.  Rename
from pph_read_location.
(pph_output_location): Move from pph-streamer-out.c. Rename
from pph_out_location.
* pph-streamer.h (pph_write_tree): Remove.
(pph_write_location): Remove.
(pph_read_tree): Remove.
(pph_read_location): Remove.
(pph_out_location): Declare.
(pph_out_tree): Declare.
(pph_in_location): Declare.
(pph_in_tree): Declare.


diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 3893ad2..f8d6393 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -517,6 +517,7 @@ static tree pph_in_any_tree (pph_stream *stream, tree 
*chain);
 
 
 /* Load an AST from STREAM.  Return the corresponding tree.  */
+
 tree
 pph_in_tree (pph_stream *stream)
 {
@@ -525,8 +526,7 @@ pph_in_tree (pph_stream *stream)
 }
 
 
-/* Load an AST in an ENCLOSING_NAMESPACE from STREAM.
-   Return the corresponding tree.  */
+/* Load an AST into CHAIN from STREAM.  */
 static void
 pph_in_mergeable_tree (pph_stream *stream, tree *chain)
 {
@@ -534,41 +534,23 @@ pph_in_mergeable_tree (pph_stream *stream, tree *chain)
 }
 
 
-/* Callback for reading ASTs from a stream.  Instantiate and return a
-   new tree from the PPH stream in DATA_IN.  */
-
-tree
-pph_read_tree (struct lto_input_block *ib_unused ATTRIBUTE_UNUSED,
-  struct data_in *root_data_in)
-{
-  /* Find data.  */
-  pph_stream *stream = (pph_stream *) root_data_in->sdata;
-  return pph_in_any_tree (stream, NULL);
-}
-
-
 /** lexical elements */
 
 
-/* Callback for streamer_hooks.input_location.  An offset is applied to
-   the location_t read in according to the properties of the merged
-   line_table.  IB and DATA_IN are as in lto_input_location.  This function
-   should only be called after pph_in_and_merge_line_table was called as
-   we expect pph_loc_offset to be set.  */
+/* Read and return a location_t from STREAM.  */
 
 location_t
-pph_read_location (struct lto_input_block *ib,
-   struct data_in *data_in ATTRIBUTE_UNUSED)
+pph_in_location (pph_stream *stream)
 {
   struct bitpack_d bp;
   bool is_builtin;
   unsigned HOST_WIDE_INT n;
   location_t old_loc;
 
-  bp = streamer_read_bitpack (ib);
+  bp = pph_in_bitpack (stream);
   is_builtin = bp_unpack_value (&bp, 1);
 
-  n = streamer_read_uhwi (ib);
+  n = pph_in_uhwi (stream);
   old_loc = (location_t) n;
   gcc_assert (old_loc == n);
 
@@ -576,20 +558,6 @@ pph_read_location (struct lto_input_block *ib,
 }
 
 
-/* Read and return a location_t from STREAM.
-   FIXME pph: Tracing doesn't depend on STREAM any more.  We could avoid having
-   to call this function, only for it to call lto_input_location, which calls
-   the streamer hook back to pph_read_location.  Say what?  */
-
-location_t
-pph_in_location (pph_stream *stream)
-{
-  location_t loc = pph_read_location (stream->encoder.r.ib,
-   stream->encoder.r.data_in);
-  return loc;
-}
-
-
 /* Load the tree value associated with TOKEN from STREAM.  */
 
 static void
@@ -761,7 +729,7 @@ pph_in_mergeable_chain (pph_stream *stream, tree *chain)
 {
   int i, count;
 
-  count = streamer_read_hwi (stream->encoder.r.ib);
+  count = pph_in_hwi (stream);
   for (i = 0; i < count; i++)
 pph_in_mergeable_tree (stream, chain);
 }
@@ -1954,8 +1922,8 @@ pph_in_tree_header (pph_stream *stream, enum LTO_tags tag)
 }
 
 
-/* Read a tree from the STREAM.  It ENCLOSING_NAMESPACE is not null,
-   the tree may be unified with an existing tree in that namespace.  */
+/* Read a tree from the STREAM.  If CHAIN is not null, the tree may be
+   unified with an existing tree in that chain.  */
 
 static tree
 pph_in_any_tree (pph_stream *stream, tree *chain)
diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index 0c00054..b5020f2 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -641,26 +641,13 @@ pph_out_mergeable_tree (pph_stream *stream, tree t)
 }
 
 
-/* Callback for writing ASTs t

Re: [PATCH, rs6000] Preserve link stack for 476 cpus

2011-10-13 Thread Peter Bergner
On Mon, 2011-09-12 at 15:29 -0400, David Edelsohn wrote:
> First, please choose a more informative option name.
> -mpreserve-link-stack seems like something generally useful for all
> processors and someone may randomly add the option.  It always is
> useful to preserve the link stack -- that's why you're jumping through
> hoops to fix this bug.  Maybe -mpreserve-ppc476-link-stack .

Done.


> I would prefer that this patch were maintained by the chip vendors
> distributing SDKs for PPC476 instead of complicating the FSF codebase.

Talking with the chip folks, they said there were a number of companies
already downloading the FSF gcc sources and building it unpatched and
that they expected more to do so in the future, so I'm not sure how many
(if any) are actually even relying on a SDK.  So...


> Otherwise, please implement this like Xilinx FPU in rs6000.opt,
> rs6000.h, ppc476.h and config.gcc where TARGET_LINK_STACK is defined
> as 0 unless GCC explicitly is configured for powerpc476.

Here's a patch to do that, by adding a variant to the powerpc*-*-linux*
target for the 476.  I bootstrapped and regtested this as before, meaning
I also tested this with the -mpreserve-ppc476-link-stack on by default,
as well as configuring without 476 support and verified that the
TARGET_LINK_STACK tests are not only optimized away, but so is the
-mpreserve-ppc476-link-stack option itself.

Is this ok for mainline now?

Peter


* config.gcc (powerpc*-*-linux*): Add powerpc*-*-linux*ppc476* variant.
* config/rs6000/476.h: New file.
* config/rs6000/476.opt: Likewise.
* config/rs6000/rs6000.h (TARGET_LINK_STACK): New define.
(SET_TARGET_LINK_STACK): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
TARGET_LINK_STACK for -mtune=476 and -mtune=476fp.
(rs6000_legitimize_tls_address): Emit the link stack preserving GOT
code if TARGET_LINK_STACK.
(rs6000_emit_load_toc_table): Likewise.
(output_function_profiler): Likewise
(macho_branch_islands): Likewise
(machopic_output_stub): Likewise
* config/rs6000/rs6000.md (load_toc_v4_PIC_1, load_toc_v4_PIC_1b):
Convert to a define_expand.
(load_toc_v4_PIC_1_normal): New define_insn.
(load_toc_v4_PIC_1_476): Likewise.
(load_toc_v4_PIC_1b_normal): Likewise.
(load_toc_v4_PIC_1b_476): Likewise.


Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 179091)
+++ gcc/config.gcc  (working copy)
@@ -2133,6 +2133,9 @@ powerpc-*-linux* | powerpc64-*-linux*)
esac
tmake_file="${tmake_file} t-slibgcc-libgcc"
case ${target} in
+   powerpc*-*-linux*ppc476*)
+   tm_file="${tm_file} rs6000/476.h"
+   extra_options="${extra_options} rs6000/476.opt" ;;
powerpc*-*-linux*altivec*)
tm_file="${tm_file} rs6000/linuxaltivec.h" ;;
powerpc*-*-linux*spe*)
Index: gcc/config/rs6000/476.h
===
--- gcc/config/rs6000/476.h (revision 0)
+++ gcc/config/rs6000/476.h (revision 0)
@@ -0,0 +1,29 @@
+/* Enable IBM PowerPC 476 support.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Peter Bergner (berg...@vnet.ibm.com)
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#undef TARGET_LINK_STACK
+#define TARGET_LINK_STACK (rs6000_link_stack)
+
+#undef SET_TARGET_LINK_STACK
+#define SET_TARGET_LINK_STACK(X) do { TARGET_LINK_STACK = (X); } while (0)
Index: gcc/config/rs6000/476.opt
===
--- gcc/config/rs6000/476.opt   (revision 0)
+++ gcc/config/rs6000/476.opt   (revision 0)
@@ -0,0 +1,24 @@
+; IBM PowerPC 476 options.
+;
+; Copyright (C) 2011 Free Software Foundation, Inc.
+; Contributed by Peter Bergner (berg...@vnet.ibm.com)
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify it under
+;

Re: Vector alignment tracking

2011-10-13 Thread Andi Kleen
Artem Shinkarov  writes:
>
> 1) Currently in C we cannot provide information that an array is
> aligned to a certain number.  The problem is hidden in the fact, that

Have you considered doing it the other way round: when an optimization
needs something to be aligned, make the declaration aligned?

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 7:14 AM, Richard Kenner
 wrote:
>> Or being fooled by the 0xfffc masking, perhaps.
>
> No, I'm pretty sure that's NOT the case.  The *whole point* of the
> routine is to deal with that masking.
>

I got

(gdb) step
make_compound_operation (x=0x7139c4c8, in_code=MEM)
at /export/gnu/import/git/gcc/gcc/combine.c:7572
7572  enum rtx_code code = GET_CODE (x);
(gdb) call debug_rtx (x)
(and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
(const_int 4 [0x4])) 0)
(subreg:DI (reg:SI 106) 0))
(const_int 4294967292 [0xfffc]))

and it produces

(gdb) call debug_rtx (x)
(and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
(const_int 4 [0x4])) 0)
(subreg:DI (reg:SI 106) 0))
(const_int 4294967292 [0xfffc]))

at the end.  make_compound_operation doesn't know how to
restore ZERO_EXTEND.

BTW, there is a small testcase at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50696

You can reproduce it on Linux/x86-64.

-- 
H.J.


Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-10-13 Thread Matthew Gretton-Dann

This patch seems to have caused PR50717:
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50717

Thanks,

Matt

On 19/08/11 15:49, Andrew Stubbs wrote:

On 14/07/11 15:35, Richard Guenther wrote:

Ok.


I've just committed this updated patch.

I found bugs with VOIDmode constants that have caused me to recast my
patches to is_widening_mult_rhs_p. They should be logically the same for
non VOIDmode cases, but work correctly for constants. I think the new
version is a bit easier to understand in any case.

Andrew


widening-multiplies-6.patch


2011-08-19  Andrew Stubbs

gcc/
* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
'type'.
Use 'type' from caller, not inferred from 'rhs'.
Don't reject non-conversion statements. Do return lhs in this case.
(is_widening_mult_p): Add new argument 'type'.
Use 'type' from caller, not inferred from 'stmt'.
Pass type to is_widening_mult_rhs_p.
(convert_mult_to_widen): Pass type to is_widening_mult_p.
(convert_plusminus_to_widen): Likewise.

gcc/testsuite/
* gcc.target/arm/wmul-8.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1966,7 +1966,8 @@ struct gimple_opt_pass pass_optimize_bswap =
   }
  };

-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+   assuming a target type of TYPE.
 There are two cases:

   - RHS makes some value at least twice as wide.  Store that value
@@ -1976,27 +1977,31 @@ struct gimple_opt_pass pass_optimize_bswap =
 but leave *TYPE_OUT untouched.  */

  static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+   tree *new_rhs_out)
  {
gimple stmt;
-  tree type, type1, rhs1;
+  tree type1, rhs1;
enum tree_code rhs_code;

if (TREE_CODE (rhs) == SSA_NAME)
  {
-  type = TREE_TYPE (rhs);
stmt = SSA_NAME_DEF_STMT (rhs);
-  if (!is_gimple_assign (stmt))
-   return false;
-
-  rhs_code = gimple_assign_rhs_code (stmt);
-  if (TREE_CODE (type) == INTEGER_TYPE
- ? !CONVERT_EXPR_CODE_P (rhs_code)
- : rhs_code != FIXED_CONVERT_EXPR)
-   return false;
+  if (is_gimple_assign (stmt))
+   {
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if (TREE_CODE (type) == INTEGER_TYPE
+ ? !CONVERT_EXPR_CODE_P (rhs_code)
+ : rhs_code != FIXED_CONVERT_EXPR)
+   rhs1 = rhs;
+ else
+   rhs1 = gimple_assign_rhs1 (stmt);
+   }
+  else
+   rhs1 = rhs;

-  rhs1 = gimple_assign_rhs1 (stmt);
type1 = TREE_TYPE (rhs1);
+
if (TREE_CODE (type1) != TREE_CODE (type)
  || TYPE_PRECISION (type1) * 2>  TYPE_PRECISION (type))
return false;
@@ -2016,28 +2021,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree 
*new_rhs_out)
return false;
  }

-/* Return true if STMT performs a widening multiplication.  If so,
-   store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
-   respectively.  Also fill *RHS1_OUT and *RHS2_OUT such that converting
-   those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
-   operands of the multiplication.  */
+/* Return true if STMT performs a widening multiplication, assuming the
+   output type is TYPE.  If so, store the unwidened types of the operands
+   in *TYPE1_OUT and *TYPE2_OUT respectively.  Also fill *RHS1_OUT and
+   *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+   and *TYPE2_OUT would give the operands of the multiplication.  */

  static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
tree *type1_out, tree *rhs1_out,
tree *type2_out, tree *rhs2_out)
  {
-  tree type;
-
-  type = TREE_TYPE (gimple_assign_lhs (stmt));
if (TREE_CODE (type) != INTEGER_TYPE
&&  TREE_CODE (type) != FIXED_POINT_TYPE)
  return false;

-  if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+  if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+  rhs1_out))
  return false;

-  if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+  if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+  rhs2_out))
  return false;

if (*type1_out == NULL)
@@ -2089,7 +2093,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator 
*gsi)
if (TREE_CODE 

Re: Vector alignment tracking

2011-10-13 Thread Artem Shinkarov
On Thu, Oct 13, 2011 at 4:54 PM, Andi Kleen  wrote:
> Artem Shinkarov  writes:
>>
>> 1) Currently in C we cannot provide information that an array is
>> aligned to a certain number.  The problem is hidden in the fact, that
>
> Have you considered doing it the other way round: when an optimization
> needs something to be aligned, make the declaration aligned?
>
> -Andi

Andi, I can't realistically imagine how could it work.  The problem is
that for an arbitrary arr[x], I have no idea whether it should be
aligned or not.

what if

arr = ptr +  5;
v = *(vec *) arr;

I can make arr aligned, because it would be better for performance,
but obviously, the pointer expression breaks this alignment.  But the
code is valid, because unaligned move is still possible.  So I think
that checking is a more conservative approach.

Or I am missing someting?

Thanks,
Artem.
> --
> a...@linux.intel.com -- Speaking for myself only
>


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> at the end.  make_compound_operation doesn't know how to
> restore ZERO_EXTEND.

It does in general.  See make_extraction, which it calls.  The question is
why it doesn't in this case.  That's the bug.


[committed] Drop TREE_ADDRESSABLE from BIT_FIELD_REF on lhs accessed vectors/complex

2011-10-13 Thread Jakub Jelinek
Hi!

I've noticed that
#define vector(elcount, type)  \
__attribute__((vector_size((elcount)*sizeof(type type

vector (4, int)
f1 (vector (4, int) a, int b)
{
  ((int *)&a)[0] = b;
  return a;
}

as well as

vector (4, int)
f2 (vector (4, int) a, int b)
{
  a[0] = b;
  return a;
}

don't result in vec_set_optab being used, instead the argument is
forced in memory.  The problem is that update_addresses_taken
wouldn't drop TREE_ADDRESSABLE from the vector when it is no
longer address taken.  While it can't be turned into DECL_GIMPLE_REG_P,
TREE_ADDRESSABLE can go, it will still not be considered a gimple register,
but at least the expander will be free to generate better code for it.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by
richi on IRC, committed to trunk.

2011-10-13  Jakub Jelinek  
Richard Guenther  

* tree-ssa.c (maybe_optimize_var): Drop TREE_ADDRESSABLE
from vector or complex vars even if their DECL_UID is in not_reg_needs
bitmap.

--- gcc/tree-ssa.c.jj   2011-10-13 11:19:30.0 +0200
+++ gcc/tree-ssa.c  2011-10-13 14:27:02.0 +0200
@@ -1976,6 +1976,8 @@ maybe_optimize_var (tree var, bitmap add
 a non-register.  Otherwise we are confused and forget to
 add virtual operands for it.  */
   && (!is_gimple_reg_type (TREE_TYPE (var))
+ || TREE_CODE (TREE_TYPE (var)) == VECTOR_TYPE
+ || TREE_CODE (TREE_TYPE (var)) == COMPLEX_TYPE
  || !bitmap_bit_p (not_reg_needs, DECL_UID (var
 {
   TREE_ADDRESSABLE (var) = 0;

Jakub


[PATCH] vec_set for 32-byte vectors

2011-10-13 Thread Jakub Jelinek
Hi!

As noted by Kirill Yukhin (and what lead to the previous tree-ssa.c patch),
vec_set wasn't wired for 32-byte vectors.
Although ix86_expand_vector_set handles 32-byte vectors just fine (even for
AVX and integer vectors), without the expander we'd force things into memory
etc.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2011-10-13  Jakub Jelinek  

* config/i386/sse.md (vec_set): Change V_128 iterator mode
to V.

--- gcc/config/i386/sse.md.jj   2011-10-13 12:26:13.0 +0200
+++ gcc/config/i386/sse.md  2011-10-13 14:50:15.0 +0200
@@ -3786,7 +3786,7 @@ (define_split
 })
 
 (define_expand "vec_set"
-  [(match_operand:V_128 0 "register_operand" "")
+  [(match_operand:V 0 "register_operand" "")
(match_operand: 1 "register_operand" "")
(match_operand 2 "const_int_operand" "")]
   "TARGET_SSE"

Jakub


Re: [PATCH] vec_set for 32-byte vectors

2011-10-13 Thread Richard Henderson
On 10/13/2011 09:21 AM, Jakub Jelinek wrote:
>   * config/i386/sse.md (vec_set): Change V_128 iterator mode to V.

Ok.


r~


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 9:11 AM, Richard Kenner
 wrote:
>> at the end.  make_compound_operation doesn't know how to
>> restore ZERO_EXTEND.
>
> It does in general.  See make_extraction, which it calls.  The question is
> why it doesn't in this case.  That's the bug.
>

It never calls make_extraction.  There are several cases handled
for AND operation. But

(and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
   (const_int 4 [0x4])) 0)
   (subreg:DI (reg:SI 106) 0))
   (const_int 4294967292 [0xfffc]))

isn't one of them.

-- 
H.J.


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> It never calls make_extraction.  There are several cases handled
> for AND operation. But
> 
> (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
>(const_int 4 [0x4])) 0)
>(subreg:DI (reg:SI 106) 0))
>(const_int 4294967292 [0xfffc]))
> 
> isn't one of them.

Yes, clearly.  Otherwise it would work!  The correct fix for this problem
is to make it to do that.  That's where this needs to be fixed: in
make_compound_operation.


Re: Ping shrink wrap patches

2011-10-13 Thread Richard Henderson
On 10/13/2011 05:27 AM, Alan Modra wrote:
> Ping
> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01002.html
> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01003.html

Ok.

> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01596.html
> 
> The last one needs a tweak.
> s/FUNCTION_VALUE_REGNO_P/targetm.calls.function_value_regno_p/,
> or wrap the whole patch in #ifdef FUNCTION_VALUE_REGNO_P. 

Ok with the s///.


r~


Re: Ping shrink wrap patches

2011-10-13 Thread Bernd Schmidt
On 10/13/11 14:27, Alan Modra wrote:
> Without the ifcvt
> optimization for a function "int foo (int x)" we might have something
> like
> 
>  r29 = r3; // save r3 in callee saved reg
>  if (some test) goto exit_label
>  // main body of foo, calling other functions
>  r3 = 0;
>  return;
> exit_label:
>  r3 = 1;
>  return;
> 
> Bernd's http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00380.html quite
> happily rearranges the r29 assignment to be after the "if", and shrink
> wrapping occurs.  With the ifcvt optimization we get
> 
>  r29 = r3; // save r3 in callee saved reg
>  r3 = 1;
>  if (some test) goto exit_label
>  // main body of foo, calling other functions
>  r3 = 0;
> exit_label:
>  return;

I wonder if this can't be described as another case for moving an insn
downwards in prepare_shrink_wrap, rather than stopping ifcvt? Doesn't
matter much however.


Bernd


Re: Vector alignment tracking

2011-10-13 Thread Andi Kleen
> Or I am missing someting?

I often see the x86 vectorizer with -mtune=generic generate a lot of
complicated code just to adjust for potential misalignment.

My thought was just if the alias oracle knows what the original
declaration is, and it's available for changes (e.g. LTO), it would be 
likely be better to just add an __attribute__((aligned()))
there.

In the general case it's probably harder, you would need some 
cost model to decide when it's worth it.

Your approach of course would still be needed for cases where this
isn't possible. But it sounded like the infrastructure you're building
could in principle do both.

-Andi


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini

On 10/13/2011 06:35 PM, Richard Kenner wrote:

It never calls make_extraction.  There are several cases handled
for AND operation. But

(and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
(const_int 4 [0x4])) 0)
(subreg:DI (reg:SI 106) 0))
(const_int 4294967292 [0xfffc]))

isn't one of them.


Yes, clearly.  Otherwise it would work!  The correct fix for this problem
is to make it to do that.  That's where this needs to be fixed: in
make_compound_operation.


An and:DI is cheaper than a zero_extend:DI of an and:SI.  So GCC is 
correct in not doing this transformation.  I think adding a case to 
make_compound_operation that simply undoes the transformation (without 
calling make_extraction) is fine if you guard it with if (in_code == MEM).


Paolo


Re: [PATCH, rs6000] Preserve link stack for 476 cpus

2011-10-13 Thread Richard Henderson
On 10/13/2011 08:49 AM, Peter Bergner wrote:
> +   if (TARGET_LINK_STACK)
> + asm_fprintf (file, "\tbl 1f\n\tb 2f\n1:\n\tblr\n2:\n");
> +   else
> + asm_fprintf (file, "\tbcl 20,31,1f\n1:\n");

Wouldn't it be better to set up an out-of-line "blr" insn that could
be shared by all instances?  That would solve a lot of this sort of
this sort of branch-to-branch-to-branch ugliness.

See the i386 port for an example of this, if you need it.


r~


Re: Vector alignment tracking

2011-10-13 Thread Jakub Jelinek
On Thu, Oct 13, 2011 at 06:57:47PM +0200, Andi Kleen wrote:
> > Or I am missing someting?
> 
> I often see the x86 vectorizer with -mtune=generic generate a lot of
> complicated code just to adjust for potential misalignment.
> 
> My thought was just if the alias oracle knows what the original
> declaration is, and it's available for changes (e.g. LTO), it would be 
> likely be better to just add an __attribute__((aligned()))
> there.
> 
> In the general case it's probably harder, you would need some 
> cost model to decide when it's worth it.

GCC already does that on certain targets, see
increase_alignment in tree-vectorizer.c.  Plus, various backends attempt
to align larger arrays more than they have to be aligned.

Jakub


Re: Ping shrink wrap patches

2011-10-13 Thread Bernd Schmidt
On 10/13/11 18:50, Bernd Schmidt wrote:
> On 10/13/11 14:27, Alan Modra wrote:
>> Without the ifcvt
>> optimization for a function "int foo (int x)" we might have something
>> like
>>
>>  r29 = r3; // save r3 in callee saved reg
>>  if (some test) goto exit_label
>>  // main body of foo, calling other functions
>>  r3 = 0;
>>  return;
>> exit_label:
>>  r3 = 1;
>>  return;
>>
>> Bernd's http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00380.html quite
>> happily rearranges the r29 assignment to be after the "if", and shrink
>> wrapping occurs.  With the ifcvt optimization we get
>>
>>  r29 = r3; // save r3 in callee saved reg
>>  r3 = 1;
>>  if (some test) goto exit_label
>>  // main body of foo, calling other functions
>>  r3 = 0;
>> exit_label:
>>  return;
> 
> I wonder if this can't be described as another case for moving an insn
> downwards in prepare_shrink_wrap, rather than stopping ifcvt?

I.e. something like this? Minimally tested by inspecting some generated
assembly. I haven't found a case where it enables extra shrink-wrapping
on i686, but maybe it's different on ppc?


Bernd

Index: /local/src/egcs/scratch-trunk/gcc/function.c
===
--- /local/src/egcs/scratch-trunk/gcc/function.c(revision 179848)
+++ /local/src/egcs/scratch-trunk/gcc/function.c(working copy)
@@ -5369,13 +5369,13 @@ static void
 prepare_shrink_wrap (basic_block entry_block)
 {
   rtx insn, curr;
-  FOR_BB_INSNS_SAFE (entry_block, insn, curr)
+  FOR_BB_INSNS_REVERSE_SAFE (entry_block, insn, curr)
 {
   basic_block next_bb;
   edge e, live_edge;
   edge_iterator ei;
-  rtx set, scan;
-  unsigned destreg, srcreg;
+  rtx set, src, dst, scan;
+  unsigned destreg;
 
   if (!NONDEBUG_INSN_P (insn))
continue;
@@ -5383,12 +5383,14 @@ prepare_shrink_wrap (basic_block entry_b
   if (!set)
continue;
 
-  if (!REG_P (SET_SRC (set)) || !REG_P (SET_DEST (set)))
+  src = SET_SRC (set);
+  dst = SET_DEST (set);
+  if (!(REG_P (src) || CONSTANT_P (src)) || !REG_P (dst))
continue;
-  srcreg = REGNO (SET_SRC (set));
-  destreg = REGNO (SET_DEST (set));
-  if (hard_regno_nregs[srcreg][GET_MODE (SET_SRC (set))] > 1
- || hard_regno_nregs[destreg][GET_MODE (SET_DEST (set))] > 1)
+  destreg = REGNO (dst);
+  if (hard_regno_nregs[destreg][GET_MODE (dst)] > 1)
+   continue;
+  if (REG_P (src) && hard_regno_nregs[REGNO (src)][GET_MODE (src)] > 1)
continue;
 
   next_bb = entry_block;
@@ -5436,7 +5438,8 @@ prepare_shrink_wrap (basic_block entry_b
if (REG_NOTE_KIND (link) == REG_INC)
  record_hard_reg_sets (XEXP (link, 0), NULL, &set_regs);
 
- if (TEST_HARD_REG_BIT (set_regs, srcreg)
+ if ((REG_P (src)
+  && TEST_HARD_REG_BIT (set_regs, REGNO (src)))
  || reg_referenced_p (SET_DEST (set),
   PATTERN (scan)))
{


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> An and:DI is cheaper than a zero_extend:DI of an and:SI.  

That depends strongly on the constants and whether the machine is 32-bit
or 64-bit. 

But that's irrelevant in this case since the and:SI will be removed (it
reflects what already been done).


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini  wrote:
> On 10/13/2011 06:35 PM, Richard Kenner wrote:
>>>
>>> It never calls make_extraction.  There are several cases handled
>>> for AND operation. But
>>>
>>> (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
>>>                (const_int 4 [0x4])) 0)
>>>        (subreg:DI (reg:SI 106) 0))
>>>    (const_int 4294967292 [0xfffc]))
>>>
>>> isn't one of them.
>>
>> Yes, clearly.  Otherwise it would work!  The correct fix for this problem
>> is to make it to do that.  That's where this needs to be fixed: in
>> make_compound_operation.
>
> An and:DI is cheaper than a zero_extend:DI of an and:SI.  So GCC is correct
> in not doing this transformation.  I think adding a case to
> make_compound_operation that simply undoes the transformation (without
> calling make_extraction) is fine if you guard it with if (in_code == MEM).
>

We first expand zero_extend:DI address to and:DI and then try
to restore zero_extend:DI.   Why do we do this transformation
to begin with?


-- 
H.J.


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On Thu, Oct 13, 2011 at 19:19, H.J. Lu  wrote:
> On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini  wrote:
>> On 10/13/2011 06:35 PM, Richard Kenner wrote:

 It never calls make_extraction.  There are several cases handled
 for AND operation. But

 (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
                (const_int 4 [0x4])) 0)
        (subreg:DI (reg:SI 106) 0))
    (const_int 4294967292 [0xfffc]))

 isn't one of them.
>>>
>>> Yes, clearly.  Otherwise it would work!  The correct fix for this problem
>>> is to make it to do that.  That's where this needs to be fixed: in
>>> make_compound_operation.
>>
>> An and:DI is cheaper than a zero_extend:DI of an and:SI.  So GCC is correct
>> in not doing this transformation.  I think adding a case to
>> make_compound_operation that simply undoes the transformation (without
>> calling make_extraction) is fine if you guard it with if (in_code == MEM).
>>
>
> We first expand zero_extend:DI address to and:DI and then try
> to restore zero_extend:DI.   Why do we do this transformation
> to begin with?

Because outside of a MEM it may be beneficial _not_ to restore
zero_extend:DI in this case (depending on rtx_costs).

Paolo


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 10:21 AM, Paolo Bonzini  wrote:
> On Thu, Oct 13, 2011 at 19:19, H.J. Lu  wrote:
>> On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini  wrote:
>>> On 10/13/2011 06:35 PM, Richard Kenner wrote:
>
> It never calls make_extraction.  There are several cases handled
> for AND operation. But
>
> (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ])
>                (const_int 4 [0x4])) 0)
>        (subreg:DI (reg:SI 106) 0))
>    (const_int 4294967292 [0xfffc]))
>
> isn't one of them.

 Yes, clearly.  Otherwise it would work!  The correct fix for this problem
 is to make it to do that.  That's where this needs to be fixed: in
 make_compound_operation.
>>>
>>> An and:DI is cheaper than a zero_extend:DI of an and:SI.  So GCC is correct
>>> in not doing this transformation.  I think adding a case to
>>> make_compound_operation that simply undoes the transformation (without
>>> calling make_extraction) is fine if you guard it with if (in_code == MEM).
>>>
>>
>> We first expand zero_extend:DI address to and:DI and then try
>> to restore zero_extend:DI.   Why do we do this transformation
>> to begin with?
>
> Because outside of a MEM it may be beneficial _not_ to restore
> zero_extend:DI in this case (depending on rtx_costs).
>

Why do we do it for MEM then?

-- 
H.J.


Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On Thu, Oct 13, 2011 at 19:06, Richard Kenner
 wrote:
>> An and:DI is cheaper than a zero_extend:DI of an and:SI.
>
> That depends strongly on the constants and whether the machine is 32-bit
> or 64-bit.

Yes, the rtx_costs take care of that.

> But that's irrelevant in this case since the and:SI will be removed (it
> reflects what already been done).

Do you refer to this in make_extraction:

  /* See if this can be done without an extraction.  We never can if the
 width of the field is not the same as that of some integer mode. For
 registers, we can only avoid the extraction if the position is at the
 low-order bit and this is either not in the destination or we have the
 appropriate STRICT_LOW_PART operation available.  */

and this call to force_to_mode in particular:

new_rtx = force_to_mode (inner, tmode,
 len >= HOST_BITS_PER_WIDE_INT
 ? ~(unsigned HOST_WIDE_INT) 0
 : ((unsigned HOST_WIDE_INT) 1 << len) - 1,
 0);

and from there the call to simplify_and_const_int that does this:

  if (constop == nonzero)
return varop;

?

Then indeed it should work if you call make_extraction more greedily
than what we do now (which is, just if the constant is one less than a
power of two).

The answer to H.J.'s "Why do we do it for MEM then?" is simply
"because no one ever thought about not doing it" (because there are no
other POINTERS_EXTEND_UNSIGNED == 1 machines).  In fact it may even be
advantageous to do it in general, even if in_code != MEM.  Only
experimentation can tell.

Paolo


  1   2   >