date:20121001

Re: [Patch contrib] check_GNU_style: remove tmp file

2012-10-01 Thread Christophe Lyon

Ping?

May I commit this small patch?
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00580.html

Thanks,

Christophe.

On 10 September 2012 14:23, Christophe Lyon  wrote:
> On 9 September 2012 12:46, Gerald Pfeifer  wrote:
>> On Mon, 3 Sep 2012, Christophe Lyon wrote:
>>> check_GNU_style.sh currently leaves a temporary file in the current
>>> directory. This patch removes it upon exit.
>>>
>>> Christophe.
>>>
>>> 2012-09-03   Christophe Lyon  
>>>
>>>   * check_GNU_style.sh: Remove temporay file upon exit.
>>
>> Shouldn't this also be removed upon abort?
>>
>> See contrib/warn_summary, for an example,
>>
>> Gerald
>
> Good point. Here is a new version, catching the same signals as warn_summary.
>
> Christophe.

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Jakub Jelinek

On Mon, Oct 01, 2012 at 08:47:13AM +0200, Steven Bosscher wrote:
> The test case compiles just fine at -O2, only VRP has trouble with it.
> Let's try to stick with facts, not speculation.

I was talking about the other PR, PR26854, which from what I remember when
trying it myself and even the latest -O3 time reports from the reduced
testcase show that IRA/reload aren't there very significant (for -O3 IRA
takes ~ 6% and reload ~ 1%).

> I've put a lot of hard work into it to fix almost all scalability problems
> on this PR for gcc 4.8. LRA undoes all of that work. I understand it is
> painful for some people to hear, but I remain of opinion that LRA cannot be
> considered "ready" if it scales so much worse than everything else in the
> compiler.

Judging the whole implementation from just these corner cases and not how it
performs on other testcases (SPEC, rebuild a distro, ...) is IMHO not the
right thing, if Vlad thinks the corner cases are fixable during stage3; IMHO
we should allow LRA in, worst case it can be disabled by default even for
i?86/x86_64.

Jakub

Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector

2012-10-01 Thread Eric Botcazou

> 2012-09-09  Marc Glisse  
> 
> gcc/
>   * simplify-rtx.c (simplify_binary_operation_1) :
>   Detect the identity.
>   : Handle VEC_SELECTs from the same vector.
> 
> gcc/testsuite/
>   * gcc.target/i386/vect-rebuild.c: New testcase.

OK if you adjust the above date and add the missing space at the end of:

/* Try to merge 2 VEC_SELECTs from the same vector into a single one. */ 

-- 
Eric Botcazou

[Ada] Ada 2012 Legality check on requeue statements

2012-10-01 Thread Arnaud Charlet

The target of a requeue statement on a protected entry must be a variable. This
is part of AI05-0225, a binding interpretation that applies to all versions of
the language.

See ACATS test b954005.adb.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Ed Schonberg  

* sem_ch9.adb (Analyze_Requeue): The target of a requeue
statement on a protected entry must be a variable. This is part
of AI05-0225.

Index: sem_ch9.adb
===
--- sem_ch9.adb (revision 191888)
+++ sem_ch9.adb (working copy)
@@ -2379,6 +2379,18 @@
 end;
  end if;
   end if;
+
+  --  AI05-0225: the target protected object of a requeue must be a
+  --  variable. This is a binding interpretation that applies to all
+  --  versions of the language.
+
+  if Present (Target_Obj)
+and then Ekind (Scope (Entry_Id)) in Protected_Kind
+and then not Is_Variable (Target_Obj)
+  then
+ Error_Msg_N
+   ("target protected object of requeue must be a variable", N);
+  end if;
end Analyze_Requeue;
 
--

[Ada] Validity checks on subprogram parameters and results

2012-10-01 Thread Arnaud Charlet

This patch introduces two new validity checks to the GNAT compiler:

1) -gnatVl   Check non-overlapping parameters
When this check is enabled, each subprogram call is preceded by a sequence of
checks that ensure no overlap between actual parameters.

2) -gnatVv   Check proper initialization of scalars on parameters and results
When this check is enabled, each IN, IN OUT and OUT formal parameter along with
a possible function result is checked on entry and exit of a subprogram for
properly initialized scalars.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Hristian Kirtchev  

* checks.ads, checks.adb (Apply_Parameter_Aliasing_Checks): New routine.
(Apply_Parameter_Validity_Checks): New routines.
* exp_ch6.adb (Expand_Call): Add aliasing checks to detect
overlapping objects.
* freeze.adb: Add with and use clauses for Checks and Validsw.
(Freeze_Entity): Add checks to detect proper initialization
of scalars.
* sem_ch4.adb: Add with and use clauses for Checks and Validsw.
(Analyze_Call): Add aliasing checks to detect overlapping objects.
* sem_ch13.adb: Add with and use clauses for Validsw.
(Analyze_Aspect_Specifications): Add checks to detect proper
initialization of scalars.
* sem_prag.adb (Chain_PPC): Correct the extraction of the
subprogram name.
* sem_util.adb (Is_Object_Reference): Attribute 'Result now
produces an object.
* usage.adb (Usage): Add usage lines for validity switches 'l',
'L', 'v' and 'V'.
* validsw.adb (Reset_Validity_Check_Options): Include
processing for flags Validity_Check_Non_Overlapping_Params and
Validity_Check_Valid_Scalars_On_Params. Code reformatting.
(Save_Validity_Check_Options): Include processing
for flags Validity_Check_Non_Overlapping_Params
and Validity_Check_Valid_Scalars_On_Params.
(Set_Validity_Check_Options): Add processing for validity switches
'a', 'l', 'L', 'n', 'v' and 'V'. Code reformatting.
* validsw.ads: Add new flags Validity_Check_Non_Overlapping_Params
and Validity_Check_Valid_Scalars_On_Params along with comments
on usage.

Index: usage.adb
===
--- usage.adb   (revision 191888)
+++ usage.adb   (working copy)
@@ -399,6 +399,8 @@
Write_Line ("Fturn off checking for floating-point");
Write_Line ("iturn on checking for in params");
Write_Line ("Iturn off checking for in params");
+   Write_Line ("lturn on checking for non-overlapping params");
+   Write_Line ("Lturn off checking for non-overlapping params");
Write_Line ("mturn on checking for in out params");
Write_Line ("Mturn off checking for in out params");
Write_Line ("oturn on checking for operators/attributes");
@@ -411,6 +413,8 @@
Write_Line ("Sturn off checking for subscripts");
Write_Line ("tturn on checking for tests");
Write_Line ("Tturn off checking for tests");
+   Write_Line ("vturn on checking for 'Valid_Scalars on params");
+   Write_Line ("Vturn off checking for 'Valid_Scalars on params");
Write_Line ("nturn off all validity checks (including RM)");
 
--  Lines for -gnatw switch
Index: checks.adb
===
--- checks.adb  (revision 191888)
+++ checks.adb  (working copy)
@@ -1774,6 +1774,353 @@
 (Ck_Node, Target_Typ, Source_Typ, Do_Static => False);
end Apply_Length_Check;
 
+   -
+   -- Apply_Parameter_Aliasing_Checks --
+   -
+
+   procedure Apply_Parameter_Aliasing_Checks (Call : Node_Id) is
+  Loc: constant Source_Ptr := Sloc (Call);
+  Actual : Node_Id;
+  Actual_Typ : Entity_Id;
+  Check  : Node_Id;
+  Cond   : Node_Id := Empty;
+  Param  : Node_Id;
+  Param_Typ  : Entity_Id;
+
+   begin
+  --  Do not generate the checks in Ada 83, 95 or 05 mode because they
+  --  require an Ada 2012 construct.
+
+  if Ada_Version_Explicit < Ada_2012 then
+ return;
+  end if;
+
+  --  Inspect all pairs of parameters
+
+  Actual := First_Actual (Call);
+  while Present (Actual) loop
+ Actual_Typ := Base_Type (Etype (Actual));
+
+ if Nkind (Actual) = N_Identifier
+   and then Is_Object_Reference (Actual)
+ then
+Param := Next_Actual (Actual);
+while Present (Param) loop
+   Param_Typ := Base_Type (Etype (Param));
+
+   if Nkind (Param) = N_Identifier
+ and then Is_Object_Reference (Param)
+ and then Actual_Typ = Param_Typ
+   then
+

[i386] Fix unwind/debug info for nested functions on 64-bit Windows

2012-10-01 Thread Eric Botcazou

Hi,

in the section of ix86_expand_prologue establishing the frame for Windows 
targets, there is:

  /* Note that SEH directives need to continue tracking the stack
 pointer even after the frame pointer has been set up.  */
  if (m->fs.cfa_reg == stack_pointer_rtx || TARGET_SEH)
{
  if (m->fs.cfa_reg == stack_pointer_rtx)
m->fs.cfa_offset += allocate;

  RTX_FRAME_RELATED_P (insn) = 1;
  add_reg_note (insn, REG_FRAME_RELATED_EXPR,
gen_rtx_SET (VOIDmode, stack_pointer_rtx,
 plus_constant (Pmode, stack_pointer_rtx,
-allocate)));
}

But there is also a few lines above:

  if (eax_live)
{
  emit_insn (gen_push (eax));
  allocate -= UNITS_PER_WORD;
}
  if (r10_live)
{
  r10 = gen_rtx_REG (Pmode, R10_REG);
  emit_insn (gen_push (r10));
  allocate -= UNITS_PER_WORD;
}

and these 2 pushes aren't marked, which can result in wrong SEH unwind and 
DWARF debug info on 64-bit Windows (we have an example of each kind in Ada).

Tested on x86_64-suse-linux and with a 4.7-based SEH-enabled compiler for 64-
Bit Windows.  OK for mainline and 4.7 branch?


2012-10-01  Eric Botcazou  

* config/i386/i386.c (ix86_expand_prologue): Emit frame info for the
special register pushes before frame probing and allocation.



-- 
Eric BotcazouIndex: config/i386/i386.c
===
--- config/i386/i386.c	(revision 191796)
+++ config/i386/i386.c	(working copy)
@@ -10671,7 +10671,7 @@ ix86_expand_prologue (void)
   rtx eax = gen_rtx_REG (Pmode, AX_REG);
   rtx r10 = NULL;
   rtx (*adjust_stack_insn)(rtx, rtx, rtx);
-
+  const bool sp_is_cfa_reg = (m->fs.cfa_reg == stack_pointer_rtx);
   bool eax_live = false;
   bool r10_live = false;
 
@@ -10680,16 +10680,31 @@ ix86_expand_prologue (void)
   if (!TARGET_64BIT_MS_ABI)
 eax_live = ix86_eax_live_at_start_p ();
 
+  /* Note that SEH directives need to continue tracking the stack
+	 pointer even after the frame pointer has been set up.  */
   if (eax_live)
 	{
-	  emit_insn (gen_push (eax));
+	  insn = emit_insn (gen_push (eax));
 	  allocate -= UNITS_PER_WORD;
+	  if (sp_is_cfa_reg || TARGET_SEH)
+	{
+	  if (sp_is_cfa_reg)
+		m->fs.cfa_offset += UNITS_PER_WORD;
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
 	}
+
   if (r10_live)
 	{
 	  r10 = gen_rtx_REG (Pmode, R10_REG);
-	  emit_insn (gen_push (r10));
+	  insn = emit_insn (gen_push (r10));
 	  allocate -= UNITS_PER_WORD;
+	  if (sp_is_cfa_reg || TARGET_SEH)
+	{
+	  if (sp_is_cfa_reg)
+		m->fs.cfa_offset += UNITS_PER_WORD;
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
 	}
 
   emit_move_insn (eax, GEN_INT (allocate));
@@ -10703,13 +10718,10 @@ ix86_expand_prologue (void)
   insn = emit_insn (adjust_stack_insn (stack_pointer_rtx,
 	   stack_pointer_rtx, eax));
 
-  /* Note that SEH directives need to continue tracking the stack
-	 pointer even after the frame pointer has been set up.  */
-  if (m->fs.cfa_reg == stack_pointer_rtx || TARGET_SEH)
+  if (sp_is_cfa_reg || TARGET_SEH)
 	{
-	  if (m->fs.cfa_reg == stack_pointer_rtx)
+	  if (sp_is_cfa_reg)
 	m->fs.cfa_offset += allocate;
-
 	  RTX_FRAME_RELATED_P (insn) = 1;
 	  add_reg_note (insn, REG_FRAME_RELATED_EXPR,
 			gen_rtx_SET (VOIDmode, stack_pointer_rtx,

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

[ Sorry for re-send, it seems that mobile gmail sends text/html and
the sourceware mailer daemon rejects that. ]

On Monday, October 1, 2012, Jakub Jelinek  wrote:
> On Sun, Sep 30, 2012 at 06:50:50PM -0400, Vladimir Makarov wrote:
>
> I think this testcase shouldn't be a show stopper for LRA inclusion into
> 4.8, but something to look at for stage3.
>
> I think a lot of GCC passes have scalability issues on that testcase,
> that is why it must be compiled with -O1 and not higher optimization
> options,

The test case compiles just fine at -O2, only VRP has trouble with it. Let's
try to stick with facts, not speculation.

And the test case is not generated, it is the Eigen template library applied
to mpfr.

I've put a lot of hard work into it to fix almost all scalability problems
on this PR for gcc 4.8. LRA undoes all of that work. I understand it is
painful for some people to hear, but I remain of opinion that LRA cannot be
considered "ready" if it scales so much worse than everything else in the
compiler.

Ciao!
Steven

[Ada] Handling of -vPx with incorrect x

2012-10-01 Thread Arnaud Charlet

Command line switch -vPx (set verbosity level for project file facility) in
gnatmake and gnatcmd is is valid only for x=0, 1, or 2. This change ensures
that any attempt to pass an invalid value generates a proper error message.

The following commands must generate the indicated errors:

$ gnat list -vP9 -Pinvalid_verbosity_prj invalid_verbosity_proc.adb
gnat: invalid verbosity level: 9

$ gnat compile -q -vP9 -Pinvalid_verbosity_prj invalid_verbosity_proc.adb
gnatmake: invalid verbosity level 9

$ gnatmake -q -vP9 -Pinvalid_verbosity_prj invalid_verbosity_proc.adb
gnatmake: invalid verbosity level 9

project invalid_verbosity_prj is end invalid_verbosity_prj;

procedure invalid_verbosity_proc is end invalid_verbosity_proc;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Thomas Quinot  

* gnatcmd.adb, make.adb (Scan_Make_Arg, Inspect_Switches): Recognize
and reject an invalid parameter passed to -vP.

Index: gnatcmd.adb
===
--- gnatcmd.adb (revision 191888)
+++ gnatcmd.adb (working copy)
@@ -1769,20 +1769,28 @@
 
   --  -vPx  Specify verbosity while parsing project files
 
-  elsif Argv'Length = 4
-and then Argv (Argv'First + 1 .. Argv'First + 2) = "vP"
-  then
- case Argv (Argv'Last) is
-when '0' =>
-   Current_Verbosity := Prj.Default;
-when '1' =>
-   Current_Verbosity := Prj.Medium;
-when '2' =>
-   Current_Verbosity := Prj.High;
-when others =>
-   Fail ("Invalid switch: " & Argv.all);
- end case;
+  elsif Argv (Argv'First + 1 .. Argv'First + 2) = "vP" then
+ if Argv'Length = 4
+  and then Argv (Argv'Last) in '0' .. '2'
+ then
+case Argv (Argv'Last) is
+   when '0' =>
+  Current_Verbosity := Prj.Default;
+   when '1' =>
+  Current_Verbosity := Prj.Medium;
+   when '2' =>
+  Current_Verbosity := Prj.High;
+   when others =>
 
+  --  Cannot happen
+
+  raise Program_Error;
+end case;
+ else
+Fail ("invalid verbosity level: "
+& Argv (Argv'First + 3 .. Argv'Last));
+ end if;
+
  Remove_Switch (Arg_Num);
 
   --  -Pproject_file  Specify project file to be used
Index: make.adb
===
--- make.adb(revision 191890)
+++ make.adb(working copy)
@@ -7825,11 +7825,12 @@
 
  --  -vPx  (verbosity of the parsing of the project files)
 
- elsif Argv'Last = 4
-   and then Argv (2 .. 3) = "vP"
-   and then Argv (4) in '0' .. '2'
- then
-if And_Save then
+ elsif Argv (2 .. 3) = "vP" then
+if Argv'Last /= 4 or else Argv (4) not in '0' .. '2' then
+   Make_Failed
+ ("invalid verbosity level " & Argv (4 .. Argv'Last));
+
+elsif And_Save then
case Argv (4) is
   when '0' =>
  Current_Verbosity := Prj.Default;

[Ada] Detect more cases of possible infinite loops

2012-10-01 Thread Arnaud Charlet

In cases where GNAT did not detect the possibility of an infinite loop, it now
issues a warning. For example, on the following code:

$ gcc -c bad.adb
bad.adb:9:13: warning: variable "Cur" is not modified in loop body
bad.adb:9:13: warning: possible infinite loop

 1. package body Bad is
 2.procedure P (Y : Integer) is
 3.begin
 4.   null;
 5.end P;
 6.procedure Q (X : Integer) is
 7.   Cur : Integer := X;
 8.begin
 9.   while Cur /= 0 loop
10.  P (Cur);
11.   end loop;
12.end Q;
13. end Bad;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Yannick Moy  

* sem_warn.adb (Check_Infinite_Loop_Warning/Test_Ref): Improve
the detection of modifications to the loop variable by noting
that, if the type of variable is elementary and the condition
does not contain a function call, then the condition cannot be
modified by side-effects from a procedure call.

Index: sem_warn.adb
===
--- sem_warn.adb(revision 191890)
+++ sem_warn.adb(working copy)
@@ -472,32 +472,41 @@
return Abandon;
 end if;
 
---  If we appear in the context of a procedure call, then also
---  abandon, since there may be issues of non-visible side
---  effects going on in the call.
+--  If the condition contains a function call, we consider it may
+--  be modified by side-effects from a procedure call. Otherwise,
+--  we consider the condition may not be modified, although that
+--  might happen if Variable is itself a by-reference parameter,
+--  and the procedure called modifies the global object referred to
+--  by Variable, but we actually prefer to issue a warning in this
+--  odd case. Note that the case where the procedure called has
+--  visibility over Variable is treated in another case below.
 
-declare
-   P : Node_Id;
+if Function_Call_Found then
+   declare
+  P : Node_Id;
 
-begin
-   P := N;
-   loop
-  P := Parent (P);
-  exit when P = Loop_Statement;
+   begin
+  P := N;
+  loop
+ P := Parent (P);
+ exit when P = Loop_Statement;
 
-  --  Abandon if at procedure call, or something strange is
-  --  going on (perhaps a node with no parent that should
-  --  have one but does not?) As always, for a warning we
-  --  prefer to just abandon the warning than get into the
-  --  business of complaining about the tree structure here!
+ --  Abandon if at procedure call, or something strange is
+ --  going on (perhaps a node with no parent that should
+ --  have one but does not?) As always, for a warning we
+ --  prefer to just abandon the warning than get into the
+ --  business of complaining about the tree structure here!
 
-  if No (P) or else Nkind (P) = N_Procedure_Call_Statement then
- return Abandon;
-  end if;
-   end loop;
-end;
+ if No (P)
+   or else Nkind (P) = N_Procedure_Call_Statement
+ then
+return Abandon;
+ end if;
+  end loop;
+   end;
+end if;
 
---  Reference to variable renaming variable in question
+ --  Reference to variable renaming variable in question
 
  elsif Is_Entity_Name (N)
and then Present (Entity (N))
@@ -509,7 +518,7 @@
  then
 return Abandon;
 
---  Call to subprogram
+ --  Call to subprogram
 
  elsif Nkind (N) in N_Subprogram_Call then

[Ada] Ada 2012 legality checks on uses of names of protected procedures

2012-10-01 Thread Arnaud Charlet

Ada 2012 AI05-0225 clarifies that most uses of the  names of protected
procedures and entries require that the target object (explicit or implicit)
be a variable. This applies to calls, generic actuals, and prefixes of 'Access.
It applies in particular to such uses within the body a protected function.

Example is ACATS Test b950001.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Ed Schonberg  

* sem_util.ads sem_util.adb (Check_Internal_Protected_Use):
reject use of protected procedure or entry within the body of
a protected function of the same protected type, when usage is
a call, an actual in an instantiation, a or prefix of 'Access.
* sem_ch8.adb (Analyze_Subprogram_Renaming): Verify that target
object in renaming of protected procedure is a variable, and
apply Check_Internal_Protected_Use.
* sem_res.adb (Analyze_Call, Analyze_Entry_Call): apply
Check_Internal_Protected_Use rather than on-line code.
* sem_attr.adb (Analyze_Access_Attribute): Verify that target
object in accsss to protected procedure is a variable, and apply
Check_Internal_Protected_Use.

Index: sem_util.adb
===
--- sem_util.adb(revision 191890)
+++ sem_util.adb(working copy)
@@ -1191,6 +1191,50 @@
   end if;
end Check_Implicit_Dereference;
 
+   --
+   -- Check_Internal_Protected_Use --
+   --
+
+   procedure Check_Internal_Protected_Use (N : Node_Id; Nam : Entity_Id) is
+  S: Entity_Id;
+  Prot : Entity_Id;
+
+   begin
+  S := Current_Scope;
+  while Present (S) loop
+ if S = Standard_Standard then
+return;
+
+ elsif Ekind (S) = E_Function
+   and then Ekind (Scope (S)) = E_Protected_Type
+ then
+Prot := Scope (S);
+exit;
+ end if;
+
+ S := Scope (S);
+  end loop;
+
+  if Scope (Nam) = Prot and then Ekind (Nam) /= E_Function then
+ if Nkind (N) = N_Subprogram_Renaming_Declaration then
+Error_Msg_N
+  ("within protected function cannot use protected "
+   & "procedure in renaming or as generic actual", N);
+
+ elsif Nkind (N) = N_Attribute_Reference then
+Error_Msg_N
+  ("within protected function cannot take access of "
+   & " protected procedure", N);
+
+ else
+Error_Msg_N
+  ("within protected function, protected object is constant", N);
+Error_Msg_N
+  ("\cannot call operation that may modify it", N);
+ end if;
+  end if;
+   end Check_Internal_Protected_Use;
+
---
-- Check_Later_Vs_Basic_Declarations --
---
Index: sem_util.ads
===
--- sem_util.ads(revision 191888)
+++ sem_util.ads(working copy)
@@ -170,6 +170,12 @@
--  checks whether T is a reference type, and if so it adds an interprettion
--  to Expr whose type is the designated type of the reference_discriminant.
 
+   procedure Check_Internal_Protected_Use (N : Node_Id; Nam : Entity_Id);
+   --  Within a protected function, the current object is a constant, and
+   --  internal calls to a procedure or entry are illegal. Similarly, other
+   --  uses of a protected procedure in a renaming or a generic instantiation
+   --  in the context of a protected function are illegal (AI05-0225).
+
procedure Check_Later_Vs_Basic_Declarations
  (Decls  : List_Id;
   During_Parsing : Boolean);
Index: sem_res.adb
===
--- sem_res.adb (revision 191888)
+++ sem_res.adb (working copy)
@@ -5314,15 +5314,7 @@
   --  Check that this is not a call to a protected procedure or entry from
   --  within a protected function.
 
-  if Ekind (Current_Scope) = E_Function
-and then Ekind (Scope (Current_Scope)) = E_Protected_Type
-and then Ekind (Nam) /= E_Function
-and then Scope (Nam) = Scope (Current_Scope)
-  then
- Error_Msg_N ("within protected function, protected " &
-   "object is constant", N);
- Error_Msg_N ("\cannot call operation that may modify it", N);
-  end if;
+  Check_Internal_Protected_Use (N, Nam);
 
   --  Freeze the subprogram name if not in a spec-expression. Note that we
   --  freeze procedure calls as well as function calls. Procedure calls are
@@ -6732,6 +6724,7 @@
   end if;
 
   Resolve_Actuals (N, Nam);
+  Check_Internal_Protected_Use (N, Nam);
 
   --  Create a call reference to the entry
 
Index: sem_attr.adb
===
--- sem_attr.adb(revis

[Ada] Tagged "/=" operator in GNAT tree doesn't get fully resolved with -gnatc

2012-10-01 Thread Arnaud Charlet

When an inequality operator is used for a tagged type, the tree node for
the inequality prior to expansion (such as with -gnatc) reflects "/="
operator in Standard rather than being resolved to be a logical negation
of the tagged type's equality function. This is a problem for ASIS in
Corresponding_Equality_Operator. We now ensure that the rewriting of
a tagged inequality happens during analysis rather than being deferred
to expansion.

No simple test available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Gary Dismukes  

* sem_ch4.adb (Find_Equality_Types.Try_One_Interp): Exclude the
predefined interpretation from consideration if it's for a "/="
operator of a tagged type. This will allow Analyze_Equality_Op to
rewrite the "/=" as a logical negation of a call to the appropriate
dispatching equality function. This needs to be done during
analysis rather than expansion for the benefit of ASIS, which
otherwise gets the unresolved N_Op_Ne operator from Standard.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 191890)
+++ sem_ch4.adb (working copy)
@@ -5612,8 +5612,24 @@
 return;
  end if;
 
+ --  If the right operand has a type compatible with T1, check for an
+ --  acceptable interpretation, unless T1 is limited (no predefined
+ --  equality available), or this is use of a "/=" for a tagged type.
+ --  In the latter case, possible interpretations of equality need to
+ --  be considered, we don't want the default inequality declared in
+ --  Standard to be chosen, and the "/=" will be rewritten as a
+ --  negation of "=" (see the end of Analyze_Equality_Op). This ensures
+ --  that that rewriting happens during analysis rather than being
+ --  delayed until expansion (this is needed for ASIS, which only sees
+ --  the unexpanded tree). Note that if the node is N_Op_Ne, but Op_Id
+ --  is Name_Op_Eq then we still proceed with the interpretation,
+ --  because that indicates the potential rewriting case where the
+ --  interpretation to consider is actually "=" and the node may be
+ --  about to be rewritten by Analyze_Equality_Op.
+
  if T1 /= Standard_Void_Type
and then Has_Compatible_Type (R, T1)
+
and then
  ((not Is_Limited_Type (T1)
 and then not Is_Limited_Composite (T1))
@@ -5622,6 +5638,11 @@
  (Is_Array_Type (T1)
and then not Is_Limited_Type (Component_Type (T1))
and then Available_Full_View_Of_Component (T1)))
+
+   and then
+ (Nkind (N) /= N_Op_Ne
+   or else not Is_Tagged_Type (T1)
+   or else Chars (Op_Id) = Name_Op_Eq)
  then
 if Found
   and then Base_Type (T1) /= Base_Type (T_F)

Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector

2012-10-01 Thread Marc Glisse


On Mon, 1 Oct 2012, Eric Botcazou wrote:


2012-09-09  Marc Glisse  

gcc/
* simplify-rtx.c (simplify_binary_operation_1) :
Detect the identity.
: Handle VEC_SELECTs from the same vector.

gcc/testsuite/
* gcc.target/i386/vect-rebuild.c: New testcase.


OK if you adjust the above date and add the missing space at the end of:

/* Try to merge 2 VEC_SELECTs from the same vector into a single one. */


I was trying to avoid splitting in 2 lines, but ok I'll split.

Thank you for the quick reply,

--
Marc Glisse

[Ada] Next step in implementing extended overflow checking

2012-10-01 Thread Arnaud Charlet

This patch defines the four modes of overflow handling (SUPPRESSED,
CHECKED, MINIMIZED, ELIMINATED), and adds the Overflow_Checks pragma
and extedned -gnato switch to et them. But for now Checked, Minimized,
and Eliminated are all treated as Checked, so the behavior is unchanged.

The following program:

 1. procedure over2 is
 2.function Ident (X : Integer) return Integer is
 3.begin
 4.   return X;
 5.end;
 6.x : integer := Ident (Integer'Last);
 7.procedure g;
 8.pragma Postcondition (x + 2 = 0);
 9.procedure g is begin null; end;
10. begin
11.g;
12. end;

compiled with -gnata -gnato10 generates

  raised SYSTEM.ASSERTIONS.ASSERT_FAILURE :
   failed postcondition from over2.adb:8

since -gnato10 turns off overflow checking in assertions
resulting in the postcondition giving a result of false.

compiled with -gnata -gnato01 generates

  raised CONSTRAINT_ERROR : over2.adb:8 overflow check failed

since -gnato01 turns on overflow checking in assertions
resulting in the overflow being detected.

The following test program:

 1. pragma Overflow_Checks
 2.   (Suppressed, Assertions => Checked);
 3. procedure over21 is
 4.function Ident (X : Integer) return Integer is
 5.begin return X; end;
 6.x : integer := Ident (Integer'Last);
 7.procedure g;
 8.pragma Postcondition (x + 2 = 0);
 9.procedure g is begin null; end;
10. begin
11.g;
12. end;

compiled with -gnata result in

  raised CONSTRAINT_ERROR : over21.adb:8 overflow check failed

The following test program:

 1. pragma Overflow_Checks
 2.   (Checked, Assertions => Suppressed);
 3. procedure over22 is
 4.function Ident (X : Integer) return Integer is
 5.begin return X; end;
 6.x : integer := Ident (Integer'Last);
 7.procedure g;
 8.pragma Postcondition (x + 2 = 0);
 9.procedure g is begin null; end;
10. begin
11.g;
12. end;

compiled with -gnata generates

  raised SYSTEM.ASSERTIONS.ASSERT_FAILURE :
   failed postcondition from over22.adb:8

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Robert Dewar  

* checks.adb: Remove reference to Enable_Overflow_Checks Use
Suppress_Options rather than Scope_Suppress.
* gnat1drv.adb (Adjust_Global_Switches): Handle new overflow
settings (Adjust_Global_Switches): Initialize Scope_Suppress
from Suppress_Options.
* opt.adb: Remove Enable_Overflow_Checks (use Suppress_Options
instead).
* opt.ads: Remove Overflow_Checks_Unsuppressed (not used)
Remove Enable_Overflow_Checks (use Suppress_Options instead)
Suppress_Options is now current setting (replaces Scope_Suppress).
* osint.adb (Initialize): Deal with initializing overflow
checking.
* par-prag.adb: Add dummy entry for pragma Overflow_Checks.
* sem.adb (Semantics): Save and restore In_Assertion_Expr Use
Suppress_Options instead of Scope_Suppress.
* sem.ads (In_Assertion_Expr): New flag (Scope_Suppress):
Removed, use Suppress_Options instead.
* sem_eval.adb (Compile_Time_Compare): Return Unknown in
preanalysis mode.
* sem_prag.adb (Process_Suppress_Unsuppress): Setting of
Overflow_Checks_Unsuppressed removed (not used anywhere!)
(Analyze_Pragma, case Check): Set In_Assertion_Expression
(Analyze_Pragma, case Overflow_Checks): Implement new pragma
* snames.ads-tmpl: Add names needed for handling pragma
Overflow_Checks
* switch-c.adb (Scan_Front_End_Switches) Handle -gnato? and
-gnato?? where ? is 0-3
* types.ads: Updates and fixes to comment on Suppress_Record.

Index: switch-c.adb
===
--- switch-c.adb(revision 191888)
+++ switch-c.adb(working copy)
@@ -128,9 +128,8 @@
 
   --  Handle switches that do not start with -gnat
 
-  if Ptr + 3 > Max
-or else Switch_Chars (Ptr .. Ptr + 3) /= "gnat"
-  then
+  if Ptr + 3 > Max or else Switch_Chars (Ptr .. Ptr + 3) /= "gnat" then
+
  --  There are two front-end switches that do not start with -gnat:
  --  -I, --RTS
 
@@ -755,11 +754,78 @@
 
 when 'o' =>
Ptr := Ptr + 1;
-   Suppress_Options.Suppress (Overflow_Check) := False;
-   Suppress_Options.Overflow_Checks_General := Check_All;
-   Suppress_Options.Overflow_Checks_Assertions := Check_All;
-   Opt.Enable_Overflow_Checks := True;
 
+   --  Case of no digits after the -gnato
+
+   if Ptr > Max or else Switch_Chars (Ptr) not in '0' .. '3' then
+  Suppress_Options.Overflow_Checks_General:= Checked;
+  Suppress_Options.Overflow_Checks_Assertions := Checked;
+
+

Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector

2012-10-01 Thread Eric Botcazou

> > /* Try to merge 2 VEC_SELECTs from the same vector into a single one. */
> 
> I was trying to avoid splitting in 2 lines, but ok I'll split.

Indeed.  Then you can remove the '2' above, it doesn't add much.

-- 
Eric Botcazou

[Ada] Set the flag In_Assertion_Expr during analysis of assertion expressions

2012-10-01 Thread Arnaud Charlet

In_Assertion_Expr should be non-zero during analysis of assertion expressions,
even for preanalysis of these expressions. So wrap the call to
Preanalyze_Spec_Expression to provide proper increment/decrement of the flag
for assertion expressions.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Yannick Moy  

* sem_ch13.adb (Add_Invariants): Analyze the invariant expression
as an assertion expression.
* sem_ch3.adb / sem_ch3.ads (Preanalyze_Assert_Expression):
New procedure that wraps a call to Preanalyze_Spec_Expression
for assertion expressions, so that In_Assertion_Expr can be
properly adjusted.
* sem_prag.adb (Analyze_PPC_In_Decl_Part
Check_Precondition_Postcondition Preanalyze_CTC_Args): Call the
new Preanalyze_Assert_Expression.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 191888)
+++ sem_ch3.adb (working copy)
@@ -19306,6 +19306,17 @@
   end if;
end Check_Anonymous_Access_Components;
 
+   --
+   -- Preanalyze_Assert_Expression --
+   --
+
+   procedure Preanalyze_Assert_Expression (N : Node_Id; T : Entity_Id) is
+   begin
+  In_Assertion_Expr := In_Assertion_Expr + 1;
+  Preanalyze_Spec_Expression (N, T);
+  In_Assertion_Expr := In_Assertion_Expr - 1;
+   end Preanalyze_Assert_Expression;
+

-- Preanalyze_Spec_Expression --

Index: sem_ch3.ads
===
--- sem_ch3.ads (revision 191888)
+++ sem_ch3.ads (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1992-2011, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -246,6 +246,10 @@
--  This mechanism is also used for aspect specifications that have an
--  expression parameter that needs similar preanalysis.
 
+   procedure Preanalyze_Assert_Expression (N : Node_Id; T : Entity_Id);
+   --  Wrapper on Preanalyze_Spec_Expression for assertion expressions, so that
+   --  In_Assertion_Expr can be properly adjusted.
+
procedure Process_Full_View (N : Node_Id; Full_T, Priv_T : Entity_Id);
--  Process some semantic actions when the full view of a private type is
--  encountered and analyzed. The first action is to create the full views
Index: sem_prag.adb
===
--- sem_prag.adb(revision 191897)
+++ sem_prag.adb(working copy)
@@ -286,9 +286,7 @@
   --  Preanalyze the boolean expression, we treat this as a spec expression
   --  (i.e. similar to a default expression).
 
-  In_Assertion_Expr := In_Assertion_Expr + 1;
-  Preanalyze_Spec_Expression (Get_Pragma_Arg (Arg1), Standard_Boolean);
-  In_Assertion_Expr := In_Assertion_Expr - 1;
+  Preanalyze_Assert_Expression (Get_Pragma_Arg (Arg1), Standard_Boolean);
 
   --  In ASIS mode, for a pragma generated from a source aspect, also
   --  analyze the original aspect expression.
@@ -296,7 +294,7 @@
   if ASIS_Mode
 and then Present (Corresponding_Aspect (N))
   then
- Preanalyze_Spec_Expression
+ Preanalyze_Assert_Expression
(Expression (Corresponding_Aspect (N)), Standard_Boolean);
   end if;
 
@@ -2178,7 +2176,7 @@
 then
--  Analyze pragma expression for correctness and for ASIS use
 
-   Preanalyze_Spec_Expression
+   Preanalyze_Assert_Expression
  (Get_Pragma_Arg (Arg1), Standard_Boolean);
 
--  In ASIS mode, for a pragma generated from a source aspect,
@@ -2187,7 +2185,7 @@
if ASIS_Mode
  and then Present (Corresponding_Aspect (N))
then
-  Preanalyze_Spec_Expression
+  Preanalyze_Assert_Expression
 (Expression (Corresponding_Aspect (N)), Standard_Boolean);
end if;
 end if;
@@ -6773,7 +6771,8 @@
 
 --pragma Check (Assertion, condition [, msg]);
 
---  So rewrite pragma in this manner, and analyze the result
+--  So rewrite pragma in this manner, transfer the message
+--  argument if present, and analyze the result
 
 Expr := Get_Pragma_Arg (Arg1);

[Ada] Checks on aliasing and initialization of scalars for parameters

2012-10-01 Thread Arnaud Charlet

This patch reimplements the checks related to aliasing and initialization of
scalars for subprogram parameters and ties them to compilation flags -gnateA
and -gnateV respectively.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Hristian Kirtchev  

* checks.adb (Apply_Parameter_Aliasing_Checks): Removed.
(Apply_Parameter_Aliasing_And_Validity_Checks): New routine.
(Apply_Parameter_Validity_Checks): Removed.
* checks.ads (Apply_Parameter_Aliasing_Checks): Removed.
(Apply_Parameter_Aliasing_And_Validity_Checks): New routine.
(Apply_Parameter_Validity_Checks): Removed.
* exp_ch6.adb (Expand_Call): Remove the generation of parameter
aliasing checks.
* freeze.adb: Remove with and use clauses for Validsw.
(Freeze_Entity): Update the guard and generation of aliasing
and scalar initialization checks for subprogram parameters.
* opt.ads: Add new flags Check_Aliasing_Of_Parameters and
Check_Validity_Of_Parameters along with comments on usage.
* sem_attr.adb (Analyze_Attribute): Pragma Overlaps_Storage is
no longer an Ada 2012 feature.
* sem_ch4.adb: Remove with and use clauses for Checks and Validsw.
(Analyze_Call): Remove the generation of aliasing checks for
subprogram parameters.
* sem_ch13.adb: Remove with and use clauses for Validsw.
(Analyze_Aspect_Specifications): Remove the generation of scalar
initialization checks.
* switch-c.adb (Scan_Front_End_Switches): Add processing for
-gnateA and -gnateV.
* usage.adb (Usage): Add information on switches -gnateA and
-gnateV. Remove information on validity switches 'l', 'L',
'v' and 'V'.
* validsw.adb (Reset_Validity_Check_Options): Remove the
reset of flags Validity_Check_Non_Overlapping_Params
and Validity_Check_Valid_Scalars_On_Params.
(Save_Validity_Check_Options): Remove the processing
for flags Validity_Check_Non_Overlapping_Params
and Validity_Check_Valid_Scalars_On_Params.
(Set_Validity_Check_Options): Remove the processing
for flags Validity_Check_Non_Overlapping_Params and
Validity_Check_Valid_Scalars_On_Params.
* validsw.ads: Remove flags Validity_Check_Non_Overlapping_Params
and Validity_Check_Valid_Scalars_On_Params along with their
comments on usage.

Index: switch-c.adb
===
--- switch-c.adb(revision 191895)
+++ switch-c.adb(working copy)
@@ -380,6 +380,12 @@
  Enable_Switch_Storing;
  Ptr := Ptr + 1;
 
+  --  -gnateA (aliasing checks on parameters)
+
+  when 'A' =>
+ Ptr := Ptr + 1;
+ Check_Aliasing_Of_Parameters := True;
+
   --  -gnatec (configuration pragmas)
 
   when 'c' =>
@@ -566,6 +572,22 @@
   when 'P' =>
  Treat_Categorization_Errors_As_Warnings := True;
 
+  --  -gnateS (generate SCO information)
+
+  --  Include Source Coverage Obligation information in ALI
+  --  files for the benefit of source coverage analysis tools
+  --  (xcov).
+
+  when 'S' =>
+ Generate_SCO := True;
+ Ptr := Ptr + 1;
+
+  --  -gnateV (validity checks on parameters)
+
+  when 'V' =>
+ Ptr := Ptr + 1;
+ Check_Validity_Of_Parameters := True;
+
   --  -gnatez (final delimiter of explicit switches)
 
   --  All switches that come after -gnatez have been added by
@@ -577,16 +599,6 @@
  Disable_Switch_Storing;
  Ptr := Ptr + 1;
 
-  --  -gnateS (generate SCO information)
-
-  --  Include Source Coverage Obligation information in ALI
-  --  files for the benefit of source coverage analysis tools
-  --  (xcov).
-
-  when 'S' =>
- Generate_SCO := True;
- Ptr := Ptr + 1;
-
   --  All other -gnate? switches are unassigned
 
   when others =>
Index: usage.adb
===
--- usage.adb   (revision 191890)
+++ usage.adb   (working copy)
@@ -167,6 +167,11 @@
Write_Switch_Char ("Dnn");
Write_Line ("Debug expanded generated code (max line length = nn)");
 
+   --  Line for -gnateA switch
+
+   Write_Switch_Char ("eA");
+   Write_Line ("Aliasing checks on subprogram parameters");
+
--  Line for -gnatec switch
 
Write_Switch_Char ("ec=?");
@@ -227,6 +232,11 @@
Write_Switch_Char ("eS");
Write_Line ("Generate SCO (Source Coverage Obl

[Ada] Invariant checks and multiple inheritance

2012-10-01 Thread Arnaud Charlet

This patch fixes some problems involving the use of Type_Invariant'Class on
the ancestor of a derived type that also implements an interface.

The following command:

   gnatmake -q -gnat12 -gnata test_invariant
   test_invariant

must yield:

   raised SYSTEM.ASSERTIONS.ASSERT_FAILURE :
failed inherited invariant from invariants.ads:5
---
with Carrier.Next; use Carrier.Next;
procedure Test_Invariant is
  THing : NT;
begin
  HeHe (Thing);
end;
---
package Carrier is
  type PT is tagged private;
  function Invariant(X: PT) return Boolean;
  procedure Do_AandB(X: out PT);
private
  type PT is tagged record A,B: Integer; end record;
  procedure Do_A(X: out PT; V: Integer);
  procedure Do_B(X: out PT; V: Integer);
end Carrier;
---
Package body Carrier is
  procedure Do_AandB(X: out PT) is
  begin Do_A(X,42); Do_B(X,42); end Do_AandB;
  function Invariant(X: PT) return Boolean is
  begin return X.A=X.B; end Invariant;
  procedure Do_A(X: out PT; V: Integer) is begin X.A := V; end Do_A;
  procedure Do_B(X: out PT; V: Integer) is begin X.B := V; end do_B;
end Carrier;
---
with Carrier; use Carrier;
Package Invariants is
  type T is new PT with private
with Type_Invariant'class => Invariant(PT(T));
-- type T introduced by my ignorance about visibility rules inside
-- of type invariants; maybe could be on Carrier.PT already, or
-- only on I below. The conclusion applies in all cases and
-- combinations.
private
  type T is new PT with null record;
end Invariants;
---
package Interf is
  type I is Interface
  -- if you want:
with Type_Invariant'Class => Invariant(I);
  -- maybe with the same "carrier detour" because of visibility?
  function Invariant(X: I) return boolean is abstract;
  procedure HeHe(X: out I) is abstract;
end Interf;
---
with Invariants; with Interf;
package Carrier.Next is
  type NT is new Invariants.T and Interf.I with null record;
  procedure HeHe(X: out NT); -- newly added op; definitely needs
  -- to check the invariant, or else
end Carrier.Next;
---
package body Carrier.Next is
  procedure HeHe(X: out NT) is
  begin
 DO_A(PT(X),666);   -- BREAKS THE INVARIANT
  end Hehe;   -- HeHe BETTER RAISE ASSERTION_ERROR
end Carrier.Next;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Ed Schonberg  

* aspects.ads: Type_Invariant'class is a valid aspect.
* sem_ch6.adb (Is_Public_Subprogram_For): with the exception of
initialization procedures, subprograms that do not come from
source are not public for the purpose of invariant checking.
* sem_ch13.adb (Build_Invariant_Procedure): Handle properly the
case of a non-private type in a package without a private part,
when the type inherits invariants from its ancestor.

Index: aspects.ads
===
--- aspects.ads (revision 191888)
+++ aspects.ads (working copy)
@@ -191,11 +191,12 @@
--  The following array indicates aspects that accept 'Class
 
Class_Aspect_OK : constant array (Aspect_Id) of Boolean :=
-   (Aspect_Invariant => True,
-Aspect_Pre   => True,
-Aspect_Predicate => True,
-Aspect_Post  => True,
-others   => False);
+   (Aspect_Invariant  => True,
+Aspect_Pre=> True,
+Aspect_Predicate  => True,
+Aspect_Post   => True,
+Aspect_Type_Invariant => True,
+others=> False);
 
--  The following array indicates aspects that a subtype inherits from
--  its base type. True means that the subtype inherits the aspect from
Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 191888)
+++ sem_ch6.adb (working copy)
@@ -11342,10 +11342,16 @@
  --  If the subprogram declaration is not a list member, it must be
  --  an Init_Proc, in which case we want to consider it to be a
  --  public subprogram, since we do get initializations to deal with.
+ --  Other internally generated subprograms are not public.
 
- if not Is_List_Member (DD) then
+ if not Is_List_Member (DD)
+   and then Is_Init_Proc (DD)
+ then
 return True;
 
+ elsif not Comes_From_Source (DD) then
+return False;
+
  --  Otherwise we test whether the subprogram is declared in the
  --  visible declarations of the package containing the type.
 
Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 191900)
+++ sem_ch13.adb(working copy)
@@ -5188,9 +5188,6 @@
  Statements => Stmts));

[Ada] Ada 2012 invariant checks on subcomponents

2012-10-01 Thread Arnaud Charlet

If a record has a subvomponent whose type has a defined invariant, then there
must be a invariant check on that component whenever a value of the record type
is created or modified by a visible primitive operation of the type.

The command:

gnatmake -q -gnat12 -gnata main
main

must yield:

CHECK
CHECK
CHECK
CHECK
OK
CHECK
CHECK
CHECK
CHECK
CHECK
CHECK

---
with Text_IO; use Text_IO;
with Ada.Assertions; use Ada.Assertions;
with P;
procedure Main is
   V : P.R2; -- Check on the T default initialization? NOK
   Table : P.A2 (1..3);
begin
   begin
  P.P3 (Table);
   exception
  when Assertion_Error => Put_Line ("OK");
   end;
   P.Prim (V.V.V); -- Check on the T reference? OK
   P.P1 (V.V); -- Check on the part? NOK
   P.P2 (V); -- Check on the part? NOK
   declare
 Thing1 : P.R3 := P.F3 (True);
 Thing2 : P.R3 := P.F3 (False);
   begin
  null;
   end;
end Main;
--
package P is

   type T is private with Type_Invariant => Check (T);

   procedure Prim (V : in out T);

   function Check (V : in T) return Boolean;

   type R1 is record
  V : T;
   end record;

   type R2 is record
  V : R1;
   end record;

   type R3 (D : Boolean) is record
  V1 : T;
  case D is
 when True => V2 : T;
 when False => Zero : Integer := 0;
  end case;
   end record;

   type A2 is array (integer range <>) of T;
   procedure P1 (V : in out R1);

   procedure P2 (V : in out R2);
   procedure P3 (T : in out A2);
   function F3 (Yes: Boolean) return R3;

private
   type T is record
  Val : Integer := 17;
   end record;
end P;
---
with Ada.Text_IO; use Ada.Text_IO;
package body P is

   ---
   -- Check --
   ---

   function Check (V : in T) return Boolean is
   begin
  Put_Line ("CHECK");
  return V.Val = 17;
   end Check;

   procedure Prim (V : in out T) is
   begin
  null;
   end Prim;

   procedure P1 (V : in out R1) is
   begin
  null;
   end P1;

   procedure P2 (V : in out R2) is
   begin
  null;
   end P2;

   procedure P3 (T : in out A2) is
   begin
  T (T'Last).Val := 18;
  null;
   end P3;

   function F3 (Yes : Boolean) return R3 is
  Result : R3 (Yes);
   begin
  return Result;
   end;
end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Ed Schonberg  

* exp_ch3.adb (Build_Record_Invariant_Proc): new procedure to
generate a checking procedure for record types that may have
components whose types have type invariants declared.

Index: exp_ch3.adb
===
--- exp_ch3.adb (revision 191894)
+++ exp_ch3.adb (working copy)
@@ -118,6 +118,10 @@
--  Build record initialization procedure. N is the type declaration
--  node, and Rec_Ent is the corresponding entity for the record type.
 
+   procedure Build_Record_Invariant_Proc (R_Type : Entity_Id; Nod : Node_Id);
+   --  If the record type has components whose types have invariant, build
+   --  an invariant procedure for the record type itself.
+
procedure Build_Slice_Assignment (Typ : Entity_Id);
--  Build assignment procedure for one-dimensional arrays of controlled
--  types. Other array and slice assignments are expanded in-line, but
@@ -3611,6 +3615,174 @@
   end if;
end Build_Record_Init_Proc;
 
+   
+   -- Build_Record_Invariant_Proc --
+   
+
+   procedure Build_Record_Invariant_Proc (R_Type : Entity_Id; Nod : Node_Id) is
+  Loc : constant Source_Ptr := Sloc (Nod);
+
+  Object_Name : constant Name_Id := New_Internal_Name ('I');
+  --  Name for argument of invariant procedure
+
+  Object_Entity : constant Node_Id :=
+Make_Defining_Identifier (Loc, Object_Name);
+  --  The procedure declaration entity for the argument
+
+  Invariant_Found : Boolean;
+  --  Set if any component needs an invariant check.
+
+  Proc_Id   : Entity_Id;
+  Proc_Body : Node_Id;
+  Stmts : List_Id;
+  Type_Def  : Node_Id;
+
+  function Build_Invariant_Checks (Comp_List : Node_Id) return List_Id;
+  --  Recursive procedure that generates a list of checks for components
+  --  that need it, and recurses through variant parts when present.
+
+  function Build_Component_Invariant_Call (Comp : Entity_Id)
+  return Node_Id;
+  --  Build call to invariant procedure for a record component.
+
+  
+  -- Build_Component_Invariant_Call --
+  
+
+  function Build_Component_Invariant_Call (Comp : Entity_Id)
+  return Node_Id
+  is
+ Sel_Comp : Node_Id;
+
+  begin
+ Invariant_Found := True;
+ Sel_Comp :=
+   Make_Selected_Component (Loc,
+ Prefix  => New_Occurrence_Of (Object_Entity, Loc),
+ Selector_Name => New_Occurrence_Of (Comp, Loc));
+
+ re

[Ada] Additional invariant checks on composite types

2012-10-01 Thread Arnaud Charlet

If a composite type has a declared invariant, and some of its compoents are of
types that have their own invariants, the invariant checks on those compoents
must be added to the invariant checks for the enclosing type.

The command;

   gnatmake -q -gnat12 -gnata test_bars
   test_bars

must yield:

   chart invariant violation detected
   value invariant violation detected

with Ada.Assertions; use Ada.Assertions;
with Bars; use Bars;
with Text_IO; use Text_IO;
procedure Test_Bars is
   B : Bar_chart := Bare_Bar (5);
   D : Data (1 .. 5) := (20, 20, 20, 20, 19);
begin
   begin
  Assemble (D, B);
   exception
  when Assertion_Error => Put_Line ("chart invariant violation detected");
   end;

   declare
  D : Data (1 .. 5) := (30, 30, 30, 30, -20);
   begin
  Assemble (D, B);
   exception
  when Assertion_Error => Put_Line ("value invariant violation detected");
   end;
end;
---
package Bars is
   type Value is private
 with Invariant => Legal (Value);

   type Bar_Chart (<>) is private
 with Invariant => Complete (Bar_Chart);

   type Data is array (positive range <>) of Integer;

   function Legal (It : Value) return Boolean;
   function Complete (It : Bar_Chart) return Boolean;
   function Bare_Bar (N : Positive) return Bar_Chart;
   procedure Assemble (From : Data; Result : out Bar_Chart);
private
   type Value is new Integer;
   type Bar_Chart is array (positive range <>) of Value;
end;
--- 
package body  Bars is
   --  type Value is private
   --with Invariant => Legal (Value);

   --  type Bar_Chart is private
   --with Invariant => Complete (Bar_Chart);

   function Legal (It : Value) return Boolean is
   begin
  return It >= 0 and It <= 100;
   end;

   function Complete (It : Bar_Chart) return Boolean is
  Total : Value := 0;
   begin
  for B of It loop Total := Total + B; end loop;
  return Total = 100;
   end;
  
   function Bare_Bar (N : Positive) return Bar_Chart is
  Result : Bar_Chart (1 .. N) := (100, others => 0);
   begin
  return Result;
   end;
  
   
   procedure Assemble (From : Data; Result : out Bar_Chart) is
   begin
  for J in From'range loop
 Result (J) := Value (From (J));
  end loop;
   end;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Ed Schonberg  

* exp_ch3.ads (Build_Array_Invariant_Proc): moved to body.
* exp_ch3.adb (Build_Array_Invariant_Proc,
Build_Record_Invariant_Proc): transform into functions.
(Insert_Component_Invariant_Checks): for composite types that have
components with specified invariants, build a checking procedure,
and make into the invariant procedure of the composite type,
or incorporate it into the user- defined invariant procedure if
one has been created.
* sem_ch3.adb (Array_Type_Declaration): Checking for invariants
on the component type is defered to the expander.

Index: exp_ch3.adb
===
--- exp_ch3.adb (revision 191902)
+++ exp_ch3.adb (working copy)
@@ -88,6 +88,22 @@
--  used for attachment of any actions required in its construction.
--  It also supplies the source location used for the procedure.
 
+   function Build_Array_Invariant_Proc
+ (A_Type : Entity_Id;
+  Nod: Node_Id) return Node_Id;
+   --  If the component of type of array type has invariants, build procedure
+   --  that checks invariant on all components of the array. Ada 2012 specifies
+   --  that an invariant on some type T must be applied to in-out parameters
+   --  and return values that include a part of type T. If the array type has
+   --  an otherwise specified invariant, the component check procedure is
+   --  called from within the user-specified invariant. Otherwise this becomes
+   --  the invariant procedure for the array type.
+
+   function Build_Record_Invariant_Proc
+ (R_Type : Entity_Id;
+  Nod: Node_Id) return Node_Id;
+   --  Ditto for record types.
+
function Build_Discriminant_Formals
  (Rec_Id : Entity_Id;
   Use_Dl : Boolean) return List_Id;
@@ -118,10 +134,6 @@
--  Build record initialization procedure. N is the type declaration
--  node, and Rec_Ent is the corresponding entity for the record type.
 
-   procedure Build_Record_Invariant_Proc (R_Type : Entity_Id; Nod : Node_Id);
-   --  If the record type has components whose types have invariant, build
-   --  an invariant procedure for the record type itself.
-
procedure Build_Slice_Assignment (Typ : Entity_Id);
--  Build assignment procedure for one-dimensional arrays of controlled
--  types. Other array and slice assignments are expanded in-line, but
@@ -184,6 +196,14 @@
--  Treat user-defined stream operations as renaming_as_body if the
--  subprogram they rename is not frozen when the type is frozen.
 
+   procedure Insert_Component_Invariant_Checks
+ (N   : Node_Id;
+ Typ

[Ada] Front-end support for per-instance coverage analysis

2012-10-01 Thread Arnaud Charlet

This changes adds circuitry to the front-end that allows the code generated
for different instances of the same generic to be identified in debugging
information. This will subsequently be used to allow per-instance coverage
analysis.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Thomas Quinot  

* sinput.ads, sinput.adb, sinput-l.adb sinput-c.adb (Sinput): New
Instances table, tracking all generic instantiations. Source file
attribute Instance replaces previous Instantiation attribute with an
index into the Instances table.
(Iterate_On_Instances): New generic procedure.
(Create_Instantiation_Source): Record instantiations in Instances.
(Tree_Read, Tree_Write): Read/write the instance table.
* scils.ads, scos.adb (SCO_Instance_Table): New table, contains
information copied from Sinput.Instance_Table, but self-contained
within the SCO data structures.
* par_sco.ads, par_sco.adb (To_Source_Location): Move to library level.
(Record_Instance): New subprogram, used by...
(Populate_SCO_Instance_Table): New subprogram to fill
the SCO instance table from the Sinput one (called by SCO_Output).
* opt.ads (Generate_SCO_Instance_Table): New option.
* put_scos.adb (Write_Instance_Table): New subprogram, used by...
(Put_SCOs): Dump the instance table at the end of SCO information
if requested.
* get_scos.adb (Get_SCOs): Read SCO_Instance_Table.
* types.h: Add declaration for Instance_Id.
* back_end.adb (Call_Back_End): Pass instance ids in source file
information table.
(Scan_Back_End_Switches): -fdebug-instances sets
Opt.Generate_SCO_Instance_Table.
* gcc-interface/gigi.h: File_Info_Type includes instance id.
* gcc-interface/trans.c: Under -fdebug-instances, set instance
id in line map from same in file info.

Index: par_sco.adb
===
--- par_sco.adb (revision 191888)
+++ par_sco.adb (working copy)
@@ -102,6 +102,9 @@
--  excluding OR and AND) and returns True if so, False otherwise, it does
--  no other processing.
 
+   function To_Source_Location (S : Source_Ptr) return Source_Location;
+   --  Converts Source_Ptr value to Source_Location (line/col) format
+
procedure Process_Decisions
  (N   : Node_Id;
   T   : Character;
@@ -138,6 +141,9 @@
end record;
No_Dominant : constant Dominant_Info := (' ', Empty);
 
+   procedure Record_Instance (Id : Instance_Id; Inst_Sloc : Source_Ptr);
+   --  Add one entry from the instance table to the corresponding SCO table
+
procedure Traverse_Declarations_Or_Statements
  (L : List_Id;
   D : Dominant_Info := No_Dominant;
@@ -696,16 +702,37 @@
   Debug_Put_SCOs;
end pscos;
 
+   -
+   -- Record_Instance --
+   -
+
+   procedure Record_Instance (Id : Instance_Id; Inst_Sloc : Source_Ptr) is
+  Inst_Src  : constant Source_File_Index :=
+Get_Source_File_Index (Inst_Sloc);
+   begin
+  SCO_Instance_Table.Append
+((Inst_Dep_Num   => Dependency_Num (Unit (Inst_Src)),
+  Inst_Loc   => To_Source_Location (Inst_Sloc),
+  Enclosing_Instance => SCO_Instance_Index (Instance (Inst_Src;
+  pragma Assert
+(SCO_Instance_Table.Last = SCO_Instance_Index (Id));
+   end Record_Instance;
+

-- SCO_Output --

 
procedure SCO_Output is
+  procedure Populate_SCO_Instance_Table is
+new Sinput.Iterate_On_Instances (Record_Instance);
+
begin
   if Debug_Flag_Dot_OO then
  dsco;
   end if;
 
+  Populate_SCO_Instance_Table;
+
   --  Sort the unit tables based on dependency numbers
 
   Unit_Table_Sort : declare
@@ -949,26 +976,6 @@
   Pragma_Sloc : Source_Ptr := No_Location;
   Pragma_Name : Pragma_Id  := Unknown_Pragma)
is
-  function To_Source_Location (S : Source_Ptr) return Source_Location;
-  --  Converts Source_Ptr value to Source_Location (line/col) format
-
-  
-  -- To_Source_Location --
-  
-
-  function To_Source_Location (S : Source_Ptr) return Source_Location is
-  begin
- if S = No_Location then
-return No_Source_Location;
- else
-return
-  (Line => Get_Logical_Line_Number (S),
-   Col  => Get_Column_Number (S));
- end if;
-  end To_Source_Location;
-
-   --  Start of processing for Set_Table_Entry
-
begin
   SCO_Table.Append
 ((C1  => C1,
@@ -980,6 +987,21 @@
   Pragma_Name => Pragma_Name));
end Set_Table_Entry;
 
+   
+   -- To_Source_Location --
+   
+
+   function To_Source_Location (S : S

Re: vec_cond_expr adjustments

2012-10-01 Thread Richard Guenther

On Fri, Sep 28, 2012 at 6:55 PM, Marc Glisse  wrote:
> On Fri, 28 Sep 2012, Richard Guenther wrote:
>
>> On Fri, Sep 28, 2012 at 12:42 AM, Marc Glisse 
>> wrote:
>>>
>>> Hello,
>>>
>>> I have been experimenting with generating VEC_COND_EXPR from the
>>> front-end,
>>> and these are just a couple things I noticed.
>>>
>>> 1) optabs.c requires that the first argument of vec_cond_expr be a
>>> comparison, but verify_gimple_assign_ternary only checks
>>> is_gimple_condexpr,
>>> like for COND_EXPR. In the long term, it seems better to also allow
>>> ssa_name
>>> and vector_cst (thus match the is_gimple_condexpr condition), but for now
>>> I
>>> just want to know early if I created an invalid vec_cond_expr.
>>
>>
>> optabs should be fixed instead, an is_gimple_val condition is implicitely
>> val != 0.
>
>
> For vectors, I think it should be val < 0 (with an appropriate cast of val
> to a signed integer vector type if necessary). Or (val & highbit) != 0, but
> that's longer.

I don't think so.  Throughout the compiler we generally assume false == 0
and anything else is true.  (yes, for FP there is STORE_FLAG_VALUE, but
it's scope is quite limited - if we want sth similar for vectors we'd have to
invent it).

>> The tree.[ch] and gimple-fold.c hunks are ok if tested properly, the
>> tree-ssa-forwprop.c idea of using TREE_TYPE (cond), too.
>
>
> Ok, I will retest that way.
>
>
>> I don't like the tree-cfg.c change, instead re-factor optabs.c to
>> get a decomposed cond for vector_compare_rtx and appropriately
>> "decompose" a non-comparison-class cond in expand_vec_cond_expr.
>
>
> So vector_compare_rtx will take as arguments rcode, t_op0, t_op1 instead of
> cond.

Yes.

> And in expand_vec_cond_expr, if I have a condition, I pass its
> elements to vector_compare_rtx, and otherwise I use 0 and the code for
> LT_EXPR as the other arguments.

Yes, but NE_EXPR and 0 (see above).

>
>> If we for example have
>>
>> predicate = a < b;
>> x = predicate ? d : e;
>> y = predicate ? f : g;
>>
>> we ideally want to re-use the predicate computation on targets where
>> that would be optimal (and combine should be able to recover the
>> case where it is not).
>
>
> That I don't understand. The vcond instruction implemented by targets takes
> as arguments d, e, cmp, a, b and emits the comparison itself. I don't see
> how I can avoid sending to the targets both (d,e,<,a,b) and (f,g,<,a,b).
> They will notice eventually that a two, but I don't see how to do that in optabs.c. Or I can compute x = a < b,
> use x < 0 as the comparison passed to the targets, and expect targets (those
> for which it is true) to recognize that < 0 is useless in a vector condition
> (PR54700), or is useless on a comparison result.

But that's a limitation of how vcond works.  ISTR there is/was a vselect
instruction as well, taking a "mask" and two vectors to select from.  At least
that's how vcond works internally for some sub-targets.

Richard.

> Thanks for the comments,
>
> --
> Marc Glisse

Profile housekeeping 5/n (make RTL loop optimizers to use loop bounds better)

2012-10-01 Thread Jan Hubicka

Hi,
this patch commonizes the maximal iteration estimate logic in between SCEV and
loop-iv.  Both are now using loop->nb_iterations_upper_bound.  I decided to
keep same API for SCEV code as for RTL code, so I made
estimated_loop_iterations and max_loop_iterations to not try to recompute
bounds and ICE when invoked without SCEV fired on.

The patch updates RTL optimizers to use the estimated_loop_iterations and
max_loop_iterations. This has few advantages:
  1) loop unroller can now take into account estimates stored into
 loop structure by earlier pass (I think none exist though)
 It is however better then using expected_loop_iterations since
 profile might get out of date with expansion.

  2) loop peeling code now use max iterations bounds. This makes it i.e.
 to peel vectorizer prologues/epilogues/scalar loops so -fpeel-loops
 now improves my low iteration count testcase by about 10%

  3) Same for loop unswithcing.

I am not really friend with the new double_int API. I copied some existing
examples but find it ugly.  Why do we miss operators for comparsions and
division? Why from_*/to_* can't be a cast at least for basic integer types?

Regtested/bootstrapped x86_64-linux, seems sane?

I also wonder if loop vectorizer should not update the estimates after
loop iteration count is reduced by vectorizing.

Honza
* loop-unswitch.c (unswitch_single_loop): Use
estimated_loop_iterations_int to prevent unswitching when loop
is known to not roll.
* tree-ssa-loop-niter.c (estimated_loop_iterations): Do not segfault
when SCEV is not initialized.
(max_loop_iterations): Likewise.
* tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Use
estimated_loop_iterations_int to prevent unswithcing when
loop is known to not roll.
* tree-scalar-evolution.c (scev_initialized_p): New function.
* tree-scalar-evolution.h (scev_initialized_p): Likewise.
* loop-unroll.c (decide_peel_once_rolling): Use
max_loop_iterations_int.
(unroll_loop_constant_iterations): Update
nb_iterations_upper_bound and nb_iterations_estimate.
(decide_unroll_runtime_iterations): Use
estimated_loop_iterations or max_loop_iterations;
(unroll_loop_runtime_iterations): fix profile updating.
(decide_peel_simple): Use estimated_loop_iterations
and max_loop_iterations.
(decide_unroll_stupid): Use estimated_loop_iterations
ad max_loop_iterations.
* loop-doloop.c (doloop_modify): Use max_loop_iterations_int.
(doloop_optimize): Likewise.
* loop-iv.c (iv_number_of_iterations): Use record_niter_bound.
(find_simple_exit): Likewise.
* cfgloop.h (struct niter_desc): Remove niter_max.

Index: loop-unswitch.c
===
*** loop-unswitch.c (revision 191867)
--- loop-unswitch.c (working copy)
*** unswitch_single_loop (struct loop *loop,
*** 257,262 
--- 257,263 
rtx cond, rcond = NULL_RTX, conds, rconds, acond, cinsn;
int repeat;
edge e;
+   HOST_WIDE_INT iterations;
  
/* Do not unswitch too much.  */
if (num > PARAM_VALUE (PARAM_MAX_UNSWITCH_LEVEL))
*** unswitch_single_loop (struct loop *loop,
*** 299,305 
  }
  
/* Nor if the loop usually does not roll.  */
!   if (expected_loop_iterations (loop) < 1)
  {
if (dump_file)
fprintf (dump_file, ";; Not unswitching, loop iterations < 1\n");
--- 300,307 
  }
  
/* Nor if the loop usually does not roll.  */
!   iterations = estimated_loop_iterations_int (loop);
!   if (iterations >= 0 && iterations <= 1)
  {
if (dump_file)
fprintf (dump_file, ";; Not unswitching, loop iterations < 1\n");
Index: tree-ssa-loop-niter.c
===
*** tree-ssa-loop-niter.c   (revision 191867)
--- tree-ssa-loop-niter.c   (working copy)
*** estimate_numbers_of_iterations_loop (str
*** 3012,3020 
  bool
  estimated_loop_iterations (struct loop *loop, double_int *nit)
  {
!   estimate_numbers_of_iterations_loop (loop);
if (!loop->any_estimate)
! return false;
  
*nit = loop->nb_iterations_estimate;
return true;
--- 3012,3034 
  bool
  estimated_loop_iterations (struct loop *loop, double_int *nit)
  {
!   /* When SCEV information is available, try to update loop iterations
!  estimate.  Otherwise just return whatever we recorded earlier.  */
!   if (scev_initialized_p ())
! estimate_numbers_of_iterations_loop (loop);
! 
!   /* Even if the bound is not recorded, possibly we can derrive one from
!  profile.  */
if (!loop->any_estimate)
! {
!   if (loop->header->count)
!   {
!   *nit = gcov_type_to_double_int
!  (expected_loop_iterations_unbounded (loop) + 1);
! return true;
!

Re: [patch] Minor TARGET_MEM_REF cleanup

2012-10-01 Thread Richard Guenther

On Sat, Sep 29, 2012 at 1:17 PM, Eric Botcazou  wrote:
> Hi,
>
> for simple loops like:
>
> extern int a[];
> extern int b[];
>
> void foo (int l)
> {
>   int i;
>
>   for (i = 0; i < l; i++)
> a[i] = b [i];
> }
>
> you get in the .lim3 dump:
>
> Unanalyzed memory reference 0: _5 = MEM[symbol: b, index: ivtmp.3_1, step: 4,
> offset: 0B];
> Memory reference 1: MEM[symbol: a, index: ivtmp.3_1, step: 4, offset: 0B]
>
> so the pass analyzes the store but not the load, which seems an oversight.
> The patch also folds copy_mem_ref_info into its only user and removes it.
>
> Tested on x86_64-suse-linux, OK for mainline?

Please take the opportunity to clean up simple_mem_ref_in_stmt some more.
Both loads and stores in assigns require gimple_assign_single_p, thus do

 if (!gimple_assign_single_p (stmt))
  return NULL;

before deciding on store/load.  To decide that the stmt is a load then do

  if (TREE_CODE (*lhs) == SSA_NAME
  && gimple_vuse (stmt))

it is a store if

   gimple_vdef (stmt)
   && (TREE_CODE (*rhs) == SSA_NAME
  || is_gimple_min_invariant (*rhs))

else it may still be an aggregate copy but LIM doesn't handle those
(though they may still be interesting for disambiguation ...)

The tree-ssa-address parts are ok as-is.

Thanks,
Richard.


>
> 2012-09-29  Eric Botcazou  
>
> * tree.h (copy_mem_ref_info): Delete.
> * tree-ssa-address.c (copy_mem_ref_info): Likewise.
> (maybe_fold_tmr): Copy flags manually.
> * tree-ssa-loop-im.c (simple_mem_ref_in_stmt): Accept TARGET_MEM_REF
> on the RHS as well.
>
>
> --
> Eric Botcazou

Re: [PATCH RFA] Implement register pressure directed hoist pass

2012-10-01 Thread Steven Bosscher

On Sat, Sep 29, 2012 at 8:37 AM, Bin Cheng  wrote:
> This is the updated patch according to your comments. Please review.
> I also re-collected code size data and found it is improved by about 0.24%
> for mips, which is better than previous data. I believe this should be
> caused by recent changes in trunk, rather than by using DF caches to
> calculate register pressure.

Hello,

Thanks for the update. The first look wasn't a very thorough review,
so I have more comments now. Sorry for that, I should have taken the
time for this the first time round...

First, as a general note: Please add a fat comment somewhere before
the code of the pass to explain how the register pressure driven
hoisting pass works. The existing implementation used to be an almost
one-to-one copy from Muchnick's book, but lately the code has moved
away from that (e.g. with Maxim's patches from last year) and it gets
a bit hard to follow without some explanation. You should at least
document your own algorithm, but I'd really appreciate if you could
also spend some time writing down how things work in general and/or
how the GCC implementation differs from Muchnick's original algorithm.

> +Use IRA to evaluate register pressure in hoist pass for decisions to hoist

"the code hoisting pass" (brownie points for an xref to the flag for
code hoisting).

> -   - do rough calc of how many regs are needed in each block, and a rough
> - calc of how many regs are available in each class and use that to
> - throttle back the code in cases where RTX_COST is minimal.

This comment still applies to LCM.

> +/* Record all regs that are set in any one insn.  Communication from
> +   mark_reg_{store,clobber} and global_conflicts.  Asm can refer to
> +   all hard-registers.  */
> +static rtx regs_set[(FIRST_PSEUDO_REGISTER > MAX_RECOG_OPERANDS
> +  ? FIRST_PSEUDO_REGISTER : MAX_RECOG_OPERANDS) * 2];
> +/* Number of regs stored in the previous array.  */
> +static int n_regs_set;

You can use DF_INSN_DEFS for this in calculate_bb_reg_pressure.

But then again...

> +   while (n_regs_set-- > 0)
> + {
> +   rtx note = find_regno_note (insn, REG_UNUSED,
> +   REGNO (regs_set[n_regs_set]));
> +   if (! note)
> + continue;
> +
> +   mark_reg_death (XEXP (note, 0));
> + }

Why not just mark all registers mentioned in REG_UNUSED notes as death, i.e.:

for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
  if ((REG_NOTE_KIND (note) == REG_UNUSED)
mark_reg_death (XEXP (note, 0));

?

> +static int hoist_expr_reaches_here_p (basic_block, struct expr*, basic_block,

space between "struct expr" and "*".

> +   int *, bitmap_head *);

"bitmap_head *" => "bitmap".
Likewise here:

> /* Basic blocks that have occurrences reachable from BB.  */
> bitmap_head _from_bbs, *from_bbs = &_from_bbs;
>+/* Basic blocks through which expr is hoisted.  */
>+bitmap_head _hoisted_bbs, *hoisted_bbs = &_hoisted_bbs;

And using BITMAP_ALLOC here:
> bitmap_initialize (from_bbs, 0);
> +   if (flag_ira_hoist_pressure == 1)
> + bitmap_initialize (hoisted_bbs, 0);

(Some older code uses the "bitmap_head *" form, but that pre-dates
coretypes.h. Newer code that uses the "old form" should really be
fixed also.)

Please also use an obstack for the hoisted_bbs and from_bbs bitmaps.
They're currently allocated in GC space but that's completely
unnecessary for objects with such a clearly identifiable life time.
You can even allocate these bitmaps outside the loop, you're already
calling bitmap_clear on them.

> EXPR_BB. Stop

Two spaces after a period in comments.

> @@ -2863,7 +2909,8 @@ static int
>if (visited == NULL)
>  {
>visited_allocated_locally = 1;
> -  visited = XCNEWVEC (char, last_basic_block);
> +  visited = sbitmap_alloc (last_basic_block);
> +  sbitmap_zero (visited);
>  }

Can you please submit this as a separate patch? I'd go ahead and
commit it as obvious, but it's always good to have one change per
patch and this bit is independent of the reg-pressure dependent
hoisting.

> +fira-hoist-pressure
> +Common Report Var(flag_ira_hoist_pressure) Init(-1)
> +Use IRA based register pressure calculation
> +in hoist optimizations.

Please add the "Optimization" marker. Why initialize to -1? The
default initialization is 0, and if the flag is set it takes a value
1. If you follow that common behavior, you can replace all occurrences
of "if (flag_ira_hoist_pressure == 1)" with just ""if
(flag_ira_hoist_pressure)".

> +  /* Enable register pressure hoist when optimizing for size on Thumb1 set.  
> */
> +  if (TARGET_THUMB1 && optimize_function_for_size_p (cfun)
> +  && flag_ira_hoist_pressure == -1)
> +flag_ira_hoist_pressure = 1;

One would expect this to be a win on all targets, but you probably
looked at th

Re: [PATCH] Fix PR middle-end/54759

2012-10-01 Thread Richard Guenther

On Sun, Sep 30, 2012 at 9:03 PM, Dehao Chen  wrote:
> Hi,
>
> This patch fixes the bug when comparing location to UNKNOWN_LOC.
>
> Bootstrapped and passed gcc regression test.
>
> Okay for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Dehao
>
> 2012-09-30  Dehao Chen  
>
> PR middle-end/54759
> * gcc/tree-vect-loop-manip.c (slpeel_make_loop_iterate_ntimes): Use
> LOCATION_LOCUS to compare with UNKNOWN_LOCATION.
> (slpeel_tree_peel_loop_to_edge): Likewise.
> * gcc/tree-vectorizer.c (vectorize_loops): Likewise.

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Richard Guenther

On Mon, Oct 1, 2012 at 7:48 AM, Jakub Jelinek  wrote:
> On Sun, Sep 30, 2012 at 06:50:50PM -0400, Vladimir Makarov wrote:
>>   But I think that LRA cpu time problem for this test can be fixed.
>> But I don't think I can fix it for 2 weeks.  So if people believe
>> that current LRA behaviour on this PR is a stopper to include it
>> into gcc4.8 than we should postpone its inclusion until gcc4.9 when
>> I hope to fix it.
>
> I think this testcase shouldn't be a show stopper for LRA inclusion into
> 4.8, but something to look at for stage3.

I agree here.

> I think a lot of GCC passes have scalability issues on that testcase,
> that is why it must be compiled with -O1 and not higher optimization
> options, so perhaps it would be enough to choose a faster algorithm
> generating worse code for the huge functions and -O1.

Yes, we spent quite some time in making basic optimization work for
insane testcases (basically avoid quadratic or bigger complexity in any
IL size variable (number of basic-blocks, edges, instructions, pseudos, etc.)).

And indeed if you use -O2 we do have issues with existing passes (and even
at -O1 points-to analysis can wreck things, or even profile guessing - see
existing bugs for that).  Basically I would tune -O1 towards being able to
compile and optimize insane testcases with memory and compile-time
requirements that are linear in any of the above complexity measures.

Thus, falling back to the -O0 register allocating strathegy at certain
thresholds for the above complexity measures is fine (existing IRA
for example has really bad scaling on the number of loops in the function,
but you can tweak with flags to make it not consider that).

> And I agree it is primarily a bug in the generator that it creates such huge
> functions, that can't perform very well.

Well, not for -O2+, yes, but at least we should try(!) hard.

Thanks,
Richard.

> Jakub

Re: [Patch,avr]: Ad PR rtl-optimization/52543: Undo the MEM->UNSPEC hack

2012-10-01 Thread Denis Chertykov

2012/9/30 Georg-Johann Lay :
>> Denis Chertykov wrote:
>> I have tried to use secondary a few years ago (may be 5 or 7).
>> I have definitely remember only one thing: secondary reload should be
>> avoided as long as possible.
>
>
> Currently each mov has to be decorated with moving the segment to RAMPZ and
> (depending on target) restoring RAMPZ afterwards.
>
> GCC has no concept of a segmented layout and there is no way to describe
> that.
>
> One way is to hack with UNSPEC and bypass ira/reload altogether but IMO that
> is no good solution.  Besides that is only works because the mov insns have
> special constraints (there will be writes to flash, flash does not change
> after load time, etc.)
>
>
>> The better way to got a knowledge about it is a GDB ;-)
>
>
> I think reload.c:push_secondary_reload() should be the right place but it
> does not call targetm.secondary_reload so that no secondary is generated.
>
> It's hard to tell where the place is that is responsible for the bypassing
> of calling the hook.
>
>
>>> From the internals I don't see why it is skipped and the responsiveness
>>> in
>>> the gcc-help@ list on such topics is zero :-(
>>
>>
>> IMHO  it's a question to gcc@ not to gcc-help@
>
>
> Ok, I will try my luck again.
>
> Do you have an idea for a better approach, i.e. not set RAMPZ over and over
> again?

May be it's a question similar to "addressing with infinite
displacement" optimization that we discuss in
long long thread - "[Patch, AVR]: Fix PR46779"
Generally, the answer is here:
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01281.html
and here:
> The biggest problem that I see, from the 950612-1.c test case
> with the current handling of the "infinite displacement frame
> pointer", is that the adjustments to the frame pointer are
> never exposed as separate instructions, so there's never a
> chance to optimize them.

You can set RAMPZ over and over but you can give GCC a chance to
optimize it out in cse or cse-postreload passes.

Denis.

PS: as you know "GCC has no concept of a segmented layout" may be
better just drop it.

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 9:16 AM, Jakub Jelinek  wrote:
> On Mon, Oct 01, 2012 at 08:47:13AM +0200, Steven Bosscher wrote:
>> The test case compiles just fine at -O2, only VRP has trouble with it.
>> Let's try to stick with facts, not speculation.
>
> I was talking about the other PR, PR26854, which from what I remember when
> trying it myself and even the latest -O3 time reports from the reduced
> testcase show that IRA/reload aren't there very significant (for -O3 IRA
> takes ~ 6% and reload ~ 1%).

OK, but what does LRA take? Vlad's numbers for 64-bits and looking at user time:

Reload: 503.26user
LRA: 598.70user

So if reload is ~1% of 503s then that'd be ~5s. And the only
difference between the two timings is LRA instead of reload, so LRA
takes ~100s, or 20%.

>> I've put a lot of hard work into it to fix almost all scalability problems
>> on this PR for gcc 4.8. LRA undoes all of that work. I understand it is
>> painful for some people to hear, but I remain of opinion that LRA cannot be
>> considered "ready" if it scales so much worse than everything else in the
>> compiler.
>
> Judging the whole implementation from just these corner cases and not how it
> performs on other testcases (SPEC, rebuild a distro, ...) is IMHO not the
> right thing, if Vlad thinks the corner cases are fixable during stage3; IMHO
> we should allow LRA in, worst case it can be disabled by default even for
> i?86/x86_64.

I'd be asked to do a guest lecture on compiler construction (to be
clear: I'd be highly surprised if anyone would ask me to, but for sake
of argument, bear with me ;-) then I'd start by stating that
algorithms should be designed for the corner cases, because the
devil's always in the details.

But more to the point regarding stage3: It will already be a busy
stage3 if the other, probably even more significant, scalability
issues have to be fixed, i.e. var-tracking and macro expansion. And
there's also the symtab work that's bound to cause some interesting
bugs still to be shaken out. With all due respect to Vlad, and,
seriously, hats off to Vlad for tacking reload and coming up with a
much easier to understand and nicely phase-split replacement, I just
don't believe that these scalability issues can be addressed in
stage3.

It's now very late stage1, and LRA was originally scheduled for GCC
4.9. Why the sudden hurrying? Did I miss the 2 minute warning?

Ciao!
Steven

Re: Profile housekeeping 5/n (make RTL loop optimizers to use loop bounds better)

2012-10-01 Thread Richard Guenther

On Mon, Oct 1, 2012 at 11:37 AM, Jan Hubicka  wrote:
> Hi,
> this patch commonizes the maximal iteration estimate logic in between SCEV and
> loop-iv.  Both are now using loop->nb_iterations_upper_bound.  I decided to
> keep same API for SCEV code as for RTL code, so I made
> estimated_loop_iterations and max_loop_iterations to not try to recompute
> bounds and ICE when invoked without SCEV fired on.
>
> The patch updates RTL optimizers to use the estimated_loop_iterations and
> max_loop_iterations. This has few advantages:
>   1) loop unroller can now take into account estimates stored into
>  loop structure by earlier pass (I think none exist though)
>  It is however better then using expected_loop_iterations since
>  profile might get out of date with expansion.
>
>   2) loop peeling code now use max iterations bounds. This makes it i.e.
>  to peel vectorizer prologues/epilogues/scalar loops so -fpeel-loops
>  now improves my low iteration count testcase by about 10%
>
>   3) Same for loop unswithcing.
>
> I am not really friend with the new double_int API. I copied some existing
> examples but find it ugly.  Why do we miss operators for comparsions and
> division? Why from_*/to_* can't be a cast at least for basic integer types?
>
> Regtested/bootstrapped x86_64-linux, seems sane?

Yes.  Can you add a testcase or two?  I tweaked RTL unroll/peel after preserving
loops to not blindly unroll/peel everything 8 times (not sure if _I_
added testcases ...).

> I also wonder if loop vectorizer should not update the estimates after
> loop iteration count is reduced by vectorizing.

Probably yes.

Thanks,
Richard.

> Honza
> * loop-unswitch.c (unswitch_single_loop): Use
> estimated_loop_iterations_int to prevent unswitching when loop
> is known to not roll.
> * tree-ssa-loop-niter.c (estimated_loop_iterations): Do not segfault
> when SCEV is not initialized.
> (max_loop_iterations): Likewise.
> * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Use
> estimated_loop_iterations_int to prevent unswithcing when
> loop is known to not roll.
> * tree-scalar-evolution.c (scev_initialized_p): New function.
> * tree-scalar-evolution.h (scev_initialized_p): Likewise.
> * loop-unroll.c (decide_peel_once_rolling): Use
> max_loop_iterations_int.
> (unroll_loop_constant_iterations): Update
> nb_iterations_upper_bound and nb_iterations_estimate.
> (decide_unroll_runtime_iterations): Use
> estimated_loop_iterations or max_loop_iterations;
> (unroll_loop_runtime_iterations): fix profile updating.
> (decide_peel_simple): Use estimated_loop_iterations
> and max_loop_iterations.
> (decide_unroll_stupid): Use estimated_loop_iterations
> ad max_loop_iterations.
> * loop-doloop.c (doloop_modify): Use max_loop_iterations_int.
> (doloop_optimize): Likewise.
> * loop-iv.c (iv_number_of_iterations): Use record_niter_bound.
> (find_simple_exit): Likewise.
> * cfgloop.h (struct niter_desc): Remove niter_max.
>
> Index: loop-unswitch.c
> ===
> *** loop-unswitch.c (revision 191867)
> --- loop-unswitch.c (working copy)
> *** unswitch_single_loop (struct loop *loop,
> *** 257,262 
> --- 257,263 
> rtx cond, rcond = NULL_RTX, conds, rconds, acond, cinsn;
> int repeat;
> edge e;
> +   HOST_WIDE_INT iterations;
>
> /* Do not unswitch too much.  */
> if (num > PARAM_VALUE (PARAM_MAX_UNSWITCH_LEVEL))
> *** unswitch_single_loop (struct loop *loop,
> *** 299,305 
>   }
>
> /* Nor if the loop usually does not roll.  */
> !   if (expected_loop_iterations (loop) < 1)
>   {
> if (dump_file)
> fprintf (dump_file, ";; Not unswitching, loop iterations < 1\n");
> --- 300,307 
>   }
>
> /* Nor if the loop usually does not roll.  */
> !   iterations = estimated_loop_iterations_int (loop);
> !   if (iterations >= 0 && iterations <= 1)
>   {
> if (dump_file)
> fprintf (dump_file, ";; Not unswitching, loop iterations < 1\n");
> Index: tree-ssa-loop-niter.c
> ===
> *** tree-ssa-loop-niter.c   (revision 191867)
> --- tree-ssa-loop-niter.c   (working copy)
> *** estimate_numbers_of_iterations_loop (str
> *** 3012,3020 
>   bool
>   estimated_loop_iterations (struct loop *loop, double_int *nit)
>   {
> !   estimate_numbers_of_iterations_loop (loop);
> if (!loop->any_estimate)
> ! return false;
>
> *nit = loop->nb_iterations_estimate;
> return true;
> --- 3012,3034 
>   bool
>   estimated_loop_iterations (struct loop *loop, double_int *nit)
>   {
> !   /* When SCEV information is available, try to update loop iterations
> !  estimate.  Othe

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 11:52 AM, Richard Guenther
 wrote:
>> I think this testcase shouldn't be a show stopper for LRA inclusion into
>> 4.8, but something to look at for stage3.
>
> I agree here.

I would also agree if it were not for the fact that IRA is already a
scalability bottle-neck and that has been known for a long time, too.
I have no confidence at all that if LRA goes in now, these scalability
problems will be solved in stage3 or at any next release cycle. It's
always the same thing with GCC: Once a patch is in, everyone moves on
to the next fancy new thing dropping the not-quite-broken but also
not-quite-working things on the floor.

Ciao!
Steven

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Sun, Sep 30, 2012 at 7:03 PM, Richard Guenther
 wrote:
> On Sun, Sep 30, 2012 at 6:52 PM, Steven Bosscher  
> wrote:
>> Hi,
>>
>>
>> To look at it in yet another way:
>>
>>>  integrated RA   : 189.34 (16%) usr
>>>  LRA non-specific:  59.82 ( 5%) usr
>>>  LRA virtuals eliminatenon:  56.79 ( 5%) usr
>>>  LRA create live ranges  : 175.30 (15%) usr
>>>  LRA hard reg assignment : 130.85 (11%) usr
>>
>> The IRA pass is slower than the next-slowest pass (tree PRA) by almost
>> a factor 2.5.  Each of the individually-measured *phases* of LRA is
>> slower than the complete IRA *pass*. These 5 timevars together make up
>> for 52% of all compile time.
>
> That figure indeed makes IRA + LRA look bad.  Did you by chance identify
> anything obvious that can be done to improve the situation?

The " LRA create live range" time is mostly spent in merge_live_ranges
walking lists. Perhaps the live ranges can be represented better with
a sorted VEC, so that the start and finish points can be looked up on
log-time instead of linear.

Ciao!
Steven

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Jakub Jelinek

On Mon, Oct 01, 2012 at 12:01:36PM +0200, Steven Bosscher wrote:
> I would also agree if it were not for the fact that IRA is already a
> scalability bottle-neck and that has been known for a long time, too.
> I have no confidence at all that if LRA goes in now, these scalability
> problems will be solved in stage3 or at any next release cycle. It's
> always the same thing with GCC: Once a patch is in, everyone moves on
> to the next fancy new thing dropping the not-quite-broken but also
> not-quite-working things on the floor.

If we open a P1 bug for it for 4.8, then it will need to be resolved some
way before branching.  I think Vlad is committed to bugfixing LRA, after
all the intent is for 4.9 to enable it on more (all?) targets, and all the
bugfixing and scalability work on LRA is needed for that anyway.

Jakub

[Ada] Implement extended overflow handling for comparison ops

2012-10-01 Thread Arnaud Charlet

This patch enables extended overflow handling for comparison ops
so that the comparison can be done in an expanded type, or even
in bignum mode if operating in ELIMINATED overflow check mode.

The following test program:

 1. with Text_IO; use Text_IO;
 2. procedure Overflowm2 is
 3.function r1
 4.  (a, b, c, d : Integer) return Boolean is
 5.begin
 6.   return a + b + c + d <= Integer'Last;
 7.end;
 8.function r2
 9.  (a, b, c, d : Integer) return Boolean is
10.begin
11.   return a * b * c * d >= Integer'First;
12.end;
13. begin
14.begin
15.   Put_Line
16. ("r1 returns " &
17.Boolean'Image
18.  (r1 (Integer'Last, Integer'Last,
19.   -Integer'Last, -Integer'Last)));
20.exception
21.   when Constraint_Error =>
22.  Put_Line ("r1 raises exception");
23.end;
24.
25.begin
26.   Put_Line
27. ("r2 returns " &
28.Boolean'Image
29.  (r2 (Integer'Last, Integer'Last,
30.   Integer'Last, 0)));
31.exception
32.   when Constraint_Error =>
33.  Put_Line ("r2 raises exception");
34.end;
35. end Overflowm2;

In CHECKED mode (-gnato1) we get:

r1 raises exception
r2 raises exception

since the first addition in r1 and the first multiplication
in r2 result in values outside the bounds of Integer'Base.

In MINIMIZED mode (-gnato2) we get:

r1 returns TRUE
r2 raises exception

since we can compute the addition result in Long_Long_Integer,
and do the comparison in Long_Long_Integer mode, but the
second multiplication yields a value outside this range,
so that causes an overflow.

In ELIMINATE mode (-gnato3) we get:

r1 returns TRUE
r2 returns TRUE

Because now we use Bignum arithmetic for the intermediate
multiplication results, and the final comparison is also
done in bignum mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Robert Dewar  

* exp_ch4.adb (Expand_Compare_Minimize_Eliminate_Overflow):
New procedure.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 191912)
+++ exp_ch4.adb (working copy)
@@ -140,6 +140,10 @@
procedure Expand_Short_Circuit_Operator (N : Node_Id);
--  Common expansion processing for short-circuit boolean operators
 
+   procedure Expand_Compare_Minimize_Eliminate_Overflow (N : Node_Id);
+   --  Deal with comparison in Minimize/Eliminate overflow mode. This is where
+   --  we allow comparison of "out of range" values.
+
function Expand_Composite_Equality
  (Nod: Node_Id;
   Typ: Entity_Id;
@@ -2276,6 +2280,237 @@
   end;
end Expand_Boolean_Operator;
 
+   
+   -- Expand_Compare_Minimize_Eliminate_Overflow --
+   
+
+   procedure Expand_Compare_Minimize_Eliminate_Overflow (N : Node_Id) is
+  Loc : constant Source_Ptr := Sloc (N);
+
+  Llo, Lhi : Uint;
+  Rlo, Rhi : Uint;
+
+  LLIB : constant Entity_Id := Base_Type (Standard_Long_Long_Integer);
+  --  Entity for Long_Long_Integer'Base
+
+  Check : constant Overflow_Check_Type := Overflow_Check_Mode (Empty);
+  --  Current checking mode
+
+  procedure Set_True;
+  procedure Set_False;
+  --  These procedures rewrite N with an occurrence of Standard_True or
+  --  Standard_False, and then makes a call to Warn_On_Known_Condition.
+
+  ---
+  -- Set_False --
+  ---
+
+  procedure Set_False is
+  begin
+ Rewrite (N, New_Occurrence_Of (Standard_False, Loc));
+ Warn_On_Known_Condition (N);
+  end Set_False;
+
+  --
+  -- Set_True --
+  --
+
+  procedure Set_True is
+  begin
+ Rewrite (N, New_Occurrence_Of (Standard_True, Loc));
+ Warn_On_Known_Condition (N);
+  end Set_True;
+
+   --  Start of processing for Expand_Compare_Minimize_Eliminate_Overflow
+
+   begin
+  --  Nothing to do unless we have a comparison operator with operands
+  --  that are signed integer types, and we are operating in either
+  --  MINIMIZED or ELIMINATED overflow checking mode.
+
+  if Nkind (N) not in N_Op_Compare
+or else Check not in Minimized_Or_Eliminated
+or else not Is_Signed_Integer_Type (Etype (Left_Opnd (N)))
+  then
+ return;
+  end if;
+
+  --  OK, this is the case we are interested in. First step is to process
+  --  our operands using the Minimize_Eliminate circuitry which applies
+  --  this processing to the two operand subtrees.
+
+  Minimize_Eliminate_Overflow_Checks (Left_Opnd (N),  Llo, Lhi);
+  Minimize_Eliminate_Overflow_Checks (Right_Opnd (N), Rlo, Rhi);
+
+  --  Se

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 12:14 PM, Jakub Jelinek  wrote:
> On Mon, Oct 01, 2012 at 12:01:36PM +0200, Steven Bosscher wrote:
>> I would also agree if it were not for the fact that IRA is already a
>> scalability bottle-neck and that has been known for a long time, too.
>> I have no confidence at all that if LRA goes in now, these scalability
>> problems will be solved in stage3 or at any next release cycle. It's
>> always the same thing with GCC: Once a patch is in, everyone moves on
>> to the next fancy new thing dropping the not-quite-broken but also
>> not-quite-working things on the floor.
>
> If we open a P1 bug for it for 4.8, then it will need to be resolved some
> way before branching.  I think Vlad is committed to bugfixing LRA, after
> all the intent is for 4.9 to enable it on more (all?) targets, and all the
> bugfixing and scalability work on LRA is needed for that anyway.

I don't question Vlad's commitment, but the last time I met him, he
only had two hands just like everyone else.

But I've made my point and it seems that I'm not voting with the majority.

Ciao!
Steven

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Bernd Schmidt

On 10/01/2012 12:14 PM, Jakub Jelinek wrote:
> On Mon, Oct 01, 2012 at 12:01:36PM +0200, Steven Bosscher wrote:
>> I would also agree if it were not for the fact that IRA is already a
>> scalability bottle-neck and that has been known for a long time, too.
>> I have no confidence at all that if LRA goes in now, these scalability
>> problems will be solved in stage3 or at any next release cycle. It's
>> always the same thing with GCC: Once a patch is in, everyone moves on
>> to the next fancy new thing dropping the not-quite-broken but also
>> not-quite-working things on the floor.
> 
> If we open a P1 bug for it for 4.8, then it will need to be resolved some
> way before branching.  I think Vlad is committed to bugfixing LRA, after
> all the intent is for 4.9 to enable it on more (all?) targets, and all the
> bugfixing and scalability work on LRA is needed for that anyway.

Why can't this be done on the branch? We've made the mistake of rushing
things into mainline too early a few times before, we should have
learned by now. And adding more half transitions is not something we
really want either.


Bernd

[Patch, Committed] Fix declared inline after being called warning

2012-10-01 Thread Tom de Vries

Hi,

I've committed to branch 4.7 as obvious attached patch that fixes a compiler
warning 'declared inline after being called'.

I ran into this warning when building the 4.7 branch with a gcc 4.3 compiler:
...
var-tracking.c:558: warning: 'set_dv_changed' declared inline after being called
var-tracking.c:558: warning: previous declaration of 'set_dv_changed' was here
...

Other instances of this problem are:
- http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00256.html
- http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01426.html

I've fixed this in the 4.7 branch. The problem is not present in the 4.6 branch,
and with trunk the warning doesn't trigger, I suppose because we're using g++ 
now.

Build on i686-pc-linux-gnu.

Thanks,
- Tom

2012-10-01  Tom de Vries  

* var-tracking.c (set_dv_changed): Add an 'inline' function specifier to
the prototype.
Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c (revision 191792)
+++ gcc/var-tracking.c (working copy)
@@ -570,7 +570,7 @@ static void dump_vars (htab_t);
 static void dump_dataflow_set (dataflow_set *);
 static void dump_dataflow_sets (void);
 
-static void set_dv_changed (decl_or_value, bool);
+static inline void set_dv_changed (decl_or_value, bool);
 static void variable_was_changed (variable, dataflow_set *);
 static void **set_slot_part (dataflow_set *, rtx, void **,
 			 decl_or_value, HOST_WIDE_INT,

[patch][lra] Use XNEWVEC and friends instead of xmalloc/xrealloc, and add some timevars

2012-10-01 Thread Steven Bosscher

Hello,

This patch uses the libiberty new-like operators instead of using
xmalloc/xrealloc.
It also adds timevars for the main LRA phases, and it fixes a warning
suggesting a space before a ';' in an only-looping for loop.

Bootstrapped (lra-branch, of course) on x86_64-unknown-linux-gnu. OK?

Ciao!
Steven


lra_XALLOC.diff
Description: Binary data

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 12:10 PM, Steven Bosscher  wrote:
> The " LRA create live range" time is mostly spent in merge_live_ranges
> walking lists.

Hmm no, that's just gcc17's ancient debugger telling me lies.
lra_live_range_in_p is not even used.

/me upgrades to something newer than gdb 6.8...

Ciao!
Steven

Re: Constant-fold vector comparisons

2012-10-01 Thread Richard Guenther

On Sat, Sep 29, 2012 at 3:25 PM, Marc Glisse  wrote:
> Hello,
>
> this patch does 2 things (I should have split it in 2, but the questions go
> together):
>
> 1) it handles constant folding of vector comparisons,
>
> 2) it fixes another place where vectors are not expected (I'll probably wait
> to have front-end support and testcases to do more of those, but there is
> something to discuss).
>
> I wasn't sure what integer_truep should test exactly. For integer: == 1 or
> != 0? For vectors: == -1 or < 0? I chose the one that worked best for the
> forwprop case where I used it.
>
> It seems that before this patch, the middle-end didn't know how comparison
> results were encoded (a good reason for VEC_COND_EXPR to require a
> comparison as its first argument). I am using the OpenCL encoding that what
> matters is the high bit of each vector element. I am not quite sure what
> happens for targets (are there any?) that use a different encoding. When
> expanding vcond, they can do the comparison as they like. When expanding an
> isolated comparison, I expect they have to expand it as vcond(a it should be ok, but I could easily have missed something.

Comments below

>
> 2012-10-01  Marc Glisse  
>
> gcc/
> * tree.c (integer_truep): New function.
> * tree.h (integer_truep): Declare.
> * tree-ssa-forwprop.c (forward_propagate_into_cond): Call it.
> Don't use boolean_type_node for vectors.
> * fold-const.c (fold_relational_const): Handle VECTOR_CST.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/foldconst-6.c: New testcase.
>
> --
> Marc Glisse
> Index: gcc/tree.h
> ===
> --- gcc/tree.h  (revision 191850)
> +++ gcc/tree.h  (working copy)
> @@ -5272,20 +5272,25 @@ extern int integer_zerop (const_tree);
>
>  /* integer_onep (tree x) is nonzero if X is an integer constant of value 1.
> */
>
>  extern int integer_onep (const_tree);
>
>  /* integer_all_onesp (tree x) is nonzero if X is an integer constant
> all of whose significant bits are 1.  */
>
>  extern int integer_all_onesp (const_tree);
>
> +/* integer_truep (tree x) is nonzero if X is an integer constant of value
> 1,
> +   or a vector constant of value < 0.  */
> +
> +extern bool integer_truep (const_tree);
> +
>  /* integer_pow2p (tree x) is nonzero is X is an integer constant with
> exactly one bit 1.  */
>
>  extern int integer_pow2p (const_tree);
>
>  /* integer_nonzerop (tree x) is nonzero if X is an integer constant
> with a nonzero value.  */
>
>  extern int integer_nonzerop (const_tree);
>
> Index: gcc/tree-ssa-forwprop.c
> ===
> --- gcc/tree-ssa-forwprop.c (revision 191850)
> +++ gcc/tree-ssa-forwprop.c (working copy)
> @@ -564,46 +564,46 @@ forward_propagate_into_cond (gimple_stmt
>enum tree_code code;
>tree name = cond;
>gimple def_stmt = get_prop_source_stmt (name, true, NULL);
>if (!def_stmt || !can_propagate_from (def_stmt))
> return 0;
>
>code = gimple_assign_rhs_code (def_stmt);
>if (TREE_CODE_CLASS (code) == tcc_comparison)
> tmp = fold_build2_loc (gimple_location (def_stmt),
>code,
> -  boolean_type_node,
> +  TREE_TYPE (cond),

That's obvious.

>gimple_assign_rhs1 (def_stmt),
>gimple_assign_rhs2 (def_stmt));
>else if ((code == BIT_NOT_EXPR
> && TYPE_PRECISION (TREE_TYPE (cond)) == 1)
>|| (code == BIT_XOR_EXPR
> -  && integer_onep (gimple_assign_rhs2 (def_stmt
> +  && integer_truep (gimple_assign_rhs2 (def_stmt

See below.

> {
>   tmp = gimple_assign_rhs1 (def_stmt);
>   swap = true;
> }
>  }
>
>if (tmp
>&& is_gimple_condexpr (tmp))
>  {
>if (dump_file && tmp)
> {
>   fprintf (dump_file, "  Replaced '");
>   print_generic_expr (dump_file, cond, 0);
>   fprintf (dump_file, "' with '");
>   print_generic_expr (dump_file, tmp, 0);
>   fprintf (dump_file, "'\n");
> }
>
> -  if (integer_onep (tmp))
> +  if (integer_truep (tmp))
> gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs2 (stmt));
>else if (integer_zerop (tmp))
> gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs3 (stmt));
>else
> {
>   gimple_assign_set_rhs1 (stmt, unshare_expr (tmp));
>   if (swap)
> {
>   tree t = gimple_assign_rhs2 (stmt);
>   gimple_assign_set_rhs2 (stmt, gimple_assign_rhs3 (stmt));
> Index: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0)

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread H.J. Lu

On Sun, Sep 30, 2012 at 11:36 PM, Sharad Singhai  wrote:
> Resend to gcc-patches
>
> I have addressed the comments by fixing all the minor issues,
> bootstrapped and tested on x86_64. I did the recommended reshuffling
> by moving non-tree code from tree-dump.c into a new file dumpfile.c.
>
> I committed two successive revisions
> r191883 Main patch with the dump infrastructure changes. However, I
> accidentally left out a new file, dumpfile.c.
> r191884 Added dumpfile.c, and did the renaming of dump_* functions
> from gimple_pretty_print.[ch].
>
> As things stand right now, r191883 is broken because of the missing
> file 'dumpfile.c', which the very next commit fixes. Anyone who got
> broken revision r191883, please svn update. I am really very sorry
> about that.
>
> I have a couple more minor patches which deal with renaming; I plan to
> address those later.
>

It caused:

FAIL: gcc.dg/tree-ssa/gen-vect-11.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-11.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-11a.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-11a.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-11b.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
"vectorized 0 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-11c.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
"vectorized 0 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-2.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-2.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-25.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-25.c scan-tree-dump-times vect
"vectorized 2 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
"Alignment of access forced using peeling" 1
FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
"Alignment of access forced using peeling" 1
FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-32.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-32.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gfortran.dg/vect/O3-pr36119.f90 (test for excess errors)
FAIL: gfortran.dg/vect/O3-pr39595.f (test for excess errors)
FAIL: gfortran.dg/vect/Ofast-pr50414.f90 (test for excess errors)
FAIL: gfortran.dg/vect/cost-model-pr34445.f (test for excess errors)
FAIL: gfortran.dg/vect/cost-model-pr34445a.f (test for excess errors)
FAIL: gfortran.dg/vect/fast-math-pr38968.f90 (test for excess errors)
FAIL: gfortran.dg/vect/fast-math-pr38968.f90 scan-tree-dump vect
"vectorized 1 loops"
FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
FAIL: gfortran.dg/vect/fast-math-vect-8.f90 (test for excess errors)
FAIL: gfortran.dg/vect/fast-math-vect-8.f90 scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gfortran.dg/vect/no-fre-no-copy-prop-O3-pr51704.f90 (test for
excess errors)
FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 (test for excess errors)
FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 scan-tree-dump-times vect
"vectorized 2 loops" 1
FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 (test for excess errors)
FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 scan-tree-dump-times vect
"vectorized 0 loops" 1
FAIL: gfortran.dg/vect/pr19049.f90  -O   scan-tree-dump-times vect
"complicated access pattern" 1
FAIL: gfortran.dg/vect/pr19049.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr32377.f90  -O   scan-tree-dump-times vect
"vectorized 2 loops" 1
FAIL: gfortran.dg/vect/pr32377.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr32380.f  -O   scan-tree-dump-times vect
"vectorized 6 loops" 1
FAIL: gfortran.dg/vect/pr32380.f  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr33301.f  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr50178.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr50412.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr51058-2.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr51058.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/pr51285.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/vect-1.f90  -O   scan-tree-dump-times vect
"vectorized 3 loops" 1
FAIL: gfortran.dg/vect/vect-1.f90  -O  (test for excess errors)
FAIL: gfortran.dg/vect/vect-2.f90  -O   scan-tree-dump-times vect
"Alignment of access forced using peeling" 3
FAIL: gfortran.dg/vect/vect-2.f90  -O   scan-tree-dump-times vect
"Vectorizing an unaligned access" 2
FAIL: gfortran.dg/vect/vect-2.f90  -O   scan-tree-dump-times vect
"vectorized 3 loops" 1
FAIL: gfortran.dg/vect/vect-2.f90  -O  (test for excess errors)
FAI

[patch] experimenting with renumbering of pseudos after expand

2012-10-01 Thread Steven Bosscher

Hello,

For most code, expand creates a lot of pseudos that are cleaned up in
subsequent passes, if they even live long enough to make it there. On
average, for cc1 preprocessed source, the number of "holes" in
regno_reg_rtx is about half the size of that array, or in other words:
regno_reg_rtx is almost a sparse array.

I've been experimenting with the attached patch to renumber pseudos
just before initializing DF. For the already-notorious test case of
PR54146 on x86_64, the patch reduces max_reg_num from 348404 to 180502
for the largest function. This reduces the memory foot print of the
test case to less than 6GB, an almost 25% reduction, and it speeds up
the DF_LR and DF_LIVE problems a bit (~15% but that's not much on the
total compile time).

I had hopes that it would help for LRA compile times also, but
unfortunately the only significant change is this one:

without patch:
 LRA hard reg assignment : 130.85 (11%) usr   0.20 ( 1%) sys 131.17
(11%) wall   0 kB ( 0%) ggc

with patch:
 LRA hard reg assignment : 108.92 ( 9%) usr   0.07 ( 0%) sys 109.03 (
9%) wall   0 kB ( 0%) ggc

Anyway, putting the patch out there to show that I've done more than
just complaining ;-)

It also would suggest that the scalability challenge for LRA is not
the number of pseudos but something else (number of insns, I guess).

Ciao!
Steven


compact_regno_reg_rtx.diff
Description: Binary data

[PATCH] Fix PR47799 - debug info for early-inlining with LTO

2012-10-01 Thread Richard Guenther


This tries to emit proper debug information for early-inlined
functions from LTO LTRANS phase (thus, emit DW_TAG_inlined_subroutine
and allow gdb to set breakpoints).  We need to avoid confusing
LTO and dwarf2out with the full abstract block tree, so this
patch "flattens" the abstract block tree by always using the
ultimate origin for BLOCK_ABSTRACT_ORIGIN on blocks which are
inlined_function_outer_scope_p.  Thus, it tries to output the
minimal info dwarf2out.c needs to emit the desired debug information.

As with LTO all abstract inline instances get generated late for
early inlined functions I had to amend the "hack" for extern inline
functions to always output dies for decls that come their way
through dwarf2out_abstract_function.  And further down not crash
on a NULL DECL_INITIAL (when LTO decided to output the function
body in another LTRANS unit or if it does not get output at all).

Currently LTO-bootstrapping and testing on x86_64-unknown-linux-gnu.

Jason, are the dwarf2out bits ok with you?

I've sofar toyed with examples like

int x, y;
static inline int foo (int i) { y = i; return y; }
static inline int bar (int i) { x = i; return foo (x); }
int main ()
{
  int k = 0;
  int res = bar (k);
  return res;
}

and debug information with/without LTO is now reasonably the same
and I can set breakpoints on the inlined instances.

Thanks,
Richard.

2012-10-01  Richard Guenther  

PR lto/47788
* tree-streamer-out.c (write_ts_block_tree_pointers): For
inlined functions outer scopes write the ultimate origin
as BLOCK_ABSTRACT_ORIGIN and BLOCK_SOURCE_LOCATION.
Do not stream the fragment chains.
(lto_input_ts_block_tree_pointers): Likewise.
* dwarf2out.c (gen_subprogram_die): Handle NULL DECL_INITIAL.
(dwarf2out_decl): Always output DECL_ABSTRACT function decls.

Index: gcc/tree-streamer-in.c
===
*** gcc/tree-streamer-in.c  (revision 191824)
--- gcc/tree-streamer-in.c  (working copy)
*** static void
*** 789,810 
  lto_input_ts_block_tree_pointers (struct lto_input_block *ib,
  struct data_in *data_in, tree expr)
  {
-   /* Do not stream BLOCK_SOURCE_LOCATION.  We cannot handle debug information
-  for early inlining so drop it on the floor instead of ICEing in
-  dwarf2out.c.  */
BLOCK_VARS (expr) = streamer_read_chain (ib, data_in);
  
-   /* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug 
information
-  for early inlining so drop it on the floor instead of ICEing in
-  dwarf2out.c.  */
- 
BLOCK_SUPERCONTEXT (expr) = stream_read_tree (ib, data_in);
  
!   /* Do not stream BLOCK_ABSTRACT_ORIGIN.  We cannot handle debug information
!  for early inlining so drop it on the floor instead of ICEing in
   dwarf2out.c.  */
!   BLOCK_FRAGMENT_ORIGIN (expr) = stream_read_tree (ib, data_in);
!   BLOCK_FRAGMENT_CHAIN (expr) = stream_read_tree (ib, data_in);
  
/* We re-compute BLOCK_SUBBLOCKS of our parent here instead
   of streaming it.  For non-BLOCK BLOCK_SUPERCONTEXTs we still
--- 789,810 
  lto_input_ts_block_tree_pointers (struct lto_input_block *ib,
  struct data_in *data_in, tree expr)
  {
BLOCK_VARS (expr) = streamer_read_chain (ib, data_in);
  
BLOCK_SUPERCONTEXT (expr) = stream_read_tree (ib, data_in);
  
!   /* Stream BLOCK_ABSTRACT_ORIGIN and BLOCK_SOURCE_LOCATION for
!  the limited cases we can handle - those that represent inlined
!  function scopes.  For the rest them on the floor instead of ICEing in
   dwarf2out.c.  */
!   BLOCK_ABSTRACT_ORIGIN (expr) = stream_read_tree (ib, data_in);
!   BLOCK_SOURCE_LOCATION (expr) = lto_input_location (ib, data_in);
!   /* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug 
information
!  for early inlined BLOCKs so drop it on the floor instead of ICEing in
!  dwarf2out.c.  */
! 
!   /* BLOCK_FRAGMENT_ORIGIN and BLOCK_FRAGMENT_CHAIN is not live at LTO
!  streaming time.  */
  
/* We re-compute BLOCK_SUBBLOCKS of our parent here instead
   of streaming it.  For non-BLOCK BLOCK_SUPERCONTEXTs we still
Index: gcc/tree-streamer-out.c
===
*** gcc/tree-streamer-out.c (revision 191824)
--- gcc/tree-streamer-out.c (working copy)
*** write_ts_exp_tree_pointers (struct outpu
*** 682,702 
  static void
  write_ts_block_tree_pointers (struct output_block *ob, tree expr, bool ref_p)
  {
-   /* Do not stream BLOCK_SOURCE_LOCATION.  We cannot handle debug information
-  for early inlining so drop it on the floor instead of ICEing in
-  dwarf2out.c.  */
streamer_write_chain (ob, BLOCK_VARS (expr), ref_p);
  
/* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug 
information
!  for early inlining so drop it on the floor instead of IC

Re: [RFC] Make vectorizer to skip loops with small iteration estimate

2012-10-01 Thread Richard Guenther

On Sun, 30 Sep 2012, Jan Hubicka wrote:

> Hi,
> the point of the following patch is to make vectorizer to not vectorize the
> following testcase with profile feedback:
> 
> int a[1];
> int i=5;
> int k=2;
> int val;
> __attribute__ ((noinline,noclone))
> test()
> {
>   int j;
>   for(j=0;j a[j]=val;
> }
> main()
> {
>   while (i)
> {
>   test ();
>   i--;
> }
> }
> 
> Here the compiler should work out that the second loop iterates 2 times at the
> average and thus it is not good candidate for vectorizing.
> 
> In my first attempt I added the following:
> @@ -1474,6 +1478,18 @@ vect_analyze_loop_operations (loop_vec_i
>return false;
>  }
>  
> +  if ((estimated_niter = estimated_stmt_executions_int (loop)) != -1
> +  && (unsigned HOST_WIDE_INT) estimated_niter <= th)
> +{
> +  if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
> +fprintf (vect_dump, "not vectorized: estimated iteration count too 
> small.");
> +  if (vect_print_dump_info (REPORT_DETAILS))
> +fprintf (vect_dump, "not vectorized: estimated iteration count 
> smaller than "
> + "user specified loop bound parameter or minimum "
> + "profitable iterations (whichever is more conservative).");
> +  return false;
> +}
> +
> 
> But to my surprise it does not help.  There are two things:
> 
> 1) the value of TH is bit low.  In a way the cost model works is that
>it finds minimal niters where vectorized loop with all the setup costs
>is cheaper than the vector loop with all the setup costs.  I.e.
> 
>   /* Calculate number of iterations required to make the vector version
>  profitable, relative to the loop bodies only.  The following condition
>  must hold true:
>  SIC * niters + SOC > VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC(A)
>  where
>  SIC = scalar iteration cost, VIC = vector iteration cost,
>  VOC = vector outside cost, VF = vectorization factor,
>  PL_ITERS = prologue iterations, EP_ITERS= epilogue iterations
>  SOC = scalar outside cost for run time cost model check.  */
> 
> This value is used for both
> 1) decision if number of iterations is too low (max iterations is known)
> 2) decision on runtime whether we want to take the vectorized path
> or the scalar path.
> 
> The vectoried loop looks like:
>   k.1_10 = k;
>   if (k.1_10 > 0)
>   {
> pretmp_2 = val;
> niters.8_4 = (unsigned int) k.1_10;
> bnd.9_13 = niters.8_4 >> 2;
> ratio_mult_vf.10_1 = bnd.9_13 << 2;
> _18 = niters.8_4 <= 3;
> _19 = ratio_mult_vf.10_1 == 0;
> _20 = _19 | _18;
> if (_20 != 0)
>   scalar loop
> else
>   vector prologue
>   }
> 
>  So the unvectorized cost is
>  SIC * niters
> 
>  The vectorized path is
>  SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC
>  The scalar path of vectorizer loop is
>  SIC * niters + SOC

Note that 'th' is used for the runtime profitability check which is
done at the time the setup cost has already been taken (yes, we
probably should make it more conservative but then guard the whole
set of loops by the check, not only the vectorized path).
See PR53355 for the general issue.

>It makes sense to vectorize if
>SIC * niters > SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC   (B)
>That is in the optimal cse where we actually vectorize the overall
>speed of vectorized loop including the runtime check is better.
> 
>It makes sense to take the vector loop if
>SIC * niters > VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC (C)
>Because the scalar loop is taken.
> 
>The attached patch implements the formula (C) and uses it to deterine the
>decision based on number of iterations estimate (that is usually provided 
> by
>the feedback)
> 
>As a reality check, I tried my testcase.
> 
>9: Cost model analysis:
>  Vector inside of loop cost: 1
>  Vector prologue cost: 7
>  Vector epilogue cost: 2
>  Scalar iteration cost: 1
>  Scalar outside cost: 6
>  Vector outside cost: 9
>  prologue iterations: 0
>  epilogue iterations: 2
>  Calculated minimum iters for profitability: 4
> 
>9:   Profitability threshold = 3
> 
>9:   Profitability estimated iterations threshold = 20
> 
>This is overrated. The loop starts to be benefical at about 4 iterations in
>reality.  I guess the values are kind of wrong.
> 
>Vector inside of loop cost and Scalar iteration cost seems to ignore the
>fact that the loops do contain some control flow that should account at 
> least
>one extra cycle.
> 
>Vector prologue cost seems bit overrated for one pack operation.
> 
>Of course this is very simple benchmark, in reality the vectorizatoin can 
> be
>a lot more harmful by complicating more complex control flows.
>
>So I guess we have two

Re: [PATCH] Do not mark pseudo-copies decomposable during first lower-subreg pass

2012-10-01 Thread Ulrich Weigand


>   * gcc.dg/lower-subreg-1.c: Disable on arm-*-* targets.

I just noticed that the triple is incomplete; we're supposed to use
arm*-*-* instead of just arm-*-*.

Checked in the the following fix as obvious.

Bye,
Ulrich


2012-10-01  Ulrich Weigand  

* gcc.dg/lower-subreg-1.c: Disable on arm*-*-* targets.

Index: gcc/testsuite/gcc.dg/lower-subreg-1.c
===
*** gcc/testsuite/gcc.dg/lower-subreg-1.c   (revision 191805)
--- gcc/testsuite/gcc.dg/lower-subreg-1.c   (working copy)
***
*** 1,4 
! /* { dg-do compile { target { ! { mips64 || { arm-*-* ia64-*-* spu-*-* 
tilegx-*-* } } } } } */
  /* { dg-options "-O -fdump-rtl-subreg1" } */
  /* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && x32 } { "*" } { "" } } */
  /* { dg-require-effective-target ilp32 } */
--- 1,4 
! /* { dg-do compile { target { ! { mips64 || { arm*-*-* ia64-*-* spu-*-* 
tilegx-*-* } } } } } */
  /* { dg-options "-O -fdump-rtl-subreg1" } */
  /* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && x32 } { "*" } { "" } } */
  /* { dg-require-effective-target ilp32 } */



-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Re: [patch] experimenting with renumbering of pseudos after expand

2012-10-01 Thread Richard Guenther

On Mon, Oct 1, 2012 at 2:06 PM, Steven Bosscher  wrote:
> Hello,
>
> For most code, expand creates a lot of pseudos that are cleaned up in
> subsequent passes, if they even live long enough to make it there. On
> average, for cc1 preprocessed source, the number of "holes" in
> regno_reg_rtx is about half the size of that array, or in other words:
> regno_reg_rtx is almost a sparse array.
>
> I've been experimenting with the attached patch to renumber pseudos
> just before initializing DF. For the already-notorious test case of
> PR54146 on x86_64, the patch reduces max_reg_num from 348404 to 180502
> for the largest function. This reduces the memory foot print of the
> test case to less than 6GB, an almost 25% reduction, and it speeds up
> the DF_LR and DF_LIVE problems a bit (~15% but that's not much on the
> total compile time).
>
> I had hopes that it would help for LRA compile times also, but
> unfortunately the only significant change is this one:
>
> without patch:
>  LRA hard reg assignment : 130.85 (11%) usr   0.20 ( 1%) sys 131.17
> (11%) wall   0 kB ( 0%) ggc
>
> with patch:
>  LRA hard reg assignment : 108.92 ( 9%) usr   0.07 ( 0%) sys 109.03 (
> 9%) wall   0 kB ( 0%) ggc
>
> Anyway, putting the patch out there to show that I've done more than
> just complaining ;-)

;)

Certainly an interesting idea.  I wonder how much pseudos we allocate
later during optimizations - thus, would sth like the SSA name freelist
(and a way to formally release a pseudo, similar to release_ssa_name)
improve the situation?  I guess most of the wasted pseudos are
created because we are not very good at generating initial RTL - are
there maybe a few patterns we can do better on that would catch a
big fraction of the waste?

Thanks,
Richard.

> It also would suggest that the scalability challenge for LRA is not
> the number of pseudos but something else (number of insns, I guess).
>
> Ciao!
> Steven

Re: [PATCH] Fix instability of -fschedule-insn for x86

2012-10-01 Thread Igor Zamyatin

We also plan to test these changes along with LRA

On Sun, Sep 30, 2012 at 4:33 PM, Uros Bizjak  wrote:
> On Tue, Sep 18, 2012 at 1:31 PM, Uros Bizjak  wrote:
>
>>> This patch aims to fix all stability issues related to using the first
>>> scheduler in gcc
>>> for x86 target (there several reported issues related to this problem).
>>>
>>> Main idea of this activity is mostly to provide user a possibility to
>>> safely turn on first scheduler for his codes. In some cases this could
>>> positively affect performance, especially for in-order Atom.
>>>
>>> Below is short description of proposed changes.
>>
>>> 2012-09-18  Yuri Rumyantsev  
>>>
>>> * config/i386/i386.c (ix86_dep_by_shift_count_body) : Add
>>> check on reload_completed since it can be invoked before
>>> register allocation phase in 1st scheduler.
>>> (ia32_multipass_dfa_lookahead) : Do not use dfa_lookahead for 1st
>>> Scheduler to save compile time.
>>> (ix86_sched_reorder) : Do not perform ready list reordering for 1st
>>> Scheduler to save compile time.
>>> (insn_is_function_arg) : New function. Returns true if lhs of insn 
>>> is
>>> HW function argument register.
>>> (add_parameter_dependencies) : New function. Add output dependencies
>>> for chain of function adjacent arguments if only there is a move to
>>> likely spilled HW registers. Return first argument if at least one
>>> dependence was added or NULL otherwise.
>>> (avoid_func_arg_motion) : New function. Add output or anti 
>>> dependency
>>> from insn to first_arg to restrict code motion.
>>> (add_dependee_for_func_arg) : New function. Avoid cross block 
>>> motion of
>>> function argument through adding dependency from the first non-jump
>>> insn in bb.
>>> (ix86_dependencies_evaluation_hook) : New function. Hook for 
>>> schedule1:
>>> avoid motion of function arguments passed in passed in likely 
>>> spilled
>>> HW registers.
>>> (ix86_adjust_priority) : New function. Hook for schedule1: set 
>>> priority
>>> of moves from likely spilled HW registers to maximum to schedule 
>>> them
>>> as soon as possible.
>>> (ix86_sched_init_global): Do not perform multipass scheduling for 
>>> 1st
>>> Scheduler to save compile time.
>>
>> I would kindly ask scheduler expert to review the patch from the
>> scheduler functionality POV.
>
> I have received opinion from Vladimir from off-line discussion, quoted below:
>
> --quote--
> I think, it is ok.
>
>   Switching off first cycle multipass scheduling is ok.  It is mostly
> useful when the order of insns issued on the same cycle is important
> (mostly VLIW or quasy-VLIW processors).
>
>   Other solutions are necessary to decrease spills and avoid reload
> crash (can not find a register in a class) when the 1st insn
> scheduling is on.  I don't think it fully avoids possibility of the
> reload crashes but it takes into account most of cases resulting in
> the crashes and makes the crash possibility really negligible.
> Register pressure sensitive insn scheduling decreased the possibility.
>  This patch will make it negligible.  And LRA will solve all the rest
> cases of the crashes.
>
>   I don't like a bit absence in freedom of moving argument insns with
> likely spilled hard-regs between each other as they are chained in the
> original order but it is debatable because it still decreases the
> possibility of spills.
>
>   In overall, the patch is ok for me.
> --/quote--
>
> Based on this opinion, the patch is OK for mainline, if there are no
> objections from other x86 maintainers in the next couple of days
> (48h). However, please watch for possible fallout from the patch,
> compile-time ICEs and performance problems. x86 and scheduler didn't
> play well together in the past, but your patch and (in the near
> future) LRA seems to fix all these problems.
>
> Thanks,
> Uros.

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Sat, Sep 29, 2012 at 10:26 PM, Steven Bosscher  wrote:
>  LRA create live ranges  : 175.30 (15%) usr   2.14 (13%) sys 177.44
> (15%) wall2761 kB ( 0%) ggc

I've tried to split this up a bit more:

process_bb_lives ~50%
create_start_finish_chains ~25%
remove_some_program_points_and_update_live_ranges ~25%

The latter two have a common structure with loops that look like this:

  for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
{
  for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next)

Perhaps it's possible to do some of the work of compress_live_ranges
during process_bb_lives, to create shorter live_ranges chains.

Also, maybe doing something else than a linked list of live_ranges
will help (unfortunately that's not a trivial change, it seems,
because the lra_live_range_t type is used everywhere and there are no
iterators or anything abstracted out like that -- just chain
walks...).

Still it does seem to me that a sorted VEC of lra_live_range objects
probably would speed things up. Question is of course how much... :-)

Ciao!
Steven

[Ada] Membership tests work with extended overflow checks

2012-10-01 Thread Arnaud Charlet

This patch implements membership tests in which the operands
can be out of range in extended overflow checkig modes.

The following is a test program:

 1. pragma Ada_2012;
 2. with Text_IO; use Text_IO;
 3. procedure Overflowm3 is
 4.subtype Int10 is Integer range 1 .. 5;
 5.subtype IntP is Integer with Predicate => Intp = 0;
 6.
 7.function r1
 8.  (a, b, c, d : Integer) return Boolean is
 9.begin
10.   return a + b + c + d in Integer'First .. Integer'Last
11. and then a + b + c + d in Integer
12. and then a + b + c + d in Intp
13. and then a + b + c + d not in Int10;
14.end;
15.function r2
16.  (a, b, c, d : Integer) return Boolean is
17.begin
18.   return a * b * c * d in Integer'First .. Integer'Last
19. and then a * b * c * d in Integer
20. and then a * b * c * d in Intp
21. and then a * b * c * d not in Int10;
22.end;
23.
24. begin
25.begin
26.   Put_Line
27. ("r1 returns " &
28.Boolean'Image
29.  (r1 (Integer'Last, Integer'Last,
30.   -Integer'Last, -Integer'Last)));
31.exception
32.   when Constraint_Error =>
33.  Put_Line ("r1 raises exception");
34.end;
35.
36.begin
37.   Put_Line
38. ("r2 returns " &
39.Boolean'Image
40.  (r2 (Integer'Last, Integer'Last,
41.   Integer'Last, 0)));
42.exception
43.   when Constraint_Error =>
44.  Put_Line ("r2 raises exception");
45.end;
46. end Overflowm3;

In CHECKED mode (-gnato1) we get:

r1 raises exception
r2 raises exception

since the first addition in r1 and the first multiplication
in r2 result in values outside the bounds of Integer'Base.

In MINIMIZED mode (-gnato2) we get:

r1 returns TRUE
r2 raises exception

since we can compute the addition result in Long_Long_Integer,
but the second multiplication yields a value outside this
range, so that causes an overflow.

In ELIMINATE mode (-gnato3) we get:

r1 returns TRUE
r2 returns TRUE

Because now we use Bignum arithmetic for the intermediate
multiplication results, and the final result is in range.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Robert Dewar  

* checks.adb (Apply_Arithmetic_Overflow_Minimized_Eliminated):
Handle case of appearing in range in membership test.
* exp_ch4.adb (Expand_Membership_Minimize_Eliminate_Overflow):
New procedure (Expand_N_In): Use
Expand_Membership_Minimize_Eliminate_Overflow.
* rtsfind.ads: Add RE_Bignum_In_LLI_Range.
* s-bignum.ads, s-bignum.adb (Bignum_In_LLI_Range): New function.
* sinfo.ads, sinfo.adb (No_Minimize_Eliminate): New flag.

Index: sinfo.adb
===
--- sinfo.adb   (revision 191888)
+++ sinfo.adb   (working copy)
@@ -2235,6 +2235,15 @@
   return Flag13 (N);
end No_Initialization;
 
+   function No_Minimize_Eliminate
+  (N : Node_Id) return Boolean is
+   begin
+  pragma Assert (False
+or else NT (N).Nkind = N_In
+or else NT (N).Nkind = N_Not_In);
+  return Flag17 (N);
+   end No_Minimize_Eliminate;
+
function No_Truncation
   (N : Node_Id) return Boolean is
begin
@@ -5288,6 +5297,15 @@
   Set_Flag13 (N, Val);
end Set_No_Initialization;
 
+   procedure Set_No_Minimize_Eliminate
+  (N : Node_Id; Val : Boolean := True) is
+   begin
+  pragma Assert (False
+or else NT (N).Nkind = N_In
+or else NT (N).Nkind = N_Not_In);
+  Set_Flag17 (N, Val);
+   end Set_No_Minimize_Eliminate;
+
procedure Set_No_Truncation
   (N : Node_Id; Val : Boolean := True) is
begin
Index: sinfo.ads
===
--- sinfo.ads   (revision 191913)
+++ sinfo.ads   (working copy)
@@ -1545,6 +1545,11 @@
--should not be taken into account (needed for in place initialization
--with aggregates).
 
+   --  No_Minimize_Eliminate (Flag17-Sem)
+   --This flag is present in membership operator nodes (N_In/N_Not_In).
+   --It is used to indicate that processing for extended overflow checking
+   --modes is not required (this is used to prevent infinite recursion).
+
--  No_Truncation (Flag17-Sem)
--Present in N_Unchecked_Type_Conversion node. This flag has an effect
--only if the RM_Size of the source is greater than the RM_Size of the
@@ -3675,6 +3680,7 @@
   --  Left_Opnd (Node2)
   --  Right_Opnd (Node3)
   --  Alternatives (List4) (set to No_List if only one set alternative)
+  --  No_Minimize_Eliminate (Flag17)
   --  plus fields for expression
 
   --  N_Not_In
@@ -3682,6 +3688,7 @@
   --  Left_Opnd (No

[Ada] Exponentiation works with extended overflow checks

2012-10-01 Thread Arnaud Charlet

This patch implements extended overflow checking modes
with the exonentiation operator.

The following is a test program:

 1. with Text_IO; use Text_IO;
 2. procedure Overflowm4 is
 3.function r1 (a, b : Integer) return Boolean is
 4.begin
 5.   return a ** 2 - b ** 2 <= Integer'Last;
 6.end;
 7.function r2 (a, b : Integer) return Boolean is
 8.begin
 9.   return a ** 10 - b ** 10 in Integer;
10.end;
11. begin
12.begin
13.   Put_Line
14. ("r1 returns " &
15.  Boolean'Image (r1 (Integer'Last, Integer'Last)));
16.exception
17.   when Constraint_Error =>
18.  Put_Line ("r1 raises exception");
19.end;
20.
21.begin
22.   Put_Line
23. ("r2 returns " &
24.  Boolean'Image (r2 (Integer'Last, Integer'Last)));
25.exception
26.   when Constraint_Error =>
27.  Put_Line ("r2 raises exception");
28.end;
29. end Overflowm4;

In CHECKED mode (-gnato1) we get:

r1 raises exception
r2 raises exception

since the first exponentiation in both r1 and r2 result
in values outside the bounds of Integer'Base.

In MINIMIZED mode (-gnato2) we get:

r1 returns TRUE
r2 raises exception

since we can compute the exponentiation results in r1 in
Long_Long_Integer mode, but that's not true for r2.

In ELIMINATE mode (-gnato3) we get:

r1 returns TRUE
r2 returns TRUE

Because now we use Bignum arithmetic for the exponentiation
operations in r2.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Robert Dewar  

* checks.adb (Minimize_Eliminate_Overflow_Checks): Changes
for exponentiation.
* exp_ch4.adb (Expand_N_Op_Expon): Changes for Minimize/Eliminate
overflow checks.
* s-bignum.adb (Compare): Fix bad precondition.

Index: checks.adb
===
--- checks.adb  (revision 191918)
+++ checks.adb  (working copy)
@@ -6548,7 +6548,7 @@
 
 when N_Op_Abs =>
Lo := Uint_0;
-   Hi := UI_Max (UI_Abs (Rlo), UI_Abs (Rhi));
+   Hi := UI_Max (abs Rlo, abs Rhi);
 
 --  Addition
 
@@ -6564,8 +6564,80 @@
 --  Exponentiation
 
 when N_Op_Expon =>
-   raise Program_Error;
 
+   --  Discard negative values for the exponent, since they will
+   --  simply result in an exception in any case.
+
+   if Rhi < 0 then
+  Rhi := Uint_0;
+   elsif Rlo < 0 then
+  Rlo := Uint_0;
+   end if;
+
+   --  Estimate number of bits in result before we go computing
+   --  giant useless bounds. Basically the number of bits in the
+   --  result is the number of bits in the base multiplied by the
+   --  value of the exponent. If this is big enough that the result
+   --  definitely won't fit in Long_Long_Integer, switch to bignum
+   --  mode immediately, and avoid computing giant bounds.
+
+   --  The comparison here is approximate, but conservative, it
+   --  only clicks on cases that are sure to exceed the bounds.
+
+   if Num_Bits (UI_Max (abs Llo, abs Lhi)) * Rhi + 1 > 100 then
+  Lo := No_Uint;
+  Hi := No_Uint;
+
+   --  If right operand is zero then result is 1
+
+   elsif Rhi = 0 then
+  Lo := Uint_1;
+  Hi := Uint_1;
+
+   else
+  --  High bound comes either from exponentiation of largest
+  --  positive value to largest exponent value, or from the
+  --  exponentiation of most negative value to an odd exponent.
+
+  declare
+ Hi1, Hi2 : Uint;
+
+  begin
+ if Lhi >= 0 then
+Hi1 := Lhi ** Rhi;
+ else
+Hi1 := Uint_0;
+ end if;
+
+ if Llo < 0 then
+if Rhi mod 2 = 0 then
+   Hi2 := Llo ** (Rhi - 1);
+else
+   Hi2 := Llo ** Rhi;
+end if;
+ else
+Hi2 := Uint_0;
+ end if;
+
+ Hi := UI_Max (Hi1, Hi2);
+  end;
+
+  --  Result can only be negative if base can be negative
+
+  if Llo < 0 then
+ if UI_Mod (Rhi, 2) = 0 then
+Lo := Llo ** (Rhi - 1);
+ else
+Lo := Llo ** Rhi;
+ end if;
+
+  --  Otherwise low bound is minimium ** minimum
+
+  e

[Ada] Division/Rem/Mod work with extended overflow checks

2012-10-01 Thread Arnaud Charlet

This patch implements extended overflow checking modes
with the division, rem, and mod operators. This completes
the work on extended overflow checking.

The following is a test program:

 1. with Text_IO; use Text_IO;
 2. procedure Overflowm5 is
 3.function r1 (a, b, c : Integer)
 4.  return Integer is
 5.begin
 6.   return a / b - c;
 7.end;
 8.function r2 (a, b, c : Long_Long_Integer)
 9.  return Long_Long_Integer is
10.begin
11.   return a / b - c;
12.end;
13. begin
14.begin
15.   Put_Line
16. ("r1 returns" &
17.Integer'Image
18.(r1 (Integer'First, - 1, Integer'Last)));
19.exception
20.   when Constraint_Error =>
21.  Put_Line ("r1 raises exception");
22.end;
23.
24.begin
25.   Put_Line
26. ("r2 returns" &
27.Long_Long_Integer'Image
28.(r2 (Long_Long_Integer'First, -1,
29. Long_Long_Integer'Last)));
30.exception
31.   when Constraint_Error =>
32.  Put_Line ("r2 raises exception");
33.end;
34. end Overflowm5;

In CHECKED mode (-gnato1) we get:

r1 raises exception
r2 raises exception

since in both cases we are dividing the largest negative integer
by minus one, which generates a value one greater than the
largest positive value.

In MINIMIZED mode (-gnato2) we get:

r1 returns 1
r2 raises exception

since we can now compute the division and subtraction for r1
in Long_Long_Integer mode, but that's not true for r2.

In ELIMINATE mode (-gnato3) we get:

r1 returns 1
r2 returns 1

Because now we use Bignum arithmetic for the division and
subtraction operations in r2.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Robert Dewar  

* checks.adb (Apply_Divide_Checks): New name for
Apply_Divide_Check (Minimize_Eliminate_Overflow_Checks):
Add code to handle division (and rem and mod) properly.
(Apply_Division_Check): New procedure (Apply_Divide_Checks):
Use Apply_Division_Check (Apply_Divide_Checks): Use
Apply_Arithmetic_Overflow_Minimized_Eliminated.
* checks.ads (Apply_Divide_Checks): New name for
Apply_Divide_Check, also add clearer documentation for this
routine and put in alfa order.
* exp_ch4.adb (Apply_Divide_Checks): New name for
Apply_Divide_Check.
* s-bignum.adb (To_Bignum): Handle largest negative integer
properly.
* sem.adb (Analyze): Handle overflow suppression correctly
(Analyze_List): Handle overflow suppression correctly
* sem_res.adb (Analyze_And_Resolve): Handle overflow suppression
correctly.

Index: checks.adb
===
--- checks.adb  (revision 191919)
+++ checks.adb  (working copy)
@@ -193,14 +193,6 @@
-- Local Subprograms --
---
 
-   procedure Apply_Float_Conversion_Check
- (Ck_Node: Node_Id;
-  Target_Typ : Entity_Id);
-   --  The checks on a conversion from a floating-point type to an integer
-   --  type are delicate. They have to be performed before conversion, they
-   --  have to raise an exception when the operand is a NaN, and rounding must
-   --  be taken into account to determine the safe bounds of the operand.
-
procedure Apply_Arithmetic_Overflow_Normal (N : Node_Id);
--  Used to apply arithmetic overflow checks for all cases except operators
--  on signed arithmetic types in Minimized/Eliminate case (for which we
@@ -211,6 +203,24 @@
--  checking mode is Minimized or Eliminated (and the Do_Overflow_Check flag
--  is known to be set) and we have an signed integer arithmetic op.
 
+   procedure Apply_Division_Check
+ (N   : Node_Id;
+  Rlo : Uint;
+  Rhi : Uint;
+  ROK : Boolean);
+   --  N is an N_Op_Div, N_Op_Rem, or N_Op_Mod node. This routine applies
+   --  division checks as required if the Do_Division_Check flag is set.
+   --  Rlo and Rhi give the possible range of the right operand, these values
+   --  can be referenced and trusted only if ROK is set True.
+
+   procedure Apply_Float_Conversion_Check
+ (Ck_Node: Node_Id;
+  Target_Typ : Entity_Id);
+   --  The checks on a conversion from a floating-point type to an integer
+   --  type are delicate. They have to be performed before conversion, they
+   --  have to raise an exception when the operand is a NaN, and rounding must
+   --  be taken into account to determine the safe bounds of the operand.
+
procedure Apply_Selected_Length_Checks
  (Ck_Node: Node_Id;
   Target_Typ : Entity_Id;
@@ -1641,52 +1651,69 @@
   Reason=> CE_Discriminant_Check_Failed));
end Apply_Discriminant_Check;
 
-   
-   -- Apply_Divide_Check --
-   
+   ---

[Ada] Static predicate checks on type conversions

2012-10-01 Thread Arnaud Charlet

In Ada 2012, if a subtype has predicates, a predicate check must be applied to
the expression in a type conversion to the subtype. Furthermore, if the
expression is a scalar static constant, the predicate must be evluated at
compile-time, and the program must be rejected if the predicate is false.

Compiling

   gcc -c -gnat12 -gnata main.adb

must yield:

   main.adb:6:16: static expression fails static predicate check on "T"

---
procedure Main is
   subtype T is Integer
   with Static_Predicate => T >= 10;
   V : T := 10;
begin
   V := 1000 / T (9);
end Main;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Ed Schonberg  

* checks.adb (Apply_Predicate_Check): If the predicate is a
static one and the operand is static, evaluate the predicate at
compile time.
* sem_eval.ads, sem_eval.adb (Eval_Static_Predicate_Check): new
procedure, to evaluate a static predicate check whenever possible.
* sem_res.adb (Resolve_Type_Conversion): Apply predicate check
on the conversion if the target type has predicates.

Index: checks.adb
===
--- checks.adb  (revision 191920)
+++ checks.adb  (working copy)
@@ -2337,6 +2337,23 @@
  (Sloc (N), Reason => SE_Infinite_Recursion));
 
  else
+
+--  If the predicate is a static predicate and the operand is
+--  static, the predicate must be evaluated statically. If the
+--  evaluation fails this is a static constraint error.
+
+if Is_OK_Static_Expression (N) then
+   if  Present (Static_Predicate (Typ)) then
+  if Eval_Static_Predicate_Check (N, Typ) then
+ return;
+  else
+ Error_Msg_NE
+   ("static expression fails static predicate check on&",
+  N, Typ);
+  end if;
+   end if;
+end if;
+
 Insert_Action (N,
   Make_Predicate_Check (Typ, Duplicate_Subexpr (N)));
  end if;
Index: sem_res.adb
===
--- sem_res.adb (revision 191920)
+++ sem_res.adb (working copy)
@@ -9713,6 +9713,22 @@
 end if;
  end;
   end if;
+
+  --  Ada 2012: if target type has predicates, the result requires a
+  --  predicate check. If the context is a call to another predicate
+  --  check we must prevent infinite recursion.
+
+  if Has_Predicates (Target_Typ) then
+ if Nkind (Parent (N)) = N_Function_Call
+   and then Present (Name (Parent (N)))
+   and then Has_Predicates (Entity (Name (Parent (N
+ then
+null;
+
+ else
+Apply_Predicate_Check (N, Target_Typ);
+ end if;
+  end if;
end Resolve_Type_Conversion;
 
--
Index: sem_eval.adb
===
--- sem_eval.adb(revision 191895)
+++ sem_eval.adb(working copy)
@@ -3249,6 +3249,37 @@
   end if;
end Eval_Slice;
 
+   -
+   -- Eval_Static_Predicate_Check --
+   -
+
+   function Eval_Static_Predicate_Check
+ (N   : Node_Id;
+  Typ : Entity_Id) return Boolean
+   is
+  Loc  : constant Source_Ptr := Sloc (N);
+  Pred : constant List_Id := Static_Predicate (Typ);
+  Test : Node_Id;
+   begin
+  if No (Pred) then
+ return True;
+  end if;
+
+  --  The static predicate is a list of alternatives in the proper format
+  --  for an Ada 2012 membership test. If the argument is a literal, the
+  --  membership test can be evaluated statically. The caller transforms
+  --  a result of False into a static contraint error.
+
+  Test := Make_In (Loc,
+ Left_Opnd=> New_Copy_Tree (N),
+ Right_Opnd   => Empty,
+ Alternatives => Pred);
+  Analyze_And_Resolve (Test, Standard_Boolean);
+
+  return Nkind (Test) = N_Identifier
+and then Entity (Test) = Standard_True;
+   end Eval_Static_Predicate_Check;
+
-
-- Eval_String_Literal --
-
Index: sem_eval.ads
===
--- sem_eval.ads(revision 191888)
+++ sem_eval.ads(working copy)
@@ -317,6 +317,11 @@
procedure Eval_Unary_Op   (N : Node_Id);
procedure Eval_Unchecked_Conversion   (N : Node_Id);
 
+   function Eval_Static_Predicate_Check
+ (N  : Node_Id;
+ Typ : Entity_Id) return Boolean;
+   --  Evaluate a static predicate check applied to a scalar literal.
+
procedure Fold_Str (N : Node_Id; Val : String_Id; Static : Boolean);
--  Rewrite N with a new N_String_Literal node as the result of the compile
--  time evaluation of

[Ada] Complain when actual Symbol is present in any dimension output call

2012-10-01 Thread Arnaud Charlet

This patch prevents the user to provide parameter Symbol (reserved for compiler
use only) in any dimension output call.


-- Source --


with System.Dim.Mks;use System.Dim.Mks;
with System.Dim.Mks_IO; use System.Dim.Mks_IO;

procedure Main is
begin
   Put (8.0**(1 / 3) * m , 1, 2, 0, "error");
end Main;

-
-- Compilation & Execution --
-

$ gcc -c -gnat12 main.adb
main.adb:6:37: Symbol parameter should not be provided
main.adb:6:37: reserved for compiler use only

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-01  Vincent Pucci  

* sem_dim.adb (Has_Symbols): Complain if parameter Symbol has been
provided by the user in the dimension output call.

Index: sem_dim.adb
===
--- sem_dim.adb (revision 191911)
+++ sem_dim.adb (working copy)
@@ -2703,7 +2703,8 @@
   -
 
   function Has_Symbols return Boolean is
- Actual : Node_Id;
+ Actual : Node_Id;
+ Actual_Str : Node_Id;
 
   begin
  Actual := First (Actuals);
@@ -2711,16 +2712,49 @@
  --  Look for a symbols parameter association in the list of actuals
 
  while Present (Actual) loop
-if Nkind (Actual) = N_Parameter_Association
+--  Positional parameter association case when the actual is a
+--  string literal.
+
+if Nkind (Actual) = N_String_Literal then
+   Actual_Str := Actual;
+
+--  Named parameter association case when the selector name is
+--  Symbol.
+
+elsif Nkind (Actual) = N_Parameter_Association
   and then Chars (Selector_Name (Actual)) = Name_Symbol
 then
+   Actual_Str := Explicit_Actual_Parameter (Actual);
+
+--  Ignore all other cases
+
+else
+   Actual_Str := Empty;
+end if;
+
+if Present (Actual_Str) then
--  Return True if the actual comes from source or if the string
--  of symbols doesn't have the default value (i.e. it is "").
 
-   return Comes_From_Source (Actual)
- or else
-   String_Length
- (Strval (Explicit_Actual_Parameter (Actual))) /= 0;
+   if Comes_From_Source (Actual)
+ or else String_Length (Strval (Actual_Str)) /= 0
+   then
+  --  Complain only if the actual comes from source or if it
+  --  hasn't been fully analyzed yet.
+
+  if Comes_From_Source (Actual)
+or else not Analyzed (Actual)
+  then
+ Error_Msg_N ("Symbol parameter should not be provided",
+  Actual);
+ Error_Msg_N ("\reserved for compiler use only", Actual);
+  end if;
+
+  return True;
+
+   else
+  return False;
+   end if;
 end if;
 
 Next (Actual);

[PATCH] Fix -frounding-math builtins

2012-10-01 Thread Richard Guenther


I noticed that we attach the no-vops attribute to -frounding-math
math functions.  That's bogus as can be seen from the testcase

int fesetround(int);
double asinh(double x);

double foo (double x, int b)
{
  double y = 0.0, z;
  if (b)
y = asinh (x);
  fesetround (0x400 /*FE_DOWNWARD*/);
  z = asinh (x);
  return y + z;
}

where PRE rightfully so removes a seeming partial redundancy by
inserting a asinh call into the else block.  That's because
it exactly does _not_ get to see the rounding mode clobbering
fesetround call as asinh does not have a virtual operand.

Fixed as follows.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2012-10-01  Richard Guenther  

* builtins.def (ATTR_MATHFN_FPROUNDING): Do not use no-vops
with -frounding-math.
* builtin-attrs.def (ATTR_PURE_NOTHROW_NOVOPS_LIST): Remove.
(ATTR_PURE_NOTHROW_NOVOPS_LEAF_LIST): Likewise.

Index: gcc/builtins.def
===
*** gcc/builtins.def(revision 191917)
--- gcc/builtins.def(working copy)
*** along with GCC; see the file COPYING3.
*** 163,169 
 memory.  */
  #undef ATTR_MATHFN_FPROUNDING
  #define ATTR_MATHFN_FPROUNDING (flag_rounding_math ? \
!   ATTR_PURE_NOTHROW_NOVOPS_LEAF_LIST : ATTR_CONST_NOTHROW_LEAF_LIST)
  
  /* Define an attribute list for math functions that are normally
 "impure" because some of them may write into global memory for
--- 163,169 
 memory.  */
  #undef ATTR_MATHFN_FPROUNDING
  #define ATTR_MATHFN_FPROUNDING (flag_rounding_math ? \
!   ATTR_PURE_NOTHROW_LEAF_LIST : ATTR_CONST_NOTHROW_LEAF_LIST)
  
  /* Define an attribute list for math functions that are normally
 "impure" because some of them may write into global memory for
Index: gcc/builtin-attrs.def
===
*** gcc/builtin-attrs.def   (revision 191917)
--- gcc/builtin-attrs.def   (working copy)
*** DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LI
*** 127,136 
ATTR_NULL, ATTR_NOTHROW_LIST)
  DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LEAF_LIST, ATTR_PURE,   \
ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
- DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_NOVOPS_LIST, ATTR_NOVOPS,   \
-   ATTR_NULL, ATTR_PURE_NOTHROW_LIST)
- DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_NOVOPS_LEAF_LIST, ATTR_NOVOPS,\
-   ATTR_NULL, ATTR_PURE_NOTHROW_LEAF_LIST)
  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LIST, ATTR_NORETURN,\
ATTR_NULL, ATTR_NOTHROW_LIST)
  DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
--- 127,132

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Sharad Singhai

I am sorry, I didn't enable all the languages. Will fix the fortran
test breakage shortly.

Thanks,
Sharad
Sharad


On Mon, Oct 1, 2012 at 4:50 AM, H.J. Lu  wrote:
> On Sun, Sep 30, 2012 at 11:36 PM, Sharad Singhai  wrote:
>> Resend to gcc-patches
>>
>> I have addressed the comments by fixing all the minor issues,
>> bootstrapped and tested on x86_64. I did the recommended reshuffling
>> by moving non-tree code from tree-dump.c into a new file dumpfile.c.
>>
>> I committed two successive revisions
>> r191883 Main patch with the dump infrastructure changes. However, I
>> accidentally left out a new file, dumpfile.c.
>> r191884 Added dumpfile.c, and did the renaming of dump_* functions
>> from gimple_pretty_print.[ch].
>>
>> As things stand right now, r191883 is broken because of the missing
>> file 'dumpfile.c', which the very next commit fixes. Anyone who got
>> broken revision r191883, please svn update. I am really very sorry
>> about that.
>>
>> I have a couple more minor patches which deal with renaming; I plan to
>> address those later.
>>
>
> It caused:
>
> FAIL: gcc.dg/tree-ssa/gen-vect-11.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
> "vectorized 0 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
> "vectorized 0 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-2.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-2.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-25.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-25.c scan-tree-dump-times vect
> "vectorized 2 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
> "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
> "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-32.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-32.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gfortran.dg/vect/O3-pr36119.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/O3-pr39595.f (test for excess errors)
> FAIL: gfortran.dg/vect/Ofast-pr50414.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/cost-model-pr34445.f (test for excess errors)
> FAIL: gfortran.dg/vect/cost-model-pr34445a.f (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 scan-tree-dump vect
> "vectorized 1 loops"
> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gfortran.dg/vect/no-fre-no-copy-prop-O3-pr51704.f90 (test for
> excess errors)
> FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 scan-tree-dump-times vect
> "vectorized 2 loops" 1
> FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 scan-tree-dump-times vect
> "vectorized 0 loops" 1
> FAIL: gfortran.dg/vect/pr19049.f90  -O   scan-tree-dump-times vect
> "complicated access pattern" 1
> FAIL: gfortran.dg/vect/pr19049.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr32377.f90  -O   scan-tree-dump-times vect
> "vectorized 2 loops" 1
> FAIL: gfortran.dg/vect/pr32377.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr32380.f  -O   scan-tree-dump-times vect
> "vectorized 6 loops" 1
> FAIL: gfortran.dg/vect/pr32380.f  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr33301.f  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr50178.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr50412.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr51058-2.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr51058.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/pr51285.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/vect-1.f90  -O   scan-tree-dump-times vect
> "vectorized 3 loops" 1
> FAIL: gfortran.dg/vect/vect-1.f90  -O  (test for excess errors)
> FAIL: gfortran.dg/vect/vect-2.f90  -O

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread H.J. Lu

On Mon, Oct 1, 2012 at 6:49 AM, Sharad Singhai  wrote:
> I am sorry, I didn't enable all the languages. Will fix the fortran
> test breakage shortly.

It is not just Fortran.  There are some failures in C testcases.

> Thanks,
> Sharad
> Sharad
>
>
> On Mon, Oct 1, 2012 at 4:50 AM, H.J. Lu  wrote:
>> On Sun, Sep 30, 2012 at 11:36 PM, Sharad Singhai  wrote:
>>> Resend to gcc-patches
>>>
>>> I have addressed the comments by fixing all the minor issues,
>>> bootstrapped and tested on x86_64. I did the recommended reshuffling
>>> by moving non-tree code from tree-dump.c into a new file dumpfile.c.
>>>
>>> I committed two successive revisions
>>> r191883 Main patch with the dump infrastructure changes. However, I
>>> accidentally left out a new file, dumpfile.c.
>>> r191884 Added dumpfile.c, and did the renaming of dump_* functions
>>> from gimple_pretty_print.[ch].
>>>
>>> As things stand right now, r191883 is broken because of the missing
>>> file 'dumpfile.c', which the very next commit fixes. Anyone who got
>>> broken revision r191883, please svn update. I am really very sorry
>>> about that.
>>>
>>> I have a couple more minor patches which deal with renaming; I plan to
>>> address those later.
>>>
>>
>> It caused:
>>
>> FAIL: gcc.dg/tree-ssa/gen-vect-11.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-11.c scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
>> "vectorized 0 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
>> "vectorized 0 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-2.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-2.c scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-25.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-25.c scan-tree-dump-times vect
>> "vectorized 2 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
>> "Alignment of access forced using peeling" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
>> "Alignment of access forced using peeling" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gcc.dg/tree-ssa/gen-vect-32.c (test for excess errors)
>> FAIL: gcc.dg/tree-ssa/gen-vect-32.c scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gfortran.dg/vect/O3-pr36119.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/O3-pr39595.f (test for excess errors)
>> FAIL: gfortran.dg/vect/Ofast-pr50414.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/cost-model-pr34445.f (test for excess errors)
>> FAIL: gfortran.dg/vect/cost-model-pr34445a.f (test for excess errors)
>> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 scan-tree-dump vect
>> "vectorized 1 loops"
>> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 scan-tree-dump-times vect
>> "vectorized 1 loops" 1
>> FAIL: gfortran.dg/vect/no-fre-no-copy-prop-O3-pr51704.f90 (test for
>> excess errors)
>> FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 scan-tree-dump-times vect
>> "vectorized 2 loops" 1
>> FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 (test for excess errors)
>> FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 scan-tree-dump-times vect
>> "vectorized 0 loops" 1
>> FAIL: gfortran.dg/vect/pr19049.f90  -O   scan-tree-dump-times vect
>> "complicated access pattern" 1
>> FAIL: gfortran.dg/vect/pr19049.f90  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr32377.f90  -O   scan-tree-dump-times vect
>> "vectorized 2 loops" 1
>> FAIL: gfortran.dg/vect/pr32377.f90  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr32380.f  -O   scan-tree-dump-times vect
>> "vectorized 6 loops" 1
>> FAIL: gfortran.dg/vect/pr32380.f  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr33301.f  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr50178.f90  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr50412.f90  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr51058-2.f90  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr51058.f90  -O  (test for excess errors)
>> FAIL: gfortran.dg/vect/pr51285.f90  -

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Sharad Singhai

On Mon, Oct 1, 2012 at 6:52 AM, H.J. Lu  wrote:
> On Mon, Oct 1, 2012 at 6:49 AM, Sharad Singhai  wrote:
>> I am sorry, I didn't enable all the languages. Will fix the fortran
>> test breakage shortly.
>
> It is not just Fortran.  There are some failures in C testcases.

I checked and those files looked like generator files for Fortran
tests and thus were not exercised in my configuration. I am really
sorry about that. I am fixing it.

UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11a.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11b.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11c.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-2.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-25.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-26.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-28.c
UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-32.c

Thanks,
Sharad

>
>> Thanks,
>> Sharad
>> Sharad
>>
>>
>> On Mon, Oct 1, 2012 at 4:50 AM, H.J. Lu  wrote:
>>> On Sun, Sep 30, 2012 at 11:36 PM, Sharad Singhai  wrote:
 Resend to gcc-patches

 I have addressed the comments by fixing all the minor issues,
 bootstrapped and tested on x86_64. I did the recommended reshuffling
 by moving non-tree code from tree-dump.c into a new file dumpfile.c.

 I committed two successive revisions
 r191883 Main patch with the dump infrastructure changes. However, I
 accidentally left out a new file, dumpfile.c.
 r191884 Added dumpfile.c, and did the renaming of dump_* functions
 from gimple_pretty_print.[ch].

 As things stand right now, r191883 is broken because of the missing
 file 'dumpfile.c', which the very next commit fixes. Anyone who got
 broken revision r191883, please svn update. I am really very sorry
 about that.

 I have a couple more minor patches which deal with renaming; I plan to
 address those later.

>>>
>>> It caused:
>>>
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11.c scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
>>> "vectorized 0 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
>>> "vectorized 0 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-2.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-2.c scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-25.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-25.c scan-tree-dump-times vect
>>> "vectorized 2 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
>>> "Alignment of access forced using peeling" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
>>> "Alignment of access forced using peeling" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gcc.dg/tree-ssa/gen-vect-32.c (test for excess errors)
>>> FAIL: gcc.dg/tree-ssa/gen-vect-32.c scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gfortran.dg/vect/O3-pr36119.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/O3-pr39595.f (test for excess errors)
>>> FAIL: gfortran.dg/vect/Ofast-pr50414.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/cost-model-pr34445.f (test for excess errors)
>>> FAIL: gfortran.dg/vect/cost-model-pr34445a.f (test for excess errors)
>>> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 scan-tree-dump vect
>>> "vectorized 1 loops"
>>> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 scan-tree-dump-times vect
>>> "vectorized 1 loops" 1
>>> FAIL: gfortran.dg/vect/no-fre-no-copy-prop-O3-pr51704.f90 (test for
>>> excess errors)
>>> FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 scan-tree-dump-times vect
>>> "vectorized 2 loops" 1
>>> FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 (test for excess errors)
>>> FAIL: gfortran.dg/vect/no-vfa-pr32457.f90 scan-tree-dump-times vect
>>> "vectorized 0 loops" 1
>>> FAIL: gfortran.dg/vect/pr19049.f90  -O   scan-tree-dump-times vect
>>> "complicated access pattern" 1
>>> FAIL: gfortran.dg/vect/pr19049.f90  -O  (test for exc

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Richard Guenther

On Mon, Oct 1, 2012 at 3:55 PM, Sharad Singhai  wrote:
> On Mon, Oct 1, 2012 at 6:52 AM, H.J. Lu  wrote:
>> On Mon, Oct 1, 2012 at 6:49 AM, Sharad Singhai  wrote:
>>> I am sorry, I didn't enable all the languages. Will fix the fortran
>>> test breakage shortly.
>>
>> It is not just Fortran.  There are some failures in C testcases.
>
> I checked and those files looked like generator files for Fortran
> tests and thus were not exercised in my configuration. I am really
> sorry about that. I am fixing it.

As I said, you should not enable/disable anything special but
configure with all default languages enabled (no --enable-languages)
and do toplevel make -k check, preferably also excercising
multilibs with RUNTESTFLAGS="--target_board=unix/\{,-m32\}"

Richard.

> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11a.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11b.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11c.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-2.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-25.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-26.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-28.c
> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-32.c
>
> Thanks,
> Sharad
>
>>
>>> Thanks,
>>> Sharad
>>> Sharad
>>>
>>>
>>> On Mon, Oct 1, 2012 at 4:50 AM, H.J. Lu  wrote:
 On Sun, Sep 30, 2012 at 11:36 PM, Sharad Singhai  
 wrote:
> Resend to gcc-patches
>
> I have addressed the comments by fixing all the minor issues,
> bootstrapped and tested on x86_64. I did the recommended reshuffling
> by moving non-tree code from tree-dump.c into a new file dumpfile.c.
>
> I committed two successive revisions
> r191883 Main patch with the dump infrastructure changes. However, I
> accidentally left out a new file, dumpfile.c.
> r191884 Added dumpfile.c, and did the renaming of dump_* functions
> from gimple_pretty_print.[ch].
>
> As things stand right now, r191883 is broken because of the missing
> file 'dumpfile.c', which the very next commit fixes. Anyone who got
> broken revision r191883, please svn update. I am really very sorry
> about that.
>
> I have a couple more minor patches which deal with renaming; I plan to
> address those later.
>

 It caused:

 FAIL: gcc.dg/tree-ssa/gen-vect-11.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-11.c scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-11a.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-11a.c scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-11b.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
 "vectorized 0 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-11c.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
 "vectorized 0 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-2.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-2.c scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-25.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-25.c scan-tree-dump-times vect
 "vectorized 2 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
 "Alignment of access forced using peeling" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
 "Alignment of access forced using peeling" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gcc.dg/tree-ssa/gen-vect-32.c (test for excess errors)
 FAIL: gcc.dg/tree-ssa/gen-vect-32.c scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gfortran.dg/vect/O3-pr36119.f90 (test for excess errors)
 FAIL: gfortran.dg/vect/O3-pr39595.f (test for excess errors)
 FAIL: gfortran.dg/vect/Ofast-pr50414.f90 (test for excess errors)
 FAIL: gfortran.dg/vect/cost-model-pr34445.f (test for excess errors)
 FAIL: gfortran.dg/vect/cost-model-pr34445a.f (test for excess errors)
 FAIL: gfortran.dg/vect/fast-math-pr38968.f90 (test for excess errors)
 FAIL: gfortran.dg/vect/fast-math-pr38968.f90 scan-tree-dump vect
 "vectorized 1 loops"
 FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
 FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess errors)
 FAIL: gfortran.dg/vect/fast-math-vect-8.f90 (test for excess errors)
 FAIL: gfortran.dg/vect/fast-math-vect-8.f90 scan-tree-dump-times vect
 "vectorized 1 loops" 1
 FAIL: gfortran.dg/vect/no-fre-no-copy-prop-O3-pr51704.f90 (test for
 excess errors)
 FAIL: gfortran.dg/vect/no-vfa-pr32377.f90 (test for excess er

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Sharad Singhai

Okay, I am retesting without any special configs and with multilibs as
you suggested.

Thanks,
Sharad

On Mon, Oct 1, 2012 at 7:00 AM, Richard Guenther
 wrote:
> On Mon, Oct 1, 2012 at 3:55 PM, Sharad Singhai  wrote:
>> On Mon, Oct 1, 2012 at 6:52 AM, H.J. Lu  wrote:
>>> On Mon, Oct 1, 2012 at 6:49 AM, Sharad Singhai  wrote:
 I am sorry, I didn't enable all the languages. Will fix the fortran
 test breakage shortly.
>>>
>>> It is not just Fortran.  There are some failures in C testcases.
>>
>> I checked and those files looked like generator files for Fortran
>> tests and thus were not exercised in my configuration. I am really
>> sorry about that. I am fixing it.
>
> As I said, you should not enable/disable anything special but
> configure with all default languages enabled (no --enable-languages)
> and do toplevel make -k check, preferably also excercising
> multilibs with RUNTESTFLAGS="--target_board=unix/\{,-m32\}"
>
> Richard.
>
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11a.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11b.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-11c.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-2.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-25.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-26.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-28.c
>> UNSUPPORTED: gcc.dg/tree-ssa/gen-vect-32.c
>>
>> Thanks,
>> Sharad
>>
>>>
 Thanks,
 Sharad
 Sharad


 On Mon, Oct 1, 2012 at 4:50 AM, H.J. Lu  wrote:
> On Sun, Sep 30, 2012 at 11:36 PM, Sharad Singhai  
> wrote:
>> Resend to gcc-patches
>>
>> I have addressed the comments by fixing all the minor issues,
>> bootstrapped and tested on x86_64. I did the recommended reshuffling
>> by moving non-tree code from tree-dump.c into a new file dumpfile.c.
>>
>> I committed two successive revisions
>> r191883 Main patch with the dump infrastructure changes. However, I
>> accidentally left out a new file, dumpfile.c.
>> r191884 Added dumpfile.c, and did the renaming of dump_* functions
>> from gimple_pretty_print.[ch].
>>
>> As things stand right now, r191883 is broken because of the missing
>> file 'dumpfile.c', which the very next commit fixes. Anyone who got
>> broken revision r191883, please svn update. I am really very sorry
>> about that.
>>
>> I have a couple more minor patches which deal with renaming; I plan to
>> address those later.
>>
>
> It caused:
>
> FAIL: gcc.dg/tree-ssa/gen-vect-11.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11a.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
> "vectorized 0 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
> "vectorized 0 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-2.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-2.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-25.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-25.c scan-tree-dump-times vect
> "vectorized 2 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
> "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
> "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gcc.dg/tree-ssa/gen-vect-32.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/gen-vect-32.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> FAIL: gfortran.dg/vect/O3-pr36119.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/O3-pr39595.f (test for excess errors)
> FAIL: gfortran.dg/vect/Ofast-pr50414.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/cost-model-pr34445.f (test for excess errors)
> FAIL: gfortran.dg/vect/cost-model-pr34445a.f (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 (test for excess errors)
> FAIL: gfortran.dg/vect/fast-math-pr38968.f90 scan-tree-dump vect
> "vectorized 1 loops"
> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess 
> errors)
> FAIL: gfortran.dg/vect/fast-math-real8-pr40801.f90 (test for excess 
> errors)
> FAIL: gfortran.dg/vect/fast-math-vect-8.f90 (

Second ping: Re: Add a configure option to disable system header canonicalizations (issue6495088)

2012-10-01 Thread Simon Baldwin

Ping, again.


On 21 September 2012 12:45, Simon Baldwin  wrote:
>
> Ping.
>
> http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00459.html
>
> Full text of previous message and context at URL above.  No comments
> or code changes since.  Patch description left below for convenience.
>
> >
> > Add flags to disable system header canonicalizations.
> >
> > Libcpp may canonicalize system header paths with lrealpath() for 
> > diagnostics,
> > dependency output, and similar.  If gcc is held in a symlink farm the
> > canonicalized paths may be meaningless to users, and will also conflict with
> > build frameworks that (for example) disallow absolute paths to header files.
> >
> > This change adds -f[no-]canonical-system-headers to the gcc command line, 
> > and
> > a configure option --[en/dis]able-canonical-system-headers to set default
> > behaviour, allowing the user to select whether or not to implement r186991.
> > Default is enabled.  See also PR c++/52974.
> >
> > Tested for regressions with bootstrap builds of C and C++, both with and
> > without configure --disable-canonical-system-headers.
>
> --
> Google UK Limited | Registered Office: Belgrave House, 76 Buckingham
> Palace Road, London SW1W 9TQ | Registered in England Number: 3977902




--
Google UK Limited | Registered Office: Belgrave House, 76 Buckingham
Palace Road, London SW1W 9TQ | Registered in England Number: 3977902

[Patch] Fix PR53397

2012-10-01 Thread venkataramanan.kumar

Hi, 

The below patch fixes the FFT/Scimark regression caused by useless prefetch
generation.

This fix tries to make prefetch less aggressive by prefetching arrays in the
inner loop, when the step is invariant in the entire loop nest.

GCC currently tries to prefetch invariant steps when they are in the inner
loop. But does not check if the step is variant in outer loops.

In the scimark FFT case, the trip count of the inner loop varies by a non
constant step, which is invariant in the inner loop. 
But the step variable is varying in outer loop. This makes
inner loop trip count small (at run time varies sometimes as small as 1
iteration) 

Prefetching ahead x iteration when the inner loop trip count is smaller than x
leads to useless prefetches. 

Flag used: -O3 -march=amdfam10 

Before 
**  **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to p...@nist.gov) **
**  **
Using   2.00 seconds min time per kenel.
Composite Score:  550.50
FFT Mflops:38.66(N=1024)
SOR Mflops:   617.61(100 x 100)
MonteCarlo: Mflops:   173.74
Sparse matmult  Mflops:   675.63(N=1000, nz=5000)
LU  Mflops:  1246.88(M=100, N=100)


After 
**  **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to p...@nist.gov) **
**  **
Using   2.00 seconds min time per kenel.
Composite Score:  639.20
FFT Mflops:   479.19(N=1024)
SOR Mflops:   617.61(100 x 100)
MonteCarlo: Mflops:   173.18
Sparse matmult  Mflops:   679.13(N=1000, nz=5000)
LU  Mflops:  1246.88(M=100, N=100)

GCC regression "make check -k" passes with x86_64-unknown-linux-gnu
New tests that PASS:

gcc.dg/pr53397-1.c scan-assembler prefetcht0
gcc.dg/pr53397-1.c scan-tree-dump aprefetch "Issued prefetch"
gcc.dg/pr53397-1.c (test for excess errors)
gcc.dg/pr53397-2.c scan-tree-dump aprefetch "loop variant step"
gcc.dg/pr53397-2.c scan-tree-dump aprefetch "Not prefetching"
gcc.dg/pr53397-2.c (test for excess errors)


Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted.

Ok to commit in trunk?

regards,
Venkat

gcc/ChangeLog
+2012-10-01  Venkataramanan Kumar  
+
+   * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$
+   Perform non constant step prefetching in inner loop, only $
+   when it is invariant in the entire loop nest.  $
+   * testsuite/gcc.dg/pr53397-1.c: New test case $
+   Checks we are prefecthing for loop invariant steps$
+   * testsuite/gcc.dg/pr53397-2.c: New test case$
+   Checks we are not prefecthing for loop variant steps
+


Index: gcc/testsuite/gcc.dg/pr53397-1.c
===
--- gcc/testsuite/gcc.dg/pr53397-1.c(revision 0)
+++ gcc/testsuite/gcc.dg/pr53397-1.c(revision 0)
@@ -0,0 +1,28 @@
+/* Prefetching when the step is loop invariant.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -fprefetch-loop-arrays -fdump-tree-aprefetch-details 
--param min-insn-to-prefetch-ratio=3 --param simultaneous-prefetches=10 
-fdump-tree-aprefetch-details" } */
+
+
+double data[16384];
+void prefetch_when_non_constant_step_is_invariant(int step, int n)
+{
+ int a;
+ int b;
+ for (a = 1; a < step; a++) {
+for (b = 0; b < n; b += 2 * step) {
+
+  int i = 2*(b + a);
+  int j = 2*(b + a + step);
+
+
+  data[j]   = data[i];
+  data[j+1] = data[i+1];
+}
+ }
+}
+
+/* { dg-final { scan-tree-dump "Issued prefetch" "aprefetch" } } */
+/* { dg-final { scan-assembler "prefetcht0" } } */
+
+/* { dg-final { cleanup-tree-dump "aprefetch" } } */
Index: gcc/testsuite/gcc.dg/pr53397-2.c
===
--- gcc/testsuite/gcc.dg/pr53397-2.c(revision 0)
+++ gcc/testsuite/gcc.dg/pr53397-2.c(revision 0)
@@ -0,0 +1,29 @@
+/* Not prefetching when the step is loop variant.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -fprefetch-loop-arrays -fdump-tree-aprefetch-details 
--param min-insn-to-prefetch-ratio=3 --param simultaneous-prefetches=10 
-fdump-tree-aprefetch-details" } */
+
+
+double data[16384];
+void donot_prefetch_when_non_constant_step_is_variant(int step, int n)
+{ 
+ int a;
+ int b;
+ for (a = 1; a < step; a++,step*=2) {
+for (b = 0; b < n; b += 2 * step) {
+
+  int i = 2*(b + a);
+  int j = 2*(b + a + step);
+
+
+  data[j]   = data[i];
+  data[j+1] = data[i+1];
+}
+ } 
+}
+
+/* { dg-final { scan-tree-dump "Not prefetching" "aprefetch" } } */
+/* { dg-final { scan-tree-dump "loop variant step" "aprefetch" }

Re: [PATCH, i386]: Implement atomic_fetch_sub

2012-10-01 Thread Andrew MacLeod


On 08/30/2012 05:33 PM, Richard Henderson wrote:

On 08/23/2012 08:59 AM, Andrew MacLeod wrote:

2012-08-23  Andrew MacLeod  

gcc
PR target/54087
* optabs.c (expand_atomic_fetch_op_no_fallback): New.  Factored code
from expand_atomic_fetch_op.
(expand_atomic_fetch_op):  iTry atomic_{add|sub} operations in terms of
the other one if direct opcode fails.

testsuite
* gcc.dg/pr54087.c:  New testcase for atomic_sub -> atomic_add when
atomic_sub fails.

Ok.


Oops, approved but never checked in.Just did so after a new 
bootstrap/test cycle with no issues.


Andrew

Re: vec_cond_expr adjustments

2012-10-01 Thread Marc Glisse


[merging both threads, thanks for the answers]

On Mon, 1 Oct 2012, Richard Guenther wrote:


optabs should be fixed instead, an is_gimple_val condition is implicitely
val != 0.


For vectors, I think it should be val < 0 (with an appropriate cast of val
to a signed integer vector type if necessary). Or (val & highbit) != 0, but
that's longer.


I don't think so.  Throughout the compiler we generally assume false == 0
and anything else is true.  (yes, for FP there is STORE_FLAG_VALUE, but
it's scope is quite limited - if we want sth similar for vectors we'd have to
invent it).


See below.


If we for example have

predicate = a < b;
x = predicate ? d : e;
y = predicate ? f : g;

we ideally want to re-use the predicate computation on targets where
that would be optimal (and combine should be able to recover the
case where it is not).


That I don't understand. The vcond instruction implemented by targets takes
as arguments d, e, cmp, a, b and emits the comparison itself. I don't see
how I can avoid sending to the targets both (d,e,<,a,b) and (f,g,<,a,b).
They will notice eventually that a

But that's a limitation of how vcond works.  ISTR there is/was a vselect
instruction as well, taking a "mask" and two vectors to select from.  At least
that's how vcond works internally for some sub-targets.


vselect seems to only appear in config/. Would it be defined as:
vselect(m,a,b)=(a&m)|(b&~m) ? I would almost be tempted to just define a 
pattern in .md files and let combine handle it, although it might be one 
instruction too long for that (and if m is x=y).

Or would it match the OpenCL select: "For each component of a vector type,
result[i] = if MSB of c[i] is set ? b[i] : a[i]."? Or the pattern with &
and | but with a precondition that the value of each element of the mask
must be 0 or ±1?

I don't find vcond that bad, as long as targets check for trivial 
comparisons in the expansion (what trivial means may depend on the 
platform). It is quite flexible for targets.



On Mon, 1 Oct 2012, Richard Guenther wrote:


tmp = fold_build2_loc (gimple_location (def_stmt),
   code,
-  boolean_type_node,
+  TREE_TYPE (cond),


That's obvious.


Ok, I'll test and commit that line separately.


+  if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
+{
+  int count = VECTOR_CST_NELTS (op0);
+  tree *elts =  XALLOCAVEC (tree, count);
+  gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+
+  for (int i = 0; i < count; i++)
+   {
+ tree elem_type = TREE_TYPE (type);
+ tree elem0 = VECTOR_CST_ELT (op0, i);
+ tree elem1 = VECTOR_CST_ELT (op1, i);
+
+ elts[i] = fold_relational_const (code, elem_type,
+  elem0, elem1);
+
+ if(elts[i] == NULL_TREE)
+   return NULL_TREE;
+
+ elts[i] = fold_negate_const (elts[i], elem_type);


I think you need to invent something new similar to STORE_FLAG_VALUE
or use STORE_FLAG_VALUE here.  With the above you try to map
{0, 1} to {0, -1} which is only true if the operation on the element types
returns {0, 1} (thus, STORE_FLAG_VALUE is 1).


Er, seems to me that constant folding of a scalar comparison in the
front/middle-end only returns {0, 1}.


+/* Return true if EXPR is an integer constant representing true.  */
+
+bool
+integer_truep (const_tree expr)
+{
+  STRIP_NOPS (expr);
+
+  switch (TREE_CODE (expr))
+{
+case INTEGER_CST:
+  /* Do not just test != 0, some places expect the value 1.  */
+  return (TREE_INT_CST_LOW (expr) == 1
+ && TREE_INT_CST_HIGH (expr) == 0);


I wonder if using STORE_FLAG_VALUE is better here (note that it
usually differs for FP vs. integral comparisons and the mode passed
to STORE_FLAG_VALUE is that of the comparison result).


I notice there is already a VECTOR_STORE_FLAG_VALUE (used only once in
simplify-rtx, in a way that seems a bit strange but I'll try to
understand that later). Thanks for showing me this macro, it seems
important indeed. However the STORE_FLAG_VALUE mechanism seems to be for
the RTL level.

It looks like it would be possible to have 3 different semantics:
source code is OpenCL, middle-end whatever we want (0 / 1 for instance),
and back-end is whatever the target wants. The front-end would generate
for a
That said, until we are sure what semantics we want here (forwprop
for example doesn't look at 'comparisons' but operations on special
values and types) I'd prefer to not introduce integer_truep ().


I completely agree that defining the semantics comes first :-)

--
Marc Glisse

Re: [patch, mips] Patch for new mips triplet - mips-mti-elf

2012-10-01 Thread Steve Ellcey

On Sun, 2012-09-30 at 19:53 +0100, Richard Sandiford wrote:

> Sorry for only noticing now, but this produced:
> 
> ERROR: gcc.target/mips/pr37362.c  -O0 : syntax error in target selector 
> "target ! mips*-sde-elf mips*-mti-elf" for " dg-do 2 compile { target { ! 
> mips*-sde-elf mips*-mti-elf } } "
> ...
> 
> We need another set of braces.  Tested on mipsisa64-elf and applied.
> 
> Richard

Thanks for fixing this, I am not sure why I didn't notice it in my
testing.

Steve Ellcey
sell...@mips.com

Re: Tweak IRA checks for singleton register classes

2012-10-01 Thread Vladimir Makarov


On 12-09-30 2:21 PM, Richard Sandiford wrote:

IRA has code to check whether there is only a single acceptable register
for a given operand.  This code uses conditions like:

   ira_class_hard_regs_num[cl] != 0
   && (ira_class_hard_regs_num[cl] <= ira_reg_class_max_nregs[cl][mode])

i.e. the number of registers needed to store the mode is >=
the number of alloctable registers in the class.  Then:

   ira_class_hard_regs[cl][0]

gives the register in question.

MIPS has a slightly strange situation in which HI can only be allocated
alongside LO; it can't be allocated independently.  At the moment,
HI and LO have their own register classes (MD0_REG and MD1_REG,
with the mapping depending on endianness) and MD_REGS is used when both
HI and LO are required.  There is also ACC_REGS, which is equivalent to
MD_REGS when the DSP ASE is not being used.  MD_REGS and ACC_REGS are
already mapped to constraints.

Having MD0_REG and MD1_REG leads to some confusing costs and makes
HI and LO irregular WRT the DSP ASE accumulator registers.  I've been
experimenting with patches to remove these classes and just have MD_REGS.
I wanted to get to a situtation where this change has no effect on cc1 .ii
files for -mno-dsp; the patch below is one of those needed to get to that
stage.

MD_REGS has only one SImode register.  As describe above, the same goes
for ACC_REGS unless the DSP ASE is being used.  However, both classes
fail the check above because HI (which doesn't accept SImode) is also
allocatable.  That is, the classes have two allocatable registers,
but only one of them can be used for SImode.

The patch below adds a new array for tracking which class/mode
combinations specify a single register, and for recording which
register that is.  The net effect will be the same on almost all
targets.

I deliberately didn't change:

  for (p2 = ®_class_subclasses[cl2][0];
   *p2 != LIM_REG_CLASSES; p2++)
if (ira_class_hard_regs_num[*p2] > 0
&& (ira_reg_class_max_nregs[*p2][mode]
<= ira_class_hard_regs_num[*p2]))
  cost = MAX (cost, ira_register_move_cost[mode][cl1][*p2]);

  for (p1 = ®_class_subclasses[cl1][0];
   *p1 != LIM_REG_CLASSES; p1++)
if (ira_class_hard_regs_num[*p1] > 0
&& (ira_reg_class_max_nregs[*p1][mode]
<= ira_class_hard_regs_num[*p1]))
  cost = MAX (cost, ira_register_move_cost[mode][*p1][cl2]);

from ira_init_register_move_cost because that had more effect
than I was expecting and wasn't needed for the MIPS patch.
It could be done as a follow-up if I ever find time...

I checked that this produced no difference in assembly output for
a set of x86_64 gcc .ii files (tested with -O2 -march=native on gcc20).
Also tested on x86_64-linux-gnu (including -m32) and mipsisa64-elf.
OK to install?
It is OK for me, Richard.  Although it was hard to me understand 
correctness of the following changes.


Thanks for the patch.

Index: gcc/ira-lives.c
===
--- gcc/ira-lives.c 2012-09-30 12:56:14.344185269 +0100
+++ gcc/ira-lives.c 2012-09-30 17:45:14.962463976 +0100
@@ -849,9 +849,10 @@ single_reg_class (const char *constraint
  next_cl = (c == 'r'
 ? GENERAL_REGS
 : REG_CLASS_FROM_CONSTRAINT (c, constraints));
- if ((cl != NO_REGS && next_cl != cl)
- || (ira_class_hard_regs_num[next_cl]
- > ira_reg_class_max_nregs[next_cl][GET_MODE (op)]))
+ if (cl == NO_REGS
+ ? ira_class_singleton[next_cl][GET_MODE (op)] < 0
+ : (ira_class_singleton[cl][GET_MODE (op)]
+!= ira_class_singleton[next_cl][GET_MODE (op)]))
  
  	return NO_REGS;

  cl = next_cl;
  break;
@@ -861,10 +862,10 @@ single_reg_class (const char *constraint
  next_cl
= single_reg_class (recog_data.constraints[c - '0'],
recog_data.operand[c - '0'], NULL_RTX);
- if ((cl != NO_REGS && next_cl != cl)
- || next_cl == NO_REGS
- || (ira_class_hard_regs_num[next_cl]
- > ira_reg_class_max_nregs[next_cl][GET_MODE (op)]))
+ if (cl == NO_REGS
+ ? ira_class_singleton[next_cl][GET_MODE (op)] < 0
+ : (ira_class_singleton[cl][GET_MODE (op)]
+!= ira_class_singleton[next_cl][GET_MODE (op)]))
return NO_REGS;
  cl = next_cl;
  break;

Re: [PATCH, i386]: Implement atomic_fetch_sub

2012-10-01 Thread Paolo Bonzini

Il 03/08/2012 17:08, Richard Henderson ha scritto:
> On 2012-08-03 08:01, Uros Bizjak wrote:
>> On Fri, Aug 3, 2012 at 4:40 PM, Richard Henderson  wrote:
>>> On 2012-08-03 01:51, Uros Bizjak wrote:
 The same reasoning goes for dynamic negation: for neg %eax,%eax value
 0x8000 stays the same, but we have changed (x)sub to an (x)add in
 the code stream.
>>>
>>> So?  Did you think the xadd will trap?
>>
>> No, but can we ignore the fact that we changed xsub -0x8000, mem
>> to xadd -0x08000, mem?
> 
> Yes, since it'll have the same effect on the bits.

In fact we can even use this trick for "xxor"...

Paolo

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Vladimir Makarov


On 12-10-01 6:30 AM, Bernd Schmidt wrote:

On 10/01/2012 12:14 PM, Jakub Jelinek wrote:

On Mon, Oct 01, 2012 at 12:01:36PM +0200, Steven Bosscher wrote:

I would also agree if it were not for the fact that IRA is already a
scalability bottle-neck and that has been known for a long time, too.
I have no confidence at all that if LRA goes in now, these scalability
problems will be solved in stage3 or at any next release cycle. It's
always the same thing with GCC: Once a patch is in, everyone moves on
to the next fancy new thing dropping the not-quite-broken but also
not-quite-working things on the floor.

If we open a P1 bug for it for 4.8, then it will need to be resolved some
way before branching.  I think Vlad is committed to bugfixing LRA, after
all the intent is for 4.9 to enable it on more (all?) targets, and all the
bugfixing and scalability work on LRA is needed for that anyway.

Why can't this be done on the branch? We've made the mistake of rushing
things into mainline too early a few times before, we should have
learned by now. And adding more half transitions is not something we
really want either.

I should clearly express that the transition will be not happen for 
short time because of the task complexity.  I believe that lra will 
coexist with reload for 1-2 releases.  I only ported LRA for 9 major 
targets.  The transition completion will be dependent on secondary 
target maintainers too because I alone can not do porting LRA for all 
supported targets.  It was discussed with a lot of people on 2012 GNU 
Tools Cauldron.  Maintenance of LRA on the branch is a big burden, even 
x86-64 is sometimes broken after merge with the trunk.


When I proposed merge LRA to gcc4.8, I had in mind that:
  o moving most changes from LRA branch will help LRA maintenance on 
the branch and I'll have more time to work on other targets and problems.
  o the earlier we start the transition, the better it will be for LRA 
because LRA on the trunk will have more feedback and better testing.


I've chosen x86/x86-64 for this because I am confident in this port.  On 
majority of tests, it generates faster, smaller code (even for these two 
extreme tests it generates 15% smaller code) for less time.  IMO, the 
slow compilation of the extreme tests are much less important than what 
I've just mentioned.


But because I got clear objections from at least two people and no clear 
support for the LRA inclusion (there were just no objections to include 
it), I will not insists on LRA merge now.


I believe in the importance of this work as LLVM catches GCC on RA front 
by implementing a new RA for LLVM3.0.  I believe we should get rid off 
reload as outdated, hard to maintain, and preventing implementation of 
new RA optimizations.


In any case submitting the patches was a good thing to do because I got 
a lot of feedback.  I still appreciate any comments on the patches.

Re: [RFC] Make vectorizer to skip loops with small iteration estimate

2012-10-01 Thread Jan Hubicka

> > 
> >  So the unvectorized cost is
> >  SIC * niters
> > 
> >  The vectorized path is
> >  SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC
> >  The scalar path of vectorizer loop is
> >  SIC * niters + SOC
> 
> Note that 'th' is used for the runtime profitability check which is
> done at the time the setup cost has already been taken (yes, we

Yes, I understand that.
> probably should make it more conservative but then guard the whole
> set of loops by the check, not only the vectorized path).
> See PR53355 for the general issue.

Yep, we may reduce the cost of SOC by outputting early guard for non-vectorized
path better than we do now. However...
> >Of course this is very simple benchmark, in reality the vectorizatoin 
> > can be
> >a lot more harmful by complicating more complex control flows.
> >
> >So I guess we have two options
> > 1) go with the new formula and try to make cost model a bit more 
> > realistic.
> > 2) stay with original formula that is quite close to reality, but I 
> > think
> >more by an accident.
> 
> I think we need to improve it as whole, thus I'd prefer 2).

... I do not see why.
Even if we make the check cheaper we will only distribute part of SOC to vector
prologues/epilogues.

Still I think the formula is wrong, I.e. accounting SOC where it should not.

The cost of scalar path without vectorization is 
  niters * SIC
while with vectorization we have scalar path
  niters * SIC + SOC
and vector path
  SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC

So SOC cancels out in the runtime check.
I still think we need two formulas - one determining if vectorization is
profitable, other specifying the threshold for scalar path at runtime (that
will generally give lower values).
> > 2) Even when loop iterates 2 times, it is estimated to 4 iterations by
> >estimated_stmt_executions_int with the profile feedback.
> >The reason is loop_ch pass.  Given a rolled loop with exit probability
> >30%, proceeds by duplicating the header with original probabilities.
> >This makes the loop to be executed with 60% probability.  Because the
> >loop body counts remain the same (and they should), the expected number
> >of iterations increase by the decrease of entry edge to the header.
> > 
> >I wonder what to do about this.  Obviously without path profiling
> >loop_ch can not really do a good job.  We can artifically make
> >header to suceed more likely, that is the reality, but that requires
> >non-trivial loop profile updating.
> > 
> >We can also simply record the iteration bound into loop structure 
> >and ignore that the profile is not realistic
> 
> But we don't preserve loop structure from header copying ...

>From what time we keep loop structure? In general I would like to eventualy
drop value histograms to loop structure specifying number of iterations with
profile feedback.
> 
> >Finally we can duplicate loop headers before profilng.  I implemented
> >that via early_ch pass executed only with profile generation or feedback.
> >I guess it makes sense to do, even if it breaks the assumption that
> >we should do strictly -Os generation on paths where
> 
> Well, there are CH cases that do not increase code size and I doubt
> that loop header copying is generally bad for -Os ... we are not
> good at handling non-copied loop headers.

There is comment saying 
  /* Loop header copying usually increases size of the code.  This used not to
 be true, since quite often it is possible to verify that the condition is
 satisfied in the first iteration and therefore to eliminate it.  Jump
 threading handles these cases now.  */
  if (optimize_loop_for_size_p (loop))
return false;

I am not sure how much backing it has. Schedule loop_ch as part of early passes
just after profile pass makes optimize_loop_for_size_p to return true 
even for functions that are later found cold by profile feedback.  I do not see
that being big issue.

I tested enabling loop_ch in early passes with -fprofile-feedback and it is SPEC
neutral.  Given that it improves loop count estimates, I would still like 
mainline
doing that.  I do not like these quite important estimates to be wrong most of 
time.

> 
> Btw, I added a "similar" check in vect_analyze_loop_operations:
> 
>   if ((LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
>&& (LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor))
>   || ((max_niter = max_stmt_executions_int (loop)) != -1
>   && (unsigned HOST_WIDE_INT) max_niter < vectorization_factor))
> {
>   if (dump_kind_p (MSG_MISSED_OPTIMIZATION))
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>  "not vectorized: iteration count too small.");
>   if (dump_kind_p (MSG_MISSED_OPTIMIZATION))
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>  "not vectorized: iteration cou

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Michael Meissner

Your change on September 30th, breaks the powerpc port because the
REPORT_DETAILS value in the enumeration is no longer there, and the
rs6000_density_test function was using that.  Please in the future, when you
are making global changes, grep for uses of enum values in all of the machine
dependent directories so we can avoid breakage like this.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Michael Meissner

On Mon, Oct 01, 2012 at 02:02:26PM -0400, Michael Meissner wrote:
> Your change on September 30th, breaks the powerpc port because the
> REPORT_DETAILS value in the enumeration is no longer there, and the
> rs6000_density_test function was using that.  Please in the future, when you
> are making global changes, grep for uses of enum values in all of the machine
> dependent directories so we can avoid breakage like this.

Also, in looking at the changes, given we are already up to 28 TDF_ flags, I
would recommend immediately adding a new type that is the TDF flagword type.
Thus it will be a lot simpler when we add 4 more TDF flags and have to change
the type from int to HOST_WIDE_INT.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899

Re: [patch, libfortran] Fix PR 54736, memory corruption with GFORTRAN_CONVERT_UNIT

2012-10-01 Thread Thomas Koenig


Hello world,

the previous version of the patch has an issue that Shane pointed
out in the PR.  This version should work; at least it survived
all the test cases I could come up with.

Regression-tested (again).  OK for trunk?  Also for 4.6 and 4.7?

Thomas

2012-10-01  Thomas König  

PR libfortran/54736
* runtime/environ.c (search_unit):  Correct logic
for binary search.
(mark_single):  Fix index errors.

Index: runtime/environ.c
===
--- runtime/environ.c	(Revision 191857)
+++ runtime/environ.c	(Arbeitskopie)
@@ -459,21 +459,35 @@ search_unit (int unit, int *ip)
 {
   int low, high, mid;
 
-  low = -1;
-  high = n_elist;
-  while (high - low > 1)
+  if (n_elist == 0)
 {
+  *ip = 0;
+  return 0;
+}
+
+  low = 0;
+  high = n_elist - 1;
+
+  do 
+{
   mid = (low + high) / 2;
-  if (unit <= elist[mid].unit)
-	high = mid;
+  if (unit == elist[mid].unit)
+	{
+	  *ip = mid;
+	  return 1;
+	}
+  else if (unit > elist[mid].unit)
+	low = mid + 1;
   else
-	low = mid;
-}
-  *ip = high;
-  if (elist[high].unit == unit)
-return 1;
+	high = mid - 1;
+} while (low <= high);
+
+  if (unit > elist[mid].unit)
+*ip = mid + 1;
   else
-return 0;
+*ip = mid;
+
+  return 0;
 }
 
 /* This matches a keyword.  If it is found, return the token supplied,
@@ -588,13 +602,13 @@ mark_single (int unit)
 }
   if (search_unit (unit, &i))
 {
-  elist[unit].conv = endian;
+  elist[i].conv = endian;
 }
   else
 {
-  for (j=n_elist; j>=i; j--)
+  for (j=n_elist-1; j>=i; j--)
 	elist[j+1] = elist[j];
-
+
   n_elist += 1;
   elist[i].unit = unit;
   elist[i].conv = endian;

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Gabriel Dos Reis

On Mon, Oct 1, 2012 at 1:27 PM, Michael Meissner
 wrote:
> On Mon, Oct 01, 2012 at 02:02:26PM -0400, Michael Meissner wrote:
>> Your change on September 30th, breaks the powerpc port because the
>> REPORT_DETAILS value in the enumeration is no longer there, and the
>> rs6000_density_test function was using that.  Please in the future, when you
>> are making global changes, grep for uses of enum values in all of the machine
>> dependent directories so we can avoid breakage like this.
>
> Also, in looking at the changes, given we are already up to 28 TDF_ flags, I
> would recommend immediately adding a new type that is the TDF flagword type.
> Thus it will be a lot simpler when we add 4 more TDF flags and have to change
> the type from int to HOST_WIDE_INT.

Agreed that we need an abstraction here.
-- Gaby

Re: RFC: LRA for x86/x86-64 [4/9]

2012-10-01 Thread Richard Sandiford

Thanks a lot for doing this.  When you finally get to the stage of
"rm reload.c reload1.c", please do it in a screen session and save
the log for posterity.

Vladimir Makarov  writes:
> +/* Return register bank of given hard regno for the current target.  */
> +DEFHOOK
> +(register_bank,
> + "A target hook which returns the register bank number to which the\
> +  register @var{hard_regno} belongs to.  The smaller the number, the\
> +  more preferable the hard register usage (when all other conditions are\
> +  the same).  This hook can be used to prefer some hard register over\
> +  others in LRA.  For example, some x86-64 register usage needs\
> +  additional prefix which makes instructions longer.  The hook can\
> +  return bigger bank number for such registers make them less favorable\
> +  and as result making the generated code smaller.\
> +  \
> +  The default version of this target hook returns always zero.",
> + int, (int),
> + default_register_bank)

This is a horribly bikeshed-level comment, sorry, but I wonder if
something like "register_priority" would be better.  Register classes
are in some ways an extension of register banks, so it wasn't obvious
from the name why we needed both.

> +/* Return true if maximal address displacement can be different.  */
> +DEFHOOK
> +(different_addr_displacement_p,
> + "A target hook which returns true if an address with the same structure\
> +  can have different maximal legitimate displacement.  For example, the\
> +  displacement can depend on memory mode or on operand combinations in\
> +  the insn.\
> +  \
> +  The default version of this target hook returns always false.",
> + bool, (void),
> + default_different_addr_displacement_p)

If I read the patch correctly, this is only used in:

+   if (lra_reg_spill_p || targetm.different_addr_displacement_p ())
+ lra_set_used_insn_alternative (insn, -1);

and so we keep the current alternative when neither spill_class_mode
nor different_addr_displacement_p is defined.  How many targets on the
LRA branch are like that?  I would have expected most targets with limited
address displacements would have to return true for the above hook,
because multiword loads and stores typically have to be split into word
loads and stores.  Same goes for strict-alignment targets, where wider
modes often have slightly lower maximal displacements.

E.g. for MIPS, SImode loads and stores have a displacement range of
[-32768, 32764], but DImode loads and stores only accept [-32768, 32760].
So the maximal displacement depends on mode, even though the instruction set
is pretty regular.

Targets with full address-size displacements can use the default false return,
but it looks like the x86 port defines spill_class_mode instead, so AIUI
the value isn't really tested on Core i7.  What's the impact of that compared
to the other x86 targets that don't set X86_TUNE_GENERAL_REGS_SSE_SPILL?
Is LRA just quicker for them, or will it make different decisions
(compared to Core i7) even for non-SSE insns?

> +/* Determine class of registers which could be used for spilled
> +   pseudos instead of memory.  */
> +DEFHOOK
> +(spill_class,
> + "This hook defines a class of registers which could be used for spilled 
> pseudos\
> +  of given class instead of memory",
> + reg_class_t, (reg_class_t),
> + NULL)

Should probably say that NO_REGS means "none".

> +/* Determine mode for spilling pseudos into registers instead of memory.  */
> +DEFHOOK
> +(spill_class_mode,
> + "This hook defines mode in which a pseudo of given mode and of the first\
> +  register class can be spilled into the second register class",
> + enum machine_mode, (reg_class_t, reg_class_t, enum machine_mode),
> + NULL)

It looks like the only use is in:

+ || (targetm.spill_class_mode (rclass, spill_class,
+   PSEUDO_REGNO_MODE (regno))
+ != PSEUDO_REGNO_MODE (regno))

So would it make sense to have a single hook like:

/* Determine mode for spilling pseudos into registers instead of memory.  */
DEFHOOK
(spill_class,
 "This hook defines a class of registers which could be used for spilling\
  pseudos of the given mode and class, or @code{NO_REGS} if only memory\
  should be used.  Not defining this hook is equivalent to returning\
  @code{NO_REGS} for all inputs."
 reg_class_t, (reg_class_t, enum machine_mode),
 NULL)

?  It means that setup_reg_spill_flag needs a class-x-mode walk, but
(bad excuse ahoy) we have plenty of those already.  If we really wanted
to avoid the extra loop, we could make VOIDmode mean "any mode",
although that does make the interface a bit more clunky.

Richard

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Ian Lance Taylor

On Mon, Oct 1, 2012 at 10:51 AM, Vladimir Makarov  wrote:
>
> When I proposed merge LRA to gcc4.8, I had in mind that:
>   o moving most changes from LRA branch will help LRA maintenance on the
> branch and I'll have more time to work on other targets and problems.
>   o the earlier we start the transition, the better it will be for LRA
> because LRA on the trunk will have more feedback and better testing.
>
> I've chosen x86/x86-64 for this because I am confident in this port.  On
> majority of tests, it generates faster, smaller code (even for these two
> extreme tests it generates 15% smaller code) for less time.  IMO, the slow
> compilation of the extreme tests are much less important than what I've just
> mentioned.
>
> But because I got clear objections from at least two people and no clear
> support for the LRA inclusion (there were just no objections to include it),
> I will not insists on LRA merge now.

I believe that we should proceed with the LRA merge as Vlad has
proposed, and treat the compilation time slowdowns on specific test
cases as bugs to be addressed.

Clearly these slowdowns are not good.  However, requiring significant
work like LRA to be better or equal to the current code in every
single case is making the perfect the enemy of the good.  We must
weigh the benefits and drawbacks, not require that there be no
drawbacks at all.  In this case I believe that the benefits of LRA
significantly outweigh the drawbacks.

Steven is correct in saying that there is a tendency to move on and
never address GCC bugs.  However, there is also a counter-vailing
tendency to fix GCC bugs.  Anyhow I'm certainly not saying that in all
cases it's OK to accept a merge with regressions; I'm saying that in
this specific case it is OK.

(I say all this based on Vlad's descriptions, I have not actually
looked at the patches.)

Ian

Re: RFC: LRA for x86/x86-64 [4/9]

2012-10-01 Thread Paul_Koning

On Oct 1, 2012, at 2:51 PM, Richard Sandiford wrote:

> ...
> E.g. for MIPS, SImode loads and stores have a displacement range of
> [-32768, 32764], but DImode loads and stores only accept [-32768, 32760].
> So the maximal displacement depends on mode, even though the instruction set
> is pretty regular.

It may be that the case doesn't arise in code GCC generates, but I don't think 
that's true.  The offset field is always a 2's complement 16 bit integer, hence 
in the range -32768..32767.  The alignment required in loading multibyte data 
with aligned load/store instructions applies to the final address, not the 
offset.  For example, if R1 contains 1, then LD r2,32767(r1) will work.

paul

Re: [Patch contrib] check_GNU_style: remove tmp file

2012-10-01 Thread Ian Lance Taylor

On Mon, Sep 10, 2012 at 5:23 AM, Christophe Lyon
 wrote:
>
> Good point. Here is a new version, catching the same signals as warn_summary.

This is OK.

Thanks.

(Minor note: it's easier if you the ChangeLog entry in the body of the
message rather than the patch.)

Ian

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread David Miller

From: Ian Lance Taylor 
Date: Mon, 1 Oct 2012 11:55:56 -0700

> Steven is correct in saying that there is a tendency to move on and
> never address GCC bugs.  However, there is also a counter-vailing
> tendency to fix GCC bugs.  Anyhow I'm certainly not saying that in all
> cases it's OK to accept a merge with regressions; I'm saying that in
> this specific case it is OK.

I think it's more important in this case to recognize Steven's real
point, which is that for an identical situation (IRA), and with an
identical patch author, we had similar bugs.  They were promised to be
worked on, and yet some of those regressions are still very much with
us.

The likelyhood of a repeat is therefore very real.

I really don't have a lot of confidence given what has happened in
the past.  I also don't understand what's so evil about sorting this
out on a branch.  It's the perfect carrot to get the compile time
regressions fixed.

Re: PR 53889: Add __gthread_recursive_mutex_destroy

2012-10-01 Thread Ian Lance Taylor

On Sun, Sep 30, 2012 at 11:41 AM, Jonathan Wakely  wrote:
> There is no __gthread_recursive_mutex_destroy function in the gthreads API.
>
> Trying to use __gthread_mutex_destroy fails to compile on platforms
> where the mutex
> types are different. To avoid resource leaks libstdc++ needs to hack
> around the missing function with overloaded functions and SFINAE
> tricks to detect how a recursive mutex can be destroyed.
>
> This patch extends the gthreads API to include
> __gthread_recursive_mutex_destroy, defining it for each gthread model,
> and removing the hacks from libstdc++.

> +return rtems_gxx_mutex_destroy( __mutex );

Space before '(', not space after.

Doing anything else here is going to be painful, but this assumes that
RTEMS uses the same representation for non-recursive and recursive
mutexes.  That is currently true, but it deserves a comment.


> --- a/libgcc/config/i386/gthr-win32.h
> +++ b/libgcc/config/i386/gthr-win32.h
> +static inline void
> +__gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t *mutex)
> +{
> +  __gthread_mutex_t __mutex2;
> +  __mutex2.sema = mutex->sema;
> +  __gthr_win32_mutex_destroy (&__mutex2);
> +}

I think it would be better to put this in
libgcc/config/i386/gthr-win32.c, like the other functions.  Then you
can just call CloseHandle.

> --- a/libgcc/config/mips/gthr-mipssde.h
> +++ b/libgcc/config/mips/gthr-mipssde.h
>
> +static inline int
> +__gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t *__mutex)
> +{
> +  return __gthread_mutex_destroy(__mutex);
> +}

Will this even compile?  It doesn't look like it.

Ian

[SH] PR 51244 - Handle T bit -> 0x7FFFFFFF / 0x80000000

2012-10-01 Thread Oleg Endo

Hello,

This handles the case where the T bit is stored to a reg as the value
0x7FFF or 0x8000.
Tested on rev 191894 with
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

and no new failures.
OK?

Cheers,
Oleg

gcc/ChangeLog:

PR target/51244
* config/sh/sh.md (*mov_t_msb_neg): New insn and two 
accompanying unnamed split patterns.

testsuite/ChangeLog:

PR target/51244
* gcc.target/sh/pr51244-12.c: New.
Index: gcc/testsuite/gcc.target/sh/pr51244-12.c
===
--- gcc/testsuite/gcc.target/sh/pr51244-12.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/pr51244-12.c	(revision 0)
@@ -0,0 +1,68 @@
+/* Check that the negc instruction is generated as expected for the cases
+   below.  If we see a movrt or #-1 negc sequence it means that the pattern
+   which handles the inverted case does not work properly.  */
+/* { dg-do compile { target "sh*-*-*" } } */
+/* { dg-options "-O1" } */
+/* { dg-skip-if "" { "sh*-*-*" } { "-m5*" } { "" } } */
+/* { dg-final { scan-assembler-times "negc" 10 } } */
+/* { dg-final { scan-assembler-not "movrt|#-1|add|sub" } } */
+
+int
+test00 (int a, int b, int* x)
+{
+  return (a == b) ? 0x7FFF : 0x8000;
+}
+
+int
+test00_inv (int a, int b)
+{
+  return (a != b) ? 0x8000 : 0x7FFF;
+}
+
+int
+test01 (int a, int b)
+{
+  return (a >= b) ? 0x7FFF : 0x8000;
+}
+
+int
+test01_inv (int a, int b)
+{
+  return (a < b) ? 0x8000 : 0x7FFF;
+}
+
+int
+test02 (int a, int b)
+{
+  return (a > b) ? 0x7FFF : 0x8000;
+}
+
+int
+test02_inv (int a, int b)
+{
+  return (a <= b) ? 0x8000 : 0x7FFF;
+}
+
+int
+test03 (int a, int b)
+{
+  return ((a & b) == 0) ? 0x7FFF : 0x8000;
+}
+
+int
+test03_inv (int a, int b)
+{
+  return ((a & b) != 0) ? 0x8000 : 0x7FFF;
+}
+
+int
+test04 (int a)
+{
+  return ((a & 0x55) == 0) ? 0x7FFF : 0x8000;
+}
+
+int
+test04_inv (int a)
+{
+  return ((a & 0x55) != 0) ? 0x8000 : 0x7FFF;
+}
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 191894)
+++ gcc/config/sh/sh.md	(working copy)
@@ -10769,6 +10769,51 @@
 	(set (reg:SI T_REG) (const_int 1))
 	(use (match_dup 2))])])
 
+;; Use negc to store the T bit in a MSB of a reg in the following way:
+;;	T = 1: 0x8000 -> reg
+;;	T = 0: 0x7FFF -> reg
+;; This works because 0 - 0x8000 = 0x8000.
+(define_insn_and_split "*mov_t_msb_neg"
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(minus:SI (const_int -2147483648)  ;; 0x8000
+		  (match_operand 1 "t_reg_operand")))
+   (clobber (reg:SI T_REG))]
+  "TARGET_SH1"
+  "#"
+  "&& can_create_pseudo_p ()"
+  [(set (match_dup 2) (const_int -2147483648))
+   (parallel [(set (match_dup 0) (minus:SI (neg:SI (match_dup 2))
+ (reg:SI T_REG)))
+	  (clobber (reg:SI T_REG))])]
+{
+  operands[2] = gen_reg_rtx (SImode);
+})
+
+;; These are essentially the same as above, but with the inverted T bit.
+;; Combine recognizes the split patterns, but does not take them sometimes
+;; if the T_REG clobber is specified.  Instead it tries to split out the
+;; T bit negation.  Since these splits are supposed to be taken only by
+;; combine, it will see the T_REG clobber of the *mov_t_msb_neg insn, so this
+;; should be fine.
+(define_split
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(plus:SI (match_operand 1 "negt_reg_operand")
+		 (const_int 2147483647)))]  ;; 0x7fff
+  "TARGET_SH1 && can_create_pseudo_p ()"
+  [(parallel [(set (match_dup 0)
+		   (minus:SI (const_int -2147483648) (reg:SI T_REG)))
+	  (clobber (reg:SI T_REG))])])
+
+(define_split
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(if_then_else:SI (match_operand 1 "t_reg_operand")
+			 (const_int 2147483647)  ;; 0x7fff
+			 (const_int -2147483648)))]  ;; 0x8000
+  "TARGET_SH1 && can_create_pseudo_p ()"
+  [(parallel [(set (match_dup 0)
+		   (minus:SI (const_int -2147483648) (reg:SI T_REG)))
+	  (clobber (reg:SI T_REG))])])
+
 ;; The *negnegt pattern helps the combine pass to figure out how to fold 
 ;; an explicit double T bit negation.
 (define_insn_and_split "*negnegt"

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Vladimir Makarov

On 10/01/2012 03:19 PM, David Miller wrote:

From: Ian Lance Taylor 
Date: Mon, 1 Oct 2012 11:55:56 -0700

Steven is correct in saying that there is a tendency to move on and
never address GCC bugs.  However, there is also a counter-vailing
tendency to fix GCC bugs.  Anyhow I'm certainly not saying that in all
cases it's OK to accept a merge with regressions; I'm saying that in
this specific case it is OK.

I think it's more important in this case to recognize Steven's real
point, which is that for an identical situation (IRA), and with an
identical patch author, we had similar bugs.  They were promised to be
worked on, and yet some of those regressions are still very much with
us.
That is not true.  I worked on many compiler time regression bugs. I 
remeber one serious degradation of compilation time on 
all_cp2k_gfortran.f90.  I solved the problem and make IRA working faster 
and generating much better code than the old RA.

http://blog.gmane.org/gmane.comp.gcc.patches/month=20080501/page=15

About other two mentioned PRs by Steven:

PR26854.  I worked on this bug even when IRA was on the branch and make 
again GCC with IRA 5% faster on this test than GCC with the old RA.

PR 54146 is 3 months old.  There were a lot work on other optimizations 
before IRA became important.  It happens only 2 months ago. I had no 
time to work on it but I am going to.

People sometimes see that RA takes a lot of compilation time but it is 
in the nature of RA.  I'd recommend first to check how the old RA 
behaves and then call it a degradation.

And please, don't listen just one side.

The likelyhood of a repeat is therefore very real.

I really don't have a lot of confidence given what has happened in
the past.  I also don't understand what's so evil about sorting this
out on a branch.  It's the perfect carrot to get the compile time
regressions fixed.

Wrong assumptions result in wrong conclusions.

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 9:19 PM, David Miller  wrote:
> From: Ian Lance Taylor 
> Date: Mon, 1 Oct 2012 11:55:56 -0700
>
>> Steven is correct in saying that there is a tendency to move on and
>> never address GCC bugs.  However, there is also a counter-vailing
>> tendency to fix GCC bugs.  Anyhow I'm certainly not saying that in all
>> cases it's OK to accept a merge with regressions; I'm saying that in
>> this specific case it is OK.
>
> I think it's more important in this case to recognize Steven's real
> point, which is that for an identical situation (IRA), and with an
> identical patch author, we had similar bugs.  They were promised to be
> worked on, and yet some of those regressions are still very much with
> us.

My point is not to single out Vlad here! I don't think this patch
author is any worse or better than the next one. There are other
examples enough, e.g. VRP is from other contributors and it has had a
few horrible pieces of code from the start that just don't get
addressed, or var-tracking for which cleaning up a few serious compile
time problems will be a Big Job for stage3. It's the general pattern
that worries me.

Ciao!
Steven

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 9:51 PM, Vladimir Makarov  wrote:
>> I think it's more important in this case to recognize Steven's real
>> point, which is that for an identical situation (IRA), and with an
>> identical patch author, we had similar bugs.  They were promised to be
>> worked on, and yet some of those regressions are still very much with
>> us.
>
> That is not true.  I worked on many compiler time regression bugs. I remeber
> one serious degradation of compilation time on all_cp2k_gfortran.f90.  I
> solved the problem and make IRA working faster and generating much better
> code than the old RA.
>
> http://blog.gmane.org/gmane.comp.gcc.patches/month=20080501/page=15
>
> About other two mentioned PRs by Steven:
>
> PR26854.  I worked on this bug even when IRA was on the branch and make
> again GCC with IRA 5% faster on this test than GCC with the old RA.
>
> PR 54146 is 3 months old.  There were a lot work on other optimizations
> before IRA became important.  It happens only 2 months ago. I had no time to
> work on it but I am going to.

This is also not quite true, see PR37448, which shows the problems as
the test case for PR54146.

I just think scalability is a very important issue. If some pass or
algorithm scales bad on some measure, then users _will_ run into that
at some point and report bugs about it (if you're lucky enough to have
a user patient enough to sit out the long compile time :-) ). Also,
good scalability opens up opportunities. For example, historically GCC
has been conservative on inlining heuristics to avoid compile time
explosions. I think it's better to address the causes of that
explosion and to avoid introducing new potential bottlenecks.

> People sometimes see that RA takes a lot of compilation time but it is in
> the nature of RA.  I'd recommend first to check how the old RA behaves and
> then call it a degradation.

There's no question that RA is one of the hardest problems the
compiler has to solve, being NP-complete and all that. I like LRA's
iterative approach, but if you know you're going to solve a hard
problem with a number potentially expensive iterations, there's even
more reason to make scalability a design goal!

As I said earlier in this thread, I was really looking forward to IRA
at the time you worked on it, because it is supposed to be a regional
allocator and I had expected that to mean it could, well, allocate
per-region which is usually very helpful for scalability (partition
your function and insert compensation code on strategically picked
region boundaries). But that's not what IRA has turned out to be.
(Instead, its regional nature is one of the reasons for its
scalability problems.)  IRA is certainly not worse than old global.c
in very many ways, and LRA looks like a well thought-through and
welcome replacement of old reload. But scalability is an issue in the
design of IRA and LRA looks to be the same in that regard.

Ciao!
Steven

Re: RFC: LRA for x86/x86-64 [4/9]

2012-10-01 Thread Jeff Law


On 09/27/2012 07:44 PM, Vladimir Makarov wrote:

On 09/27/2012 08:07 PM, Joseph S. Myers wrote:

On Thu, 27 Sep 2012, Vladimir Makarov wrote:


Hook spill_class returns a value of enum reg_class which is defined in
target-depend include file.

That's what reg_class_t is for: avoiding enum reg_class in hook
interfaces.


Ok.  Thanks for pointing this out.

Here is the modified patch.

2012-09-27  Vladimir Makarov  

 * targhooks.h (default_lra_p): Declare.
 (default_register_bank): Ditto.
 (default_different_addr_displacement_p): Ditto.
 * targhooks.c (default_lra_p): New function.
 (default_register_bank): Ditto.
 (default_different_addr_displacement_p): Ditto.
 * target.def (lra_p): New hook.
 (register_bank): Ditto.
 (different_addr_displacement_p): Ditto.
 (spill_class, spill_class_mode): New hooks.
 * doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_BANK,
 TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, TARGET_SPILL_CLASS, and
 TARGET_SPILL_CLASS_MODE.
 * doc/tm.texi: Update.

The change also requires some modification in the 9th patch. The
ChangeLog for the patch should be the same as before.

This looks fine to me.

jeff

[PATCH, libbacktrace]: Compile with -fasynchronous-unwind-tables

2012-10-01 Thread Uros Bizjak

Hello!

Without -fasynchronous-unwind-tables, FDE is not generated for
backtrace_full and backtrace_simple wrappers. Without FDE, unwinding
terminates at these functions.

Attached patch fixes this problem by adding
-fasynchronous-unwind-tables, and this way forcing FDEs for all
functions. With this change, btest passes OK, failing log and
runtime/pprof from libgo testsuite also pass OK.

BTW: It would be enough to compile only backtrace.c and simple.c with
-fasynchronous-unwind-tables, since critical wrapper functions live
here.

2012-10-01  Uros Bizjak  

PR other/54761
* Makefile.am (AM_CFLAGS): Add -fasynchronous-unwind-tables.
* Makefile.in: Regenerate.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32} and alphaev68-pc-linux-gnu (where fixes all mentioned
unwinding failures).

OK for mainline?

Uros.
Index: ChangeLog
===
--- ChangeLog   (revision 191932)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2012-10-01  Uros Bizjak  
+
+   PR other/54761
+   * Makefile.am (AM_CFLAGS): Add -fasynchronous-unwind-tables.
+   * Makefile.in: Regenerate.
+
 2012-09-29  Ian Lance Taylor  
 
PR other/54749
Index: Makefile.am
===
--- Makefile.am (revision 191932)
+++ Makefile.am (working copy)
@@ -34,7 +34,7 @@
 AM_CPPFLAGS = -I $(top_srcdir)/../include -I $(top_srcdir)/../libgcc \
-I ../libgcc -I ../gcc/include -I $(MULTIBUILDTOP)../../gcc/include
 
-AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG)
+AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG) -fasynchronous-unwind-tables
 
 noinst_LTLIBRARIES = libbacktrace.la
 
Index: Makefile.in
===
--- Makefile.in (revision 191932)
+++ Makefile.in (working copy)
@@ -253,7 +253,7 @@
 AM_CPPFLAGS = -I $(top_srcdir)/../include -I $(top_srcdir)/../libgcc \
-I ../libgcc -I ../gcc/include -I $(MULTIBUILDTOP)../../gcc/include
 
-AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG)
+AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG) -fasynchronous-unwind-tables
 noinst_LTLIBRARIES = libbacktrace.la
 libbacktrace_la_SOURCES = \
backtrace.h \

Re: [patch][lra] Use XNEWVEC and friends instead of xmalloc/xrealloc, and add some timevars

2012-10-01 Thread Mike Stump

On Oct 1, 2012, at 4:05 AM, Steven Bosscher  wrote:
> This patch uses the libiberty new-like operators instead of using
> xmalloc/xrealloc.

So, in headers that can be used by C compiles, we should use the abstraction…  
Should we prefer it for translation units that are C++ only?

[PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Michael Meissner

I tracked down some of the other code that previously used REPORT_DETAILS, and
MSG_NOTE is the new way to do the same thing.  This bootstraps and no
unexpected errors occur during make check.  Is it ok to install?

2012-10-01  Michael Meissner  

* config/rs6000/rs6000.c (toplevel): Include dumpfile.h.
(rs6000_density_test): Rework to accomidate 09-30 change by Sharad
Singhai.

* config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 191932)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -58,6 +58,7 @@
 #include "tm-constrs.h"
 #include "opts.h"
 #include "tree-vectorizer.h"
+#include "dumpfile.h"
 #if TARGET_XCOFF
 #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
 #endif
@@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d
   && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
 {
   data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
-  if (vect_print_dump_info (REPORT_DETAILS))
-   fprintf (vect_dump,
-"density %d%%, cost %d exceeds threshold, penalizing "
-"loop body cost by %d%%", density_pct, 
-vec_cost + not_vec_cost, DENSITY_PENALTY);
+  if (dump_kind_p (MSG_NOTE))
+   dump_printf_loc (MSG_NOTE, vect_location,
+"density %d%%, cost %d exceeds threshold, penalizing "
+"loop body cost by %d%%", density_pct,
+vec_cost + not_vec_cost, DENSITY_PENALTY);
 }
 }
 
Index: gcc/config/rs6000/t-rs6000
===
--- gcc/config/rs6000/t-rs6000  (revision 191932)
+++ gcc/config/rs6000/t-rs6000  (working copy)
@@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety
   $(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \
   output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \
   $(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \
-  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H)
+  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h
 
 rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \
 $(srcdir)/config/rs6000/rs6000-protos.h \

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899

Re: [PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Gabriel Dos Reis

On Mon, Oct 1, 2012 at 4:37 PM, Michael Meissner
 wrote:
> I tracked down some of the other code that previously used REPORT_DETAILS, and
> MSG_NOTE is the new way to do the same thing.  This bootstraps and no
> unexpected errors occur during make check.  Is it ok to install?

yes -- qualifies as "obvious".  Thanks!


>
> 2012-10-01  Michael Meissner  
>
> * config/rs6000/rs6000.c (toplevel): Include dumpfile.h.
> (rs6000_density_test): Rework to accomidate 09-30 change by Sharad
> Singhai.
>
> * config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency.
>
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 191932)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -58,6 +58,7 @@
>  #include "tm-constrs.h"
>  #include "opts.h"
>  #include "tree-vectorizer.h"
> +#include "dumpfile.h"
>  #if TARGET_XCOFF
>  #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
>  #endif
> @@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d
>&& vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
>  {
>data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
> -  if (vect_print_dump_info (REPORT_DETAILS))
> -   fprintf (vect_dump,
> -"density %d%%, cost %d exceeds threshold, penalizing "
> -"loop body cost by %d%%", density_pct,
> -vec_cost + not_vec_cost, DENSITY_PENALTY);
> +  if (dump_kind_p (MSG_NOTE))
> +   dump_printf_loc (MSG_NOTE, vect_location,
> +"density %d%%, cost %d exceeds threshold, penalizing 
> "
> +"loop body cost by %d%%", density_pct,
> +vec_cost + not_vec_cost, DENSITY_PENALTY);
>  }
>  }
>
> Index: gcc/config/rs6000/t-rs6000
> ===
> --- gcc/config/rs6000/t-rs6000  (revision 191932)
> +++ gcc/config/rs6000/t-rs6000  (working copy)
> @@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety
>$(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \
>output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \
>$(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \
> -  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H)
> +  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h
>
>  rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \
>  $(srcdir)/config/rs6000/rs6000-protos.h \
>
> --
> Michael Meissner, IBM
> 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
> meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
>

Convert more non-GTY htab_t to hash_table.

2012-10-01 Thread Lawrence Crowl

Change more non-GTY hash tables to use the new type-safe template hash table.
Constify member function parameters that can be const.
Correct a couple of expressions in formerly uninstantiated templates.

The new code is 0.362% faster in bootstrap, with a 99.5% confidence of
being faster.

Tested on x86-64.

Okay for trunk?


Index: gcc/java/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o.
(JCFDUMP_OBJS): Add dependence on hash-table.o.
(jcf-io.o): Add dependence on hash-table.h.
* jcf-io.c (memoized_class_lookups): Change to use type-safe hash table.

Index: gcc/c/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (c-decl.o): Add dependence on hash-table.h.
* c-decl.c (detect_field_duplicates_hash): Change to new type-safe
hash table.

Index: gcc/objc/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o.
(objc-act.o): Add dependence on hash-table.h.
* objc-act.c (objc_detect_field_duplicates): Change to new type-safe
hash table.

Index: gcc/ChangeLog

2012-10-01  Lawrence Crowl  

* Makefile.in (fold-const.o): Add depencence on hash-table.h.
(dse.o): Likewise.
(cfg.o): Likewise.
* fold-const.c (fold_checksum_tree): Change to new type-safe hash table.
* (print_fold_checksum): Likewise.
* cfg.c (var bb_original): Likewise.
* (var bb_copy): Likewise.
* (var loop_copy): Likewise.
* hash-table.h (template hash_table): Constify parameters for find...
and remove_elt... member functions.
(hash_table::empty) Correct size expression.
(hash_table::clear_slot) Correct deleted entry assignment.
* dse.c (var rtx_group_table): Change to new type-safe hash table.

Index: gcc/cp/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (class.o): Add dependence on hash-table.h.
(tree.o): Likewise.
(semantics.o): Likewise.
* class.c (fixed_type_or_null): Change to new type-safe hash table.
* tree.c (verify_stmt_tree): Likewise.
(verify_stmt_tree_r): Likewise.
* semantics.c (struct nrv_data): Likewise.


Index: gcc/java/Make-lang.in
===
--- gcc/java/Make-lang.in   (revision 191941)
+++ gcc/java/Make-lang.in   (working copy)
@@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav
   java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o
java/mangle.o \
   java/mangle_name.o java/builtins.o java/resource.o \
   java/jcf-depend.o \
-  java/jcf-path.o java/boehm.o java/java-gimplify.o
+  java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o

 JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o
java/jcf-path.o \
-   java/win32-host.o java/zextract.o ggc-none.o
+   java/win32-host.o java/zextract.o ggc-none.o hash-table.o

 JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o

@@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify
 # jcf-io.o needs $(ZLIBINC) added to cflags.
 CFLAGS-java/jcf-io.o += $(ZLIBINC)
 java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
-  $(JAVA_TREE_H) java/zipfile.h
+  $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H)

 # jcf-path.o needs a -D.
 CFLAGS-java/jcf-path.o += \
Index: gcc/java/jcf-io.c
===
--- gcc/java/jcf-io.c   (revision 191941)
+++ gcc/java/jcf-io.c   (working copy)
@@ -31,7 +31,7 @@ The Free Software Foundation is independ
 #include "jcf.h"
 #include "tree.h"
 #include "java-tree.h"
-#include "hashtab.h"
+#include "hash-table.h"
 #include 

 #include "zlib.h"
@@ -271,20 +271,34 @@ find_classfile (char *filename, JCF *jcf
   return open_class (filename, jcf, fd, dep_name);
 }

-/* Returns 1 if the CLASSNAME (really a char *) matches the name
-   stored in TABLE_ENTRY (also a char *).  */

-static int
-memoized_class_lookup_eq (const void *table_entry, const void *classname)
+/* Hash table helper.  */
+
+struct charstar_hash : typed_noop_remove 
 {
-  return strcmp ((const char *)classname, (const char *)table_entry) == 0;
+  typedef const char T;
+  static inline hashval_t hash (const T *candidate);
+  static inline bool equal (const T *existing, const T *candidate);
+};
+
+inline hashval_t
+charstar_hash::hash (const T *candidate)
+{
+  return htab_hash_string (candidate);
 }

+inline bool
+charstar_hash::equal (const T *existing, const T *candidate)
+{
+  return strcmp (existing, candidate) == 0;
+}
+
+
 /* A hash table keeping track of class names that were not found
during class lookup.  (There is no need to cache the values
associated with names that were found; they are saved in
IDENTIFIER_CLASS_VALUE.)  */
-static htab_t memoized_class_lookups;
+static hash_table  memoized_class_lookups;

 /* Returns a freshly malloc

[patch][lra] a few bitmap obstacks for lra-assigns

2012-10-01 Thread Steven Bosscher

Hello,

This eliminates a few large loops in lra-assigns.c. They're not the
most costly loops but the life times of the bitmaps is well-defined
and destroying a bitmap obstack is much cheaper than looping over all
bitmaps calling bitmap_clear. The saving is small but you have to
start somewhere...

Bootstrapped lra-branch and tested on x86_64-unknown-linux-gnu, along
with the patch from earlier today. OK for the branch?

Ciao!
Steven


lra_assign_bitobstacks.diff
Description: Binary data

[PATCH rs6000 testsuite] Fix a couple tests for VSX scalar instructions

2012-10-01 Thread Pat Haugen

This patch fixes a couple failures that occur if the testsuite is run 
with -mvsx and the VSX scalar sqrt instructions are generated. Ok for trunk?


-Pat


testsuite/ChangeLog:
2012-10-01  Pat Haugen 

* gcc.target/powerpc/pr46728-1.c: Accept xssqrtdp.
* gcc.target/powerpc/pr46728-2.c: Likewise.


Index: gcc/testsuite/gcc.target/powerpc/pr46728-1.c
===
--- gcc/testsuite/gcc.target/powerpc/pr46728-1.c(revision 191713)
+++ gcc/testsuite/gcc.target/powerpc/pr46728-1.c(working copy)
@@ -27,5 +27,5 @@ main (int argc, char *argv[])
 }
 
 
-/* { dg-final { scan-assembler-times "fsqrt" 2 { target powerpc*-*-* } } } */
+/* { dg-final { scan-assembler-times "fsqrt|xssqrtdp" 2 { target powerpc*-*-* 
} } } */
 /* { dg-final { scan-assembler-not "pow" { target powerpc*-*-* } } } */
Index: gcc/testsuite/gcc.target/powerpc/pr46728-2.c
===
--- gcc/testsuite/gcc.target/powerpc/pr46728-2.c(revision 191713)
+++ gcc/testsuite/gcc.target/powerpc/pr46728-2.c(working copy)
@@ -27,5 +27,5 @@ main (int argc, char *argv[])
 }
 
 
-/* { dg-final { scan-assembler-times "fsqrt" 4 { target powerpc*-*-* } } } */
+/* { dg-final { scan-assembler-times "fsqrt|xssqrtdp" 4 { target powerpc*-*-* 
} } } */
 /* { dg-final { scan-assembler-not "pow" { target powerpc*-*-* } } } */

[patch][lra] Comment typo fix

2012-10-01 Thread Steven Bosscher

I suppose no-one would object if I commit this as obvious at some point?

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 191858)
+++ lra-constraints.c   (working copy)
@@ -4293,7 +4293,7 @@ update_ebb_live_info (rtx head, rtx tail
{
  if (prev_bb != NULL)
{
- /* Udpate DF_LR_IN (prev_bb):  */
+ /* Update DF_LR_IN (prev_bb):  */
  EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi)
if (bitmap_bit_p (&live_regs, j))
  bitmap_set_bit (DF_LR_IN (prev_bb), j);

Re: [patch][lra] Comment typo fix

2012-10-01 Thread Robert Dewar


On 10/1/2012 6:09 PM, Steven Bosscher wrote:

I suppose no-one would object if I commit this as obvious at some point?

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 191858)
+++ lra-constraints.c   (working copy)
@@ -4293,7 +4293,7 @@ update_ebb_live_info (rtx head, rtx tail
 {
   if (prev_bb != NULL)
 {
- /* Udpate DF_LR_IN (prev_bb):  */
+ /* Update DF_LR_IN (prev_bb):  */
   EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi)
 if (bitmap_bit_p (&live_regs, j))
   bitmap_set_bit (DF_LR_IN (prev_bb), j);



took me a few readings to see the change you had made, amazing how
the brain reads what it expects to see :-)

MAINTAINERS: add myself as dwarf debugging code maintainer

2012-10-01 Thread Cary Coutant

2012-10-01  Cary Coutant  

* MAINTAINERS: Add myself as dwarf debugging code maintainer.


Index: MAINTAINERS
===
--- MAINTAINERS (revision 191942)
+++ MAINTAINERS (working copy)
@@ -185,6 +185,7 @@ caller-save.c   Jeff Law
 l...@redhat.com
 callgraph  Jan Hubicka j...@suse.cz
 debugging code Jim Wilson  wil...@tuliptree.org
 dwarf debugging code   Jason Merrill   ja...@redhat.com
+dwarf debugging code   Cary Coutantccout...@google.com
 c++ runtime libs   Paolo Carlini   paolo.carl...@oracle.com
 c++ runtime libs   Gabriel Dos Reisg...@integrable-solutions.net
 c++ runtime libs   Ulrich Drepper  drep...@gmail.com

Re: [PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Xinliang David Li

On Mon, Oct 1, 2012 at 2:37 PM, Michael Meissner
 wrote:
> I tracked down some of the other code that previously used REPORT_DETAILS, and
> MSG_NOTE is the new way to do the same thing.  This bootstraps and no
> unexpected errors occur during make check.  Is it ok to install?
>
> 2012-10-01  Michael Meissner  
>
> * config/rs6000/rs6000.c (toplevel): Include dumpfile.h.
> (rs6000_density_test): Rework to accomidate 09-30 change by Sharad
> Singhai.
>
> * config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency.
>
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 191932)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -58,6 +58,7 @@
>  #include "tm-constrs.h"
>  #include "opts.h"
>  #include "tree-vectorizer.h"
> +#include "dumpfile.h"
>  #if TARGET_XCOFF
>  #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
>  #endif
> @@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d
>&& vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
>  {
>data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
> -  if (vect_print_dump_info (REPORT_DETAILS))
> -   fprintf (vect_dump,
> -"density %d%%, cost %d exceeds threshold, penalizing "
> -"loop body cost by %d%%", density_pct,
> -vec_cost + not_vec_cost, DENSITY_PENALTY);
> +  if (dump_kind_p (MSG_NOTE))

Is this check needed? Seems redundant.

David


> +   dump_printf_loc (MSG_NOTE, vect_location,
> +"density %d%%, cost %d exceeds threshold, penalizing 
> "
> +"loop body cost by %d%%", density_pct,
> +vec_cost + not_vec_cost, DENSITY_PENALTY);
>  }
>  }
>
> Index: gcc/config/rs6000/t-rs6000
> ===
> --- gcc/config/rs6000/t-rs6000  (revision 191932)
> +++ gcc/config/rs6000/t-rs6000  (working copy)
> @@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety
>$(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \
>output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \
>$(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \
> -  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H)
> +  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h
>
>  rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \
>  $(srcdir)/config/rs6000/rs6000-protos.h \
>
> --
> Michael Meissner, IBM
> 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
> meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
>

Re: [PATCH, libbacktrace]: Compile with -fasynchronous-unwind-tables

2012-10-01 Thread Ian Lance Taylor

On Mon, Oct 1, 2012 at 2:12 PM, Uros Bizjak  wrote:
>
> Without -fasynchronous-unwind-tables, FDE is not generated for
> backtrace_full and backtrace_simple wrappers. Without FDE, unwinding
> terminates at these functions.

I'm not opposed to -fasynchronous-unwind-tables, but now that you
bring it up I'm fairly certain that it would suffice to use
-funwind-tables.  I've been testing mainly on x86_64, and I forgot
that on x86_64 -funwind-tables is the default.  Sorry about that.  And
-fasynchronous-unwind-tables is the default also, so I could be wrong
that -funwind-tables is all that is needed.

> Attached patch fixes this problem by adding
> -fasynchronous-unwind-tables, and this way forcing FDEs for all
> functions. With this change, btest passes OK, failing log and
> runtime/pprof from libgo testsuite also pass OK.

This is basically fine but libbacktrace may be compiled by the host
compiler and that may not be GCC, so please add a configure test to
see if the compiler accepts the -fasynchronous-unwind-tables option.

Ian

Re: [PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Sharad Singhai

Thanks for tracking down and fixing the powerpc port.

The "dump_kind_p ()" check is redundant but canonical form here. I
think blocks of dump code guarded by "if dump_kind_p (...)" might be
easier to read/maintain.

Sharad
Sharad


On Mon, Oct 1, 2012 at 3:45 PM, Xinliang David Li  wrote:
> On Mon, Oct 1, 2012 at 2:37 PM, Michael Meissner
>  wrote:
>> I tracked down some of the other code that previously used REPORT_DETAILS, 
>> and
>> MSG_NOTE is the new way to do the same thing.  This bootstraps and no
>> unexpected errors occur during make check.  Is it ok to install?
>>
>> 2012-10-01  Michael Meissner  
>>
>> * config/rs6000/rs6000.c (toplevel): Include dumpfile.h.
>> (rs6000_density_test): Rework to accomidate 09-30 change by Sharad
>> Singhai.
>>
>> * config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency.
>>
>> Index: gcc/config/rs6000/rs6000.c
>> ===
>> --- gcc/config/rs6000/rs6000.c  (revision 191932)
>> +++ gcc/config/rs6000/rs6000.c  (working copy)
>> @@ -58,6 +58,7 @@
>>  #include "tm-constrs.h"
>>  #include "opts.h"
>>  #include "tree-vectorizer.h"
>> +#include "dumpfile.h"
>>  #if TARGET_XCOFF
>>  #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
>>  #endif
>> @@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d
>>&& vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
>>  {
>>data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
>> -  if (vect_print_dump_info (REPORT_DETAILS))
>> -   fprintf (vect_dump,
>> -"density %d%%, cost %d exceeds threshold, penalizing "
>> -"loop body cost by %d%%", density_pct,
>> -vec_cost + not_vec_cost, DENSITY_PENALTY);
>> +  if (dump_kind_p (MSG_NOTE))
>
> Is this check needed? Seems redundant.
>
> David
>
>
>> +   dump_printf_loc (MSG_NOTE, vect_location,
>> +"density %d%%, cost %d exceeds threshold, 
>> penalizing "
>> +"loop body cost by %d%%", density_pct,
>> +vec_cost + not_vec_cost, DENSITY_PENALTY);
>>  }
>>  }
>>
>> Index: gcc/config/rs6000/t-rs6000
>> ===
>> --- gcc/config/rs6000/t-rs6000  (revision 191932)
>> +++ gcc/config/rs6000/t-rs6000  (working copy)
>> @@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety
>>$(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \
>>output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \
>>$(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \
>> -  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H)
>> +  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h
>>
>>  rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \
>>  $(srcdir)/config/rs6000/rs6000-protos.h \
>>
>> --
>> Michael Meissner, IBM
>> 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
>> meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
>>

Re: [PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Xinliang David Li

On Mon, Oct 1, 2012 at 4:05 PM, Sharad Singhai  wrote:
> Thanks for tracking down and fixing the powerpc port.
>
> The "dump_kind_p ()" check is redundant but canonical form here. I
> think blocks of dump code guarded by "if dump_kind_p (...)" might be
> easier to read/maintain.
>

I find it confusing to be honest. The redundant check serves no purpose.

David

> Sharad
> Sharad
>
>
> On Mon, Oct 1, 2012 at 3:45 PM, Xinliang David Li  wrote:
>> On Mon, Oct 1, 2012 at 2:37 PM, Michael Meissner
>>  wrote:
>>> I tracked down some of the other code that previously used REPORT_DETAILS, 
>>> and
>>> MSG_NOTE is the new way to do the same thing.  This bootstraps and no
>>> unexpected errors occur during make check.  Is it ok to install?
>>>
>>> 2012-10-01  Michael Meissner  
>>>
>>> * config/rs6000/rs6000.c (toplevel): Include dumpfile.h.
>>> (rs6000_density_test): Rework to accomidate 09-30 change by Sharad
>>> Singhai.
>>>
>>> * config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency.
>>>
>>> Index: gcc/config/rs6000/rs6000.c
>>> ===
>>> --- gcc/config/rs6000/rs6000.c  (revision 191932)
>>> +++ gcc/config/rs6000/rs6000.c  (working copy)
>>> @@ -58,6 +58,7 @@
>>>  #include "tm-constrs.h"
>>>  #include "opts.h"
>>>  #include "tree-vectorizer.h"
>>> +#include "dumpfile.h"
>>>  #if TARGET_XCOFF
>>>  #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
>>>  #endif
>>> @@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d
>>>&& vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
>>>  {
>>>data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
>>> -  if (vect_print_dump_info (REPORT_DETAILS))
>>> -   fprintf (vect_dump,
>>> -"density %d%%, cost %d exceeds threshold, penalizing "
>>> -"loop body cost by %d%%", density_pct,
>>> -vec_cost + not_vec_cost, DENSITY_PENALTY);
>>> +  if (dump_kind_p (MSG_NOTE))
>>
>> Is this check needed? Seems redundant.
>>
>> David
>>
>>
>>> +   dump_printf_loc (MSG_NOTE, vect_location,
>>> +"density %d%%, cost %d exceeds threshold, 
>>> penalizing "
>>> +"loop body cost by %d%%", density_pct,
>>> +vec_cost + not_vec_cost, DENSITY_PENALTY);
>>>  }
>>>  }
>>>
>>> Index: gcc/config/rs6000/t-rs6000
>>> ===
>>> --- gcc/config/rs6000/t-rs6000  (revision 191932)
>>> +++ gcc/config/rs6000/t-rs6000  (working copy)
>>> @@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety
>>>$(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \
>>>output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \
>>>$(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \
>>> -  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H)
>>> +  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h
>>>
>>>  rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \
>>>  $(srcdir)/config/rs6000/rs6000-protos.h \
>>>
>>> --
>>> Michael Meissner, IBM
>>> 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
>>> meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
>>>

Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2d

2012-10-01 Thread Michael Meissner

2012-10-01  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_option_override_internal): If
-mcpu= is not specified and the compiler is not configured
using --with-cpu=, use the bits from the TARGET_DEFAULT to
set the initial options.

I reworked the patch to allow TARGET_DEFAULT bits to be set if there is no
-mcpu= and the compiler was not configured using --with-cpu=, so that
we don't first clear all of the ISA bits, set them from the cpu, and then merge
back in the TARGET_DEFAULT bits.

Somebody asked about what is set, when this function gets called.  The
target_flags variable is set with the initial settings (TARGET_DEFAULT) and
then all of the switches that the user sets or resets are then applied.  The
target_flags_explicit variable is only set if the user explicitly used that
switch.

So for instance, if the user passed -mpopcntb -mno-vsx on Linux 64-bit systems,
target_flags would be 0x150001 (MASK_PPC_GFXOPT | MASK_POWERPC64 | MASK_64BIT |
MASK_POPCNTB) and target_flags_explicit would be 0x201 (MASK_POPCNTB |
MASK_VSX).

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 191942)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -2446,21 +2446,34 @@ rs6000_option_override_internal (bool gl
   rs6000_cpu_index = cpu_index = main_target_opt->x_rs6000_cpu_index;
   have_cpu = true;
 }
+  else if (implicit_cpu)
+{
+  rs6000_cpu_index = cpu_index = rs6000_cpu_name_lookup (implicit_cpu);
+  have_cpu = true;
+}
   else
 {
-  const char *default_cpu =
-(implicit_cpu ? implicit_cpu
- : (TARGET_POWERPC64 ? "powerpc64" : "powerpc"));
-
+  const char *default_cpu = (TARGET_POWERPC64 ? "powerpc64" : "powerpc");
   rs6000_cpu_index = cpu_index = rs6000_cpu_name_lookup (default_cpu);
-  have_cpu = implicit_cpu != 0;
+  have_cpu = false;
 }
 
   gcc_assert (cpu_index >= 0);
 
-  target_flags &= ~set_masks;
-  target_flags |= (processor_target_table[cpu_index].target_enable
-  & set_masks);
+  /* If we have a cpu, either through an explicit -mcpu= or if the
+ compiler was configured with --with-cpu=, replace all of the ISA bits
+ with those from the cpu, except for options that were explicitly set.  If
+ we don't have a cpu, do not override the target bits set in
+ TARGET_DEFAULT.  */
+  if (have_cpu)
+{
+  target_flags &= ~set_masks;
+  target_flags |= (processor_target_table[cpu_index].target_enable
+  & set_masks);
+}
+  else
+target_flags |= (processor_target_table[cpu_index].target_enable
+& ~target_flags_explicit);
 
   if (rs6000_tune_index >= 0)
 tune_index = rs6000_tune_index;

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899

[v3] libstdc++/54757

2012-10-01 Thread Paolo Carlini


Hi,

sanity checked x86_64-linux (both cases), committed to mainline.

Thanks,
Paolo.

/
2012-10-01  Paolo Carlini  

PR libstdc++/54757
* include/ext/random (rice_distribution<>::operator()): Use std::hypot
only if _GLIBCXX_USE_C99_MATH_TR1.
* include/ext/random.tcc (rice_distribution<>::__generate_impl):
Likewise.
Index: include/ext/random
===
--- include/ext/random  (revision 191942)
+++ include/ext/random  (working copy)
@@ -1042,7 +1042,11 @@
{
  result_type __x = this->_M_ndx(__urng);
  result_type __y = this->_M_ndy(__urng);
+#if _GLIBCXX_USE_C99_MATH_TR1
  return std::hypot(__x, __y);
+#else
+ return std::sqrt(__x * __x + __y * __y);
+#endif
}
 
   template
@@ -1054,7 +1058,11 @@
__px(__p.nu(), __p.sigma()), __py(result_type(0), __p.sigma());
  result_type __x = this->_M_ndx(__px, __urng);
  result_type __y = this->_M_ndy(__py, __urng);
+#if _GLIBCXX_USE_C99_MATH_TR1
  return std::hypot(__x, __y);
+#else
+ return std::sqrt(__x * __x + __y * __y);
+#endif
}
 
   template_M_ndx(__px, __urng);
result_type __y = this->_M_ndy(__py, __urng);
+#if _GLIBCXX_USE_C99_MATH_TR1
*__f++ = std::hypot(__x, __y);
+#else
+   *__f++ = std::sqrt(__x * __x + __y * __y);
+#endif
  }
   }

Re: [PATCH rs6000 testsuite] Fix a couple tests for VSX scalar instructions

2012-10-01 Thread David Edelsohn

On Mon, Oct 1, 2012 at 6:03 PM, Pat Haugen  wrote:
> This patch fixes a couple failures that occur if the testsuite is run with
> -mvsx and the VSX scalar sqrt instructions are generated. Ok for trunk?
>
> -Pat
>
>
> testsuite/ChangeLog:
> 2012-10-01  Pat Haugen 
>
> * gcc.target/powerpc/pr46728-1.c: Accept xssqrtdp.
> * gcc.target/powerpc/pr46728-2.c: Likewise.

LGTM.

Thanks, David

Re: [PATCH] Fix test breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-01 Thread Sharad Singhai

Here is a patch to fix test breakage caused by r191883. Bootstrapped
on x86_64 and tested with
make -k check RUNTESTFLAGS="--target_board=unix/\{,-m32\}".

Okay for trunk?

Thanks,
Sharad

2012-10-01  Sharad Singhai  

* tree-vect-stmts.c (vectorizable_operation): Add missing return.

testsuite/Changelog

* gfortran.dg/vect/vect.exp: Change verbose vectorizor dump options
to fix test failures caused by r191883.
* gcc.dg/tree-ssa/gen-vect-11.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-2.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-11a.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-26.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-11b.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-11c.c: Likewise.
* gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
* testsuite/gcc.target/i386/vect-double-1.c: Fix test. Missing entry
from r191883.


Index: testsuite/gfortran.dg/vect/vect.exp
===
--- testsuite/gfortran.dg/vect/vect.exp (revision 191883)
+++ testsuite/gfortran.dg/vect/vect.exp (working copy)
@@ -26,7 +26,7 @@ set DEFAULT_VECTCFLAGS ""

 # These flags are used for all targets.
 lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fno-vect-cost-model" \
-  "-ftree-vectorizer-verbose=4" "-fdump-tree-vect-stats"
+  "-fdump-tree-vect-details"

 # If the target system supports vector instructions, the default action
 # for a test is 'run', otherwise it's 'compile'.  Save current default.
Index: testsuite/gcc.dg/tree-ssa/gen-vect-11.c
===
--- testsuite/gcc.dg/tree-ssa/gen-vect-11.c (revision 191883)
+++ testsuite/gcc.dg/tree-ssa/gen-vect-11.c (working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=3
-fwrapv -fdump-tree-vect-stats" } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=3
-fwrapv -fdump-tree-vect-stats -mno-sse" { target { i?86-*-*
x86_64-*-* } } } */
+/* { dg-options "-O2 -ftree-vectorize -fwrapv -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -fwrapv
-fdump-tree-vect-details -mno-sse" { target { i?86-*-* x86_64-*-* } }
} */

 #include 

Index: testsuite/gcc.dg/tree-ssa/gen-vect-2.c
===
--- testsuite/gcc.dg/tree-ssa/gen-vect-2.c (revision 191883)
+++ testsuite/gcc.dg/tree-ssa/gen-vect-2.c (working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=4
-fdump-tree-vect-stats" } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=4
-fdump-tree-vect-stats -mno-sse" { target { i?86-*-* x86_64-*-* } } }
*/
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details
-mno-sse" { target { i?86-*-* x86_64-*-* } } } */

 #include 

Index: testsuite/gcc.dg/tree-ssa/gen-vect-32.c
===
--- testsuite/gcc.dg/tree-ssa/gen-vect-32.c (revision 191883)
+++ testsuite/gcc.dg/tree-ssa/gen-vect-32.c (working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=4
-fdump-tree-vect-stats" } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=4
-fdump-tree-vect-stats -mno-sse" { target { i?86-*-* x86_64-*-* } } }
*/
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details
-mno-sse" { target { i?86-*-* x86_64-*-* } } } */

 #include 

Index: testsuite/gcc.dg/tree-ssa/gen-vect-25.c
===
--- testsuite/gcc.dg/tree-ssa/gen-vect-25.c (revision 191883)
+++ testsuite/gcc.dg/tree-ssa/gen-vect-25.c (working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=4
-fdump-tree-vect-stats" } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=4
-fdump-tree-vect-stats -mno-sse" { target { i?86-*-* x86_64-*-* } } }
*/
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details
-mno-sse" { target { i?86-*-* x86_64-*-* } } } */

 #include 

Index: testsuite/gcc.dg/tree-ssa/gen-vect-11a.c
===
--- testsuite/gcc.dg/tree-ssa/gen-vect-11a.c (revision 191883)
+++ testsuite/gcc.dg/tree-ssa/gen-vect-11a.c (working copy)
@@ -1,6 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
-/* { dg-options "-O2 -ftree-vectorize -ftree-vectorizer-verbose=3
-fdump-tree-vect-stats"

1 2 >

1 - 100 of 114 matches

Mail list logo