date:20121003

Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-03 Thread Steven Bosscher

On Tue, Oct 2, 2012 at 3:42 PM, Richard Sandiford
 wrote:
>
>> +/* Compress pseudo live ranges by removing program points where
>> +   nothing happens.  Complexity of many algorithms in LRA is linear
>> +   function of program points number.  To speed up the code we try to
>> +   minimize the number of the program points here.  */
>> +static void
>> +remove_some_program_points_and_update_live_ranges (void)
>
> Genuine question, but could we do this on the fly instead,
> by not incrementing curr_point if the current point had no value?
>
> I suppose the main complication would be checking cases where
> all births are recorded by extending the previous just-closed live
> range rather than starting a new one, in which case it's the previous
> point that needs to be reused.  Hmm...

It does seem to be something worth investigating further. Things like:

Compressing live ranges: from 1742579 to 554532 - 31%
Compressing live ranges: from 1742569 to 73069 - 4%

look extreme, but they're actually the norm. For the same test case
(PR54146 still) but looking only at functions with initially 1000-
program points, you get this picture:

Compressing live ranges: from 1766 to 705 - 39%
Compressing live ranges: from 1449 to 370 - 25%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 2464 to 676 - 27%
Compressing live ranges: from 1433 to 379 - 26%
Compressing live ranges: from 1261 to 348 - 27%
Compressing live ranges: from 2835 to 755 - 26%
Compressing live ranges: from 5426 to 1678 - 30%
Compressing live ranges: from 5227 to 1477 - 28%
Compressing live ranges: from 1845 to 467 - 25%
Compressing live ranges: from 4868 to 1378 - 28%
Compressing live ranges: from 4875 to 1388 - 28%
Compressing live ranges: from 4882 to 1384 - 28%
Compressing live ranges: from 5688 to 1714 - 30%
Compressing live ranges: from 4943 to 1310 - 26%
Compressing live ranges: from 2976 to 792 - 26%
Compressing live ranges: from 5463 to 1526 - 27%
Compressing live ranges: from 2854 to 730 - 25%
Compressing live ranges: from 1810 to 745 - 41%
Compressing live ranges: from 2771 to 904 - 32%
Compressing live ranges: from 4916 to 1429 - 29%
Compressing live ranges: from 6505 to 2238 - 34%
Compressing live ranges: from 6493 to 166 - 2%
Compressing live ranges: from 5498 to 1734 - 31%
Compressing live ranges: from 1810 to 745 - 41%
Compressing live ranges: from 5043 to 1420 - 28%
Compressing live ranges: from 3094 to 788 - 25%
Compressing live ranges: from 4563 to 1311 - 28%
Compressing live ranges: from 4557 to 158 - 3%
Compressing live ranges: from 1050 to 274 - 26%
Compressing live ranges: from 1602 to 434 - 27%
Compressing live ranges: from 2474 to 600 - 24%
Compressing live ranges: from 2718 to 866 - 31%
Compressing live ranges: from 2097 to 716 - 34%
Compressing live ranges: from 4152 to 1099 - 26%
Compressing live ranges: from 5065 to 1514 - 29%
Compressing live ranges: from 1236 to 359 - 29%
Compressing live ranges: from 1722 to 541 - 31%
Compressing live ranges: from 5186 to 1401 - 27%

Unfortunately the dump doesn't mention how many live ranges could be
merged thanks to the compression.

It'd also be good to understand why the compression ratios are so
small, and consistently around ~30%. Maybe curr_point includes things
it should ignore (DEBUG_INSNs, NOTEs, ...)?

Ciao!
Steven

[Ada] Connect_Socket with timeout does not report failure correctly

2012-10-03 Thread Arnaud Charlet

This change fixes an oversight in the Connect_Socket variant that includes
a timeout check. The connection is initiated asynchronously, and a select(2)
call is performed to wait for it to complete, with a timeout. When that call
returns however, the connection is not always succsefully established. It
may also happen that an error occurred during the connection attempt
(typically, connection refused by the remote host), which needs to be reported
through the raising of an exception. This change adds the missing error
check.

The following test case should report:
Client is trying to connect to Port: 11552 
Expected exception raised

with Text_IO;use Text_IO;
with Gnat.Sockets; use Gnat.Sockets;
with Ada.Command_Line;
with Ada.Exceptions; use Ada.Exceptions;

procedure Timed_Conn_Error is
  Port  : constant Port_Type := 11552;
  --  There should be no listening server on this port

  Socket: Socket_Type := No_Socket;
  Address   : Sock_Addr_Type;
  Host  : constant String := Host_Name;
  Status: Selector_Status;
begin
  Address.Addr := Addresses (Get_Host_By_Name (Host), 1);
  Address.Port := Port;

  Create_Socket (Socket);
  Set_Socket_Option (Socket, Socket_Level, (Reuse_Address, True));
  Put_Line ("Client is trying to connect to Port:" & Port'Img & " ");
  Connect_Socket (Socket => Socket, Server => Address, Timeout => Forever, 
Status => Status);

  begin
String'Output (Stream (Socket), "Hello");
  exception
when E : others =>
  Put_Line ("Unexpected exception, should not reach this point");
  Put_Line ("Info: " & Exception_Information (E));
  end;

exception when E : others =>
   Put_Line ("Expected exception raised");
   Put_Line ("Info: " & Exception_Information (E));
end Timed_Conn_Error;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-03  Thomas Quinot  

* g-socket.adb (Connect_Socket, version with timeout): When the
newly-connected socket is reported as available for writing, check
whether it has a pending asynchronous error prior to returning.

Index: g-socket.adb
===
--- g-socket.adb(revision 192025)
+++ g-socket.adb(working copy)
@@ -123,7 +123,7 @@
function Resolve_Error
  (Error_Value : Integer;
   From_Errno  : Boolean := True) return Error_Type;
-   --  Associate an enumeration value (error_type) to en error value (errno).
+   --  Associate an enumeration value (error_type) to an error value (errno).
--  From_Errno prevents from mixing h_errno with errno.
 
function To_Name   (N  : String) return Name_Type;
@@ -702,6 +702,13 @@
   Req : Request_Type;
   --  Used to set Socket to non-blocking I/O
 
+  Conn_Err : aliased Integer;
+  --  Error status of the socket after completion of select(2)
+
+  Res   : C.int;
+  Conn_Err_Size : aliased C.int := Conn_Err'Size / 8;
+  --  For getsockopt(2) call
+
begin
   if Selector /= null and then not Is_Open (Selector.all) then
  raise Program_Error with "closed selector";
@@ -735,10 +742,32 @@
  Selector => Selector,
  Status   => Status);
 
+  --  Check error condition (the asynchronous connect may have terminated
+  --  with an error, e.g. ECONNREFUSED) if select(2) completed.
+
+  if Status = Completed then
+ Res := C_Getsockopt
+   (C.int (Socket), SOSC.SOL_SOCKET, SOSC.SO_ERROR,
+Conn_Err'Address, Conn_Err_Size'Access);
+
+ if Res /= 0 then
+Conn_Err := Socket_Errno;
+ end if;
+
+  else
+ Conn_Err := 0;
+  end if;
+
   --  Reset the socket to blocking I/O
 
   Req := (Name => Non_Blocking_IO, Enabled => False);
   Control_Socket (Socket, Request => Req);
+
+  --  Report error condition if any
+
+  if Conn_Err /= 0 then
+ Raise_Socket_Error (Conn_Err);
+  end if;
end Connect_Socket;

[Ada] Additional information on subtype conformance error

2012-10-03 Thread Arnaud Charlet

This patch adds a clarifying message info when subtype conformance fails, due
to a missing null exclusion indicatar in a formal that must  match a controlling
access formal. This Ada 2005 rule was checked partially in the context of
subprogram renamings but not for 'Access attribute references.

Compiling alpha.adb in gnat05 mode must be rejected with:

alpha.adb:6:19: not subtype conformant with declaration at beta.ads:3
alpha.adb:6:19: controlling formal "Ref" of "Updated" excludes null,
declaration must exclude null as well

---
  with Beta;
  package Alpha is
 type Object is tagged limited null record;

 procedure Start;

 procedure Updated (Ref : access Object) is null;
  end Alpha;
---

  package body Alpha is

 procedure Start is
 begin
Beta.Set (Updated'Access);
 end Start;
  end Alpha;
---
  limited with Alpha;
  package Beta is
 type Callback is access procedure (Ref : access Alpha.Object);

 procedure Set (Proc : in Callback) is null;
  end Beta;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-03  Ed Schonberg  

* sem_ch6.adb (Check_Conformance): Additional info when subtype
conformance fails, due to a missing null exclusion indicatar in
a formal that must match a controlling access formal.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 192025)
+++ sem_ch6.adb (working copy)
@@ -5756,14 +5756,31 @@
 
declare
   TSS_Name : constant TSS_Name_Type := Get_TSS_Name (New_Id);
+
begin
   if TSS_Name /= TSS_Stream_Read
 and then TSS_Name /= TSS_Stream_Write
 and then TSS_Name /= TSS_Stream_Input
 and then TSS_Name /= TSS_Stream_Output
   then
- Conformance_Error
-   ("\type of & does not match!", New_Formal);
+ --  Here we have a definite conformance error. It is worth
+ --  special casesing the error message for the case of a
+ --  controlling formal (which excludes null).
+
+ if Is_Controlling_Formal (New_Formal) then
+Error_Msg_Node_2 := Scope (New_Formal);
+Conformance_Error
+ ("\controlling formal& of& excludes null, "
+   & "declaration must exclude null as well",
+New_Formal);
+
+ --  Normal case (couldn't we give more detail here???)
+
+ else
+Conformance_Error
+  ("\type of & does not match!", New_Formal);
+ end if;
+
  return;
   end if;
end;

[Ada] Ada/C++ missing call to constructor with defaults

2012-10-03 Thread Arnaud Charlet

When the type of an object is a CPP untagged type and the object initialization
requires calling its default C++ constructor, the Ada compiler did not
generate the call to a C++ constructor which has all parameters with
defaults (and hence it covers the default C++ constructor).

// c_class.h
class Tester {
  public:
Tester(unsigned int a_num = 5, char* a_className = 0);
};

// c_class.cc


#include "c_class.h"
#include 

Tester::Tester(unsigned int a_num, char* a_className) {
  std::cout << " ctor Tester called " << a_num << ":";

  if (a_className == 0) {
 std::cout << "null";
  } else {
 std::cout << a_className;
  }

  std::cout << std::endl;
}

--  c_class_h.ads

with Interfaces.C; use Interfaces.C;
with Interfaces.C.Strings;

package c_class_h is
   package Class_Tester is
  type Tester is limited record
 null;
  end record;
  pragma Import (CPP, Tester);

  function New_Tester  --  Modified by hand
(a_num : unsigned := 5;
 a_className : Interfaces.C.Strings.chars_ptr
 := Interfaces.C.Strings.Null_Ptr)
 return Tester;  -- c_class.h:3
  pragma CPP_Constructor (New_Tester, "_ZN6TesterC1EjPc");
   end;
   use Class_Tester;
end c_class_h;

--  main.adb

with c_class_h; use c_class_h;
procedure Main is
   use Class_Tester;

   Ptr : access Tester := new Tester;  -- TEST
   pragma Unreferenced (Ptr);
begin
   null;
end main;

--  ada2cpp.gpr
project Ada2Cppc is
   for Languages use ("Ada", "C++");
   for Main use ("main.adb");

   package Naming is
 for Implementation_Suffix ("C++") use ".cc";
   end Naming;

   for Source_Dirs use (".");
   for Object_Dir use "obj";

   package Compiler is
  for Default_Switches ("ada") use
("-g", "-gnato", "-gnatwa", "-gnatQ", "-gnat05", "-gnatD");
   end Compiler;

   package Builder is
  for Default_Switches ("ada") use ("-g");
   end Builder;

   package Ide is
  for Compiler_Command ("ada") use "gnatmake";
  for Compiler_Command ("c") use "gcc";
   end Ide;
end Ada2Cppc;

Command: gprbuild -q -P ada2cppc.gpr; obj/main
Output:
 ctor Tester called 5:null

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-03  Javier Miranda  

* exp_disp.adb (Set_CPP_Constructors_Old): Handle constructor of
untagged type that has all its parameters with defaults and hence it
covers the default constructor.

Index: exp_disp.adb
===
--- exp_disp.adb(revision 192025)
+++ exp_disp.adb(working copy)
@@ -8459,6 +8459,8 @@
  P : Node_Id;
  Parms : List_Id;
 
+ Covers_Default_Constructor : Entity_Id := Empty;
+
   begin
  --  Look for the constructor entities
 
@@ -8490,7 +8492,8 @@
Make_Defining_Identifier (Loc,
  Chars (Defining_Identifier (P))),
  Parameter_Type =>
-   New_Copy_Tree (Parameter_Type (P;
+   New_Copy_Tree (Parameter_Type (P)),
+ Expression => New_Copy_Tree (Expression (P;
  Next (P);
   end loop;
end if;
@@ -8508,6 +8511,17 @@
Set_Convention (Init, Convention_CPP);
Set_Is_Public  (Init);
Set_Has_Completion (Init);
+
+   --  If this constructor has parameters and all its parameters
+   --  have defaults then it covers the default constructor. The
+   --  semantic analyzer ensures that only one constructor with
+   --  defaults covers the default constructor.
+
+   if Present (Parameter_Specifications (Parent (E)))
+ and then Needs_No_Actuals (E)
+   then
+  Covers_Default_Constructor := Init;
+   end if;
 end if;
 
 Next_Entity (E);
@@ -8519,6 +8533,49 @@
  if not Found then
 Set_Is_Abstract_Type (Typ);
  end if;
+
+ --  Handle constructor that has all its parameters with defaults and
+ --  hence it covers the default constructor. We generate a wrapper IP
+ --  which calls the covering constructor.
+
+ if Present (Covers_Default_Constructor) then
+declare
+   Body_Stmts: List_Id;
+   Wrapper_Id: Entity_Id;
+   Wrapper_Body_Node : Node_Id;
+begin
+   Loc := Sloc (Covers_Default_Constructor);
+
+   Body_Stmts := New_List (
+ Make_Procedure_Call_Statement (Loc,
+   Name => New_Reference_To (Covers_Default_Constructor, Loc),
+   Parameter_Associations => New_List (
+ Make_Identifier (Loc, Name_uInit;
+
+   Wrapper_Id := Make_Defining_Identifier (Loc,
+

[Ada] Ada/C++ missing call to allocation of C++ object with defaults

2012-10-03 Thread Arnaud Charlet

When the type of an object is a CPP untagged type, the object is allocated
in memory through the "new" construct, and the object initialization
requires calling its C++ constructor passing defaults, the Ada compiler
did not generate the call to a C++ constructor which has all parameters with
defaults (and hence it covers the default C++ constructor). The following
test now executes well.

// c_class.h
class Tester {
  public:
Tester(unsigned int a_num = 5, char* a_className = 0);
};

// c_class.cc

#include "c_class.h"
#include 

Tester::Tester(unsigned int a_num, char* a_className) {
  std::cout << " ctor Tester called " << a_num << ":";

  if (a_className == 0) {
 std::cout << "null";
  } else {
 std::cout << a_className;
  }

  std::cout << std::endl;
}

-- c_class_h.ads
with Interfaces.C; use Interfaces.C;
with Interfaces.C.Strings;

package c_class_h is
   package Class_Tester is
  type Tester is limited record
 null;
  end record;
  pragma Import (CPP, Tester);

  function New_Tester
(a_num : unsigned := 5;
 a_className : Interfaces.C.Strings.chars_ptr
 := Interfaces.C.Strings.Null_Ptr)
 return Tester;  -- c_class.h:3
  pragma CPP_Constructor (New_Tester, "_ZN6TesterC1EjPc");
   end;
   use Class_Tester;
end c_class_h;


project Ada2Cppc is
   for Languages use ("Ada", "C++");
   for Main use ("main.adb");

   package Naming is
 for Implementation_Suffix ("C++") use ".cc";
   end Naming;

   for Source_Dirs use (".");

   package Compiler is
  for Default_Switches ("ada") use ("-gnat05");
   end Compiler;

   package Builder is
  for Default_Switches ("ada") use ("-g");
   end Builder;

   package Ide is
  for Compiler_Command ("ada") use "gnatmake";
  for Compiler_Command ("c") use "gcc";
   end Ide;
end Ada2Cppc;

Command: gprbuild -q -P ada2cppc.gpr; ./main
Output:
 ctor Tester called 5:null

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-03  Javier Miranda  

* exp_ch4.adb (Expand_N_Allocator_Expression): Minor code
reorganization and cleanup. Done to ensure proper management of
the C++ constructor covering tagged and untagged types and also
non-default constructors.
* exp_ch6.ads, exp_ch6.adb (Make_CPP_Constructor_Call_In_Allocator):
New subprogram.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 192027)
+++ exp_ch4.adb (working copy)
@@ -867,6 +867,15 @@
--  Start of processing for Expand_Allocator_Expression
 
begin
+  --  Handle call to C++ constructor
+
+  if Is_CPP_Constructor_Call (Exp) then
+ Make_CPP_Constructor_Call_In_Allocator
+   (Allocator => N,
+Function_Call => Exp);
+ return;
+  end if;
+
   --  In the case of an Ada 2012 allocator whose initial value comes from a
   --  function call, pass "the accessibility level determined by the point
   --  of call" (AI05-0234) to the function. Conceptually, this belongs in
@@ -899,59 +908,7 @@
   --  Case of tagged type or type requiring finalization
 
   if Is_Tagged_Type (T) or else Needs_Finalization (T) then
- if Is_CPP_Constructor_Call (Exp) then
 
---  Generate:
---Pnnn : constant ptr_T := new (T);
---Init (Pnnn.all,...);
-
---  Allocate the object without an expression
-
-Node := Relocate_Node (N);
-Set_Expression (Node, New_Reference_To (Etype (Exp), Loc));
-
---  Avoid its expansion to avoid generating a call to the default
---  C++ constructor.
-
-Set_Analyzed (Node);
-
-Temp := Make_Temporary (Loc, 'P', N);
-
-Temp_Decl :=
-  Make_Object_Declaration (Loc,
-Defining_Identifier => Temp,
-Constant_Present=> True,
-Object_Definition   => New_Reference_To (PtrT, Loc),
-Expression  => Node);
-Insert_Action (N, Temp_Decl);
-
-Apply_Accessibility_Check (Temp);
-
---  Locate the enclosing list and insert the C++ constructor call
-
-declare
-   P : Node_Id;
-
-begin
-   P := Parent (Node);
-   while not Is_List_Member (P) loop
-  P := Parent (P);
-   end loop;
-
-   Insert_List_After_And_Analyze (P,
- Build_Initialization_Call (Loc,
-   Id_Ref  =>
- Make_Explicit_Dereference (Loc,
-   Prefix => New_Reference_To (Temp, Loc)),
-   Typ => Etype (Exp),
-   Constructor_Ref => Exp));
-end;
-
-Rewrite (N, New_Reference_To (Temp, Loc));
-Analyze_And_Resolve (N, PtrT);
-return;
- end if;
-
  --

[Ada] Further fixes to MINIMIZED overflow checking mode

2012-10-03 Thread Arnaud Charlet

This patch has three components. First we set SUPPRESSED mode as the
default for -gnatg mode (including the run-time). This has no effect
right now, but avoids hidden problems that may appear if in future
we make overflow checking the default behavior.

Second, in some obscure cases, the overflow checking code was
generating bogus warnings for redundant type conversions from
generated overflow checking code in MINIMIZED/ELIMINATED mode.
No simple test has been found for this, and the conditions
are quite obscure (and not well understood!) but the patch
decisively eliminates such bogus warnings.

Finally, we properly handle the case where the result is
converted to a larger type. The following test program:

 1. with Text_IO; use Text_IO;
 2. procedure convov is
 3.function a (x : integer)
 4.  return Long_Long_Integer
 5.is
 6.begin
 7.   return Long_Long_Integer (x * x);
 8.end;
 9. begin
10.Put_Line (a (Integer'Last)'Img);
11. end;

when run in -gnato2 or -gnato3 mode, compiles quietly
and outputs a single line:

 4611686014132420609

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-03  Robert Dewar  

* checks.adb (Minimize_Eliminate_Overflow_Checks): Properly
handle case of top level expression within type conversion
* gnat1drv.adb (Adjust_Global_Switches): Set SUPPRESSED as
default for overflow checking for -gnatg mode (includes run-time).
* sem_res.adb (Resolve_Type_Conversion): Avoid bogus warnings
about redundant conversions from MINIMIZED/EXTENDED mode checking

Index: checks.adb
===
--- checks.adb  (revision 192027)
+++ checks.adb  (working copy)
@@ -7404,6 +7404,16 @@
 
   elsif Top_Level
 and then not (Bignum_Operands or Long_Long_Integer_Operands)
+
+--  One further refinement. If we are at the top level, but our parent
+--  is a type conversion, then go into bignum or long long integer node
+--  since the result will be converted to that type directly without
+--  going through the result type, and we may avoid an overflow. This
+--  is the case for example of Long_Long_Integer (A ** 4), where A is
+--  of type Integer, and the result A ** 4 fits in Long_Long_Integer
+--  but does not fit in Integer.
+
+and then Nkind (Parent (N)) /= N_Type_Conversion
   then
  --  Here we will keep the original types, but we do need an overflow
  --  check, so we will set Do_Overflow_Check to True (actually it is
@@ -7561,12 +7571,6 @@
 
   if Nkind (N) = N_Op_Expon and then Etype (Right_Opnd (N)) = LLIB then
  Convert_To_And_Rewrite (Standard_Natural, Right_Opnd (N));
-
- --  Now Long_Long_Integer_Operands may have to be reset if that was
- --  the only long long integer operand, i.e. we now have long long
- --  integer operands only if the left operand is long long integer.
-
- Long_Long_Integer_Operands := Etype (Left_Opnd (N)) = LLIB;
   end if;
 
   --  Here we will do the operation in Long_Long_Integer. We do this even
Index: sem_res.adb
===
--- sem_res.adb (revision 192025)
+++ sem_res.adb (working copy)
@@ -9624,6 +9624,13 @@
 then
null;
 
+--  Never warn on conversion to Long_Long_Integer'Base since
+--  that is most likely an artifact of the extended overflow
+--  checking and comes from complex expanded code.
+
+elsif Orig_T = Base_Type (Standard_Long_Long_Integer) then
+   null;
+
 --  Here we give the redundant conversion warning. If it is an
 --  entity, give the name of the entity in the message. If not,
 --  just mention the expression.
Index: gnat1drv.adb
===
--- gnat1drv.adb(revision 192025)
+++ gnat1drv.adb(working copy)
@@ -334,6 +334,12 @@
   if Opt.Suppress_Options.Overflow_Checks_General /= Not_Set then
  null;
 
+  --  By default suppress overflow checks in -gnatg mode
+
+  elsif GNAT_Mode then
+ Suppress_Options.Overflow_Checks_General:= Suppressed;
+ Suppress_Options.Overflow_Checks_Assertions := Suppressed;
+
   --  If we have backend divide and overflow checks, then by default
   --  overflow checks are minimized, which is a reasonable setting.

Re: [C++ PATCH] -Wsizeof-pointer-memaccess warning (take 2)

2012-10-03 Thread Dodji Seketeli

Jakub Jelinek  a écrit:

> --- gcc/cp/call.c.jj  2012-09-27 12:45:49.0 +0200
> +++ gcc/cp/call.c 2012-10-01 17:53:17.594609236 +0200
> @@ -557,7 +557,10 @@ null_ptr_cst_p (tree t)
>  {
>/* Core issue 903 says only literal 0 is a null pointer constant.  */
>if (cxx_dialect < cxx0x)
> - t = integral_constant_value (t);
> + {
> +   t = integral_constant_value (t);
> +   t = maybe_constant_value (t);
> + }

Just for my education, why couldn't maybe_constant_value just call
integral_constant_value, so that we just use maybe_constant_value here?

Even after reading the comments of these two functions, /having/ to use
them like this seems confusing to the casual reader in me.

-- 
Dodji

Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inide union-s

2012-10-03 Thread Laurynas Biveinis

Hello Basile -

> 2012-10-02  Basile Starynkevitch  
>
> * gengtype.c (walk_type): Emit mark_hook when inside a
>   struct of a union member.

Can you send me off-list the gengtype output before and after the fix?

+   const char *structmarkhook = NULL;

struct_mark_hook or just mark_hook please.

+   oprintf (d->of, "%*s/*structmarkhook %s */ %s (&%s));\n",
+d->indent, "", t->u.s.tag,  structmarkhook, oldval);

No need to output the comment.

Thanks,
-- 
Laurynas

Re: [C++ PATCH] -Wsizeof-pointer-memaccess warning (take 2)

2012-10-03 Thread Jakub Jelinek

On Wed, Oct 03, 2012 at 11:14:37AM +0200, Dodji Seketeli wrote:
> Jakub Jelinek  a écrit:
> 
> > --- gcc/cp/call.c.jj2012-09-27 12:45:49.0 +0200
> > +++ gcc/cp/call.c   2012-10-01 17:53:17.594609236 +0200
> > @@ -557,7 +557,10 @@ null_ptr_cst_p (tree t)
> >  {
> >/* Core issue 903 says only literal 0 is a null pointer constant.  */
> >if (cxx_dialect < cxx0x)
> > -   t = integral_constant_value (t);
> > +   {
> > + t = integral_constant_value (t);
> > + t = maybe_constant_value (t);
> > +   }
> 
> Just for my education, why couldn't maybe_constant_value just call
> integral_constant_value, so that we just use maybe_constant_value here?

Perhaps maybe_constant_value already does all that integral_constant_value
does, but I'm not sure about it.  maybe_constant_value starts with a lot of
conditions where it doesn't do anything if they are true,
integral_constant_value doesn't have any such conditions, on the other side
performs far less - just loops through DECL_INITIAL values looking for a
const.  As I said earlier, while the initial hacks around early folding
of SIZEOF_EXPR were quite ugly, the amount of changes to allow late folding
of SIZEOF_EXPR was huge, with user visible changes etc.

Jakub

RE: [PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced

2012-10-03 Thread Gopalasubramanian, Ganesh

Testing was done before posting the patch. It was successful.

Regards
Ganesh

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Thursday, September 27, 2012 5:57 PM
To: Gopalasubramanian, Ganesh
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH, i386]: Fix PR51109, symbol size in scheduler state machine 
is reduced

On Thu, Sep 27, 2012 at 10:30 AM, Gopalasubramanian, Ganesh 
 wrote:

> This is a fix for PR 51109.
>
> There are three changes
>
> 1.  Microcoded instructions are considered as single issue instructions 
> and are therefore issued to a separate execution unit.
> 2.  The multiplier unit is attached to execution unit 1 (ieu1). Since ieu 
> is handled as a separate automaton in the patch, separate "mult" automaton is 
> not required.
> 3.  The integer execution units (2AGUs and 2EXs) are now decoupled. Now, 
> they are described as separate automatons.
>
> Is it OK for upstream?
>
> Regards
> Ganesh
>
> 2012-09-27  Ganesh Gopalasubramanian  
> 
>
> PR 51109
> * gcc/config/i386/bdver1.md (bdver1_int): Automaton has been
> split to reduce state transitions.

OK for mainline, if tested according to [1].

[1] http://gcc.gnu.org/contribute.html#testing

Thanks,
Uros.

Re: [patch][lra] Use XNEWVEC and friends instead of xmalloc/xrealloc, and add some timevars

2012-10-03 Thread Steven Bosscher

On Mon, Oct 1, 2012 at 1:05 PM, Steven Bosscher  wrote:
> Hello,
>
> This patch uses the libiberty new-like operators instead of using
> xmalloc/xrealloc.
> It also adds timevars for the main LRA phases, and it fixes a warning
> suggesting a space before a ';' in an only-looping for loop.
>
> Bootstrapped (lra-branch, of course) on x86_64-unknown-linux-gnu. OK?

Hi Vlad,

Is this patch OK for the lra-branch?

Ciao!
Steven

Re: PATCH trunk: gengtype honoring mark_hook-s inside struct inside union-s

2012-10-03 Thread Basile Starynkevitch

On Wed, Oct 03, 2012 at 12:21:02PM +0300, Laurynas Biveinis wrote:
> Hello Basile -
> 
> > 2012-10-02  Basile Starynkevitch  
> >
> > * gengtype.c (walk_type): Emit mark_hook when inside a
> >   struct of a union member.
> 
> Can you send me off-list the gengtype output before and after the fix?

I messed something, the example I did send was wrong. Let's start all over 
again.

Consider the following file 
(I named this example on purpose differently than in the previous message)

/* file basilemarkh.h */

   struct  GTY ((mark_hook("mymarker"))) mytest_st {
 int myflag;
 tree mytree;
 gimple mygimple;
   };

   #define MYUTAG 1
   union GTY ((desc ("%0.u_int"))) myutest_un
   {
 int GTY ((skip)) u_int;
 struct mytest_st GTY ((tag ("MYUTAG"))) u_mytest;
   };

   static GTY (()) union myutest_un *myutestptr;

   static inline void mymarker (struct mytest_st *s)
   {
 s->myflag = 1;
   }

   /* eof basilemarkh.h */
###

With the gengtype from 4.7 in Debian/Sid/AMD64 (a 4.7.2 I believe) 
/usr/lib/gcc/x86_64-linux-gnu/4.7/gengtype  -D -v \
   -r /usr/lib/gcc/x86_64-linux-gnu/4.7/gtype.state \
   -P _g-4.7-basilemarkh.h basilemarkh.h

 from generated  _g-4.7-basilemarkh.h 
/* functions code */

void
gt_ggc_mx_myutest_un (void *x_p)
{
  union myutest_un * const x = (union myutest_un *)x_p;
  if (ggc_test_and_set_mark (x))
{
  switch ((*x).u_int)
{
case MYUTAG:  ## no mark hook here, should have a call to 
mymarker
  gt_ggc_m_9tree_node ((*x).u_mytest.mytree);
  gt_ggc_m_18gimple_statement_d ((*x).u_mytest.mygimple);
  break;
default:
  break;
}
}
}
### end of excerpt from generated  _g-4.7-basilemarkh.h 

With the gengtype from unpatched trunk svn rev 192031 the emitted 
gt_ggc_mx_myutest_un is exactly the same.
Of course the  ### comment above has been added manually by me Basile.

So I applied and I am proposing the following patch to gcc trunk 192031
(Laurynas, I did take your remarks into account)
# patch to trunk
Index: gcc/gengtype.c
===
--- gcc/gengtype.c  (revision 192031)
+++ gcc/gengtype.c  (working copy)
@@ -2810,6 +2810,7 @@ walk_type (type_p t, struct walk_type_data *d)
const char *oldval = d->val;
const char *oldprevval1 = d->prev_val[1];
const char *oldprevval2 = d->prev_val[2];
+   const char *struct_mark_hook = NULL;
const int union_p = t->kind == TYPE_UNION;
int seen_default_p = 0;
options_p o;
@@ -2833,7 +2834,14 @@ walk_type (type_p t, struct walk_type_data *d)
  if (!desc && strcmp (o->name, "desc") == 0
  && o->kind == OPTION_STRING)
desc = o->info.string;
+ else if (!struct_mark_hook && strcmp (o->name, "mark_hook") == 0
+  && o->kind == OPTION_STRING)
+   struct_mark_hook = o->info.string;
 
+   if (struct_mark_hook) 
+   oprintf (d->of, "%*s%s (&%s));\n",
+d->indent, "", struct_mark_hook, oldval);
+ 
d->prev_val[2] = oldval;
d->prev_val[1] = oldprevval2;
if (union_p)

# end of patch

with the following ChangeLog entry
# gcc/ChangeLog entry
2012-10-03  Basile Starynkevitch  

* gengtype.c (walk_type): Emit mark_hook when inside a
  struct of a union member.

# end of gcc/ChangeLog entry

If you apply my proposed patch the emitted function is
### from generated _gtrunk-basilemarkh.h with my patch to gengtype.c
/* functions code */

void
gt_ggc_mx_myutest_un (void *x_p)
{
  union myutest_un * const x = (union myutest_un *)x_p;
  if (ggc_test_and_set_mark (x))
{
  switch ((*x).u_int)
{
case MYUTAG:
  mymarker (&(*x).u_mytest));
  gt_ggc_m_9tree_node ((*x).u_mytest.mytree);
  gt_ggc_m_18gimple_statement_d ((*x).u_mytest.mygimple);
  break;
default:
  break;
}
}
}
 end of excerpt from generated  _gtrunk-basilemarkh.h


If you just use the gengtype from trunk 192031 without my patch
you get the same code as from gcc-4.7's gengtype:
 from generated _g192031-basilemarkh.h 
void
gt_ggc_mx_myutest_un (void *x_p)
{
  union myutest_un * const x = (union myutest_un *)x_p;
  if (ggc_test_and_set_mark (x))
{
  switch ((*x).u_int)
{
case MYUTAG:
  gt_ggc_m_9tree_node ((*x).u_mytest.mytree);
  gt_ggc_m_18gimple_statement_d ((*x).u_mytest.mygimple);
  break;
default:
  break;
}
}
}
### end of excerpt from generated _g192031-basilemarkh.h


Is my patch here ok for trunk?

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement l

Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-10-03 Thread Paolo Bonzini

Il 30/09/2012 11:02, Richard Sandiford ha scritto:
> Uros Bizjak  writes:
>> On Thu, Sep 27, 2012 at 8:20 PM, Jakub Jelinek  wrote:
>>> On Thu, Sep 27, 2012 at 08:04:58PM +0200, Uros Bizjak wrote:
 After some off-line discussion with Richard, attached is v2 of the patch.

 2012-09-27  Uros Bizjak  

 PR rtl-optimization/54457
 * simplify-rtx.c (simplify_subreg):
   Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
   to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).
>>>
>>> Is that a good idea even for WORD_REGISTER_OPERATIONS targets?
>>
>> I have bootstrapped and regtested [1] the patch on
>> alphaev68-pc-linux-gnu, a WORD_REGISTER_OPERATIONS target, and there
>> were no additional failures.
> 
> Thanks.  Given Jakub's question/concern, I'd really prefer a third
> opinion rather than approving it myself, but... OK if no-one objects
> within 24hrs.

I used to have a patch doing roughly the same thing (for more operations
but not MULT).  I never submitted because I didn't have the time to
audit all targets after changing the canonicalization.

Perhaps you can take the best of both worlds.

Paolo
change canonicalization

Index: gcc/Makefile.in
===
--- gcc/Makefile.in	(branch pr39726)
+++ gcc/Makefile.in	(working copy)
@@ -2844,7 +2844,7 @@ jump.o : jump.c $(CONFIG_H) $(SYSTEM_H) 
 simplify-rtx.o : simplify-rtx.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) $(REAL_H) insn-config.h \
$(RECOG_H) $(EXPR_H) $(TOPLEV_H) output.h $(FUNCTION_H) $(GGC_H) $(TM_P_H) \
-   $(TREE_H) $(TARGET_H)
+   $(TREE_H) $(TARGET_H) $(OPTABS_H)
 cgraph.o : cgraph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
langhooks.h $(TOPLEV_H) $(FLAGS_H) $(GGC_H) $(TARGET_H) $(CGRAPH_H) \
gt-cgraph.h output.h intl.h $(BASIC_BLOCK_H) debug.h $(HASHTAB_H) \
Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c	(branch pr39726)
+++ gcc/simplify-rtx.c	(working copy)
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  
 #include "output.h"
 #include "ggc.h"
 #include "target.h"
+#include "optabs.h"
 
 /* Simplification and canonicalization of RTL.  */
 
@@ -5191,6 +5192,8 @@ rtx
 simplify_subreg (enum machine_mode outermode, rtx op,
 		 enum machine_mode innermode, unsigned int byte)
 {
+  int is_lowpart;
+
   /* Little bit of sanity checking.  */
   gcc_assert (innermode != VOIDmode);
   gcc_assert (outermode != VOIDmode);
@@ -5300,11 +5303,13 @@ simplify_subreg (enum machine_mode outer
   return NULL_RTX;
 }
 
+  is_lowpart = subreg_lowpart_offset (outermode, innermode) == byte;
+
   /* Merge implicit and explicit truncations.  */
 
   if (GET_CODE (op) == TRUNCATE
   && GET_MODE_SIZE (outermode) < GET_MODE_SIZE (innermode)
-  && subreg_lowpart_offset (outermode, innermode) == byte)
+  && is_lowpart)
 return simplify_gen_unary (TRUNCATE, outermode, XEXP (op, 0),
 			   GET_MODE (XEXP (op, 0)));
 
@@ -5343,7 +5348,7 @@ simplify_subreg (enum machine_mode outer
 	 The information is used only by alias analysis that can not
 	 grog partial register anyway.  */
 
-	  if (subreg_lowpart_offset (outermode, innermode) == byte)
+	  if (is_lowpart)
 	ORIGINAL_REGNO (x) = ORIGINAL_REGNO (op);
 	  return x;
 	}
@@ -5393,6 +5398,51 @@ simplify_subreg (enum machine_mode outer
   return NULL_RTX;
 }
 
+  /* Try to move a subreg inside an arithmetic operation.  */
+  if (is_lowpart && ARITHMETIC_P (op)
+  && GET_MODE_CLASS (outermode) == MODE_INT
+  && GET_MODE_CLASS (innermode) == MODE_INT)
+{
+  enum insn_code ic;
+  enum machine_mode cnt_mode;
+  switch (GET_CODE (op))
+	{
+	case ABS:
+	case NEG:
+	  return simplify_gen_unary (GET_CODE (op), outermode,
+ rtl_hooks.gen_lowpart_no_emit
+  (outermode, XEXP (op, 0)),
+ outermode);
+
+	case ASHIFT:
+	  /* i386 always uses QImode for the shift count.  Get the
+	 appropriate mode from the optab.  */
+	  ic = ashl_optab->handlers[outermode].insn_code;
+	  if (ic != CODE_FOR_nothing)
+	cnt_mode = insn_data[ic].operand[2].mode;
+	  else
+	cnt_mode = outermode;
+	  return simplify_gen_binary (GET_CODE (op), outermode,
+  rtl_hooks.gen_lowpart_no_emit
+   (outermode, XEXP (op, 0)),
+  rtl_hooks.gen_lowpart_no_emit
+   (cnt_mode, XEXP (op, 1)));
+
+	case PLUS:
+	case MINUS:
+	case AND:
+	case IOR:
+	case XOR:
+	  return simplify_gen_binary (GET_CODE (op), outermode,
+  rtl_hooks.gen_lowpart_no_emit
+   (outermode, XEXP (op, 0)),
+  rtl_hooks.gen_lowpart_no_emit
+   (outermode, XEXP (op, 1)));
+	default:
+	  break;
+	}
+}
+
   /* Optimize SUBREG truncations of zero and sign extended values.  */
   if ((GET_CODE (op) == ZERO_EXTEND
|| GET_CODE (op) == SIGN_EXTEND)
Index: gcc/combine.c

Re: RFC: LRA for x86/x86-64 [7/9] -- continuation

2012-10-03 Thread Richard Sandiford

Hi Vlad,

Some comments on lra-spills.c and lra-coalesce.c.

> +   The pass creates necessary stack slots and assign spilled pseudos
> +   to the stack slots in following way:

s/assign/assigns/

> +   (or insn memory constraints) might be not satisfied any more.

s/might be not/might not be/

> +   For some targets, the pass can spill some pseudos into hard
> +   registers of different class (usually into vector registers)
> +   instead of spilling them into memory if it is possible and
> +   profitable.   Spilling GENERAL_REGS pseudo into SSE registers for
> +   modern Intel x86/x86-64 processors is an example of such
> +   optimization.  And this is actually recommended by Intel
> +   optimization guide.

Maybe mention core i7 specifically?  "Modern" is a bit dangerous
in code that'll live a long time.

> +/* The structure describes a stack slot which can be used for several
> +   spilled pseudos.  */
> +struct slot
> +{

Looks like this describes "a register or stack slot" given the hard_regno case.

> +/* Array containing info about the stack slots.   The array element is
> +   indexed by the stack slot number in the range [0..slost_num).  */

Typo: slots_num

> +  /* Each pseudo has an inherent size which comes from its own mode,
> + and a total size which provides room for paradoxical subregs
> + which refer to the pseudo reg in wider modes.
> + 
> + We can use a slot already allocated if it provides both enough
> + inherent space and enough total space.  Otherwise, we allocate a
> + new slot, making sure that it has no less inherent space, and no
> + less total space, then the previous slot.   */

The second part of the comment seems a bit misplaced, since the following
code doesn't reuse stack slots.  This is done elsewhere instead.
Maybe the first part would be better above the inherent_size assignment.

> +  /* If we have any adjustment to make, or if the stack slot is the
> + wrong mode, make a new stack slot.   */
> +  x = adjust_address_nv (x, GET_MODE (regno_reg_rtx[i]), adjust);

We don't make a new slot here.

> +/* Sort pseudos according their slot numbers putting ones with smaller
> +   numbers first, or last when the frame pointer is not needed.   So
> +   pseudos with the first slot will be finally addressed with smaller
> +   address displacement.  */
> +static int
> +pseudo_reg_slot_compare (const void *v1p, const void *v2p)
> +{
> +  const int regno1 = *(const int *) v1p;
> +  const int regno2 = *(const int *) v2p;
> +  int diff, slot_num1, slot_num2;
> +  int total_size1, total_size2;
> +
> +  slot_num1 = pseudo_slots[regno1].slot_num;
> +  slot_num2 = pseudo_slots[regno2].slot_num;
> +  if ((diff = slot_num1 - slot_num2) != 0)
> +return (frame_pointer_needed
> + || !FRAME_GROWS_DOWNWARD == STACK_GROWS_DOWNWARD ? diff : -diff);

The comment doesn't quite describe the condition.  Maybe:

/* Sort pseudos according to their slots, putting the slots in the order
   that they should be allocated.  Slots with lower numbers have the highest
   priority and should get the smallest displacement from the stack or
   frame pointer (whichever is being used).

   The first allocated slot is always closest to the frame pointer,
   so prefer lower slot numbers when frame_pointer_needed.  If the stack
   and frame grow in the same direction, then the first allocated slot is
   always closest to the initial stack pointer and furthest away from the
   final stack pointer, so allocate higher numbers first when using the
   stack pointer in that case.  The reverse is true if the stack and
   frame grow in opposite directions.  */

> +  total_size1 = MAX (PSEUDO_REGNO_BYTES (regno1),
> +  GET_MODE_SIZE (lra_reg_info[regno1].biggest_mode));
> +  total_size2 = MAX (PSEUDO_REGNO_BYTES (regno2),
> +  GET_MODE_SIZE (lra_reg_info[regno2].biggest_mode));
> +  if ((diff = total_size2 - total_size1) != 0)
> +return diff;

I think this could do with a bit more commentary.  When is biggest_mode
ever smaller than PSEUDO_REGNO_BYTES?  Is that for pseudos that are only
ever referenced as lowpart subregs?  If so, why does PSEUDO_REGNO_BYTES
matter for those registers here but not when calculating biggest_mode?

> +/* Assign spill hard registers to N pseudos in PSEUDO_REGNOS.  Put the
> +   pseudos which did not get a spill hard register at the beginning of
> +   array PSEUDO_REGNOS.   Return the number of such pseudos.  */

It'd be worth saying that PSEUDO_REGNOS is sorted in order of highest
frequency first.

> +  bitmap set_jump_crosses = regstat_get_setjmp_crosses ();

I notice you use "set_jump" here and "setjump" in parts of 7a.patch.
Probably better to use setjmp across the board.

> +  /* Hard registers which can not be used for any purpose at given
> + program point because they are unallocatable or already allocated
> + for other pseudos.   */ 
> +  HARD_REG_SET *reserved_hard_regs;
> +
> +  if

Re: [C++ PATCH] Fix ICE with COMPOUND_EXPR in constexpr (PR c++/54777)

2012-10-03 Thread Jason Merrill


OK.

Jason

[PATCH] Improve debug info for partial inlining (PR debug/54519, take 2)

2012-10-03 Thread Jakub Jelinek

On Tue, Sep 11, 2012 at 03:59:56PM +0200, Jakub Jelinek wrote:
> As discussed in the PR, right now we do a very bad job for debug info
> of partially inlined functions (both when they are kept only partially
> inlined, or when partial inlining is performed, but doesn't seem to be
> useful and foo.part.N is inlined back, either into the original function, or
> into a function into which foo has been inlined first).
> 
> This patch improves that by doing something similar to what ipa-prop.c does,
> in particular for arguments that aren't actually passed to foo.part.N
> we add debug args and corresponding debug bind and debug source bind stmts
> to provide better debug info (if foo.part.N isn't inlined, then
> DW_OP_GNU_parameter_ref is going to be used together with corresponding call
> site arguments).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, some of the tests
> still fail with some option combinations, am going to file a DF VTA PR for
> that momentarily.  Ok for trunk?

Now that Alex fixed the DF VTA issue, most of the non-LTO FAILs are gone,
the remaining one on x86_64 looks like a GDB bug, and there is one -Os
failure on pr54519-3.c which could be either alias oracle or var-tracking
bug/missing feature - basically there is a non-addressable parameter
in stack slot, and vt_canon_true_dep -> canon_true_dependence thinks an
argument push insn might alias with it, because it doesn't have a MEM_EXPR
and ao_ref_from_mem fails.

Posting an updated patch, because last night I found that even when the
patch should have fixed
static inline void foo (int x, int y) { asm volatile ("nop"); }
static inline void bar (int z) { foo (z, 0); foo (z, 1); }
int main ()
{
  bar (0);
  bar (1);
  return 0;
}
it didn't, there was a confusion when DECL_ORIGIN should be used and when
not.  This version of the patch fixes that, on this testcase (adjusted
added as pr54519-6.c) p x, p y and up; p z now work well.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-10-03  Jakub Jelinek  

PR debug/54519
* ipa-split.c (split_function): Add debug args and
debug source and normal stmts for args_to_skip which are
gimple regs.
* tree-inline.c (copy_debug_stmt): When inlining, adjust
source debug bind stmts to debug binds of corresponding
DEBUG_EXPR_DECL.

* gcc.dg/guality/pr54519-1.c: New test.
* gcc.dg/guality/pr54519-2.c: New test.
* gcc.dg/guality/pr54519-3.c: New test.
* gcc.dg/guality/pr54519-4.c: New test.
* gcc.dg/guality/pr54519-5.c: New test.
* gcc.dg/guality/pr54519-6.c: New test.

--- gcc/ipa-split.c.jj  2012-09-25 14:26:52.612821323 +0200
+++ gcc/ipa-split.c 2012-10-02 17:41:31.030357922 +0200
@@ -1059,6 +1059,7 @@ split_function (struct split_point *spli
   gimple last_stmt = NULL;
   unsigned int i;
   tree arg, ddef;
+  VEC(tree, gc) **debug_args = NULL;

   if (dump_file)
 {
@@ -1232,6 +1233,65 @@ split_function (struct split_point *spli
   gimple_set_block (call, DECL_INITIAL (current_function_decl));
   VEC_free (tree, heap, args_to_pass);

+  if (args_to_skip)
+for (parm = DECL_ARGUMENTS (current_function_decl), num = 0;
+parm; parm = DECL_CHAIN (parm), num++)
+  if (bitmap_bit_p (args_to_skip, num)
+ && is_gimple_reg (parm))
+   {
+ tree ddecl;
+ gimple def_temp;
+
+ arg = get_or_create_ssa_default_def (cfun, parm);
+ if (!MAY_HAVE_DEBUG_STMTS)
+   continue;
+ if (debug_args == NULL)
+   debug_args = decl_debug_args_insert (node->symbol.decl);
+ ddecl = make_node (DEBUG_EXPR_DECL);
+ DECL_ARTIFICIAL (ddecl) = 1;
+ TREE_TYPE (ddecl) = TREE_TYPE (parm);
+ DECL_MODE (ddecl) = DECL_MODE (parm);
+ VEC_safe_push (tree, gc, *debug_args, DECL_ORIGIN (parm));
+ VEC_safe_push (tree, gc, *debug_args, ddecl);
+ def_temp = gimple_build_debug_bind (ddecl, unshare_expr (arg),
+ call);
+ gsi_insert_after (&gsi, def_temp, GSI_NEW_STMT);
+   }
+  if (debug_args != NULL)
+{
+  unsigned int i;
+  tree var, vexpr;
+  gimple_stmt_iterator cgsi;
+  gimple def_temp;
+
+  push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl));
+  var = BLOCK_VARS (DECL_INITIAL (node->symbol.decl));
+  i = VEC_length (tree, *debug_args);
+  cgsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR));
+  do
+   {
+ i -= 2;
+ while (var != NULL_TREE
+&& DECL_ABSTRACT_ORIGIN (var)
+   != VEC_index (tree, *debug_args, i))
+   var = TREE_CHAIN (var);
+ if (var == NULL_TREE)
+   break;
+ vexpr = make_node (DEBUG_EXPR_DECL);
+ parm = VEC_index (tree, *debug_args, i);
+ DECL_ARTIFICIAL (vexpr) = 1;
+ TREE_TYPE (vexpr) = TREE_TYPE (parm);
+ DECL_MODE (vexpr) = DECL_MODE (parm)

[PATCH] Fix up DW_TAG_formal_parameter placement

2012-10-03 Thread Jakub Jelinek

Hi!

With the PR54519 patch I've just posted, I've noticed, I've noticed on the
same testcase from yesterday's IRC:
static inline void foo (int x, int y) { asm volatile ("nop"); }
static inline void bar (int z) { foo (z, 0); foo (z, 1); }
int main ()
{
  bar (0);
  bar (1);
  return 0;
}
that while I can print x and y just fine, if I do bt, x, y and z printed
in the backtrace are all optimized out.
The problem is that first tree versioning for foo.isra.* or bar.isra.*
deposits the optimized away parameters as VAR_DECLs into the DECL_INITIAL
block (which is fine), but then during inlining they end up in the remapped
block of DECL_INITIAL, not the new block added above it into which inliner
puts parameters.  So in the debug info we have
DW_TAG_inlined_subroutine
  DW_TAG_formal_parameter for non-optimized away parameters
  DW_TAG_lexical_block
DW_TAG_formal_parameter for optimized away parameters
and the debugger (expectably) looks only at DW_TAG_inlined_subroutine
DIE's immediate children for the formal parameters to print during
backtrace.
Fixed up by moving the VAR_DECLs for parameters optimized away by versioning
to BLOCK_SUPERCONTEXT during inlining, at that point we know both of the
blocks have the same scope, and if the original DECL_INITIAL doesn't contain
any other vars, we can actually end up with shorter/more correct debug info
as well as memory savings due to being able to GC the remapped DECL_INITIAL
block.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-10-03  Jakub Jelinek  

* tree-inline.c (expand_call_inline): Move VAR_DECLs with
PARM_DECL origins from remapped DECL_INITIAL's BLOCK_VARS
into id->block's BLOCK_VARS.

--- gcc/tree-inline.c.jj2012-10-02 17:43:13.0 +0200
+++ gcc/tree-inline.c   2012-10-02 19:43:52.576382413 +0200
@@ -3946,7 +3946,29 @@ expand_call_inline (basic_block bb, gimp
   initialize_inlined_parameters (id, stmt, fn, bb);
 
   if (DECL_INITIAL (fn))
-prepend_lexical_block (id->block, remap_blocks (DECL_INITIAL (fn), id));
+{
+  tree *var;
+
+  prepend_lexical_block (id->block, remap_blocks (DECL_INITIAL (fn), id));
+  gcc_checking_assert (BLOCK_SUBBLOCKS (id->block)
+  && (BLOCK_CHAIN (BLOCK_SUBBLOCKS (id->block))
+  == NULL_TREE));
+  /* Move vars for PARM_DECLs from DECL_INITIAL block to id->block,
+otherwise DW_TAG_formal_parameter will not be children of
+DW_TAG_inlined_subroutine, but of a DW_TAG_lexical_block
+under it.  The parameters can be then evaluated in the debugger,
+but don't show in backtraces.  */
+  for (var = &BLOCK_VARS (BLOCK_SUBBLOCKS (id->block)); *var; )
+   if (TREE_CODE (DECL_ORIGIN (*var)) == PARM_DECL)
+ {
+   tree v = *var;
+   *var = TREE_CHAIN (v);
+   TREE_CHAIN (v) = BLOCK_VARS (id->block);
+   BLOCK_VARS (id->block) = v;
+ }
+   else
+ var = &TREE_CHAIN (*var);
+}
 
   /* Return statements in the function body will be replaced by jumps
  to the RET_LABEL.  */


Jakub

[PATCH] Fix recently introduced -fcompare-debug failures in the scheduler (PR rtl-optimization/54792)

2012-10-03 Thread Jakub Jelinek

Hi!

While bootstrapping the PR54519 patches last night, I've noticed a
comparison failure on i686-linux.  The problem was -g vs. -g0
swapping two instructions in tree-inline.o, caused by find_modifiable_mems
scanning [head, tail) instead of [head, tail] sequence of instructions.
Without -g0 on tree-inline.c we had a MEM load as last insn in a bb (i.e.
tail) which wasn't processed by find_modifiable_mems, while with -g
there were some DEBUG_INSNs after it and find_mems was called on it
and triggerring the optimization.

Fixed by scanning also the tail insn, the same way as other spots in
sched-deps.c handle [head, tail] walk.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-10-03  Jakub Jelinek  

PR rtl-optimization/54792
* sched-deps.c (find_modifiable_mems): Scan also TAIL insn.

--- gcc/sched-deps.c.jj 2012-10-01 10:08:43.0 +0200
+++ gcc/sched-deps.c2012-10-03 11:35:57.215589982 +0200
@@ -4816,10 +4816,10 @@ find_mem (struct mem_inc_info *mii, rtx
 void
 find_modifiable_mems (rtx head, rtx tail)
 {
-  rtx insn;
+  rtx insn, next_tail = NEXT_INSN (tail);
   int success_in_block = 0;
 
-  for (insn = head; insn != tail; insn = NEXT_INSN (insn))
+  for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
 {
   struct mem_inc_info mii;
 

Jakub

RFC: Using DECL_NO_LIMIT_STACK as a backend specific flag

2012-10-03 Thread Nick Clifton

Hi Guys,

  Is there a target specific way to add a flag bit to a function decl ?

  For the RX port I want to be able to mark a function decl for which I
  have issued a warning message.  Unfortunately I could find no easy way
  to do this other than stealing one of the generic bits (no_limit_stack
  in this case).  Is there a proper way to do this ?

  I have attached my current implementation to demonstrate the problem.

Cheers
  Nick

Index: gcc/config/rx/rx.c
===
--- gcc/config/rx/rx.c  (revision 192034)
+++ gcc/config/rx/rx.c  (working copy)
@@ -1288,6 +1288,24 @@
   target_reinit ();
 }
 
+  if (current_is_fast_interrupt && rx_warn_multiple_fast_interrupts)
+{
+  static tree prev_fast_interrupt = NULL_TREE;
+
+  if (prev_fast_interrupt == NULL_TREE)
+   prev_fast_interrupt = fndecl;
+  else if (prev_fast_interrupt != fndecl
+  && ! DECL_NO_LIMIT_STACK (fndecl))
+   {
+ warning (0, "multiple fast interrupt routines seen: %qE and %qE",
+  fndecl, prev_fast_interrupt);
+ /* FIXME: We use the no_limit_stack bit in the tree_function_decl
+structure to mark functions for which we have issued this warning.
+There probably ought to be another way to do this.  */
+ DECL_NO_LIMIT_STACK (fndecl) = 1;
+   }
+}
+
   rx_previous_fndecl = fndecl;
 }
 
Index: gcc/config/rx/rx.opt
===
--- gcc/config/rx/rx.opt(revision 192034)
+++ gcc/config/rx/rx.opt(working copy)
@@ -118,3 +118,9 @@
 mpid
 Target Mask(PID)
 Enables Position-Independent-Data (PID) mode.
+
+;---
+
+mwarn-multiple-fast-interrupts
+Target Report Var(rx_warn_multiple_fast_interrupts) Init(1) Warning
+Warn when multiple, different, fast interrupt handlers are in the compilation 
unit.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 192034)
+++ gcc/doc/invoke.texi (working copy)
@@ -859,6 +859,7 @@
 -mmax-constant-size=@gol
 -mint-register=@gol
 -mpid@gol
+-mno-warn-multiple-fast-interrupts@gol
 -msave-acc-in-interrupts}
 
 @emph{S/390 and zSeries Options}
@@ -17875,6 +17876,15 @@
 By default this feature is not enabled.  The default can be restored
 via the @option{-mno-pid} command-line option.
 
+@item -mno-warn-multiple-fast-interrupts
+@itemx -mwarn-multiple-fast-interrupts
+@opindex mno-warn-multiple-fast-interrupts
+@opindex mwarn-multiple-fast-interrupts
+Prevents GCC from issuing a warning message if it finds more than one
+fast interrupt handler when it is compiling a file.  The default is to
+issue a warning for each extra fast interrupt handler found, as the RX
+only supports one such interrupt.
+
 @end table
 
 @emph{Note:} The generic GCC command-line option @option{-ffixed-@var{reg}}

Re: RFC: Using DECL_NO_LIMIT_STACK as a backend specific flag

2012-10-03 Thread Ian Lance Taylor

On Wed, Oct 3, 2012 at 6:31 AM, Nick Clifton  wrote:
>
>   Is there a target specific way to add a flag bit to a function decl ?
>
>   For the RX port I want to be able to mark a function decl for which I
>   have issued a warning message.  Unfortunately I could find no easy way
>   to do this other than stealing one of the generic bits (no_limit_stack
>   in this case).  Is there a proper way to do this ?

Can't you just keep a list of the decls for which you have issued the
warning?  How many decls do you expect to be on that list?

Ian

Re: [PATCH] Fix recently introduced -fcompare-debug failures in the scheduler (PR rtl-optimization/54792)

2012-10-03 Thread Vladimir Makarov


On 12-10-03 9:01 AM, Jakub Jelinek wrote:

Hi!

While bootstrapping the PR54519 patches last night, I've noticed a
comparison failure on i686-linux.  The problem was -g vs. -g0
swapping two instructions in tree-inline.o, caused by find_modifiable_mems
scanning [head, tail) instead of [head, tail] sequence of instructions.
Without -g0 on tree-inline.c we had a MEM load as last insn in a bb (i.e.
tail) which wasn't processed by find_modifiable_mems, while with -g
there were some DEBUG_INSNs after it and find_mems was called on it
and triggerring the optimization.

Fixed by scanning also the tail insn, the same way as other spots in
sched-deps.c handle [head, tail] walk.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-10-03  Jakub Jelinek  

PR rtl-optimization/54792
* sched-deps.c (find_modifiable_mems): Scan also TAIL insn.



Yes.  Thanks, Jakub.

Re: [patch][lra] Use XNEWVEC and friends instead of xmalloc/xrealloc, and add some timevars

2012-10-03 Thread Vladimir Makarov


On 12-10-03 6:53 AM, Steven Bosscher wrote:

On Mon, Oct 1, 2012 at 1:05 PM, Steven Bosscher  wrote:

Hello,

This patch uses the libiberty new-like operators instead of using
xmalloc/xrealloc.
It also adds timevars for the main LRA phases, and it fixes a warning
suggesting a space before a ';' in an only-looping for loop.

Bootstrapped (lra-branch, of course) on x86_64-unknown-linux-gnu. OK?

Hi Vlad,

Is this patch OK for the lra-branch?


Yes.  Thanks, Steven.

I found only one typo:

+DEFTIMEVAR (TV_LRA_ELIMINATE , "LRA virtuals eliminatenon")
^

Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-03 Thread Vladimir Makarov


On 12-10-03 3:13 AM, Steven Bosscher wrote:

On Tue, Oct 2, 2012 at 3:42 PM, Richard Sandiford
 wrote:

+/* Compress pseudo live ranges by removing program points where
+   nothing happens.  Complexity of many algorithms in LRA is linear
+   function of program points number.  To speed up the code we try to
+   minimize the number of the program points here.  */
+static void
+remove_some_program_points_and_update_live_ranges (void)

Genuine question, but could we do this on the fly instead,
by not incrementing curr_point if the current point had no value?

I suppose the main complication would be checking cases where
all births are recorded by extending the previous just-closed live
range rather than starting a new one, in which case it's the previous
point that needs to be reused.  Hmm...

It does seem to be something worth investigating further. Things like:

Compressing live ranges: from 1742579 to 554532 - 31%
Compressing live ranges: from 1742569 to 73069 - 4%

look extreme, but they're actually the norm. For the same test case
(PR54146 still) but looking only at functions with initially 1000-
program points, you get this picture:

Compressing live ranges: from 1766 to 705 - 39%
Compressing live ranges: from 1449 to 370 - 25%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 2464 to 676 - 27%
Compressing live ranges: from 1433 to 379 - 26%
Compressing live ranges: from 1261 to 348 - 27%
Compressing live ranges: from 2835 to 755 - 26%
Compressing live ranges: from 5426 to 1678 - 30%
Compressing live ranges: from 5227 to 1477 - 28%
Compressing live ranges: from 1845 to 467 - 25%
Compressing live ranges: from 4868 to 1378 - 28%
Compressing live ranges: from 4875 to 1388 - 28%
Compressing live ranges: from 4882 to 1384 - 28%
Compressing live ranges: from 5688 to 1714 - 30%
Compressing live ranges: from 4943 to 1310 - 26%
Compressing live ranges: from 2976 to 792 - 26%
Compressing live ranges: from 5463 to 1526 - 27%
Compressing live ranges: from 2854 to 730 - 25%
Compressing live ranges: from 1810 to 745 - 41%
Compressing live ranges: from 2771 to 904 - 32%
Compressing live ranges: from 4916 to 1429 - 29%
Compressing live ranges: from 6505 to 2238 - 34%
Compressing live ranges: from 6493 to 166 - 2%
Compressing live ranges: from 5498 to 1734 - 31%
Compressing live ranges: from 1810 to 745 - 41%
Compressing live ranges: from 5043 to 1420 - 28%
Compressing live ranges: from 3094 to 788 - 25%
Compressing live ranges: from 4563 to 1311 - 28%
Compressing live ranges: from 4557 to 158 - 3%
Compressing live ranges: from 1050 to 274 - 26%
Compressing live ranges: from 1602 to 434 - 27%
Compressing live ranges: from 2474 to 600 - 24%
Compressing live ranges: from 2718 to 866 - 31%
Compressing live ranges: from 2097 to 716 - 34%
Compressing live ranges: from 4152 to 1099 - 26%
Compressing live ranges: from 5065 to 1514 - 29%
Compressing live ranges: from 1236 to 359 - 29%
Compressing live ranges: from 1722 to 541 - 31%
Compressing live ranges: from 5186 to 1401 - 27%

Unfortunately the dump doesn't mention how many live ranges could be
merged thanks to the compression.

It'd also be good to understand why the compression ratios are so
small, and consistently around ~30%.
This sentence is not clear to me.  30% means 3 times less points. It is 
pretty good compression.

  Maybe curr_point includes things
it should ignore (DEBUG_INSNs, NOTEs, ...)?

After the compression, there are only points important for conflict 
info, i.e. only points where some reg dies or start living.  Even more 
if on the subsequent points there are only life starts or only deaths,  
they are represented by one point after the compression.

Re: RFC: Using DECL_NO_LIMIT_STACK as a backend specific flag

2012-10-03 Thread nick clifton


Hi Ian,


Can't you just keep a list of the decls for which you have issued the
warning?


Yes - that would work too.  In fact I agree that this would be cleaner 
solution in my particular case.  I'll create a new patch...




How many decls do you expect to be on that list?


Not very many.  Maybe two or three at most.

But I am interested to know if there is a way for targets to add their 
own information to function decls (and decls in general).  If not for 
this particular case, then for problems to come in the future.


Cheers
  Nick

[patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-03 Thread Steven Bosscher

On Wed, Oct 3, 2012 at 4:56 PM, Vladimir Makarov  wrote:
> On 12-10-03 3:13 AM, Steven Bosscher wrote:
>>
>> On Tue, Oct 2, 2012 at 3:42 PM, Richard Sandiford
>>  wrote:

 +/* Compress pseudo live ranges by removing program points where
 +   nothing happens.  Complexity of many algorithms in LRA is linear
 +   function of program points number.  To speed up the code we try to
 +   minimize the number of the program points here.  */
 +static void
 +remove_some_program_points_and_update_live_ranges (void)
>>>
>>> Genuine question, but could we do this on the fly instead,
>>> by not incrementing curr_point if the current point had no value?
>>>
>>> I suppose the main complication would be checking cases where
>>> all births are recorded by extending the previous just-closed live
>>> range rather than starting a new one, in which case it's the previous
>>> point that needs to be reused.  Hmm...
>>
>> It does seem to be something worth investigating further. Things like:
>>
>> Compressing live ranges: from 1742579 to 554532 - 31%
>> Compressing live ranges: from 1742569 to 73069 - 4%
>>
>> look extreme, but they're actually the norm. For the same test case
>> (PR54146 still) but looking only at functions with initially 1000-
>> program points, you get this picture:
>>
>> Compressing live ranges: from 1766 to 705 - 39%
>> Compressing live ranges: from 1449 to 370 - 25%
>> Compressing live ranges: from 3939 to 1093 - 27%
>> Compressing live ranges: from 3939 to 1093 - 27%
>> Compressing live ranges: from 3939 to 1093 - 27%
>> Compressing live ranges: from 3939 to 1093 - 27%
>> Compressing live ranges: from 2464 to 676 - 27%
>> Compressing live ranges: from 1433 to 379 - 26%
>> Compressing live ranges: from 1261 to 348 - 27%
>> Compressing live ranges: from 2835 to 755 - 26%
>> Compressing live ranges: from 5426 to 1678 - 30%
>> Compressing live ranges: from 5227 to 1477 - 28%
>> Compressing live ranges: from 1845 to 467 - 25%
>> Compressing live ranges: from 4868 to 1378 - 28%
>> Compressing live ranges: from 4875 to 1388 - 28%
>> Compressing live ranges: from 4882 to 1384 - 28%
>> Compressing live ranges: from 5688 to 1714 - 30%
>> Compressing live ranges: from 4943 to 1310 - 26%
>> Compressing live ranges: from 2976 to 792 - 26%
>> Compressing live ranges: from 5463 to 1526 - 27%
>> Compressing live ranges: from 2854 to 730 - 25%
>> Compressing live ranges: from 1810 to 745 - 41%
>> Compressing live ranges: from 2771 to 904 - 32%
>> Compressing live ranges: from 4916 to 1429 - 29%
>> Compressing live ranges: from 6505 to 2238 - 34%
>> Compressing live ranges: from 6493 to 166 - 2%
>> Compressing live ranges: from 5498 to 1734 - 31%
>> Compressing live ranges: from 1810 to 745 - 41%
>> Compressing live ranges: from 5043 to 1420 - 28%
>> Compressing live ranges: from 3094 to 788 - 25%
>> Compressing live ranges: from 4563 to 1311 - 28%
>> Compressing live ranges: from 4557 to 158 - 3%
>> Compressing live ranges: from 1050 to 274 - 26%
>> Compressing live ranges: from 1602 to 434 - 27%
>> Compressing live ranges: from 2474 to 600 - 24%
>> Compressing live ranges: from 2718 to 866 - 31%
>> Compressing live ranges: from 2097 to 716 - 34%
>> Compressing live ranges: from 4152 to 1099 - 26%
>> Compressing live ranges: from 5065 to 1514 - 29%
>> Compressing live ranges: from 1236 to 359 - 29%
>> Compressing live ranges: from 1722 to 541 - 31%
>> Compressing live ranges: from 5186 to 1401 - 27%
>>
>> Unfortunately the dump doesn't mention how many live ranges could be
>> merged thanks to the compression.
>>
>> It'd also be good to understand why the compression ratios are so
>> small, and consistently around ~30%.
>
> This sentence is not clear to me.  30% means 3 times less points. It is
> pretty good compression.
>
>>   Maybe curr_point includes things
>> it should ignore (DEBUG_INSNs, NOTEs, ...)?
>>
> After the compression, there are only points important for conflict info,
> i.e. only points where some reg dies or start living.  Even more if on the
> subsequent points there are only life starts or only deaths,  they are
> represented by one point after the compression.

With this patch the amount of compression is reduced. Without the
patch the compression ratio is typically around 30%, with the patch
this goes up to ~65%. The worst compression ratios (where worse is
better in this case :-) are:

$ grep Compressing log4 | egrep " [78].%"
Compressing live ranges: from 31 to 22 - 70%, pre_count 17, post_count 15
Compressing live ranges: from 128 to 94 - 73%, pre_count 87, post_count 77
Compressing live ranges: from 32 to 23 - 71%, pre_count 16, post_count 15
Compressing live ranges: from 38 to 29 - 76%, pre_count 19, post_count 18
Compressing live ranges: from 40 to 28 - 70%, pre_count 26, post_count 24
Compressing live ranges: from 40 to 28 - 70%, pre_count 26, post_count 24
Compressing live ranges: from 40 to 28 - 70%, pre_count 26, post_count 24
Compressing live ranges: from 60 to 4

Re: [PATCH] Use __cxa_atexit on OpenBSD

2012-10-03 Thread Gerald Pfeifer

On Sat, 15 Sep 2012, Ian Lance Taylor wrote:
>> 2012-09-02  Mark Kettenis  
>>
>> * config.gcc (*-*-openbsd4.[3-9]|*-*-openbsd[5-9]*): Set
>> default_use_cxa_atexit to yes.
> This is OK.

I committed this to trunk and plan on doing so for the 4.7 branch
as well (so that OpenBSD can benefit from a release branch of GCC
carrying this) unless there are objections.

Gerald

Commit: RX: Warn about multiple fast interrupt routines.

2012-10-03 Thread Nick Clifton

Hi Guys,

  I am applying the patch below to the RX backend.  It adds support for
  issuing warning messages when multiple fast interrupt routines are
  defined in the same compilation unit.  The RX only supports one fast
  interrupt so it is probably a mistake to define more than one.  If the
  programmer really does want to define them then the command line
  -mno-warn-multiple-fast-interrupts can be used.  Of course it is
  possible to define multiple fast interrupt handlers in multiple
  compilation units and then link them together, but Renesas were not
  interested in solving this problem.

Cheers
  Nick

PS.  I have even remembered to include a patch to the html documentation
as well!


gcc/ChangeLog
2012-10-03  Nick Clifton  

* config/rx/rx.c (struct decl_chain): New local structure.
(warned_decls): New local variable.  Contains a stack of decls for
which warnings have been issued.
(add_warned_decl): Adds a decl to the stack.
(already_warned): Returns true if a given decl is on the stack.
(rx_set_current_function): Issue a warning if multiple fast
interrupt handlers are defined.
* config/rx/rx.opt (mwarn-multiple-fast-interrupts): New option.
* doc/invoke.texi: Document the option.

Index: gcc/config/rx/rx.c
===
--- gcc/config/rx/rx.c  (revision 192034)
+++ gcc/config/rx/rx.c  (working copy)
@@ -1256,6 +1256,41 @@
 }
 }
 
+struct decl_chain
+{
+  tree fndecl;
+  struct decl_chain * next;
+};
+
+/* Stack of decls for which we have issued warnings.  */
+static struct decl_chain * warned_decls = NULL;
+
+static void
+add_warned_decl (tree fndecl)
+{
+  struct decl_chain * warned = (struct decl_chain *) xmalloc (sizeof * warned);
+
+  warned->fndecl = fndecl;
+  warned->next = warned_decls;
+  warned_decls = warned;
+}
+
+/* Returns TRUE if FNDECL is on our list of warned about decls.  */
+
+static bool
+already_warned (tree fndecl)
+{
+  struct decl_chain * warned;
+
+  for (warned = warned_decls;
+   warned != NULL;
+   warned = warned->next)
+if (warned->fndecl == fndecl)
+  return true;
+
+  return false;
+}
+
 /* Perform any actions necessary before starting to compile FNDECL.
For the RX we use this to make sure that we have the correct
set of register masks selected.  If FNDECL is NULL then we are
@@ -1288,6 +1323,24 @@
   target_reinit ();
 }
 
+  if (current_is_fast_interrupt && rx_warn_multiple_fast_interrupts)
+{
+  /* We do not warn about the first fast interrupt routine that
+we see.  Instead we just push it onto the stack.  */
+  if (warned_decls == NULL)
+   add_warned_decl (fndecl);
+
+  /* Otherwise if this fast interrupt is one for which we have
+not already issued a warning, generate one and then push
+it onto the stack as well.  */
+  else if (! already_warned (fndecl))
+   {
+ warning (0, "multiple fast interrupt routines seen: %qE and %qE",
+  fndecl, warned_decls->fndecl);
+ add_warned_decl (fndecl);
+   }
+}
+
   rx_previous_fndecl = fndecl;
 }
 
Index: gcc/config/rx/rx.opt
===
--- gcc/config/rx/rx.opt(revision 192034)
+++ gcc/config/rx/rx.opt(working copy)
@@ -118,3 +118,9 @@
 mpid
 Target Mask(PID)
 Enables Position-Independent-Data (PID) mode.
+
+;---
+
+mwarn-multiple-fast-interrupts
+Target Report Var(rx_warn_multiple_fast_interrupts) Init(1) Warning
+Warn when multiple, different, fast interrupt handlers are in the compilation 
unit.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 192034)
+++ gcc/doc/invoke.texi (working copy)
@@ -859,6 +859,7 @@
 -mmax-constant-size=@gol
 -mint-register=@gol
 -mpid@gol
+-mno-warn-multiple-fast-interrupts@gol
 -msave-acc-in-interrupts}
 
 @emph{S/390 and zSeries Options}
@@ -17875,6 +17876,15 @@
 By default this feature is not enabled.  The default can be restored
 via the @option{-mno-pid} command-line option.
 
+@item -mno-warn-multiple-fast-interrupts
+@itemx -mwarn-multiple-fast-interrupts
+@opindex mno-warn-multiple-fast-interrupts
+@opindex mwarn-multiple-fast-interrupts
+Prevents GCC from issuing a warning message if it finds more than one
+fast interrupt handler when it is compiling a file.  The default is to
+issue a warning for each extra fast interrupt handler found, as the RX
+only supports one such interrupt.
+
 @end table
 
 @emph{Note:} The generic GCC command-line option @option{-ffixed-@var{reg}}

Index: htdocs/gcc-4.8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.32
diff -u -3 -p -r1.32 changes.html
--- htdocs/gcc-4.8/changes.html 2 Oct 2012 18:

Re: Propagate profile counts during switch expansion

2012-10-03 Thread Xinliang David Li

What is the status of switch expansion GIMPLE rewrite? If it is not
planned for 4.8, It will be desirable to include this fix into trunk.
It also helps set up a good base line to test against regression.


thanks,

David

On Tue, Oct 2, 2012 at 6:09 PM, Easwaran Raman  wrote:
> Hi,
>  This patch propagates the profile counts during RTL expansion. In
> many cases, there is no way to determine the exact count of an edge
> generated during the expansion. So this patch uses some simple
> heuristics to estimate the edge counts but ensures that the counts of
> the basic blocks corresponding to the cases are (nearly the) same as
> at the gimple level.
>
> Bootstrapped and profile-bootstrapped on an x86_64/linux machine. OK for 
> trunk?
>
> - Easwaran
>
> --
> 2012-10-02   Easwaran Raman  
>
> * cfgbuild.c (gen_probabilities_from_existing_counts): New function.
> (compute_outgoing_frequencies): If at least one successor of a
> BB has non-zero profile count, use the counts to compute
> probabilities.
> * expr.c (do_tablejump): Add a REG_BR_PROB note on the
> jump to default label.
> (try_tablejump): Add a parameter to specify the probability
> of jumping to the default label.
> * expr.h (try_tablejump): Add a new parameter.
> * stmt.c (case_node): Add new fields COUNT and SUBTREE_COUNT.
> (do_jump_if_equal): Pass probability for REG_BR_PROB note.
> (add_case_node): Pass execution count of the case node and use
> it to initialize COUNT field.
> (emit_case_decision_tree): Pass default_count to emit_case_nodes.
> (get_outgoing_edge_counts): New function.
> (add_prob_note_to_last_insn): Likewise.
> (case_probability): New macro.
> (emit_case_dispatch_table): Compute probability of jumping to default
> label and apply note to the jump.
> (expand_case): Compute and propagate default edge count to
> emit_case_dispatch_table.
> (expand_sjlj_dispatch_table): Update calls to add_case_node and
> emit_case_dispatch_table.
> (balance_case_nodes): Update subtree_counts.
> (emit_case_nodes): Compute edge probabilities and add note.
>
> gcc/testsuite/ChangeLog:
> 2012-10-02   Easwaran Raman  
> * gcc.dg/tree-prof/switch-case-1.c: New test case.
> * gcc.dg/tree-prof/switch-case-2.c: New test case.
>
> Index: gcc/testsuite/gcc.dg/tree-prof/switch-case-1.c
> ===
> --- gcc/testsuite/gcc.dg/tree-prof/switch-case-1.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-prof/switch-case-1.c (revision 0)
> @@ -0,0 +1,40 @@
> +/* { dg-options "-O2 -fdump-rtl-expand-all" } */
> +int g;
> +
> +__attribute__((noinline)) void foo (int  n)
> +{
> +  switch (n)
> +{
> +case 1:
> +  g++; break;
> +case 2:
> +  g += 2; break;
> +case 3:
> +  g += 1; break;
> +case 4:
> +  g += 3; break;
> +case 5:
> +  g += 4; break;
> +case 6:
> +  g += 5; break;
> +case 7:
> +  g += 6; break;
> +case 8:
> +  g += 7; break;
> +case 9:
> +  g += 8; break;
> +default:
> +  g += 8; break;
> +   }
> +}
> +
> +int main ()
> +{
> + int i;
> + for (i = 0; i < 1; i++)
> +   foo ((i * i) % 5);
> + return 0;
> +}
> +/* { dg-final-use { scan-rtl-dump-times ";; basic block\[^\\n\]*count
> 4000" 2 "expand"} } */
> +/* { dg-final-use { scan-rtl-dump-times ";; basic block\[^\\n\]*count
> 2000" 1 "expand"} } */
> +/* { dg-final-use { cleanup-rtl-dump "expand" } } */
> Index: gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c
> ===
> --- gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-prof/switch-case-2.c (revision 0)
> @@ -0,0 +1,40 @@
> +/* { dg-options "-O2 -fdump-rtl-expand-all" } */
> +int g;
> +
> +__attribute__((noinline)) void foo (int  n)
> +{
> +  switch (n)
> +{
> +case 99:
> +  g += 2; break;
> +case 1:
> +  g++; break;
> +case 100:
> +  g += 1; break;
> +case 4:
> +  g += 3; break;
> +case 5:
> +  g += 4; break;
> +case 6:
> +  g += 5; break;
> +case 7:
> +  g += 6; break;
> +case 8:
> +  g += 7; break;
> +case 9:
> +  g += 8; break;
> +default:
> +  g += 8; break;
> +   }
> +}
> +
> +int main ()
> +{
> + int i;
> + for (i = 0; i < 1; i++)
> +   foo ((i * i) % 5);
> + return 0;
> +}
> +/* { dg-final-use { scan-rtl-dump-times ";; basic block\[^\\n\]*count
> 4000" 2 "expand"} } */
> +/* { dg-final-use { scan-rtl-dump-times ";; basic block\[^\\n\]*count
> 2000" 1 "expand"} } */
> +/* { dg-final-use { cleanup-rtl-dump "expand" } } */
> Index: gcc/expr.c
> ===
> --- gcc/expr.c (revision 191879)
> +++ gcc/expr.c (working copy)
> @@ -154,7 +154,7 @@ static rtx do_store_flag (sepops, rtx, enum machin
>  #ifdef PUSH_ROUNDING
>  static void emit_single_push_insn (enum machine_mode, rtx, tree);
>  #endif
> -static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx);
> +stati

[Patch, Fortran, OOP] PR 54784: [4.7/4.8 Regression] wrong code in polymorphic allocation with SOURCE

2012-10-03 Thread Janus Weil

Hi all,

here is a small patch for a wrong-code regression with polymorphic
allocation. The problem is that we falsely detect the allocation
variable to be a polymorphic array (although it is a scalar). For
further details see the PR, in particular comment 4.

Regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.7?

Cheers,
Janus


2012-10-03  Janus Weil  

PR fortran/54784
* trans-stmt.c (gfc_trans_allocate): Correctly determine the reference
to the _data component for polymorphic allocation with SOURCE.

2012-10-03  Janus Weil  

PR fortran/54784
* gfortran.dg/class_allocate_13.f90: New.


pr54784.diff
Description: Binary data


class_allocate_13.f90
Description: Binary data

Re: Convert more non-GTY htab_t to hash_table.

2012-10-03 Thread Lawrence Crowl

On 10/2/12, Ian Lance Taylor  wrote:
> On Oct 2, 2012 Lawrence Crowl  wrote:
> > On 10/2/12, Richard Guenther  wrote:
> > > You are changing a hashtable used by fold checking, did you
> > > test with fold checking enabled?
> >
> > I didn't know I had to do anything beyond the normal make check.
> > What do I do?
>
> Fold checking is not enabled by default because of high overhead
> and general pointlessness.  To enable it, when you run configure,
> use --enable-checking=yes,fold.

So, why have the feature if it is pointless?  Just curious.

-- 
Lawrence Crowl

Re: Convert more non-GTY htab_t to hash_table.

2012-10-03 Thread Jakub Jelinek

On Wed, Oct 03, 2012 at 09:54:45AM -0700, Lawrence Crowl wrote:
> On 10/2/12, Ian Lance Taylor  wrote:
> > On Oct 2, 2012 Lawrence Crowl  wrote:
> > > On 10/2/12, Richard Guenther  wrote:
> > > > You are changing a hashtable used by fold checking, did you
> > > > test with fold checking enabled?
> > >
> > > I didn't know I had to do anything beyond the normal make check.
> > > What do I do?
> >
> > Fold checking is not enabled by default because of high overhead
> > and general pointlessness.  To enable it, when you run configure,
> > use --enable-checking=yes,fold.
> 
> So, why have the feature if it is pointless?  Just curious.

It is not pointless, just seldom used.  The reason for the verification code
is to (very expensively) verify that fold_binary etc. doesn't modify
passed expressions in place, instead if it needs to make changes, it
allocates new trees.  It is not something for every day use
(unlike say tree or rtl checking which can be enabled on faster build boxes
for daily bootstraps on the trunk), but something to be checked once a year
or so (similarly to --enable-checking=valgrind, which is even more
expensive).

Jakub

Re: abs(long long)

2012-10-03 Thread Gabriel Dos Reis

On Tue, Oct 2, 2012 at 1:53 PM, Marc Glisse  wrote:

>> See what we did in c/cmath and c_global/cmath.
>
>
> Note that llabs is quite different from asin.

Is asin the function you took out of c/cmath?

>  __builtin_llabs generates an
> ABS_EXPR, which will later be expanded either to a special instruction or to
> a condition. It never generates a call to llabs (I am not sure exactly if
> Paolo's instructions to use llabs meant he wanted an actual library call).

distinction without difference.  Again see c/cmath.


> __builtin_asin on the other hand is never expanded inline (except maybe for
> special constant input like 0) and expands to a call to the library function
> asin.
>
> Would the attached patch be better, assuming it passes testing? For lldiv,
> there is no builtin (for good reason).
>
> * include/c_std/cstdlib (abs(long long)): Define with
> __builtin_llabs when we have long long.
>
> (abs(__int128)): Define when we have __int128.

This change is OK

> (div(long long, long long)): Use lldiv.

not this one.

>
>
> --
> Marc Glisse
> Index: cstdlib
> ===
> --- cstdlib (revision 191941)
> +++ cstdlib (working copy)
> @@ -128,21 +128,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>using ::strtod;
>using ::strtol;
>using ::strtoul;
>using ::system;
>  #ifdef _GLIBCXX_USE_WCHAR_T
>using ::wcstombs;
>using ::wctomb;
>  #endif // _GLIBCXX_USE_WCHAR_T
>
>inline long
> -  abs(long __i) { return labs(__i); }
> +  abs(long __i) { return __builtin_labs(__i); }
> +
> +#ifdef _GLIBCXX_USE_LONG_LONG
> +  inline long long
> +  abs(long long __x) { return __builtin_llabs (__x); }
> +#endif
> +
> +#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128)
> +  inline __int128
> +  abs(__int128 __x) { return __x >= 0 ? __x : -__x; }
> +#endif
>
>inline ldiv_t
>div(long __i, long __j) { return ldiv(__i, __j); }
>
>  _GLIBCXX_END_NAMESPACE_VERSION
>  } // namespace
>
>  #if _GLIBCXX_USE_C99
>
>  #undef _Exit
> @@ -161,29 +171,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
>using ::lldiv_t;
>  #endif
>  #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
>extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN;
>  #endif
>  #if !_GLIBCXX_USE_C99_DYNAMIC
>using ::_Exit;
>  #endif
>
> -  inline long long
> -  abs(long long __x) { return __x >= 0 ? __x : -__x; }
> -
>  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
>using ::llabs;
>
>inline lldiv_t
>div(long long __n, long long __d)
> -  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
> +  { return ::lldiv (__n, __d); }
>
>using ::lldiv;
>  #endif
>
>  #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
>extern "C" long long int (atoll)(const char *) throw ();
>extern "C" long long int
>  (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
>extern "C" unsigned long long int
>  (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
> @@ -198,21 +205,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>  _GLIBCXX_END_NAMESPACE_VERSION
>  } // namespace __gnu_cxx
>
>  namespace std
>  {
>  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
>using ::__gnu_cxx::lldiv_t;
>  #endif
>using ::__gnu_cxx::_Exit;
> -  using ::__gnu_cxx::abs;
>  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
>using ::__gnu_cxx::llabs;
>using ::__gnu_cxx::div;
>using ::__gnu_cxx::lldiv;
>  #endif
>using ::__gnu_cxx::atoll;
>using ::__gnu_cxx::strtof;
>using ::__gnu_cxx::strtoll;
>using ::__gnu_cxx::strtoull;
>using ::__gnu_cxx::strtold;
>

Re: Propagate profile counts during switch expansion

2012-10-03 Thread Steven Bosscher

On Wed, Oct 3, 2012 at 6:12 PM, Xinliang David Li  wrote:
> What is the status of switch expansion GIMPLE rewrite? If it is not
> planned for 4.8, It will be desirable to include this fix into trunk.

I could work on it for GCC 4.8 (there's not a lot of work left to be
done for it now) but we haven't really decided yet where the pass
should be scheduled and I also would like to wait a bit to see how the
SJLJ changes work out.  So I talked about this with Easwaran, I think
his patch should be included into the trunk now. I'll adapt it for the
move to GIMPLE.

> It also helps set up a good base line to test against regression.

Agreed.

Ciao!
Steven

Re: [PATCH] PowerPC VLE port

2012-10-03 Thread Andrew Pinski

On Wed, Oct 3, 2012 at 6:59 AM, James Lemke  wrote:
> Ping..

> @@ -847,7 +1106,6 @@

>   && (DEFAULT_ABI != ABI_AIX || SYMBOL_REF_FUNCTION_P 
> (op))")))

>

>  ;; Return 1 if op is an operand that can be loaded via the GOT.

> -;; or non-special register register field no cr0

>  (define_predicate "got_operand"

>   (match_code "symbol_ref,const,label_ref"))


Most likely should be submitted and committed (as obvious) separately.
> @@ -6694,7 +6771,7 @@

>

>   switch (mode)

> {

>-  case  QImode:

>+case QImode:

> case HImode:

>   if (dest == NULL)

>   dest = gen_reg_rtx (mode);

Likewise.

> @@ -14944,11 +15104,10 @@

>   return;

>

> case 'Q':

>-  if (TARGET_MFCRF)

>-  fputc (',', file);

>-/* FALLTHRU */

>-  else

>+  if (! TARGET_MFCRF)

>   return;

>+  fputc (',', file);

>+  /* FALLTHRU */

>

> case 'R':

>   /* X is a CR register.  Print the mask for `mtcrf'.  */

Likewise.

>@@ -15893,7 +16071,7 @@

> }

>

> /* Return the string to output a conditional branch to LABEL, which is

>-   the operand number of the label, or -1 if the branch is really a

>+   the operand template of the label, or NULL if the branch is really a

>conditional return.

>

>OP is the conditional expression.  XEXP (OP, 0) is assumed to be a

This looks like it should be done separately also.

> +bool

> +valid_vle_sd4_field (rtx mem, enum machine_mode mode)

No comment before this function.

>- (simple_return "")])

>+ (simple_return "1")])

Submitted separately it looks.
> libgcc/longlong.h
Gets sync'd with glibc's version sometimes.

Also have you thought have just adding a vle.md for all the needed
patterns and disabling the patterns in rs6000.md for VLE and not using
%^/%+/%- ? I think that would have been a cleaner implementation of
vle than adding support for it to the current patterns.  Also does not
have the maintenance issue of always having to check if a new pattern
needs the %^/%+/%-.


Thanks,
Andrew Pinski


>
>
>
>  Original Message 
> Subject: [PATCH] PowerPC VLE port
> Date: Mon, 24 Sep 2012 21:44:02 -0400
> From: James Lemke 
> To: GCC Patches 
>
> The initial patch for this port caused much comment.  I have attached
> an updated patch trying to address many of those points.
> All comments are welcome.  I would prefer comments are made on-list.
>
> I have tried to simplify this patch by:
> 1) Removing as many compound tests as possible and now rely on feature
> flags.
> I have removed TARGET_VLE_ISEL, TARGET_VLE_MULTIPLE, TARGET_VLE_ISEL64 &
> TARGET_MFCRF_NOVLE.
> -mcpu now implies the following options:
> -mcpu=e200z[0-2]: -mvle -misel -mno-mfcrf -mmultiple -msoft-float
> -mcpu=e200z[367]: -mvle -misel -mno-mfcrf -mmultiple -mfloat-gprs=single \
>-mspe=yes -mabi=spe
> 2) When gcc is configured for a non-VLE target, TARGET_VLE evaluates to "0"
> so that most VLE-specific compiler code is optimized away.
> 3) Separated all VLE-only items from rs6000.md to a new file, vle.md.
> In the cases where there was strong commonality there is still VLE code in
> rs6000.md.
>
> On r191665 I have run the DejaGNU suite.  A bootstrap is running.
>
> Comments?
> Jim.
>
> --
> Jim Lemke
> Mentor Graphics / CodeSourcery
> Orillia Ontario,  +1-613-963-1073
>
>
>

Re: PATCH: PR target/54741: Check SSE and YMM state support for -march=native

2012-10-03 Thread H.J. Lu

On Tue, Oct 2, 2012 at 12:35 PM, H.J. Lu  wrote:
> Hi,
>
> This patch checks SSE and YMM state support for -march=native.  Tested
> on Linux/x86-64.  OK to install?
>
> Thanks.
>
>
> H.J.
> ---
> 2012-10-02  H.J. Lu  
>
> PR target/54741
> *  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
> (XSTATE_FP): Likewise.
> (XSTATE_SSE): Likewise.
> (XSTATE_YMM): Likewise.
> (host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
> SSE and YMM states aren't supported.
>
> diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
> index bda4e02..4dffc51 100644
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -390,6 +390,7 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>unsigned int has_hle = 0, has_rtm = 0;
>unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
>unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
> +  unsigned int has_osxsave = 0;
>
>bool arch;
>
> @@ -431,6 +432,7 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>has_sse4_1 = ecx & bit_SSE4_1;
>has_sse4_2 = ecx & bit_SSE4_2;
>has_avx = ecx & bit_AVX;
> +  has_osxsave = ecx & bit_OSXSAVE;
>has_cmpxchg16b = ecx & bit_CMPXCHG16B;
>has_movbe = ecx & bit_MOVBE;
>has_popcnt = ecx & bit_POPCNT;
> @@ -460,6 +462,26 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>has_adx = ebx & bit_ADX;
>  }
>
> +  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
> +#define XCR_XFEATURE_ENABLED_MASK  0x0
> +#define XSTATE_FP  0x1
> +#define XSTATE_SSE 0x2
> +#define XSTATE_YMM 0x4
> +  if (has_osxsave)
> +asm (".byte 0x0f; .byte 0x01; .byte 0xd0"
> +: "=a" (eax), "=d" (edx)
> +: "c" (XCR_XFEATURE_ENABLED_MASK));
> +
> +  /* Check if SSE and YMM states are supported.  */
> +  if ((eax & (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM))
> +{
> +  has_avx = 0;
> +  has_avx2 = 0;
> +  has_fma = 0;
> +  has_fma4 = 0;
> +  has_xop = 0;
> +}
> +
>/* Check cpuid level of extended features.  */
>__cpuid (0x8000, ext_level, ebx, ecx, edx);
>

This is very embarrassing.  Thanks to Andrew, I checked his fix

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54741#c18

into trunk, 4.6 and 4.7 branches.


-- 
H.J.

Re: Propagate profile counts during switch expansion

2012-10-03 Thread Xinliang David Li

thanks for the update!

David

On Wed, Oct 3, 2012 at 10:37 AM, Steven Bosscher  wrote:
> On Wed, Oct 3, 2012 at 6:12 PM, Xinliang David Li  wrote:
>> What is the status of switch expansion GIMPLE rewrite? If it is not
>> planned for 4.8, It will be desirable to include this fix into trunk.
>
> I could work on it for GCC 4.8 (there's not a lot of work left to be
> done for it now) but we haven't really decided yet where the pass
> should be scheduled and I also would like to wait a bit to see how the
> SJLJ changes work out.  So I talked about this with Easwaran, I think
> his patch should be included into the trunk now. I'll adapt it for the
> move to GIMPLE.
>
>> It also helps set up a good base line to test against regression.
>
> Agreed.
>
> Ciao!
> Steven

Re: [PATCH] PowerPC VLE port

2012-10-03 Thread James Lemke




  ;; Return 1 if op is an operand that can be loaded via the GOT.
-;; or non-special register register field no cr0
  (define_predicate "got_operand"
   (match_code "symbol_ref,const,label_ref"))


Most likely should be submitted and committed (as obvious) separately.


Yes, will do.


   switch (mode)
 {
-  case  QImode:
+case QImode:
 case HImode:
   if (dest == NULL)
dest = gen_reg_rtx (mode);


Likewise.


Will do.


 case 'Q':
-  if (TARGET_MFCRF)
-   fputc (',', file);
-/* FALLTHRU */
-  else
+  if (! TARGET_MFCRF)
return;
+  fputc (',', file);
+  /* FALLTHRU */

 case 'R':
   /* X is a CR register.  Print the mask for `mtcrf'.  */


Likewise.


Will do.


/* Return the string to output a conditional branch to LABEL, which is
-   the operand number of the label, or -1 if the branch is really a
+   the operand template of the label, or NULL if the branch is really a
conditional return.

OP is the conditional expression.  XEXP (OP, 0) is assumed to be a


This looks like it should be done separately also.


OK.


+bool
+valid_vle_sd4_field (rtx mem, enum machine_mode mode)

No comment before this function.


I'll fix that.


-  (simple_return "")])
+  (simple_return "1")])


Submitted separately it looks.


Sure.


libgcc/longlong.h

Gets sync'd with glibc's version sometimes.


Are you asking me to change something here?


Also have you thought have just adding a vle.md for all the needed
patterns and disabling the patterns in rs6000.md for VLE and not using
%^/%+/%- ? I think that would have been a cleaner implementation of
vle than adding support for it to the current patterns.  Also does not
have the maintenance issue of always having to check if a new pattern
needs the %^/%+/%-.


The idea was discussed.  A disadvantage is the code duplication it
would cause, which is also a maintenance headache.  So we opted for the
current implementation.  If the mainstream development causes a problem
for VLE it will be up to those interested in VLE to correct %^ etc.

--
Jim Lemke
Mentor Graphics / CodeSourcery
Orillia Ontario,  +1-613-963-1073

RFA: add lock_length attribute to break branch-shortening cycles

2012-10-03 Thread Joern Rennecke


The ARCompact architecture has some pipelining features that result in
the vanilla branch shortening not always converging.

Moreover, there are some short, complex branch instructions with very short
offsets; replacing them with a multi-insn sequence when the offset doesn't
reach makes the code significantly longer.  Thus, when starting branch
shortening with pessimistic assumptions, the short branches are often not
used because of the pessimistic branch length causing the offsets going out
of range.
This problem can be avoided when starting with a low estimate and working
upwards.  However, that makes the incidence of infinite branch shortening
cycles higher, and also makes it impossible to just break out after some
iteration count.

To address these issues, I've made the generator programs recognize the
optional lock_length attribute.

To quote from the documentation added for this feature:

 If you define the `lock_length' attribute, branch shortening will work
the other way round: it starts out assuming minimum instruction lengths
and iterates from there.  In addition, the value of the `lock_length'
attribute does not decrease across iterations, and the value computed
for the `length' attribute will be no smaller than that of the
`lock_length' attribute.

bootstrapped and regression tested on i686-pc-linux-gnu
2012-10-03  Joern Rennecke  

* final.c (get_attr_length_1): Use direct recursion rather than
calling get_attr_length.
(get_attr_lock_length): New function.
(INSN_VARIABLE_LENGTH_P): Define.
(shorten_branches): Take HAVE_ATTR_lock_length into account.
Don't overwrite non-delay slot insn lengths with the lengths of
delay slot insns with same uid.
* genattrtab.c (lock_length_str): New variable.
(make_length_attrs): New parameter base.
(main): Initialize lock_length_str.
Generate lock_lengths attributes.
* genattr.c (gen_attr): Emit declarations for lock_length attribute
related functions.
* doc/md.texi (node Insn Lengths): Document lock_length attribute.

Index: doc/md.texi
===
--- doc/md.texi (revision 192036)
+++ doc/md.texi (working copy)
@@ -8004,6 +8004,20 @@ (define_insn "jump"
   (const_int 6)))])
 @end smallexample
 
+@cindex lock_length
+Usually, branch shortening is done assuming the worst case (i.e. longest)
+lengths, and then iterating (if optimizing) to smaller lengths till 
+no further changed occur.  This does not work so well for architectures
+that have very small minimum offsets and considerable jumps in instruction
+lengths.
+
+If you define the @code{lock_length} attribute, branch shortening will
+work the other way round: it starts out assuming minimum instruction
+lengths and iterates from there.  In addition, the value of the
+@code{lock_length} attribute does not decrease across iterations, and
+the value computed for the @code{length} attribute will be no smaller
+than that of the @code{lock_length} attribute.
+
 @end ifset
 @ifset INTERNALS
 @node Constant Attributes
Index: final.c
===
--- final.c (revision 192036)
+++ final.c (working copy)
@@ -312,6 +312,7 @@ dbr_sequence_length (void)
`insn_current_length'.  */
 
 static int *insn_lengths;
+static char *uid_lock_length;
 
 VEC(int,heap) *insn_addresses_;
 
@@ -447,6 +448,20 @@ get_attr_length (rtx insn)
   return get_attr_length_1 (insn, insn_default_length);
 }
 
+#ifdef HAVE_ATTR_lock_length
+int
+get_attr_lock_length (rtx insn)
+{
+  if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn))
+return uid_lock_length[INSN_UID (insn)];
+  return get_attr_length_1 (insn, insn_min_lock_length);
+}
+#define INSN_VARIABLE_LENGTH_P(INSN) \
+  (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN))
+#else
+#define INSN_VARIABLE_LENGTH_P(INSN) (insn_variable_length_p (INSN))
+#endif
+
 /* Obtain the current length of an insn.  If branch shortening has been done,
get its actual length.  Otherwise, get its minimum length.  */
 int
@@ -865,6 +880,7 @@ shorten_branches (rtx first ATTRIBUTE_UN
   rtx body;
   int uid;
   rtx align_tab[MAX_CODE_ALIGN];
+  int (*length_fun) (rtx);
 
 #endif
 
@@ -875,6 +891,10 @@ shorten_branches (rtx first ATTRIBUTE_UN
   free (uid_shuid);
 
   uid_shuid = XNEWVEC (int, max_uid);
+#ifdef HAVE_ATTR_lock_length
+  uid_lock_length = XNEWVEC (char, max_uid);
+  memset (uid_lock_length, 0, max_uid);
+#endif
 
   if (max_labelno != max_label_num ())
 {
@@ -1067,6 +1087,10 @@ shorten_branches (rtx first ATTRIBUTE_UN
 #endif /* CASE_VECTOR_SHORTEN_MODE */
 
   /* Compute initial lengths, addresses, and varying flags for each insn.  */
+  length_fun = insn_default_length;
+#ifdef HAVE_ATTR_lock_length
+  length_fun = insn_min_length;
+#endif
   for (insn_current_address = 0, insn = first;
insn != 0;

[PATCH] Fix PR54782

2012-10-03 Thread Dehao Chen

Hi,

Attached patch fixes PR54782. phi_arg_location is not correctly
updated in move_block_to_fn. This patch fixes the problem.

Bootstrapped and passed gcc regression tests on x86.

Okay for trunk?

Thanks,
Dehao

gcc/ChangeLog:
2012-10-03  Dehao Chen  

PR middle-end/54782
* tree-cfg.c (move_block_to_fn): Update lexical block for phi_args.

gcc/testsuite/ChangeLog:
2012-10-03  Dehao Chen  

PR middle-end/54782
* gcc.dg/pr54782.c: New test.
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 192041)
+++ gcc/tree-cfg.c  (working copy)
@@ -6322,6 +6322,7 @@ move_block_to_fn (struct function *dest_cfun, basi
   use_operand_p use;
   tree op = PHI_RESULT (phi);
   ssa_op_iter oi;
+  unsigned i;
 
   if (virtual_operand_p (op))
{
@@ -6340,6 +6341,20 @@ move_block_to_fn (struct function *dest_cfun, basi
SET_USE (use, replace_ssa_name (op, d->vars_map, dest_cfun->decl));
}
 
+  for (i = 0; i < EDGE_COUNT (bb->preds); i++)
+   {
+ location_t locus = gimple_phi_arg_location (phi, i);
+ if (locus != UNKNOWN_LOCATION)
+   {
+ tree block = LOCATION_BLOCK (locus);
+ if (d->orig_block == NULL_TREE
+ || block == d->orig_block)
+   gimple_phi_arg_set_location (phi, i, d->new_block ?
+   COMBINE_LOCATION_DATA (line_table, locus, d->new_block) :
+   LOCATION_LOCUS (locus));
+   }
+   }
+
   gsi_next (&si);
 }
 
Index: gcc/testsuite/gcc.dg/pr54782.c
===
--- gcc/testsuite/gcc.dg/pr54782.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr54782.c  (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ffast-math -ftree-parallelize-loops=2 -g" } */
+
+struct S
+{
+  int n;
+  float *a;
+};
+
+int
+foo (struct S *s)
+{
+  float sum = 0;
+  int i;
+  for (i = 0; i < s->n; i++)
+sum += s->a[i];
+  return sum;
+}

Add myself to MAINTAINERS

2012-10-03 Thread Lawrence Crowl

Adding myself as write after approval.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 191941)
+++ MAINTAINERS (working copy)
@@ -342,6 +342,7 @@ R. Kelley Cook
 kc...@gcc.gnu.org
 Christian Cornelssen   cc...@cs.tu-berlin.de
 François-Xavier Coudertfxcoud...@gcc.gnu.org
 Cary Coutant   ccout...@google.com
+Lawrence Crowl cr...@google.com
 Ian Dall   i...@beware.dropbear.id.au
 David Daneydavid.da...@caviumnetworks.com
 Bud Davis  jmda...@link.com


-- 
Lawrence Crowl

Re: [PATCH] Fix PR54782

2012-10-03 Thread Jakub Jelinek

On Wed, Oct 03, 2012 at 11:26:09AM -0700, Dehao Chen wrote:
> @@ -6340,6 +6341,20 @@ move_block_to_fn (struct function *dest_cfun, basi
>   SET_USE (use, replace_ssa_name (op, d->vars_map, dest_cfun->decl));
>   }
>  
> +  for (i = 0; i < EDGE_COUNT (bb->preds); i++)
> + {
> +   location_t locus = gimple_phi_arg_location (phi, i);
> +   if (locus != UNKNOWN_LOCATION)
> + {
> +   tree block = LOCATION_BLOCK (locus);
> +   if (d->orig_block == NULL_TREE
  || block == d->orig_block)
> + gimple_phi_arg_set_location (phi, i, d->new_block ?
> + COMBINE_LOCATION_DATA (line_table, locus, d->new_block) :
> + LOCATION_LOCUS (locus));
> + }
> + }
> +

The formatting is wrong on this.  ? and : would need to go to the next line
and even the indentation of part of an argument is weird.
IMHO better just do:
  if (d->orig_block == NULL_TREE || block == d->orig_block)
{
  if (d->new_block)
locus = COMBINE_LOCATION_DATA (line_table, locus,
   d->new_block);
  else
locus = LOCATION_LOCUS (locus);
  gimple_phi_arg_set_location (phi, i, locus);
}

It will be far more readable.
And/or to decrease indentation level you could do
  if (locus == UNKNOWN_LOCATION)
continue;
  block = LOCATION_BLOCK (locus);
  ...

Jakub

Re: Propagate profile counts during switch expansion

2012-10-03 Thread Jan Hubicka

> thanks for the update!
OK, I will review the patch tomorrow then. It is a good incremental step.  I
would certainly like to see the gimple expansion in 4.8 however.

Honza

RFA: darwin PATCH to fix build, internal visibility

2012-10-03 Thread Jason Merrill


This patch fixes a couple of Darwin issues I noticed with a cross-compiler:

1) Adds a couple of consts to avoid const-correctness errors.
2) Treats visibility "internal" like "hidden" rather than like 
"default".  The documentation says that internal is hidden + 
processor-specific semantics, so treating it as just hidden makes sense 
to me.


OK for trunk?

Jason
commit 907f8d9588da2e65cf938985639bfaf0dde53c6f
Author: Jason Merrill 
Date:   Wed Oct 3 09:55:25 2012 -0400

	* config/darwin.c (darwin_assemble_visibility): Treat
	VISIBILITY_INTERNAL as hidden.

	* config/darwin-c.c (find_subframework_file): Add missing const.
	(framework_construct_pathname): Likewise.

diff --git a/gcc/config/darwin-c.c b/gcc/config/darwin-c.c
index a642f66..8e48c30 100644
--- a/gcc/config/darwin-c.c
+++ b/gcc/config/darwin-c.c
@@ -267,7 +267,7 @@ static struct framework_header framework_header_dirs[] = {
 static char *
 framework_construct_pathname (const char *fname, cpp_dir *dir)
 {
-  char *buf;
+  const char *buf;
   size_t fname_len, frname_len;
   cpp_dir *fast_dir;
   char *frname;
@@ -344,7 +344,7 @@ find_subframework_file (const char *fname, const char *pname)
 {
   char *sfrname;
   const char *dot_framework = ".framework/";
-  char *bufptr;
+  const char *bufptr;
   int sfrname_len, i, fname_len;
   struct cpp_dir *fast_dir;
   static struct cpp_dir subframe_dir;
diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index 54c92d1..5a9f50a 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -2623,7 +2623,7 @@ darwin_assemble_visibility (tree decl, int vis)
 {
   if (vis == VISIBILITY_DEFAULT)
 ;
-  else if (vis == VISIBILITY_HIDDEN)
+  else if (vis == VISIBILITY_HIDDEN || vis == VISIBILITY_INTERNAL)
 {
   fputs ("\t.private_extern ", asm_out_file);
   assemble_name (asm_out_file,
@@ -2631,7 +2631,7 @@ darwin_assemble_visibility (tree decl, int vis)
   fputs ("\n", asm_out_file);
 }
   else
-warning (OPT_Wattributes, "internal and protected visibility attributes "
+warning (OPT_Wattributes, "protected visibility attribute "
 	 "not supported in this configuration; ignored");
 }

[libcpp] Free some variables

2012-10-03 Thread Tobias Burnus

Build on x86-64-linux with C/C++/Fortran. I will now do an all-language 
build/regtest.

OK when it passes?

Tobias
2012-10-03  Tobias Burnus  

	* files.c (read_file_guts, _cpp_save_file_entries): Free memory
	before returning.
	* lex.c (warn_about_normalization): Ditto.
	* mkdeps.c (deps_save): Ditto.
	* pch.c (cpp_valid_state): Ditto.

diff --git a/libcpp/files.c b/libcpp/files.c
index 5b3a37b..6fc24e2 100644
--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -668,12 +668,13 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file)
 	}
 }
 
   if (count < 0)
 {
   cpp_errno (pfile, CPP_DL_ERROR, file->path);
+  free (buf);
   return false;
 }
 
   if (regular && total != size && STAT_SIZE_RELIABLE (file->st))
 cpp_error (pfile, CPP_DL_WARNING,
 	   "%s is shorter than expected", file->path);
@@ -1756,12 +1757,13 @@ _cpp_save_file_entries (cpp_reader *pfile, FILE *fp)
 	  FILE *ff;
 	  int oldfd = f->fd;
 
 	  if (!open_file (f))
 	{
 	  open_file_failed (pfile, f, 0);
+	  free (result);
 	  return false;
 	}
 	  ff = fdopen (f->fd, "rb");
 	  md5_stream (ff, result->entries[count].sum);
 	  fclose (ff);
 	  f->fd = oldfd;
diff --git a/libcpp/lex.c b/libcpp/lex.c
index ab904db..23809bc 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -1091,12 +1091,13 @@ warn_about_normalization (cpp_reader *pfile,
   if (NORMALIZE_STATE_RESULT (s) == normalized_C)
 	cpp_warning_with_line (pfile, CPP_W_NORMALIZE, token->src_loc, 0,
 			   "`%.*s' is not in NFKC", (int) sz, buf);
   else
 	cpp_warning_with_line (pfile, CPP_W_NORMALIZE, token->src_loc, 0,
 			   "`%.*s' is not in NFC", (int) sz, buf);
+  free (buf);
 }
 }
 
 /* Returns TRUE if the sequence starting at buffer->cur is invalid in
an identifier.  FIRST is TRUE if this starts an identifier.  */
 static bool
diff --git a/libcpp/mkdeps.c b/libcpp/mkdeps.c
index af11ac3..b576813 100644
--- a/libcpp/mkdeps.c
+++ b/libcpp/mkdeps.c
@@ -396,31 +396,39 @@ deps_save (struct deps *deps, FILE *f)
 int
 deps_restore (struct deps *deps, FILE *fd, const char *self)
 {
   unsigned int i, count;
   size_t num_to_read;
   size_t buf_size = 512;
-  char *buf = XNEWVEC (char, buf_size);
+  char *buf;
 
   /* Number of dependences.  */
   if (fread (&count, 1, sizeof (count), fd) != sizeof (count))
 return -1;
 
+  buf = XNEWVEC (char, buf_size);
+
   /* The length of each dependence string, followed by the string.  */
   for (i = 0; i < count; i++)
 {
   /* Read in # bytes in string.  */
   if (fread (&num_to_read, 1, sizeof (size_t), fd) != sizeof (size_t))
-	return -1;
+	{
+	  free (buf);
+	  return -1;
+	}
   if (buf_size < num_to_read + 1)
 	{
 	  buf_size = num_to_read + 1 + 127;
 	  buf = XRESIZEVEC (char, buf, buf_size);
 	}
   if (fread (buf, 1, num_to_read, fd) != num_to_read)
-	return -1;
+	{
+	  free (buf);
+	  return -1;
+	}
   buf[num_to_read] = '\0';
 
   /* Generate makefile dependencies from .pch if -nopch-deps.  */
   if (self != NULL && filename_cmp (buf, self) != 0)
 deps_add_dep (deps, buf);
 }
diff --git a/libcpp/pch.c b/libcpp/pch.c
index d278f14..001bf3f 100644
--- a/libcpp/pch.c
+++ b/libcpp/pch.c
@@ -707,13 +707,12 @@ cpp_valid_state (cpp_reader *r, const char *name, int fd)
 
   /* We win!  */
   return 0;
 
  error:
   cpp_errno (r, CPP_DL_ERROR, "while reading precompiled header");
-  return -1;
 
  fail:
   free (namebuf);
   free (undeftab);
   free (nl.defs);
   return 1;

patch for allocation reginfo in advance

2012-10-03 Thread Vladimir Makarov


The following patch was submitted as approved by Jeff Law and Paolo Bonzini.

The patch was successfully boostraped on x86/x86-64.

Committed as rev. 192048.

2012-10-03  Vladimir Makarov  

* reginfo.c (max_regno_since_last_resize): New.
(reg_preferred_class, reg_alternate_class): Add assert.
(allocate_reg_info): Initialize allocated reg info.
(resize_reg_info): Make bigger reg_info and initialize new memory.
(reginfo_init): Initialize max_regno_since_last_resize.
(setup_reg_classes): Change assert.

Index: reginfo.c
===
--- reginfo.c	(revision 192046)
+++ reginfo.c	(working copy)
@@ -839,6 +839,8 @@ static struct reg_pref *reg_pref;
 
 /* Current size of reg_info.  */
 static int reg_info_size;
+/* Max_reg_num still last resize_reg_info call.  */
+static int max_regno_since_last_resize;
 
 /* Return the reg_class in which pseudo reg number REGNO is best allocated.
This function is sometimes called before the info has been computed.
@@ -849,6 +851,7 @@ reg_preferred_class (int regno)
   if (reg_pref == 0)
 return GENERAL_REGS;
 
+  gcc_assert (regno < reg_info_size);
   return (enum reg_class) reg_pref[regno].prefclass;
 }
 
@@ -858,6 +861,7 @@ reg_alternate_class (int regno)
   if (reg_pref == 0)
 return ALL_REGS;
 
+  gcc_assert (regno < reg_info_size);
   return (enum reg_class) reg_pref[regno].altclass;
 }
 
@@ -868,45 +872,64 @@ reg_allocno_class (int regno)
   if (reg_pref == 0)
 return NO_REGS;
 
+  gcc_assert (regno < reg_info_size);
   return (enum reg_class) reg_pref[regno].allocnoclass;
 }
 
 
 
-/* Allocate space for reg info.  */
+/* Allocate space for reg info and initilize it.  */
 static void
 allocate_reg_info (void)
 {
-  reg_info_size = max_reg_num ();
+  int i;
+
+  max_regno_since_last_resize = max_reg_num ();
+  reg_info_size = max_regno_since_last_resize * 3 / 2 + 1;
   gcc_assert (! reg_pref && ! reg_renumber);
   reg_renumber = XNEWVEC (short, reg_info_size);
   reg_pref = XCNEWVEC (struct reg_pref, reg_info_size);
   memset (reg_renumber, -1, reg_info_size * sizeof (short));
+  for (i = 0; i < reg_info_size; i++)
+{
+  reg_pref[i].prefclass = GENERAL_REGS;
+  reg_pref[i].altclass = ALL_REGS;
+  reg_pref[i].allocnoclass = GENERAL_REGS;
+}
 }
 
 
-/* Resize reg info. The new elements will be uninitialized.  Return
-   TRUE if new elements (for new pseudos) were added.  */
+/* Resize reg info. The new elements will be initialized.  Return TRUE
+   if new pseudos were added since the last call.  */
 bool
 resize_reg_info (void)
 {
-  int old;
+  int old, i;
+  bool change_p;
 
   if (reg_pref == NULL)
 {
   allocate_reg_info ();
   return true;
 }
-  if (reg_info_size == max_reg_num ())
-return false;
+  change_p = max_regno_since_last_resize != max_reg_num ();
+  max_regno_since_last_resize = max_reg_num ();
+  if (reg_info_size >= max_reg_num ())
+return change_p;
   old = reg_info_size;
-  reg_info_size = max_reg_num ();
+  reg_info_size = max_reg_num () * 3 / 2 + 1;
   gcc_assert (reg_pref && reg_renumber);
   reg_renumber = XRESIZEVEC (short, reg_renumber, reg_info_size);
   reg_pref = XRESIZEVEC (struct reg_pref, reg_pref, reg_info_size);
   memset (reg_pref + old, -1,
 	  (reg_info_size - old) * sizeof (struct reg_pref));
   memset (reg_renumber + old, -1, (reg_info_size - old) * sizeof (short));
+  for (i = old; i < reg_info_size; i++)
+{
+  reg_pref[i].prefclass = GENERAL_REGS;
+  reg_pref[i].altclass = ALL_REGS;
+  reg_pref[i].allocnoclass = GENERAL_REGS;
+}
   return true;
 }
 
@@ -938,6 +961,7 @@ reginfo_init (void)
   /* This prevents dump_reg_info from losing if called
  before reginfo is run.  */
   reg_pref = NULL;
+  reg_info_size = max_regno_since_last_resize = 0;
   /* No more global register variables may be declared.  */
   no_global_reg_vars = 1;
   return 1;
@@ -964,7 +988,7 @@ struct rtl_opt_pass pass_reginfo_init =
 
 
 
-/* Set up preferred, alternate, and cover classes for REGNO as
+/* Set up preferred, alternate, and allocno classes for REGNO as
PREFCLASS, ALTCLASS, and ALLOCNOCLASS.  */
 void
 setup_reg_classes (int regno,
@@ -973,7 +997,7 @@ setup_reg_classes (int regno,
 {
   if (reg_pref == NULL)
 return;
-  gcc_assert (reg_info_size == max_reg_num ());
+  gcc_assert (reg_info_size >= max_reg_num ());
   reg_pref[regno].prefclass = prefclass;
   reg_pref[regno].altclass = altclass;
   reg_pref[regno].allocnoclass = allocnoclass;

Re: abs(long long)

2012-10-03 Thread Marc Glisse


On Wed, 3 Oct 2012, Gabriel Dos Reis wrote:


On Tue, Oct 2, 2012 at 1:53 PM, Marc Glisse  wrote:


* include/c_std/cstdlib (abs(long long)): Define with
__builtin_llabs when we have long long.

(abs(__int128)): Define when we have __int128.


This change is OK


Thanks, I'll commit that part only.


(div(long long, long long)): Use lldiv.


not this one.


Ok. Note that glibc's implementation does more than just / and %. Possible 
reasons are:

1) glibc does useless work
2) libstdc++ has a bug
3) there are platforms supported by glibc but not by libstdc++

I choose to believe it is option 3.

--
Marc Glisse

Re: [PATCH] Fix PR54782

2012-10-03 Thread Dehao Chen

Thanks for the comments. The patch was updated as attached.

Dehao

On Wed, Oct 3, 2012 at 11:46 AM, Jakub Jelinek  wrote:
> On Wed, Oct 03, 2012 at 11:26:09AM -0700, Dehao Chen wrote:
>> @@ -6340,6 +6341,20 @@ move_block_to_fn (struct function *dest_cfun, basi
>>   SET_USE (use, replace_ssa_name (op, d->vars_map, dest_cfun->decl));
>>   }
>>
>> +  for (i = 0; i < EDGE_COUNT (bb->preds); i++)
>> + {
>> +   location_t locus = gimple_phi_arg_location (phi, i);
>> +   if (locus != UNKNOWN_LOCATION)
>> + {
>> +   tree block = LOCATION_BLOCK (locus);
>> +   if (d->orig_block == NULL_TREE
>   || block == d->orig_block)
>> + gimple_phi_arg_set_location (phi, i, d->new_block ?
>> + COMBINE_LOCATION_DATA (line_table, locus, d->new_block) :
>> + LOCATION_LOCUS (locus));
>> + }
>> + }
>> +
>
> The formatting is wrong on this.  ? and : would need to go to the next line
> and even the indentation of part of an argument is weird.
> IMHO better just do:
>   if (d->orig_block == NULL_TREE || block == d->orig_block)
> {
>   if (d->new_block)
> locus = COMBINE_LOCATION_DATA (line_table, locus,
>d->new_block);
>   else
> locus = LOCATION_LOCUS (locus);
>   gimple_phi_arg_set_location (phi, i, locus);
> }
>
> It will be far more readable.
> And/or to decrease indentation level you could do
>   if (locus == UNKNOWN_LOCATION)
> continue;
>   block = LOCATION_BLOCK (locus);
>   ...
>
> Jakub
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 192041)
+++ gcc/tree-cfg.c  (working copy)
@@ -6322,6 +6322,7 @@ move_block_to_fn (struct function *dest_cfun, basi
   use_operand_p use;
   tree op = PHI_RESULT (phi);
   ssa_op_iter oi;
+  unsigned i;
 
   if (virtual_operand_p (op))
{
@@ -6340,6 +6341,23 @@ move_block_to_fn (struct function *dest_cfun, basi
SET_USE (use, replace_ssa_name (op, d->vars_map, dest_cfun->decl));
}
 
+  for (i = 0; i < EDGE_COUNT (bb->preds); i++)
+   {
+ location_t locus = gimple_phi_arg_location (phi, i);
+ tree block = LOCATION_BLOCK (locus);
+
+ if (locus == UNKNOWN_LOCATION)
+   continue;
+ if (d->orig_block == NULL_TREE || block == d->orig_block)
+   {
+ if (d->new_block == NULL_TREE)
+   locus = LOCATION_LOCUS (locus);
+ else
+   locus = COMBINE_LOCATION_DATA (line_table, locus, d->new_block);
+ gimple_phi_arg_set_location (phi, i, locus);
+   }
+   }
+
   gsi_next (&si);
 }
 
Index: gcc/testsuite/gcc.dg/pr54782.c
===
--- gcc/testsuite/gcc.dg/pr54782.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr54782.c  (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ffast-math -ftree-parallelize-loops=2 -g" } */
+
+struct S
+{
+  int n;
+  float *a;
+};
+
+int
+foo (struct S *s)
+{
+  float sum = 0;
+  int i;
+  for (i = 0; i < s->n; i++)
+sum += s->a[i];
+  return sum;
+}

Re: [PATCH] Fix PR54782

2012-10-03 Thread Jakub Jelinek

On Wed, Oct 03, 2012 at 01:09:27PM -0700, Dehao Chen wrote:
> Thanks for the comments. The patch was updated as attached.

Ok, thanks.

Jakub

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread David Edelsohn

Tobias,

Sorry to return to this patch so long after the discussion, but I have
been investigating testsuite failures on AIX and this patch introduced
a number of Fortran failures.

+  if (sym->module)
+   name = gfc_get_string (".__%s_MOD_%s", sym->module, sym->name);
+  else
+   name = gfc_get_string (".%s", sym->name);

Symbols beginning with "." have a well-known meaning on AIX, so the
compiler cannot assume that it is safe to mangle a name by prepending
a period.

For instance, the test allocatable_function_5.s generates the
following global symbol definitions in the assembler:

.globl __m_MOD_mbar
.globl .__m_MOD_mbar
.csect __m_MOD_mbar[DS]
__m_MOD_mbar:   <--- function
descriptor __m_MOD_mbar
.long .__m_MOD_mbar, TOC[tc0], 0
.csect .text[PR]
.__m_MOD_mbar:  <--- function address
.__m_MOD_mbar

...

.globl .__m_MOD_mbar
.csect .data[RW],4
.align 2
.__m_MOD_mbar:  <--- mangled string
name .__m_MOD_mbar
.space 4

Unsurprisingly, two symbols with the same name are a problem and the
AIX assembler complains about the redefinition.

I am not sure why you chose a period and how best to correct this.

Thanks, David

Re: patch to fix

2012-10-03 Thread Marc Glisse


On Wed, 3 Oct 2012, Kenneth Zadeck wrote:


The patch defines a new datatype, a 'wide_int' (defined in
wide-int.[ch], and this datatype will be used to perform all of the
integer constant math in the compiler.  Externally, wide-int is very
similar to double-int except that it does not have the limitation that
math must be done on exactly two HOST_WIDE_INTs.

Internally, a wide_int is a structure that contains a fixed sized
array of HOST_WIDE_INTs, a length field and a mode.  The size of the
array is determined at generation time by dividing the number of bits
of the largest integer supported on the target by the number of bits
in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
integer can be supported on any host.


Hello,

did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.


--
Marc Glisse

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread Tobias Burnus


David,

David Edelsohn wrote:

I am not sure why you chose a period and how best to correct this.


Well, in principle any name which the user cannot enter would do. (Not 
enter: At least not as valid Fortran identifier.)


The reason for choosing "." is that  is used elsewhere in 
gfortran for such identifier for the string-length variable belonging to 
, e.g. "._result" in trans-decl.c. I assume the reason that it 
didn't pop up with those is that those are local variables, but I 
wouldn't be surprised if it would break elsewhere.


I wonder whether "@" would work, otherwise, one could also use "_". The 
only other problem is that it will break the ABI. On the other hand, 
it's a rather new feature and if we bump the .mod version number, the 
chance that one effectively forces the user to re-compile is rather 
high. So far we always bumped the .mod version number as something 
changed. There are also some other patches pending which effectively 
lead to a bump in the .mod version.


(The .mod version won't affect code which doesn't use modules such as 
BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by 
the ABI change anyway as there the name doesn't propagate as it does 
with modules..)



Thanks for investigating the test-suite failure.

Tobias

opts.c, gcc.c: Plug some memory leaks - and an out-of-bounds memory access

2012-10-03 Thread Tobias Burnus


Found using http://scan5.coverity.com/

Build on x86-64-gnu-linux with C/C++/Fortran. I will now do an 
all-language build/regtest.

OK when it passes?

(Note to the save_string call: I reduced it by 2: The "+1" in the call 
makes it long (out of bounds) and the "+1" in temp_filename_length is 
not needed (but also doesn't harm) as "tmp" is null terminated and 
save_string adds another '\0' after copying "len" bytes.)


Tobias
2012-10-03  Tobias Burnus  

	* gcc.c (record_temp_file, add_sysrooted_prefix, process_command,
	do_self_spec, compare_debug_dump_opt_spec_function): Plug memleaks.
	(do_spec_1): Ditto, fix out-of-bound access.
	* opts.c (common_handle_option): Plug memleak.

diff --git a/gcc/gcc.c b/gcc/gcc.c
index af3c34a..2d56d40 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -1981,21 +1981,24 @@ record_temp_file (const char *filename, int always_delete, int fail_delete)
   always_delete_queue = temp;
 
 already1:;
 }
 
   if (fail_delete)
 {
   struct temp_file *temp;
   for (temp = failure_delete_queue; temp; temp = temp->next)
 	if (! filename_cmp (name, temp->name))
-	  goto already2;
+	  {
+	free (name);
+	goto already2;
+	  }
 
   temp = XNEW (struct temp_file);
   temp->next = failure_delete_queue;
   temp->name = name;
   failure_delete_queue = temp;
 
 already2:;
 }
 }
 
@@ -2455,22 +2458,29 @@ add_sysrooted_prefix (struct path_prefix *pprefix, const char *prefix,
   if (target_system_root)
 {
   char *sysroot_no_trailing_dir_separator = xstrdup (target_system_root);
   size_t sysroot_len = strlen (target_system_root);
 
   if (sysroot_len > 0
 	  && target_system_root[sysroot_len - 1] == DIR_SEPARATOR)
 	sysroot_no_trailing_dir_separator[sysroot_len - 1] = '\0';
 
   if (target_sysroot_suffix)
-	  prefix = concat (target_sysroot_suffix, prefix, NULL);
-  prefix = concat (sysroot_no_trailing_dir_separator, prefix, NULL);
+	{
+	  char *tmp;
+	  tmp = concat (target_sysroot_suffix, prefix, NULL);
+	  prefix = concat (sysroot_no_trailing_dir_separator, tmp, NULL);
+	  free (tmp);
+	}
+  else
+	prefix = concat (sysroot_no_trailing_dir_separator, prefix, NULL);
+
   free (sysroot_no_trailing_dir_separator);
 
   /* We have to override this because GCC's notion of sysroot
 	 moves along with GCC.  */
   component = "GCC";
 }
 
   add_prefix (pprefix, prefix, component, priority,
 	  require_machine_suffix, os_multilib);
 }
@@ -3564,21 +3574,21 @@ set_option_handlers (struct cl_option_handlers *handlers)
 
 /* Create the vector `switches' and its contents.
Store its length in `n_switches'.  */
 
 static void
 process_command (unsigned int decoded_options_count,
 		 struct cl_decoded_option *decoded_options)
 {
   const char *temp;
   char *temp1;
-  const char *tooldir_prefix;
+  char *tooldir_prefix, *tooldir_prefix2;
   char *(*get_relative_prefix) (const char *, const char *,
 const char *) = NULL;
   struct cl_option_handlers handlers;
   unsigned int j;
 
   gcc_exec_prefix = getenv ("GCC_EXEC_PREFIX");
 
   n_switches = 0;
   n_infiles = 0;
   added_libraries = 0;
@@ -3913,36 +3923,38 @@ process_command (unsigned int decoded_options_count,
   add_prefix (&exec_prefixes, standard_libexec_prefix, "BINUTILS",
 		  PREFIX_PRIORITY_LAST, 2, 0);
   add_prefix (&exec_prefixes, standard_exec_prefix, "BINUTILS",
 		  PREFIX_PRIORITY_LAST, 2, 0);
 #endif
   add_prefix (&startfile_prefixes, standard_exec_prefix, "BINUTILS",
 		  PREFIX_PRIORITY_LAST, 1, 0);
 }
 
   gcc_assert (!IS_ABSOLUTE_PATH (tooldir_base_prefix));
-  tooldir_prefix = concat (tooldir_base_prefix, spec_machine,
-			   dir_separator_str, NULL);
+  tooldir_prefix2 = concat (tooldir_base_prefix, spec_machine,
+			dir_separator_str, NULL);
 
   /* Look for tools relative to the location from which the driver is
  running, or, if that is not available, the configured prefix.  */
   tooldir_prefix
 = concat (gcc_exec_prefix ? gcc_exec_prefix : standard_exec_prefix,
 	  spec_machine, dir_separator_str,
-	  spec_version, dir_separator_str, tooldir_prefix, NULL);
+	  spec_version, dir_separator_str, tooldir_prefix2, NULL);
+  free (tooldir_prefix2);
 
   add_prefix (&exec_prefixes,
 	  concat (tooldir_prefix, "bin", dir_separator_str, NULL),
 	  "BINUTILS", PREFIX_PRIORITY_LAST, 0, 0);
   add_prefix (&startfile_prefixes,
 	  concat (tooldir_prefix, "lib", dir_separator_str, NULL),
 	  "BINUTILS", PREFIX_PRIORITY_LAST, 0, 1);
+  free (tooldir_prefix);
 
 #if defined(TARGET_SYSTEM_ROOT_RELOCATABLE) && !defined(VMS)
   /* If the normal TARGET_SYSTEM_ROOT is inside of $exec_prefix,
  then consider it to relocate with the rest of the GCC installation
  if GCC_EXEC_PREFIX is set.
  ``make_relative_prefix'' is not compiled for VMS, so don't call it.  */
   if (target_system_root && !target_system_root_changed && gcc_exec_prefix)
 {
   char *tmp_prefix = get_relative_

Re: [wwwdocs] Buildstat update for 4.4

2012-10-03 Thread Gerald Pfeifer

On Tue, 2 Oct 2012, Tom G. Christensen wrote:
> Latest results for 4.4.x

Thanks, Tom.  Applied.

Gerald

[lra] patch taking LRA review into account

2012-10-03 Thread Vladimir Makarov

The following patch takes comments and proposals from Jeff Law's and 
Richard Sandiford's reviews of LRA.


The patch was successfully bootstraped on x86/x86-64.

Commited as rev. 192050.

2012-10-03  Vladimir Makarov  

* sched-vis.c (print_value_slim): Fix the comment.
* targhooks.h (default_register_bank): Rename to
default_register_priority.
* targhooks.c (default_register_bank): Ditto.
* target.def (register_bank): Rename to register_priority.
(spill_class): Add a new argument
(spill_class_mode): Remove.
* doc/tm.texi.in (TARGET_REGISTER_BANK): Rename to
TARGET_REGISTER_PRIORITY.
(TARGET_SPILL_CLASS_MODE): Remove.
* doc/tm.texi: Update.
* lra.c (setup_reg_spill_flag): Provide a new argument to
spill_class.
* lra-assigns.c (find_hard_regno_for): Use register_priority
instead of register_bank.
* lra-spills.c (assign_spill_hard_regs): Don't use hook
spill_class_mode.  Provide a new argument to spill_class hook.
* config/i386/i386.c (ix86_register_bank): Rename to
ix86_register_priority.
(ix86_spill_class): Add new parameter.
(ix86_spill_class_mode): Remove.
(TARGET_REGISTER_BANK): Rename to TARGET_REGISTER_PRIORITY.
(TARGET_SPILL_CLASS_MODE): Remove.

Index: config/i386/i386.c
===
--- config/i386/i386.c	(revision 192048)
+++ config/i386/i386.c	(working copy)
@@ -31705,9 +31705,9 @@ ix86_lra_p (void)
   return true;
 }
 
-/* Return a register bank number for hard reg REGNO.  */
+/* Return a register priority for hard reg REGNO.  */
 static int
-ix86_register_bank (int hard_regno)
+ix86_register_priority (int hard_regno)
 {
   /* ebp and r13 as the base always wants a displacement, r12 as the
  base always wants an index.  So discourage their usage in an
@@ -40525,37 +40525,20 @@ ix86_autovectorize_vector_sizes (void)
 
 
 
-/* Return class of registers which could be used for pseudo of class
-   RCLASS for spilling instead of memory.  Return NO_REGS if it is not
-   possible or non-profitable.  */
-static enum reg_class
-ix86_spill_class (enum reg_class rclass)
+/* Return class of registers which could be used for pseudo of MODE
+   and of class RCLASS for spilling instead of memory.  Return NO_REGS
+   if it is not possible or non-profitable.  */
+static reg_class_t
+ix86_spill_class (reg_class_t rclass, enum machine_mode mode)
 {
   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL
   && hard_reg_set_subset_p (reg_class_contents[rclass],
-reg_class_contents[GENERAL_REGS]))
+reg_class_contents[GENERAL_REGS])
+  && (mode == SImode || (TARGET_64BIT && mode == DImode)))
 return SSE_REGS;
   return NO_REGS;
 }
 
-/* Return mode in which pseudo of MODE and RCLASS can be spilled into
-   a register of class SPILL_CLASS.  Return VOIDmode if it is not
-   possible.  */
-static enum machine_mode
-ix86_spill_class_mode (enum reg_class rclass, enum reg_class spill_class,
-		   enum machine_mode mode)
-{
-  if (! TARGET_SSE || ! TARGET_GENERAL_REGS_SSE_SPILL
-  || ! hard_reg_set_subset_p (reg_class_contents[rclass],
-  reg_class_contents[GENERAL_REGS])
-  || spill_class != SSE_REGS)
-return VOIDmode;
-  if (mode == SImode || (TARGET_64BIT && mode == DImode))
-return mode;
-  return VOIDmode;
-}
-
-
 /* Implement targetm.vectorize.init_cost.  */
 
 static void *
@@ -40961,8 +40944,8 @@ ix86_memmodel_check (unsigned HOST_WIDE_
 #undef TARGET_LRA_P
 #define TARGET_LRA_P ix86_lra_p
 
-#undef TARGET_REGISTER_BANK
-#define TARGET_REGISTER_BANK ix86_register_bank
+#undef TARGET_REGISTER_PRIORITY
+#define TARGET_REGISTER_PRIORITY ix86_register_priority
 
 #undef TARGET_LEGITIMATE_CONSTANT_P
 #define TARGET_LEGITIMATE_CONSTANT_P ix86_legitimate_constant_p
@@ -40990,9 +40973,6 @@ ix86_memmodel_check (unsigned HOST_WIDE_
 #undef TARGET_SPILL_CLASS
 #define TARGET_SPILL_CLASS ix86_spill_class
 
-#undef TARGET_SPILL_CLASS_MODE
-#define TARGET_SPILL_CLASS_MODE ix86_spill_class_mode
-
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-i386.h"
Index: doc/tm.texi
===
--- doc/tm.texi	(revision 192048)
+++ doc/tm.texi	(working copy)
@@ -2897,20 +2897,16 @@ as below:
 A target hook which returns true if we use LRA instead of reload pass.  It means that LRA was ported to the target.The default version of this target hook returns always false.
 @end deftypefn
 
-@deftypefn {Target Hook} int TARGET_REGISTER_BANK (int)
-A target hook which returns the register bank number to which the  register @var{hard_regno} belongs to.  The smaller the number, the  more preferable the hard register usage (when all other conditions are  the same).  This hook can be used to prefer some hard register over  others in LRA.  For example, some x86-64 register usage needs  additional prefi

Re: RFC: LRA for x86/x86-64 [2/9]

2012-10-03 Thread Vladimir Makarov


On 09/28/2012 11:36 AM, Jeff Law wrote:

On 09/28/2012 09:21 AM, Vladimir Makarov wrote:

On 12-09-28 4:43 AM, Steven Bosscher wrote:

On Fri, Sep 28, 2012 at 12:57 AM, Vladimir Makarov
 wrote:

LRA outputs a lot debug information about insns.  I found that using
slim
insn/rtl presentation helps a lot for LRA debuging. The following 
patch

makes slim presentation printing functions visible to LRA. It also
implements one more such function.

2012-09-27  Vladimir Makarov  

 * rtl.h (debug_bb_n_slim, debug_bb_slim, print_value_slim): New
 prototypes.
 (debug_rtl_slim, debug_insn_slim): Ditto.
 * sched-vis.c (print_value_slim): New.

I have patches in the works to use the slim RTL dumping format more,
too, and to use the pretty-printer code so that printing strings with
escaped characters can be made more transparent (e.g. for use in
GraphViz dumps).

That would be nice.  Slim printing is very useful for LRA which prints a
lot of changes in RTL code during all its work.  Regular printing is
unreadable because of its volume.  For LRA debugging I usually find a
suspicious place in slim dump and if I need more info I use regular dump
of the suspicious insn.

Perhaps it's time to rename sched-vis.c to print-rtl-slim.c? :-)


Yes, the name sched-vis.c is very misleading.
It seems to me this change ought to go forward now (rename to 
print-rtl-slim.c and add print_value_slim.


WRT print_value_slim we have this block comment:




+/* Prints rtxes, I customarily classified as values.  They're
+   constants, registers, labels, symbols and memory accesses. Print
+   them to file F.  */


That block comment just doesn't make sense when I read it.  It seems 
like.


/* Print X, an RTL value node, to file F in slim format.  Include
   additional information if VERBOSE is nonzero.

   Value nodes are constants, registers, labels, symbols and
   memory.  */


With that change I think you could make the obvious changes necessary 
to rename to print-rtl-slim and check this patch in.


   Unfortunately, it is not easy.  The right thing to do is to make it 
independent from scheduler. There are dependecies on insn scheduler in 
this file.  Hooks print_insn and current_sched_info are used in the code.


So I just fixed the comment and will work on removing dependencies 
later if I have time.

[SH] PR 54760 - Add thread pointer built-ins and GBR displacement addressing

2012-10-03 Thread Oleg Endo

Hello,

This adds the two common built-in functions __builtin_thread_pointer and
__builtin_set_thread_pointer to the SH port.
I've done it in a way so that hopefully it can be transitioned to target
independent thread pointer built-ins easily, as suggested by Richard a
while ago:
http://gcc.gnu.org/ml/gcc-patches/2012-07/msg00946.html

Originally I wanted to wait until the target independent bits are in,
but somehow the thread mentioned above died and I got impatient.

I've also added support for SH's GBR based displacement addressing
modes.  They are not used for general purpose mem loads/stores by the
compiler, but rather when code accesses data behind the thread pointer
(thread control block or something).
The way GBR displacement address opportunities are discovered might not
be the best way of doing this sort of thing, but it works.  The insn
walking could potentially slow down compile times, but it is only
enabled for functions where the GBR is referenced, so it shouldn't be so
bad.  Alternatives and suggestions are highly appreciated.  :)

Tested on rev 191894 with 'make all' (c,c++) and
'make -k check-gcc RUNTESTFLAGS="sh.exp --target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"'

For code that doesn't reference GBR there are no functional changes.
TLS code that references GBR might trigger the 'sh_find_equiv_gbr_addr'
function.  Unfortunately TLS tests don't seem to work on sh-sim, so I
could not test this part.

Cheers,
Oleg

gcc/ChangeLog:

PR target/54760
* config/sh/sh.md (define_constants): Add UNSPECV_GBR.
(get_thread_pointer, set_thread_pointer): New expanders.
(load_gbr): Rename to store_gbr.  Remove GBR_REG use.
(store_gbr): New insn.
(*mov_gbr_load, *mov_gbr_store): New insns and 
accompanying unnamed splits.
* config/sh/predicates.md (general_movsrc_operand, 
general_movdst_operand): Reject GBR addresses.
* config/sh/sh-protos.h (sh_find_equiv_gbr_addr): New 
declaration.
* config/sh/sh.c (prepare_move_operands): Use gen_store_gbr 
instead of gen_load_gbr in TLS_MODEL_LOCAL_EXEC case.
(sh_address_cost, sh_legitimate_address_p, sh_secondary_reload):
Handle GBR addresses.
(builtin_description): Add is_enabled member.
(shmedia_builtin, sh1_builtin): New functions.
(signature_args): Add SH_BLTIN_VP.
(bdesc): Use shmedia_builtin for existing built-ins.  Add 
__builtin_thread_pointer and __builtin_set_thread_pointer as 
sh1_builtin.
(sh_media_init_builtins, sh_init_builtins): Merge into single 
function sh_init_builtins.  Add is_enabled checking.
(sh_media_builtin_decl, sh_builtin_decl): Merge into single 
function sh_builtin_decl.  Add is_enabled checking.
(base_reg_disp): New class.
(sh_find_base_reg_disp, sh_find_equiv_gbr_addr): New functions.

testsuite/ChangeLog:

PR target/54760
* gcc.target/sh/pr54706-1.c: New.
* gcc.target/sh/pr54706-2.c: New.
* gcc.target/sh/pr54706-3.c: New.

Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 191894)
+++ gcc/config/sh/sh.md	(working copy)
@@ -175,6 +175,7 @@
   (UNSPECV_WINDOW_END	10)
   (UNSPECV_CONST_END	11)
   (UNSPECV_EH_RETURN	12)
+  (UNSPECV_GBR		13)
 ])
 
 ;; -
@@ -10029,13 +10030,165 @@
   DONE;
 })
 
-(define_insn "load_gbr"
-  [(set (match_operand:SI 0 "register_operand" "=r") (reg:SI GBR_REG))
-   (use (reg:SI GBR_REG))]
+;;--
+;; Thread pointer getter and setter.
+;;
+;; On SH the thread pointer is kept in the GBR.
+;; These patterns are usually expanded from the respective built-in functions.
+(define_expand "get_thread_pointer"
+  [(set (match_operand:SI 0 "register_operand") (reg:SI GBR_REG))]
+  "TARGET_SH1")
+
+(define_insn "store_gbr"
+  [(set (match_operand:SI 0 "register_operand" "=r") (reg:SI GBR_REG))]
   ""
   "stc	gbr,%0"
   [(set_attr "type" "tls_load")])
 
+(define_expand "set_thread_pointer"
+  [(set (reg:SI GBR_REG)
+	(unspec_volatile:SI [(match_operand:SI 0 "register_operand")]
+	 UNSPECV_GBR))]
+  "TARGET_SH1")
+
+(define_insn "load_gbr"
+  [(set (reg:SI GBR_REG)
+	(unspec_volatile:SI [(match_operand:SI 0 "register_operand" "r")]
+	 UNSPECV_GBR))]
+  "TARGET_SH1"
+  "ldc	%0,gbr"
+  [(set_attr "type" "move")])
+
+;;--
+;; Thread pointer relative memory loads and stores.
+;;
+;; On SH there are GBR displacement address modes which can be utilized to
+;; access memory behind the thread pointer.
+;; Since we do not allow using GBR for general purpose memory accesses, these
+;; GBR addressing modes are formed by the combine pass.
+;; This could b

Re: [SH] PR 54760 - Add thread pointer built-ins and GBR displacement addressing

2012-10-03 Thread Oleg Endo

On Wed, 2012-10-03 at 23:21 +0200, Oleg Endo wrote:
> testsuite/ChangeLog:
> 
>   PR target/54760
>   * gcc.target/sh/pr54706-1.c: New.
>   * gcc.target/sh/pr54706-2.c: New.
>   * gcc.target/sh/pr54706-3.c: New.

Obviously there is a typo in the file names for the test cases.
Attached is the corrected patch + changelog.

Cheers,
Oleg

gcc/ChangeLog:

PR target/54760
* config/sh/sh.md (define_constants): Add UNSPECV_GBR.
(get_thread_pointer, set_thread_pointer): New expanders.
(load_gbr): Rename to store_gbr.  Remove GBR_REG use.
(store_gbr): New insn.
(*mov_gbr_load, *mov_gbr_store): New insns and 
accompanying unnamed splits.
* config/sh/predicates.md (general_movsrc_operand, 
general_movdst_operand): Reject GBR addresses.
* config/sh/sh-protos.h (sh_find_equiv_gbr_addr): New 
declaration.
* config/sh/sh.c (prepare_move_operands): Use gen_store_gbr 
instead of gen_load_gbr in TLS_MODEL_LOCAL_EXEC case.
(sh_address_cost, sh_legitimate_address_p, sh_secondary_reload):
Handle GBR addresses.
(builtin_description): Add is_enabled member.
(shmedia_builtin, sh1_builtin): New functions.
(signature_args): Add SH_BLTIN_VP.
(bdesc): Use shmedia_builtin for existing built-ins.  Add 
__builtin_thread_pointer and __builtin_set_thread_pointer as 
sh1_builtin.
(sh_media_init_builtins, sh_init_builtins): Merge into single 
function sh_init_builtins.  Add is_enabled checking.
(sh_media_builtin_decl, sh_builtin_decl): Merge into single 
function sh_builtin_decl.  Add is_enabled checking.
(base_reg_disp): New class.
(sh_find_base_reg_disp, sh_find_equiv_gbr_addr): New functions.

testsuite/ChangeLog:

PR target/54760
* gcc.target/sh/pr54760-1.c: New.
* gcc.target/sh/pr54760-2.c: New.
* gcc.target/sh/pr54760-3.c: New.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 191894)
+++ gcc/config/sh/sh.md	(working copy)
@@ -175,6 +175,7 @@
   (UNSPECV_WINDOW_END	10)
   (UNSPECV_CONST_END	11)
   (UNSPECV_EH_RETURN	12)
+  (UNSPECV_GBR		13)
 ])

 ;; -
@@ -10029,13 +10030,165 @@
   DONE;
 })

-(define_insn "load_gbr"
-  [(set (match_operand:SI 0 "register_operand" "=r") (reg:SI GBR_REG))
-   (use (reg:SI GBR_REG))]
+;;--
+;; Thread pointer getter and setter.
+;;
+;; On SH the thread pointer is kept in the GBR.
+;; These patterns are usually expanded from the respective built-in functions.
+(define_expand "get_thread_pointer"
+  [(set (match_operand:SI 0 "register_operand") (reg:SI GBR_REG))]
+  "TARGET_SH1")
+
+(define_insn "store_gbr"
+  [(set (match_operand:SI 0 "register_operand" "=r") (reg:SI GBR_REG))]
   ""
   "stc	gbr,%0"
   [(set_attr "type" "tls_load")])

+(define_expand "set_thread_pointer"
+  [(set (reg:SI GBR_REG)
+	(unspec_volatile:SI [(match_operand:SI 0 "register_operand")]
+	 UNSPECV_GBR))]
+  "TARGET_SH1")
+
+(define_insn "load_gbr"
+  [(set (reg:SI GBR_REG)
+	(unspec_volatile:SI [(match_operand:SI 0 "register_operand" "r")]
+	 UNSPECV_GBR))]
+  "TARGET_SH1"
+  "ldc	%0,gbr"
+  [(set_attr "type" "move")])
+
+;;--
+;; Thread pointer relative memory loads and stores.
+;;
+;; On SH there are GBR displacement address modes which can be utilized to
+;; access memory behind the thread pointer.
+;; Since we do not allow using GBR for general purpose memory accesses, these
+;; GBR addressing modes are formed by the combine pass.
+;; This could be done with fewer patterns than below by using a mem predicate
+;; for the GBR mem, but then reload would try to reload addresses with a
+;; zero displacement for some strange reason.
+
+(define_insn "*mov_gbr_load"
+  [(set (match_operand:QIHISI 0 "register_operand" "=z")
+	(mem:QIHISI (plus:SI (reg:SI GBR_REG)
+			 (match_operand:QIHISI 1 "gbr_displacement"]
+  "TARGET_SH1"
+  "mov.	@(%O1,gbr),%0"
+  [(set_attr "type" "load")])
+
+(define_insn "*mov_gbr_load"
+  [(set (match_operand:QIHISI 0 "register_operand" "=z")
+	(mem:QIHISI (reg:SI GBR_REG)))]
+  "TARGET_SH1"
+  "mov.	@(0,gbr),%0"
+  [(set_attr "type" "load")])
+
+(define_insn "*mov_gbr_load"
+  [(set (match_operand:SI 0 "register_operand" "=z")
+	(sign_extend:SI
+	  (mem:QIHI (plus:SI (reg:SI GBR_REG)
+			 (match_operand:QIHI 1 "gbr_displacement")]
+  "TARGET_SH1"
+  "mov.	@(%O1,gbr),%0"
+  [(set_attr "type" "load")])
+
+(define_insn "*mov_gbr_load"
+  [(set (match_operand:SI 0 "register_operand" "=z")
+	(sign_extend:SI (mem:QIHI (reg:SI GBR_REG]
+  "TARGET_SH1"
+  "mov.	@(0,gbr),%0"
+  [(set_attr "type" "load")])
+
+(define_insn "*mov_gbr_store"
+  [

Re: patch to fix

2012-10-03 Thread Kenneth Zadeck

i have already converted the vrp code, so i have some guess at where you 
are talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.


my library is able to do that with one catch (and this is a big catch): 
the target has to have an integer mode that is twice as big as the mode 
of the operands. The issue is that wide-ints actually carry around the 
mode of the value in order to get the bitsize and precision of the 
operands (it does not have the type, because this code has to both work 
on the rtl and tree level and i generally do not want the signness anyway).


my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it returns 
bottom.   What this means is that for most (many or some) targets that 
have a TImode, the largest thing that particular vrp discover ranges for 
is a DImode value.   We could get around this by defining the next 
larger mode than what the target really needs but i wonder how much 
mileage you are going to get out of that with really large numbers.


Of course you could have something else in mind.

kenny

On 10/03/2012 04:47 PM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Kenneth Zadeck wrote:


The patch defines a new datatype, a 'wide_int' (defined in
wide-int.[ch], and this datatype will be used to perform all of the
integer constant math in the compiler.  Externally, wide-int is very
similar to double-int except that it does not have the limitation that
math must be done on exactly two HOST_WIDE_INTs.

Internally, a wide_int is a structure that contains a fixed sized
array of HOST_WIDE_INTs, a length field and a mode.  The size of the
array is determined at generation time by dividing the number of bits
of the largest integer supported on the target by the number of bits
in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
integer can be supported on any host.


Hello,

did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.

[Patch, Fortran, F08] PR 45521: GENERIC resolution with ALLOCATABLE/POINTER and PROCEDURE

2012-10-03 Thread Janus Weil

Hi all,

the attached patch implements an F08 feature, which allows to
distinguish two specific procedures in a generic interface, based on
the POINTER and ALLOCATABLE attribute of their arguments.

In addition to this, the patch fixes a bug in rejecting data actual
arguments passed to procedure formal arguments.

The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk?

Cheers,
Janus



2012-10-03  Janus Weil  

PR fortran/45521
* interface.c (generic_correspondence): Implement additional
distinguishability criteria of F08.
(compare_actual_formal): Reject data object as actual argument for
procedure formal argument.

2012-10-03  Janus Weil  

PR fortran/45521
* gfortran.dg/generic_25.f90: New.
* gfortran.dg/generic_26.f90: New.
* gfortran.dg/generic_27.f90: New.


pr45521_v2.diff
Description: Binary data


generic_25.f90
Description: Binary data


generic_26.f90
Description: Binary data


generic_27.f90
Description: Binary data

libbacktrace patch committed: Fix leb128 overflow test

2012-10-03 Thread Ian Lance Taylor

This patch to libbacktrace fixes the overflow test for the leb128
reading routines.  What matters is not the shift after the loop, but the
shift within the loop.  This also removes the setting of unit_buf.start
in build_address_map, which was simply wrong and was causing the error
message to print the wrong offset in the .debug_info section.
Bootstrapped and ran libbacktrace testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2012-10-03  Ian Lance Taylor  

* dwarf.c (read_uleb128): Fix overflow test.
(read_sleb128): Likewise.
(build_address_map): Don't change unit_buf.start.


Index: dwarf.c
===
--- dwarf.c	(revision 191858)
+++ dwarf.c	(working copy)
@@ -524,10 +524,12 @@ read_uleb128 (struct dwarf_buf *buf)
 {
   uint64_t ret;
   unsigned int shift;
+  int overflow;
   unsigned char b;
 
   ret = 0;
   shift = 0;
+  overflow = 0;
   do
 {
   const unsigned char *p;
@@ -536,14 +538,17 @@ read_uleb128 (struct dwarf_buf *buf)
   if (!advance (buf, 1))
 	return 0;
   b = *p;
-  ret |= ((uint64_t) (b & 0x7f)) << shift;
+  if (shift < 64)
+	ret |= ((uint64_t) (b & 0x7f)) << shift;
+  else if (!overflow)
+	{
+	  dwarf_buf_error (buf, "LEB128 overflows uint64_t");
+	  overflow = 1;
+	}
   shift += 7;
 }
   while ((b & 0x80) != 0);
 
-  if (shift > 64)
-dwarf_buf_error (buf, "LEB128 overflows uint64_5");
-
   return ret;
 }
 
@@ -554,10 +559,12 @@ read_sleb128 (struct dwarf_buf *buf)
 {
   uint64_t val;
   unsigned int shift;
+  int overflow;
   unsigned char b;
 
   val = 0;
   shift = 0;
+  overflow = 0;
   do
 {
   const unsigned char *p;
@@ -566,15 +573,18 @@ read_sleb128 (struct dwarf_buf *buf)
   if (!advance (buf, 1))
 	return 0;
   b = *p;
-  val |= ((uint64_t) (b & 0x7f)) << shift;
+  if (shift < 64)
+	val |= ((uint64_t) (b & 0x7f)) << shift;
+  else if (!overflow)
+	{
+	  dwarf_buf_error (buf, "signed LEB128 overflows uint64_t");
+	  overflow = 1;
+	}
   shift += 7;
 }
   while ((b & 0x80) != 0);
 
-  if (shift > 64)
-dwarf_buf_error (buf, "signed LEB128 overflows uint64_t");
-
-  if ((b & 0x40) != 0)
+  if ((b & 0x40) != 0 && shift < 64)
 val |= ((uint64_t) -1) << shift;
 
   return (int64_t) val;
@@ -1262,7 +1272,6 @@ build_address_map (struct backtrace_stat
 	}
 
   unit_buf = info;
-  unit_buf.start = info.buf;
   unit_buf.left = len;
 
   if (!advance (&info, len))

[SH] PR 33135 - Remove mieee option in libgcc

2012-10-03 Thread Oleg Endo

Hello,

Since the -mieee behavior has been fixed, is enabled by default on SH
and the additional flags in libgcc can be removed.

OK?

Cheers,
Oleg

libgcc/ChangeLog:

PR target/33135
* config/sh/t-sh (HOST_LIBGCC2_CFLAGS): Delete.
* config/sh/t-netbsd (HOST_LIBGCC2_CFLAGS): Delete.
* config/sh/t-linux (HOST_LIBGCC2_CFLAGS): Remove mieee option.
Index: libgcc/config/sh/t-sh
===
--- libgcc/config/sh/t-sh	(revision 192050)
+++ libgcc/config/sh/t-sh	(working copy)
@@ -59,5 +59,3 @@
 libgcc-4-300.a: div_table-4-300.o
 	$(AR_CREATE_FOR_TARGET) $@ div_table-4-300.o
 
-HOST_LIBGCC2_CFLAGS += -mieee
-
Index: libgcc/config/sh/t-netbsd
===
--- libgcc/config/sh/t-netbsd	(revision 192050)
+++ libgcc/config/sh/t-netbsd	(working copy)
@@ -1,3 +1,2 @@
 LIB1ASMFUNCS_CACHE = _ic_invalidate
 
-HOST_LIBGCC2_CFLAGS += -mieee
Index: libgcc/config/sh/t-linux
===
--- libgcc/config/sh/t-linux	(revision 192051)
+++ libgcc/config/sh/t-linux	(working copy)
@@ -2,7 +2,7 @@
 
 LIB2ADD = $(srcdir)/config/sh/linux-atomic.c
 
-HOST_LIBGCC2_CFLAGS += -mieee -DNO_FPSCR_VALUES
+HOST_LIBGCC2_CFLAGS += -DNO_FPSCR_VALUES
 
 # Silence atomic built-in related warnings in linux-atomic.c.
 # Unfortunately the conflicting types warning can't be disabled selectively.

Re: RFA: darwin PATCH to fix build, internal visibility

2012-10-03 Thread Mike Stump

On Oct 3, 2012, at 12:04 PM, Jason Merrill  wrote:
> This patch fixes a couple of Darwin issues I noticed with a cross-compiler:
> 
> 1) Adds a couple of consts to avoid const-correctness errors.
> 2) Treats visibility "internal" like "hidden" rather than like "default".  
> The documentation says that internal is hidden + processor-specific 
> semantics, so treating it as just hidden makes sense to me.
> 
> OK for trunk?

Ok.

[PATCH] rs6000: Remove 'A' output modifier

2012-10-03 Thread Segher Boessenkool

It was used for old POWER only, which has been removed.

Tested etc.; okay to apply?


Segher


2012-10-04  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.c (print_operand) ['A']: Delete.

---
 gcc/config/rs6000/rs6000.c |   11 ---
 1 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 211087b..846a939 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -14676,17 +14676,6 @@ print_operand (FILE *file, rtx x, int code)
 {
   /* %a is output_address.  */
 
-case 'A':
-  /* If X is a constant integer whose low-order 5 bits are zero,
-write 'l'.  Otherwise, write 'r'.  This is a kludge to fix a bug
-in the AIX assembler where "sri" with a zero shift count
-writes a trash instruction.  */
-  if (GET_CODE (x) == CONST_INT && (INTVAL (x) & 31) == 0)
-   putc ('l', file);
-  else
-   putc ('r', file);
-  return;
-
 case 'b':
   /* If constant, low-order 16 bits of constant, unsigned.
 Otherwise, write normally.  */
-- 
1.7.7.6

Re: Commit: RX: Warn about multiple fast interrupt routines.

2012-10-03 Thread Gerald Pfeifer

On Wed, 3 Oct 2012, Nick Clifton wrote:
> PS.  I have even remembered to include a patch to the html
> documentation as well!

Yes, I was going to mention this appreciatively. ;-)

A minor change I applied on top is the following, adding a closing
 and a dash in command-line.

Thanks,
Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.33
retrieving revision 1.35
diff -u -3 -p -r1.33 -r1.35
--- changes.html3 Oct 2012 16:12:56 -   1.33
+++ changes.html3 Oct 2012 22:33:46 -   1.35
@@ -244,7 +244,7 @@ by this change.
 This target will now issue a warning message whenever multiple fast
 interrupt handlers are found in the same cpmpilation unit.  This feature 
can
 be turned off by the new -mno-warn-multiple-fast-interrupts
-command line option.
+command-line option.
   
 
 SH

Re: Convert more non-GTY htab_t to hash_table.

2012-10-03 Thread Lawrence Crowl

On 10/2/12, Richard Guenther  wrote:
> On Mon, 1 Oct 2012, Lawrence Crowl wrote:
>
>> Change more non-GTY hash tables to use the new type-safe template hash
>> table.
>> Constify member function parameters that can be const.
>> Correct a couple of expressions in formerly uninstantiated templates.
>>
>> The new code is 0.362% faster in bootstrap, with a 99.5% confidence of
>> being faster.
>>
>> Tested on x86-64.
>>
>> Okay for trunk?
>
> You are changing a hashtable used by fold checking, did you test
> with fold checking enabled?
>
> +/* Data structures used to maintain mapping between basic blocks and
> +   copies.  */
> +static hash_table  bb_original;
> +static hash_table  bb_copy;
>
> note that because hash_table has a constructor we now get global
> CTORs for all statics :(  (and mx-protected local inits ...)
> Can you please try to remove the constructor from hash_table to
> avoid this overhead?  (as a followup - that is, don't initialize htab)
>
> The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll leave the
> rest to
> respective maintainers of the pieces of the compiler.
>
> Thanks,
> Richard.
>
>>
>> Index: gcc/java/ChangeLog
>>
>> 2012-10-01  Lawrence Crowl  
>>
>>  * Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o.
>>  (JCFDUMP_OBJS): Add dependence on hash-table.o.
>>  (jcf-io.o): Add dependence on hash-table.h.
>>  * jcf-io.c (memoized_class_lookups): Change to use type-safe hash table.
>>
>> Index: gcc/c/ChangeLog
>>
>> 2012-10-01  Lawrence Crowl  
>>
>>  * Make-lang.in (c-decl.o): Add dependence on hash-table.h.
>>  * c-decl.c (detect_field_duplicates_hash): Change to new type-safe
>>  hash table.
>>
>> Index: gcc/objc/ChangeLog
>>
>> 2012-10-01  Lawrence Crowl  
>>
>>  * Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o.
>>  (objc-act.o): Add dependence on hash-table.h.
>>  * objc-act.c (objc_detect_field_duplicates): Change to new type-safe
>>  hash table.
>>
>> Index: gcc/ChangeLog
>>
>> 2012-10-01  Lawrence Crowl  
>>
>>  * Makefile.in (fold-const.o): Add depencence on hash-table.h.
>>  (dse.o): Likewise.
>>  (cfg.o): Likewise.
>>  * fold-const.c (fold_checksum_tree): Change to new type-safe hash table.
>>  * (print_fold_checksum): Likewise.
>>  * cfg.c (var bb_original): Likewise.
>>  * (var bb_copy): Likewise.
>>  * (var loop_copy): Likewise.
>>  * hash-table.h (template hash_table): Constify parameters for find...
>>  and remove_elt... member functions.
>> (hash_table::empty) Correct size expression.
>> (hash_table::clear_slot) Correct deleted entry assignment.
>>  * dse.c (var rtx_group_table): Change to new type-safe hash table.
>>
>> Index: gcc/cp/ChangeLog
>>
>> 2012-10-01  Lawrence Crowl  
>>
>>  * Make-lang.in (class.o): Add dependence on hash-table.h.
>>  (tree.o): Likewise.
>>  (semantics.o): Likewise.
>>  * class.c (fixed_type_or_null): Change to new type-safe hash table.
>>  * tree.c (verify_stmt_tree): Likewise.
>>  (verify_stmt_tree_r): Likewise.
>>  * semantics.c (struct nrv_data): Likewise.
>>
>>
>> Index: gcc/java/Make-lang.in
>> ===
>> --- gcc/java/Make-lang.in(revision 191941)
>> +++ gcc/java/Make-lang.in(working copy)
>> @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav
>>java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o
>> java/mangle.o \
>>java/mangle_name.o java/builtins.o java/resource.o \
>>java/jcf-depend.o \
>> -  java/jcf-path.o java/boehm.o java/java-gimplify.o
>> +  java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o
>>
>>  JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o
>> java/jcf-path.o \
>> -java/win32-host.o java/zextract.o ggc-none.o
>> +java/win32-host.o java/zextract.o ggc-none.o hash-table.o
>>
>>  JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o
>>
>> @@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify
>>  # jcf-io.o needs $(ZLIBINC) added to cflags.
>>  CFLAGS-java/jcf-io.o += $(ZLIBINC)
>>  java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
>> -  $(JAVA_TREE_H) java/zipfile.h
>> +  $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H)
>>
>>  # jcf-path.o needs a -D.
>>  CFLAGS-java/jcf-path.o += \
>> Index: gcc/java/jcf-io.c
>> ===
>> --- gcc/java/jcf-io.c(revision 191941)
>> +++ gcc/java/jcf-io.c(working copy)
>> @@ -31,7 +31,7 @@ The Free Software Foundation is independ
>>  #include "jcf.h"
>>  #include "tree.h"
>>  #include "java-tree.h"
>> -#include "hashtab.h"
>> +#include "hash-table.h"
>>  #include 
>>
>>  #include "zlib.h"
>> @@ -271,20 +271,34 @@ find_classfile (char *filename, JCF *jcf
>>return open_class (filename, jcf, fd, dep_name);
>>  }
>>
>> -/* Returns 1 if the CLASSNAME (really a char *) matches the name
>> -

Re: Convert more non-GTY htab_t to hash_table.

2012-10-03 Thread Lawrence Crowl

Sorry, one more time with the right file contents.

On 10/2/12, Richard Guenther  wrote:
> You are changing a hashtable used by fold checking, did you test
> with fold checking enabled?

One small thinko fixed.  Patch passes tests.

> The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll
> leave the rest to respective maintainers of the pieces of the
> compiler.

+cc
java: tro...@redhat.com
c: r...@redhat.com
objc: mikest...@comcast.net
cp: ja...@redhat.com

Change more non-GTY hash tables to use the new type-safe template hash table.
Constify member function parameters that can be const.
Correct a couple of expressions in formerly uninstantiated templates.

The new code is 0.362% faster in bootstrap, with a 99.5% confidence of
being faster.

Tested on x86-64.

Okay for trunk?

Index: gcc/java/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o.
(JCFDUMP_OBJS): Add dependence on hash-table.o.
(jcf-io.o): Add dependence on hash-table.h.
* jcf-io.c (memoized_class_lookups): Change to use type-safe hash table.

Index: gcc/c/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (c-decl.o): Add dependence on hash-table.h.
* c-decl.c (detect_field_duplicates_hash): Change to new type-safe
hash table.

Index: gcc/objc/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o.
(objc-act.o): Add dependence on hash-table.h.
* objc-act.c (objc_detect_field_duplicates): Change to new type-safe
hash table.

Index: gcc/ChangeLog

2012-10-01  Lawrence Crowl  

* Makefile.in (fold-const.o): Add depencence on hash-table.h.
(dse.o): Likewise.
(cfg.o): Likewise.
* fold-const.c (fold_checksum_tree): Change to new type-safe hash table.
* (print_fold_checksum): Likewise.
* cfg.c (var bb_original): Likewise.
* (var bb_copy): Likewise.
* (var loop_copy): Likewise.
* hash-table.h (template hash_table): Constify parameters for find...
and remove_elt... member functions.
(hash_table::empty) Correct size expression.
(hash_table::clear_slot) Correct deleted entry assignment.
* dse.c (var rtx_group_table): Change to new type-safe hash table.

Index: gcc/cp/ChangeLog

2012-10-01  Lawrence Crowl  

* Make-lang.in (class.o): Add dependence on hash-table.h.
(tree.o): Likewise.
(semantics.o): Likewise.
* class.c (fixed_type_or_null): Change to new type-safe hash table.
* tree.c (verify_stmt_tree): Likewise.
(verify_stmt_tree_r): Likewise.
* semantics.c (struct nrv_data): Likewise.

Index: gcc/java/Make-lang.in
===
--- gcc/java/Make-lang.in   (revision 191941)
+++ gcc/java/Make-lang.in   (working copy)
@@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav
   java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o
java/mangle.o \
   java/mangle_name.o java/builtins.o java/resource.o \
   java/jcf-depend.o \
-  java/jcf-path.o java/boehm.o java/java-gimplify.o
+  java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o

 JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o
java/jcf-path.o \
-   java/win32-host.o java/zextract.o ggc-none.o
+   java/win32-host.o java/zextract.o ggc-none.o hash-table.o

 JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o

@@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify
 # jcf-io.o needs $(ZLIBINC) added to cflags.
 CFLAGS-java/jcf-io.o += $(ZLIBINC)
 java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
-  $(JAVA_TREE_H) java/zipfile.h
+  $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H)

 # jcf-path.o needs a -D.
 CFLAGS-java/jcf-path.o += \
Index: gcc/java/jcf-io.c
===
--- gcc/java/jcf-io.c   (revision 191941)
+++ gcc/java/jcf-io.c   (working copy)
@@ -31,7 +31,7 @@ The Free Software Foundation is independ
 #include "jcf.h"
 #include "tree.h"
 #include "java-tree.h"
-#include "hashtab.h"
+#include "hash-table.h"
 #include 

 #include "zlib.h"
@@ -271,20 +271,34 @@ find_classfile (char *filename, JCF *jcf
   return open_class (filename, jcf, fd, dep_name);
 }

-/* Returns 1 if the CLASSNAME (really a char *) matches the name
-   stored in TABLE_ENTRY (also a char *).  */

-static int
-memoized_class_lookup_eq (const void *table_entry, const void *classname)
+/* Hash table helper.  */
+
+struct charstar_hash : typed_noop_remove 
 {
-  return strcmp ((const char *)classname, (const char *)table_entry) == 0;
+  typedef const char T;
+  static inline hashval_t hash (const T *candidate);
+  static inline bool equal (const T *existing, const T *candidate);
+};
+
+inline hashval_t
+charstar_hash::hash (const T *candidate)
+{
+  return htab_hash_string (cand

Re: [PATCH] rs6000: Remove 'A' output modifier

2012-10-03 Thread David Edelsohn

On Wed, Oct 3, 2012 at 6:28 PM, Segher Boessenkool
 wrote:
> It was used for old POWER only, which has been removed.
>
> Tested etc.; okay to apply?
>
>
> Segher
>
>
> 2012-10-04  Segher Boessenkool  
>
> gcc/
> * config/rs6000/rs6000.c (print_operand) ['A']: Delete.

Okay.

Thanks, David

Re: patch to fix

2012-10-03 Thread Mike Stump

On Oct 3, 2012, at 1:47 PM, Marc Glisse  wrote:
> did you consider making the size of wide_int a template parameter, now that 
> we are using C++? All with a convenient typedef or macro so it doesn't show. 
> I am asking because in vrp I do some arithmetic that requires 2*N+1 bits 
> where N is the size of double_int.

No, not really.  I'd maybe answer it this way, we put in a type (singular) to 
support all integral constants in all languages on a port.  Since we only 
needed 1, there was little need to templatize it.  By supporting all integral 
constants in all languages, there is little need for more.  If Ada say, wanted 
a 2048 bit integer, then, we just have it drop off the size it wants someplace 
and we would mix that in on a MAX(….) line, net result, the type we use would 
then directly support the needs of Ada.  If vpr wanted 2x of all existing 
modes, we could simply change the MAX equation and essentially double it; if 
people need that.  This comes as a cost, as the intermediate wide values are 
fixed size allocated (not variable); so these all would be larger.  For the 
longer lived values, no change, as these are variably sized as one would expect.

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread David Edelsohn

For C and C++, identifiers beginning with underscore and upper case
letter or with two underscores are reserve to the implementation.  C++
uses _Z for mangling.

Maybe Fortran could prepend "_F".  Something beginning with an
underscore seems like a much better choice, given the rules about
reserved identifiers.

Thanks, David

On Wed, Oct 3, 2012 at 5:00 PM, Tobias Burnus  wrote:
> David,
>
>
> David Edelsohn wrote:
>>
>> I am not sure why you chose a period and how best to correct this.
>
>
> Well, in principle any name which the user cannot enter would do. (Not
> enter: At least not as valid Fortran identifier.)
>
> The reason for choosing "." is that  is used elsewhere in
> gfortran for such identifier for the string-length variable belonging to
> , e.g. "._result" in trans-decl.c. I assume the reason that it
> didn't pop up with those is that those are local variables, but I wouldn't
> be surprised if it would break elsewhere.
>
> I wonder whether "@" would work, otherwise, one could also use "_". The only
> other problem is that it will break the ABI. On the other hand, it's a
> rather new feature and if we bump the .mod version number, the chance that
> one effectively forces the user to re-compile is rather high. So far we
> always bumped the .mod version number as something changed. There are also
> some other patches pending which effectively lead to a bump in the .mod
> version.
>
> (The .mod version won't affect code which doesn't use modules such as
> BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by the
> ABI change anyway as there the name doesn't propagate as it does with
> modules..)
>
>
> Thanks for investigating the test-suite failure.
>
> Tobias

Re: [wwwdocs] Buildstat update for 4.6

2012-10-03 Thread Gerald Pfeifer

On Tue, 2 Oct 2012, Tom G. Christensen wrote:
> Latest results for 4.6.x

Thanks, applied.  A bit unusual to see submissions of older versions
of one branch at this time, and from one and the same submitter...

Gerald

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread David Edelsohn

Maybe something like the following:

Index: trans-decl.c
===
--- trans-decl.c(revision 192019)
+++ trans-decl.c(working copy)
@@ -1097,9 +1097,9 @@

   /* Also prefix the mangled name.  */
   if (sym->module)
-   name = gfc_get_string (".__%s_MOD_%s", sym->module, sym->name);
+   name = gfc_get_string ("_F_%s_MOD_%s", sym->module, sym->name);
   else
-   name = gfc_get_string (".%s", sym->name);
+   name = gfc_get_string ("_F_%s", sym->name);

   length = build_decl (input_location,
   VAR_DECL, get_identifier (name),

Thanks, David

On Wed, Oct 3, 2012 at 5:00 PM, Tobias Burnus  wrote:
> David,
>
>
> David Edelsohn wrote:
>>
>> I am not sure why you chose a period and how best to correct this.
>
>
> Well, in principle any name which the user cannot enter would do. (Not
> enter: At least not as valid Fortran identifier.)
>
> The reason for choosing "." is that  is used elsewhere in
> gfortran for such identifier for the string-length variable belonging to
> , e.g. "._result" in trans-decl.c. I assume the reason that it
> didn't pop up with those is that those are local variables, but I wouldn't
> be surprised if it would break elsewhere.
>
> I wonder whether "@" would work, otherwise, one could also use "_". The only
> other problem is that it will break the ABI. On the other hand, it's a
> rather new feature and if we bump the .mod version number, the chance that
> one effectively forces the user to re-compile is rather high. So far we
> always bumped the .mod version number as something changed. There are also
> some other patches pending which effectively lead to a bump in the .mod
> version.
>
> (The .mod version won't affect code which doesn't use modules such as
> BLAS/LAPACK or any Fortran 66/77 code, but those won't be affected by the
> ABI change anyway as there the name doesn't propagate as it does with
> modules..)
>
>
> Thanks for investigating the test-suite failure.
>
> Tobias

Re: [wwwdocs] Buildstat update for 4.5

2012-10-03 Thread Gerald Pfeifer

On Tue, 2 Oct 2012, Tom G. Christensen wrote:
> Latest results for 4.5.x

Thanks, Tom!

Gerald

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread David Edelsohn

Another suggestion from Segher is "_F.", which is both reserve and
cannot conflict with C/C++ because identifiers cannot contain ".".

- David

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread Mike Stump

On Oct 3, 2012, at 4:49 PM, David Edelsohn  wrote:
> Another suggestion from Segher is "_F.", which is both reserve and
> cannot conflict with C/C++ because identifiers cannot contain ".".

The problem with that are machines that don't like "." in identifiers (aka 
NO_DOT_IN_LABEL)…  so one just has to be a little careful in using it.

Re: [Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars

2012-10-03 Thread David Edelsohn

On Wed, Oct 3, 2012 at 8:29 PM, Mike Stump  wrote:
> On Oct 3, 2012, at 4:49 PM, David Edelsohn  wrote:
>> Another suggestion from Segher is "_F.", which is both reserve and
>> cannot conflict with C/C++ because identifiers cannot contain ".".
>
> The problem with that are machines that don't like "." in identifiers (aka 
> NO_DOT_IN_LABEL)…  so one just has to be a little careful in using it.

That's a good reason to use "_F".  And NO_DOT_IN_LABEL systems will
not function with the current "." mangling.

- David

Ping: RFA: Process '*' in '@'-output-template alternatives

2012-10-03 Thread Joern Rennecke


The following patch is still awaiting review:

2011-09-19  J"orn Rennecke  

* genoutput.c (process_template): Process '*' in '@' alternatives.
* doc/md.texi (node Output Statement): Provide example for the above.

http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01422.html

Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2b

2012-10-03 Thread David Edelsohn

@@ -1115,7 +1118,8 @@ static const struct attribute_spec rs600
   { NULL,0, 0, false, false, false, NULL, false }
 };
 
-#ifndef MASK_STRICT_ALIGN
+#ifndef OPTION_MASK_STRICT_ALIGN
+#define OPTION_MASK_STRICT_ALIGN 0
 #define MASK_STRICT_ALIGN 0
 #endif
 #ifndef TARGET_PROFILE_KERNEL

Why does this fragment define OPTION_MASK_STRICT_ALIGN but does not
remove definition of MASK_STRICT_ALIGN?

Similarly for

-#ifndef MASK_64BIT
+#ifndef OPTION_MASK_64BIT
+#define OPTION_MASK_64BIT 0
 #define MASK_64BIT 0
 #endif

Why define both OPTION_MASK_64BIT and MASK_64BIT?

And

+/* Map OPTION_ back into TARGET_ options in rs6000_isa_flags.  */

Why set up correspondence for all OPTION_xxx flags back to TARGET_xxx flags?

Thanks, David

Re: [SH] PR 54760 - Add thread pointer built-ins and GBR displacement addressing

2012-10-03 Thread Kaz Kojima

Oleg Endo  wrote:
> This adds the two common built-in functions __builtin_thread_pointer and
> __builtin_set_thread_pointer to the SH port.
> I've done it in a way so that hopefully it can be transitioned to target
> independent thread pointer built-ins easily, as suggested by Richard a
> while ago:
> http://gcc.gnu.org/ml/gcc-patches/2012-07/msg00946.html
> 
> Originally I wanted to wait until the target independent bits are in,
> but somehow the thread mentioned above died and I got impatient.
> 
> I've also added support for SH's GBR based displacement addressing
> modes.  They are not used for general purpose mem loads/stores by the
> compiler, but rather when code accesses data behind the thread pointer
> (thread control block or something).
> The way GBR displacement address opportunities are discovered might not
> be the best way of doing this sort of thing, but it works.  The insn
> walking could potentially slow down compile times, but it is only
> enabled for functions where the GBR is referenced, so it shouldn't be so
> bad.  Alternatives and suggestions are highly appreciated.  :)
> 
> Tested on rev 191894 with 'make all' (c,c++) and
> 'make -k check-gcc RUNTESTFLAGS="sh.exp --target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"'
> 
> For code that doesn't reference GBR there are no functional changes.
> TLS code that references GBR might trigger the 'sh_find_equiv_gbr_addr'
> function.  Unfortunately TLS tests don't seem to work on sh-sim, so I
> could not test this part.

sh4-unknown-linux-gnu build failed during compiling libmudflap:

/exp/ldroot/dodes/xsh-gcc/./gcc/xgcc -B/exp/ldroot/dodes/xsh-gcc/./gcc/ 
-B/usr/local/sh4-unknown-linux-gnu/bin/ -B/usr/local/sh4-unknown-linux-gnu/lib/ 
-isystem /usr/local/sh4-unknown-linux-gnu/include -isystem 
/usr/local/sh4-unknown-linux-gnu/sys-include -DHAVE_CONFIG_H -I. 
-I../../../LOCAL/trunk/libmudflap -DLIBMUDFLAPTH -g -O2 -MT 
libmudflapth_la-mf-runtime.lo -MD -MP -MF .deps/libmudflapth_la-mf-runtime.Tpo 
-c ../../../LOCAL/trunk/libmudflap/mf-runtime.c -o libmudflapth_la-mf-runtime.o
../../../LOCAL/trunk/libmudflap/mf-runtime.c: In function 
'begin_recursion_protect1':
../../../LOCAL/trunk/libmudflap/mf-runtime.c:152:1: internal compiler error: 
Segmentation fault
 }
 ^
0x8529c60 crash_signal
../../LOCAL/trunk/gcc/toplev.c:335
0x8771a87 sh_find_base_reg_disp
../../LOCAL/trunk/gcc/config/sh/sh.c:13344
0x8791554 sh_find_equiv_gbr_addr(rtx_def*, rtx_def*)
../../LOCAL/trunk/gcc/config/sh/sh.c:13395
0x87ce6cf gen_split_1029(rtx_def*, rtx_def**)
../../LOCAL/trunk/gcc/config/sh/sh.md:10184
0x87ea6e0 split_1
../../LOCAL/trunk/gcc/config/sh/sh.md:10183
0x87ea6e0 split_3
../../LOCAL/trunk/gcc/config/sh/sh.md:7082
0x82bf8a1 try_split(rtx_def*, rtx_def*, int)
../../LOCAL/trunk/gcc/emit-rtl.c:3503
0x849c642 split_insn
../../LOCAL/trunk/gcc/recog.c:2809
0x84a08b5 split_all_insns()
../../LOCAL/trunk/gcc/recog.c:2899
0x84a09a7 rest_of_handle_split_all_insns
../../LOCAL/trunk/gcc/recog.c:3751

Looks prev_nonnote_insn returns a barrier there:

(gdb) fr 0
#0  0x08771a87 in sh_find_base_reg_disp (insn=, x=0xb79ddd40, 
base_reg=0x0, disp=0) at ../../LOCAL/trunk/gcc/config/sh/sh.c:13344
13344 if (p != NULL && GET_CODE (p) == SET && REG_P (XEXP (p, 0))
(gdb) p p
$1 = (rtx) 0xafafafaf
(gdb) p i
$2 = (rtx_def *) 0xb79ecc3c
(gdb) call debug_rtx(i)
(barrier 216 215 217)

>   (builtin_description): Add is_enabled member.
>   (shmedia_builtin, sh1_builtin): New functions.
>   (signature_args): Add SH_BLTIN_VP.
>   (bdesc): Use shmedia_builtin for existing built-ins.  Add 
>   __builtin_thread_pointer and __builtin_set_thread_pointer as 
>   sh1_builtin.
>   (sh_media_init_builtins, sh_init_builtins): Merge into single 
>   function sh_init_builtins.  Add is_enabled checking.
>   (sh_media_builtin_decl, sh_builtin_decl): Merge into single 
>   function sh_builtin_decl.  Add is_enabled checking.

It would be better to separate this part except new thread pointer
builtins into an independent patch which should be tested also with
sh64-elf build, though now unified sh64-elf build is failing.
I'd like to commit a quick fix for sh64-elf build failure.

Regards,
kaz

[patch committed SH] Fix sh64-elf build failure

2012-10-03 Thread Kaz Kojima

Hi,

I've committed the patch below to fix sh64-elf build failure.
SHmedia and SHcompact using call cookie require special return insns
and will require an extra work to enable simple_return.  I simply
disable it for these targets ATM.

Regards,
kaz
--
2012-10-04  Kaz Kojima  

* config/sh/sh.c (sh_can_use_simple_return_p): Return false for
SHmedia and SHcompact using call cookie.
* config/sh/sh.md (epilogue): Emit non-inlined return insns for
SHmedia and SHcompact using call cookie.

Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md (revision 192060)
+++ gcc/config/sh/sh.md (working copy)
@@ -10460,6 +10460,13 @@
   ""
 {
   sh_expand_epilogue (false);
+  if (TARGET_SHMEDIA
+  || (TARGET_SHCOMPACT
+ && (crtl->args.info.call_cookie & CALL_COOKIE_RET_TRAMP (1
+{
+  emit_jump_insn (gen_return ());
+  DONE;
+}
 })
 
 (define_expand "eh_return"
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c  (revision 192060)
+++ gcc/config/sh/sh.c  (working copy)
@@ -13134,6 +13134,12 @@
   HARD_REG_SET live_regs_mask;
   int d;
 
+  /* Some targets require special return insns.  */
+  if (TARGET_SHMEDIA
+  || (TARGET_SHCOMPACT
+ && (crtl->args.info.call_cookie & CALL_COOKIE_RET_TRAMP (1
+return false;
+
   if (! reload_completed || frame_pointer_needed)
 return false;

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-03 Thread Vladimir Makarov


On 12-10-03 11:35 AM, Steven Bosscher wrote:

On Wed, Oct 3, 2012 at 4:56 PM, Vladimir Makarov  wrote:

On 12-10-03 3:13 AM, Steven Bosscher wrote:

On Tue, Oct 2, 2012 at 3:42 PM, Richard Sandiford
 wrote:

+/* Compress pseudo live ranges by removing program points where
+   nothing happens.  Complexity of many algorithms in LRA is linear
+   function of program points number.  To speed up the code we try to
+   minimize the number of the program points here.  */
+static void
+remove_some_program_points_and_update_live_ranges (void)

Genuine question, but could we do this on the fly instead,
by not incrementing curr_point if the current point had no value?

I suppose the main complication would be checking cases where
all births are recorded by extending the previous just-closed live
range rather than starting a new one, in which case it's the previous
point that needs to be reused.  Hmm...

It does seem to be something worth investigating further. Things like:

Compressing live ranges: from 1742579 to 554532 - 31%
Compressing live ranges: from 1742569 to 73069 - 4%

look extreme, but they're actually the norm. For the same test case
(PR54146 still) but looking only at functions with initially 1000-
program points, you get this picture:

Compressing live ranges: from 1766 to 705 - 39%
Compressing live ranges: from 1449 to 370 - 25%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 3939 to 1093 - 27%
Compressing live ranges: from 2464 to 676 - 27%
Compressing live ranges: from 1433 to 379 - 26%
Compressing live ranges: from 1261 to 348 - 27%
Compressing live ranges: from 2835 to 755 - 26%
Compressing live ranges: from 5426 to 1678 - 30%
Compressing live ranges: from 5227 to 1477 - 28%
Compressing live ranges: from 1845 to 467 - 25%
Compressing live ranges: from 4868 to 1378 - 28%
Compressing live ranges: from 4875 to 1388 - 28%
Compressing live ranges: from 4882 to 1384 - 28%
Compressing live ranges: from 5688 to 1714 - 30%
Compressing live ranges: from 4943 to 1310 - 26%
Compressing live ranges: from 2976 to 792 - 26%
Compressing live ranges: from 5463 to 1526 - 27%
Compressing live ranges: from 2854 to 730 - 25%
Compressing live ranges: from 1810 to 745 - 41%
Compressing live ranges: from 2771 to 904 - 32%
Compressing live ranges: from 4916 to 1429 - 29%
Compressing live ranges: from 6505 to 2238 - 34%
Compressing live ranges: from 6493 to 166 - 2%
Compressing live ranges: from 5498 to 1734 - 31%
Compressing live ranges: from 1810 to 745 - 41%
Compressing live ranges: from 5043 to 1420 - 28%
Compressing live ranges: from 3094 to 788 - 25%
Compressing live ranges: from 4563 to 1311 - 28%
Compressing live ranges: from 4557 to 158 - 3%
Compressing live ranges: from 1050 to 274 - 26%
Compressing live ranges: from 1602 to 434 - 27%
Compressing live ranges: from 2474 to 600 - 24%
Compressing live ranges: from 2718 to 866 - 31%
Compressing live ranges: from 2097 to 716 - 34%
Compressing live ranges: from 4152 to 1099 - 26%
Compressing live ranges: from 5065 to 1514 - 29%
Compressing live ranges: from 1236 to 359 - 29%
Compressing live ranges: from 1722 to 541 - 31%
Compressing live ranges: from 5186 to 1401 - 27%

Unfortunately the dump doesn't mention how many live ranges could be
merged thanks to the compression.

It'd also be good to understand why the compression ratios are so
small, and consistently around ~30%.

This sentence is not clear to me.  30% means 3 times less points. It is
pretty good compression.


   Maybe curr_point includes things
it should ignore (DEBUG_INSNs, NOTEs, ...)?


After the compression, there are only points important for conflict info,
i.e. only points where some reg dies or start living.  Even more if on the
subsequent points there are only life starts or only deaths,  they are
represented by one point after the compression.

With this patch the amount of compression is reduced. Without the
patch the compression ratio is typically around 30%, with the patch
this goes up to ~65%. The worst compression ratios (where worse is
better in this case :-) are:

$ grep Compressing log4 | egrep " [78].%"
Compressing live ranges: from 31 to 22 - 70%, pre_count 17, post_count 15
Compressing live ranges: from 128 to 94 - 73%, pre_count 87, post_count 77
Compressing live ranges: from 32 to 23 - 71%, pre_count 16, post_count 15
Compressing live ranges: from 38 to 29 - 76%, pre_count 19, post_count 18
Compressing live ranges: from 40 to 28 - 70%, pre_count 26, post_count 24
Compressing live ranges: from 40 to 28 - 70%, pre_count 26, post_count 24
Compressing live ranges: from 40 to 28 - 70%, pre_count 26, post_count 24
Compressing live ranges: from 60 to 43 - 71%, pre_count 37, post_count 33
Compressing live ranges: from 60 to 43 - 71%, pre_count 37, post_count 33
Compressing live ranges: from 125 to 89 - 71%, pre_count 71, post_count 62
Compressing live ranges: fr

Re: [PATCH] Improve debug info for partial inlining (PR debug/54519, take 2)

2012-10-03 Thread Alexandre Oliva

On Oct  3, 2012, Jakub Jelinek  wrote:

> basically there is a non-addressable parameter in stack slot, and
> vt_canon_true_dep -> canon_true_dependence thinks an argument push
> insn might alias with it, because it doesn't have a MEM_EXPR and
> ao_ref_from_mem fails.

I have a pending (still unreviewed) patch that might address :-), so to
speak, this problem.

for  gcc/ChangeLog
from  Alexandre Oliva  

	PR debug/53671
	PR debug/49888
	* var-tracking.c (attrs_list_by_loc_eq): New.
	(track_stack_pointer): New.
	(dataflow_set_merge): Use it.
	(vt_initialize): Record the initial stack pointer in the
	dataflow set.

Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c.orig	2012-06-26 17:33:12.991323578 -0300
+++ gcc/var-tracking.c	2012-06-26 17:51:55.316453808 -0300
@@ -1462,6 +1462,17 @@ attrs_list_member (attrs list, decl_or_v
   return NULL;
 }
 
+/* Return the entry whose LOC field equals LOC.  */
+
+static attrs
+attrs_list_by_loc_eq (attrs list, rtx loc)
+{
+  for (; list; list = list->next)
+if (list->loc == loc)
+  return list;
+  return NULL;
+}
+
 /* Insert the triplet DECL, OFFSET, LOC to the list *LISTP.  */
 
 static void
@@ -4028,6 +4039,86 @@ variable_merge_over_src (variable s2var,
   return 1;
 }
 
+/* Add to DST any needed binding for the canonicalization of the stack
+   pointer to yield the same expression as in SRC1 and SRC2, if they
+   both yield the same expression.
+
+   Return TRUE iff we found an equivalence.
+
+   ??? The return value, that was useful during testing, ended up
+   unused, but this single-use static function will be inlined, and
+   then the return value computation will be optimized out, so I'm
+   leaving it in.
+
+   ??? We use this kludge to avoid accidental aliasing between
+   incoming arguments and register-saving or outgoing-args pushes.  We
+   shouldn't have to add explicit stack pointer tracking for that:
+   intersect_loc_chains ought to be able to take information from the
+   static cselib table and recognize additional equivalences, but we
+   don't have a sufficiently efficient algorithm to do that yet.  */
+
+static bool
+track_stack_pointer (dataflow_set *dst, dataflow_set *src1, dataflow_set *src2)
+{
+  attrs dattr, s1attr, s2attr;
+  rtx daddr, s1addr, s2addr;
+  decl_or_value dv;
+
+  for (dattr = dst->regs[STACK_POINTER_REGNUM];
+   (dattr = attrs_list_by_loc_eq (dattr, stack_pointer_rtx))
+	 && (dattr->offset || !dv_is_value_p (dattr->dv));
+   dattr = dattr->next)
+;
+
+  for (s1attr = src1->regs[STACK_POINTER_REGNUM];
+   (s1attr = attrs_list_by_loc_eq (s1attr, stack_pointer_rtx))
+	 && (s1attr->offset || !dv_is_value_p (s1attr->dv));
+   s1attr = s1attr->next)
+;
+  if (!s1attr)
+return false;
+
+  for (s2attr = src2->regs[STACK_POINTER_REGNUM];
+   (s2attr = attrs_list_by_loc_eq (s2attr, stack_pointer_rtx))
+	 && (s2attr->offset || !dv_is_value_p (s2attr->dv));
+   s2attr = s2attr->next)
+;
+  if (!s2attr)
+return false;
+
+  if (dattr)
+daddr = vt_canonicalize_addr (dst, dv_as_value (dattr->dv));
+  else
+daddr = NULL;
+  s1addr = vt_canonicalize_addr (src1, dv_as_value (s1attr->dv));
+  s2addr = vt_canonicalize_addr (src2, dv_as_value (s2attr->dv));
+
+  if (!rtx_equal_p (s1addr, s2addr))
+return false;
+
+  if (daddr && rtx_equal_p (daddr, s1addr))
+return true;
+
+  dst_can_be_shared = false;
+  if (daddr)
+dv = dattr->dv;
+  else if (vt_get_canonicalize_base (s1addr)
+	   != (cfa_base_rtx ? cfa_base_rtx : arg_pointer_rtx))
+return false;
+  else
+{
+  cselib_val *val = cselib_lookup (s1addr, GET_MODE (s1addr), 1, VOIDmode);
+  cselib_preserve_value (val);
+  dv = dv_from_value (val->val_rtx);
+}
+
+  var_reg_decl_set (dst, stack_pointer_rtx,
+		VAR_INIT_STATUS_INITIALIZED,
+		dv, 0, NULL_RTX, INSERT);
+
+  return true;
+}
+
 /* Combine dataflow set information from SRC2 into DST, using PDST
to carry over information across passes.  */
 
@@ -4066,6 +4157,8 @@ dataflow_set_merge (dataflow_set *dst, d
   FOR_EACH_HTAB_ELEMENT (shared_hash_htab (dsm.cur->vars), var, variable, hi)
 variable_merge_over_cur (var, &dsm);
 
+  track_stack_pointer (dst, src1, src2);
+
   if (dsm.src_onepart_cnt)
 dst_can_be_shared = false;
 
@@ -9682,6 +9775,18 @@ vt_initialize (void)
 	  expr = plus_constant (GET_MODE (reg), reg, ofst);
 	  cselib_add_permanent_equiv (val, expr, get_insns ());
 	}
+
+  /* VAL below will generally be the one set within the
+	 conditional block, but if OFST happens to be zero, we'll be
+	 happy to use the one corresponding to REG.
+
+	 ??? We shouldn't need this any more once dataflow merges
+	 start using equivalences from the cselib table in addition to
+	 those in dataflow sets.  */
+  var_reg_decl_set (&VTI (ENTRY_BLOCK_PTR)->out,
+			stack_pointer_rtx, VAR_INIT_STATUS_INITIALIZED,
+			dv_from_value (val->val_rtx), 0, NULL_RTX,
+			INSERT);

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-03 Thread Steven Bosscher

On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov  wrote:
> I was going to look at this code too but I was interesting in generation of
> less points and live ranges.  It is strange that in my profiles,
> remove_some_program_points_and_update_live_ranges takes 0.6% of compiler
> time on these huge tests.   So I was not interesting to speed up the
> function and may be therefore you have no visible change in compilation
> time.

Right. The compression algorithm doesn't care much about the initial
number of program points, only about the number of live ranges before
and after compression. I had expected a bigger effect on the number of
live ranges before compression.

0.6% sounds really very different from my timings. How much time does
create_start_finish_chains take for you?

> I don't object the idea of the patch.  I need some time to look at it (the
> different results on a function is a bit scary for me) and check simulator
> times on other tests.

Understood.

FWIW you can find the test results from before and after the patch here:
http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg00267.html
http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg00303.html

Ciao!
Steven

82 matches

Mail list logo