Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call

2015-04-16 Thread Senthil Kumar Selvaraj
On Thu, Apr 16, 2015 at 11:02:05AM +0200, Georg-Johann Lay wrote:
> Am 04/16/2015 um 08:43 AM schrieb Senthil Kumar Selvaraj:
> >This patch fixes PR 65657.
> 
> The following artifact appears to be PR63633.
> 

I did see that one - unfortunately, that fix won't help here. IIUC, you
check if input/output operand hard regs are in the clobber list, 
and if yes, you generate pseudos to save and restore clobbered hard
regs.

In this case, the reg is actually clobbered by a different insn (one
that loads the next argument to the function). So unless I blindly generate 
pseudos for 
all regs in the clobber list, the clobbering will still happen.

FWIW, I did try saving/restoring all clobbered regs, and it did fix the 
problem - just that it appeared like a (worse) hack to me. Aren't we
manually replicating what RA/reload should be doing?

What do you think?

> >When cfgexpand.c expands a function call, it first figures out the
> >registers in which the arguments will go, followed by expansion of the
> >arguments themselves (right to left). It later emits mov insns to set
> >the precomputed registers with the expanded RTXes.
> >
> >If one of the arguments is a __memx char pointer dereference, the mov
> >eventually expands to gen_xload_A (for certain cases), which
> >clobbers R22, R21 and Z.  This causes problems if one of those
> >clobbered registers was precomputed to hold another argument.
> >
> >In general, call expansion does not appear to take clobbers into account -
> >when it checks for argument overlap, the RTX (args[i].value) is only a MEM
> >in QImode for the memx deref - the clobber shows up when it eventually
> >calls emit_move_insn, at which point, it is too late.
> >
> >This does not happen for a int pointer dereference - turns out that
> >precompute_register_parameters does a copy_to_mode_reg if the
> >cost of args[i].value is more than COSTS_N_INSNS(1) i.e., it creates a
> >pseudo and later assigns the pseudo to the arg register. This is done
> >before any moves to arg registers is done, so other arguments are not
> >overwritten.
> >
> >Doing the same thing - providing a better cost estimate for a MEM rtx in
> >the non-default address space, makes this problem go away, and that is
> >what this patch does. Regression testing does not show any new failures.
> >
> >The fix does seem tangential to the core problem - not sure if there is
> >a better way to fix this?
> >
> >If not, could someone commit this please? I don't have commit access.
> >
> >Regards
> >Senthil
> >
> >gcc/ChangeLog
> >
> >2015-04-16  Senthil Kumar Selvaraj  
> >
> > * config/avr/avr.c (avr_rtx_costs_1): Increase cost for
> > MEM rtx in non-default address space.
> >
> >gcc/testsuite/ChangeLog
> >
> >2015-04-16  Senthil Kumar Selvaraj  
> >
> > * gcc.target/avr/pr65657.c: New.
> >
> >diff --git gcc/config/avr/avr.c gcc/config/avr/avr.c
> >index 68d5ddc..c9b8678 100644
> >--- gcc/config/avr/avr.c
> >+++ gcc/config/avr/avr.c
> >@@ -9959,7 +9959,11 @@ avr_rtx_costs_1 (rtx x, int codearg, int outer_code 
> >ATTRIBUTE_UNUSED,
> >return true;
> >
> >  case MEM:
> >-  *total = COSTS_N_INSNS (GET_MODE_SIZE (mode));
> >+  /* MEM rtx with non-default address space is more
> >+ expensive. Not expressing that results in reg
> >+ clobber during expand (PR 65657). */
> >+  *total = COSTS_N_INSNS (GET_MODE_SIZE (mode)
> >+  + (MEM_ADDR_SPACE(x) == ADDR_SPACE_RAM ? 0 : 5));
> >return true;
> 
> IMO costs should not be used to fix wrong code or ICEs.  Presumably the fix
> is similar to the one used by PR63633.

Yes, fixing it by adjusting costs is fragile, but it appeared cleaner to me.

Regards
Senthil

> 
> Johann
> 
> >  case NEG:
> >diff --git gcc/testsuite/gcc.target/avr/pr65657.c 
> >gcc/testsuite/gcc.target/avr/pr65657.c
> >new file mode 100644
> >index 000..d908fe3
> >--- /dev/null
> >+++ gcc/testsuite/gcc.target/avr/pr65657.c
> >@@ -0,0 +1,39 @@
> >+/* { dg-do run } */
> >+/* { dg-options "-Os" } */
> >+
> >+/* When cfgexpand expands the call to foo, it
> >+   assigns registers R22:R23 to 0xABCD and R24
> >+   to *x. Because x is a __memx address space
> >+   pointer, the dereference results in an RTL
> >+   insn that clobbers R22 among other
> >+   registers (xload_A).
> >+
> >+   Call expansion does not take this into account
> >+   and ends up clobbering R22 set previously to
> >+   hold part of the second argument.
> >+*/
> >+
> >+#include 
> >+
> >+void __attribute__((noinline))
> >+__attribute__((noclone))
> >+foo (char c, volatile unsigned int d)
> >+{
> >+if (d != 0xABCD)
> >+  abort();
> >+if (c != 'A')
> >+abort();
> >+}
> >+
> >+void __attribute__((noinline))
> >+__attribute__((noclone))
> >+readx(const char __memx *x)
> >+{
> >+foo (*x, 0xABCD);
> >+}
> >+
> >+const char __memx arr[] = { 'A' };
> >+int main()
> >+{
> >+   readx (arr);
> >+}
> >
> 


Re: [PING][PATCH][4.9]Backport "Fix register corruption bug in ree"

2015-04-16 Thread Renlin Li

Ping~

Regards,
Renlin Li

On 16/04/15 10:09, wrote:

Ping~
Anybody has time to review it?


Regards,
Renlin Li

On 06/02/15 17:48, Renlin Li wrote:

Hi all,

This is a backport patch for branch 4.9. You can find the original=20
patch here:https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00356.html
And it has been commit on the trunk as r215205.

This fixes a few libstdc++-v3 test suite failures.
x86_64 bootstraps Okay, aarch64_be-none-elf libstdc++-v3 tested Okay.

Okay to commit on branch 4.9?

Regards,
Renlin Li

2015-02-06  Renlin Li

 Backport from mainline
 2014-09-12  Wilco Dijkstra

 * gcc/ree.c (combine_reaching_defs): Ensure inserted copy don't=20
change
 the number of hard registers.





[Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

2015-04-16 Thread Jiong Wang

This is a rework of

  https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01998.html

After second thinking, I feel it's better to fix this in earlier stage
during RTL expand which is more generic, and we also avoid making the
already complex combine pass complexer.

Currently gcc expand wide mode left shift to some generic complex
instruction sequences, while if we have known the high part of wide mode
all comes from sign extension, the expand logic could be simplifed.

Given the following example,

T A = (T) B  << const_imm_shift

We know the high part of A are all comes from sign extension, if

* T is the next wider type of word_mode.

For example, for aarch64, if type T is 128int (TImode), and B is with
type SImode or DImode, then tree analyzer know that the high part of
TImode result all comes from sign extension, and kept them in range info.

 |<   T  >|
 |   high |   low |
  |<- sizel ->|

For above example, we could simplify the expand logic into
 1. low = low << const_imm_shift;
 2. high = low >> (sizel - const_imm_shift)  */

We can utilize the arithmetic right shift to do the sign
extension. Those reduntant instructions will be optimized out later.

For actual .s improvement,

AArch64
===

  __int128_t
  foo (int data)
  {
return (__int128_t) data << 50;
  }

  old:
sxtwx2, w0
asr x1, x2, 63
lsl x0, x2, 50
lsl x1, x1, 50
orr x1, x1, x2, lsr 14
 
  new:
sxtwx1, w0
lsl x0, x1, 50
asr x1, x1, 14


ARM (.fpu softvfp)
===

  long long
  shift (int data)
  {
return (long long) data << 20;
  }
 
  old:
stmfd   sp!, {r4, r5}
mov r5, r0, asr #31
mov r3, r0
mov r0, r0, asl #20
mov r1, r5, asl #20
orr r1, r1, r3, lsr #12
ldmfd   sp!, {r4, r5}
bx  lr

  new:
mov r1, r0
mov r0, r0, asl #20
mov r1, r1, asr #12
bx  lr

Test


  x86 bootstrap OK, regression test OK.
  AArch64 bootstrap OK, regression test on board OK.

Regards,
Jiong

2015-04-116  Jiong.Wang  

gcc/
  * expr.c (expand_expr_real_2): Take tree range info into account when
  expanding LSHIFT_EXPR.

gcc/testsuite
  * gcc.dg/wide_shift_64_1.c: New testcase.
  * gcc.dg/wide_shift_128_1.c: Ditto.
  * gcc.target/aarch64/ashlti3_1.c: Ditto.
  * gcc.target/arm/ashldisi_1.c: Ditto.
  
diff --git a/gcc/expr.c b/gcc/expr.c
index 89ca129..96d64cc 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8984,23 +8984,85 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 
 case LSHIFT_EXPR:
 case RSHIFT_EXPR:
-  /* If this is a fixed-point operation, then we cannot use the code
-	 below because "expand_shift" doesn't support sat/no-sat fixed-point
- shifts.   */
-  if (ALL_FIXED_POINT_MODE_P (mode))
-	goto binop;
-
-  if (! safe_from_p (subtarget, treeop1, 1))
-	subtarget = 0;
-  if (modifier == EXPAND_STACK_PARM)
-	target = 0;
-  op0 = expand_expr (treeop0, subtarget,
-			 VOIDmode, EXPAND_NORMAL);
-  temp = expand_variable_shift (code, mode, op0, treeop1, target,
-unsignedp);
-  if (code == LSHIFT_EXPR)
-	temp = REDUCE_BIT_FIELD (temp);
-  return temp;
+  {
+	/* If this is a fixed-point operation, then we cannot use the code
+	   below because "expand_shift" doesn't support sat/no-sat fixed-point
+	   shifts.  */
+	if (ALL_FIXED_POINT_MODE_P (mode))
+	  goto binop;
+
+	if (! safe_from_p (subtarget, treeop1, 1))
+	  subtarget = 0;
+	if (modifier == EXPAND_STACK_PARM)
+	  target = 0;
+
+	op0 = expand_expr (treeop0, subtarget,
+			   VOIDmode, EXPAND_NORMAL);
+
+	/* If mode == GET_MODE_WIDER_MODE (word_mode),
+	   then normally, there will no native instructions to support
+	   this wide mode left shift.
+
+	   given below example,
+
+	   T A = (T) B  << C
+
+	   |<		T	   >|
+	   |   high |   low |
+
+			|<- sizel ->|
+
+	   if from range info, we could deduce that the high part are all sign
+	   bit extension, then this left shift operation could be largely
+	   simplified into.
+
+	 1. low = low << C;
+	 2. high = low >> (sizel - C)  */
+
+	int o_bits = GET_MODE_SIZE (mode) * BITS_PER_UNIT;
+	wide_int min, max;
+
+	if (code == LSHIFT_EXPR
+	&& !unsignedp
+	&& mode == GET_MODE_WIDER_MODE (word_mode)
+	&& !have_insn_for (LSHIFT_EXPR, mode)
+	&& TREE_CONSTANT (treeop1)
+	&& get_range_info (treeop0, &min, &max) == VR_RANGE
+	&& (wi::cmp (min,
+			 wide_int::from (wi::min_value
+	 ((unsigned) (BITS_PER_WORD),
+	  SIGNED), o_bits, SIGNED),
+			 SIGNED) != -1)
+	&& (wi::cmp (max,
+			 wide_int::from (wi::max_value
+	 ((unsigned)(BITS_PER_WORD),
+	  SIGNED), o_bits, SIGNED),
+			 SIGNED) != 1))
+	  {
+	rtx low = simplify_gen_subreg (word_mode, op0, mode, 0);
+	rtx t_low = simplify_gen_subreg (word_mode, target, mode, 0);
+	rtx t_high = simplify_gen_subreg (word_mode, target, mode,
+	  UNITS_PER_WORD);
+	tree high_shi

[PATCH, i386]: Rewrite print_reg

2015-04-16 Thread Uros Bizjak
Hello!

Attached patch rewrites print_reg to optimize it and make it more
readable. Currently, it looks like 64bit registers were bolted on as
an afterthought.

2015-04-16  Uros Bizjak  

* config/i386/i386.c (print_reg): Rewrite function.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} and
committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 222131)
+++ config/i386/i386.c  (working copy)
@@ -15191,8 +15191,9 @@ void
 print_reg (rtx x, int code, FILE *file)
 {
   const char *reg;
+  int msize;
   unsigned int regno;
-  bool duplicated = code == 'd' && TARGET_AVX;
+  bool duplicated;
 
   if (ASSEMBLER_DIALECT == ASM_ATT)
 putc ('%', file);
@@ -15204,82 +15205,49 @@ print_reg (rtx x, int code, FILE *file)
   return;
 }
 
-  regno = true_regnum (x);
-  gcc_assert (regno != ARG_POINTER_REGNUM
- && regno != FRAME_POINTER_REGNUM
- && regno != FLAGS_REG
- && regno != FPSR_REG
- && regno != FPCR_REG);
+  if (code == 'y' && STACK_TOP_P (x))
+{
+  fputs ("st(0)", file);
+  return;
+}
 
   if (code == 'w')
-code = 2;
+msize = 2;
   else if (code == 'b')
-code = 1;
+msize = 1;
   else if (code == 'k')
-code = 4;
+msize = 4;
   else if (code == 'q')
-code = 8;
-  else if (code == 'y')
-code = 3;
+msize = 8;
   else if (code == 'h')
-code = 0;
+msize = 0;
   else if (code == 'x')
-code = 16;
+msize = 16;
   else if (code == 't')
-code = 32;
+msize = 32;
   else if (code == 'g')
-code = 64;
+msize = 64;
   else
-code = GET_MODE_SIZE (GET_MODE (x));
+msize = GET_MODE_SIZE (GET_MODE (x));
 
-  /* Irritatingly, AMD extended registers use different naming convention
- from the normal registers: "r%d[bwd]"  */
-  if (REX_INT_REGNO_P (regno))
-{
-  gcc_assert (TARGET_64BIT);
-  putc ('r', file);
-  fprint_ul (file, regno - FIRST_REX_INT_REG + 8);
-  switch (code)
-   {
- case 0:
-   error ("extended registers have no high halves");
-   break;
- case 1:
-   putc ('b', file);
-   break;
- case 2:
-   putc ('w', file);
-   break;
- case 4:
-   putc ('d', file);
-   break;
- case 8:
-   /* no suffix */
-   break;
- default:
-   error ("unsupported operand size for extended register");
-   break;
-   }
-  return;
-}
+  regno = true_regnum (x);
 
-  reg = NULL;
-  switch (code)
+  gcc_assert (regno != ARG_POINTER_REGNUM
+ && regno != FRAME_POINTER_REGNUM
+ && regno != FLAGS_REG
+ && regno != FPSR_REG
+ && regno != FPCR_REG);
+
+  duplicated = code == 'd' && TARGET_AVX;
+
+  switch (msize)
 {
-case 3:
-  if (STACK_TOP_P (x))
-   {
- reg = "st(0)";
- break;
-   }
-  /* FALLTHRU */
 case 8:
 case 4:
+  if (LEGACY_INT_REGNO_P (regno))
+   putc (msize == 8 ? 'r' : 'e', file);
+case 16:
 case 12:
-  if (LEGACY_INT_REG_P (x))
-   putc (code == 8 && TARGET_64BIT ? 'r' : 'e', file);
-  /* FALLTHRU */
-case 16:
 case 2:
 normal:
   reg = hi_reg_name[regno];
@@ -15296,19 +15264,49 @@ print_reg (rtx x, int code, FILE *file)
   break;
 case 32:
 case 64:
-  if (SSE_REG_P (x))
+  if (SSE_REGNO_P (regno))
{
  gcc_assert (!duplicated);
- putc (code == 32 ? 'y' : 'z', file);
- fputs (hi_reg_name[regno] + 1, file);
- return;
+ putc (msize == 32 ? 'y' : 'z', file);
+ reg = hi_reg_name[regno] + 1;
+ break;
}
-  break;
+  goto normal;
 default:
   gcc_unreachable ();
 }
 
   fputs (reg, file);
+
+  /* Irritatingly, AMD extended registers use
+ different naming convention: "r%d[bwd]"  */
+  if (REX_INT_REGNO_P (regno))
+{
+  gcc_assert (TARGET_64BIT);
+  switch (msize)
+   {
+ case 0:
+   error ("extended registers have no high halves");
+   break;
+ case 1:
+   putc ('b', file);
+   break;
+ case 2:
+   putc ('w', file);
+   break;
+ case 4:
+   putc ('d', file);
+   break;
+ case 8:
+   /* no suffix */
+   break;
+ default:
+   error ("unsupported operand size for extended register");
+   break;
+   }
+  return;
+}
+
   if (duplicated)
 {
   if (ASSEMBLER_DIALECT == ASM_ATT)


[PATCH] -Warray-bounds TLC

2015-04-16 Thread Richard Biener

The following applies the patch produced earlier this year, applying
TLC to array bound warnings and catching a few more cases.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-04-16  Richard Biener  

PR tree-optimization/64277
* tree-vrp.c (check_array_ref): Fix anti-range handling,
simplify upper bound handling.
(search_for_addr_array): Simplify.
(check_array_bounds): Handle ADDR_EXPRs here.
(check_all_array_refs): Simplify.

* gcc.dg/Warray-bounds-12.c: New testcase.
* gcc.dg/Warray-bounds-13.c: Likewise.
* c-c++-common/ubsan/bounds-4.c: Disable -Warray-bounds.
* c-c++-common/ubsan/bounds-6.c: Likewise.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2015-01-27 10:47:58.812933951 +0100
--- gcc/tree-vrp.c  2015-01-27 11:18:02.129291796 +0100
*** check_array_ref (location_t location, tr
*** 6533,6538 
--- 6533,6546 
up_bound_p1 = int_const_binop (PLUS_EXPR, up_bound,
 build_int_cst (TREE_TYPE (up_bound), 1));
  
+   /* Empty array.  */
+   if (tree_int_cst_equal (low_bound, up_bound_p1))
+ {
+   warning_at (location, OPT_Warray_bounds,
+ "array subscript is above array bounds");
+   TREE_NO_WARNING (ref) = 1;
+ }
+ 
if (TREE_CODE (low_sub) == SSA_NAME)
  {
vr = get_value_range (low_sub);
*** check_array_ref (location_t location, tr
*** 6546,6554 
if (vr && vr->type == VR_ANTI_RANGE)
  {
if (TREE_CODE (up_sub) == INTEGER_CST
!   && tree_int_cst_lt (up_bound, up_sub)
&& TREE_CODE (low_sub) == INTEGER_CST
!   && tree_int_cst_lt (low_sub, low_bound))
  {
warning_at (location, OPT_Warray_bounds,
  "array subscript is outside array bounds");
--- 6554,6564 
if (vr && vr->type == VR_ANTI_RANGE)
  {
if (TREE_CODE (up_sub) == INTEGER_CST
!   && (ignore_off_by_one
! ? tree_int_cst_lt (up_bound, up_sub)
! : tree_int_cst_le (up_bound, up_sub))
&& TREE_CODE (low_sub) == INTEGER_CST
!   && tree_int_cst_le (low_sub, low_bound))
  {
warning_at (location, OPT_Warray_bounds,
  "array subscript is outside array bounds");
*** check_array_ref (location_t location, tr
*** 6557,6566 
  }
else if (TREE_CODE (up_sub) == INTEGER_CST
   && (ignore_off_by_one
!  ? (tree_int_cst_lt (up_bound, up_sub)
! && !tree_int_cst_equal (up_bound_p1, up_sub))
!  : (tree_int_cst_lt (up_bound, up_sub)
! || tree_int_cst_equal (up_bound_p1, up_sub
  {
if (dump_file && (dump_flags & TDF_DETAILS))
{
--- 6567,6574 
  }
else if (TREE_CODE (up_sub) == INTEGER_CST
   && (ignore_off_by_one
!  ? !tree_int_cst_le (up_sub, up_bound_p1)
!  : !tree_int_cst_le (up_sub, up_bound)))
  {
if (dump_file && (dump_flags & TDF_DETAILS))
{
*** check_array_ref (location_t location, tr
*** 6593,6617 
  static void
  search_for_addr_array (tree t, location_t location)
  {
-   while (TREE_CODE (t) == SSA_NAME)
- {
-   gimple g = SSA_NAME_DEF_STMT (t);
- 
-   if (gimple_code (g) != GIMPLE_ASSIGN)
-   return;
- 
-   if (get_gimple_rhs_class (gimple_assign_rhs_code (g))
- != GIMPLE_SINGLE_RHS)
-   return;
- 
-   t = gimple_assign_rhs1 (g);
- }
- 
- 
-   /* We are only interested in addresses of ARRAY_REF's.  */
-   if (TREE_CODE (t) != ADDR_EXPR)
- return;
- 
/* Check each ARRAY_REFs in the reference chain. */
do
  {
--- 6601,6606 
*** check_array_bounds (tree *tp, int *walk_
*** 6701,6712 
if (TREE_CODE (t) == ARRAY_REF)
  check_array_ref (location, t, false /*ignore_off_by_one*/);
  
!   if (TREE_CODE (t) == MEM_REF
!   || (TREE_CODE (t) == RETURN_EXPR && TREE_OPERAND (t, 0)))
! search_for_addr_array (TREE_OPERAND (t, 0), location);
! 
!   if (TREE_CODE (t) == ADDR_EXPR)
! *walk_subtree = FALSE;
  
return NULL_TREE;
  }
--- 6690,6700 
if (TREE_CODE (t) == ARRAY_REF)
  check_array_ref (location, t, false /*ignore_off_by_one*/);
  
!   else if (TREE_CODE (t) == ADDR_EXPR)
! {
!   search_for_addr_array (t, location);
!   *walk_subtree = FALSE;
! }
  
return NULL_TREE;
  }
*** check_all_array_refs (void)
*** 6736,6764 
{
  gimple stmt = gsi_stmt (si);
  struct walk_stmt_info wi;
! if (!gimple_has_location (stmt))
continue;
  
! if (is_gimple_call (stmt))
!   {
! size_t i;
! size_t n = gimple_call_num_args (stmt);
! for (i = 0; i < n; i++)
!   {
!  

[PATCH] Fix PR65774

2015-04-16 Thread Richard Biener

The following avoids invoking bit-tracking on complex types which
have zero TYPE_PRECISION where wi::sext () invokes undefined behavior.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-04-16  Richard Biener  

PR tree-optimization/65774
* tree-ssa-ccp.c (evaluate_stmt): Constrain types we invoke
bit-value tracking on.

Index: gcc/tree-ssa-ccp.c
===
--- gcc/tree-ssa-ccp.c  (revision 222140)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -1764,35 +1773,28 @@ evaluate_stmt (gimple stmt)
{
  enum tree_code subcode = gimple_assign_rhs_code (stmt);
  tree rhs1 = gimple_assign_rhs1 (stmt);
- switch (get_gimple_rhs_class (subcode))
-   {
-   case GIMPLE_SINGLE_RHS:
- if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
- || POINTER_TYPE_P (TREE_TYPE (rhs1)))
-   val = get_value_for_expr (rhs1, true);
- break;
-
-   case GIMPLE_UNARY_RHS:
- if ((INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
-  || POINTER_TYPE_P (TREE_TYPE (rhs1)))
- && (INTEGRAL_TYPE_P (gimple_expr_type (stmt))
- || POINTER_TYPE_P (gimple_expr_type (stmt
-   val = bit_value_unop (subcode, gimple_expr_type (stmt), rhs1);
- break;
-
-   case GIMPLE_BINARY_RHS:
- if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
- || POINTER_TYPE_P (TREE_TYPE (rhs1)))
-   {
- tree lhs = gimple_assign_lhs (stmt);
- tree rhs2 = gimple_assign_rhs2 (stmt);
- val = bit_value_binop (subcode,
-TREE_TYPE (lhs), rhs1, rhs2);
-   }
- break;
+ tree lhs = gimple_assign_lhs (stmt);
+ if ((INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+  || POINTER_TYPE_P (TREE_TYPE (lhs)))
+ && (INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+ || POINTER_TYPE_P (TREE_TYPE (rhs1
+   switch (get_gimple_rhs_class (subcode))
+ {
+ case GIMPLE_SINGLE_RHS:
+   val = get_value_for_expr (rhs1, true);
+   break;
+
+ case GIMPLE_UNARY_RHS:
+   val = bit_value_unop (subcode, TREE_TYPE (lhs), rhs1);
+   break;
+
+ case GIMPLE_BINARY_RHS:
+   val = bit_value_binop (subcode, TREE_TYPE (lhs), rhs1,
+  gimple_assign_rhs2 (stmt));
+   break;
 
-   default:;
-   }
+ default:;
+ }
}
   else if (code == GIMPLE_COND)
{


LTO early debug

2015-04-16 Thread Richard Biener

The LTO early debug info prototype project has been completed.

The provided patch against the debug-early branch shows that the 
general idea is sound and works.  Now on to the details.


What works?
---

Simple C and C++ testcases (didn't test more), esp. now libstdc++
pretty-printers finally work!  For example

#include 
#include 
int main()
{
  std::string s = "Hello";
  std::cout << s << std::endl;
}

with GCC 5 and a -flto -g compile produces

(gdb) start
Temporary breakpoint 1, main () at t2.C:5
5 std::string s = "Hello";
(gdb) info locals
s = 
(gdb) ptype s
type = struct basic_string {

}

while the patched compiler produces

(gdb) start
Temporary breakpoint 1, main () at t2.C:5
5 std::string s = "Hello";
(gdb) info locals
s = ""
(gdb) n
6 std::cout << s << std::endl;
(gdb) info locals
s = "Hello"
(gdb) ptype s
type = std::string

exactly the same as with non-LTO operation.


What does not work?
---

Currently the simple "tooling" (driver support for linking in the
.debug_info sections produced early at compile-time into the final
executable) does not work for fat LTO objects (thus also when a linker
plugin is not available).  The tooling might also break automatic
LTO linker plugin loading as the slim LTO files are not marked slim.

You can run into new ICEs in dwarf2out.c (also for non-LTO operation).
The libstdc++ testsuite doesn't get very far with -flto -g (we
ICE building testsuite_abi.cc).

VLAs don't work (see below).

Somehow constructor invocations show duplicate parameters and no
locations (but DWARF looks sane).


Future work
---

Make the WPA stage side more efficient (do not use DIEs to store/retrieve
the tree <-> DIE symbol + offset info).

Fix the tooling to support fat LTO objects (and make slim objects slim
again).  The only reasonable way to do this is to emit the early
debug info into .gnu.lto_. prefixed sections (??? can/need we redirect
relocations into separate LTO sections as well?), then at WPA (or
unpartitioned LTO) time move all such sections from the LTO inputs
to a (single?) temporary file which serves as additional output the
linker can consume.  We'd need to re-name the sections back at that
time (does that work for a single file and the relocations?  Otherwise
we need unique section names or simply N files).  In theory this
should work with using simple-object and thus be not ELF specific(?).

Fix all the ICEs.

VLAs and stuff don't work (but gcc.dg/guality/vla-1.c also fails quite a
bit on trunk).  I've fixed it up somewhat but the issue is that the
gimplified type size decls are not getting ignored and that they
get location info.  The DWARF looks good but gdb appearantly doesn't
see it (yes, there seem to be some gdb issues as well).
-- Looks like we have to pull up a type chain up to the point that
can refer to the upper bound DIE for the artificial decl (also makes
it unnecessary to output this early).  Will be somewhat fun but
not too difficult.

We should re-think how we handle abstract origins for functions
we currently output for inlines.  The early debug can serve as
abstract origin for example.  Currently with LTO we get quite
easily confused by having two DIEs for the same decl (also LTO
streaming still drops abstract origins from early inlining, something
no longer necessary with early debug).

late dwarf generation should be refactored out of early dwarf generation.
late dwarf for global variables should be created from where we output
the variable (similar to functions), not from a loop in toplev.c

And of course the attached patch needs to be split up, a changelog
written and formally tested and submitted (after fixing all of the
above ;))


Now looking forward to get debug-early merged to trunk (I'll do
some LTO testing on the bare branch - I suspect some issues I ran
into are pre-existing on the branch).

Thanks,
Richard.diff --git a/gcc/dwarf2asm.c b/gcc/dwarf2asm.c
index b817aaf..78e594e 100644
--- a/gcc/dwarf2asm.c
+++ b/gcc/dwarf2asm.c
@@ -44,6 +44,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "hash-map.h"
 #include "ggc.h"
 #include "tm_p.h"
+#include "function.h"
+#include "emit-rtl.h"
 
 
 /* Output an unaligned integer with the given value and size.  Prefer not
@@ -216,6 +218,34 @@ dw2_asm_output_offset (int size, const char *label,
   va_end (ap);
 }
 
+void
+dw2_asm_output_offset_offset (int size, const char *label, HOST_WIDE_INT 
offset,
+ section *base ATTRIBUTE_UNUSED,
+ const char *comment, ...)
+{
+  va_list ap;
+
+  va_start (ap, comment);
+
+#ifdef ASM_OUTPUT_DWARF_OFFSET
+  FIXME
+  ASM_OUTPUT_DWARF_OFFSET (asm_out_file, size, label, base);
+#else
+  dw2_assemble_integer (size, gen_rtx_PLUS (Pmode,
+   gen_rtx_SYMBOL_REF (Pmode, label),
+   gen_int_mode (offset, Pmode)));
+#endif
+
+  if (flag_debug_asm && comment)
+ 

Re: [C++ Patch] PR 65340

2015-04-16 Thread Paolo Carlini

Hi again,

On 04/15/2015 03:31 AM, Jason Merrill wrote:
There are various places in the compiler that error and continue if 
tf_error is set, but return error_mark_node immediately if not; it 
seems reasonable to follow that pattern in the places that don't 
currently check the return value.

Thus I tested the below. Ok for trunk?

Thanks,
Paolo.

///
2015-04-16  Paolo Carlini  

* call.c (build_op_delete_call, build_over_call): Check mark_used
return value.
* class.c (resolve_address_of_overloaded_function): Likewise.
* decl.c (cxx_maybe_build_cleanup): Likewise.
* pt.c (gen_elem_of_pack_expansion_instantiation, tsubst_baselink,
tsubst_qualified_id, tsubst_copy, tsubst_copy_and_build): Likewise.
* rtti.c (build_dynamic_cast_1): Likewise.
* semantics.c (process_outer_var_ref): Likewise.
* typeck.c (build_class_member_access_expr,
cp_build_function_call_vec, cp_build_addr_expr_1): Likewise.
Index: call.c
===
--- call.c  (revision 222143)
+++ call.c  (working copy)
@@ -5968,7 +5968,8 @@ build_op_delete_call (enum tree_code code, tree ad
  argarray[0] = addr;
  for (i = 1; i < nargs; i++)
argarray[i] = CALL_EXPR_ARG (placement, i);
- mark_used (fn);
+ if (!mark_used (fn, complain) && !(complain & tf_error))
+   return error_mark_node;
  return build_cxx_call (fn, nargs, argarray, complain);
}
   else
@@ -7400,7 +7401,8 @@ build_over_call (struct z_candidate *cand, int fla
 the implementation elided its use.  */
   if (!trivial || DECL_DELETED_FN (fn))
{
- mark_used (fn);
+ if (!mark_used (fn, complain) && !(complain & tf_error))
+   return error_mark_node;
  already_used = true;
}
 
Index: class.c
===
--- class.c (revision 222143)
+++ class.c (working copy)
@@ -7755,8 +7755,8 @@ resolve_address_of_overloaded_function (tree targe
   /* Make =delete work with SFINAE.  */
   if (DECL_DELETED_FN (fn) && !(flags & tf_error))
return error_mark_node;
-  
-  mark_used (fn);
+  if (!mark_used (fn, flags) && !(flags & tf_error))
+   return error_mark_node;
 }
 
   /* We could not check access to member functions when this
Index: decl.c
===
--- decl.c  (revision 222143)
+++ decl.c  (working copy)
@@ -14587,7 +14587,8 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t
 ordinary FUNCTION_DECL.  */
   fn = lookup_name (id);
   arg = build_address (decl);
-  mark_used (decl);
+  if (!mark_used (decl, complain) && !(complain & tf_error))
+   return error_mark_node;
   cleanup = cp_build_function_call_nary (fn, complain, arg, NULL_TREE);
   if (cleanup == error_mark_node)
return error_mark_node;
@@ -14627,10 +14628,11 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t
 SET_EXPR_LOCATION (cleanup, UNKNOWN_LOCATION);
 
   if (cleanup
-  && !lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (TREE_TYPE (decl
-/* Treat objects with destructors as used; the destructor may do
-   something substantive.  */
-mark_used (decl);
+  && !lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (TREE_TYPE (decl)))
+  /* Treat objects with destructors as used; the destructor may do
+something substantive.  */
+  && !mark_used (decl, complain) && !(complain & tf_error))
+return error_mark_node;
 
   return cleanup;
 }
Index: pt.c
===
--- pt.c(revision 222143)
+++ pt.c(working copy)
@@ -9833,7 +9833,8 @@ gen_elem_of_pack_expansion_instantiation (tree pat
  if (index == 0)
{
  aps = make_argument_pack_select (arg_pack, index);
- mark_used (parm);
+ if (!mark_used (parm, complain) && !(complain & tf_error))
+   return error_mark_node;
  register_local_specialization (aps, parm);
}
  else
@@ -12603,8 +12604,9 @@ tsubst_baselink (tree baselink, tree object_type,
point.)  */
 if (BASELINK_P (baselink))
   fns = BASELINK_FUNCTIONS (baselink);
-if (!template_id_p && !really_overloaded_fn (fns))
-  mark_used (OVL_CURRENT (fns));
+if (!template_id_p && !really_overloaded_fn (fns)
+   && !mark_used (OVL_CURRENT (fns), complain) && !(complain & tf_error))
+  return error_mark_node;
 
 /* Add back the template arguments, if present.  */
 if (BASELINK_P (baselink) && template_id_p)
@@ -12719,7 +12721,8 @@ tsubst_qualified_id (tree qualified_id, tree args,
   check_accessibility_of_qualified_id (expr, /*object_type=*/NULL_TREE,
   sco

[Obvious][AArch64 Testsuite] Fix comments in vldN_lane_1.c

2015-04-16 Thread Alan Lawrence
The comments in vldN_lane_1.c say it is testing vld{1,2,3}{,q}_dup. This is 
wrong, it is testing vld{1,2,3}{,q}_lane, as per test filename; I've pushed the 
attached as  r222148.


gcc/testsuite/ChangeLog:

gcc.target/aarch64/vldN_lane_1.c: Correct dup->lane in comments.
diff --git a/gcc/testsuite/gcc.target/aarch64/vldN_lane_1.c b/gcc/testsuite/gcc.target/aarch64/vldN_lane_1.c
index e450b7b2b961db56acf5ef5b88e0dc185e81e754..13ab45459f4a1f55c60ed0337e0ef71e24918b01 100644
--- a/gcc/testsuite/gcc.target/aarch64/vldN_lane_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/vldN_lane_1.c
@@ -54,11 +54,11 @@ test_vld##STRUCT##Q##_lane##SUFFIX (const BASE##_t *data,		\
 }
 
 
-/* Tests of vld2_dup and vld2q_dup.  */
+/* Tests of vld2_lane and vld2q_lane.  */
 VARIANTS (TESTMETH, 2)
-/* Tests of vld3_dup and vld3q_dup.  */
+/* Tests of vld3_lane and vld3q_lane.  */
 VARIANTS (TESTMETH, 3)
-/* Tests of vld4_dup and vld4q_dup.  */
+/* Tests of vld4_lane and vld4q_lane.  */
 VARIANTS (TESTMETH, 4)
 
 #define CHECK(BASE, Q, ELTS, SUFFIX, LANE, STRUCT)			\


Re: [PATCH] Save an instruction with -mprofile-kernel style profiling

2015-04-16 Thread David Edelsohn
On Wed, Apr 15, 2015 at 9:57 PM, Anton Blanchard  wrote:
> Save an instruction with -mprofile-kernel style profiling and require
> the mcount handler to save the LR to the stack.
>
> -mprofile-kernel is an undocumented gcc option that was introduced
> specifically for the Linux kernel, so it should be safe to make this
> change. When we do add support for this option in the kernel we plan
> to support both the old and new behaviour.
>
> gcc/
> 2015-04-16  Anton Blanchard  
>
> * config/rs6000/rs6000.c (rs6000_output_function_prologue):
> No need for -mprofile-kernel to save LR to stack.

The PPC kernel maintainers own the interface and ABI, so you can
ensure the caller and callee are compatible.

The patch is fine with me.

Thanks, David


Re: [PATCH] libstdc++: Fix exceptions being generated when compiling with -fno-exceptions

2015-04-16 Thread Federico Lenarduzzi
Jonathan,

I don't have commit access though, so if you could commit this for me
it'd be great.

Regards.

2015-04-15 11:55 GMT-03:00 Jonathan Wakely :
> On 15/04/15 11:15 -0300, Federico Lenarduzzi wrote:
>>
>> The idea of this patch is that using this function not depends from a
>> configuration
>> flag but depends of a flag which participates in the selection of the
>> correct multilib.
>
>
> OK, then I agree it's better to make it conditional on
> __cpp_exceptions which can have different values for your different
> multilibs, whereas _GLIBCXX_VERBOSE would be the same in all of them.
>
> I have no objections to the change.



-- 
Federico Lenarduzzi
Software Engineer

Taller Technologies Argentina
San Lorenzo 47, 3rd Floor, Office 5
Cordoba, Argentina
Phone: +54 351 4220701
Mobile: +54 358 4289083
Skype: federico.lenarduzzi1


Re: [PATCH] libstdc++: Fix exceptions being generated when compiling with -fno-exceptions

2015-04-16 Thread Jonathan Wakely

On 16/04/15 11:20 -0300, Federico Lenarduzzi wrote:

Jonathan,

I don't have commit access though, so if you could commit this for me
it'd be great.


Yes, but I'm waiting until I do the uncaught_exceptions() commit.



Re: [C++ Patch] PR 65340

2015-04-16 Thread Jason Merrill

OK, thanks.

Jason


Re: [PATCH] Do not discard the constructors of empty structs [PR c++/64527]

2015-04-16 Thread Jason Merrill

On 04/15/2015 09:00 PM, Patrick Palka wrote:

-   if (!cleared || num_nonzero_elements > 0)


How about adding || TREE_SIDE_EFFECTS (TREE_OPERAND (*expr_p), 1) to 
this test rather than removing it entirely?


Jason



Re: [Patch][testsuite,i386]: Remove include of values.h from avx512

2015-04-16 Thread Kirill Yukhin
Hello,

Thanks!

On 15 Apr 12:30, Ryan Mansfield wrote:
> OK for trunk and gcc-5-branch?
The patch is OK for trunk and gcc-5-branch (when it is open).

--
K


Re: [libstdc++ PATCH] Add support for std::uncaught_exceptions

2015-04-16 Thread Jonathan Wakely

On 15/04/15 14:10 -0300, Daniel Gutson wrote:

You might want to check this thread
https://gcc.gnu.org/ml/libstdc++/2015-04/msg00104.html
which is specially useful when creating multilibs with and without exceptions
and enabling linker's sections garbage collection.
Maybe you could commit the change for both uncaught_exception and
uncaught_exceptions?


I'll be doing both commits.


Re: [PATCH] Do not discard the constructors of empty structs [PR c++/64527]

2015-04-16 Thread Patrick Palka

On Thu, 16 Apr 2015, Jason Merrill wrote:


On 04/15/2015 09:00 PM, Patrick Palka wrote:

-   if (!cleared || num_nonzero_elements > 0)


How about adding || TREE_SIDE_EFFECTS (TREE_OPERAND (*expr_p), 1) to this 
test rather than removing it entirely?


That works too.  Does the following patch look OK if testing succeeds?

gcc/
PR c++/64527
* gcc/gimplify.c (gimplify_init_constructor): Always emit a
side-effecting constructor.

gcc/testsuite/
* g++.dg/init/pr64527.C: New test.
---
 gcc/gimplify.c  | 11 ---
 gcc/testsuite/g++.dg/init/pr64527.C | 21 +
 2 files changed, 29 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/init/pr64527.C

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index ff0a225..29d3f4d 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -3994,6 +3994,9 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
pre_p, post_p, &preeval_data);
  }

+   bool ctor_has_side_effects_p
+ = TREE_SIDE_EFFECTS (TREE_OPERAND (*expr_p, 1));
+
if (cleared)
  {
/* Zap the CONSTRUCTOR element list, which simplifies this case.
@@ -4006,9 +4009,11 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
  }

/* If we have not block cleared the object, or if there are nonzero
-  elements in the constructor, add assignments to the individual
-  scalar fields of the object.  */
-   if (!cleared || num_nonzero_elements > 0)
+  elements in the constructor, or if the constructor has side effects,
+  add assignments to the individual scalar fields of the object.  */
+   if (!cleared
+   || num_nonzero_elements > 0
+   || ctor_has_side_effects_p)
  gimplify_init_ctor_eval (object, elts, pre_p, cleared);

*expr_p = NULL_TREE;
diff --git a/gcc/testsuite/g++.dg/init/pr64527.C 
b/gcc/testsuite/g++.dg/init/pr64527.C
new file mode 100644
index 000..36b8214
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/pr64527.C
@@ -0,0 +1,21 @@
+// { dg-do run { target c++11 } }
+
+static int g;
+
+struct A {
+  A() { g = 1; }
+};
+
+struct accessor {
+  A a;
+  int x;
+};
+
+int
+main (void)
+{
+  (void) accessor{};
+
+  if (g != 1)
+__builtin_abort ();
+}
--
2.4.0.rc2


[RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-16 Thread David Malcolm
Attached is a work-in-progress patch for a new
  -Wmisleading-indentation
warning I've been experimenting with, for GCC 6.

It kind-of-works, but there are some issues, so I wanted to get feedback
on it here.

The idea is to issue a warning when the C/C++ compiler's view of the
block structure doesn't match that of a naive human reader of the code
via indentation.

For example, CVE-2014-1266 (aka "goto fail") had:

  hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
  hashOut.length = SSL_SHA1_DIGEST_LEN;
  if ((err = SSLFreeBuffer(&hashCtx)) != 0)
  goto fail;
  if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
  goto fail;
  if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
  goto fail;
  if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
  goto fail;
  if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
  goto fail;
  goto fail;  /* MISTAKE! THIS LINE SHOULD NOT BE HERE */
  if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
  goto fail;

  err = sslRawVerify(...);

where the second "goto fail;" here:

  if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
  goto fail;
  goto fail;  /* MISTAKE! THIS LINE SHOULD NOT BE HERE */

is indented as if it's guarded by the conditional.  It isn't, making the
real meaning of the code hard to discern, and masking a security
vulnerability (where the stray goto led to the  certificate-checking
code later in the function being dead code, skipping it with err == 0,
for success).

The patch successfully issues a warning on a simplified version of this:

if ((err = foo (b)) != 0)
goto fail;
goto fail;

./Wmisleading-indentation-6.c: In function ‘foo’:
./Wmisleading-indentation-6.c:15:3: warning: statement is indented as if
it were guarded by... [-Wmisleading-indentation]
   goto fail;
   ^
./Wmisleading-indentation-6.c:13:2: note: ...this clause, but it is not
  if ((err = foo (b)) != 0)
  ^

and on various other testcases.

That said, there are several major issues with the patch as it stands:

(A) works with the C FE, but not with C++ yet.  The implementation
assumes a single lexer consuming the token stream, so it doesn't yet
work with the C++ FE's ideas of tentatively-consumed tokens
(save/commit/rollback), and with its lexer stack.  I *think* that can be
fixed by relying on the fact that the integer locations of the token
stream are normally monotonic, and using that to capture and handle
those two C++ FE impl details.

(B) GTY/PCH/etc, and performance;  I know that it leaks visual_block
instances, and there's probably a way to do this with much less dynamic
memory allocation, but I wanted to get it right before making it fast.

(C) no ChangeLog yet, though it's heavily commented.

(D) tabs vs spaces.  This is probably the biggest can of worms.

Consider:

{
  if (flagA)
if (flagB)
  if (flagC)
foo ();
bar (); /* would like to warn about this */
}

If this is indented using spaces&tabs with 8-space tabs, we have:

{
<2 spaces>if (flagA)
<4 spaces>if (flagB)
<6 spaces>if (flagC)
<1 tab>foo ();
<1 tab>bar ();
}

What does this look like in an editor?  Consider emacs, which uses uses
0-based offsets to what I call "visual columns".  The column numbers for
the start of the first token for each line are reported by emacs as:

<0>{
<2>if (flagA)
<4>if (flagB)
<6>if (flagC)
<8>foo ();
<8>bar ();
<0>}

However within libcpp and gcc, in linemap's expanded_location and in
diagnostic messages, the "column" numbers are actually 1-based counts of
*characters*, so the "column" numbers emitted in diagnostics for the
start of the first token in each line are actually:

<1>{
<3>if (flagA)
<5>if (flagB)
<7>if (flagC)
<2>foo ();
<2>bar ();
<1>}

Note the transition at tab offsets, where we go from increasing visual
column numbers:

<6>if (flagC)
<8>foo ();

to a decrease in the character count:

<7>if (flagC)
<2>foo ();

Hence to handle mixed spaces+tabs, we need some way to model "visual
columns".  I think this is doable (at the cost of adding a field to
cpp_token, and to c_token/cp_token), provided we can assume, per-buffer,
either
  (i) a consistent value for tabs in terms of spaces, or
  (ii) that we don't have mixed tabs/spaces in this buffer

An issue here is how to determine (i), or if it's OK to default to 8 and
have a command-line option (param?) to override it? (though what about,
say, each header file?)

Thoughts on this, and on the patch?

Thanks

Dave

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 30a717e..f71a33d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1462,6 +1462,7 @@ OBJS = \
 	var-tracking.o \
 	varasm.o \
 	varpool.o \
+	visual-parser.o \
 	vmsdbgout.o \
 	vtable-verify.o \
 	web.o \
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7d0a2cd..3795087 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -494,6 +494,10 @@ Wmain
 LangEnabledBy(C ObjC C++ ObjC++,Wpedantic, 2, 0)
 ;
 
+Wmislead

[C++ Patch] Rename tsubst_flags_t function parameter

2015-04-16 Thread Paolo Carlini

Hi,

while working on the last patch I noticed that in a couple of functions 
in class.c we name the tsubst_flags_t parameter flags instead of 
complain. Besides the inconsistency, I find that choice especially 
misleading, because normally in the front end a parameter of name flags 
is just an int. Objections to the below?


Thanks,
Paolo.

/
2015-04-16  Paolo Carlini  

* class.c (resolve_address_of_overloaded_function, instantiate_type):
Rename tsubst_flags_t parameter flags -> complain.
Index: class.c
===
--- class.c (revision 222150)
+++ class.c (working copy)
@@ -7471,7 +7471,7 @@ pop_lang_context (void)
 static tree
 resolve_address_of_overloaded_function (tree target_type,
tree overload,
-   tsubst_flags_t flags,
+   tsubst_flags_t complain,
bool template_only,
tree explicit_targs,
tree access_path)
@@ -7531,7 +7531,7 @@ resolve_address_of_overloaded_function (tree targe
 target_type = build_reference_type (target_type);
   else
 {
-  if (flags & tf_error)
+  if (complain & tf_error)
error ("cannot resolve overloaded function %qD based on"
   " conversion to type %qT",
   DECL_NAME (OVL_FUNCTION (overload)), target_type);
@@ -7667,7 +7667,7 @@ resolve_address_of_overloaded_function (tree targe
   if (matches == NULL_TREE)
 {
   /* There were *no* matches.  */
-  if (flags & tf_error)
+  if (complain & tf_error)
{
  error ("no matches converting function %qD to type %q#T",
 DECL_NAME (OVL_CURRENT (overload)),
@@ -7695,7 +7695,7 @@ resolve_address_of_overloaded_function (tree targe
 
   if (match)
{
- if (flags & tf_error)
+ if (complain & tf_error)
{
  error ("converting overloaded function %qD to type %q#T is 
ambiguous",
 DECL_NAME (OVL_FUNCTION (overload)),
@@ -7717,11 +7717,11 @@ resolve_address_of_overloaded_function (tree targe
   fn = TREE_PURPOSE (matches);
 
   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
-  && !(flags & tf_ptrmem_ok) && !flag_ms_extensions)
+  && !(complain & tf_ptrmem_ok) && !flag_ms_extensions)
 {
   static int explained;
 
-  if (!(flags & tf_error))
+  if (!(complain & tf_error))
return error_mark_node;
 
   permerror (input_location, "assuming pointer to member %qD", fn);
@@ -7742,7 +7742,7 @@ resolve_address_of_overloaded_function (tree targe
   if (fn == NULL)
return error_mark_node;
   /* Mark all the versions corresponding to the dispatcher as used.  */
-  if (!(flags & tf_conv))
+  if (!(complain & tf_conv))
mark_versions_used (fn);
 }
 
@@ -7750,12 +7750,12 @@ resolve_address_of_overloaded_function (tree targe
  determining conversion sequences, we should not consider the
  function used.  If this conversion sequence is selected, the
  function will be marked as used at this point.  */
-  if (!(flags & tf_conv))
+  if (!(complain & tf_conv))
 {
   /* Make =delete work with SFINAE.  */
-  if (DECL_DELETED_FN (fn) && !(flags & tf_error))
+  if (DECL_DELETED_FN (fn) && !(complain & tf_error))
return error_mark_node;
-  if (!mark_used (fn, flags) && !(flags & tf_error))
+  if (!mark_used (fn, complain) && !(complain & tf_error))
return error_mark_node;
 }
 
@@ -7765,11 +7765,11 @@ resolve_address_of_overloaded_function (tree targe
   if (DECL_FUNCTION_MEMBER_P (fn))
 {
   gcc_assert (access_path);
-  perform_or_defer_access_check (access_path, fn, fn, flags);
+  perform_or_defer_access_check (access_path, fn, fn, complain);
 }
 
   if (TYPE_PTRFN_P (target_type) || TYPE_PTRMEMFUNC_P (target_type))
-return cp_build_addr_expr (fn, flags);
+return cp_build_addr_expr (fn, complain);
   else
 {
   /* The target must be a REFERENCE_TYPE.  Above, cp_build_unary_op
@@ -7783,7 +7783,7 @@ resolve_address_of_overloaded_function (tree targe
 
 /* This function will instantiate the type of the expression given in
RHS to match the type of LHSTYPE.  If errors exist, then return
-   error_mark_node. FLAGS is a bit mask.  If TF_ERROR is set, then
+   error_mark_node. COMPLAIN is a bit mask.  If TF_ERROR is set, then
we complain on errors.  If we are not complaining, never modify rhs,
as overload resolution wants to try many possible instantiations, in
the hope that at least one will work.
@@ -7792,16 +7792,16 @@ resolve_address_of_overloaded_function (tree targe
function, or a pointer to member function.  */
 
 tree
-instantiate_type (tree lhstype, tree rhs, tsubst_flags_t flags)
+instantiate_type (tree lh

[committed, PATCH]: Replace __buitlin_alloca with __builtin_alloca

2015-04-16 Thread H.J. Lu
I checked this patch into trunk and will backport it to GCC 5 branch.


H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 222150)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2015-04-16  H.J. Lu  
+
+   * gcc.target/i386/mpx/alloca-1-lbv.c (mpx_test): Replace
+   __buitlin_alloca with __builtin_alloca.
+
 2015-04-16  Alan Lawrence  
 
* gcc.target/aarch64/vldN_lane_1.c: Correct dup->lane in comments.
Index: gcc.target/i386/mpx/alloca-1-lbv.c
===
--- gcc.target/i386/mpx/alloca-1-lbv.c  (revision 222150)
+++ gcc.target/i386/mpx/alloca-1-lbv.c  (working copy)
@@ -16,7 +16,7 @@ int rd (int *p, int i)
 
 int mpx_test (int argc, const char **argv)
 {
-  int *buf = (int *)__buitlin_alloca (100 * sizeof(int));
+  int *buf = (int *)__builtin_alloca (100 * sizeof(int));
 
   rd (buf, -1);
 


Re: [committed, PATCH]: Replace __buitlin_alloca with __builtin_alloca

2015-04-16 Thread Jakub Jelinek
On Thu, Apr 16, 2015 at 08:35:27AM -0700, H.J. Lu wrote:
> I checked this patch into trunk and will backport it to GCC 5 branch.

Ok for GCC 5 branch now.

> --- ChangeLog (revision 222150)
> +++ ChangeLog (working copy)
> @@ -1,3 +1,8 @@
> +2015-04-16  H.J. Lu  
> +
> + * gcc.target/i386/mpx/alloca-1-lbv.c (mpx_test): Replace
> + __buitlin_alloca with __builtin_alloca.
> +
>  2015-04-16  Alan Lawrence  
>  
>   * gcc.target/aarch64/vldN_lane_1.c: Correct dup->lane in comments.
> Index: gcc.target/i386/mpx/alloca-1-lbv.c
> ===
> --- gcc.target/i386/mpx/alloca-1-lbv.c(revision 222150)
> +++ gcc.target/i386/mpx/alloca-1-lbv.c(working copy)
> @@ -16,7 +16,7 @@ int rd (int *p, int i)
>  
>  int mpx_test (int argc, const char **argv)
>  {
> -  int *buf = (int *)__buitlin_alloca (100 * sizeof(int));
> +  int *buf = (int *)__builtin_alloca (100 * sizeof(int));
>  
>rd (buf, -1);
>  

Jakub


Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-04-16 Thread Mike Stump
On Apr 14, 2015, at 8:07 AM, H.J. Lu  wrote:
>> I can confirm that the most current patch bootstraps on
>> x86_64-apple-darwin14 and that all of the new tests show up as
>> unsupported in the test suite.
>>  Jack
> 
> I am re-posting this patch.  OK for trunk?

If Jack is happy, I’m happy.  :-)  That leaves the x86 people to comment on it.


Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call

2015-04-16 Thread Georg-Johann Lay

Am 04/16/2015 um 11:28 AM schrieb Senthil Kumar Selvaraj:

On Thu, Apr 16, 2015 at 11:02:05AM +0200, Georg-Johann Lay wrote:

Am 04/16/2015 um 08:43 AM schrieb Senthil Kumar Selvaraj:

This patch fixes PR 65657.


The following artifact appears to be PR63633.



I did see that one - unfortunately, that fix won't help here. IIUC, you
check if input/output operand hard regs are in the clobber list,
and if yes, you generate pseudos to save and restore clobbered hard
regs.

In this case, the reg is actually clobbered by a different insn (one


Arrgh, yes...


that loads the next argument to the function). So unless I blindly generate 
pseudos for
all regs in the clobber list, the clobbering will still happen.

FWIW, I did try saving/restoring all clobbered regs, and it did fix the
problem - just that it appeared like a (worse) hack to me. Aren't we
manually replicating what RA/reload should be doing?


As it appears, we'll have to do it by hand.  The attaches patch is just a 
sketch that indicates how the problem could be approached.  Notice the new 
assertions in the split expanders; they will throw ICE until the fix is 
actually installed.


The critical insn are generated in movMM expander and are changed to have no 
clobbers (SCRATCHes actually).  An a later pass, when appropriate life info can 
be made available, run yet another avr pass that


1a) Save-restore needed hard regs around the insn.

2a) Kick out hard regs overlapping the clobbers, e.g. in set_src, into new 
pseudos.  Maybe that could happen due to some hard regs progagation, or we can 
use a new predicate similar combine_pseudo_register_operand.


3) Replace scratch -> hard regs for all scratch_operands.

2b) Restore respective hard regs from their pseudos.

1b) Restore respective hard regs from their pseudos.


And maybe we can get better code by allocating things like address register by 
hand and get better code then.


When I implemented some of the libgcc insns I tried to express the operand by 
means of constraints, e.h. for (reg:HI 22) and let register allocator do the job.


The resulting code was functional but *horrific*.

The register allocator is not yet ready to generate efficient code in such 
demanding situations...




What do you think?



IMO sooner or later we'll need such an infrastructure; maybe also for non-mov 
insn that are implemented by transparent libcalls like divmod, mul, etc.



When cfgexpand.c expands a function call, it first figures out the
registers in which the arguments will go, followed by expansion of the
arguments themselves (right to left). It later emits mov insns to set
the precomputed registers with the expanded RTXes.

If one of the arguments is a __memx char pointer dereference, the mov
eventually expands to gen_xload_A (for certain cases), which
clobbers R22, R21 and Z.  This causes problems if one of those
clobbered registers was precomputed to hold another argument.

In general, call expansion does not appear to take clobbers into account -


We had been warned that using hard regs is evil...  But without that technique 
the code quality would decrease way too much.



when it checks for argument overlap, the RTX (args[i].value) is only a MEM
in QImode for the memx deref - the clobber shows up when it eventually
calls emit_move_insn, at which point, it is too late.


Such situations could only be handled by a target hook which allowed to expand 
specific trees by hand...  Such a hook could cater for insn that must use hard 
registers.



This does not happen for a int pointer dereference - turns out that
precompute_register_parameters does a copy_to_mode_reg if the
cost of args[i].value is more than COSTS_N_INSNS(1) i.e., it creates a
pseudo and later assigns the pseudo to the arg register. This is done
before any moves to arg registers is done, so other arguments are not
overwritten.

Doing the same thing - providing a better cost estimate for a MEM rtx in
the non-default address space, makes this problem go away, and that is
what this patch does. Regression testing does not show any new failures.


Can you tell something about overall code quality? If it is not significantly 
worse then I'd propose to apply your rtx-costs solution soon.  The full fix 
will take more time to work it out.



Johann



Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call

2015-04-16 Thread Georg-Johann Lay

...and the sketch against 4.9
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 221321)
+++ config/avr/avr.c	(working copy)
@@ -287,6 +287,94 @@ avr_to_int_mode (rtx x)
 }
 
 
+static void
+avr_rest_of_handle_expand_xload (void)
+{
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  rtx insn, next;
+
+  FOR_BB_INSNS_SAFE (bb, insn, next)
+{
+  enum attr_fixregs fixregs;
+
+  if (-1 == recog_memoized (insn)
+  || FIXREGS_NO == (fixregs = get_attr_fixregs (insn)))
+{
+  continue;
+}
+
+  avr_edump ("%?: Must fix insn:\n  %r\n\n", insn);
+  if (dump_file)
+avr_fdump (dump_file, ";; Fixing insn :\n  %r\n\n", insn);
+
+  extract_insn (insn);
+
+  switch (fixregs)
+{
+default:
+  gcc_unreachable();
+
+case FIXREGS_XLOADQI_A:
+  if (dump_file)
+avr_fdump (dump_file, "$2 = %r\n\n", recog_data.operand[2]);
+  break;
+
+case FIXREGS_XLOAD_A:
+  if (dump_file)
+{
+  avr_fdump (dump_file, "$2 = %r\n", recog_data.operand[2]);
+  avr_fdump (dump_file, "$3 = %r\n", recog_data.operand[3]);
+  avr_fdump (dump_file, "$4 = %r\n\n", recog_data.operand[4]);
+}
+  break;
+}
+} // insn
+}
+}
+
+
+static const pass_data avr_pass_data_expand_xload =
+{
+  RTL_PASS,   // type
+  "", // name (will be patched)
+  OPTGROUP_NONE,  // optinfo_flags
+  false,  // has_gate
+  true,   // has_execute
+  TV_NONE,// tv_id
+  0,  // properties_required
+  0,  // properties_provided
+  0,  // properties_destroyed
+  0,  // todo_flags_start
+  // todo_flags_finish
+  TODO_df_finish | TODO_verify_rtl_sharing | TODO_verify_flow
+};
+
+
+class avr_pass_expand_xload : public rtl_opt_pass
+{
+public:
+  avr_pass_expand_xload (gcc::context *ctxt, const char *name)
+: rtl_opt_pass (avr_pass_data_expand_xload, ctxt)
+  {
+this->name = name;
+  }
+
+  unsigned int execute (void)
+  {
+df_lr_add_problem ();
+df_live_add_problem ();
+df_analyze ();
+
+avr_rest_of_handle_expand_xload();
+
+return 0;
+  }
+}; // avr_pass_recompute_notes
+
+
 static const pass_data avr_pass_data_recompute_notes =
 {
   RTL_PASS,   // type
@@ -326,6 +414,11 @@ public:
 static void
 avr_register_passes (void)
 {
+  /* This avr-specific fixed PR63633 for the case of mov insns.  */
+
+  register_pass (new avr_pass_expand_xload (g, "avr-expand-xload"),
+ PASS_POS_INSERT_BEFORE, "split1", 1);
+
   /* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
  notes which are used by `avr.c::reg_unused_after' and branch offset
  computations.  These notes must be correct, i.e. there must be no
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 221321)
+++ config/avr/avr.md	(working copy)
@@ -165,6 +165,13 @@ (define_attr "isa"
standard"
   (const_string "standard"))
 
+
+(define_attr "fixregs"
+  "xload_A, xloadQI_A,
+   no"
+  (const_string "no"))
+
+
 (define_attr "enabled" ""
   (cond [(eq_attr "isa" "standard")
  (const_int 1)
@@ -494,9 +501,9 @@ (define_insn "load__libgcc"
 ;; "xload8qi_A"
 ;; "xload8qq_A" "xload8uqq_A"
 (define_insn_and_split "xload8_A"
-  [(set (match_operand:ALL1 0 "register_operand" "=r")
-(match_operand:ALL1 1 "memory_operand""m"))
-   (clobber (reg:HI REG_Z))]
+  [(set (match_operand:ALL1 0 "register_operand"  "=r")
+(match_operand:ALL1 1 "memory_operand" "m"))
+   (clobber (match_operand:HI 2 "scratch_operand" "=z"))] ;; HI 30
   "can_create_pseudo_p()
&& !avr_xload_libgcc_p (mode)
&& avr_mem_memx_p (operands[1])
@@ -505,6 +512,8 @@ (define_insn_and_split "xload8_A"
   "&& 1"
   [(clobber (const_int 0))]
   {
+gcc_assert (SCRATCH != GET_CODE (operands[2]));
+
 /* ; Split away the high part of the address.  GCC's register allocator
; in not able to allocate segment registers and reload the resulting
; expressions.  Notice that no address register can hold a PSImode.  */
@@ -520,7 +529,9 @@ (define_insn_and_split "xload8_A"
 set_mem_addr_space (SET_SRC (single_set (insn)),
  MEM_ADDR_SPACE (operands[1]));
 DONE;
-  })
+  }
+  [(set_attr "fixregs" "xloadQI_A")])
+
 
 ;; "xloadqi_A" "xloadqq_A" "xloaduqq_A"
 ;; "xloadhi_A" "xloadhq_A" "xloaduhq_A" "xloadha_A" "xloaduha_A"
@@ -530,9 +541,9 @@ (define_insn_and_split "xload8_A"
 (define_insn_and_split "xload_A"
   [(set (match_operand:MOVMODE 0 "register_operand" "=r")
 (match_operand:MOVMODE 1 "memory_operand""m"))
-   (clobber 

RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

2015-04-16 Thread Petar Jovanovic
-Original Message-
From: Moore, Catherine [mailto:catherine_mo...@mentor.com] 
Sent: Friday, February 13, 2015 2:37 AM
To: Petar Jovanovic; 'Matthew Fortune'; gcc-patches@gcc.gnu.org; 'Maciej W.
Rozycki'
Cc: Moore, Catherine
Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

> Hi Petar,
> 
> There isn't any need to execute a large testcase.  Instead, try adding a
short version of your test to the directory gcc/testsuite/gcc.target/mips.
> There are examples of other tests there they check for specific assembler
sequences by using scan-assembler.   See umips-swp-1.c (and others) for a
specific example of how to do this.
> Let me know if you need additional help.
> Thanks,
> Catherine

Hi Catherine,

Sorry for late response, I have missed to follow up on this.
IIUC, scan-assembler will scan assembly file generated from a test file.
This particular change will - on the other hand - modify crtend.o and thus
init section. Is there a 'scan-assembler' functionality over a final linked
executable, as that would actually test the change?

Let me know. Thanks.

Regards,
Petar



Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call

2015-04-16 Thread Denis Chertykov
2015-04-16 19:47 GMT+03:00 Georg-Johann Lay :
>
> Am 04/16/2015 um 11:28 AM schrieb Senthil Kumar Selvaraj:
>>
>> On Thu, Apr 16, 2015 at 11:02:05AM +0200, Georg-Johann Lay wrote:
>>>
>>> Am 04/16/2015 um 08:43 AM schrieb Senthil Kumar Selvaraj:

 This patch fixes PR 65657.
>>>
>>>
>>> The following artifact appears to be PR63633.
>>>
>>
>> I did see that one - unfortunately, that fix won't help here. IIUC, you
>> check if input/output operand hard regs are in the clobber list,
>> and if yes, you generate pseudos to save and restore clobbered hard
>> regs.
>>
>> In this case, the reg is actually clobbered by a different insn (one
>
>
> Arrgh, yes...
>
>> that loads the next argument to the function). So unless I blindly generate 
>> pseudos for
>> all regs in the clobber list, the clobbering will still happen.
>>
>> FWIW, I did try saving/restoring all clobbered regs, and it did fix the
>> problem - just that it appeared like a (worse) hack to me. Aren't we
>> manually replicating what RA/reload should be doing?
>
>
> As it appears, we'll have to do it by hand.  The attaches patch is just a 
> sketch that indicates how the problem could be approached.  Notice the new 
> assertions in the split expanders; they will throw ICE until the fix is 
> actually installed.
>
> The critical insn are generated in movMM expander and are changed to have no 
> clobbers (SCRATCHes actually).  An a later pass, when appropriate life info 
> can be made available, run yet another avr pass that
>
> 1a) Save-restore needed hard regs around the insn.
>
> 2a) Kick out hard regs overlapping the clobbers, e.g. in set_src, into new 
> pseudos.  Maybe that could happen due to some hard regs progagation, or we 
> can use a new predicate similar combine_pseudo_register_operand.
>
> 3) Replace scratch -> hard regs for all scratch_operands.
>
> 2b) Restore respective hard regs from their pseudos.
>
> 1b) Restore respective hard regs from their pseudos.
>
>
> And maybe we can get better code by allocating things like address register 
> by hand and get better code then.
>
> When I implemented some of the libgcc insns I tried to express the operand by 
> means of constraints, e.h. for (reg:HI 22) and let register allocator do the 
> job.
>
> The resulting code was functional but *horrific*.
>
> The register allocator is not yet ready to generate efficient code in such 
> demanding situations...
>
>>
>> What do you think?
>>
>
> IMO sooner or later we'll need such an infrastructure; maybe also for non-mov 
> insn that are implemented by transparent libcalls like divmod, mul, etc.
>
 When cfgexpand.c expands a function call, it first figures out the
 registers in which the arguments will go, followed by expansion of the
 arguments themselves (right to left). It later emits mov insns to set
 the precomputed registers with the expanded RTXes.

 If one of the arguments is a __memx char pointer dereference, the mov
 eventually expands to gen_xload_A (for certain cases), which
 clobbers R22, R21 and Z.  This causes problems if one of those
 clobbered registers was precomputed to hold another argument.

 In general, call expansion does not appear to take clobbers into account -
>
>
> We had been warned that using hard regs is evil...  But without that 
> technique the code quality would decrease way too much.
>
 when it checks for argument overlap, the RTX (args[i].value) is only a MEM
 in QImode for the memx deref - the clobber shows up when it eventually
 calls emit_move_insn, at which point, it is too late.
>
>
> Such situations could only be handled by a target hook which allowed to 
> expand specific trees by hand...  Such a hook could cater for insn that must 
> use hard registers.
>
 This does not happen for a int pointer dereference - turns out that
 precompute_register_parameters does a copy_to_mode_reg if the
 cost of args[i].value is more than COSTS_N_INSNS(1) i.e., it creates a
 pseudo and later assigns the pseudo to the arg register. This is done
 before any moves to arg registers is done, so other arguments are not
 overwritten.

 Doing the same thing - providing a better cost estimate for a MEM rtx in
 the non-default address space, makes this problem go away, and that is
 what this patch does. Regression testing does not show any new failures.
>
>
> Can you tell something about overall code quality? If it is not significantly 
> worse then I'd propose to apply your rtx-costs solution soon.  The full fix 
> will take more time to work it out.

I'm agree with Georg.


Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-16 Thread Mike Stump
On Apr 16, 2015, at 8:01 AM, David Malcolm  wrote:
> Attached is a work-in-progress patch for a new
>  -Wmisleading-indentation
> warning I've been experimenting with, for GCC 6.

Seems like a nice idea in general.

Does it also handle:

if (cone);
  stmt;

?  Would be good to add that to the test suite, as that is another hard to spot 
common error that should be caught.

I do think that it is reasonable to warn for things like:

  stmt;
stmt;

one of those two lines is likely misindented, though, maybe you want to start 
with the high payback things first.

> An issue here is how to determine (i), or if it's OK to default to 8

Yes, 8 is the proper value to default it to.

> and have a command-line option (param?) to override it? (though what about,
> say, each header file?)

I’ll abstain from this.  The purist in me says no option for other than 8, life 
goes on.  20 years ago, someone was confused over hard v soft tabbing and what 
exactly the editor key TAB does.  That confusion is over, the 8 people have 
won.  Catering to other than 8 gives the impression that the people that lost 
still have a chance at winning.  :-)

> Thoughts on this, and on the patch?

Would be nice to have a stricter version that warns about all wildly 
inconsistently or wrongly indented lines.

{
  stmt;
stmt;  // must be same as above
}

{
stmt; // must be indented at least 1
}

if (cond)
stmt;  // must be indented at least 1

[PATCH][AArch64] Fix PR/65770 vstN_lane on bigendian

2015-04-16 Thread Alan Lawrence
As per bugzilla entry, indices in the generated assembly for bigendian are 
flipped when they should not be (and, flipped always relative to a Q-register!).


This flips the lane indices back again at assembly time, fixing PR. The 
"indices" contained in the RTL are still wrong for D registers, but these are 
only parameters to an UNSPEC and so never acted upon. (Nonetheless I intend to 
fix this anomaly in later patches).


Tested check-gcc on aarch64-none-elf and aarch64_be-none-elf.
New test (initially failing on bigendian) now passing on both.

gcc/ChangeLog:

PR target/65770
config/aarch64/aarch64-simd.md (vec_store_lanesoi_lane,
vec_store_lanesci_lane, vec_store_lanesxi_lane):
Flip lane index back at assembly time for bigendian.

gcc/testsuite/ChangeLog:

PR target/65770
gcc.target/aarch64/vstN_lane_1.c: New file.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 055757036d54d0d5cf5df4bd05419e39ea119f46..b84374443a08a89a7b7c372b1585e128ac8b7fdd 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3954,6 +3954,7 @@
   [(set_attr "type" "neon_store2_2reg")]
 )
 
+;; RTL uses GCC vector extension indices, so flip only for assembly.
 (define_insn "vec_store_lanesoi_lane"
   [(set (match_operand: 0 "aarch64_simd_struct_operand" "=Utv")
 	(unspec: [(match_operand:OI 1 "register_operand" "w")
@@ -3961,7 +3962,10 @@
 		(match_operand:SI 2 "immediate_operand" "i")]
UNSPEC_ST2_LANE))]
   "TARGET_SIMD"
-  "st2\\t{%S1. - %T1.}[%2], %0"
+  {
+operands[2] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[2])));
+return "st2\\t{%S1. - %T1.}[%2], %0";
+  }
   [(set_attr "type" "neon_store3_one_lane")]
 )
 
@@ -4045,6 +4049,7 @@
   [(set_attr "type" "neon_store3_3reg")]
 )
 
+;; RTL uses GCC vector extension indices, so flip only for assembly.
 (define_insn "vec_store_lanesci_lane"
   [(set (match_operand: 0 "aarch64_simd_struct_operand" "=Utv")
 	(unspec: [(match_operand:CI 1 "register_operand" "w")
@@ -4052,7 +4057,10 @@
 		(match_operand:SI 2 "immediate_operand" "i")]
UNSPEC_ST3_LANE))]
   "TARGET_SIMD"
-  "st3\\t{%S1. - %U1.}[%2], %0"
+  {
+operands[2] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[2])));
+return "st3\\t{%S1. - %U1.}[%2], %0";
+  }
   [(set_attr "type" "neon_store3_one_lane")]
 )
 
@@ -4136,6 +4144,7 @@
   [(set_attr "type" "neon_store4_4reg")]
 )
 
+;; RTL uses GCC vector extension indices, so flip only for assembly.
 (define_insn "vec_store_lanesxi_lane"
   [(set (match_operand: 0 "aarch64_simd_struct_operand" "=Utv")
 	(unspec: [(match_operand:XI 1 "register_operand" "w")
@@ -4143,7 +4152,10 @@
 		(match_operand:SI 2 "immediate_operand" "i")]
UNSPEC_ST4_LANE))]
   "TARGET_SIMD"
-  "st4\\t{%S1. - %V1.}[%2], %0"
+  {
+operands[2] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[2])));
+return "st4\\t{%S1. - %V1.}[%2], %0";
+  }
   [(set_attr "type" "neon_store4_one_lane")]
 )
 
diff --git a/gcc/testsuite/gcc.target/aarch64/vstN_lane_1.c b/gcc/testsuite/gcc.target/aarch64/vstN_lane_1.c
new file mode 100644
index ..a695aa1954036ef1c1782b14ddb3c46ec78b5f0b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vstN_lane_1.c
@@ -0,0 +1,75 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-inline" } */
+
+#include 
+
+extern void abort (void);
+
+#define VARIANTS(VARIANT, STRUCT)	\
+VARIANT (uint8, , 8, _u8, 6, STRUCT)	\
+VARIANT (uint16, , 4, _u16, 3, STRUCT)	\
+VARIANT (uint32, , 2, _u32, 1, STRUCT)	\
+VARIANT (uint64, , 1, _u64, 0, STRUCT)	\
+VARIANT (int8, , 8, _s8, 5, STRUCT)	\
+VARIANT (int16, , 4, _s16, 2, STRUCT)	\
+VARIANT (int32, , 2, _s32, 0, STRUCT)	\
+VARIANT (int64, , 1, _s64, 0, STRUCT)	\
+VARIANT (poly8, , 8, _p8, 7, STRUCT)	\
+VARIANT (poly16, , 4, _p16, 1, STRUCT)	\
+VARIANT (float32, , 2, _f32, 1, STRUCT)	\
+VARIANT (float64, , 1, _f64, 0, STRUCT)	\
+VARIANT (uint8, q, 16, _u8, 14, STRUCT)	\
+VARIANT (uint16, q, 8, _u16, 4, STRUCT)	\
+VARIANT (uint32, q, 4, _u32, 3, STRUCT)	\
+VARIANT (uint64, q, 2, _u64, 0, STRUCT)	\
+VARIANT (int8, q, 16, _s8, 13, STRUCT)	\
+VARIANT (int16, q, 8, _s16, 6, STRUCT)	\
+VARIANT (int32, q, 4, _s32, 2, STRUCT)	\
+VARIANT (int64, q, 2, _s64, 1, STRUCT)	\
+VARIANT (poly8, q, 16, _p8, 12, STRUCT)	\
+VARIANT (poly16, q, 8, _p16, 5, STRUCT)	\
+VARIANT (float32, q, 4, _f32, 1, STRUCT)\
+VARIANT (float64, q, 2, _f64, 0, STRUCT)
+
+#define TESTMETH(BASE, Q, ELTS, SUFFIX, LANE, STRUCT)			\
+int	\
+test_vst##STRUCT##Q##_lane##SUFFIX (const BASE##_t *data)		\
+{	\
+  BASE##x##ELTS##x##STRUCT##_t vectors;	\
+  for (int i = 0; i < STRUCT; i++, data += ELTS)			\
+vectors.val[i] = vld1##Q##SUFFIX (data);\
+  BASE##_t temp[STRUCT];		\
+  vst##STRUCT##Q##_lane##SUFFIX (temp, vectors, LANE);			\
+  for (int i = 0; i < STRUCT; i++)	\
+{	\
+  if (temp[i] != vget

RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

2015-04-16 Thread Maciej W. Rozycki
On Thu, 16 Apr 2015, Petar Jovanovic wrote:

> > There isn't any need to execute a large testcase.  Instead, try adding a
> short version of your test to the directory gcc/testsuite/gcc.target/mips.
> > There are examples of other tests there they check for specific assembler
> sequences by using scan-assembler.   See umips-swp-1.c (and others) for a
> specific example of how to do this.
> > Let me know if you need additional help.
> > Thanks,
> > Catherine
> 
> Hi Catherine,
> 
> Sorry for late response, I have missed to follow up on this.
> IIUC, scan-assembler will scan assembly file generated from a test file.
> This particular change will - on the other hand - modify crtend.o and thus
> init section. Is there a 'scan-assembler' functionality over a final linked
> executable, as that would actually test the change?

 You'd need `objdump' for doing that and there is no ready-to-use solution 
within the GCC test suite available, you'd have to cook something up 
yourself, perhaps starting with `[find_binutils_prog objdump]'.

 Another solution might be using an auxiliary linker script (`-Wl,-T,...' 
or maybe just an implicit linker script will do; see the LD manual for 
details) to place the sections the jump is made between apart manually at 
addresses appropriate for JAL to fail.  They span does not have to be 
large -- when placed correctly, even `JAL .' can jump to the wrong place, 
which is what LD is supposed to catch as a relocation error -- so a test 
executable successfully linked with your correction in place won't be 
large, it doesn't have to take more than 2 virtual pages.

 The downside of the latter solution are possible portability issues with 
the linker script.  You won't have to run your executable AFAICT from 
glibc BZ #17601 as you'll get a link error if a jump target is out of 
range.  So you could make it a mere link test, no need to run it.

  Maciej


Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation

2015-04-16 Thread Manuel López-Ibáñez

On 16/04/15 17:01, David Malcolm wrote:

Attached is a work-in-progress patch for a new
   -Wmisleading-indentation
warning I've been experimenting with, for GCC 6.


It sounds very cool...


(D) tabs vs spaces.  This is probably the biggest can of worms.


I would suggest to be very conservative when warning. In case of doubt, do not 
warn.



An issue here is how to determine (i), or if it's OK to default to 8 and
have a command-line option (param?) to override it? (though what about,
say, each header file?)


In case of a potential warning, you could detect if there is a mix of tab and 
spaces for the lines affected that may make the warning bogus, and not warn in 
that case. That is,


<6 spaces>if (flagC)
<1 tab>foo ();
<1 tab>bar ();

warns but

<6 spaces>if (flagC)
<8 spaces>foo ();
<2   tabs>bar ();

does not because you don't know how many spaces those 2 tabs take.

Also, in order to detect whether something is a tab or a space, why not simply 
re-open and seek? The input file is probably already in memory. See how 
diagnostic_show_locus in diagnostics.c gets lines from given locations.


Not even Emacs, which is a GNU package, can get tabs vs. spaces right and 
compile-goto-error goes to the wrong column. That seems a can of worms not 
worth opening.



+
+   This is fed three things by the frontend:
+
+   (A) a series of location_t by the frontend's tokenizer,
+   corresponding to the locations of the token stream.  It uses
+   this to model a stack of "visual blocks", corresponding to
+   indentation levels within the source code.

I don't understand why you need more than three, one for the conditional 
statement, another for the solo statement and another for the next-statement. 
I also don't get why you need to pass the whole block and new_stmt to 
visual_block, don't the locations returned by EXPR_LOCATION suffice?


Given that, the C++ tentative parsing should not be a problem since there is no 
tentative parsing (or there should not be) of if/else/while, and you only care 
about the first token when parsing the other two statements.


@@ -236,6 +236,9 @@ c_lex_one_token (c_parser *parser, c_token *token)
   token->keyword = RID_MAX;
   token->pragma_kind = PRAGMA_NONE;

+  if (vis_parser)
+vis_parser->on_token (token->location);
+

Why you need to save every location? Why not just do this when the parser 
detects an if/else/while? Then,


@@ -5081,7 +5086,11 @@ c_parser_if_body (c_parser *parser, bool *if_p)
   else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
 add_stmt (c_parser_compound_statement (parser));
   else
-c_parser_statement_after_labels (parser);
+{
+  c_parser_statement_after_labels (parser);
+  if (vis_parser)
+   vis_parser->on_solo_stmt (block, if_loc);
+}
   return c_end_compound_stmt (body_loc, block, flag_isoc99);
 }

You can simply here peek the location of the next token before and after 
c_parser_statement_after_labels (parser). Then warn if they are equal or if 
they are on the same line, no? You may need to skip pragmas.



+   Note that we can't simply use statement locations; for example, in:
+ if (flag)
+   x = 1;
+   the "if (flag)"'s location is at the open-paren, and that of the
+   assignment "x = 1;" is at the equals-sign, so any attempt to use
+   statement locations can be fooled by varying the spacing:

Aren't these bugs (or missing features) in the FE locations? I know that the 
location of the assignment is at '=' because we don't have (yet) locations for 
constants or variables (all binary expressions should have 3 locations!).


In any case, don't you have the location of the token that starts the 
statements? This is all you need as far as I understand.


And why the 'if' points to the open paren? One can easily find the open-paren 
location if needed by seeking in the input buffer.



+ /* FIXME: what about entirely empty lines???
+Presumably they simply don't get tokens.  */

Empty lines no, but preprocessor directives do.

Cheers,

Manuel.


Re: [PATCH, i386]: Merge *_sse patterns with *_mixed using enabled attribute.

2015-04-16 Thread Uros Bizjak
Hello!

This part merges *fop__1_sse with *fop__1_mixed.

2015-04-16  Uros Bizjak  

* config/i386/predicates.md (register_mixssei387nonimm_operand): New.
* config/i386/i386.md (*fop__1_mixed): Merge with
*fop__1_sse using enabled attribute.  Use
register_mixssei387nonimm_operand operand 1 predicate. Change
alternative 3 constraints from "x" to "v".

Bootsttapped and regression tested on x86_64-linux-gnu {,-m32} and
committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 222145)
+++ config/i386/i386.md (working copy)
@@ -13602,12 +13602,26 @@
   (const_string "fop")))
(set_attr "mode" "")])
 
+(define_insn "*rcpsf2_sse"
+  [(set (match_operand:SF 0 "register_operand" "=x")
+   (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm")]
+  UNSPEC_RCP))]
+  "TARGET_SSE_MATH"
+  "%vrcpss\t{%1, %d0|%d0, %1}"
+  [(set_attr "type" "sse")
+   (set_attr "atom_sse_attr" "rcp")
+   (set_attr "btver2_sse_attr" "rcp")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "SF")])
+
 (define_insn "*fop__1_mixed"
-  [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,x")
+  [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,v")
(match_operator:MODEF 3 "binary_fp_operator"
- [(match_operand:MODEF 1 "nonimmediate_operand" "0,fm,0,x")
-  (match_operand:MODEF 2 "nonimmediate_operand" "fm,0,xm,xm")]))]
-  "SSE_FLOAT_MODE_P (mode) && TARGET_MIX_SSE_I387
+ [(match_operand:MODEF 1
+"register_mixssei387nonimm_operand" "0,fm,0,v")
+  (match_operand:MODEF 2
+"nonimmediate_operand"  "fm,0,xm,vm")]))]
+  "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH
&& !COMMUTATIVE_ARITH_P (operands[3])
&& !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "* return output_387_binary_op (insn, operands);"
@@ -13628,39 +13642,13 @@
   (const_string "fop")))
(set_attr "isa" "*,*,noavx,avx")
(set_attr "prefix" "orig,orig,orig,vex")
-   (set_attr "mode" "")])
+   (set_attr "mode" "")
+   (set (attr "enabled")
+ (cond [(eq_attr "alternative" "0,1")
+  (symbol_ref "TARGET_MIX_SSE_I387")
+  ]
+   (const_string "*")))])
 
-(define_insn "*rcpsf2_sse"
-  [(set (match_operand:SF 0 "register_operand" "=x")
-   (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm")]
-  UNSPEC_RCP))]
-  "TARGET_SSE_MATH"
-  "%vrcpss\t{%1, %d0|%d0, %1}"
-  [(set_attr "type" "sse")
-   (set_attr "atom_sse_attr" "rcp")
-   (set_attr "btver2_sse_attr" "rcp")
-   (set_attr "prefix" "maybe_vex")
-   (set_attr "mode" "SF")])
-
-(define_insn "*fop__1_sse"
-  [(set (match_operand:MODEF 0 "register_operand" "=x,x")
-   (match_operator:MODEF 3 "binary_fp_operator"
- [(match_operand:MODEF 1 "register_operand" "0,x")
-  (match_operand:MODEF 2 "nonimmediate_operand" "xm,xm")]))]
-  "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH
-   && !COMMUTATIVE_ARITH_P (operands[3])"
-  "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type")
-(cond [(match_operand:MODEF 3 "mult_operator")
- (const_string "ssemul")
-  (match_operand:MODEF 3 "div_operator")
- (const_string "ssediv")
-  ]
-  (const_string "sseadd")))
-   (set_attr "isa" "noavx,avx")
-   (set_attr "prefix" "orig,vex")
-   (set_attr "mode" "")])
-
 ;; This pattern is not fully shadowed by the pattern above.
 (define_insn "*fop__1_i387"
   [(set (match_operand:MODEF 0 "register_operand" "=f,f")
Index: config/i386/predicates.md
===
--- config/i386/predicates.md   (revision 222145)
+++ config/i386/predicates.md   (working copy)
@@ -115,6 +115,12 @@
 (match_operand 0 "nonmemory_operand")
 (match_operand 0 "general_operand")))
 
+;; Match register operands, include memory operand for TARGET_MIX_SSE_I387.
+(define_predicate "register_mixssei387nonimm_operand"
+  (if_then_else (match_test "TARGET_MIX_SSE_I387")
+(match_operand 0 "nonimmediate_operand")
+(match_operand 0 "register_operand")))
+
 ;; Return true if VALUE is symbol reference
 (define_predicate "symbol_operand"
   (match_code "symbol_ref"))


[PATCH] PR target/65780: [5/6 Regression] Uninitialized common handling in executables

2015-04-16 Thread H.J. Lu
Uninitialized common symbol behavior in executables is target and linker
dependent.  default_binds_local_p_3 is made public and updated to take an
argument to indicate if common symbol may be local.  If common symbol
may be local, default_binds_local_p_3 will treat non-external variable
as defined locally.  default_binds_local_p_2 is changed to treat common
symbol as local for non-PIE binaries.

For i386, common symbol is local only for non-PIE binaries.  For x86-64,
common symbol is local only for non-PIE binaries or linker supports copy
reloc in PIE binaries.  If a target treats common symbol as local only
for non-PIE binaries, it can define TARGET_BINDS_LOCAL_P as
default_binds_local_p_2.

Tested on Linux/x86-64 using -m32 with binutils master and 2.24.  OK for
trunk and 5 branch?


H.J.
---
gcc/

PR target/65780
* output.h (default_binds_local_p_3): New.
* varasm.c (default_binds_local_p_3): Make it public.  Take an
argument to indicate if common symbol may be local.  If common
symbol may be local, treat non-external variable as defined
locally.
(default_binds_local_p_2): Pass !flag_pic to default_binds_local_p_3.
(default_binds_local_p_1): Pass false to default_binds_local_p_3.
* config/i386/i386.c (ix86_binds_local_p): New.
(TARGET_BINDS_LOCAL_P): Replace default_binds_local_p_2 with
ix86_binds_local_p.

gcc/testsuite/

PR target/65780
* gcc.dg/pr65780-1.c: New test.
* gcc.dg/pr65780-2.c: Likewise.
* gcc.target/i386/pr32219-9.c: Likewise.
* gcc.target/i386/pr32219-1.c (xxx): Make it initialized common
symbol.
* gcc.target/i386/pr64317.c (c): Initialize.
---
 gcc/config/i386/i386.c| 15 ++-
 gcc/output.h  |  1 +
 gcc/testsuite/gcc.dg/pr65780-1.c  | 12 
 gcc/testsuite/gcc.dg/pr65780-2.c  | 13 +
 gcc/testsuite/gcc.target/i386/pr32219-1.c |  3 ++-
 gcc/testsuite/gcc.target/i386/pr32219-9.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr64317.c   |  2 +-
 gcc/varasm.c  | 31 +++
 8 files changed, 77 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr65780-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr65780-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-9.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1f20ff3..72e6bc2 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -51793,6 +51793,19 @@ ix86_initialize_bounds (tree var, tree lb, tree ub, 
tree *stmts)
   return 2;
 }
 
+/* For i386, common symbol is local only for non-PIE binaries.  For
+   x86-64, common symbol is local only for non-PIE binaries or linker
+   supports copy reloc in PIE binaries.   */
+
+static bool
+ix86_binds_local_p (const_tree exp)
+{
+  return default_binds_local_p_3 (exp, flag_shlib != 0, true, true,
+ (!flag_pic
+  || (TARGET_64BIT
+  && HAVE_LD_PIE_COPYRELOC != 0)));
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_RETURN_IN_MEMORY
 #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
@@ -51927,7 +51940,7 @@ ix86_initialize_bounds (tree var, tree lb, tree ub, 
tree *stmts)
 #define TARGET_BINDS_LOCAL_P darwin_binds_local_p
 #else
 #undef TARGET_BINDS_LOCAL_P
-#define TARGET_BINDS_LOCAL_P default_binds_local_p_2
+#define TARGET_BINDS_LOCAL_P ix86_binds_local_p
 #endif
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 #undef TARGET_BINDS_LOCAL_P
diff --git a/gcc/output.h b/gcc/output.h
index 53e47d0..81d2ad2 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -587,6 +587,7 @@ extern bool default_use_anchors_for_symbol_p (const_rtx);
 extern bool default_binds_local_p (const_tree);
 extern bool default_binds_local_p_1 (const_tree, int);
 extern bool default_binds_local_p_2 (const_tree);
+extern bool default_binds_local_p_3 (const_tree, bool, bool, bool, bool);
 extern void default_globalize_label (FILE *, const char *);
 extern void default_globalize_decl_name (FILE *, tree);
 extern void default_emit_unwind_label (FILE *, tree, int, int);
diff --git a/gcc/testsuite/gcc.dg/pr65780-1.c b/gcc/testsuite/gcc.dg/pr65780-1.c
new file mode 100644
index 000..b586211
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr65780-1.c
@@ -0,0 +1,12 @@
+/* PR target/65780 */
+/* { dg-do link { target *-*-linux* *-*-gnu* } } */
+/* { dg-options "-O2" } */
+
+int optopt;
+
+int
+main ()
+{
+  optopt = 4;
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr65780-2.c b/gcc/testsuite/gcc.dg/pr65780-2.c
new file mode 100644
index 000..bff3323
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr65780-2.c
@@ -0,0 +1,13 @@
+/* PR target/65780 */
+/* { dg-do link { target *-*-linux* *-*-gnu* } } */
+/* { dg-require-effective-target pie } */
+/* { dg-options "-O2 -fpie" } */
+
+int optopt;
+

Re: [Patch, Fortran, pr60322, addendum] was: [Patch 1/2, Fortran, pr60322] [OOP] Incorrect bounds on polymorphic dummy array

2015-04-16 Thread Paul Richard Thomas
Hi Andre,

The delta patch is OK for trunk and eventual backport to 5.2.

Thanks for all the hard work

Paul

On 14 April 2015 at 19:00, Andre Vehreschild  wrote:
> Hi all,
>
> during further testing of a big Fortran software I encounter two bugs with
> class arrays, that are somehow connected to pr60322. I therefore propose an
> extended patch for pr60322. Because Paul has already reviewed most the 
> extended
> patch, I give you two patches:
>
> 1. a full patch, fixing all the issues connected to pr60322, and
> 2. a delta patch to get from the reviewed patch to the latest version.
>
> With the second patch I hope to get a faster review, because it is
> significantly shorter.
>
> Now what was the issue? To be precise there were two issues:
>
> i. a pointer to a class array (CLASS_DATA(sym).attr.class_pointer == 1) was
> dereferenced, which lead to an ICE (the patch for this in the delta is chunk 5
> in gfc_conv_expr_descriptor, and
>
> ii. (and this was a severe brain cracker) in chains of references consisting 
> of
> more then one class-(array)-ref always the _vptr of the first symbol was taken
> and not the _vptr of the currently dereferenced class object. This occurred
> when fortran code similiar to this was executed:
>
> type innerT
>   integer, allocatable :: arr(:)
> end type
>
> type T
>   class(innerT) :: mat(:,:)
> end type
>
> class(T) :: o
>
> allocate(o%mat(2,2))
> allocate(o%mat(:,:)%arr(10)) ! This is obviously pseudo code,
> ! but I think you get what is meant.
>
> o%mat(1,1)%arr(1) = 1
>
> In the last line the address to get to arr(1) was computed using the
> _vptr->size of o and not of o%mat(1,1). To fix this gfc_component_ref () now
> computes the class' _vptr-ref whenever it does a _data-ref (chunk 1 of
> trans-expr.c in the delta patch). The _vptr-ref is stored in gfc_se, where I
> added the new member class_vptr. The gfc_se->class_vptr is then used in
> array-refs (chunk 2 of trans.c) to get the size of the array elements of the
> correct level.
>
> The other chunks of the delta patch are:
> - parameter passing fixes, and
> - documentation fixes as requested for the version 5 of the pr60322 patch.
>
> I hope this helps in getting the patch reviewed quickly.
>
> Bootstraps and regtests ok on x86_64-linux-gnu/F21.
>
> Ok for trunk -> 6.0?
> Ok, for backport to 5.2, once available?
>
> Note, the patches may apply with shifts, as I forgot to update before taking
> the diffs.
>
> Regards,
> Andre
>
> On Thu, 9 Apr 2015 14:37:09 +0200
> Andre Vehreschild  wrote:
>
>> Hi Paul, hi all,
>>
>> Paul, thanks for the review. Answers to your questions are inline below:
>>
>> On Sun, 5 Apr 2015 11:13:05 +0200
>> Paul Richard Thomas  wrote:
>> 
>> > +  /* The dummy is returned for pointer, allocatable or assumed rank 
>> > arrays.
>> > + The check for pointerness needs to be repeated here (it is done in
>> > + IS_CLASS_ARRAY (), too), because for class arrays that are pointers,
>> > as
>> > + is the one of the sym, which is incorrect here.  */
>> >
>> > What does this mean, please?
>>
>> The first sentence is about regular arrays and should be unchanged from the
>> original source. Then I have to check for class (arrays) that are pointers,
>> i.e., independent of whether the sym is a class array or a regular pointer to
>> a class object. (The latter shouldn't make it into the routine anyway.)
>> IS_CLASS_ARRAY () returns false for too many reasons to be of use here. I 
>> have
>> to apologize and confess that the comment was a mere note to myself to not
>> return to use is_classarray in the if below. Let me rephrase the comment to
>> be:
>>
>> /* The dummy is returned for pointer, allocatable or assumed rank arrays.
>>For class arrays the information if sym is an allocatable or pointer
>>object needs to be checked explicitly (IS_CLASS_ARRAY can be false for
>>too many reasons to be of use here).  */
>>
>> > +  /* Returning the descriptor for dummy class arrays is hazardous,
>> > because
>> > + some caller is expecting an expression to apply the component refs 
>> > to.
>> > + Therefore the descriptor is only created and stored in
>> > + sym->backend_decl's GFC_DECL_SAVED_DESCRIPTOR.  The caller is then
>> > + responsible to extract it from there, when the descriptor is
>> > + desired.  */
>> > +  if (IS_CLASS_ARRAY (sym)
>> > +  && (!DECL_LANG_SPECIFIC (sym->backend_decl)
>> > +  || !GFC_DECL_SAVED_DESCRIPTOR (sym->backend_decl)))
>> > +{
>> > +  decl = gfc_build_dummy_array_decl (sym, sym->backend_decl);
>> > +  /* Prevent the dummy from being detected as unused if it is copied.
>> > */
>> > +  if (sym->backend_decl != NULL && decl != sym->backend_decl)
>> > +DECL_ARTIFICIAL (sym->backend_decl) = 1;
>> > +  sym->backend_decl = decl;
>> > +}
>> >
>> > The comments, such as the above are often going well beyond column 72,
>> > into the 80's. I know that much of the existing code violates t

Re: [PATCH] Do not discard the constructors of empty structs [PR c++/64527]

2015-04-16 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Fix PR65204

2015-04-16 Thread H.J. Lu
On Mon, Apr 13, 2015 at 7:03 PM, H.J. Lu  wrote:
> On Wed, Feb 25, 2015 at 5:26 AM, Richard Biener  wrote:
>>
>> This fixes missed tracking of alignment of non-invariant addresses
>> in CCP.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for GCC 6.
>>
>> Richard.
>>
>> 2015-02-25  Richard Biener  
>>
>> PR tree-optimization/65204
>> * tree-ssa-ccp.c (evaluate_stmt): Always evaluate address
>> takens for bit-CCP.
>>
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65758

This also caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65788

and r222085 didn't fix it.

-- 
H.J.


Re: [PATCH V2] IRA: Speed up setup_left_conflict_sizes_p

2015-04-16 Thread Jeff Law

On 03/11/2015 07:34 PM, Zhouyi Zhou wrote:

From: Zhouyi Zhou 

In function setup_left_conflict_sizes_p, left conflict subnodes sizes
are computed in a bottom-to-up fashion through the regnodes foreast.

Speed up the process from prevent node itself to involve in this
computation.

I have no write access to GCC SVN repository, if it OK, can you commit
for me?

(Thanks Richard for reviewing)

Bootstrap and regtest scheduled on x86_64 GNU/Linux
Signed-off-by: Zhouyi Zhou 
---
  gcc/ChangeLog   |5 +
  gcc/ira-color.c |8 +++-
  2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 53f582b..a495dfc 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-03-12  Zhouyi Zhou  
+
+   * ira-color.c (setup_left_conflict_sizes_p): Prevent node itself
+   from computing left conflict subnodes size.

Approved and installed after the usual bootstrap and regression testing.

Thanks,
jeff



[C++ PATCH PING] Reject trailing return type for operator auto()

2015-04-16 Thread Ville Voutilainen
On 28 December 2014 at 20:21, Ville Voutilainen
 wrote:
> Any comments on this?

Re-ping. :) The original message is
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01614.html

>
> On 19 December 2014 at 09:21, Ville Voutilainen
>  wrote:
>> Tested on Linux-x64.
>>
>> /cp
>> 2014-12-19  Ville Voutilainen  
>>
>> Reject trailing return type for an operator auto().
>> * decl.c (grokdeclarator): Reject trailing return types for
>> all conversion operators, don't handle conversion operators
>> in the previous checks that deal with auto.
>>
>> /testsuite
>> 2014-12-19  Ville Voutilainen  
>>
>> Reject trailing return type for an operator auto().
>> * g++.dg/cpp0x/auto9.C: Adjust.


[PATCH] fix PR target/65535

2015-04-16 Thread Andreas Tobler

Hi all,

the below is an attempt to warn a user when she/he builds a cross 
compiler for *-*-freebsd* without giving a major version number.


Ok for trunk?
Thanks,
Andreas

2015-04-16  Andreas Tobler  

* config.gcc: Exit with a comment when we do not have a major version
number for the FreeBSD target.

Index: config.gcc
===
--- config.gcc  (revision 222153)
+++ config.gcc  (working copy)
@@ -664,6 +664,11 @@
   gas=yes
   gnu_ld=yes
   fbsd_major=`echo ${target} | sed -e 's/.*freebsd//g' | sed -e 
's/\..*//g'`

+  if test "$fbsd_major" = ""; then
+echo "Specify the major version number of the targeted FreeBSD release"
+echo "like this: --target=amd64-unknown-freebsd10.1"
+exit 1
+  fi
   tm_defines="${tm_defines} FBSD_MAJOR=${fbsd_major}"
   tmake_file="t-slibgcc"
   case ${enable_threads} in


RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

2015-04-16 Thread Moore, Catherine


> -Original Message-
> From: Maciej W. Rozycki [mailto:ma...@linux-mips.org]
> Sent: Thursday, April 16, 2015 2:23 PM
> To: Petar Jovanovic
> Cc: Moore, Catherine; 'Matthew Fortune'; gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
> 
> On Thu, 16 Apr 2015, Petar Jovanovic wrote:
> 
> > > There isn't any need to execute a large testcase.  Instead, try
> > > adding a
> > short version of your test to the directory gcc/testsuite/gcc.target/mips.
> > > There are examples of other tests there they check for specific
> > > assembler
> > sequences by using scan-assembler.   See umips-swp-1.c (and others) for a
> > specific example of how to do this.
> > > Let me know if you need additional help.
> > > Thanks,
> > > Catherine
> >
> > Hi Catherine,
> >
> > Sorry for late response, I have missed to follow up on this.
> > IIUC, scan-assembler will scan assembly file generated from a test file.
> > This particular change will - on the other hand - modify crtend.o and
> > thus init section. Is there a 'scan-assembler' functionality over a
> > final linked executable, as that would actually test the change?
> 
Hi Petar,
It looks like I answered a little too quickly the first time around.  I'm sorry.
In any case, Maciej is correct.  I think a link-time test should do it.
Thanks,
Catherine

>  You'd need `objdump' for doing that and there is no ready-to-use solution
> within the GCC test suite available, you'd have to cook something up
> yourself, perhaps starting with `[find_binutils_prog objdump]'.
> 
>  Another solution might be using an auxiliary linker script (`-Wl,-T,...'
> or maybe just an implicit linker script will do; see the LD manual for
> details) to place the sections the jump is made between apart manually at
> addresses appropriate for JAL to fail.  They span does not have to be large --
> when placed correctly, even `JAL .' can jump to the wrong place, which is
> what LD is supposed to catch as a relocation error -- so a test executable
> successfully linked with your correction in place won't be large, it doesn't
> have to take more than 2 virtual pages.
> 
>  The downside of the latter solution are possible portability issues with the
> linker script.  You won't have to run your executable AFAICT from glibc BZ
> #17601 as you'll get a link error if a jump target is out of range.  So you 
> could
> make it a mere link test, no need to run it.
> 
>   Maciej


Re: [C++ Patch] Rename tsubst_flags_t function parameter

2015-04-16 Thread Jason Merrill

OK.

Jason


[PATCH] Fix dwarf2out ICE (PR debug/65771)

2015-04-16 Thread Jakub Jelinek
Hi!

As mentioned in the PR, on the following testcase we ICE, because for
  # DEBUG D#2 => b
  # DEBUG D#1 => a[D#2].t
  # DEBUG c => D#1
during expansion we get the a[D#2].t added as MEM_EXPR of a MEM, and because
we can't mem_loc_descriptor that MEM (I'll post separately a trunk only
patch that fixes that in this case, but generally not all MEMs can be
represented in debug info), we try harder and try to use MEM_EXPR in
loc_list_from_tree, but that one ICEs on DEBUG_EXPR_DECL.  There is nothing
we can do for those at this point, debug_exprs are only useful to
var-tracking, so returning NULL is the only thing we can do for those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and 5.1?

2015-04-16  Jakub Jelinek  

PR debug/65771
* dwarf2out.c (loc_list_from_tree): Return NULL
for DEBUG_EXPR_DECL.

* gcc.dg/debug/pr65771.c: New test.

--- gcc/dwarf2out.c.jj  2015-04-16 16:51:52.0 +0200
+++ gcc/dwarf2out.c 2015-04-16 16:57:28.866134980 +0200
@@ -14642,6 +14642,7 @@ loc_list_from_tree (tree loc, int want_a
 
 case TARGET_MEM_REF:
 case SSA_NAME:
+case DEBUG_EXPR_DECL:
   return NULL;
 
 case COMPOUND_EXPR:
--- gcc/testsuite/gcc.dg/debug/pr65771.c.jj 2015-04-16 17:00:23.811328842 
+0200
+++ gcc/testsuite/gcc.dg/debug/pr65771.c2015-04-16 17:00:13.0 
+0200
@@ -0,0 +1,15 @@
+/* PR debug/65771 */
+/* { dg-do link } */
+/* { dg-require-effective-target tls } */
+
+struct S { int s; int t; };
+__thread struct S a[10];
+int b;
+
+int
+main ()
+{
+  int c = a[b].t;
+  (void) c;
+  return 0;
+}

Jakub


[PATCH, 5.1, rs6000] Fix PR65787

2015-04-16 Thread Bill Schmidt
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65787 identifies an issue
where the powerpc64le vector swap optimization miscompiles some code.
The code for handling vector extract operations did not expect to find
those operations wrapped in a PARALLEL with a CLOBBER, but this test
shows that this can now happen.  This patch adds that possibility to the
handling for vector extract.  I've added a small test case to verify
that we now catch this.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk and gcc-5-branch?

After this is complete I will investigate whether we need to backport
this to 4.8 and 4.9 also.

Thanks,
Bill


[gcc]

2015-04-16  Bill Schmidt  

PR target/65787
* config/rs6000/rs6000.c (rtx_is_swappable_p): Handle case where
vec_extract operation is wrapped in a PARALLEL with a CLOBBER.
(adjust_extract): Likewise.

[gcc/testsuite]

2015-04-16  Bill Schmidt  

PR target/65787
* gcc.target/powerpc/pr65787.c: New.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 222158)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -34204,6 +34204,20 @@ rtx_is_swappable_p (rtx op, unsigned int *special)
   else
return 0;
 
+case PARALLEL: {
+  /* A vec_extract operation may be wrapped in a PARALLEL with a
+clobber, so account for that possibility.  */
+  unsigned int len = XVECLEN (op, 0);
+
+  if (len != 2)
+   return 0;
+
+  if (GET_CODE (XVECEXP (op, 0, 1)) != CLOBBER)
+   return 0;
+
+  return rtx_is_swappable_p (XVECEXP (op, 0, 0), special);
+}
+
 case UNSPEC:
   {
/* Various operations are unsafe for this optimization, at least
@@ -34603,7 +34617,10 @@ permute_store (rtx_insn *insn)
 static void
 adjust_extract (rtx_insn *insn)
 {
-  rtx src = SET_SRC (PATTERN (insn));
+  rtx pattern = PATTERN (insn);
+  if (GET_CODE (pattern) == PARALLEL)
+pattern = XVECEXP (pattern, 0, 0);
+  rtx src = SET_SRC (pattern);
   /* The vec_select may be wrapped in a vec_duplicate for a splat, so
  account for that.  */
   rtx sel = GET_CODE (src) == VEC_DUPLICATE ? XEXP (src, 0) : src;
Index: gcc/testsuite/gcc.target/powerpc/pr65787.c
===
--- gcc/testsuite/gcc.target/powerpc/pr65787.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr65787.c  (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O3" } */
+/* { dg-final { scan-assembler "xxsldwi \[0-9\]*,\[0-9\]*,\[0-9\]*,3" } } */
+/* { dg-final { scan-assembler-not "xxpermdi" } } */
+
+/* This test verifies that a vector extract operand properly has its
+   lane changed by the swap optimization.  Element 2 of LE corresponds
+   to element 1 of BE.  When doublewords are swapped, this becomes
+   element 3 of BE, so we need to shift the vector left by 3 words
+   to be able to extract the correct value from BE element zero.  */
+
+typedef float  v4f32 __attribute__ ((__vector_size__ (16)));
+
+void foo (float);
+extern v4f32 x, y;
+
+int main() {
+  v4f32 z = x + y;
+  foo (z[2]);
+}




[PATCH] Improve debug info generation for TLS + const (PR debug/65771)

2015-04-16 Thread Jakub Jelinek
On Thu, Apr 16, 2015 at 11:11:06PM +0200, Jakub Jelinek wrote:
> during expansion we get the a[D#2].t added as MEM_EXPR of a MEM, and because
> we can't mem_loc_descriptor that MEM (I'll post separately a trunk only
> patch that fixes that in this case, but generally not all MEMs can be
> represented in debug info), we try harder and try to use MEM_EXPR in

Here is the patch for it, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?
We handle TLS SYMBOL_REFs in mem_loc_descriptor, but in the testcase it
is surrounded by (const (plus (symbol_ref) (const_int))).

2015-04-16  Jakub Jelinek  

PR debug/65771
* dwarf2out.c (mem_loc_descriptor): For CONST, fallback to
trying mem_loc_descriptor on XEXP (rtl, 0).

--- gcc/dwarf2out.c.jj  2015-02-26 16:37:09.0 +0100
+++ gcc/dwarf2out.c 2015-04-16 16:45:44.584429608 +0200
@@ -12799,7 +12799,12 @@ mem_loc_descriptor (rtx rtl, machine_mod
}
 
   if (!const_ok_for_output (rtl))
-   break;
+   {
+ if (GET_CODE (rtl) == CONST)
+   mem_loc_result = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
+initialized);
+ break;
+   }
 
 symref:
   mem_loc_result = new_addr_loc_descr (rtl, dtprel_false);


Jakub


[patch, c, ping] Fix PR c/48956: diagnostics for conversions involving complex types (reviewed)

2015-04-16 Thread Mikhail Maltsev
I would like to ping the following patch:
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01925.html
Review: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02672.html

I fixed minor issues mentioned in the review and updated the changelog
message. Rebased on current trunk (r222157), bootstrapped and regtested
on x86_64-unknown-linux-gnu.
If it is OK for trunk, please assist with applying (I don't have write
access), and I will then create a new PR in bugzilla for the remaining
cases mentioned in review.

-- 
Regards,
Mikhail Maltsev
gcc/c-family/ChangeLog:

2015-04-17  Mikhail Maltsev  

[PR c/48956] Complex values conversion diagnostics
* c-common.c (int_safely_convertible_to_real_p): New function
(unsafe_conversion_p): Check conversions involving complex types
(conversion_warning): Added new warning message for conversions
which discard imaginary component
* c-common.h: Added new enumerator to enum conversion_safety for such
unsafe conversions

gcc/testsuite/ChangeLog:

2015-04-17  Mikhail Maltsev  

[PR c/48956] Complex values conversion diagnostics
* gcc.dg/Wconversion-complex-c99.c: New test.
* gcc.dg/Wconversion-complex-gnu.c: New test.


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 7fe7fa6..fd536db 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -2643,17 +2643,42 @@ shorten_binary_op (tree result_type, tree op0, tree 
op1, bool bitwise)
   return result_type;
 }
 
-/* Checks if expression EXPR of real/integer type cannot be converted
-   to the real/integer type TYPE. Function returns non-zero when:
+/* Returns true iff any integer value of type FROM_TYPE can be represented as
+   real of type TO_TYPE.  This is a helper function for unsafe_conversion_p.  
*/
+
+static bool
+int_safely_convertible_to_real_p (const_tree from_type, const_tree to_type)
+{
+  tree type_low_bound = TYPE_MIN_VALUE (from_type);
+  tree type_high_bound = TYPE_MAX_VALUE (from_type);
+  REAL_VALUE_TYPE real_low_bound =
+ real_value_from_int_cst (0, type_low_bound);
+  REAL_VALUE_TYPE real_high_bound =
+ real_value_from_int_cst (0, type_high_bound);
+
+  return exact_real_truncate (TYPE_MODE (to_type), &real_low_bound)
+&& exact_real_truncate (TYPE_MODE (to_type), &real_high_bound);
+}
+
+/* Checks if expression EXPR of complex/real/integer type cannot be converted
+   to the complex/real/integer type TYPE.  Function returns non-zero when:
* EXPR is a constant which cannot be exactly converted to TYPE.
* EXPR is not a constant and size of EXPR's type > than size of TYPE,
- for EXPR type and TYPE being both integers or both real.
+ for EXPR type and TYPE being both integers or both real, or both
+ complex.
+   * EXPR is not a constant of complex type and TYPE is a real or
+ an integer.
* EXPR is not a constant of real type and TYPE is an integer.
* EXPR is not a constant of integer type which cannot be
  exactly converted to real type.
+
Function allows conversions between types of different signedness and
can return SAFE_CONVERSION (zero) in that case.  Function can produce
-   signedness warnings if PRODUCE_WARNS is true.  */
+   signedness warnings if PRODUCE_WARNS is true.
+
+   Function allows conversions from complex constants to non-complex types,
+   provided that imaginary part is zero and real part can be safely converted
+   to TYPE.  */
 
 enum conversion_safety
 unsafe_conversion_p (location_t loc, tree type, tree expr, bool produce_warns)
@@ -2664,6 +2689,11 @@ unsafe_conversion_p (location_t loc, tree type, tree 
expr, bool produce_warns)
 
   if (TREE_CODE (expr) == REAL_CST || TREE_CODE (expr) == INTEGER_CST)
 {
+  /* If type is complex, we are interested in compatibility with
+underlying type.  */
+  if (TREE_CODE (type) == COMPLEX_TYPE)
+ type = TREE_TYPE (type);
+
   /* Warn for real constant that is not an exact integer converted
 to integer type.  */
   if (TREE_CODE (expr_type) == REAL_TYPE
@@ -2713,6 +2743,63 @@ unsafe_conversion_p (location_t loc, tree type, tree 
expr, bool produce_warns)
}
}
 }
+
+  else if (TREE_CODE (expr) == COMPLEX_CST)
+{
+  tree imag_part = TREE_IMAGPART (expr);
+  /* Conversion from complex constant with zero imaginary part,
+perform check for conversion of real part.  */
+  if ((TREE_CODE (imag_part) == REAL_CST
+  && real_zerop (imag_part))
+ || (TREE_CODE (imag_part) == INTEGER_CST
+ && integer_zerop (imag_part)))
+   /* Note: in this branch we use recursive call to unsafe_conversion_p
+  with different type of EXPR, but it is still safe, because when EXPR
+  is a constant, it's type is not used in text of generated warnings
+  (otherwise they could sound misleading).  */
+   return unsafe_conver

Re: [PR65768] Check rtx_cost when propagating constant

2015-04-16 Thread Kugan


On 15/04/15 21:18, Richard Biener wrote:
> On Wed, Apr 15, 2015 at 11:05 AM, Steven Bosscher  
> wrote:
>> On Wed, Apr 15, 2015 at 9:53 AM, Kugan wrote:
>>> 2015-04-15  Kugan Vivekanandarajah  < >
>>> Zhenqiang Chen  <>
>>>
>>> PR target/65768
>>> * cprop.c (try_replace_reg): Check cost of constants before 
>>> propagating.
>>
>>
>>> +
>>> +  /* For CONSTANT_P (to), loop2_invariant pass might hoist it out the loop.
>>> + And it can be shared by different references.  So skip propagation if
>>> + it makes INSN's rtx cost higher.  */
>>> +
>>
>> So only undo if the insn is inside a loop (i.e.
>> BLOCK_FOR_INSN(insn)->loop_father != NULL) and this is a
>> post-pass_loop2 cprop run?
> 
> post loop2 loops are destroyed.  When loops are available loop_father
> is always non-NULL, the proper check is for loop_outer (->loop_father) == 
> NULL.
> or loop_depth (->loop_father) != 0.

Thanks Steven and Richard for the comments. If the loop information is
present, we could have used this. But even otherwise, we are just
limiting the cprop of an expensive constant (based on the rtx_cost). I
understand that Richard was a bit concerned about extending the live
range but this is a trade off. As per his previous mail, Zhenqiang did
some benchmarking. I am happy to do further benchmarking if you want to
see that.

Probably the rematerialization that is being introduced in IRA/LRA can
redo this if it sees there is high register pressure. Any thoughts?

Thanks,
Kugan


[PATCH] remove need for store_values_directly

2015-04-16 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

Last stage 1 I introduced a second form of hash_table that stored elements of
value_type in addition to the old form that stored elements of type value_type
*.  That lead to a fair bit of code dupplication in hash_table, but it
simplified the transition by allowing it to take place one hash table at a
time.  Now I'm switching the rest of the hash_table users to use the new setup,
and removing supporot for the old one.

this was bootstrapped and regtested on x86_64-unknown-linux-gnu, and I ran make
all-gcc for the following crosses to check the hash tables they use were
correctly converted
arm-linux-androideabi
i686-apple-darwin
i686-solaris2.11
i686-w64-mingw32
ia64-linux
mips64-linux
nvptx-elf
ppc64-linux

Is this ok?

Trev

gcc/

* hash-table.h: Remove version of hash_table that stored value_type *.
* asan.c, attribs.c, bitmap.c, cfg.c, cgraph.h, config/arm/arm.c,
config/i386/winnt.c, config/ia64/ia64.c, config/mips/mips.c,
config/sol2.c, coverage.c, cselib.c, dse.c, dwarf2cfi.c,
dwarf2out.c, except.c, gcse.c, genmatch.c, ggc-common.c,
gimple-ssa-strength-reduction.c, gimplify.c, haifa-sched.c,
hard-reg-set.h, hash-map.h, hash-set.h, ipa-devirt.c, ipa-icf.h,
ipa-profile.c, ira-color.c, ira-costs.c, loop-invariant.c,
loop-iv.c, loop-unroll.c, lto-streamer.h, plugin.c, postreload-gcse.c,
reginfo.c, statistics.c, store-motion.c, trans-mem.c, tree-cfg.c,
tree-eh.c, tree-hasher.h, tree-into-ssa.c, tree-parloops.c,
tree-sra.c, tree-ssa-coalesce.c, tree-ssa-dom.c, tree-ssa-live.c,
tree-ssa-loop-im.c, tree-ssa-loop-ivopts.c, tree-ssa-phiopt.c,
tree-ssa-pre.c, tree-ssa-reassoc.c, tree-ssa-sccvn.c,
tree-ssa-structalias.c, tree-ssa-tail-merge.c,
tree-ssa-threadupdate.c, tree-vectorizer.c, tree-vectorizer.h,
valtrack.h, var-tracking.c, vtable-verify.c, vtable-verify.h: Adjust.


libcc1/

* plugin.cc: Adjust for hash_table changes.

java/

* jcf-io.c: Adjust for hash_table changes.

lto/

* lto.c: Adjust for hash_table changes.

objc/

* objc-act.c: Adjust for hash_table changes.

diff --git a/gcc/asan.c b/gcc/asan.c
index 9e4a629..7b70ee2 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -407,11 +407,11 @@ asan_mem_ref_get_end (const asan_mem_ref *ref, tree len)
 struct asan_mem_ref_hasher
   : typed_noop_remove 
 {
-  typedef asan_mem_ref value_type;
-  typedef asan_mem_ref compare_type;
+  typedef asan_mem_ref *value_type;
+  typedef asan_mem_ref *compare_type;
 
-  static inline hashval_t hash (const value_type *);
-  static inline bool equal (const value_type *, const compare_type *);
+  static inline hashval_t hash (const asan_mem_ref *);
+  static inline bool equal (const asan_mem_ref *, const asan_mem_ref *);
 };
 
 /* Hash a memory reference.  */
diff --git a/gcc/attribs.c b/gcc/attribs.c
index c18bff2..7b7e2a9 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -67,21 +67,21 @@ substring_hash (const char *str, int l)
 
 struct attribute_hasher : typed_noop_remove 
 {
-  typedef attribute_spec value_type;
-  typedef substring compare_type;
-  static inline hashval_t hash (const value_type *);
-  static inline bool equal (const value_type *, const compare_type *);
+  typedef attribute_spec *value_type;
+  typedef substring *compare_type;
+  static inline hashval_t hash (const attribute_spec *);
+  static inline bool equal (const attribute_spec *, const substring *);
 };
 
 inline hashval_t
-attribute_hasher::hash (const value_type *spec)
+attribute_hasher::hash (const attribute_spec *spec)
 {
   const int l = strlen (spec->name);
   return substring_hash (spec->name, l);
 }
 
 inline bool
-attribute_hasher::equal (const value_type *spec, const compare_type *str)
+attribute_hasher::equal (const attribute_spec *spec, const substring *str)
 {
   return (strncmp (spec->name, str->str, str->length) == 0
  && !spec->name[str->length]);
diff --git a/gcc/bitmap.c b/gcc/bitmap.c
index d43a39f..71d5b11 100644
--- a/gcc/bitmap.c
+++ b/gcc/bitmap.c
@@ -61,20 +61,20 @@ struct loc
 
 struct bitmap_desc_hasher : typed_noop_remove 
 {
-  typedef bitmap_descriptor_d value_type;
-  typedef loc compare_type;
-  static inline hashval_t hash (const value_type *);
-  static inline bool equal (const value_type *, const compare_type *);
+  typedef bitmap_descriptor_d *value_type;
+  typedef loc *compare_type;
+  static inline hashval_t hash (const bitmap_descriptor_d *);
+  static inline bool equal (const bitmap_descriptor_d *, const loc *);
 };
 
 inline hashval_t
-bitmap_desc_hasher::hash (const value_type *d)
+bitmap_desc_hasher::hash (const bitmap_descriptor_d *d)
 {
   return htab_hash_pointer (d->file) + d->line;
 }
 
 inline bool
-bitmap_desc_hasher::equal (const value_type *d, const compare_type *l)
+bitmap_desc_hasher::equal (const bitmap_descriptor_d *d, const loc *l)
 {
   return d->file == l->file && d->function == l->function 

Re: [PATCH] Fix inlining checks wrt optimize attribute

2015-04-16 Thread Christian Bruel
On 01/22/2015 11:33 AM, Richard Biener wrote:
> On Thu, 22 Jan 2015, Christian Bruel wrote:
> 
>> Hi Richard,
>>
>> I thought one of my current issue would be solved by this patch, but it is 
>> not
>> : I have some inlining failures with the attribute target on ARM. (e.g
>> inline-3.c) where obvious early inline fails with because we fail into the
>> last can_inline_edge_p case:
>>
>> opt_for_fn (callee->decl, optimize)
>> >= opt_for_fn (caller->decl, optimize)))
>>
>> when callee and caller are both -O2 and targetm.target_option.can_inline_p 
>> was
>> true, they should be inlined as in the general case (no
>> DECL_FUNCTION_SPECIFIC_OPTIMIZATION)
>>
>> I'm currently testing this additional change:
>>
>> Index: ipa-inline.c
>> ===
>> --- ipa-inline.c (revision 219989)
>> +++ ipa-inline.c (working copy)
>> @@ -489,7 +489,7 @@
>>else if (opt_for_fn (callee->decl, optimize_size)
>> < opt_for_fn (caller->decl, optimize_size)
>> || (opt_for_fn (callee->decl, optimize)
>> -   >= opt_for_fn (caller->decl, optimize)))
>> +   > opt_for_fn (caller->decl, optimize)))
>>  {
>>if (estimate_edge_time (e)
>>>= 20 + inline_edge_summary (e)->call_stmt_time)
>>
>> Since this is a hot topic for you, I though you would have useful comments on
>> this before I ask for a commit (when stage 4 close) ?
> 
> Yeah - the above looks like an obvious change to me.  Thus,
> approved if it passes bootstrap/regtest.
> 
> Thanks,
> Richard.
> 

thanks, sorry for the delay (stage1 blocked)

committed with the Changelog:

* ipa-inline.c (can_inline_edge_p): Allow inlining of functions with
same attributes.

This frees the road for

 - [PATCH, x86] [PR target/64835] Add
TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook
 https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00594.html

 and

 - [PATCH, ARM] attribute target (thumb,arm) [0-6]
  https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00706.html

with new regressions tests.





[PATCH] Perform more pattern matching during SSA propagation

2015-04-16 Thread Richard Biener

The following is an old patch that teaches the SSA propagator to mark
stmts it will never re-visit again and thus are safe for pattern
matching.

Re-bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-04-16  Richard Biener  

* tree-ssa-ccp.c (likely_value): See if we have operands that
are marked as never simulate again and return CONSTANT in this
case.
* tree-ssa-propagate.c (simulate_stmt): Mark stmts that do
not have any operands that will be simulated again as
not being simulated again.

* gcc.dg/tree-ssa/ssa-ccp-36.c: New testcase.
* gcc.dg/tree-ssa/pr37508.c: Adjust.
* gfortran.dg/reassoc_6.f: Remove XFAIL.

Index: gcc/tree-ssa-ccp.c
===
*** gcc/tree-ssa-ccp.c.orig 2015-01-30 13:20:25.794134935 +0100
--- gcc/tree-ssa-ccp.c  2015-02-02 15:48:47.810028364 +0100
*** static ccp_lattice_t
*** 645,650 
--- 645,651 
  likely_value (gimple stmt)
  {
bool has_constant_operand, has_undefined_operand, all_undefined_operands;
+   bool has_nsa_operand;
tree use;
ssa_op_iter iter;
unsigned i;
*** likely_value (gimple stmt)
*** 667,672 
--- 668,674 
has_constant_operand = false;
has_undefined_operand = false;
all_undefined_operands = true;
+   has_nsa_operand = false;
FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
  {
ccp_prop_value_t *val = get_value (use);
*** likely_value (gimple stmt)
*** 678,683 
--- 680,689 
  
if (val->lattice_val == CONSTANT)
has_constant_operand = true;
+ 
+   if (SSA_NAME_IS_DEFAULT_DEF (use)
+ || !prop_simulate_again_p (SSA_NAME_DEF_STMT (use)))
+   has_nsa_operand = true;
  }
  
/* There may be constants in regular rhs operands.  For calls we
*** likely_value (gimple stmt)
*** 750,757 
  
/* We do not consider virtual operands here -- load from read-only
   memory may have only VARYING virtual operands, but still be
!  constant.  */
if (has_constant_operand
|| gimple_references_memory_p (stmt))
  return CONSTANT;
  
--- 756,765 
  
/* We do not consider virtual operands here -- load from read-only
   memory may have only VARYING virtual operands, but still be
!  constant.  Also we can combine the stmt with definitions from
!  operands whose definitions are not simulated again.  */
if (has_constant_operand
+   || has_nsa_operand
|| gimple_references_memory_p (stmt))
  return CONSTANT;
  
Index: gcc/tree-ssa-propagate.c
===
*** gcc/tree-ssa-propagate.c.orig   2015-01-30 13:20:25.817135239 +0100
--- gcc/tree-ssa-propagate.c2015-02-02 15:48:47.825028438 +0100
*** simulate_stmt (gimple stmt)
*** 364,369 
--- 364,370 
  FOR_EACH_EDGE (e, ei, bb->succs)
add_control_edge (e);
}
+   return;
  }
else if (val == SSA_PROP_INTERESTING)
  {
*** simulate_stmt (gimple stmt)
*** 377,382 
--- 378,422 
if (taken_edge)
add_control_edge (taken_edge);
  }
+ 
+   /* If there are no SSA uses on the stmt whose defs are simulated
+  again then this stmt will be never visited again.  */
+   bool has_simulate_again_uses = false;
+   use_operand_p use_p;
+   ssa_op_iter iter;
+   if (gimple_code  (stmt) == GIMPLE_PHI)
+ {
+   edge_iterator ei;
+   edge e;
+   tree arg;
+   FOR_EACH_EDGE (e, ei, gimple_bb (stmt)->preds)
+   if (!(e->flags & EDGE_EXECUTABLE)
+   || ((arg = PHI_ARG_DEF_FROM_EDGE (stmt, e))
+   && TREE_CODE (arg) == SSA_NAME
+   && !SSA_NAME_IS_DEFAULT_DEF (arg)
+   && prop_simulate_again_p (SSA_NAME_DEF_STMT (arg
+ {
+   has_simulate_again_uses = true;
+   break;
+ }
+ }
+   else
+ FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
+   {
+   gimple def_stmt = SSA_NAME_DEF_STMT (USE_FROM_PTR (use_p));
+   if (!gimple_nop_p (def_stmt)
+   && prop_simulate_again_p (def_stmt))
+ {
+   has_simulate_again_uses = true;
+   break;
+ }
+   }
+   if (!has_simulate_again_uses)
+ {
+   if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "marking stmt to be not simulated again\n");
+   prop_set_simulate_again (stmt, false);
+ }
  }
  
  /* Process an SSA edge worklist.  WORKLIST is the SSA edge worklist to
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-36.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-36.c  2015-02-02 15:48:47.840028513 
+0100
***
*** 0 
--- 1,15 
+ /* { dg-do compile } */
+ /* { dg-o

Re: unused comparison warning (bug 62182)

2015-04-16 Thread Marek Polacek
On Thu, Apr 16, 2015 at 12:51:33AM +0200, Arnaud Bienner wrote:
> Hi,
> 
> I've submitted a patch to bug 62182 [1], and I would like to have some
> feedback about it (this is still WIP as noted in the bug).
> As it is my first patch to gcc, I'm not sure what is the best way to
> discuss/review patches (here or in bugzilla).
> Anyway, please let me know your thoughts :)

Thanks for working on this.  Several comments:

- Do you have a copyright assignment on file?  (Not sure if it's needed for
  this particular patch.)
- We'll need testcases.  You should e.g. ensure that the warning
  works with -Wunused-comparison and doesn't show up with 
-Wno-unused-comparison,
  that casting to void suppresses the warning, etc.  You can look into
  gcc/testsuite/gcc.dg.
- New options need documenting in invoke.texi.  Only mentioning the new
  option is not enough.  See e.g. "@item -Wlogical-not-parentheses" paragraph
  for an example.
- As this is a C/C++ FE warning, please move it from common.opt to
  c-family/c.opt.
- Could you detail how this patch has been tested?
- Please adhere to the GNU coding style (though we can sort this out in later
  reviews).

Thanks,

Marek


RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-16 Thread Thomas Preud'homme
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Monday, April 13, 2015 8:48 PM
> 
> I know there were several followups between Steven and yourself.
> With
> stage1 now open, can you post a final version and do a final
> bootstrap/test with it?

Here is what came out of our discussion with Steven:

The RTL cprop pass in GCC operates by doing a local constant/copy propagation
first and then a global one. In the local one, if a constant cannot be 
propagated
(eg. due to constraints of the destination instruction) a copy propagation is
done instead. However, at the global level copy propagation is only tried if no
constant can be propagated, ie. if a constant can be propagated but the
constraints of the destination instruction forbids it, no copy propagation will 
be
tried. This patch fixes this issue.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2015-04-15  Thomas Preud'homme  
Steven Bosscher 

* cprop.c (cprop_reg_p): New.
(hash_scan_set): Use above function to check if register can be
propagated.
(find_avail_set): Return up to two sets, one whose source is
a register and one whose source is a constant.  Sets are returned in
an array passed as parameter rather than as a return value.
(cprop_insn): Use a do while loop rather than a goto.  Try each of the
sets returned by find_avail_set, starting with the one whose source is
a constant. Use cprop_reg_p to check if register can be propagated.
(do_local_cprop): Use cprop_reg_p to check if register can be
propagated.
(implicit_set_cond_p): Likewise.

*** gcc/testsuite/ChangeLog ***

2015-04-15  Thomas Preud'homme  
Steven Bosscher 

* gcc.target/arm/pr64616.c: New file.


And the patch is:


diff --git a/gcc/cprop.c b/gcc/cprop.c
index c9fb2fc..78541cf 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -285,6 +285,15 @@ cprop_constant_p (const_rtx x)
   return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x));
 }
 
+/* Determine whether the rtx X should be treated as a register that can
+   be propagated.  Any pseudo-register is fine.  */
+
+static bool
+cprop_reg_p (const_rtx x)
+{
+  return REG_P (x) && !HARD_REGISTER_P (x);
+}
+
 /* Scan SET present in INSN and add an entry to the hash TABLE.
IMPLICIT is true if it's an implicit set, false otherwise.  */
 
@@ -295,8 +304,7 @@ hash_scan_set (rtx set, rtx_insn *insn, struct hash_table_d 
*table,
   rtx src = SET_SRC (set);
   rtx dest = SET_DEST (set);
 
-  if (REG_P (dest)
-  && ! HARD_REGISTER_P (dest)
+  if (cprop_reg_p (dest)
   && reg_available_p (dest, insn)
   && can_copy_p (GET_MODE (dest)))
 {
@@ -321,9 +329,8 @@ hash_scan_set (rtx set, rtx_insn *insn, struct hash_table_d 
*table,
src = XEXP (note, 0), set = gen_rtx_SET (VOIDmode, dest, src);
 
   /* Record sets for constant/copy propagation.  */
-  if ((REG_P (src)
+  if ((cprop_reg_p (src)
   && src != dest
-  && ! HARD_REGISTER_P (src)
   && reg_available_p (src, insn))
  || cprop_constant_p (src))
insert_set_in_table (dest, src, insn, table, implicit);
@@ -821,15 +828,15 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn)
   return success;
 }
 
-/* Find a set of REGNOs that are available on entry to INSN's block.  Return
-   NULL no such set is found.  */
+/* Find a set of REGNOs that are available on entry to INSN's block.  If found,
+   SET_RET[0] will be assigned a set with a register source and SET_RET[1] a
+   set with a constant source.  If not found the corresponding entry is set to
+   NULL.  */
 
-static struct cprop_expr *
-find_avail_set (int regno, rtx_insn *insn)
+static void
+find_avail_set (int regno, rtx_insn *insn, struct cprop_expr *set_ret[2])
 {
-  /* SET1 contains the last set found that can be returned to the caller for
- use in a substitution.  */
-  struct cprop_expr *set1 = 0;
+  set_ret[0] = set_ret[1] = NULL;
 
   /* Loops are not possible here.  To get a loop we would need two sets
  available at the start of the block containing INSN.  i.e. we would
@@ -869,8 +876,10 @@ find_avail_set (int regno, rtx_insn *insn)
  If the source operand changed, we may still use it for the next
  iteration of this loop, but we may not use it for substitutions.  */
 
-  if (cprop_constant_p (src) || reg_not_set_p (src, insn))
-   set1 = set;
+  if (cprop_constant_p (src))
+   set_ret[1] = set;
+  else if (reg_not_set_p (src, insn))
+   set_ret[0] = set;
 
   /* If the source of the set is anything except a register, then
 we have reached the end of the copy chain.  */
@@ -881,10 +890,6 @@ find_avail_set (int regno, rtx_insn *insn)
 and see if we have an available copy into SRC.  */
   regno = REGNO (src);
 }
-
-  /* SET1 holds the last set that was available and anticipatable at
- INSN.  */
-  return set

[PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-04-16 Thread Tom de Vries

[stage1 ping^2]
On 10-03-15 16:30, Tom de Vries wrote:

[stage1 ping]
On 22-02-15 14:13, Tom de Vries wrote:

On 19-02-15 14:03, Richard Biener wrote:

On Thu, 19 Feb 2015, Tom de Vries wrote:


On 19-02-15 11:29, Tom de Vries wrote:

Hi,

I'm posting this patch series for stage1:
- 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
- 0002-Add-gimple_find_sub_bbs.patch
- 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
- 0004-Handle-internal_fn-in-operand_equal_p.patch
- 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch

The patch series - based on Michael's initial patch - postpones expanding
va_arg
until pass_stdarg, which makes pass_stdarg more robust.

Bootstrapped and reg-tested on x86_64 using all languages, with unix/ and
unix/-m32 testing.

I'll post the patches in reply to this email.



This patch postpones expanding va_arg until pass_stdarg.

We add a new internal function IFN_VA_ARG. During gimplification, we map
VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified in to a
gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into actual
code.

There are a few implementation details worth mentioning:
- passing the type beyond gimplification is done by adding a NULL pointer-
   to-type to IFN_VA_ARG.
- there is special handling for IFN_VA_ARG that would be most suited to be
   placed in gimplify_va_arg_expr. However, that function lacks the scope for
   the special handling, so it's placed awkwardly in gimplify_modify_expr.
- there's special handling in case the va_arg type is variable-sized.
   gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR IFN_VA_ARG for
   variable-sized types. However, this is gimplified into a gimple_call which
   does not have the possibility to wrap it's result in a WITH_SIZE_EXPR. So
   we're adding the size argument of the WITH_SIZE_EXPR as argument to
   IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
   gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the subsequent
   gimplify_assign will generate a memcpy if necessary.
- when gimplifying the va_arg argument ap, it may not be addressable. So
   gimplification will generate a copy ap.1 = ap, and use &ap.1 as argument.
   This means that we have to copy back the ap.1 value to ap after IFN_VA_ARG.
   The copy is classified by the va_list_gpr/fpr_size optimization as an
   escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15 update is
   because of that.

OK for stage1?


Looks mostly good, though it looks like with -O0 this doesn't delay
lowering of va-arg and thus won't "fix" offloading.  Can you instead
introduce a PROP_gimple_lva, provide it by the stdarg pass and add
a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
that runs of !PROP_gimple_lva (and also provides it), and require
PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
the complex stuff to get an idea what needs to be touched)



Updated according to comments.

Furthermore (having updated the patch series to recent trunk), I'm dropping the
ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates to this
patch.

Retested as before.

OK for stage1?



Ping.


Ping again.

Patch originally posted at: 
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .


Thanks,
- Tom


Btw, I'm wondering if as run-time optimization we can tentatively set
PROP_gimple_lva at the start of the gimple pass, and unset it in
gimplify_va_arg_expr. That way we would avoid the loop in expand_ifn_va_arg_1
(over all bbs and gimples) in functions without va_arg.



Taken care of in follow-up patch 5b.




[PING^2] [PATCH][5b/5] Avoid running expand_ifn_va_arg_1 in functions without va_arg

2015-04-16 Thread Tom de Vries

[stage1 ping^2]
On 10-03-15 16:31, Tom de Vries wrote:

[stage1 ping]
[was: Postpone expanding va_arg until pass_stdarg]
On 24-02-15 07:48, Tom de Vries wrote:

On 23-02-15 11:09, Tom de Vries wrote:

On 23-02-15 09:26, Michael Matz wrote:

Hi,

On Sun, 22 Feb 2015, Tom de Vries wrote:


Btw, I'm wondering if as run-time optimization we can tentatively set
PROP_gimple_lva at the start of the gimple pass, and unset it in
gimplify_va_arg_expr. That way we would avoid the loop in
expand_ifn_va_arg_1 (over all bbs and gimples) in functions without
va_arg.


That makes sense.



I'll put this follow-up patch through testing.


Bootstrapped and reg-tested as before together with the rest of the patch 
series.

OK for stage1?



Ping.



Ping again.

Patch originally posted at: 
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01402.html .


Thanks,
- Tom




Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-04-16 Thread Richard Biener
On Thu, 16 Apr 2015, Tom de Vries wrote:

> [stage1 ping^2]
> On 10-03-15 16:30, Tom de Vries wrote:
> > [stage1 ping]
> > On 22-02-15 14:13, Tom de Vries wrote:
> > > On 19-02-15 14:03, Richard Biener wrote:
> > > > On Thu, 19 Feb 2015, Tom de Vries wrote:
> > > > 
> > > > > On 19-02-15 11:29, Tom de Vries wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I'm posting this patch series for stage1:
> > > > > > - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
> > > > > > - 0002-Add-gimple_find_sub_bbs.patch
> > > > > > - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
> > > > > > - 0004-Handle-internal_fn-in-operand_equal_p.patch
> > > > > > - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
> > > > > > 
> > > > > > The patch series - based on Michael's initial patch - postpones
> > > > > > expanding
> > > > > > va_arg
> > > > > > until pass_stdarg, which makes pass_stdarg more robust.
> > > > > > 
> > > > > > Bootstrapped and reg-tested on x86_64 using all languages, with
> > > > > > unix/ and
> > > > > > unix/-m32 testing.
> > > > > > 
> > > > > > I'll post the patches in reply to this email.
> > > > > > 
> > > > > 
> > > > > This patch postpones expanding va_arg until pass_stdarg.
> > > > > 
> > > > > We add a new internal function IFN_VA_ARG. During gimplification, we
> > > > > map
> > > > > VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified
> > > > > in to a
> > > > > gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into
> > > > > actual
> > > > > code.
> > > > > 
> > > > > There are a few implementation details worth mentioning:
> > > > > - passing the type beyond gimplification is done by adding a NULL
> > > > > pointer-
> > > > >to-type to IFN_VA_ARG.
> > > > > - there is special handling for IFN_VA_ARG that would be most suited
> > > > > to be
> > > > >placed in gimplify_va_arg_expr. However, that function lacks the
> > > > > scope for
> > > > >the special handling, so it's placed awkwardly in
> > > > > gimplify_modify_expr.
> > > > > - there's special handling in case the va_arg type is variable-sized.
> > > > >gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
> > > > > IFN_VA_ARG for
> > > > >variable-sized types. However, this is gimplified into a
> > > > > gimple_call which
> > > > >does not have the possibility to wrap it's result in a
> > > > > WITH_SIZE_EXPR. So
> > > > >we're adding the size argument of the WITH_SIZE_EXPR as argument to
> > > > >IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
> > > > >gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
> > > > > subsequent
> > > > >gimplify_assign will generate a memcpy if necessary.
> > > > > - when gimplifying the va_arg argument ap, it may not be addressable.
> > > > > So
> > > > >gimplification will generate a copy ap.1 = ap, and use &ap.1 as
> > > > > argument.
> > > > >This means that we have to copy back the ap.1 value to ap after
> > > > > IFN_VA_ARG.
> > > > >The copy is classified by the va_list_gpr/fpr_size optimization as
> > > > > an
> > > > >escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15
> > > > > update is
> > > > >because of that.
> > > > > 
> > > > > OK for stage1?
> > > > 
> > > > Looks mostly good, though it looks like with -O0 this doesn't delay
> > > > lowering of va-arg and thus won't "fix" offloading.  Can you instead
> > > > introduce a PROP_gimple_lva, provide it by the stdarg pass and add
> > > > a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
> > > > that runs of !PROP_gimple_lva (and also provides it), and require
> > > > PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
> > > > the complex stuff to get an idea what needs to be touched)
> > > > 
> > > 
> > > Updated according to comments.
> > > 
> > > Furthermore (having updated the patch series to recent trunk), I'm
> > > dropping the
> > > ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates
> > > to this
> > > patch.
> > > 
> > > Retested as before.
> > > 
> > > OK for stage1?
> > > 
> > 
> > Ping.
> 
> Ping again.
> 
> Patch originally posted at:
> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
> 
> > > Btw, I'm wondering if as run-time optimization we can tentatively set
> > > PROP_gimple_lva at the start of the gimple pass, and unset it in
> > > gimplify_va_arg_expr. That way we would avoid the loop in
> > > expand_ifn_va_arg_1
> > > (over all bbs and gimples) in functions without va_arg.
> > > 
> > 
> > Taken care of in follow-up patch 5b.


Re: [PING^2] [PATCH][5b/5] Avoid running expand_ifn_va_arg_1 in functions without va_arg

2015-04-16 Thread Richard Biener
On Thu, 16 Apr 2015, Tom de Vries wrote:

> [stage1 ping^2]
> On 10-03-15 16:31, Tom de Vries wrote:
> > [stage1 ping]
> > [was: Postpone expanding va_arg until pass_stdarg]
> > On 24-02-15 07:48, Tom de Vries wrote:
> > > On 23-02-15 11:09, Tom de Vries wrote:
> > > > On 23-02-15 09:26, Michael Matz wrote:
> > > > > Hi,
> > > > > 
> > > > > On Sun, 22 Feb 2015, Tom de Vries wrote:
> > > > > 
> > > > > > Btw, I'm wondering if as run-time optimization we can tentatively
> > > > > > set
> > > > > > PROP_gimple_lva at the start of the gimple pass, and unset it in
> > > > > > gimplify_va_arg_expr. That way we would avoid the loop in
> > > > > > expand_ifn_va_arg_1 (over all bbs and gimples) in functions without
> > > > > > va_arg.
> > > > > 
> > > > > That makes sense.
> > > > > 
> > > > 
> > > > I'll put this follow-up patch through testing.
> > > 
> > > Bootstrapped and reg-tested as before together with the rest of the patch
> > > series.
> > > 
> > > OK for stage1?
> > > 
> > 
> > Ping.
> > 
> 
> Ping again.
> 
> Patch originally posted at:
> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01402.html .

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [patch,avr] Fix PR 65657 - read from __memx address space tramples arguments to function call

2015-04-16 Thread Georg-Johann Lay

Am 04/16/2015 um 08:43 AM schrieb Senthil Kumar Selvaraj:

This patch fixes PR 65657.


The following artifact appears to be PR63633.


When cfgexpand.c expands a function call, it first figures out the
registers in which the arguments will go, followed by expansion of the
arguments themselves (right to left). It later emits mov insns to set
the precomputed registers with the expanded RTXes.

If one of the arguments is a __memx char pointer dereference, the mov
eventually expands to gen_xload_A (for certain cases), which
clobbers R22, R21 and Z.  This causes problems if one of those
clobbered registers was precomputed to hold another argument.

In general, call expansion does not appear to take clobbers into account -
when it checks for argument overlap, the RTX (args[i].value) is only a MEM
in QImode for the memx deref - the clobber shows up when it eventually
calls emit_move_insn, at which point, it is too late.

This does not happen for a int pointer dereference - turns out that
precompute_register_parameters does a copy_to_mode_reg if the
cost of args[i].value is more than COSTS_N_INSNS(1) i.e., it creates a
pseudo and later assigns the pseudo to the arg register. This is done
before any moves to arg registers is done, so other arguments are not
overwritten.

Doing the same thing - providing a better cost estimate for a MEM rtx in
the non-default address space, makes this problem go away, and that is
what this patch does. Regression testing does not show any new failures.

The fix does seem tangential to the core problem - not sure if there is
a better way to fix this?

If not, could someone commit this please? I don't have commit access.

Regards
Senthil

gcc/ChangeLog

2015-04-16  Senthil Kumar Selvaraj  

* config/avr/avr.c (avr_rtx_costs_1): Increase cost for
MEM rtx in non-default address space.

gcc/testsuite/ChangeLog

2015-04-16  Senthil Kumar Selvaraj  

* gcc.target/avr/pr65657.c: New.

diff --git gcc/config/avr/avr.c gcc/config/avr/avr.c
index 68d5ddc..c9b8678 100644
--- gcc/config/avr/avr.c
+++ gcc/config/avr/avr.c
@@ -9959,7 +9959,11 @@ avr_rtx_costs_1 (rtx x, int codearg, int outer_code 
ATTRIBUTE_UNUSED,
return true;

  case MEM:
-  *total = COSTS_N_INSNS (GET_MODE_SIZE (mode));
+  /* MEM rtx with non-default address space is more
+ expensive. Not expressing that results in reg
+ clobber during expand (PR 65657). */
+  *total = COSTS_N_INSNS (GET_MODE_SIZE (mode)
+  + (MEM_ADDR_SPACE(x) == ADDR_SPACE_RAM ? 0 : 5));
return true;


IMO costs should not be used to fix wrong code or ICEs.  Presumably the fix is 
similar to the one used by PR63633.


Johann


  case NEG:
diff --git gcc/testsuite/gcc.target/avr/pr65657.c 
gcc/testsuite/gcc.target/avr/pr65657.c
new file mode 100644
index 000..d908fe3
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr65657.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-options "-Os" } */
+
+/* When cfgexpand expands the call to foo, it
+   assigns registers R22:R23 to 0xABCD and R24
+   to *x. Because x is a __memx address space
+   pointer, the dereference results in an RTL
+   insn that clobbers R22 among other
+   registers (xload_A).
+
+   Call expansion does not take this into account
+   and ends up clobbering R22 set previously to
+   hold part of the second argument.
+*/
+
+#include 
+
+void __attribute__((noinline))
+__attribute__((noclone))
+foo (char c, volatile unsigned int d)
+{
+if (d != 0xABCD)
+  abort();
+if (c != 'A')
+abort();
+}
+
+void __attribute__((noinline))
+__attribute__((noclone))
+readx(const char __memx *x)
+{
+foo (*x, 0xABCD);
+}
+
+const char __memx arr[] = { 'A' };
+int main()
+{
+   readx (arr);
+}