[PATCH] Propagate -fdiagnostics-* options in lto-wrapper

2015-09-06 Thread Chung-Lin Tang
Hi,
Currently most non-target specific options are skipped when crossing the
LTO/offload processing border, however since there are still quite a
number of warning calls in many target backends, it makes sense to
save and propagate the associated options, to preserve consistency in warning
behavior.

For example, currently:
$ x86_64-pc-linux-gnu-gcc -fopenacc test.c -fno-diagnostics-show-caret
y.c: In function 'main._omp_fn.0':
y.c:6:11: warning: using num_workers (32), ignoring 500
   #pragma acc parallel num_workers(500)
   ^
(note: this warning message is triggered by nvptx code currently only
on gomp-4_0-branch, but illustrates the point)

The caret stills shows, because -fno-diagnostics-show-caret does not
reach the accel compiler. -flto should also have a similar issue.

The attached patch allows a series of -fdiagnostics-* options to be
propagated by lto-wrapper.  I've tested this patch without regressions,
is this okay for trunk?

Thanks,
Chung-Lin

2015-09-06  Chung-Lin Tang  

* lto-wrapper.c (merge_and_complain): Add OPT_fdiagnostics_show_caret,
OPT_fdiagnostics_show_option, OPT_fdiagnostics_show_location_, and
OPT_fshow_column to handled saved option cases.
(append_compiler_options): Do not skip the above added options.
Index: lto-wrapper.c
===
--- lto-wrapper.c	(revision 227508)
+++ lto-wrapper.c	(working copy)
@@ -232,6 +232,10 @@ merge_and_complain (struct cl_decoded_option **dec
 	break;
 
 	  /* Fallthru.  */
+	case OPT_fdiagnostics_show_caret:
+	case OPT_fdiagnostics_show_option:
+	case OPT_fdiagnostics_show_location_:
+	case OPT_fshow_column:
 	case OPT_fPIC:
 	case OPT_fpic:
 	case OPT_fPIE:
@@ -479,6 +483,10 @@ append_compiler_options (obstack *argv_obstack, st
 	 on any CL_TARGET flag and a few selected others.  */
   switch (option->opt_index)
 	{
+	case OPT_fdiagnostics_show_caret:
+	case OPT_fdiagnostics_show_option:
+	case OPT_fdiagnostics_show_location_:
+	case OPT_fshow_column:
 	case OPT_fPIC:
 	case OPT_fpic:
 	case OPT_fPIE:


[Patch, fortran] PR40054 and PR63921 - Implement pointer function assignment - redux

2015-09-06 Thread Paul Richard Thomas
Dear All,

The attached patch more or less implements the assignment of
expressions to the result of a pointer function. To wit:

my_ptr_fcn (arg1, arg2...) = expr

arg1 would usually be the target, pointed to by the function. The
patch parses these statements and resolves them into:

temp_ptr => my_ptr_fcn (arg1, arg2...)
temp_ptr = expr

I say more or less implemented because I have ducked one of the
headaches here. At the end of the specification block, there is an
ambiguity between statement functions and pointer function
assignments. I do not even try to resolve this ambiguity and require
that there be at least one other type of executable statement before
these beasts. This can undoubtedly be fixed but the effort seems to me
to be unwarranted at the present time.

This version of the patch extends the coverage of allowed rvalues to
any legal expression. Also, all the problems with error.c have been
dealt with by Manuel's patch.

I am grateful to Dominique for reminding me of PR40054 and pointing
out PR63921. After a remark of his on #gfortran, I fixed the checking
of the standard to pick up all the offending lines with F2003 and
earlier.


Bootstraps and regtests on FC21/x86_64 - OK for trunk?

Cheers

Paul

2015-09-06  Paul Thomas  

PR fortran/40054
PR fortran/63921
* decl.c (get_proc_name): Return if statement function is
found.
* match.c (gfc_match_ptr_fcn_assign): New function.
* match.h : Add prototype for gfc_match_ptr_fcn_assign.
* parse.c : Add static flag 'in_specification_block'.
(decode_statement): If in specification block match a statement
function, otherwise if standard embraces F2008 try to match a
pointer function assignment.
(parse_interface): Set 'in_specification_block' on exiting from
parse_spec.
(parse_spec): Set and then reset 'in_specification_block'.
(gfc_parse_file): Set 'in_specification_block'.
* resolve.c (get_temp_from_expr): Extend to include other
expressions than variables and constants as rvalues.
(resolve_ptr_fcn_assign): New function.
(gfc_resolve_code): Call resolve_ptr_fcn_assign.
* symbol.c (gfc_add_procedure): Add a sentence to the error to
flag up the ambiguity between a statement function and pointer
function assignment at the end of the specification block.

2015-09-06  Paul Thomas  

PR fortran/40054
PR fortran/63921
* gfortran.dg/fmt_tab_1.f90: Change from run to compile and set
standard as legacy.
* gfortran.dg/ptr_func_assign_1.f08: New test.
* gfortran.dg/ptr_func_assign_2.f08: New test.


Re: [Patch, fortran] PR40054 and PR63921 - Implement pointer function assignment - redux

2015-09-06 Thread Paul Richard Thomas
It helps to attach the patch :-)

Paul

On 6 September 2015 at 13:42, Paul Richard Thomas
 wrote:
> Dear All,
>
> The attached patch more or less implements the assignment of
> expressions to the result of a pointer function. To wit:
>
> my_ptr_fcn (arg1, arg2...) = expr
>
> arg1 would usually be the target, pointed to by the function. The
> patch parses these statements and resolves them into:
>
> temp_ptr => my_ptr_fcn (arg1, arg2...)
> temp_ptr = expr
>
> I say more or less implemented because I have ducked one of the
> headaches here. At the end of the specification block, there is an
> ambiguity between statement functions and pointer function
> assignments. I do not even try to resolve this ambiguity and require
> that there be at least one other type of executable statement before
> these beasts. This can undoubtedly be fixed but the effort seems to me
> to be unwarranted at the present time.
>
> This version of the patch extends the coverage of allowed rvalues to
> any legal expression. Also, all the problems with error.c have been
> dealt with by Manuel's patch.
>
> I am grateful to Dominique for reminding me of PR40054 and pointing
> out PR63921. After a remark of his on #gfortran, I fixed the checking
> of the standard to pick up all the offending lines with F2003 and
> earlier.
>
>
> Bootstraps and regtests on FC21/x86_64 - OK for trunk?
>
> Cheers
>
> Paul
>
> 2015-09-06  Paul Thomas  
>
> PR fortran/40054
> PR fortran/63921
> * decl.c (get_proc_name): Return if statement function is
> found.
> * match.c (gfc_match_ptr_fcn_assign): New function.
> * match.h : Add prototype for gfc_match_ptr_fcn_assign.
> * parse.c : Add static flag 'in_specification_block'.
> (decode_statement): If in specification block match a statement
> function, otherwise if standard embraces F2008 try to match a
> pointer function assignment.
> (parse_interface): Set 'in_specification_block' on exiting from
> parse_spec.
> (parse_spec): Set and then reset 'in_specification_block'.
> (gfc_parse_file): Set 'in_specification_block'.
> * resolve.c (get_temp_from_expr): Extend to include other
> expressions than variables and constants as rvalues.
> (resolve_ptr_fcn_assign): New function.
> (gfc_resolve_code): Call resolve_ptr_fcn_assign.
> * symbol.c (gfc_add_procedure): Add a sentence to the error to
> flag up the ambiguity between a statement function and pointer
> function assignment at the end of the specification block.
>
> 2015-09-06  Paul Thomas  
>
> PR fortran/40054
> PR fortran/63921
> * gfortran.dg/fmt_tab_1.f90: Change from run to compile and set
> standard as legacy.
> * gfortran.dg/ptr_func_assign_1.f08: New test.
> * gfortran.dg/ptr_func_assign_2.f08: New test.



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx
Index: gcc/fortran/decl.c
===
*** gcc/fortran/decl.c  (revision 227508)
--- gcc/fortran/decl.c  (working copy)
*** get_proc_name (const char *name, gfc_sym
*** 901,906 
--- 901,908 
  return rc;
  
sym = *result;
+   if (sym->attr.proc == PROC_ST_FUNCTION)
+ return rc;
  
if (sym->attr.module_procedure
&& sym->attr.if_source == IFSRC_IFBODY)
Index: gcc/fortran/match.c
===
*** gcc/fortran/match.c (revision 227508)
--- gcc/fortran/match.c (working copy)
*** match
*** 4886,4892 
  gfc_match_st_function (void)
  {
gfc_error_buffer old_error;
- 
gfc_symbol *sym;
gfc_expr *expr;
match m;
--- 4886,4891 
*** gfc_match_st_function (void)
*** 4926,4931 
--- 4925,4990 
return MATCH_YES;
  
  undo_error:
+   gfc_pop_error (&old_error);
+   return MATCH_NO;
+ }
+ 
+ 
+ /* Match an assignment to a pointer function (F2008). This could, in
+general be ambiguous with a statement function. In this implementation
+it remains so if it is the first statement after the specification
+block.  */
+ 
+ match
+ gfc_match_ptr_fcn_assign (void)
+ {
+   gfc_error_buffer old_error;
+   locus old_loc;
+   gfc_symbol *sym;
+   gfc_expr *expr;
+   match m;
+   char name[GFC_MAX_SYMBOL_LEN + 1];
+ 
+   old_loc = gfc_current_locus;
+   m = gfc_match_name (name);
+   if (m != MATCH_YES)
+ return m;
+ 
+   gfc_find_symbol (name, NULL, 1, &sym);
+   if (sym && sym->attr.flavor != FL_PROCEDURE)
+ return MATCH_NO;
+ 
+   gfc_push_error (&old_error);
+ 
+   if (sym && sym->attr.function)
+ goto match_actual_arglist;
+ 
+   gfc_current_locus = old_loc;
+   m = gfc_match_symbol (&sym, 0);
+   if (m != MATCH_YES)
+ return m;
+ 
+   if (!gfc_add_procedure (&sym->attr, PROC_UNKNOWN, sym->name, NULL))
+ goto undo_error;
+ 
+ match_actual_arglist:
+   gfc_current_locus = old_loc;
+   m = gfc_match (" %e", &expr);
+   if (m != MA

[PATCH] Fix PR64078

2015-09-06 Thread Bernd Edlinger
Hi,

we observed sporadic failures of the following two test cases (see PR64078):
c-c++-common/ubsan/object-size-9.c and c-c++-common/ubsan/object-size-10.c

For object-size-9.c this happens in a reproducible way when -fpic option is 
used:
If that option is used, it is slightly less desirable to inline the functions, 
but if an explicit
"inline" is added, the function is still in-lined, even if -fpic is used.

But it may also happen randomly when the sanitizer tries to dump memory around 
an object,
that lies next to a non-accessible page, the sanitizer prints ""
in this case, which is not what the test case expects here.  As a work around I 
added
a large alignment attribute, to make sure, that the object cannot be at a page 
boundary.


Boot-strapped and regression-tested x86_64-linux-gnu.
OK for trunk?


Thanks,
Bernd.
  2015-09-06  Bernd Edlinger  

PR testsuite/64078
* c-c++-common/ubsan/object-size-9.c (s): Add alignment attribute.
(f2, f3): Add inline attribute.
* c-c++-common/ubsan/object-size-10.c (a, b): Add alignment attribute.


patch-pr64078.diff
Description: Binary data


[0/7] Type promotion pass and elimination of zext/sext

2015-09-06 Thread Kugan

This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.

In the meantime, I would appreciate if you take some time to review this.

I have bootstrapped on x86_64-linux-gnu, arm-linux-gnu and
aarch-64-linux-gnu (with the workaround) and regression tested.

Thanks,
Kugan


[1/7] Add new tree code SEXT_EXPR

2015-09-06 Thread Kugan

This patch adds support for new tree code SEXT_EXPR.

gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
* expr.c (expand_expr_real_2): Likewise.
* fold-const.c (int_const_binop_1): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
(op_symbol_code): Likewise.
* tree.def: Define new tree code SEXT_EXPR.
>From 9e9fd271b84580ae40ce21eb39f9be8072e6dd12 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 17 Aug 2015 13:37:15 +1000
Subject: [PATCH 1/8] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c |  4 
 gcc/expr.c  | 16 
 gcc/fold-const.c|  4 
 gcc/tree-cfg.c  | 12 
 gcc/tree-inline.c   |  1 +
 gcc/tree-pretty-print.c | 11 +++
 gcc/tree.def|  4 
 7 files changed, 52 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index d567a87..bbc3c10 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
 case FMA_EXPR:
   return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+case SEXT_EXPR:
+  return op0;
+
+
 default:
 flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index 1e820b4..bcd87c0 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9273,6 +9273,22 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
   target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
   return target;
 
+case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode
+	= smallest_mode_for_size (tree_to_shwi (treeop1),
+  MODE_INT);
+	  temp = convert_modes (inner_mode,
+TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c826e67..473f930 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -984,6 +984,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
   res = wi::bit_and (arg1, arg2);
   break;
 
+case SEXT_EXPR:
+  res = wi::sext (arg1, arg2.to_uhwi ());
+  break;
+
 case RSHIFT_EXPR:
 case LSHIFT_EXPR:
   if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5ac73b3..c9ad28d 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3756,6 +3756,18 @@ verify_gimple_assign_binary (gassign *stmt)
 return false;
   }
 
+case SEXT_EXPR:
+  {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	|| !INTEGRAL_TYPE_P (rhs1_type)
+	|| TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	error ("invalid operands in sext expr");
+	return true;
+	  }
+	return false;
+  }
+
 case VEC_WIDEN_LSHIFT_HI_EXPR:
 case VEC_WIDEN_LSHIFT_LO_EXPR:
   {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..272c409 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3884,6 +3884,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 case BIT_XOR_EXPR:
 case BIT_AND_EXPR:
 case BIT_NOT_EXPR:
+case SEXT_EXPR:
 
 case TRUTH_ANDIF_EXPR:
 case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..04f6777 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1794,6 +1794,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
   }
   break;
 
+case SEXT_EXPR:
+  pp_string (pp, "SEXT_EXPR <");
+  dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+  pp_string (pp, ", ");
+  dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+  pp_greater (pp);
+  break;
+
 case MODIFY_EXPR:
 case INIT_EXPR:
   dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
 case MIN_EXPR:
   return "min";
 
+case SEXT_EXPR:
+  return "sext from bit";
+
 default:
   return "<<< ??? >>>";
 }
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..d614544 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,10 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
value of the expression is determined from the first operand

[2/7] Add new type promotion pass

2015-09-06 Thread Kugan

This pass applies type promotion to SSA names in the function and
inserts appropriate truncations to preserve the semantics.  Idea of this
pass is to promote operations such a way that we can minimize generation
of subreg in RTL, that intern results in removal of redundant zero/sign
extensions.

gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* Makefile.in: Add gimple-ssa-type-promote.o.
* common.opt: New option -ftree-type-promote.
* doc/invoke.texi: Document -ftree-type-promote.
* gimple-ssa-type-promote.c: New file.
* passes.def: Define new pass_type_promote.
* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
* tree-pass.h (make_pass_type_promote): New.
* tree-ssanames.c (set_range_info): Adjust range_info.
>From c63cc2e1253a7d3544ba35a15dda2fde0d0380e4 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 17 Aug 2015 13:44:50 +1000
Subject: [PATCH 2/8] Add type promotion pass

---
 gcc/Makefile.in   |   1 +
 gcc/common.opt|   4 +
 gcc/doc/invoke.texi   |  10 +
 gcc/gimple-ssa-type-promote.c | 809 ++
 gcc/passes.def|   1 +
 gcc/timevar.def   |   1 +
 gcc/tree-pass.h   |   1 +
 gcc/tree-ssanames.c   |   3 +-
 8 files changed, 829 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3d1c1e5..2fb5174 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1494,6 +1494,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 94d1d88..b5a93b0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2378,6 +2378,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c0ec0fd..7eeabcd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8956,6 +8956,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 000..62b5fdc
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,809 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate tru

[3/7] Optimize ZEXT_EXPR with tree-vrp

2015-09-06 Thread Kugan
This patch tree-vrp handling and optimization for ZEXT_EXPR.



gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* tree-vrp.c (extract_range_from_binary_expr_1): Handle SEXT_EXPR.
(simplify_bit_ops_using_ranges): Likewise.
(simplify_stmt_using_ranges): Likewise.
>From 7143e0575f309f70d838edf436b555fb93a6c4bb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Mon, 17 Aug 2015 13:45:52 +1000
Subject: [PATCH 3/8] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 75 ++
 1 file changed, 75 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 21fbed0..d579b49 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2327,6 +2327,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
   && code != LSHIFT_EXPR
   && code != MIN_EXPR
   && code != MAX_EXPR
+  && code != SEXT_EXPR
   && code != BIT_AND_EXPR
   && code != BIT_IOR_EXPR
   && code != BIT_XOR_EXPR)
@@ -2887,6 +2888,55 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
   extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
   return;
 }
+  else if (code == SEXT_EXPR)
+{
+  gcc_assert (range_int_cst_p (&vr1));
+  unsigned int prec = tree_to_uhwi (vr1.min);
+  type = vr0.type;
+  wide_int tmin, tmax;
+  wide_int type_min, type_max;
+  wide_int may_be_nonzero, must_be_nonzero;
+
+  gcc_assert (!TYPE_UNSIGNED (expr_type));
+  type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+  type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+  if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+ &may_be_nonzero,
+ &must_be_nonzero))
+	{
+	  HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
+	  HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();
+
+	  if (int_must_be_nonzero & (1 << (prec - 1)))
+	{
+	  /* If to-be-extended sign bit is one.  */
+	  tmin = type_min;
+	  tmax = may_be_nonzero;
+	}
+	  else if ((int_may_be_nonzero & (1 << (prec - 1))) == 0)
+	{
+	  /* If to-be-extended sign bit is zero.  */
+	  tmin = must_be_nonzero;
+	  tmax = may_be_nonzero;
+	}
+	  else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+	}
+  else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+  tmin = wi::sext (tmin, prec - 1);
+  tmax = wi::sext (tmax, prec - 1);
+  min = wide_int_to_tree (expr_type, tmin);
+  max = wide_int_to_tree (expr_type, tmax);
+}
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
 {
@@ -9254,6 +9304,30 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
 	  break;
 	}
   break;
+case SEXT_EXPR:
+	{
+	  gcc_assert (is_gimple_min_invariant (op1));
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int mask;
+	  HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
+	  HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();
+	  mask = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+	  mask = wi::bit_not (mask);
+	  if (must_be_nonzero & (1 << (prec - 1)))
+	{
+	  /* If to-be-extended sign bit is one.  */
+	  if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	}
+	  else if ((may_be_nonzero & (1 << (prec - 1))) == 0)
+	{
+	  /* If to-be-extended sign bit is zero.  */
+	  if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	}
+	}
+  break;
 default:
   gcc_unreachable ();
 }
@@ -9955,6 +10029,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	 if all the bits being cleared are already cleared or
 	 all the bits being set are already set.  */
-- 
1.9.1



[4/7] Use correct promoted mode sign for result of GIMPLE_CALL

2015-09-06 Thread Kugan


For the following testcase (compiling with -O1; -O2 works fine), we have
a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
resulting in wrong code. Simple SSA_NAME copes are generally optimized
but when they are not, we can end up using the wrong promoted mode.
Attached patch fixes when we have one copy. I think it might be better
to do this in a while loop but I don't think it can happen in practice.
Please let me know what you think.

  _6 = bar5 (-10);
  ...
  _7 = _6;
  _3 = (long unsigned int) _6;
  ...
  if (_3 != l5.0_4)


for
extern void abort (void);

__attribute__ ((noinline))
static unsigned short int foo5 (int x)
{
  return x;
}

__attribute__ ((noinline))
short int bar5 (int x)
{
  return foo5 (x + 6);
}

unsigned long l5 = (short int) -4;

int
main (void)
{
  if (bar5 (-10) != l5)
abort ();
  return 0;
}

gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
SSA_NAME that was set by GIMPLE_CALL and assigned to another
SSA_NAME of same type.
>From 64ac68bfda1d3e8487827512e6d163b384e8a1cf Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Wed, 2 Sep 2015 12:18:41 +1000
Subject: [PATCH 4/8] use correct promoted sign for result of GIMPLE_CALL

---
 gcc/expr.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index bcd87c0..6dac3cf 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9633,7 +9633,22 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	   gimple_call_fntype (g),
 	   2);
 	  else
-	pmode = promote_ssa_mode (ssa_name, &unsignedp);
+	{
+	  tree rhs;
+	  gimple stmt;
+	  if (code == SSA_NAME
+		  && is_gimple_assign (g)
+		  && (rhs = gimple_assign_rhs1 (g))
+		  && TREE_CODE (rhs) == SSA_NAME
+		  && (stmt = SSA_NAME_DEF_STMT (rhs))
+		  && gimple_code (stmt) == GIMPLE_CALL
+		  && !gimple_call_internal_p (stmt))
+		pmode = promote_function_mode (type, mode, &unsignedp,
+	   gimple_call_fntype (stmt),
+	   2);
+	  else
+		pmode = promote_ssa_mode (ssa_name, &unsignedp);
+	}
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
-- 
1.9.1



[5/7] Allow gimple debug stmt in widen mode

2015-09-06 Thread Kugan
Allow GIMPLE_DEBUG with values in promoted register.


gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
SSA_NAME that was set by GIMPLE_CALL and assigned to another
SSA_NAME of same type.
>From a28de63bcbb9f315cee7e41be11b65b3ff521a91 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Tue, 1 Sep 2015 08:40:40 +1000
Subject: [PATCH 5/8] debug stmt in widen mode

---
 gcc/cfgexpand.c   | 11 ---
 gcc/gimple-ssa-type-promote.c |  7 ---
 gcc/rtl.h |  2 ++
 3 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index bbc3c10..036085a 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5240,7 +5240,6 @@ expand_debug_locations (void)
 	tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
 	rtx val;
 	rtx_insn *prev_insn, *insn2;
-	machine_mode mode;
 
 	if (value == NULL_TREE)
 	  val = NULL_RTX;
@@ -5275,16 +5274,6 @@ expand_debug_locations (void)
 
 	if (!val)
 	  val = gen_rtx_UNKNOWN_VAR_LOC ();
-	else
-	  {
-	mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-	gcc_assert (mode == GET_MODE (val)
-			|| (GET_MODE (val) == VOIDmode
-			&& (CONST_SCALAR_INT_P (val)
-|| GET_CODE (val) == CONST_FIXED
-|| GET_CODE (val) == LABEL_REF)));
-	  }
 
 	INSN_VAR_LOCATION_LOC (insn) = val;
 	prev_insn = PREV_INSN (insn);
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index 62b5fdc..6805b9c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -570,13 +570,6 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
   bool do_not_promote = false;
   switch (gimple_code (stmt))
 	{
-	case GIMPLE_DEBUG:
-	{
-	  gsi = gsi_for_stmt (stmt);
-	  gsi_remove (&gsi, true);
-	  break;
-	}
-
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
 	case GIMPLE_RETURN:
diff --git a/gcc/rtl.h b/gcc/rtl.h
index ac56133..c3cdf96 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2100,6 +2100,8 @@ wi::int_traits ::decompose (HOST_WIDE_INT *,
 	   targets is 1 rather than -1.  */
 	gcc_checking_assert (INTVAL (x.first)
 			 == sext_hwi (INTVAL (x.first), precision)
+			 || INTVAL (x.first)
+			 == (INTVAL (x.first) & ((1 << precision) - 1))
 			 || (x.second == BImode && INTVAL (x.first) == 1));
 
   return wi::storage_ref (&INTVAL (x.first), 1, precision);
-- 
1.9.1



[5/7] Allow gimple debug stmt in widen mode

2015-09-06 Thread Kugan
Allow GIMPLE_DEBUG with values in promoted register.


gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* expr.c (expand_expr_real_1): Set proper SUBREG_PROMOTED_MODE for
SSA_NAME that was set by GIMPLE_CALL and assigned to another
SSA_NAME of same type.
>From a28de63bcbb9f315cee7e41be11b65b3ff521a91 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Tue, 1 Sep 2015 08:40:40 +1000
Subject: [PATCH 5/8] debug stmt in widen mode

---
 gcc/cfgexpand.c   | 11 ---
 gcc/gimple-ssa-type-promote.c |  7 ---
 gcc/rtl.h |  2 ++
 3 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index bbc3c10..036085a 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5240,7 +5240,6 @@ expand_debug_locations (void)
 	tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
 	rtx val;
 	rtx_insn *prev_insn, *insn2;
-	machine_mode mode;
 
 	if (value == NULL_TREE)
 	  val = NULL_RTX;
@@ -5275,16 +5274,6 @@ expand_debug_locations (void)
 
 	if (!val)
 	  val = gen_rtx_UNKNOWN_VAR_LOC ();
-	else
-	  {
-	mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-	gcc_assert (mode == GET_MODE (val)
-			|| (GET_MODE (val) == VOIDmode
-			&& (CONST_SCALAR_INT_P (val)
-|| GET_CODE (val) == CONST_FIXED
-|| GET_CODE (val) == LABEL_REF)));
-	  }
 
 	INSN_VAR_LOCATION_LOC (insn) = val;
 	prev_insn = PREV_INSN (insn);
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index 62b5fdc..6805b9c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -570,13 +570,6 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
   bool do_not_promote = false;
   switch (gimple_code (stmt))
 	{
-	case GIMPLE_DEBUG:
-	{
-	  gsi = gsi_for_stmt (stmt);
-	  gsi_remove (&gsi, true);
-	  break;
-	}
-
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
 	case GIMPLE_RETURN:
diff --git a/gcc/rtl.h b/gcc/rtl.h
index ac56133..c3cdf96 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2100,6 +2100,8 @@ wi::int_traits ::decompose (HOST_WIDE_INT *,
 	   targets is 1 rather than -1.  */
 	gcc_checking_assert (INTVAL (x.first)
 			 == sext_hwi (INTVAL (x.first), precision)
+			 || INTVAL (x.first)
+			 == (INTVAL (x.first) & ((1 << precision) - 1))
 			 || (x.second == BImode && INTVAL (x.first) == 1));
 
   return wi::storage_ref (&INTVAL (x.first), 1, precision);
-- 
1.9.1



[6/7] Temporary workaround to get aarch64 bootstrap

2015-09-06 Thread Kugan

AARCH64 bootstrap problem that started happening with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c mis-compiled in
stage due to this fwprop.c is failing. It looks to me that there is a
latent issue which gets exposed my patch. I can also reproduce this in
x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For
the time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround .
>From 6a10c856374446ab6d18eb9ce840c08cac440a61 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Tue, 1 Sep 2015 08:44:59 +1000
Subject: [PATCH 6/8] temporary workaround for bootstrap failure due to copy
 coalescing

---
 gcc/tree-ssa-coalesce.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 6468012..b18f0b8 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1384,11 +1384,13 @@ gimple_can_coalesce_p (tree name1, tree name2)
 	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
 	 coalesce its SSA versions with those of any other variables,
 	 because it may be passed by reference.  */
-  return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+  return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)));
+#if 0
 	|| (/* The case var1 == var2 is already covered above.  */
 	!parm_in_stack_slot_p (var1)
 	&& !parm_in_stack_slot_p (var2)
 	&& promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
+#endif
 }
 
   /* If the types are not the same, check for a canonical type match.  This
-- 
1.9.1



[7/7] Adjust-arm-test cases

2015-09-06 Thread Kugan


gcc/testsuite/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  

* gcc.target/arm/mla-2.c: Scan for wider mode operation.
* gcc.target/arm/wmul-1.c: Likewise.
* gcc.target/arm/wmul-2.c: Likewise.
* gcc.target/arm/wmul-3.c: Likewise.
* gcc.target/arm/wmul-9.c: Likewise.
>From 305c526b4019fc11260c474143f6829be2cc3f54 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Wed, 2 Sep 2015 12:21:46 +1000
Subject: [PATCH 7/8] adjust arm testcases

---
 gcc/testsuite/gcc.target/arm/mla-2.c  | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-1.c | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-2.c | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-3.c | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-9.c | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mla-2.c b/gcc/testsuite/gcc.target/arm/mla-2.c
index 1e3ca20..474bce0 100644
--- a/gcc/testsuite/gcc.target/arm/mla-2.c
+++ b/gcc/testsuite/gcc.target/arm/mla-2.c
@@ -7,4 +7,4 @@ long long foolong (long long x, short *a, short *b)
 return x + *a * *b;
 }
 
-/* { dg-final { scan-assembler "smlalbb" } } */
+/* { dg-final { scan-assembler "smla" } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-1.c b/gcc/testsuite/gcc.target/arm/wmul-1.c
index d50..d4e7b41 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-1.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-1.c
@@ -16,4 +16,4 @@ int mac(const short *a, const short *b, int sqr, int *sum)
   return sqr;
 }
 
-/* { dg-final { scan-assembler-times "smlabb" 2 } } */
+/* { dg-final { scan-assembler-times "mla" 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-2.c b/gcc/testsuite/gcc.target/arm/wmul-2.c
index 2ea55f9..0e32674 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-2.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-2.c
@@ -10,4 +10,4 @@ void vec_mpy(int y[], const short x[], short scaler)
y[i] += ((scaler * x[i]) >> 31);
 }
 
-/* { dg-final { scan-assembler-times "smulbb" 1 } } */
+/* { dg-final { scan-assembler-times "mul" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-3.c b/gcc/testsuite/gcc.target/arm/wmul-3.c
index 144b553..46d709c 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-3.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-3.c
@@ -16,4 +16,4 @@ int mac(const short *a, const short *b, int sqr, int *sum)
   return sqr;
 }
 
-/* { dg-final { scan-assembler-times "smulbb" 2 } } */
+/* { dg-final { scan-assembler-times "mul" 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-9.c b/gcc/testsuite/gcc.target/arm/wmul-9.c
index 40ed021..415a114 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-9.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
@@ -8,4 +8,4 @@ foo (long long a, short *b, char *c)
   return a + *b * *c;
 }
 
-/* { dg-final { scan-assembler "smlalbb" } } */
+/* { dg-final { scan-assembler "mlal" } } */
-- 
1.9.1



[Aarch64] Use vector wide add for mixed-mode adds

2015-09-06 Thread Michael Collison
This patch is designed to address code that was not being vectorized due 
to missing widening patterns in the aarch64 backend. Code such as:


int t6(int len, void * dummy, short * __restrict x)
{
  len = len & ~31;
  int result = 0;
  __asm volatile ("");
  for (int i = 0; i < len; i++)
result += x[i];
  return result;
}

Validated on aarch64-none-elf, aarch64_be-none-elf, and 
aarch64-none-linus-gnu.


Note that there are three non-execution tree dump vectorization 
regressions where previously code was being vectorized.  They are:


Passed now fails  [PASS => FAIL]:
  gcc.dg/vect/slp-multitypes-4.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 1 loops" 1
  gcc.dg/vect/slp-multitypes-4.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 1
  gcc.dg/vect/slp-multitypes-4.c scan-tree-dump-times vect "vectorized 1 loops" 
1
  gcc.dg/vect/slp-multitypes-4.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 1
  gcc.dg/vect/slp-multitypes-5.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 1 loops" 1
  gcc.dg/vect/slp-multitypes-5.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 1
  gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorized 1 loops" 
1
  gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 1
  gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 1
  gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 1
  gcc.dg/vect/vect-125.c -flto -ffat-lto-objects  scan-tree-dump vect "vectorized 1 
loops"
  gcc.dg/vect/vect-125.c scan-tree-dump vect "vectorized 1 loops"

I would like to treat these as saperate bugs and resolve them separately.




2015-09-04  Michael Collison  

* config/aarch64/aarch64-simd.md (widen_ssum, widen_usum,
aarch64_w_internal): New patterns
* config/aarch64/iterators.md (Vhalf, VDBLW): New mode attributes.
* gcc.target/aarch64/saddw-1.c: New test.
* gcc.target/aarch64/saddw-2.c: New test.
* gcc.target/aarch64/uaddw-1.c: New test.
* gcc.target/aarch64/uaddw-2.c: New test.
* gcc.target/aarch64/uaddw-3.c: New test.

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md

index 9777418..d6c5d61 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2636,6 +2636,60 @@

 ;; w.

+(define_expand "widen_ssum3"
+  [(set (match_operand: 0 "register_operand" "")
+(plus: (sign_extend: (match_operand:VQW 1 
"register_operand" ""))

+  (match_operand: 2 "register_operand" "")))]
+  "TARGET_SIMD"
+  {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, false);
+rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
+
+emit_insn (gen_aarch64_saddw_internal (temp, operands[2],
+operands[1], p));
+emit_insn (gen_aarch64_saddw2 (operands[0], temp, operands[1]));
+DONE;
+  }
+)
+
+(define_expand "widen_ssum3"
+  [(set (match_operand: 0 "register_operand" "")
+(plus: (sign_extend:
+   (match_operand:VD_BHSI 1 "register_operand" ""))
+  (match_operand: 2 "register_operand" "")))]
+  "TARGET_SIMD"
+{
+  emit_insn (gen_aarch64_saddw (operands[0], operands[2], 
operands[1]));

+  DONE;
+})
+
+(define_expand "widen_usum3"
+  [(set (match_operand: 0 "register_operand" "=&w")
+(plus: (zero_extend: (match_operand:VQW 1 
"register_operand" ""))

+  (match_operand: 2 "register_operand" "")))]
+  "TARGET_SIMD"
+  {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, false);
+rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
+
+emit_insn (gen_aarch64_uaddw_internal (temp, operands[2],
+ operands[1], p));
+emit_insn (gen_aarch64_uaddw2 (operands[0], temp, operands[1]));
+DONE;
+  }
+)
+
+(define_expand "widen_usum3"
+  [(set (match_operand: 0 "register_operand" "")
+(plus: (zero_extend:
+   (match_operand:VD_BHSI 1 "register_operand" ""))
+  (match_operand: 2 "register_operand" "")))]
+  "TARGET_SIMD"
+{
+  emit_insn (gen_aarch64_uaddw (operands[0], operands[2], 
operands[1]));

+  DONE;
+})
+
 (define_insn "aarch64_w"
   [(set (match_operand: 0 "register_operand" "=w")
 (ADDSUB: (match_operand: 1 "register_operand" "w")
@@ -2646,6 +2700,18 @@
   [(set_attr "type" "neon__widen")]
 )

+(define_insn "aarch64_w_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+(ADDSUB: (match_operand: 1 "register_operand" "w")
+(ANY_EXTEND:
+  (vec_select:
+   (match_operand:VQW 2 "register_operand" "w")
+   (match_operand:VQW 3 "vect_par_cnst_lo_half" "")]
+  "TARGET_SIMD"
+  "w\\t%0., %1., 
%2."

+  [(s