Stream ODR types

2014-09-11 Thread Jan Hubicka
Hi,
this patch adds computation and streaming of mangled type names.  As suggested 
by Jason,
it simple calls DECL_ASSEMBLER_NAME on all names types and lets C++ supply them.
This makes it possible to stablish precise ODR type equivalency at LTO (till 
now we can
do that only for complete class types with virtual methods attached to them).
Lto type merging is then updated to register all types into the ODR type hash.  
This
makes warnings to be output for ODR violations. Here are ones output for 
Firefox:
http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt

As discussed earlier, in addition to ODR warnings that seems useful, I would
like to use it for TBAA analysis for ODR types that are not structurally
equivalent to non-ODR types, so C++ programs will get better alias analysis and
for other tricks, such as more agresively merging ODR types.

I believe this makes sense (is orthogonal) with early debug info (for warnings, 
TBAA
and devirtualization).  It can be also used to more agresively merge debug 
information
as done by LLVM.

The change increase LTO object fules by about 2% (uncompressed by 6%) and also
increase WPA memory use and streaming times by about same percentage.  It is
not small and thus I made it optional (enabled by default for now).  We could 
see
how benefits relate to this cost once the other three parts are implemented.

Bootstrapped/regtested x86_64-linux, seems sane?

Honza

* common.opt (flto-odr-type-merging): New flag.
* ipa-deivrt.c (hash_type_name): Use ODR names for hasing if availale.
(types_same_for_odr): Likewise.
(odr_subtypes_equivalent_p): Likewise.
(add_type_duplicate): Do not walk type variants.
(register_odr_type): New function.
* ipa-utils.h (register_odr_type): Declare.
(odr_type_p): New function.
* langhooks.c (lhd_set_decl_assembler_name): Do not compute
TYPE_DECLs
* doc/invoke.texi (-flto-odr-type-merging): Document.
* tree.c (need_assembler_name_p): Compute ODR names when asked
for it.
* tree.h (DECL_ASSEMBLER_NAME): Update comment.

* lto.c (lto_read_decls): Register ODR types.

Index: common.opt
===
--- common.opt  (revision 215103)
+++ common.opt  (working copy)
@@ -1560,6 +1560,10 @@ flto-compression-level=
 Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1)
 -flto-compression-level=   Use zlib compression level  for 
IL
 
+flto-odr-type-merging
+Common Report Var(flag_lto_odr_type_mering) Init(1)
+Merge C++ types using One Definition Rule
+
 flto-report
 Common Report Var(flag_lto_report) Init(0)
 Report various link-time optimization statistics
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 215103)
+++ ipa-devirt.c(working copy)
@@ -287,7 +287,13 @@ hash_type_name (tree t)
   if (type_in_anonymous_namespace_p (t))
 return htab_hash_pointer (t);
 
-  /* For polymorphic types, we can simply hash the virtual table.  */
+  /* ODR types have name specified.  */
+  if (TYPE_NAME (t)
+  && DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t)))
+return IDENTIFIER_HASH_VALUE (DECL_ASSEMBLER_NAME (TYPE_NAME (t)));
+
+  /* For polymorphic types that was compiled with -fno-lto-odr-type-merging
+ we can simply hash the virtual table.  */
   if (TREE_CODE (t) == RECORD_TYPE
   && TYPE_BINFO (t) && BINFO_VTABLE (TYPE_BINFO (t)))
 {
@@ -305,8 +311,14 @@ hash_type_name (tree t)
   return hash;
 }
 
-  /* Rest is not implemented yet.  */
-  gcc_unreachable ();
+  /* Builtin types may appear as main variants of ODR types and are unique.
+ Sanity check we do not get anything that looks non-builtin.  */
+  gcc_checking_assert (TREE_CODE (t) == INTEGER_TYPE
+  || TREE_CODE (t) == VOID_TYPE
+  || TREE_CODE (t) == COMPLEX_TYPE
+  || TREE_CODE (t) == REAL_TYPE
+  || TREE_CODE (t) == POINTER_TYPE);
+  return htab_hash_pointer (t);
 }
 
 /* Return the computed hashcode for ODR_TYPE.  */
@@ -347,42 +359,61 @@ types_same_for_odr (const_tree type1, co
   || type_in_anonymous_namespace_p (type2))
 return false;
 
-  /* See if types are obvoiusly different (i.e. different codes
- or polymorphis wrt non-polymorphic).  This is not strictly correct
- for ODR violating programs, but we can't do better without streaming
- ODR names.  */
-  if (TREE_CODE (type1) != TREE_CODE (type2))
-return false;
-  if (TREE_CODE (type1) == RECORD_TYPE
-  && (TYPE_BINFO (type1) == NULL_TREE) != (TYPE_BINFO (type1) == 
NULL_TREE))
-return false;
-  if (TREE_CODE (type1) == RECORD_TYPE && TYPE_BINFO (type1)
-  && (BINFO_VTABLE (TYPE_BINFO (type1)) == NULL_TREE)
-!= (BINFO_VTABLE (TYPE_BINFO (type2)) == NULL_TREE))
-return false;
 
-  /* At the moment we have no w

Re: [PATCH] gcc parallel make check

2014-09-11 Thread Jakub Jelinek
On Wed, Sep 10, 2014 at 11:23:34PM +0200, Jakub Jelinek wrote:
> On Wed, Sep 10, 2014 at 11:08:22PM +0200, Jakub Jelinek wrote:
> > Perhaps better approach might be if we have some way how to synchronize 
> > among
> > multiple expect processes and spawn only as many expects (of course, per
> > check target) as there are CPUs.  E.g. if mkdir is atomic on all
> > hosts/filesystems we care about, we could have some shared directory that
> > make would clear before spawning all the expects, and after checking
> > runtest_file_p we could attempt to mkdir something (e.g. testcase filename
> > with $(srcdir) part removed, or *.exp filename / counter what test are we
> > considering or something similar) in the shared directory, if that would
> > succeed, it would tell us that we are the process that should run the test,
> > if that failed, we'd know some other runtest did that.
> > Or perhaps not for every single test, but every 10 or 100 tests or
> > something.
> > 
> > E.g. we could just override runtest_file_p itself, so that it would first
> > call the original dejagnu version, and then do this check.
> 
> Seems file mkdir in tcl doesn't error on pre-existing directory, so perhaps
> [open $path {WRONLY EXCL CREAT}] ?
> Now, does this work properly on all hosts we care about?

Here is a proof of concept on the tcl side.
To get a large seq of numbers in the Makefile, I guess we can use something
like
check_p_numbers0:=1 2 3 4 5 6 7 8 9  
check_p_numbers1:=0 $(check_p_numbers0)
check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers1)))
check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2)
check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers3)))
check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4)
check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers5)))
check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) 
$(check_p_numbers6)
(and then what
check_p_subdirs=$(wordlist 1,$(words 
$(check_$*_parallelize)),$(check_p_numbers))
uses, just with $(check_$*_parallelize) replaced with something to match the
number of desired goals.
Looking at some of the *.exp tests, it seems only some of them (though, the
majority of the time consuming ones) actually use runtest_file_p, e.g.
compat.exp or struct-layout-1.exp and several others don't.

So, IMHO what we should do in the Makefile is, right inside
@if [ -z "$(filter-out --target_board=%,$(filter-out 
--extra_opts%,$(RUNTESTFLAGS)))" ] \
&& [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \
first rm -rf $(TESTSUITEDIR)/$*-parallel; mkdir $(TESTSUITEDIR)/$*-parallel
so that we start with empty dir, compute check_p_subdirs from actual -jN
number, then in check-parallel-gcc_1 etc. goals (but not in
check-parallel-gcc) set
GCC_RUNTEST_PARALLELIZE_DIR=$(TESTSUITEDIR)/$(check_p_tool)-parallel
in the environment and use RUNTESTFLAGS with selected known to be
parallelizable *.exp files (dg.exp execute.exp compile.exp and the like),
and use all the other *.exp files for check-parallel-gcc.

Thoughts on this?

Unfortunately, not sure how would that work with the
check-subtargets stuff if people are used to parallelize testing across
multiple machines (but it is unclear to me how they are merging the log/sum
files from the multiple machines anyway).  Not sure if this works over
NFS/AFS and other networked filesystems, if it does, supposedly they could
arrange for the *-parallel directories to be shared.

I can't find how to query the -jN value passed to make check by the user
though, both $(MFLAGS) and $(MAKEFLAGS) only contain something like
--jobserver-fds=3,5 -j from which it is not possible to find out how many
goals would be the upper reasonable limit.  Running too many goals would
waste time (once scheduled, the goal would only wildcard all the test, and
for all of them find in the *-parallel directory the test has been run
already), running too few could prevent good parallelization.

--- gcc/testsuite/lib/gcc-defs.exp.jj   2014-09-01 09:43:28.0 +0200
+++ gcc/testsuite/lib/gcc-defs.exp  2014-09-11 08:37:43.871943270 +0200
@@ -188,6 +188,30 @@ if { [info procs runtest_file_p] == "" }
 }
 }
 
+if { [info exists env(GCC_RUNTEST_PARALLELIZE_DIR)] \
+ && [info procs runtest_file_p] != [list] \
+ && [info procs gcc_parallelize_saved_runtest_file_p] == [list] } then {
+rename runtest_file_p gcc_parallelize_saved_runtest_file_p
+global gcc_runtest_parallelize_counter
+
+set gcc_runtest_parallelize_counter 0
+proc runtest_file_p { runtests testcase } {
+   global gcc_runtest_parallelize_counter
+   if ![gcc_parallelize_saved_runtest_file_p $runtests $testcase] {
+   return 0
+   }
+
+   set dir [getenv GCC_RUNTEST_PARALLELIZE_DIR]
+   set path $dir/$gcc_runtest_parallelize_counter
+   set gcc_runtest_parallelize_counter [expr 
{$gcc_runt

RE: [PATCH][ARM] Fix -fcall-saved-rX for X > 7 with -Os -mthumb

2014-09-11 Thread Thomas Preud'homme
Ping?

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, August 20, 2014 9:28 AM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][ARM] Fix -fcall-saved-rX for X > 7
>
> This patch makes -fcall-saved-rX for X > 7 on Thumb target when
> optimizing
> for size. It works by adding a new field x_user_set_call_save_regs in
> struct
> target_hard_regs to track whether an entry in fields x_fixed_regs,
> x_call_used_regs and x_call_really_used_regs was user set or is in its
> > default
> value. Then it can decide whether to set a given high register as caller
> saved
> or not when optimizing for size based on this information.
>
> ChangeLog are as follows:
>
> *** gcc/ChangeLog ***
>
> 2014-08-15  Thomas Preud'homme  
>
> * config/arm/arm.c (arm_conditional_register_usage): Only set high
> registers as caller saved when optimizing for size *and* the user did
> not asked otherwise through -fcall-saved-* switch.
> * hard-reg-set.h (x_user_set_call_save_regs): New.
> (user_set_call_save_regs): Define.
> * reginfo.c (init_reg_sets): Initialize user_set_call_save_regs.
> (fix_register): Indicate in user_set_call_save_regs that the value set
> in call_save_regs and fixed_regs is user set.
>
>
> *** gcc/testsuite/ChangeLog ***
>
> 2014-08-15  Thomas Preud'homme  
>
> * gcc.target/arm/fcall-save-rhigh.c: New.
>
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 2f8d327..8324fa3 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -30084,7 +30084,8 @@ arm_conditional_register_usage (void)
>  stacking them.  */
>for (regno = FIRST_HI_REGNUM;
>  regno <= LAST_HI_REGNUM; ++regno)
> - fixed_regs[regno] = call_used_regs[regno] = 1;
> + if (!user_set_call_save_regs[regno])
> +   fixed_regs[regno] = call_used_regs[regno] = 1;
>  }
>
>/* The link register can be clobbered by any branch insn,
> diff --git a/gcc/hard-reg-set.h b/gcc/hard-reg-set.h
> index b8ab3df..b523637 100644
> --- a/gcc/hard-reg-set.h
> +++ b/gcc/hard-reg-set.h
> @@ -614,6 +614,11 @@ struct target_hard_regs {
>
>char x_call_really_used_regs[FIRST_PSEUDO_REGISTER];
>
> +  /* Indexed by hard register number, contains 1 for registers
> + whose saving at function call was decided by the user
> + with -fcall-saved-*, -fcall-used-* or -ffixed-*.  */
> +  char x_user_set_call_save_regs[FIRST_PSEUDO_REGISTER];
> +
>/* The same info as a HARD_REG_SET.  */
>HARD_REG_SET x_call_used_reg_set;
>
> @@ -685,6 +690,8 @@ extern struct target_hard_regs
> *this_target_hard_regs;
>(this_target_hard_regs->x_call_used_regs)
>  #define call_really_used_regs \
>(this_target_hard_regs->x_call_really_used_regs)
> +#define user_set_call_save_regs \
> +  (this_target_hard_regs->x_user_set_call_save_regs)
>  #define call_used_reg_set \
>(this_target_hard_regs->x_call_used_reg_set)
>  #define call_fixed_reg_set \
> diff --git a/gcc/reginfo.c b/gcc/reginfo.c
> index 7668be0..0b35f7f 100644
> --- a/gcc/reginfo.c
> +++ b/gcc/reginfo.c
> @@ -183,6 +183,7 @@ init_reg_sets (void)
>memcpy (call_really_used_regs, initial_call_really_used_regs,
> sizeof call_really_used_regs);
>  #endif
> +  memset (user_set_call_save_regs, 0, sizeof user_set_call_save_regs);
>  #ifdef REG_ALLOC_ORDER
>memcpy (reg_alloc_order, initial_reg_alloc_order, sizeof
> reg_alloc_order);
>  #endif
> @@ -742,6 +743,7 @@ fix_register (const char *name, int fixed, int
> > call_used)
> if (fixed == 0)
>   call_really_used_regs[i] = call_used;
>  #endif
> +   user_set_call_save_regs[i] = 1;
>   }
>   }
>  }
> diff --git a/gcc/testsuite/gcc.target/arm/fcall-save-rhigh.c
> b/gcc/testsuite/gcc.target/arm/fcall-save-rhigh.c
> new file mode 100644
> index 000..a321a2b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/fcall-save-rhigh.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-final { scan-assembler "mov\\s+r.\\s*,\\s*r8" } } */
> +/* { dg-require-effective-target arm_thumb1_ok } */
> +/* { dg-options "-Os -mthumb -mcpu=cortex-m0 -fcall-saved-r8" } */
> +
> +void
> +save_regs (void)
> +{
> +  asm volatile ("" ::: "r7", "r8");
> +}
>
> Ok for trunk?
>
> Best regards,
>
> Thomas
>





Re: [PATCH] gcc parallel make check

2014-09-11 Thread Jakub Jelinek
On Thu, Sep 11, 2014 at 09:51:23AM +0200, Jakub Jelinek wrote:
> I can't find how to query the -jN value passed to make check by the user
> though, both $(MFLAGS) and $(MAKEFLAGS) only contain something like
> --jobserver-fds=3,5 -j from which it is not possible to find out how many
> goals would be the upper reasonable limit.  Running too many goals would
> waste time (once scheduled, the goal would only wildcard all the test, and
> for all of them find in the *-parallel directory the test has been run
> already), running too few could prevent good parallelization.

After a little googling, it seems there is no way to do that :(, unless
one e.g. attempts to find the command line of the topmost parent make
and scan it through ps or something.

There is an option to touch say *-parallel/finished file once any of the
check-parallel-gcc-{1,2,...} goals is done (because when it finishes, it
means all the tests for the particular check-$lang that are parallelizable
have either finished, or at least touched their file) and not start runtest
at all if finished already exists, but guess it would be still undesirable to 
have
tens of thousands of goals by default, so perhaps we could go with say
128 subgoals by default and have some env var to override it, so on the
really highly parallel boxes you'd specify
make -j512 -k check GCC_TEST_PARALLEL_SLOTS=512
or similar.

Jakub


Re: Stream ODR types

2014-09-11 Thread Richard Biener
On Thu, 11 Sep 2014, Jan Hubicka wrote:

> Hi,
> this patch adds computation and streaming of mangled type names.  As 
> suggested by Jason,
> it simple calls DECL_ASSEMBLER_NAME on all names types and lets C++ supply 
> them.
> This makes it possible to stablish precise ODR type equivalency at LTO (till 
> now we can
> do that only for complete class types with virtual methods attached to them).
> Lto type merging is then updated to register all types into the ODR type 
> hash.  This
> makes warnings to be output for ODR violations. Here are ones output for 
> Firefox:
> http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt
> 
> As discussed earlier, in addition to ODR warnings that seems useful, I would
> like to use it for TBAA analysis for ODR types that are not structurally
> equivalent to non-ODR types, so C++ programs will get better alias analysis 
> and
> for other tricks, such as more agresively merging ODR types.
> 
> I believe this makes sense (is orthogonal) with early debug info (for 
> warnings, TBAA
> and devirtualization).  It can be also used to more agresively merge debug 
> information
> as done by LLVM.
> 
> The change increase LTO object fules by about 2% (uncompressed by 6%) and also
> increase WPA memory use and streaming times by about same percentage.  It is
> not small and thus I made it optional (enabled by default for now).  We could 
> see
> how benefits relate to this cost once the other three parts are implemented.
> 
> Bootstrapped/regtested x86_64-linux, seems sane?

It looks sane, but when early debug is completed we likely will drop
all the elaborated types from decls.  Thus to keep the ODR type you'd
have to keep (and compute early as well) their DECL_ASSEMBLER_NAME?

Can't we just store a hash of the assembler name?  From alias analysis
perspective false aliasing due to a hash collision is harmless, no?
Maybe not for ODR warnings though.  At least a hash would be way
cheaper than those usually very large strings

You probably want to restrict ODR types to aggregates?

Richard.

> Honza
> 
>   * common.opt (flto-odr-type-merging): New flag.
>   * ipa-deivrt.c (hash_type_name): Use ODR names for hasing if availale.
>   (types_same_for_odr): Likewise.
>   (odr_subtypes_equivalent_p): Likewise.
>   (add_type_duplicate): Do not walk type variants.
>   (register_odr_type): New function.
>   * ipa-utils.h (register_odr_type): Declare.
>   (odr_type_p): New function.
>   * langhooks.c (lhd_set_decl_assembler_name): Do not compute
>   TYPE_DECLs
>   * doc/invoke.texi (-flto-odr-type-merging): Document.
>   * tree.c (need_assembler_name_p): Compute ODR names when asked
>   for it.
>   * tree.h (DECL_ASSEMBLER_NAME): Update comment.
> 
>   * lto.c (lto_read_decls): Register ODR types.
> 
> Index: common.opt
> ===
> --- common.opt(revision 215103)
> +++ common.opt(working copy)
> @@ -1560,6 +1560,10 @@ flto-compression-level=
>  Common Joined RejectNegative UInteger Var(flag_lto_compression_level) 
> Init(-1)
>  -flto-compression-level= Use zlib compression level  for 
> IL
>  
> +flto-odr-type-merging
> +Common Report Var(flag_lto_odr_type_mering) Init(1)
> +Merge C++ types using One Definition Rule
> +
>  flto-report
>  Common Report Var(flag_lto_report) Init(0)
>  Report various link-time optimization statistics
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c  (revision 215103)
> +++ ipa-devirt.c  (working copy)
> @@ -287,7 +287,13 @@ hash_type_name (tree t)
>if (type_in_anonymous_namespace_p (t))
>  return htab_hash_pointer (t);
>  
> -  /* For polymorphic types, we can simply hash the virtual table.  */
> +  /* ODR types have name specified.  */
> +  if (TYPE_NAME (t)
> +  && DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t)))
> +return IDENTIFIER_HASH_VALUE (DECL_ASSEMBLER_NAME (TYPE_NAME (t)));
> +
> +  /* For polymorphic types that was compiled with -fno-lto-odr-type-merging
> + we can simply hash the virtual table.  */
>if (TREE_CODE (t) == RECORD_TYPE
>&& TYPE_BINFO (t) && BINFO_VTABLE (TYPE_BINFO (t)))
>  {
> @@ -305,8 +311,14 @@ hash_type_name (tree t)
>return hash;
>  }
>  
> -  /* Rest is not implemented yet.  */
> -  gcc_unreachable ();
> +  /* Builtin types may appear as main variants of ODR types and are unique.
> + Sanity check we do not get anything that looks non-builtin.  */
> +  gcc_checking_assert (TREE_CODE (t) == INTEGER_TYPE
> +|| TREE_CODE (t) == VOID_TYPE
> +|| TREE_CODE (t) == COMPLEX_TYPE
> +|| TREE_CODE (t) == REAL_TYPE
> +|| TREE_CODE (t) == POINTER_TYPE);
> +  return htab_hash_pointer (t);
>  }
>  
>  /* Return the computed hashcode for ODR_TYPE.  */
> @@ -347,42 +359,61 @@ types_same_for_odr (const_tree type1

[AArch64] Tighten predicates on SIMD shift intrinsics

2014-09-11 Thread James Greenhalgh

Hi,

There are a set of SIMD shift intrinsics that have very tight predicates
on the range of immediates they can accept, but have been written with very
loose predicates, bailing out with an error in final if there has been an
issue.

This is a problem if some pass figures out that a value passed to the related,
non-immediate form, intrinsics is constant and tries to use the immediate
form. This can result in a bogus error.

This patch tightens all such predicates, preventing the compiler from
trying to emit the immediate-form instructions where they are
inappropriate.

Cross-tested for aarch64-none-elf with no issues.

OK?

Thanks,
James

---
gcc/

2014-09-11  James Greenhalgh  

* config/aarch64/aarch64-protos.h (aarch64_simd_const_bounds): Change
return type to bool.
* config/aarch64/aarch64-simd.md (aarch64_qshl): Use
new predicates.
(aarch64_shll2_n): Likewise.
(aarch64_shr_n): Likewise.
(aarch64_sra_n: Likewise.
(aarch64_si_n): Likewise.
(aarch64_qshl_n): Likewise.
* config/aarch64/aarch64.c (aarch64_simd_const_bounds): Change
return type to bool; don't print errors.
* config/aarch64/iterators.md (ve_mode): New.
(offsetlr): Remap to infix text for use in new predicates.
* config/aarch64/predicates.md (aarch64_simd_shift_imm_qi): New.
(aarch64_simd_shift_imm_hi): Likewise.
(aarch64_simd_shift_imm_si): Likewise.
(aarch64_simd_shift_imm_di): Likewise.
(aarch64_simd_shift_imm_offset_qi): Likewise.
(aarch64_simd_shift_imm_offset_hi): Likewise.
(aarch64_simd_shift_imm_offset_si): Likewise.
(aarch64_simd_shift_imm_offset_di): Likewise.
(aarch64_simd_shift_imm_bitsize_qi): Likewise.
(aarch64_simd_shift_imm_bitsize_hi): Likewise.
(aarch64_simd_shift_imm_bitsize_si): Likewise.
(aarch64_simd_shift_imm_bitsize_di): Likewise.

gcc/testsuite/

2014-09-08  James Greenhalgh  

* gcc.target/aarch64/simd/vqshlb_1.c: New.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 35f89ff..9de7af7 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -205,6 +205,7 @@ bool aarch64_regno_ok_for_base_p (int, bool);
 bool aarch64_regno_ok_for_index_p (int, bool);
 bool aarch64_simd_check_vect_par_cnst_half (rtx op, enum machine_mode mode,
 	bool high);
+bool aarch64_simd_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 bool aarch64_simd_imm_scalar_p (rtx x, enum machine_mode mode);
 bool aarch64_simd_imm_zero_p (rtx, enum machine_mode);
 bool aarch64_simd_scalar_immediate_valid_for_move (rtx, enum machine_mode);
@@ -255,7 +256,6 @@ void aarch64_emit_call_insn (rtx);
 /* Initialize builtins for SIMD intrinsics.  */
 void init_aarch64_simd_builtins (void);
 
-void aarch64_simd_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 void aarch64_simd_disambiguate_copy (rtx *, rtx *, rtx *, unsigned int);
 
 /* Emit code to place a AdvSIMD pair result in memory locations (with equal
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 6a45e91512ffe1c8c2ecd2b1ba4336baf87f7256..9e688e310027c772cfe5ecd4a158796b143998c5 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3715,12 +3715,12 @@ (define_insn "aarch64_qshl
 (define_insn "aarch64_shll_n"
   [(set (match_operand: 0 "register_operand" "=w")
 	(unspec: [(match_operand:VDW 1 "register_operand" "w")
-			 (match_operand:SI 2 "immediate_operand" "i")]
+			 (match_operand:SI 2
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
  VSHLL))]
   "TARGET_SIMD"
   "*
   int bit_width = GET_MODE_UNIT_SIZE (mode) * BITS_PER_UNIT;
-  aarch64_simd_const_bounds (operands[2], 0, bit_width + 1);
   if (INTVAL (operands[2]) == bit_width)
   {
 return \"shll\\t%0., %1., %2\";
@@ -3741,7 +3741,6 @@ (define_insn "aarch64_shll2_n
   "TARGET_SIMD"
   "*
   int bit_width = GET_MODE_UNIT_SIZE (mode) * BITS_PER_UNIT;
-  aarch64_simd_const_bounds (operands[2], 0, bit_width + 1);
   if (INTVAL (operands[2]) == bit_width)
   {
 return \"shll2\\t%0., %1., %2\";
@@ -3757,13 +3756,11 @@ (define_insn "aarch64_shll2_n
 (define_insn "aarch64_shr_n"
   [(set (match_operand:VSDQ_I_DI 0 "register_operand" "=w")
 (unspec:VSDQ_I_DI [(match_operand:VSDQ_I_DI 1 "register_operand" "w")
-			   (match_operand:SI 2 "immediate_operand" "i")]
+			   (match_operand:SI 2
+			 "aarch64_simd_shift_imm_offset_" "i")]
 			  VRSHR_N))]
   "TARGET_SIMD"
-  "*
-  int bit_width = GET_MODE_UNIT_SIZE (mode) * BITS_PER_UNIT;
-  aarch64_simd_const_bounds (operands[2], 1, bit_width + 1);
-  return \"shr\\t%0, %1, %2\";"
+  "shr\\t%0, %1, %2"
   [(set_attr "type" "neon_sat_shift_imm")]
 )
 
@@ -3773,13 +3770,11 @@ (define_insn "aarch64_sra_n"
   [(set (match_operand:VSDQ_I_DI 0 "register_operand" "=w")
 	(unspec:VSDQ_I_DI [(match_operand:V

Re: [PATCH i386 AVX512] [36/n] Extend gather insn patterns.

2014-09-11 Thread Uros Bizjak
On Wed, Sep 10, 2014 at 7:40 PM, Uros Bizjak  wrote:

>> Patch in the bottom extends gather instructions support.
>>
>> Bootstrapped.
>> AVX-512* tests on top of patch-set all pass
>> under simulator.
>>
>> Is it ok for trunk?
>>
>> gcc/
>> * config/i386/sse.md
>> (define_expand "_gathersi"): Rename from
>> "avx512f_gathersi".
>> (define_insn "*avx512f_gathersi"): Use VI48F.
>> (define_insn "*avx512f_gathersi_2"): Ditto.
>> (define_expand "_gatherdi"): Rename from
>> "avx512f_gatherdi".
>> (define_insn "*avx512f_gatherdi"): Use VI48F.
>> (define_insn "*avx512f_gatherdi_2"): Use VI48F, add 128/256-bit
>> wide versions.
>> (define_expand "_scattersi"): Rename from
>> "avx512f_scattersi".
>> (define_insn "*avx512f_scattersi"): Use VI48F.
>> (define_expand "_scatterdi"): Rename from
>> "avx512f_scatterdi".
>> (define_insn "*avx512f_scatterdi"): Use VI48F.
>>
>
> ...
>
>>  (define_insn "*avx512f_gatherdi_2"
>> -  [(set (match_operand:VI48F_512 0 "register_operand" "=&v")
>> -   (unspec:VI48F_512
>> +  [(set (match_operand:VI48F 0 "register_operand" "=&v")
>> +   (unspec:VI48F
>>   [(pc)
>>(match_operand:QI 6 "register_operand" "1")
>>(match_operator: 5 "vsib_mem_operator"
>> @@ -16762,22 +16762,27 @@
>>"TARGET_AVX512F"
>>  {
>>if (mode != mode)
>> -return "vgatherq\t{%5, 
>> %t0%{%1%}|%t0%{%1%}, %g5}";
>> +{
>> +  if (GET_MODE_SIZE (mode) != 64)
>
> Something is wrong here. Mode iterator is VI48F that always has mode
> size != 64, so the condition is always true.

Oh, I just mixed mode bitsize with mode size. Those sizes are huge ;)

The patch is OK.

Thanks,
Uros.


[PATCH][match-and-simplify] Dump what patterns get applied

2014-09-11 Thread Richard Biener

The following adds dumping what patterns get applied with -details.
This is useful for tracking down which patterns cause a miscompile.

Applied.

Richard.

2014-09-11  Richard Biener  

* genmatch.c (output_line_directive): Add variant for dump files.
(dt_simplify::gen): Write to dump file with TDF_DETAILS what
patterns get applied.
* gimple-match-head: Include dumpfile.h.
* generic-match-head: Likewise.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215124)
+++ gcc/genmatch.c  (working copy)
@@ -64,16 +64,28 @@ fatal_at (const cpp_token *tk, const cha
 }
 
 static void
-output_line_directive (FILE *f, source_location location)
+output_line_directive (FILE *f, source_location location,
+  bool dumpfile = false)
 {
   const line_map *map;
   linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
   expanded_location loc = linemap_expand_location (line_table, map, location);
-  /* Other gen programs really output line directives here, at least for
- development it's right now more convenient to have line information
- from the generated file.  Still keep the directives as comment for now
- to easily back-point to the meta-description.  */
-  fprintf (f, "/* #line %d \"%s\" */\n", loc.line, loc.file);
+  if (dumpfile)
+{
+  /* When writing to a dumpfile only dump the filename.  */
+  const char *file = strrchr (loc.file, DIR_SEPARATOR);
+  if (!file)
+   file = loc.file;
+  else
+   ++file;
+  fprintf (f, "%s:%d", file, loc.line);
+}
+  else
+/* Other gen programs really output line directives here, at least for
+   development it's right now more convenient to have line information
+   from the generated file.  Still keep the directives as comment for now
+   to easily back-point to the meta-description.  */
+fprintf (f, "/* #line %d \"%s\" */\n", loc.line, loc.file);
 }
 
 
@@ -1710,9 +1722,8 @@ dt_operand::gen_generic_kids (FILE *f)
 void
 dt_simplify::gen (FILE *f, bool gimple)
 {
-  output_line_directive (f, s->result_location);
-
   fprintf (f, "{\n");
+  output_line_directive (f, s->result_location);
   fprintf (f, "tree captures[%u] ATTRIBUTE_UNUSED = {};\n", 
dt_simplify::capture_max);
 
   for (unsigned i = 0; i < dt_simplify::capture_max; ++i)
@@ -1729,15 +1740,16 @@ dt_simplify::gen (FILE *f, bool gimple)
   for (int i = s->ifexpr_vec.length () - 1; i >= 0; --i)
{
  if_or_with &w = s->ifexpr_vec[i];
- output_line_directive (f, w.location);
  if (w.is_with)
{
  fprintf (f, "{\n");
+ output_line_directive (f, w.location);
  w.cexpr->gen_transform (f, NULL, true, 1, "type");
  n_braces++;
}
  else
{
+ output_line_directive (f, w.location);
  fprintf (f, "if (");
  if (i == 0 || s->ifexpr_vec[i-1].is_with)
w.cexpr->gen_transform (f, NULL, true, 1, "type");
@@ -1768,6 +1780,11 @@ dt_simplify::gen (FILE *f, bool gimple)
   n_braces++;
 }
 
+  fprintf (f, "if (dump_file && (dump_flags & TDF_DETAILS)) "
+  "fprintf (dump_file, \"Applying pattern ");
+  output_line_directive (f, s->result_location, true);
+  fprintf (f, ", %%s:%%d\\n\", __FILE__, __LINE__);\n");
+
   if (gimple)
 {
   if (s->result->type == operand::OP_EXPR)
Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 215009)
+++ gcc/gimple-match-head.c (working copy)
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.
 #include "expr.h"
 #include "tree-dfa.h"
 #include "builtins.h"
+#include "dumpfile.h"
 #include "gimple-match.h"
 
 #define integral_op_p(node) INTEGRAL_TYPE_P(TREE_TYPE(node))
Index: gcc/generic-match-head.c
===
--- gcc/generic-match-head.c(revision 215009)
+++ gcc/generic-match-head.c(working copy)
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.
 #include "expr.h"
 #include "tree-dfa.h"
 #include "builtins.h"
+#include "dumpfile.h"
 
 #define INTEGER_CST_P(node) (TREE_CODE(node) == INTEGER_CST)
 #define integral_op_p(node) INTEGRAL_TYPE_P(TREE_TYPE(node))


RE: [PATCH] RE: gcc parallel make check

2014-09-11 Thread VandeVondele Joost
> could it be that the pattern in normal1 should have been '[ab]*/ de*/ 
> [ep]*/*' ?

I've checked that this fixes the bug in the current trunk split. I.e. files are 
stil tested, but now only once. Consider this change added to the previously 
submitted patch.



Re: [PATCH] Introduce LABEL_REF_LABEL

2014-09-11 Thread Richard Biener
On Thu, Sep 11, 2014 at 2:37 AM, David Malcolm  wrote:
> The following patch adds a macro:
>
>   /* Get the label that a LABEL_REF references.  */
>   #define LABEL_REF_LABEL(LABREF) XCEXP (LABREF, 0, LABEL_REF)
>
> and uses it in place of XEXP (foo, 0) for "foo" known to be a
> LABEL_REF, throughout the "gcc" directory, in the hope of
> (A) improving the clarity of the code
> (B) perhaps making it easier to make things typesafe in future patches.
>
> It's rather verbose compared to XEXP (x, 0), but I think it's clearer,
> and easier to grep for.  Maybe "LABEL_REF_LAB"?
>
> I haven't gone through the config subfirectories yet.
>
> Bootstrapped on x86_64-unknown-linux-gnu (Fedora 20), and has been
> rebuilt as part of a config-list.mk build for all working
> configurations.
>
> OK for trunk?

Ok.

Thanks,
Richard.

> gcc/ChangeLog:
> * rtl.h (LABEL_REF_LABEL): New macro.
>
> * alias.c (rtx_equal_for_memref_p): Use LABEL_REF_LABEL in place
> of XEXP (, 0), where we know that we have a LABEL_REF.
> * cfgbuild.c (make_edges): Likewise.
> (purge_dead_tablejump_edges): Likewise.
> * cfgexpand.c (convert_debug_memory_address): Likewise.
> * cfgrtl.c (patch_jump_insn): Likewise.
> * combine.c (distribute_notes): Likewise.
> * cse.c (hash_rtx_cb): Likewise.
> (exp_equiv_p): Likewise.
> (fold_rtx): Likewise.
> (check_for_label_ref): Likewise.
> * cselib.c (rtx_equal_for_cselib_1): Likewise.
> (cselib_hash_rtx): Likewise.
> * emit-rtl.c (mark_label_nuses): Likewise.
> * explow.c (convert_memory_address_addr_space): Likewise.
> * final.c (output_asm_label): Likewise.
> (output_addr_const): Likewise.
> * gcse.c (add_label_notes): Likewise.
> * genconfig.c (walk_insn_part): Likewise.
> * genrecog.c (validate_pattern): Likewise.
> * ifcvt.c (cond_exec_get_condition): Likewise.
> (noce_emit_store_flag): Likewise.
> (noce_get_alt_condition): Likewise.
> (noce_get_condition): Likewise.
> * jump.c (maybe_propagate_label_ref): Likewise.
> (mark_jump_label_1): Likewise.
> (redirect_exp_1): Likewise.
> (rtx_renumbered_equal_p): Likewise.
> * lra-constraints.c (operands_match_p): Likewise.
> * reload.c (operands_match_p): Likewise.
> (find_reloads): Likewise.
> * reload1.c (set_label_offsets): Likewise.
> * reorg.c (get_branch_condition): Likewise.
> * rtl.c (rtx_equal_p_cb): Likewise.
> (rtx_equal_p): Likewise.
> * rtlanal.c (reg_mentioned_p): Likewise.
> (rtx_referenced_p): Likewise.
> (get_condition): Likewise.
> * sched-vis.c (print_value): Likewise.
> * varasm.c (const_hash_1): Likewise.
> (compare_constant): Likewise.
> (const_rtx_hash_1): Likewise.
> (output_constant_pool_1): Likewise.
> ---
>  gcc/alias.c   |  2 +-
>  gcc/cfgbuild.c|  4 ++--
>  gcc/cfgexpand.c   |  2 +-
>  gcc/cfgrtl.c  |  2 +-
>  gcc/combine.c |  4 ++--
>  gcc/cse.c | 20 ++--
>  gcc/cselib.c  |  4 ++--
>  gcc/emit-rtl.c|  4 ++--
>  gcc/explow.c  |  2 +-
>  gcc/final.c   |  4 ++--
>  gcc/gcse.c|  6 +++---
>  gcc/genconfig.c   |  4 ++--
>  gcc/genrecog.c|  4 ++--
>  gcc/ifcvt.c   |  8 
>  gcc/jump.c| 16 
>  gcc/lra-constraints.c |  2 +-
>  gcc/reload.c  | 13 +++--
>  gcc/reload1.c |  6 +++---
>  gcc/reorg.c   |  6 +++---
>  gcc/rtl.c |  4 ++--
>  gcc/rtl.h |  4 
>  gcc/rtlanal.c |  8 +---
>  gcc/sched-vis.c   |  2 +-
>  gcc/varasm.c  | 11 ++-
>  24 files changed, 75 insertions(+), 67 deletions(-)
>
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 602e9e0..a098cb7 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -1521,7 +1521,7 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y)
>return REGNO (x) == REGNO (y);
>
>  case LABEL_REF:
> -  return XEXP (x, 0) == XEXP (y, 0);
> +  return LABEL_REF_LABEL (x) == LABEL_REF_LABEL (y);
>
>  case SYMBOL_REF:
>return XSTR (x, 0) == XSTR (y, 0);
> diff --git a/gcc/cfgbuild.c b/gcc/cfgbuild.c
> index e5ac8d6..00dab3e 100644
> --- a/gcc/cfgbuild.c
> +++ b/gcc/cfgbuild.c
> @@ -277,7 +277,7 @@ make_edges (basic_block min, basic_block max, int 
> update_p)
>   && GET_CODE (SET_SRC (tmp)) == IF_THEN_ELSE
>   && GET_CODE (XEXP (SET_SRC (tmp), 2)) == LABEL_REF)
> make_label_edge (edge_cache, bb,
> -XEXP (XEXP (SET_SRC (tmp), 2), 0), 0);
> +LABEL_REF_LABEL (XEXP (SET_SRC (tmp), 2)), 
> 0);
> }
>
>   /* If this is a computed jump, then mark it as reaching
> @@ -415,7 +

Re: [PATCHv2] Vimrc config with GNU formatting

2014-09-11 Thread Richard Biener
On Wed, Sep 10, 2014 at 10:09 AM, Yury Gribov  wrote:
> Hi all,
>
> This is a second version of patch which adds a Vim config (.local.vimrc)
> to root folder to allow automatic setup of GNU formatting for C/C++/Java/Lex
> GCC files.
>
> I've updated the code with comments from Richard and Bernhard (which fixed
> formatting
> of lonely closing bracket).
>
> The patch caused a lively debate with Segher who wanted .local.vimrc to not
> be enabled
> by default. We basically have two options:
> 1) put .local.vimrc to root (just like .dir-locals.el config for Emacs)
> 2) put both .local.vimrc and .dir-locals.el to contrib and add Makefile
> targets
> to create symlinks in root folder per user's request
> I personally prefer 2) because this would IMHO improve the quality of
> patches
> (e.g. no more silly tab-whitespace formatting bugs).
>
> Thoughts? Ok to commit?

It doesn't handle indenting switch/case correctly.  I get

 switch (x)
   {
   case X:
  {
 int foo;
...

that is, the { after the case label is wrongly indented.  The same happens
for
  {
   {
   }
  }

we seem to get two soft-tabs here.

Richard.

> -Y
>
>
>


Fix some tests

2014-09-11 Thread Bernd Schmidt
These are copy&paste errors (except the l3 instead of 13 which is just 
odd) and they show up on nvptx as type mismatches in the assembly 
because of the implicit declaration. Two of the tests had ^M characters 
which I also removed.


Tested on x86_64-linux and committed as obvious.


Bernd
commit 93065446e241d0ac36fd0cfb22471022aca307c0
Author: Bernd Schmidt 
Date:   Wed Sep 10 16:30:56 2014 +0200

Fix declarations in some tests.

	* gcc.dg/compat/struct-by-value-13_main.c (struct_by_value_13_x):
	Fix declaration.
	* gcc.dg/compat/struct-by-value-16a_main.c (struct_by_value_16a_x):
	Fix declaration.
	* gcc.dg/compat/struct-by-value-17a_main.c (struct_by_value_17a_x):
	Fix declaration.
	* gcc.dg/compat/struct-by-value-18a_main.c (struct_by_value_18a_x):
	Fix declaration.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 1432e77..d83fee5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,14 @@
+2014-09-11  Bernd Schmidt  
+
+* gcc.dg/compat/struct-by-value-13_main.c (struct_by_value_13_x):
+Fix declaration.
+* gcc.dg/compat/struct-by-value-16a_main.c (struct_by_value_16a_x):
+Fix declaration.
+* gcc.dg/compat/struct-by-value-17a_main.c (struct_by_value_17a_x):
+Fix declaration.
+* gcc.dg/compat/struct-by-value-18a_main.c (struct_by_value_18a_x):
+Fix declaration.
+
 2014-09-10  Jan Hubicka  
 
 	PR tree-optimization/63186
diff --git a/gcc/testsuite/gcc.dg/compat/struct-by-value-13_main.c b/gcc/testsuite/gcc.dg/compat/struct-by-value-13_main.c
index b853bb8..41f4927 100644
--- a/gcc/testsuite/gcc.dg/compat/struct-by-value-13_main.c
+++ b/gcc/testsuite/gcc.dg/compat/struct-by-value-13_main.c
@@ -2,7 +2,7 @@
variable-length argument lists.  All struct members are type
_Complex int.  */
 
-extern void struct_by_value_l3_x (void);
+extern void struct_by_value_13_x (void);
 extern void exit (int);
 int fails;
 
diff --git a/gcc/testsuite/gcc.dg/compat/struct-by-value-16a_main.c b/gcc/testsuite/gcc.dg/compat/struct-by-value-16a_main.c
index 6a71d15..1520e94 100644
--- a/gcc/testsuite/gcc.dg/compat/struct-by-value-16a_main.c
+++ b/gcc/testsuite/gcc.dg/compat/struct-by-value-16a_main.c
@@ -1,14 +1,14 @@
-/* Test structures passed by value, including to a function with a
-   variable-length argument lists.  All struct members are of type
-   _Complex float.  */
-
-extern void struct_by_value_16_x (void);
-extern void exit (int);
-int fails;
-
-int
-main ()
-{
-  struct_by_value_16a_x ();
-  exit (0);
-}
+/* Test structures passed by value, including to a function with a
+   variable-length argument lists.  All struct members are of type
+   _Complex float.  */
+
+extern void struct_by_value_16a_x (void);
+extern void exit (int);
+int fails;
+
+int
+main ()
+{
+  struct_by_value_16a_x ();
+  exit (0);
+}
diff --git a/gcc/testsuite/gcc.dg/compat/struct-by-value-17a_main.c b/gcc/testsuite/gcc.dg/compat/struct-by-value-17a_main.c
index 1db0021..f5baefc 100644
--- a/gcc/testsuite/gcc.dg/compat/struct-by-value-17a_main.c
+++ b/gcc/testsuite/gcc.dg/compat/struct-by-value-17a_main.c
@@ -2,7 +2,7 @@
variable-length argument lists.  All struct members are of type
_Complex double.  */
 
-extern void struct_by_value_17_x (void);
+extern void struct_by_value_17a_x (void);
 extern void exit (int);
 int fails;
 
diff --git a/gcc/testsuite/gcc.dg/compat/struct-by-value-18a_main.c b/gcc/testsuite/gcc.dg/compat/struct-by-value-18a_main.c
index 5b9dfd9..ce7edfb 100644
--- a/gcc/testsuite/gcc.dg/compat/struct-by-value-18a_main.c
+++ b/gcc/testsuite/gcc.dg/compat/struct-by-value-18a_main.c
@@ -1,14 +1,14 @@
-/* Test structures passed by value, including to a function with a
-   variable-length argument lists.  All struct members are of type
-   _Complex long double.  */
-
-extern void struct_by_value_18_x (void);
-extern void exit (int);
-int fails;
-
-int
-main ()
-{
-  struct_by_value_18a_x ();
-  exit (0);
-}
+/* Test structures passed by value, including to a function with a
+   variable-length argument lists.  All struct members are of type
+   _Complex long double.  */
+
+extern void struct_by_value_18a_x (void);
+extern void exit (int);
+int fails;
+
+int
+main ()
+{
+  struct_by_value_18a_x ();
+  exit (0);
+}


Re: [PATCHv2] Vimrc config with GNU formatting

2014-09-11 Thread pinskia


> On Sep 10, 2014, at 9:47 PM, Yury Gribov  wrote:
> 
> Segher Boessenkool  kernel.crashing.org> writes:
>> I am saying it is very anti-social to make
>> people's editor behave differently from what they are used to.
>> ...
>> The Emacs dir-locals file simply
>> configures some settings for files with certain major modes in that
> dir.
>> For example, ours says that c-mode files should use GNU style.  This
> is
>> quite harmless, and probably what most Emacs users want.
> 
> Hm, so autoformatting in Emacs is good because that's what most users 
> want but autoformatting in Vim is bad because that's not what people are 
> used to?

I don't like auto formatting in any editor. Though I don't use emacs, I use 
vim. I think using auto formatting is cheating and not understanding why coding 
styles exists.  And some folks already have to deal with two more formatting 
styles already: Linux kernel and gnu. So if you add auto formatting to one, 
some folks are going to get confused. 


Thanks,
Andrew

> 
>> First, you are encouraging
>> the use of a plugin that is a gaping wide security hole.
> 
> I don't think so. The comment mentions that user can either install a 
> (rather widespread btw) plugin or just call config from his .vimrc.
> 
>> Secondly, this is a very poor imitation of the mechanism Vim has for
> dealing
>> with filetypes, namely, ftplugins.
> 
> I'm ready to accept technical suggestions on how to do the thing 
> properly. So what exactly do you propose?
> 
>> [Snipped some overly optimistic stuff about this all increasing the
> quality
>> of posted patches.  Hint: the most frequently made formatting error is
>> forgetting to put two spaces at the end of a sentence.
> 
> Dunno, I was relying on personal experience. And searching for "two|2 
> spaces" on http://gcc.gnu.org/ml/gcc-patches returns 2000 results 
> whereas "eight|8 spaces" only 700.
> 
> -Y
> 


Re: [PATCHv2] Vimrc config with GNU formatting

2014-09-11 Thread Richard Biener
On Thu, Sep 11, 2014 at 11:06 AM, Richard Biener
 wrote:
> On Wed, Sep 10, 2014 at 10:09 AM, Yury Gribov  wrote:
>> Hi all,
>>
>> This is a second version of patch which adds a Vim config (.local.vimrc)
>> to root folder to allow automatic setup of GNU formatting for C/C++/Java/Lex
>> GCC files.
>>
>> I've updated the code with comments from Richard and Bernhard (which fixed
>> formatting
>> of lonely closing bracket).
>>
>> The patch caused a lively debate with Segher who wanted .local.vimrc to not
>> be enabled
>> by default. We basically have two options:
>> 1) put .local.vimrc to root (just like .dir-locals.el config for Emacs)
>> 2) put both .local.vimrc and .dir-locals.el to contrib and add Makefile
>> targets
>> to create symlinks in root folder per user's request
>> I personally prefer 2) because this would IMHO improve the quality of
>> patches
>> (e.g. no more silly tab-whitespace formatting bugs).
>>
>> Thoughts? Ok to commit?
>
> It doesn't handle indenting switch/case correctly.  I get
>
>  switch (x)
>{
>case X:
>   {
>  int foo;
> ...
>
> that is, the { after the case label is wrongly indented.  The same happens
> for
>   {
>{
>}
>   }
>
> we seem to get two soft-tabs here.


setlocal cinoptions=>s,n-s,{s,:s,=s,g0,hs,p5,t0,+s,(0,u0,w1,m0

does better but still oddly handles

  switch (x)
{
case X:
{
tree x;

thus indents a brace two spaces too much (but the stmts are
correctly indented).  The following is handled fine:

  switch (x)
{
case X:
   foo ();

Richard.

> Richard.
>
>> -Y
>>
>>
>>


Re: [PATCHv2] Vimrc config with GNU formatting

2014-09-11 Thread Yury Gribov

On 09/11/2014 01:14 PM, pins...@gmail.com wrote:

I don't like auto formatting in any editor.


Ok, so +1 for moving to contrib/.


And some folks already have to deal with two more formatting styles already:
Linux kernel and gnu. So if you add auto formatting to one,
some folks are going to get confused.


This config will only turn on autoformatting for GCC sources.
It's not going to influence other directories.

-Y


Re: [PATCHv2] Vimrc config with GNU formatting

2014-09-11 Thread Yury Gribov

On 09/11/2014 01:18 PM, Richard Biener wrote:
On Thu, Sep 11, 2014 at 11:06 AM, Richard Biener
  wrote:

>On Wed, Sep 10, 2014 at 10:09 AM, Yury Gribov  wrote:

>>Hi all,
>>
>>This is a second version of patch which adds a Vim config (.local.vimrc)
>>to root folder to allow automatic setup of GNU formatting for C/C++/Java/Lex
>>GCC files.
>>
>>I've updated the code with comments from Richard and Bernhard (which fixed
>>formatting
>>of lonely closing bracket).
>>
>>The patch caused a lively debate with Segher who wanted .local.vimrc to not
>>be enabled
>>by default. We basically have two options:
>>1) put .local.vimrc to root (just like .dir-locals.el config for Emacs)
>>2) put both .local.vimrc and .dir-locals.el to contrib and add Makefile
>>targets
>>to create symlinks in root folder per user's request
>>I personally prefer 2) because this would IMHO improve the quality of
>>patches
>>(e.g. no more silly tab-whitespace formatting bugs).
>>
>>Thoughts? Ok to commit?

>
>It doesn't handle indenting switch/case correctly.  I get
>
>  switch (x)
>{
>case X:
>   {
>  int foo;
>...
>
>that is, the { after the case label is wrongly indented.  The same happens
>for
>   {
>{
>}
>   }
>
>we seem to get two soft-tabs here.

setlocal cinoptions=>s,n-s,{s,:s,=s,g0,hs,p5,t0,+s,(0,u0,w1,m0

does better but still oddly handles


Also fails for

  if (1)
{
x = 2;
}



ptx preliminary address space fixes [1/4]

2014-09-11 Thread Bernd Schmidt
I'm getting ready to submit the ptx port and the various changes that 
are necessary for it. To start with, here are some patches to deal with 
address space issues.


ptx has the concept of an implicit address space: everything (objects on 
the stack as well as constant data and global variables) lives in its 
own address space. These are applied by a lower-as pass which I'll 
submit later.


Since address spaces are more pervasive than on other targets and can 
show up in places the compiler doesn't yet expect them (such as local 
variables), this uncovers a few bugs in the optimizers. These are 
typically of the kind where we recreate a memory reference and aren't 
quite careful enough to preserve the existing address space.


The first patch below just introduces a utility function that the 
following patches will use.  All were bootstrapped and tested together 
on x86_64-linux. Ok?



Bernd
commit 9a63fbecf0ccf9dd9cf18073958e4cfccf6ecaf2
Author: Bernd Schmidt 
Date:   Wed Sep 10 16:32:27 2014 +0200

	* tree.c (apply_as_to_type): New function.
	* tree.h (apply_as_to_type): Declare.

diff --git a/gcc/tree.c b/gcc/tree.c
index d1d67ef..a7438b2 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -6156,6 +6156,21 @@ handle_dll_attribute (tree * pnode, tree name, tree args, int flags,
 
 #endif /* TARGET_DLLIMPORT_DECL_ATTRIBUTES  */
 
+/* Build a type like TYPE, but with address space AS (which can be
+   ADDR_SPACE_GENERIC to remove an existing address space), and return it.  */
+
+tree
+apply_as_to_type (tree type, addr_space_t as)
+{
+  int quals = TYPE_QUALS_NO_ADDR_SPACE (type);
+  if (!ADDR_SPACE_GENERIC_P (as))
+quals |= ENCODE_QUAL_ADDR_SPACE (as);
+  type = build_qualified_type (type, quals);
+  if (TREE_CODE (type) == ARRAY_TYPE)
+TREE_TYPE (type) = apply_as_to_type (TREE_TYPE (type), as);
+  return type;
+}
+
 /* Set the type qualifiers for TYPE to TYPE_QUALS, which is a bitmask
of the various TYPE_QUAL values.  */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index e000e4e..8e1aa6b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3845,6 +3845,8 @@ extern tree build_qualified_type (tree, int);
 
 extern tree build_aligned_type (tree, unsigned int);
 
+extern tree apply_as_to_type (tree, addr_space_t);
+
 /* Like build_qualified_type, but only deals with the `const' and
`volatile' qualifiers.  This interface is retained for backwards
compatibility with the various front-ends; new code should use


ptx preliminary address space fixes [2/4]

2014-09-11 Thread Bernd Schmidt
This is a bug in SRA which replaces a memory reference without taking 
care to use the correct address space.


Bootstrapped and tested together with the other patches on x86_64-linux. 
 Ok?



Bernd
commit 6b9be6e3081c313c024aeabe2d70bc0f8146b429
Author: Bernd Schmidt 
Date:   Wed Sep 10 16:32:56 2014 +0200

	* tree-sra.c (build_ref_for_offset): Use existing address space
	when replacing a memory reference.

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8259dba..c69cc90 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -1561,6 +1561,7 @@ build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset,
   if (align < TYPE_ALIGN (exp_type))
 exp_type = build_aligned_type (exp_type, align);
 
+  exp_type = apply_as_to_type (exp_type, TYPE_ADDR_SPACE (TREE_TYPE (prev_base)));
   mem_ref = fold_build2_loc (loc, MEM_REF, exp_type, base, off);
   if (TREE_THIS_VOLATILE (prev_base))
 TREE_THIS_VOLATILE (mem_ref) = 1;


ptx preliminary address space fixes [3/4]

2014-09-11 Thread Bernd Schmidt
The vectorizer can also replace a memory reference without ensuring it 
uses the correct address space.


Bootstrapped and tested together with the other patches on x86_64-linux. 
 Ok?



Bernd
commit e85dbde1aa3396b5e202aa736f96b232a6e11e86
Author: Bernd Schmidt 
Date:   Wed Sep 10 16:33:40 2014 +0200

	* tree-vect-stmts.c (vectorizable_store, vectorizable_load): Apply
	address spaces to the type of the MEM_REF as needed.

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 26eb2d4..5fca144 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -5026,6 +5026,9 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   && TREE_CODE (scalar_dest) != MEM_REF)
 return false;
 
+  tree dest_type = TREE_TYPE (scalar_dest);
+  addr_space_t as = TYPE_ADDR_SPACE (dest_type);
+
   gcc_assert (gimple_assign_single_p (stmt));
   op = gimple_assign_rhs1 (stmt);
   if (!vect_is_simple_use (op, stmt, loop_vinfo, bb_vinfo, &def_stmt,
@@ -5038,6 +5041,11 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 }
 
   elem_type = TREE_TYPE (vectype);
+  if (!ADDR_SPACE_GENERIC_P (as))
+{
+  elem_type = apply_as_to_type (elem_type, as);
+  vectype = apply_as_to_type (vectype, as);
+}
   vec_mode = TYPE_MODE (vectype);
 
   /* FORNOW. In some cases can vectorize even if data-type not supported
@@ -5379,7 +5387,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 		   vect_permute_store_chain().  */
 		vec_oprnd = result_chain[i];
 
-	  data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd), dataref_ptr,
+	  data_ref = build2 (MEM_REF, vectype, dataref_ptr,
  dataref_offset
  ? dataref_offset
  : build_int_cst (reference_alias_ptr_type
@@ -5692,8 +5700,17 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
   if (!STMT_VINFO_DATA_REF (stmt_info))
 return false;
 
+  tree rhs = gimple_assign_rhs1 (stmt);
+  tree rhstype = TREE_TYPE (rhs);
+  addr_space_t as = TYPE_ADDR_SPACE (rhstype);
+
   elem_type = TREE_TYPE (vectype);
   mode = TYPE_MODE (vectype);
+  if (!ADDR_SPACE_GENERIC_P (as))
+{
+  elem_type = apply_as_to_type (elem_type, as);
+  vectype = apply_as_to_type (vectype, as);
+}
 
   /* FORNOW. In some cases can vectorize even if data-type not supported
 (e.g. - data copies).  */


ptx preliminary address space fixes [4/4]

2014-09-11 Thread Bernd Schmidt
This one isn't a wrong-code issue, just a missed optimization.  The 
strlen optimizations need to be made to look through 
ADDR_SPACE_CONVERT_EXPR to work on ptx.


Bootstrapped and tested together with the other patches on x86_64-linux. 
 Ok?



Bernd
commit 35d765aeba4ea7d0ba829b2b502c8c7af0c24728
Author: Bernd Schmidt 
Date:   Wed Sep 10 16:34:04 2014 +0200

	* tree-ssa-strlen.c (strlen_optimize_stmt): Look through
	ADDR_SPACE_CONVERT_EXPR.

diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index bb42cc7..f72ecf5 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1930,6 +1930,7 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
   if (TREE_CODE (lhs) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE (lhs)))
 	{
 	  if (gimple_assign_single_p (stmt)
+	  || gimple_assign_rhs_code (stmt) == ADDR_SPACE_CONVERT_EXPR
 	  || (gimple_assign_cast_p (stmt)
 		  && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)
 	{


Fix some more decl types in the Fortran frontend

2014-09-11 Thread Bernd Schmidt
This shows up as failures of gfortran.dg/bessel_[67].f90 on ptx. 
According to the man page, the correct declarations are


double jn(int n, double x);
double yn(int n, double x);

and our calls match this, but the argument types are switched in the 
decls built by the Fortran frontend. On ptx, such errors are diagnosed 
by the assembler. Fixed by the following patch, bootstrapped and tested 
on x86_64-linux. Ok?


I'm also seeing some new Fortran testsuite failures after the recent 
merge to gomp-4_0-branch, due to a similar issue. That one has been 
harder to figure out, one of the affected tests is 
gfortran.dg/array_assignment_5.f90.  Here, we call 
_gfortran_spread_char_scalar, which is declared in the library as a 
function taking 6 args, and we pass 6 args to it. However, the decl for 
it only has 5 arguments. It's unclear to me where the problem is in the 
construction of the argument list for the decl. I'm also unsure why this 
has shown up only very recently.



Bernd
commit 87c261fd190c9dea09a793b1295e7cff9bb6044e
Author: Bernd Schmidt 
Date:   Wed Sep 10 18:02:53 2014 +0200

Fix type mismatches in intrinsic functions.

	* f95-lang.c (build_builtin_fntypes): Switch order of args for type
	with index 2.

diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index da3a0d0..27cfc87 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -608,9 +608,9 @@ build_builtin_fntypes (tree *fntype, tree type)
   fntype[0] = build_function_type_list (type, type, NULL_TREE);
   /* type (*) (type, type) */
   fntype[1] = build_function_type_list (type, type, type, NULL_TREE);
-  /* type (*) (type, int) */
-  fntype[2] = build_function_type_list (type,
-type, integer_type_node, NULL_TREE);
+  /* type (*) (int, type) */
+  fntype[2] = build_function_type_list (type, integer_type_node, type,
+NULL_TREE);
   /* type (*) (void) */
   fntype[3] = build_function_type_list (type, NULL_TREE);
   /* type (*) (type, &int) */


Re: Fix some more decl types in the Fortran frontend

2014-09-11 Thread FX
Changing the fntype[2] looks wrong to me, as it is also used for powi(double, 
int) , where the argument order matches the current version:

>   gfc_define_builtin ("__builtin_powi", mfunc_double[2],
>   BUILT_IN_POWI, "powi", ATTR_CONST_NOTHROW_LEAF_LIST);

(I don’t see any other use of this, but I might be missing something.)

It looks like fntype[5] is actually what you need, and it’s already 
constructed! However, there is even more mistery here, because it is currently 
used for __builtin_scalbn, which doesn’t seem right: 
http://pubs.opengroup.org/onlinepubs/009695399/functions/scalbln.html

So I suspect looking a bit more in depth is required! Also, testcases that 
excercise this fndecl matching (which you would see fail on ptx) would be a 
great addition to the testsuite, once you commit (for powi & scalbn, which do 
not look covered right now, otherwise you would have seen regressions).

Cheers,
FX

[patch] libstdc++/63219 remove stray template parameter

2014-09-11 Thread Jonathan Wakely

Fix a mistake in the declaration.

The trunk patch also fixes a few warnings that show up with
-Wsystem-headers.

Tested x86_64-linux, committed to trunk and the 4.9 branch.
commit 4f02e789f77c1d72a75fad0206dbde7728f41c23
Author: Jonathan Wakely 
Date:   Thu Sep 11 10:38:49 2014 +0100

PR libstdc++/63219
* include/bits/regex.h (match_results::format): Remove stray template
parameter.
* include/bits/regex_compiler.h (_RegexTranslator::_RegexTranslator):
Remove parameter name to avoid -Wunused-parameter warning.
* include/bits/regex_executor.h (_State_info::_State_info): Reorder
mem-initializers to avoid -Wreorder warning.
* include/bits/regex_executor.tcc (_Executor::_M_word_boundary):
Remove parameter name to avoid -Wunused-parameter warning.
* include/bits/regex_scanner.tcc (_Scanner::_M_advance): Add braces
to avoid -Wempty-body warning when not in debug mode.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index e556350..9dc83fd 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -1814,7 +1814,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /**
* @pre   ready() == true
*/
-  template
+  template
basic_string
format(const basic_string& __fmt,
   match_flag_type __flags = regex_constants::format_default) const
diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index ca116de..1193a5a 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -212,7 +212,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef _CharT   _StrTransT;
 
   explicit
-  _RegexTranslator(const _TraitsT& __traits)
+  _RegexTranslator(const _TraitsT&)
   { }
 
   _CharT
diff --git a/libstdc++-v3/include/bits/regex_executor.h 
b/libstdc++-v3/include/bits/regex_executor.h
index 40d3443..130bc74 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -159,7 +159,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
  explicit
  _State_info(_StateIdT __start, size_t __n)
- : _M_start(__start), _M_visited_states(new bool[__n]())
+ : _M_visited_states(new bool[__n]()), _M_start(__start)
  { }
 
  bool _M_visited(_StateIdT __i)
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 3c68668..3ca7de3 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -407,7 +407,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-_M_word_boundary(_State<_TraitsT> __state) const
+_M_word_boundary(_State<_TraitsT>) const
 {
   // By definition.
   bool __ans = false;
diff --git a/libstdc++-v3/include/bits/regex_scanner.tcc 
b/libstdc++-v3/include/bits/regex_scanner.tcc
index 818e47b..1dc2fd9 100644
--- a/libstdc++-v3/include/bits/regex_scanner.tcc
+++ b/libstdc++-v3/include/bits/regex_scanner.tcc
@@ -83,7 +83,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   else if (_M_state == _S_state_in_brace)
_M_scan_in_brace();
   else
-   _GLIBCXX_DEBUG_ASSERT(false);
+   {
+ _GLIBCXX_DEBUG_ASSERT(false);
+   }
 }
 
   // Differences between styles:
commit 12c8deb05de9f5f56ed0a867a90a738200c01ddc
Author: Jonathan Wakely 
Date:   Thu Sep 11 11:04:13 2014 +0100

PR libstdc++/63219
* include/bits/regex.h (match_results::format): Remove stray template
parameter.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index e556350..9dc83fd 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -1814,7 +1814,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /**
* @pre   ready() == true
*/
-  template
+  template
basic_string
format(const basic_string& __fmt,
   match_flag_type __flags = regex_constants::format_default) const


[PATCH][match-and-simplify] Fix PR41043

2014-09-11 Thread Richard Biener

So I managed to re-introduce PR41043 by implementing the
fold_unary (T1)(X * Y) -> (T1)X * (T1)Y pattern on GIMPLE.
That is because it re-applies recursively on a large
multiplication tree and because if there are multiple uses
of SSA names doing that will expand (aka un-CSE) the
multiplication tree.  Oops.  The following restricts it
to at most one SSA name or two single-use SSA names.
In the end this kind of demotion should be done by a
pass, not by expression simplification (it's not really
a simplification after all).

Applied.

Richard.

2014-09-11  Richard Biener  

PR middle-end/41043
* match-conversions.pd ((T1)(X * Y) -> (T1)X * (T1)Y): Restrict
to a single or single-use SSA names.

Index: gcc/match-conversions.pd
===
--- gcc/match-conversions.pd(revision 215110)
+++ gcc/match-conversions.pd(working copy)
@@ -56,7 +56,15 @@
  (convert (mult @0 @1))
  (if (INTEGRAL_TYPE_P (type)
   && INTEGRAL_TYPE_P (TREE_TYPE (@0))
-  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@0)))
+  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@0))
+  /* ???  These kind of patterns are a bad idea - see PR41043.  We
+create a lot of redundant statements if operands are used multiple
+times.  Maybe we want a flag for this.  But eventually these
+kind of transforms should be done in a pass.  */
+  && (GENERIC
+  || TREE_CODE (@0) != SSA_NAME || TREE_CODE (@1) != SSA_NAME
+ || ((TREE_CODE (@0) != SSA_NAME || has_single_use (@0))
+  && (TREE_CODE (@1) != SSA_NAME || has_single_use (@1)
   (if (TYPE_OVERFLOW_WRAPS (type))
(mult (convert @0) (convert @1)))
   (with { tree utype = unsigned_type_for (type); }


[PATCH][match-and-simplify] Complete associate_* patterns

2014-09-11 Thread Richard Biener

The following patch completes transitioning 
tree-ssa-forwprop.c:associate_* to patterns.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2014-09-11  Richard Biener  

* match-plusminus.pd: Complete associate_plusminus patterns.
Properly guard all patterns.  Add associate_pointerplus_align
pattern.

Index: gcc/match-plusminus.pd
===
--- gcc/match-plusminus.pd  (revision 215009)
+++ gcc/match-plusminus.pd  (working copy)
@@ -17,101 +17,138 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-/* ???  Have simplify groups guarded with common
-   predicates on the outermost type?  */
 
-/* Contract negates.  */
-(simplify
-  (plus:c @0 (negate @1))
-  (if (!TYPE_SATURATING (type))
-   (minus @0 @1)))
-(simplify
-  (minus @0 (negate @1))
-  (if (!TYPE_SATURATING (type))
-   (plus @0 @1)))
+/* From tree-ssa-forwprop.c:associate_plusminus.  */
 
+/* We can't reassociate at all for saturating types.  */
+(if (!TYPE_SATURATING (type))
 
-/* Match patterns that allow contracting a plus-minus pair
-   irrespective of overflow issues.
-   ???  !TYPE_SATURATING condition missing.
-   ???  !FLOAT_TYPE_P && !FIXED_POINT_TYPE_P condition missing
-   because of saturation to +-Inf.  */
+ /* Contract negates.  */
+ (simplify
+  (plus:c @0 (negate @1))
+  (minus @0 @1))
+ (simplify
+  (minus @0 (negate @1))
+  (plus @0 @1))
 
-(if (!TYPE_SATURATING (type)
-&& !FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type))
+ /* We can't reassociate floating-point or fixed-point plus or minus
+because of saturation to +-Inf.  */
+ (if (!FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type))
+
+  /* Match patterns that allow contracting a plus-minus pair
+ irrespective of overflow issues.  */
+  /* (A +- B) - A   ->  +- B */
+  /* (A +- B) -+ B  ->  A */
+  /* A - (A +- B)   -> -+ B */
+  /* A +- (B -+ A)  ->  +- B */
   (simplify
-(minus (plus @0 @1) @0)
+(minus (plus:c @0 @1) @0)
 @1)
-
   (simplify
 (minus (minus @0 @1) @0)
 (negate @1))
-
   (simplify
-(minus (plus @0 @1) @1)
+(plus:c (minus @0 @1) @1)
 @0)
-
   (simplify
-(plus:c (minus @0 @1) @1)
-@0))
-
-/* (CST +- A) +- CST -> CST' +- A.  */
-/* simplify handles constant folding for us so we can
-   implement these as re-association patterns.
-   Watch out for operand order and constant canonicalization
-   we do!  A - CST -> A + -CST, CST + A -> A + CST.  */
-(simplify
-  (plus (plus @0 INTEGER_CST@1) INTEGER_CST@2)
-  /* If the constant operation overflows we cannot do the transform
- as we would introduce undefined overflow, for example
- with (a - 1) + INT_MIN.  */
-  (if (!TREE_OVERFLOW (@1 = int_const_binop (PLUS_EXPR, @1, @2)))
-   (plus @0 @1)))
-(simplify
-  (plus (minus INTEGER_CST@0 @1) INTEGER_CST@2)
-  (minus (plus @0 @2) @1))
-/* TODO:
-   (A +- CST) +- CST  ->  A +- CST
-   ~A + A ->  -1
-   ~A + 1 ->  -A
-   A - (A +- B)   ->  -+ B
-   A +- (B +- A)  ->  +- B
-   CST +- (CST +- A)  ->  CST +- A
-   CST +- (A +- CST)  ->  CST +- A
-   A + ~A ->  -1
-   (T)(P + A) - (T)P  -> (T)A
- */
+   (minus @0 (plus:c @0 @1))
+   (negate @1))
+  (simplify
+   (minus @0 (minus @0 @1))
+   @1)
 
-/* ~A + A -> -1 */
-(simplify
-  (plus:c (bit_not @0) @0)
-  { build_all_ones_cst (type); })
+  /* (A +- CST) +- CST -> A + CST  */
+  (for outer_op (plus minus)
+   (for inner_op (plus minus)
+(simplify
+ (outer_op (inner_op @0 INTEGER_CST@1) INTEGER_CST@2)
+ /* If the constant operation overflows we cannot do the transform
+   as we would introduce undefined overflow, for example
+   with (a - 1) + INT_MIN.  */
+ (with { tree cst = int_const_binop (outer_op == inner_op
+? PLUS_EXPR : MINUS_EXPR, @1, @2); }
+  (if (!TREE_OVERFLOW (cst))
+   (inner_op @0 { cst; } ))
+
+  /* (CST - A) +- CST -> CST - A  */
+  (for outer_op (plus minus)
+   (simplify
+(outer_op (minus INTEGER_CST@1 @0) INTEGER_CST@2)
+(with { tree cst = int_const_binop (outer_op, @1, @2); }
+ (if (!TREE_OVERFLOW (cst))
+  (minus { cst; } @0)
 
-/* ~A + 1 -> -A */
-(simplify
-  (plus (bit_not integral_op_p@0) integer_onep)
-  (negate @0)) 
+  /* ~A + A -> -1 */
+  (simplify
+   (plus:c (bit_not @0) @0)
+   { build_all_ones_cst (type); })
 
-/* A - (A +- B) -> -+ B */
-(simplify
-  (minus @0 (plus:c @0 @1))
-  (negate @1))
-(simplify
-  (minus @0 (minus @0 @1))
-  @1)
+  /* ~A + 1 -> -A */
+  (simplify
+   (plus (bit_not integral_op_p@0) integer_onep@1)
+   (if (TREE_CODE (TREE_TYPE (@1)) != COMPLEX_TYPE
+   || (TREE_CODE (@1) == COMPLEX_CST
+   && integer_onep (TREE_REALPART (@1))
+   && integer_onep (TREE_IMAGPART (@1
+(negate @0)))
+
+  /* (T)(P + A) - (T)P -> (T) A */
+  (for add (plus pointer_

Re: [PATCH] RE: gcc parallel make check

2014-09-11 Thread Jonathan Wakely
On 11 September 2014 07:22, VandeVondele  Joost wrote:
> Jakub,
>
>> First of all, the -j2 testing shows more tests tested in gcc and libstdc++:
>>
>>-# of expected passes   10133
>>+# of expected passes   10152
>>
>>+PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors)
>>[...]
>>
>>Not sure where the bug is, could be e.g. in i386.exp for gcc, but for
>>libstdc++ less likely to be there rather than in the split.
>
> I looked into this, and believe this problem is already in current trunk, and 
> not due to my patch. I.e. unmodified trunk also has these tests executed 
> several times:
>
> libstdc++-v3/testsuite/normal4/libstdc++.log.sep:PASS: 
> 23_containers/map/modifiers/erase/abi_tag.cc
> libstdc++-v3/testsuite/normal1/libstdc++.log.sep:PASS: 
> 23_containers/map/modifiers/erase/abi_tag.cc
>
>  I believe the current trunk pattern could indeed match those twice 
> (Makefile.in in trunk):
>   normal1) \
> dirs="`cd $$srcdir; echo [ab]* de* [ep]*/*`";; \
>   normal4) \
> dirs="`cd $$srcdir; echo 23_*/[a-km-tw-z]*`";; \
>
> could it be that the pattern in normal1 should have been '[ab]*/ de*/ 
> [ep]*/*' ?

Yes, we are running these tests multiple times:

PASS: 23_containers/map/modifiers/erase/abi_tag.cc (test for excess errors)
PASS: 23_containers/multimap/modifiers/erase/abi_tag.cc (test for excess errors)
PASS: 23_containers/multiset/modifiers/erase/abi_tag.cc (test for excess errors)
PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors)
PASS: 26_numerics/complex/abi_tag.cc (test for excess errors)

I'll fix that.


[PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-11 Thread Ilya Verbin
Hello,

I would like to start merging offloading-related patches from gomp-4_0-branch 
to trunk.
This is the first patch (from r202620), which adds a splay tree and memory 
mapping to libgomp.
I removed temporarily device 257 from it.  Bootstrapped and regtested on 
x86_64-linux.

2014-09-11  Jakub Jelinek  

* splay-tree.h: New file.
* target.c (splay_tree_node, splay_tree, splay_tree_key): New typedefs.
(struct target_mem_desc, struct splay_tree_key_s): New structures.
(splay_compare): New inline function.
(resolve_device): Use default_device_var ICV.
(dev_splay_tree, dev_env_lock): New variables.
(gomp_map_vars_existing, gomp_map_vars, gomp_unmap_tgt,
gomp_unmap_vars, gomp_update): New functions.
(GOMP_target): Arrange for host callback to be performed in a
separate initial thread and contention group, inheriting ICVs from
gomp_global_icv etc.
(GOMP_target_data): Use gomp_map_vars.
(GOMP_target_end_data): Use gomp_unmap_vars.
(GOMP_target_update): Use gomp_update.

Thanks,
  -- Ilya

---

diff --git a/libgomp/splay-tree.h b/libgomp/splay-tree.h
new file mode 100644
index 000..eb8011a
--- /dev/null
+++ b/libgomp/splay-tree.h
@@ -0,0 +1,232 @@
+/* A splay-tree datatype.
+   Copyright 1998-2014
+   Free Software Foundation, Inc.
+   Contributed by Mark Mitchell (m...@markmitchell.com).
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* The splay tree code copied from include/splay-tree.h and adjusted,
+   so that all the data lives directly in splay_tree_node_s structure
+   and no extra allocations are needed.
+
+   Files including this header should before including it add:
+typedef struct splay_tree_node_s *splay_tree_node;
+typedef struct splay_tree_s *splay_tree;
+typedef struct splay_tree_key_s *splay_tree_key;
+   define splay_tree_key_s structure, and define
+   splay_compare inline function.  */
+
+/* For an easily readable description of splay-trees, see:
+
+ Lewis, Harry R. and Denenberg, Larry.  Data Structures and Their
+ Algorithms.  Harper-Collins, Inc.  1991.
+
+   The major feature of splay trees is that all basic tree operations
+   are amortized O(log n) time for a tree with n nodes.  */
+
+/* The nodes in the splay tree.  */
+struct splay_tree_node_s {
+  struct splay_tree_key_s key;
+  /* The left and right children, respectively.  */
+  splay_tree_node left;
+  splay_tree_node right;
+};
+
+/* The splay tree.  */
+struct splay_tree_s {
+  splay_tree_node root;
+};
+
+/* Rotate the edge joining the left child N with its parent P.  PP is the
+   grandparents' pointer to P.  */
+
+static inline void
+rotate_left (splay_tree_node *pp, splay_tree_node p, splay_tree_node n)
+{
+  splay_tree_node tmp;
+  tmp = n->right;
+  n->right = p;
+  p->left = tmp;
+  *pp = n;
+}
+
+/* Rotate the edge joining the right child N with its parent P.  PP is the
+   grandparents' pointer to P.  */
+
+static inline void
+rotate_right (splay_tree_node *pp, splay_tree_node p, splay_tree_node n)
+{
+  splay_tree_node tmp;
+  tmp = n->left;
+  n->left = p;
+  p->right = tmp;
+  *pp = n;
+}
+
+/* Bottom up splay of KEY.  */
+
+static void
+splay_tree_splay (splay_tree sp, splay_tree_key key)
+{
+  if (sp->root == NULL)
+return;
+
+  do {
+int cmp1, cmp2;
+splay_tree_node n, c;
+
+n = sp->root;
+cmp1 = splay_compare (key, &n->key);
+
+/* Found.  */
+if (cmp1 == 0)
+  return;
+
+/* Left or right?  If no child, then we're done.  */
+if (cmp1 < 0)
+  c = n->left;
+else
+  c = n->right;
+if (!c)
+  return;
+
+/* Next one left or right?  If found or no child, we're done
+   after one rotation.  */
+cmp2 = splay_compare (key, &c->key);
+if (cmp2 == 0
+   || (cmp2 < 0 && !c->left)
+   || (cmp2 > 0 && !c->right))
+  {
+   if (cmp1 < 0)
+ rotate_left (&sp->root, n, c);
+   else
+ rotate_right (&sp->root, n, c);
+   return;
+  }
+
+/* Now we 

Re: ptx preliminary address space fixes [1/4]

2014-09-11 Thread Richard Biener
On Thu, Sep 11, 2014 at 12:11 PM, Bernd Schmidt  wrote:
> I'm getting ready to submit the ptx port and the various changes that are
> necessary for it. To start with, here are some patches to deal with address
> space issues.
>
> ptx has the concept of an implicit address space: everything (objects on the
> stack as well as constant data and global variables) lives in its own
> address space. These are applied by a lower-as pass which I'll submit later.
>
> Since address spaces are more pervasive than on other targets and can show
> up in places the compiler doesn't yet expect them (such as local variables),
> this uncovers a few bugs in the optimizers. These are typically of the kind
> where we recreate a memory reference and aren't quite careful enough to
> preserve the existing address space.
>
> The first patch below just introduces a utility function that the following
> patches will use.  All were bootstrapped and tested together on
> x86_64-linux. Ok?

+tree
+apply_as_to_type (tree type, addr_space_t as)
+{
+  int quals = TYPE_QUALS_NO_ADDR_SPACE (type);
+  if (!ADDR_SPACE_GENERIC_P (as))
+quals |= ENCODE_QUAL_ADDR_SPACE (as);
+  type = build_qualified_type (type, quals);

please optimize the case of quals == TYPE_QUALS (type).

+  if (TREE_CODE (type) == ARRAY_TYPE)
+TREE_TYPE (type) = apply_as_to_type (TREE_TYPE (type), as);

why is this necessary for ARRAY_TYPE but not for sth like
a RECORD_TYPE or a POINTER_TYPE?

[having address-spaces on types rather than decls and (indirect)
memory references seems odd anyway - that is, as far as the
middle-/back-end
is concerned]

The name apply_as_to_type looks odd to me - other address-space
related functions use addr_space - can you change it to that please?

Thanks,
Richard.

>
> Bernd


Re: ptx preliminary address space fixes [2/4]

2014-09-11 Thread Richard Biener
On Thu, Sep 11, 2014 at 12:12 PM, Bernd Schmidt  wrote:
> This is a bug in SRA which replaces a memory reference without taking care
> to use the correct address space.
>
> Bootstrapped and tested together with the other patches on x86_64-linux.
> Ok?

Ok (with adjustments necessary for renaming apply_as_to_type).

Thanks,
Richard.

>
> Bernd


Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-11 Thread Jakub Jelinek
On Thu, Sep 11, 2014 at 03:28:02PM +0400, Ilya Verbin wrote:
> I would like to start merging offloading-related patches from gomp-4_0-branch 
> to trunk.
> This is the first patch (from r202620), which adds a splay tree and memory 
> mapping to libgomp.
> I removed temporarily device 257 from it.  Bootstrapped and regtested on 
> x86_64-linux.

I think it is not useful to split patches on in which sequence they were
added to the tree.  I'd prefer patches for functional parts for the
differences between trunk and corresponding offloading branch.
So, one patch should be all the libgomp changes except for addition of
plugins, another one generic middle-end changes, then the libmicoffload
support library, then libgomp plugin for Intel MIC and finally rest of Intel
MIC enablements.  E.g. the patch you've posted contains tons of FIXMEs that
really shouldn't make into the trunk, not even short lived.

Jakub


Re: ptx preliminary address space fixes [3/4]

2014-09-11 Thread Richard Biener
On Thu, Sep 11, 2014 at 12:12 PM, Bernd Schmidt  wrote:
> The vectorizer can also replace a memory reference without ensuring it uses
> the correct address space.
>
> Bootstrapped and tested together with the other patches on x86_64-linux.
> Ok?

Seeing this it would be nice to abstract away the exact place we store
the address-space in a memory reference.  So - can you add a helper
reference_addr_space () please?  Thus do

  addr_space_t as = reference_addr_space (scalar_dest);

+  if (!ADDR_SPACE_GENERIC_P (as))
+{
+  elem_type = apply_as_to_type (elem_type, as);
+  vectype = apply_as_to_type (vectype, as);
+}

but then I wonder why not simply build the correct vector types
in the first place in vect_analyze_data_refs?

Or apply the addr-space to the memory reference with a new helper
reference_apply_addr_space

- data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd), dataref_ptr,
 dataref_offset
 ? dataref_offset
 : build_int_cst (reference_alias_ptr_type
..
 reference_apply_addr_space (data_ref, as);

at least that's how it's abstracted on the RTL side.  I think I'd prefer
if things would be working that way on the tree level, too.

Thanks,
Richard.

>
> Bernd


Re: [PATCH][match-and-simplify] Complete associate_* patterns

2014-09-11 Thread Marc Glisse

On Thu, 11 Sep 2014, Richard Biener wrote:


+ /* We can't reassociate floating-point or fixed-point plus or minus
+because of saturation to +-Inf.  */
+ (if (!FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type))


Do you remember if there was a particular reason not to add
  || flag_associative_math
when this was in forwprop?

--
Marc Glisse


Re: ptx preliminary address space fixes [4/4]

2014-09-11 Thread Richard Biener
On Thu, Sep 11, 2014 at 12:12 PM, Bernd Schmidt  wrote:
> This one isn't a wrong-code issue, just a missed optimization.  The strlen
> optimizations need to be made to look through ADDR_SPACE_CONVERT_EXPR to
> work on ptx.
>
> Bootstrapped and tested together with the other patches on x86_64-linux.
> Ok?

Did you try adding ADDR_SPACE_CONVERT_EXPR to the tree codes
handled in gimple_assign_cast_p?

Thanks,
Richard.

>
> Bernd


Re: [PATCH][match-and-simplify] Complete associate_* patterns

2014-09-11 Thread Richard Biener
On Thu, 11 Sep 2014, Marc Glisse wrote:

> On Thu, 11 Sep 2014, Richard Biener wrote:
> 
> > + /* We can't reassociate floating-point or fixed-point plus or minus
> > +because of saturation to +-Inf.  */
> > + (if (!FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type))
> 
> Do you remember if there was a particular reason not to add
>   || flag_associative_math
> when this was in forwprop?

No idea honestly.  The initial drop-in was supposed to mitigate
somewhat that tree-ssa-reassoc.c doesn't associate TYPE_OVERFLOW_UNDEFINED
expressions at all, so it was designed as "integer only"

Richard.


[PATCH][match-and-simplify] Minor fixes to match-plusminus.pd

2014-09-11 Thread Richard Biener

More closely match-up to tree-ssa-forwprop.c code.

Committed.

Richard.

2014-09-11  Richard Biener  

* match-plusminus.pd: More closely match tree-ssa-forwprop.c code.

Index: gcc/match-plusminus.pd
===
--- gcc/match-plusminus.pd  (revision 215163)
+++ gcc/match-plusminus.pd  (working copy)
@@ -61,21 +61,21 @@ along with GCC; see the file COPYING3.
   (for outer_op (plus minus)
(for inner_op (plus minus)
 (simplify
- (outer_op (inner_op @0 INTEGER_CST@1) INTEGER_CST@2)
+ (outer_op (inner_op @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
  /* If the constant operation overflows we cannot do the transform
as we would introduce undefined overflow, for example
with (a - 1) + INT_MIN.  */
- (with { tree cst = int_const_binop (outer_op == inner_op
-? PLUS_EXPR : MINUS_EXPR, @1, @2); }
-  (if (!TREE_OVERFLOW (cst))
+ (with { tree cst = fold_binary (outer_op == inner_op
+? PLUS_EXPR : MINUS_EXPR, type, @1, @2); }
+  (if (cst && !TREE_OVERFLOW (cst))
(inner_op @0 { cst; } ))
 
   /* (CST - A) +- CST -> CST - A  */
   (for outer_op (plus minus)
(simplify
-(outer_op (minus INTEGER_CST@1 @0) INTEGER_CST@2)
-(with { tree cst = int_const_binop (outer_op, @1, @2); }
- (if (!TREE_OVERFLOW (cst))
+(outer_op (minus CONSTANT_CLASS_P@1 @0) CONSTANT_CLASS_P@2)
+(with { tree cst = fold_binary (outer_op, type, @1, @2); }
+ (if (cst && !TREE_OVERFLOW (cst))
   (minus { cst; } @0)
 
   /* ~A + A -> -1 */
@@ -85,7 +85,7 @@ along with GCC; see the file COPYING3.
 
   /* ~A + 1 -> -A */
   (simplify
-   (plus (bit_not integral_op_p@0) integer_onep@1)
+   (plus (bit_not @0) integer_onep@1)
(if (TREE_CODE (TREE_TYPE (@1)) != COMPLEX_TYPE
|| (TREE_CODE (@1) == COMPLEX_CST
&& integer_onep (TREE_REALPART (@1))
@@ -136,7 +136,9 @@ along with GCC; see the file COPYING3.
 (simplify
   (pointer_plus @0 (convert?@2 (minus@3 (convert @1) (convert @0
   /* Conditionally look through a sign-changing conversion.  */
-  (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3)))
+  (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3))
+   && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1))
+   || (GENERIC && type == TREE_TYPE (@1)
@1))
 
 /* From tree-ssa-forwprop.c:associate_pointerplus_align.  */


[AArch64] Cheap fix for argument types of vmull_high_lane_{us}{16,32}

2014-09-11 Thread James Greenhalgh

Hi,

I'd been putting this patch off in the hope that I might find
time to move these intrinsics to a C/builtin implementation, but it
is probably better to get them right for now and come back to improving
them later.

All four of these suffer the same problem, their "lane" argument should
be a 64-bit rather than 128-bit vector.

Fix it the obvious way.

Tested cross on aarch64-none-eabi.

OK?

Thanks,
James

---
2014-09-11  James Greenhalgh  

* config/aarch64/arm_neon.h (vmull_high_lane_s16): Fix argument
types.
(vmull_high_lane_s32): Likewise.
(vmull_high_lane_u16): Likewise.
(vmull_high_lane_u32): Likewise.
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index c31f7e3..77e3688 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -8249,7 +8249,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_s16(a, b, c)\
   __extension__ \
 ({  \
-   int16x8_t b_ = (b);  \
+   int16x4_t b_ = (b);  \
int16x8_t a_ = (a);  \
int32x4_t result;\
__asm__ ("smull2 %0.4s, %1.8h, %2.h[%3]" \
@@ -8262,7 +8262,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_s32(a, b, c)\
   __extension__ \
 ({  \
-   int32x4_t b_ = (b);  \
+   int32x2_t b_ = (b);  \
int32x4_t a_ = (a);  \
int64x2_t result;\
__asm__ ("smull2 %0.2d, %1.4s, %2.s[%3]" \
@@ -8275,7 +8275,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_u16(a, b, c)\
   __extension__ \
 ({  \
-   uint16x8_t b_ = (b); \
+   uint16x4_t b_ = (b); \
uint16x8_t a_ = (a); \
uint32x4_t result;   \
__asm__ ("umull2 %0.4s, %1.8h, %2.h[%3]" \
@@ -8288,7 +8288,7 @@ vmul_n_u32 (uint32x2_t a, uint32_t b)
 #define vmull_high_lane_u32(a, b, c)\
   __extension__ \
 ({  \
-   uint32x4_t b_ = (b); \
+   uint32x2_t b_ = (b); \
uint32x4_t a_ = (a); \
uint64x2_t result;   \
__asm__ ("umull2 %0.2d, %1.4s, %2.s[%3]" \

Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2014-09-11 Thread Ilya Verbin
On 11 Sep 13:34, Jakub Jelinek wrote:
> I think it is not useful to split patches on in which sequence they were
> added to the tree.  I'd prefer patches for functional parts for the
> differences between trunk and corresponding offloading branch.
> So, one patch should be all the libgomp changes except for addition of
> plugins, another one generic middle-end changes, then the libmicoffload
> support library, then libgomp plugin for Intel MIC and finally rest of Intel
> MIC enablements.  E.g. the patch you've posted contains tons of FIXMEs that
> really shouldn't make into the trunk, not even short lived.
> 
>   Jakub

Ok.  All other libgomp changes depend on the defines from configure.
So I'll start from the patch, which adds new options for the configure.

Thanks,
  -- Ilya


[PATCH][match-and-simplify] Pattern for simplify_conversion_from_bitmask

2014-09-11 Thread Richard Biener

Applied.

Richard.

2014-09-11  Richard Biener  

* match-plusminus.pd: Fix typo.
* match-conversions.pd: Add pattern for
simplify_conversion_from_bitmask.

Index: gcc/match-conversions.pd
===
--- gcc/match-conversions.pd(revision 215162)
+++ gcc/match-conversions.pd(working copy)
@@ -202,3 +202,19 @@
(unsigned) significand_size (TYPE_MODE (inter_type))
>= inside_prec - !inside_unsignedp)
 (convert @0))
+
+/* From tree-ssa-forwprop.c:simplify_conversion_from_bitmask.  */
+
+/* If we have a narrowing conversion to an integral
+   type that is fed by a BIT_AND_EXPR, we might be
+   able to remove the BIT_AND_EXPR if it merely
+   masks off bits outside the final type (and nothing
+   else.  */
+(simplify
+  (convert (bit_and @0 INTEGER_CST@1))
+  (if (INTEGRAL_TYPE_P (type)
+   && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@0))
+   && operand_equal_p (@1, build_low_bits_mask (TREE_TYPE (@1),
+   TYPE_PRECISION (type)), 0))
+   (convert @0)))
Index: gcc/match-plusminus.pd
===
--- gcc/match-plusminus.pd  (revision 215169)
+++ gcc/match-plusminus.pd  (working copy)
@@ -137,8 +137,8 @@ along with GCC; see the file COPYING3.
   (pointer_plus @0 (convert?@2 (minus@3 (convert @1) (convert @0
   /* Conditionally look through a sign-changing conversion.  */
   (if (TYPE_PRECISION (TREE_TYPE (@2)) == TYPE_PRECISION (TREE_TYPE (@3))
-   && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1))
-   || (GENERIC && type == TREE_TYPE (@1)
+   && ((GIMPLE && useless_type_conversion_p (type, TREE_TYPE (@1)))
+   || (GENERIC && type == TREE_TYPE (@1
@1))
 
 /* From tree-ssa-forwprop.c:associate_pointerplus_align.  */


Re: [PATCH][match-and-simplify] Minor fixes to match-plusminus.pd

2014-09-11 Thread Marc Glisse

  /* ~A + 1 -> -A */
  (simplify
   (plus (bit_not @0) integer_onep@1)
   (if (TREE_CODE (TREE_TYPE (@1)) != COMPLEX_TYPE
|| (TREE_CODE (@1) == COMPLEX_CST
&& integer_onep (TREE_REALPART (@1))
&& integer_onep (TREE_IMAGPART (@1
(negate @0)))

the complex part cannot happen, since integer_onep already checks that the 
imaginary part is 0. I was thinking of adding a predicate: 
integer_each_onep (or other name) that would be equivalent to integer_onep 
except for complex where it would check for 1+i instead of 1. We already 
have 2 separate predicates for -1 that only differ for complex. And we 
would probably want a corresponding build_ function to produce such 
constants.


--
Marc Glisse


Re: [PATCH][match-and-simplify] Minor fixes to match-plusminus.pd

2014-09-11 Thread Richard Biener
On Thu, 11 Sep 2014, Marc Glisse wrote:

>   /* ~A + 1 -> -A */
>   (simplify
>(plus (bit_not @0) integer_onep@1)
>(if (TREE_CODE (TREE_TYPE (@1)) != COMPLEX_TYPE
> || (TREE_CODE (@1) == COMPLEX_CST
> && integer_onep (TREE_REALPART (@1))
> && integer_onep (TREE_IMAGPART (@1
> (negate @0)))
> 
> the complex part cannot happen, since integer_onep already checks that the
> imaginary part is 0. I was thinking of adding a predicate: integer_each_onep
> (or other name) that would be equivalent to integer_onep except for complex
> where it would check for 1+i instead of 1. We already have 2 separate
> predicates for -1 that only differ for complex. And we would probably want a
> corresponding build_ function to produce such constants.

Ah, indeed.  The forwprop code reads:

  else if ((TREE_CODE (TREE_TYPE (rhs2)) != COMPLEX_TYPE
&& integer_onep (rhs2))
   || (TREE_CODE (rhs2) == COMPLEX_CST
   && integer_onep (TREE_REALPART (rhs2))
   && integer_onep (TREE_IMAGPART (rhs2

but yes, a new predicate would be nice.  In the pattern we can fix
it by using CONSTANT_CLASS_P as the predicate for @1 (or no
predicate at all).

Richard.


[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports

2014-09-11 Thread Yvan Roux
Hi all

we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to
revision 214896 as r215060.  We have also backported this set of revisions:

r211717 as r214313 : [AArch32] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
r212927 as r214314 : [AArch32] Enable arm target in
ira-shrinkwrap-prep* testcases
r212978 as r214739 : Testsuite: fix check_effective_target_arm_nothumb
r212989 as r214312 : PR 61876: Do not convert cast + __builtin_round
into __builtin_lround unless -fno-math-errno is used
r213304 as r214314 : [AArch64] Fix Thumb2 testsuite fallout
r213378 as r214502 : [AArch64_be] Fix vec_select hi/lo mask confusions.
r213379 as r214504 : [AArch64_be] Don't fold reduction intrinsics
r213485 as r214505 : [AArch64][1/2] Fix offset glitch in load reg pair pattern
r213486 as r214505 : fix ChangeLog for 213485
r213487 as r214505 : [AArch64][2/2] Add constrain to address offset in
storewb_pair/loadwb_pair insns
r213488 as r214506 : [AArch64] Improve TARGET_LEGITIMIZE_ADDRESS_P hook
r213489 as r214506 : add missing testcase for 213488
r213490 as r214507 : [AArch64] Removed unused get_lane and dup_lane builtins.
r213551 as r214509 : [sched-deps] Generalise usage of macro fusion to
work on any two insns
r213556 as r214509 : Fix wrong ChangeLog date from 213551
r213557 as r214511 : [doc] Document clrsb optab and fix some inconsistencies
r213627 as r214516 : [AArch64] Some aarch64-builtins.c cleanup.
r213628 as r214312 : [convert.c] PR 61876: Guard transformation to
lrint by -fno-math-errno
r213630 as r214512 : [AArch32] Adjust clz, rbit and rev patterns for
-mrestrict-it
r213632 as r214513 : [AArch32/AArch64] Add CRC32 scheduling
information to Cortex-A53 and Cortex-A57
r213651 as r214809 : [AArch64] Use REG_P and CONST_INT_P instead of
GET_CODE + comparison
r213659 as r214844 : [AArch64] Prefer dup to zip for vec_perm_const;
enable dup for bigendian; add testcase.
r213692 as r214313 : [AArch32] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
r213701 as r214517 : Testsuiteisms.
r213711 as r214514 : [AArch64] Use MOVN to generate 64-bit negative
immediates where sensible
r213713 as r214515 : [AArch64] Delete f_sels, f_seld types, use fcsel instead
r214503 as r214845 : [AArch64] Fix typo
r214526 as r214847 : PR target/60606 target/61330 fix ICE
r215004 as r215069 : [AArch64] PR target/63190

This will be part of our 2014.09 4.9 release.

Thanks,
Yvan


[PATCH i386 AVX512] [37/n] Extend max/min insn patterns.

2014-09-11 Thread Kirill Yukhin
Hello,
Patch in the bottom extends integer max/min patterns.
Also, it seems, like rounding variant was generated
for maxmin patterns. Bug fixed.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md (VI128_256): Delete.
(define_mode_iterator VI124_256): New.
(define_mode_iterator VI124_256_AVX512F_AVX512BW): Ditto.
(define_expand "3"): Delete.
(define_expand "3"): New.
(define_insn "*avx2_3"): Rename from
"*avx2_3" and update mode iterator.
(define_expand "3_mask"): New.
(define_insn "*avx512bw_3"): Ditto.
(define_insn "3"): Update mode
iterator.
(define_expand "3"): Update pettern generation
in presence of AVX-512.
--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 40b8f83..92f94b9 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -290,9 +290,6 @@
 (define_mode_iterator VI8_256_512
   [V8DI (V4DI "TARGET_AVX512VL")])
 
-(define_mode_iterator VI128_256
-  [V4DI V2DI V4SI (V16QI "TARGET_AVX512BW") (V8HI "TARGET_AVX512BW")])
-
 (define_mode_iterator VI1_AVX2
   [(V32QI "TARGET_AVX2") V16QI])
 
@@ -499,8 +496,12 @@
 (define_mode_iterator VI48_128 [V4SI V2DI])
 
 ;; Various 256bit and 512 vector integer mode combinations
-(define_mode_iterator VI124_256_48_512
-  [V32QI V16HI V8SI (V8DI "TARGET_AVX512F") (V16SI "TARGET_AVX512F")])
+(define_mode_iterator VI124_256 [V32QI V16HI V8SI])
+(define_mode_iterator VI124_256_AVX512F_AVX512BW
+  [V32QI V16HI V8SI
+   (V64QI "TARGET_AVX512BW")
+   (V32HI "TARGET_AVX512BW")
+   (V16SI "TARGET_AVX512F")])
 (define_mode_iterator VI48_256 [V8SI V4DI])
 (define_mode_iterator VI48_512 [V16SI V8DI])
 (define_mode_iterator VI4_256_8_512 [V8SI V8DI])
@@ -9449,71 +9450,100 @@
   [(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_expand "3"
-  [(set (match_operand:VI124_256_48_512 0 "register_operand")
-   (maxmin:VI124_256_48_512
- (match_operand:VI124_256_48_512 1 "")
- (match_operand:VI124_256_48_512 2 "")))]
-  "TARGET_AVX2 &&  && "
+(define_expand "3"
+  [(set (match_operand:VI124_256_AVX512F_AVX512BW 0 "register_operand")
+   (maxmin:VI124_256_AVX512F_AVX512BW
+ (match_operand:VI124_256_AVX512F_AVX512BW 1 "nonimmediate_operand")
+ (match_operand:VI124_256_AVX512F_AVX512BW 2 "nonimmediate_operand")))]
+  "TARGET_AVX2"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
-(define_insn "*avx2_3"
-  [(set (match_operand:VI124_256_48_512 0 "register_operand" "=v")
-   (maxmin:VI124_256_48_512
- (match_operand:VI124_256_48_512 1 "" "%v")
- (match_operand:VI124_256_48_512 2 "" 
"")))]
-  "TARGET_AVX2 && ix86_binary_operator_ok (, mode, operands)
-   &&  && "
-  "vp\t{%2, %1, 
%0|%0, %1, %2}"
+(define_insn "*avx2_3"
+  [(set (match_operand:VI124_256 0 "register_operand" "=v")
+   (maxmin:VI124_256
+ (match_operand:VI124_256 1 "nonimmediate_operand" "%v")
+ (match_operand:VI124_256 2 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX2 && ix86_binary_operator_ok (, mode, operands)"
+  "vp\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseiadd")
(set_attr "prefix_extra" "1")
-   (set_attr "prefix" "maybe_evex")
+   (set_attr "prefix" "vex")
(set_attr "mode" "OI")])
 
+(define_expand "3_mask"
+  [(set (match_operand:VI48_AVX512VL 0 "register_operand")
+   (vec_merge:VI48_AVX512VL
+ (maxmin:VI48_AVX512VL
+   (match_operand:VI48_AVX512VL 1 "nonimmediate_operand")
+   (match_operand:VI48_AVX512VL 2 "nonimmediate_operand"))
+ (match_operand:VI48_AVX512VL 3 "vector_move_operand")
+ (match_operand: 4 "register_operand")))]
+  "TARGET_AVX512F"
+  "ix86_fixup_binary_operands_no_copy (, mode, operands);")
+
+(define_insn "*avx512bw_3"
+  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
+   (maxmin:VI48_AVX512VL
+ (match_operand:VI48_AVX512VL 1 "nonimmediate_operand" "%v")
+ (match_operand:VI48_AVX512VL 2 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512F && ix86_binary_operator_ok (, mode, operands)"
+  "vp\t{%2, %1, 
%0|%0, %1, %2}"
+  [(set_attr "type" "sseiadd")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "maybe_evex")
+   (set_attr "mode" "")])
+
 (define_insn "3"
-  [(set (match_operand:VI128_256 0 "register_operand" "=v")
-(maxmin:VI128_256
-  (match_operand:VI128_256 1 "register_operand" "v")
-  (match_operand:VI128_256 2 "nonimmediate_operand" "vm")))]
-  "TARGET_AVX512VL"
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v")
+(maxmin:VI12_AVX512VL
+  (match_operand:VI12_AVX512VL 1 "register_operand" "v")
+  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX512BW"
   "vp\t{%2, %1, 
%0|%0, %1, %2}"
   [(set_attr "type" "sseiadd")
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
 (d

[PATCH i386 AVX512] [38/n] Extend vpternlog, valign, vrotate insn patterns.

2014-09-11 Thread Kirill Yukhin
Hello,
Patch in the bottom extends patterns for rotate, ternlog and align.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI48_AVX512VL): New.
(define_expand "_vternlog_maskz"): Rename from
"avx512f_vternlog_maskz" and update mode iterator.
(define_insn "_vternlog"): Rename
from "avx512f_vternlog" and update mode iterator.
(define_insn "_vternlog_mask"): Rename from
"avx512f_vternlog_mask" and update mode iterator.
(define_insn "_align"): Rename
from "avx512f_align" and update mode
iterator.
(define_insn "_v"): Rename from
"avx512f_v" and update mode iterator.
(define_insn "_"): Rename from
"avx512f_" and update mode iterator.
(define_insn "clz2"): Use VI48_AVX512VL.
(define_insn "conflict"): Ditto.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 92f94b9..73bdd22 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -7158,27 +7158,27 @@
   [(set_attr "prefix" "evex")
(set_attr "mode"  "")])
 
-(define_expand "avx512f_vternlog_maskz"
-  [(match_operand:VI48_512 0 "register_operand")
-   (match_operand:VI48_512 1 "register_operand")
-   (match_operand:VI48_512 2 "register_operand")
-   (match_operand:VI48_512 3 "nonimmediate_operand")
+(define_expand "_vternlog_maskz"
+  [(match_operand:VI48_AVX512VL 0 "register_operand")
+   (match_operand:VI48_AVX512VL 1 "register_operand")
+   (match_operand:VI48_AVX512VL 2 "register_operand")
+   (match_operand:VI48_AVX512VL 3 "nonimmediate_operand")
(match_operand:SI 4 "const_0_to_255_operand")
(match_operand: 5 "register_operand")]
   "TARGET_AVX512F"
 {
-  emit_insn (gen_avx512f_vternlog_maskz_1 (
+  emit_insn (gen__vternlog_maskz_1 (
 operands[0], operands[1], operands[2], operands[3],
 operands[4], CONST0_RTX (mode), operands[5]));
   DONE;
 })
 
-(define_insn "avx512f_vternlog"
-  [(set (match_operand:VI48_512 0 "register_operand" "=v")
-   (unspec:VI48_512
- [(match_operand:VI48_512 1 "register_operand" "0")
-  (match_operand:VI48_512 2 "register_operand" "v")
-  (match_operand:VI48_512 3 "nonimmediate_operand" "vm")
+(define_insn "_vternlog"
+  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
+   (unspec:VI48_AVX512VL
+ [(match_operand:VI48_AVX512VL 1 "register_operand" "0")
+  (match_operand:VI48_AVX512VL 2 "register_operand" "v")
+  (match_operand:VI48_AVX512VL 3 "nonimmediate_operand" "vm")
   (match_operand:SI 4 "const_0_to_255_operand")]
  UNSPEC_VTERNLOG))]
   "TARGET_AVX512F"
@@ -7187,13 +7187,13 @@
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
-(define_insn "avx512f_vternlog_mask"
-  [(set (match_operand:VI48_512 0 "register_operand" "=v")
-   (vec_merge:VI48_512
- (unspec:VI48_512
-   [(match_operand:VI48_512 1 "register_operand" "0")
-(match_operand:VI48_512 2 "register_operand" "v")
-(match_operand:VI48_512 3 "nonimmediate_operand" "vm")
+(define_insn "_vternlog_mask"
+  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
+   (vec_merge:VI48_AVX512VL
+ (unspec:VI48_AVX512VL
+   [(match_operand:VI48_AVX512VL 1 "register_operand" "0")
+(match_operand:VI48_AVX512VL 2 "register_operand" "v")
+(match_operand:VI48_AVX512VL 3 "nonimmediate_operand" "vm")
 (match_operand:SI 4 "const_0_to_255_operand")]
UNSPEC_VTERNLOG)
  (match_dup 1)
@@ -7227,12 +7227,12 @@
 [(set_attr "prefix" "evex")
  (set_attr "mode" "")])
 
-(define_insn "avx512f_align"
-  [(set (match_operand:VI48_512 0 "register_operand" "=v")
-(unspec:VI48_512 [(match_operand:VI48_512 1 "register_operand" "v")
- (match_operand:VI48_512 2 "nonimmediate_operand" "vm")
- (match_operand:SI 3 "const_0_to_255_operand")]
-UNSPEC_ALIGN))]
+(define_insn "_align"
+  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
+(unspec:VI48_AVX512VL [(match_operand:VI48_AVX512VL 1 
"register_operand" "v")
+  (match_operand:VI48_AVX512VL 2 
"nonimmediate_operand" "vm")
+  (match_operand:SI 3 "const_0_to_255_operand")]
+ UNSPEC_ALIGN))]
   "TARGET_AVX512F"
   "valign\t{%3, %2, %1, %0|%0, 
%1, %2, %3}";
   [(set_attr "prefix" "evex")
@@ -9430,20 +9430,20 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "")])
 
-(define_insn "avx512f_v"
-  [(set (match_operand:VI48_512 0 "register_operand" "=v")
-   (any_rotate:VI48_512
- (match_operand:VI48_512 1 "register_operand" "v")
- (match_operand:VI48_512 2 "nonimmediate_operand" "vm")))]
+(define_insn "_v"
+  [(set (match_operand:VI48_AVX512VL 0

ptx preliminary rtl patches [2/4]

2014-09-11 Thread Bernd Schmidt
There's some code in get_uncond_jump_length to emit and then delete a 
label and a jump.  On ptx we skip register allocation and reload, and 
this fails a "reload_completed || bb != NULL" assert in df_insn_delete. 
Fixed by instead emitting the two insns into a sequence which we then 
just discard.


Bootstrapped and tested on x86_64-linux, together with the other 
patches.  Ok?



Bernd
	* bb-reorder.c (get_uncond_jump_length): Avoid using delete_insn,
	emit into a sequence instead.

diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index b3f770d..789f1e9 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -1374,13 +1374,12 @@ get_uncond_jump_length (void)
   rtx_insn *label, *jump;
   int length;
 
-  label = emit_label_before (gen_label_rtx (), get_insns ());
+  start_sequence ();
+  label = emit_label (gen_label_rtx ());
   jump = emit_jump_insn (gen_jump (label));
-
   length = get_attr_min_length (jump);
+  end_sequence ();
 
-  delete_insn (jump);
-  delete_insn (label);
   return length;
 }
 


ptx preliminary rtl patches [1/4]

2014-09-11 Thread Bernd Schmidt
The nvptx backend is somewhat unusual in that call insns set a pseudo. 
The combiner is surprised by this and allows combining them into other 
insns, which remain as INSN rather than CALL_INSN. Aborts ensue.


Bootstrapped and tested on x86_64-linux, together with the other 
patches.  Ok?



Bernd
	* combine.c (try_combine): Don't allow a call as one of the source
	insns.

diff --git a/gcc/combine.c b/gcc/combine.c
index 0ec7f85..fe95b41 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2543,7 +2543,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 
   /* Exit early if one of the insns involved can't be used for
  combinations.  */
-  if (cant_combine_insn_p (i3)
+  if (CALL_P (i2)
+  || (i1 && CALL_P (i1))
+  || (i0 && CALL_P (i0))
+  || cant_combine_insn_p (i3)
   || cant_combine_insn_p (i2)
   || (i1 && cant_combine_insn_p (i1))
   || (i0 && cant_combine_insn_p (i0))


ptx preliminary rtl patches [3/4]

2014-09-11 Thread Bernd Schmidt
nvptx will be the first port to use BImode and have 
STORE_FLAG_VALUE==-1. That has exposed a bug in combine where we can end 
up calling num_sign_bit_copies for a BImode value. However, the return 
value is always 1 in that case, so it doesn't tell us anything and is 
going to be misinterpreted by the caller.


Bootstrapped and tested on x86_64-linux, together with the other 
patches.  Ok?



Bernd
	* combine.c (combine_simplify_rtx): Avoid using num_sign_bit_copies
	for single-bit modes.

diff --git a/gcc/combine.c b/gcc/combine.c
index fe95b41..49c6baa 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5801,10 +5801,14 @@ combine_simplify_rtx (rtx x, enum machine_mode op0_mode, int in_dest,
 	;
 
 	  else if (STORE_FLAG_VALUE == -1
-	  && new_code == NE && GET_MODE_CLASS (mode) == MODE_INT
-	  && op1 == const0_rtx
-	  && (num_sign_bit_copies (op0, mode)
-		  == GET_MODE_PRECISION (mode)))
+		   && new_code == NE && GET_MODE_CLASS (mode) == MODE_INT
+		   && op1 == const0_rtx
+		   /* There's always at least one sign bit copy in a
+		  one-bit mode, so the call to num_sign_bit_copies
+		  tells us nothing in that case.  */
+		   && GET_MODE_PRECISION (mode) > 1
+		   && (num_sign_bit_copies (op0, mode)
+		   == GET_MODE_PRECISION (mode)))
 	return gen_lowpart (mode,
 expand_compound_operation (op0));
 
@@ -5824,6 +5828,7 @@ combine_simplify_rtx (rtx x, enum machine_mode op0_mode, int in_dest,
 		   && new_code == EQ && GET_MODE_CLASS (mode) == MODE_INT
 		   && op1 == const0_rtx
 		   && mode == GET_MODE (op0)
+		   && GET_MODE_PRECISION (mode) > 1
 		   && (num_sign_bit_copies (op0, mode)
 		   == GET_MODE_PRECISION (mode)))
 	{


ptx preliminary rtl patches [4/4]

2014-09-11 Thread Bernd Schmidt
It turns out that we're calling eliminate_regs for global variables 
which can't possibly have eliminable regs in their decl. At that point, 
reg_eliminate can be NULL. This patch avoids unnecessary work, and 
allows us to add an assert to eliminate_regs later.


Bootstrapped and tested on x86_64-linux, together with the other 
patches.  Ok?



Bernd
	* dbxout.c (dbxout_symbol): Don't call eliminate_regs on TREE_STATIC
	decls.

diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index d856bdd..ffef1f5 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -2887,7 +2887,8 @@ dbxout_symbol (tree decl, int local ATTRIBUTE_UNUSED)
   if (!decl_rtl)
 	DBXOUT_DECR_NESTING_AND_RETURN (0);
 
-  decl_rtl = eliminate_regs (decl_rtl, VOIDmode, NULL_RTX);
+  if (!TREE_STATIC (decl))
+	decl_rtl = eliminate_regs (decl_rtl, VOIDmode, NULL_RTX);
 #ifdef LEAF_REG_REMAP
   if (crtl->uses_only_leaf_regs)
 	leaf_renumber_regs_insn (decl_rtl);


Re: [PATCH][ARM] Enable auto-vectorization for copysignf

2014-09-11 Thread Christophe Lyon
Hi Jiong,

On 9 September 2014 12:59, Ramana Radhakrishnan
 wrote:
> On Mon, Aug 18, 2014 at 11:31 AM, Jiong Wang  wrote:
>> this patch enable auto-vectorization for copysignf by using vector
>> bit selection instruction on arm32 when neon available.
>>

I've noticed that your new testcase fails (the scan-tree-dump-times
line), in the following cases:
* forcing -march=armv5t in RUNTESTFLAGS (targets
arm-none-linux-gnueabi and arm-none-linux-gnueabihf)
* target armeb-none-linux-gnueabihf

You can have a look at:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/report-build-info.html

If you go 1 level up at
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/
you'll be able to browse into the per-target subdirs and get the .sum
files if you need them.

Christophe.


>> for a simple testcase:
>>
>>   for (i = 0; i < N; i++)
>> r[i] = __builtin_copysignf (a[i], b[i]);
>>
>>
>> assuming vector factor be 4, the generated instruction sequences is:
>>
>> vmov.i32q10, #2147483648  @ v4si
>> .L2:
>> vld1.64 {d18-d19}, [ip:64]
>> add r3, r3, #16
>> add ip, ip, #16
>> vldrd16, [r3, #-16]
>> vldrd17, [r3, #-8]
>> vbifq8, q9, q10
>
>
>
> Ok.
>
> Ramana
>
>
>>
>> thanks.
>>
>> gcc/
>>   * config/arm/arm.c (NEON_COPYSIGNF): New enum.
>>   (arm_init_neon_builtins): Support NEON_COPYSIGNF.
>>   (arm_builtin_vectorized_function): Likewise.
>>   * config/arm/arm_neon_builtins.def: New macro for copysignf.
>>   * config/arm/neon.md (neon_copysignf): New pattern for vector
>> copysignf.
>>
>> gcc/testsuite/
>>   * gcc.target/arm/vect-copysignf.c: New testcase.


C++ PATCH for c++/63139 (variadic alias templates)

2014-09-11 Thread Jason Merrill
This testcase was breaking because multiple substitutions of ContainerEndA> into T... would lose the original arguments when we pull 
the pattern out of the intermediate expansion.  So the added assert 
guards against that happening again, and we avoid the situation by 
simplifying the handling of that situation: if we are substituting into 
a trivial expansion like T... we can just return the argument pack.


Tested x86_64-pc-linux-gnu, applying to trunk and 4.9.
commit 25baa58252bce8ac42d96710c48107df3e2c56c8
Author: Jason Merrill 
Date:   Wed Sep 10 21:58:58 2014 -0400

	PR c++/63139
	* pt.c (tsubst_pack_expansion): Simplify substitution into T
	(tsubst): Don't throw away PACK_EXPANSION_EXTRA_ARGS.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3c93178..7f7ab93 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -9913,6 +9913,11 @@ tsubst_pack_expansion (tree t, tree args, tsubst_flags_t complain,
 	}
 }
 
+  /* If the expansion is just T..., return the matching argument pack.  */
+  if (!unsubstituted_packs
+  && TREE_PURPOSE (packs) == pattern)
+return ARGUMENT_PACK_ARGS (TREE_VALUE (packs));
+
   /* We cannot expand this expansion expression, because we don't have
  all of the argument packs we need.  */
   if (use_pack_expansion_extra_args_p (packs, len, unsubstituted_packs))
@@ -11831,7 +11836,11 @@ tsubst (tree t, tree args, tsubst_flags_t complain, tree in_decl)
 		   gen_elem_of_pack_expansion_instantiation will
 		   build the resulting pack expansion from it.  */
 		if (PACK_EXPANSION_P (arg))
-		  arg = PACK_EXPANSION_PATTERN (arg);
+		  {
+		/* Make sure we aren't throwing away arg info.  */
+		gcc_assert (!PACK_EXPANSION_EXTRA_ARGS (arg));
+		arg = PACK_EXPANSION_PATTERN (arg);
+		  }
 	  }
 	  }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic161.C b/gcc/testsuite/g++.dg/cpp0x/variadic161.C
new file mode 100644
index 000..ac6eaf6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic161.C
@@ -0,0 +1,51 @@
+// PR c++/63139
+// { dg-do compile { target c++11 } }
+
+template
+struct type_list {};
+
+template
+struct make_type_list
+{
+using type = type_list;
+};
+
+// The bug disappears if you use make_type_list directly.
+template
+using make_type_list_t = typename make_type_list::type;
+
+
+struct ContainerEndA {};
+
+template
+struct ContainerA
+{
+using type = make_type_list_t;
+};
+
+
+struct ContainerEndB {};
+
+template
+struct ContainerB
+{
+using type = make_type_list_t;
+};
+
+template
+struct is_same
+{
+  static const bool value = false;
+};
+
+template
+struct is_same
+{
+  static const bool value = true;
+};
+
+#define SA(X) static_assert((X), #X)
+
+SA((is_same::type, type_list>::value));
+SA((!is_same::type, type_list>::value));
+SA((!is_same::type, ContainerB<>::type>::value));


Re: [PATCH][AArch64 Testsuite] Add test of vld[234]q? intrinsic

2014-09-11 Thread Christophe Lyon
On 9 September 2014 12:19, Marcus Shawcroft  wrote:
> On 8 September 2014 11:35, Alan Lawrence  wrote:
>> This adds a test of all the variants of vld2, vld2q, vld3, vld3q, vld4, and
>> vld4q. These all use typexNxM structs and the OI/CI/XImode mechanism, so the
>> test cross-checks this against plain ol' vst1(q?).
>>
>> Cross-tested on aarch64-none-elf (passing), also on aarch64_be-none-elf
>> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810).
>>

Hi Alan,

On my side, your new test fails at execution on aarch64_be-none-elf:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215072/report-build-info.html

This seems strange since you tested it too.
I am using the Foundation Model.

Christophe.

>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/aarch64/vldN_1.c: New test.
>
> OK /Marcus


Re: [PATCH][ARM] Enable auto-vectorization for copysignf

2014-09-11 Thread Jiong Wang


On 11/09/14 14:43, Christophe Lyon wrote:

Hi Jiong,

On 9 September 2014 12:59, Ramana Radhakrishnan
 wrote:

On Mon, Aug 18, 2014 at 11:31 AM, Jiong Wang  wrote:

this patch enable auto-vectorization for copysignf by using vector
bit selection instruction on arm32 when neon available.


I've noticed that your new testcase fails (the scan-tree-dump-times
line), in the following cases:
* forcing -march=armv5t in RUNTESTFLAGS (targets
arm-none-linux-gnueabi and arm-none-linux-gnueabihf)
* target armeb-none-linux-gnueabihf

You can have a look at:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/report-build-info.html

If you go 1 level up at
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/
you'll be able to browse into the per-target subdirs and get the .sum
files if you need them.


Christophe,

  the auto-test system is great!

  the testcase only pass when both hardware & abi options meet requirement.

  I tried to skip those environment where neon is not available by 
"dg-require-effective-target arm_neon_hw"

  there maybe something not covered. I'll have a look.

  thanks.

-- Jiong



Christophe.



for a simple testcase:

   for (i = 0; i < N; i++)
 r[i] = __builtin_copysignf (a[i], b[i]);


assuming vector factor be 4, the generated instruction sequences is:

 vmov.i32q10, #2147483648  @ v4si
.L2:
 vld1.64 {d18-d19}, [ip:64]
 add r3, r3, #16
 add ip, ip, #16
 vldrd16, [r3, #-16]
 vldrd17, [r3, #-8]
 vbifq8, q9, q10



Ok.

Ramana



thanks.

gcc/
   * config/arm/arm.c (NEON_COPYSIGNF): New enum.
   (arm_init_neon_builtins): Support NEON_COPYSIGNF.
   (arm_builtin_vectorized_function): Likewise.
   * config/arm/arm_neon_builtins.def: New macro for copysignf.
   * config/arm/neon.md (neon_copysignf): New pattern for vector
copysignf.

gcc/testsuite/
   * gcc.target/arm/vect-copysignf.c: New testcase.





Re: [PATCH][AArch64 Testsuite] Add test of vld[234]q? intrinsic

2014-09-11 Thread Alan Lawrence
Yes, I had seen that, and the failure is expected. AFAICT the test is correct; 
but the implementation of vld[234] is incorrect on bigendian, because of 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810 .


HTH, Alan.



Christophe Lyon wrote:

On 9 September 2014 12:19, Marcus Shawcroft  wrote:

On 8 September 2014 11:35, Alan Lawrence  wrote:

This adds a test of all the variants of vld2, vld2q, vld3, vld3q, vld4, and
vld4q. These all use typexNxM structs and the OI/CI/XImode mechanism, so the
test cross-checks this against plain ol' vst1(q?).

Cross-tested on aarch64-none-elf (passing), also on aarch64_be-none-elf
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810).



Hi Alan,

On my side, your new test fails at execution on aarch64_be-none-elf:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215072/report-build-info.html

This seems strange since you tested it too.
I am using the Foundation Model.

Christophe.


gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vldN_1.c: New test.

OK /Marcus







[PATCH 4.9] Backported r214946: One-liner: fix type of an add in SIMD registers

2014-09-11 Thread Alan Lawrence

Original patch applied cleanly to 4.9 HEAD as r215175.

Marcus Shawcroft wrote:

On 20 August 2014 10:25, Alan Lawrence  wrote:

The SIMD-register variant is miscategorized as "alu_reg" despite not using
any ALU registers, and should be "neon_add" for e.g. scheduling.

Tested with check-gcc and check-g++ on aarch64-none-elf and
aarch64_be-none-elf.

gcc/ChangeLog:

* config/aarch64/aarch64.md (adddi3_aarch64): set type to neon_add.


OK and back port please.

/Marcus






Re: [AArch64] Cheap fix for argument types of vmull_high_lane_{us}{16,32}

2014-09-11 Thread Marcus Shawcroft
On 11 September 2014 13:15, James Greenhalgh  wrote:
>
> Hi,
>
> I'd been putting this patch off in the hope that I might find
> time to move these intrinsics to a C/builtin implementation, but it
> is probably better to get them right for now and come back to improving
> them later.
>
> All four of these suffer the same problem, their "lane" argument should
> be a 64-bit rather than 128-bit vector.
>
> Fix it the obvious way.
>
> Tested cross on aarch64-none-eabi.
>
> OK?

OK /Marcus


RE: [PATCH] RE: gcc parallel make check

2014-09-11 Thread VandeVondele Joost

>> could it be that the pattern in normal1 should have been '[ab]*/ de*/ 
>> [ep]*/*' ?
>
>Yes, we are running these tests multiple times:
>
>PASS: 23_containers/map/modifiers/erase/abi_tag.cc (test for excess errors)
>PASS: 23_containers/multimap/modifiers/erase/abi_tag.cc (test for excess 
>errors)
>PASS: 23_containers/multiset/modifiers/erase/abi_tag.cc (test for excess 
>errors)
>PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors)
>PASS: 26_numerics/complex/abi_tag.cc (test for excess errors)
>
>I'll fix that.

Actually, the proper pattern should presumably be '[ab]*/* de*/* [ep]*/*' even 
though it seems to make no difference in testing. I'll have this included in 
yet another version of the parallel make check patch (plus some further 
reschuffling as requested by Jakub), so I think there is no need for you to fix 
this now.


Re: [PATCH] RE: gcc parallel make check

2014-09-11 Thread Jonathan Wakely
On 11 September 2014 15:45, VandeVondele  Joost
 wrote:
>
>>> could it be that the pattern in normal1 should have been '[ab]*/ de*/ 
>>> [ep]*/*' ?
>>
>>Yes, we are running these tests multiple times:
>>
>>PASS: 23_containers/map/modifiers/erase/abi_tag.cc (test for excess errors)
>>PASS: 23_containers/multimap/modifiers/erase/abi_tag.cc (test for excess 
>>errors)
>>PASS: 23_containers/multiset/modifiers/erase/abi_tag.cc (test for excess 
>>errors)
>>PASS: 23_containers/set/modifiers/erase/abi_tag.cc (test for excess errors)
>>PASS: 26_numerics/complex/abi_tag.cc (test for excess errors)
>>
>>I'll fix that.
>
> Actually, the proper pattern should presumably be '[ab]*/* de*/* [ep]*/*' 
> even though it seems to make no difference in testing.

Yes, that's what I'm testing.

> I'll have this included in yet another version of the parallel make check 
> patch (plus some further reschuffling as requested by Jakub), so I think 
> there is no need for you to fix this now.

This can (and should) be fixed now, without waiting for some other change.


Re: [PATCH] gcc parallel make check

2014-09-11 Thread Jakub Jelinek
On Thu, Sep 11, 2014 at 10:06:40AM +0200, Jakub Jelinek wrote:
> There is an option to touch say *-parallel/finished file once any of the
> check-parallel-gcc-{1,2,...} goals is done (because when it finishes, it
> means all the tests for the particular check-$lang that are parallelizable
> have either finished, or at least touched their file) and not start runtest
> at all if finished already exists, but guess it would be still undesirable to 
> have
> tens of thousands of goals by default, so perhaps we could go with say
> 128 subgoals by default and have some env var to override it, so on the
> really highly parallel boxes you'd specify
> make -j512 -k check GCC_TEST_PARALLEL_SLOTS=512
> or similar.

Here is a patch I'm testing now:

--- gcc/Makefile.in.jj  2014-09-08 22:12:56.0 +0200
+++ gcc/Makefile.in 2014-09-11 16:06:36.641219430 +0200
@@ -513,34 +513,10 @@ xm_include_list=@xm_include_list@
 xm_defines=@xm_defines@
 lang_checks=
 lang_checks_parallelized=
-dg_target_exps:=aarch64.exp,alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp
-dg_target_exps:=$(dg_target_exps),epiphany.exp,frv.exp,i386.exp,ia64.exp
-dg_target_exps:=$(dg_target_exps),m68k.exp,microblaze.exp,mips.exp,powerpc.exp
-dg_target_exps:=$(dg_target_exps),rx.exp,s390.exp,sh.exp,sparc.exp,spu.exp
-dg_target_exps:=$(dg_target_exps),tic6x.exp,xstormy16.exp
-# This lists a couple of test files that take most time during check-gcc.
-# When doing parallelized check-gcc, these can run in parallel with the
-# remaining tests.  Each word in this variable stands for work for one
-# make goal and one extra make goal is added to handle all the *.exp
-# files not handled explicitly already.  If multiple *.exp files
-# should be run in the same runtest invocation (usually if they aren't
-# very long running, but still should be split of from the check-parallel-$lang
-# remaining tests runtest invocation), they should be concatenated with commas.
-# Note that [a-zA-Z] wildcards need to have []s prefixed with \ (needed
-# by tcl) and as the *.exp arguments are mached both as is and with
-# */ prefixed to it in runtest_file_p, it is usually desirable to include
-# a subdirectory name.
-check_gcc_parallelize=execute.exp=execute/2* \
- execute.exp=execute/\[013-9a-fA-F\]* \
- execute.exp=execute/\[pP\]*,dg.exp \
- 
execute.exp=execute/\[g-oq-zG-OQ-Z\]*,compile.exp=compile/2* \
- compile.exp=compile/\[9pP\]*,builtins.exp \
- compile.exp=compile/\[013-8a-oq-zA-OQ-Z\]* \
- dg-torture.exp,ieee.exp \
- vect.exp,unsorted.exp \
- guality.exp \
- struct-layout-1.exp,stackalign.exp \
- $(dg_target_exps)
+# Upper limit to which it is useful to parallelize this lang target.
+# It doesn't make sense to try e.g. 128 goals for small testsuites
+# like objc or go.
+check_gcc_parallelize=1
 lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt $(srcdir)/common.opt
 lang_specs_files=@lang_specs_files@
 lang_tree_files=@lang_tree_files@
@@ -3631,27 +3607,32 @@ $(filter-out $(lang_checks_parallelized)
export TCL_LIBRARY ; fi ; \
$(RUNTEST) --tool $* $(RUNTESTFLAGS))
 
-$(patsubst %,%-subtargets,$(filter-out 
$(lang_checks_parallelized),$(lang_checks))): check-%-subtargets:
+$(patsubst %,%-subtargets,$(lang_checks)): check-%-subtargets:
@echo check-$*
 
 check_p_tool=$(firstword $(subst _, ,$*))
-check_p_vars=$(check_$(check_p_tool)_parallelize)
+check_p_count=$(check_$(check_p_tool)_parallelize)
 check_p_subno=$(word 2,$(subst _, ,$*))
-check_p_comma=,
-check_p_subwork=$(subst $(check_p_comma), ,$(if $(check_p_subno),$(word 
$(check_p_subno),$(check_p_vars
-check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
+check_p_numbers0:=1 2 3 4 5 6 7 8 9
+check_p_numbers1:=0 $(check_p_numbers0)
+check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers1)))
+check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2)
+check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers3)))
+check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4)
+check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers5)))
+check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) 
$(check_p_numbers6)
 check_p_subdir=$(subst _,,$*)
-check_p_subdirs=$(wordlist 1,$(words 
$(check_$*_parallelize)),$(check_p_numbers))
+check_p_subdirs=$(wordlist 1,$(check_p_count),$(wordlist 1,$(or 
$(GCC_TEST_PARALLEL_SLOTS),128),$(check_p_numbers)))
 
 # For parallelized check-% targets, this decides whether parallelization
 # is desirable (if -jN is used and RUNTESTFLAGS doesn't contain anything
 # but optional --target_board or --extra_opts arguments).  If desirable,
 # recursive make is run with check-parallel-$lang{,1,2,3,4,5} etc

Re: [C++ Patch] PR 61489

2014-09-11 Thread Jason Merrill

Do we need a documentation update?

Jason


Re: [PATCH][AArch64 Testsuite] Add test of vld[234]q? intrinsic

2014-09-11 Thread Christophe Lyon
Ha OK, I had misunderstood your first email, and thought you had the
test also pass in big endian.

Thanks for the clarification.


On 11 September 2014 15:56, Alan Lawrence  wrote:
> Yes, I had seen that, and the failure is expected. AFAICT the test is
> correct; but the implementation of vld[234] is incorrect on bigendian,
> because of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810 .
>
> HTH, Alan.
>
>
>
>
> Christophe Lyon wrote:
>>
>> On 9 September 2014 12:19, Marcus Shawcroft 
>> wrote:
>>>
>>> On 8 September 2014 11:35, Alan Lawrence  wrote:

 This adds a test of all the variants of vld2, vld2q, vld3, vld3q, vld4,
 and
 vld4q. These all use typexNxM structs and the OI/CI/XImode mechanism, so
 the
 test cross-checks this against plain ol' vst1(q?).

 Cross-tested on aarch64-none-elf (passing), also on aarch64_be-none-elf
 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810).

>>
>> Hi Alan,
>>
>> On my side, your new test fails at execution on aarch64_be-none-elf:
>>
>> http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215072/report-build-info.html
>>
>> This seems strange since you tested it too.
>> I am using the Foundation Model.
>>
>> Christophe.
>>
 gcc/testsuite/ChangeLog:

 * gcc.target/aarch64/vldN_1.c: New test.
>>>
>>> OK /Marcus
>>
>>
>
>


Re: [AArch64] Cheap fix for argument types of vmull_high_lane_{us}{16,32}

2014-09-11 Thread James Greenhalgh
On Thu, Sep 11, 2014 at 03:26:49PM +0100, Marcus Shawcroft wrote:
> On 11 September 2014 13:15, James Greenhalgh  wrote:
> >
> > Hi,
> >
> > I'd been putting this patch off in the hope that I might find
> > time to move these intrinsics to a C/builtin implementation, but it
> > is probably better to get them right for now and come back to improving
> > them later.
> >
> > All four of these suffer the same problem, their "lane" argument should
> > be a 64-bit rather than 128-bit vector.
> >
> > Fix it the obvious way.
> >
> > Tested cross on aarch64-none-eabi.
> >
> > OK?
> 
> OK /Marcus
> 

Thanks Marcus,

After your offline pre-approval I've also backported this fix to the 4.9
branch as revision 215178.

Cheers,
James



RE: [PATCH 1/4] AArch64: Fix register_move_cost

2014-09-11 Thread Wilco Dijkstra
Patch attached for commit as I don't have write access.

> -Original Message-
> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> Sent: 04 September 2014 16:23
> To: Wilco Dijkstra
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH 1/4] AArch64: Fix register_move_cost
> 
> On 4 September 2014 15:44, Wilco Dijkstra  wrote:
> > Hi,
> >
> > This is a set of patches improving register costs on AArch64. The first 
> > fixes
> > aarch64_register_move_cost() to support CALLER_SAVE_REGS and POINTER_REGS 
> > so costs are
> calculated
> > correctly in the register allocator.
> >
> > ChangeLog:
> > 2014-09-04  Wilco Dijkstra  
> >
> > * gcc/config/aarch64/aarch64.c (aarch64_register_move_cost):
> > Add cost handling of CALLER_SAVE_REGS and POINTER_REGS.
> 
> OK /Marcus
---
 gcc/config/aarch64/aarch64.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 023f9fd..56b8eda 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5932,6 +5932,13 @@ aarch64_register_move_cost (enum machine_mode mode,
   const struct cpu_regmove_cost *regmove_cost
 = aarch64_tune_params->regmove_cost;
 
+  /* Caller save and pointer regs are equivalent to GENERAL_REGS.  */
+  if (to == CALLER_SAVE_REGS || to == POINTER_REGS)
+to = GENERAL_REGS;
+
+  if (from == CALLER_SAVE_REGS || from == POINTER_REGS)
+from = GENERAL_REGS;
+
   /* Moving between GPR and stack cost is the same as GP2GP.  */
   if ((from == GENERAL_REGS && to == STACK_REG)
   || (to == GENERAL_REGS && from == STACK_REG))
-- 
1.9.1



RE: [PATCH 2/4] AArch64: Fix cost for Q register moves

2014-09-11 Thread Wilco Dijkstra
Patch attached for commit as I don't have write access.

ChangeLog:
2014-09-11  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c (aarch64_register_move_cost):
Fix Q register move handling.  (generic_regmove_cost): Undo raised 
FP2FP move cost as Q register moves are now handled correctly.

> -Original Message-
> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> Sent: 04 September 2014 16:54
> To: Wilco Dijkstra
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH 2/4] AArch64: Fix cost for Q register moves
> 
> On 4 September 2014 16:41, Wilco Dijkstra  wrote:
> >> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> >> > -  NAMED_PARAM (FP2FP, 4)
> >> > +  NAMED_PARAM (FP2FP, 2)
> >>
> >> This is not directly related to the change below and it is missing
> >> from the ChangeLog.   Originally this number had to be > 2 in order
> >> for secondary reload to kick in.  See the comment above the second
> >> hunk of this patch.  Why is it OK to lower this number ?
> >
> > It is related because the GET_MODE_SIZE bug means it never returns the
> > correct cost, but instead returns the FP2FP cost. So the FP2FP cost had
> > to be artificially increased. With the fix this is no longer required.
> 
> Yep, I read the code again, I understand.  You still need to fix the
> ChangeLog.  OK to commit with a fixed ChangeLog.
> 
> Cheers
> 
> /Marcus
---
 gcc/config/aarch64/aarch64.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 56b8eda..62b0168 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -215,10 +215,7 @@ static const struct cpu_regmove_cost generic_regmove_cost =
   NAMED_PARAM (GP2GP, 1),
   NAMED_PARAM (GP2FP, 2),
   NAMED_PARAM (FP2GP, 2),
-  /* We currently do not provide direct support for TFmode Q->Q move.
- Therefore we need to raise the cost above 2 in order to have
- reload handle the situation.  */
-  NAMED_PARAM (FP2FP, 4)
+  NAMED_PARAM (FP2FP, 2)
 };
 
 /* Generic costs for vector insn classes.  */
@@ -5961,7 +5958,7 @@ aarch64_register_move_cost (enum machine_mode mode,
  secondary reload.  A general register is used as a scratch to move
  the upper DI value and the lower DI value is moved directly,
  hence the cost is the sum of three moves. */
-  if (! TARGET_SIMD && GET_MODE_SIZE (mode) == 128)
+  if (! TARGET_SIMD && GET_MODE_SIZE (mode) == 16)
 return regmove_cost->GP2FP + regmove_cost->FP2GP + regmove_cost->FP2FP;
 
   return regmove_cost->FP2FP;
-- 
1.9.1



RE: [PATCH 3/4] AArch64: Cleanup inconsistent use of __extension__

2014-09-11 Thread Wilco Dijkstra
OK, I'll skip this patch for now as HAVE_DESIGNATED_INITIALIZERS should
always be false, so there is no point in cleaning it up.

> -Original Message-
> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> Sent: 04 September 2014 16:42
> To: Wilco Dijkstra
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH 3/4] AArch64: Cleanup inconsistent use of __extension__
> 
> On 4 September 2014 16:39, Marcus Shawcroft  
> wrote:
> > On 4 September 2014 15:45, Wilco Dijkstra  wrote:
> >> Cleanup inconsistent use of __extension__.
> >>
> >> ChangeLog:
> >> 2014-09-04  Wilco Dijkstra  
> >>
> >> * gcc/config/aarch64/aarch64.c: Cleanup use of __extension__.
> >
> > Write a proper ChangeLog entry please.
> > /Marcus
> 
> 
> Actually, on second thoughts, I think it better to just remove the
> three lines of the first spurious instance of:
> 
> #if HAVE_DESIGNATED_INITIALIZERS && GCC_VERSION >= 2007
> __extension__
> #endif
> 
> and leave the other instances alone.
> 
> /Marcus





RE: [PATCH 4/4] AArch64: Add regmove_costs for Cortex-A57 and A53

2014-09-11 Thread Wilco Dijkstra
I've kept the integer move costs at 1 - patch attached for commit as I don't 
have write access.

ChangeLog:
2014-09-11  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.c:
(cortexa57_regmove_cost): New cost table for A57. 
(cortexa53_regmove_cost): New cost table for A53.
Increase GP2FP/FP2GP cost to spilling from integer to FP registers.

> -Original Message-
> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> Sent: 04 September 2014 17:40
> To: Wilco Dijkstra
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH 4/4] AArch64: Add regmove_costs for Cortex-A57 and A53
> 
> On 4 September 2014 15:47, Wilco Dijkstra  wrote:
> > This patch adds regmove_costs for Cortex-A57 and A53, and sets the cost of 
> > GP2FP/FP2GP
> higher than
> > memory cost to block the register allocator allocating integer values in FP 
> > registers.
> >
> > Overall these patches give 2-3% speedup on SPEC.
> >
> > This passes all regression tests (with this fix
> > https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00356.html).
> >
> > OK for commit?
> >
> > Wilco
> >
> > ChangeLog:
> > 2014-09-04  Wilco Dijkstra  
> >
> > * gcc/config/aarch64/aarch64.c:
> > Add cortexa57_regmove_cost and cortexa53_regmove_cost to avoid
> > spilling from integer to FP registers.
> 
> Write a proper ChangeLog entry please.
> 
> Keep the GP2GP cost aligned with generic until we have justification
> to change it.
> 
> /Marcus
---
 gcc/config/aarch64/aarch64.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 62b0168..bb092ca 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -218,6 +218,26 @@ static const struct cpu_regmove_cost generic_regmove_cost =
   NAMED_PARAM (FP2FP, 2)
 };
 
+static const struct cpu_regmove_cost cortexa57_regmove_cost =
+{
+  NAMED_PARAM (GP2GP, 1),
+  /* Avoid the use of slow int<->fp moves for spilling by setting
+ their cost higher than memmov_cost.  */
+  NAMED_PARAM (GP2FP, 5),
+  NAMED_PARAM (FP2GP, 5),
+  NAMED_PARAM (FP2FP, 2)
+};
+
+static const struct cpu_regmove_cost cortexa53_regmove_cost =
+{
+  NAMED_PARAM (GP2GP, 1),
+  /* Avoid the use of slow int<->fp moves for spilling by setting
+ their cost higher than memmov_cost.  */
+  NAMED_PARAM (GP2FP, 5),
+  NAMED_PARAM (FP2GP, 5),
+  NAMED_PARAM (FP2FP, 2)
+};
+
 /* Generic costs for vector insn classes.  */
 #if HAVE_DESIGNATED_INITIALIZERS && GCC_VERSION >= 2007
 __extension__
@@ -275,7 +295,7 @@ static const struct tune_params cortexa53_tunings =
 {
   &cortexa53_extra_costs,
   &generic_addrcost_table,
-  &generic_regmove_cost,
+  &cortexa53_regmove_cost,
   &generic_vector_cost,
   NAMED_PARAM (memmov_cost, 4),
   NAMED_PARAM (issue_rate, 2)
@@ -285,7 +305,7 @@ static const struct tune_params cortexa57_tunings =
 {
   &cortexa57_extra_costs,
   &cortexa57_addrcost_table,
-  &generic_regmove_cost,
+  &cortexa57_regmove_cost,
   &cortexa57_vector_cost,
   NAMED_PARAM (memmov_cost, 4),
   NAMED_PARAM (issue_rate, 3)
-- 
1.9.1



Re: [wwwdocs] Mention Cilk Plus support

2014-09-11 Thread Aldy Hernandez
Gerald Pfeifer  writes:


> Hi Igor,
>
> On Wed, 10 Sep 2014, Zamyatin, Igor wrote:
>> + Complete support for http://cilk.org";>Cilk Plus 
>> features was added to GCC
>> + [2014-09-02]

features *were* added, plural.

>> + Contributed by Jakub Jelinek, Iyer Balaji and Igor
>> Zamyatin.

Wht?  No Aldy Hernandez?  I spent the better part of year working on
this.  Although perhaps it's better if no one remembers that ;-).

Aldy
>
> can you please make this "Cilk Plus support" or "Full Cilk Plus support"
> in the title, and then use the current title as the first part of the
> more detailed description?  (You can use the latest AVX announcement as
> an example.)
>
> Gerald


Re: Fix Libreoffice LTO build failure

2014-09-11 Thread Aldy Hernandez
Jan Hubicka  writes:


> Hi,
> Libreoffice fails to build because ltrans tries to fetch in variable 
> constructor
> that is not shipped there.  Fixed thus.
>
> Bootstrapped/regtested x86_64-linux.
>
> Honza
>
>   * varpool.c (varpool_node::ctor_useable_for_folding_p): Do not try
>   to access removed nodes.
>
> Index: varpool.c
> ===
> --- varpool.c (revision 215100)
> +++ varpool.c (working copy)
> @@ -316,6 +316,11 @@ varpool_node::ctor_useable_for_folding_p
>&& !real_node->lto_file_data)
>  return false;
>  
> +  /* Avoid attempts to load constructors that was not streamed.  */

that *were* not streamed.  Plural.

Aldy


Re: ptx preliminary rtl patches [2/4]

2014-09-11 Thread Steven Bosscher
On Thu, Sep 11, 2014 at 3:25 PM, Bernd Schmidt wrote:
> Bootstrapped and tested on x86_64-linux, together with the other patches.
> Ok?

This is OK.

Ciao!
Steven


Re: [PATCH][ARM] Enable auto-vectorization for copysignf

2014-09-11 Thread Jiong Wang


On 11/09/14 14:55, Jiong Wang wrote:

On 11/09/14 14:43, Christophe Lyon wrote:

Hi Jiong,

On 9 September 2014 12:59, Ramana Radhakrishnan
 wrote:

On Mon, Aug 18, 2014 at 11:31 AM, Jiong Wang  wrote:

this patch enable auto-vectorization for copysignf by using vector
bit selection instruction on arm32 when neon available.


I've noticed that your new testcase fails (the scan-tree-dump-times
line), in the following cases:
* forcing -march=armv5t in RUNTESTFLAGS (targets
arm-none-linux-gnueabi and arm-none-linux-gnueabihf)
* target armeb-none-linux-gnueabihf

You can have a look at:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/report-build-info.html

If you go 1 level up at
http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/215067/
you'll be able to browse into the per-target subdirs and get the .sum
files if you need them.

Christophe,

the auto-test system is great!

the testcase only pass when both hardware & abi options meet requirement.

I tried to skip those environment where neon is not available by 
"dg-require-effective-target arm_neon_hw"

there maybe something not covered. I'll have a look.

thanks.


   the scan of

"/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
*/

   is a little bit fragile, it needs -mfpu, -mfloat-abi both meet requirement, 
and thus cause
   trouble if the test environment has complicated options combinations like 
the Linaro test farm.

   currently, I haven't found any good way in Dejagnu to accurately detect 
what's options used.

   I was trying specify options like -march=armv7 -mfloat-abi=hard, and do 
compile test only,
   but even this, these options may be override by options specified explicitly 
by user, and thus
   pass those dejagnu check-*-target while fail on the later actual compile.

   the one way I can think of which is 100% safe on all test environment is:

 * keep the test as a "run" test only, as the correctness is important.
 * remove the scan of "vectorized 1 loops".

-- Jiong



-- Jiong


Christophe.



for a simple testcase:

for (i = 0; i < N; i++)
  r[i] = __builtin_copysignf (a[i], b[i]);


assuming vector factor be 4, the generated instruction sequences is:

  vmov.i32q10, #2147483648  @ v4si
.L2:
  vld1.64 {d18-d19}, [ip:64]
  add r3, r3, #16
  add ip, ip, #16
  vldrd16, [r3, #-16]
  vldrd17, [r3, #-8]
  vbifq8, q9, q10


Ok.

Ramana



thanks.

gcc/
* config/arm/arm.c (NEON_COPYSIGNF): New enum.
(arm_init_neon_builtins): Support NEON_COPYSIGNF.
(arm_builtin_vectorized_function): Likewise.
* config/arm/arm_neon_builtins.def: New macro for copysignf.
* config/arm/neon.md (neon_copysignf): New pattern for vector
copysignf.

gcc/testsuite/
* gcc.target/arm/vect-copysignf.c: New testcase.








Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Steven Bosscher
On Thu, Sep 11, 2014 at 3:26 PM, Bernd Schmidt wrote:
> nvptx will be the first port to use BImode and have STORE_FLAG_VALUE==-1.
> That has exposed a bug in combine where we can end up calling
> num_sign_bit_copies for a BImode value. However, the return value is always
> 1 in that case, so it doesn't tell us anything and is going to be
> misinterpreted by the caller.
>
> Bootstrapped and tested on x86_64-linux, together with the other patches.
> Ok?

This should be handled in num_sign_bit_copies itself, i.e. handle BImode there.

Ciao!
Steven


Re: ptx preliminary rtl patches [4/4]

2014-09-11 Thread Steven Bosscher
On Thu, Sep 11, 2014 at 3:27 PM, Bernd Schmidt wrote:
> It turns out that we're calling eliminate_regs for global variables which
> can't possibly have eliminable regs in their decl. At that point,
> reg_eliminate can be NULL. This patch avoids unnecessary work, and allows us
> to add an assert to eliminate_regs later.
>
> Bootstrapped and tested on x86_64-linux, together with the other patches.
> Ok?

Why not use is_global_var()?

Ciao!
Steven


[jit] MAINTAINERS: Add myself as jit maintainer

2014-09-11 Thread David Malcolm
Committed to branch dmalcolm/jit:

ChangeLog.jit:
* MAINTAINERS (Various Maintainers): Add myself as jit maintainer.

gcc/jit/ChangeLog.jit:
* TODO.rst (Initial Release): Update for addition of myself as
maintainer.
---
 ChangeLog.jit | 4 
 MAINTAINERS   | 1 +
 gcc/jit/ChangeLog.jit | 5 +
 gcc/jit/TODO.rst  | 2 --
 4 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/ChangeLog.jit b/ChangeLog.jit
index 131f5a5..5d2db3f 100644
--- a/ChangeLog.jit
+++ b/ChangeLog.jit
@@ -1,3 +1,7 @@
+2014-09-11  David Malcolm  
+
+   * MAINTAINERS (Various Maintainers): Add myself as jit maintainer.
+
 2013-10-03  David Malcolm  
 
* configure.ac: Add --enable-host-shared
diff --git a/MAINTAINERS b/MAINTAINERS
index cb55e9c..4d436fb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -256,6 +256,7 @@ testsuite   Janis Johnson   
jani...@codesourcery.com
 register allocationVladimir Makarovvmaka...@redhat.com
 gdbhooks.pyDavid Malcolm   dmalc...@redhat.com
 SLSR   Bill Schmidtwschm...@linux.vnet.ibm.com
+jitDavid Malcolm   dmalc...@redhat.com
 
 Note that individuals who maintain parts of the compiler need approval to
 check in changes outside of the parts of the compiler they maintain.
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 029e29a..651b285 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,8 @@
+2014-09-11  David Malcolm  
+
+   * TODO.rst (Initial Release): Update for addition of myself as
+   maintainer.
+
 2014-09-10  David Malcolm  
 
* TODO.rst (Test suite): Multithreaded test is done.
diff --git a/gcc/jit/TODO.rst b/gcc/jit/TODO.rst
index ed8ffcc..bb43f1f 100644
--- a/gcc/jit/TODO.rst
+++ b/gcc/jit/TODO.rst
@@ -57,8 +57,6 @@ Initial Release
 
 * add a SONAME to the library (and potentially version the symbols?)
 
-* add myself as maintainer
-
 * do we need alternative forms of division (floor vs rounding)?
 
 * are we missing any ops?
-- 
1.8.5.3



Re: [PATCHv2] Vimrc config with GNU formatting

2014-09-11 Thread Yury Gribov

On 09/11/2014 02:10 PM, Yury Gribov wrote:

On 09/11/2014 01:18 PM, Richard Biener wrote:
On Thu, Sep 11, 2014 at 11:06 AM, Richard Biener
  wrote:

>On Wed, Sep 10, 2014 at 10:09 AM, Yury Gribov
wrote:

>>Hi all,
>>
>>This is a second version of patch which adds a Vim config
(.local.vimrc)
>>to root folder to allow automatic setup of GNU formatting for
C/C++/Java/Lex
>>GCC files.
>>
>>I've updated the code with comments from Richard and Bernhard
(which fixed
>>formatting
>>of lonely closing bracket).
>>
>>The patch caused a lively debate with Segher who wanted
.local.vimrc to not
>>be enabled
>>by default. We basically have two options:
>>1) put .local.vimrc to root (just like .dir-locals.el config for
Emacs)
>>2) put both .local.vimrc and .dir-locals.el to contrib and add
Makefile
>>targets
>>to create symlinks in root folder per user's request
>>I personally prefer 2) because this would IMHO improve the quality of
>>patches
>>(e.g. no more silly tab-whitespace formatting bugs).
>>
>>Thoughts? Ok to commit?

>
>It doesn't handle indenting switch/case correctly.  I get
>
>  switch (x)
>{
>case X:
>   {
>  int foo;
>...
>
>that is, the { after the case label is wrongly indented.  The same
happens
>for
>   {
>{
>}
>   }
>
>we seem to get two soft-tabs here.

setlocal cinoptions=>s,n-s,{s,:s,=s,g0,hs,p5,t0,+s,(0,u0,w1,m0

does better but still oddly handles


Also fails for

   if (1)
 {
 x = 2;
 }


Ok, it tooks some time. Basically we want brace symbol to behave 
differently in two contexts:


1) not add any additional offset when not following control flow operator:
void
f ()
{
  int x;
  {
  }
}

2) but add shifttab otherwise:
void
f()
{
  if (1)
{
}
}

My understanding is that {N looks too rigid and always adds the same 
amount to current indent. Thus we see parasitic whites in the first case.


I wonder what would be the best way to handle this. We could just live 
with that (free {}'s are rare anyway) or I could try to hack a custom 
indentexpr (this will of course increase the complexity of patch).


-Y


Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Bernd Schmidt

On 09/11/2014 05:55 PM, Steven Bosscher wrote:

On Thu, Sep 11, 2014 at 3:26 PM, Bernd Schmidt wrote:

nvptx will be the first port to use BImode and have STORE_FLAG_VALUE==-1.
That has exposed a bug in combine where we can end up calling
num_sign_bit_copies for a BImode value. However, the return value is always
1 in that case, so it doesn't tell us anything and is going to be
misinterpreted by the caller.

Bootstrapped and tested on x86_64-linux, together with the other patches.
Ok?


This should be handled in num_sign_bit_copies itself, i.e. handle BImode there.


What do you expect that function to do different? It returns the correct 
value.



Bernd




Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Steven Bosscher
On Thu, Sep 11, 2014 at 6:06 PM, Bernd Schmidt wrote:
> On 09/11/2014 05:55 PM, Steven Bosscher wrote:
>>
>> On Thu, Sep 11, 2014 at 3:26 PM, Bernd Schmidt wrote:
>>>
>>> nvptx will be the first port to use BImode and have STORE_FLAG_VALUE==-1.
>>> That has exposed a bug in combine where we can end up calling
>>> num_sign_bit_copies for a BImode value. However, the return value is
>>> always
>>> 1 in that case, so it doesn't tell us anything and is going to be
>>> misinterpreted by the caller.
>>>
>>> Bootstrapped and tested on x86_64-linux, together with the other patches.
>>> Ok?
>>
>>
>> This should be handled in num_sign_bit_copies itself, i.e. handle BImode
>> there.
>
>
> What do you expect that function to do different? It returns the correct
> value.
>

No different. Just that if you want to check whether DECL is a global
variable then we have a predicate for it. So why use TREE_STATIC
instead?

In other words: Just trying to make/keep certain checks consistent. (A
hopeless cause, but a noble one... ;-)

Ciao!
Steven


[PATCH 1/2] Extend libiberty to allow append stdout and stderr to existing files.

2014-09-11 Thread Maxim Ostapenko
Working on ICE debugging patch, I've noted that libiberty interface 
doesn't allow to append stdout and stderr to existing files.


This small patch provides two new flags for pex_run and extends 
open_write interface to handle the issue.


Does this patch look sane?

-Maxim
libiberty/ChangeLog:

2014-09-11  Max Ostapenko  

	* pex-common.h (struct pex_funcs): Add new parameter for open_write field.
	* pex-unix.c (pex_unix_open_write): Add support for new parameter.
	* pex-djgpp.c (pex_djgpp_open_write): Likewise.
	* pex-win32.c (pex_win32_open_write): Likewise.
	* pex-common.c (pex_run_in_environment): Likewise.


include/ChangeLog:

2014-09-11  Max Ostapenko  

	* libiberty.h (PEX_STDOUT_APPEND): New flag.
	(PEX_STDERR_APPEND): Likewise.

diff --git a/include/libiberty.h b/include/libiberty.h
index 56b8b43..bcc1f9a 100644
--- a/include/libiberty.h
+++ b/include/libiberty.h
@@ -445,6 +445,11 @@ extern struct pex_obj *pex_init (int flags, const char *pname,
on Unix.  */
 #define PEX_BINARY_ERROR	0x80
 
+/* Append stdout to existing file instead of truncating it.  */
+#define PEX_STDOUT_APPEND	0x100
+
+/* Thes same as PEX_STDOUT_APPEND, but for STDERR.  */
+#define PEX_STDERR_APPEND	0x200
 
 /* Execute one program.  Returns NULL on success.  On error returns an
error string (typically just the name of a system call); the error
diff --git a/libiberty/pex-common.c b/libiberty/pex-common.c
index 6fd3fde..146010a 100644
--- a/libiberty/pex-common.c
+++ b/libiberty/pex-common.c
@@ -267,7 +267,8 @@ pex_run_in_environment (struct pex_obj *obj, int flags, const char *executable,
   if (out < 0)
 {
   out = obj->funcs->open_write (obj, outname,
-(flags & PEX_BINARY_OUTPUT) != 0);
+(flags & PEX_BINARY_OUTPUT) != 0,
+(flags & PEX_STDOUT_APPEND) != 0);
   if (out < 0)
 	{
 	  *err = errno;
@@ -319,8 +320,9 @@ pex_run_in_environment (struct pex_obj *obj, int flags, const char *executable,
 }
   else
 {
-  errdes = obj->funcs->open_write (obj, errname, 
-   (flags & PEX_BINARY_ERROR) != 0);
+  errdes = obj->funcs->open_write (obj, errname,
+   (flags & PEX_BINARY_ERROR) != 0,
+   (flags & PEX_STDERR_APPEND) != 0);
   if (errdes < 0)
 	{
 	  *err = errno;
diff --git a/libiberty/pex-common.h b/libiberty/pex-common.h
index af338e6..b6db248 100644
--- a/libiberty/pex-common.h
+++ b/libiberty/pex-common.h
@@ -104,7 +104,7 @@ struct pex_funcs
   /* Open file NAME for writing.  If BINARY is non-zero, open in
  binary mode.  Return >= 0 on success, -1 on error.  */
   int (*open_write) (struct pex_obj *, const char */* name */,
- int /* binary */);
+ int /* binary */, int /* append */);
   /* Execute a child process.  FLAGS, EXECUTABLE, ARGV, ERR are from
  pex_run.  IN, OUT, ERRDES, TOCLOSE are all descriptors, from
  open_read, open_write, or pipe, or they are one of STDIN_FILE_NO,
diff --git a/libiberty/pex-djgpp.c b/libiberty/pex-djgpp.c
index 0721139..b014ffa 100644
--- a/libiberty/pex-djgpp.c
+++ b/libiberty/pex-djgpp.c
@@ -43,7 +43,7 @@ extern int errno;
 #endif
 
 static int pex_djgpp_open_read (struct pex_obj *, const char *, int);
-static int pex_djgpp_open_write (struct pex_obj *, const char *, int);
+static int pex_djgpp_open_write (struct pex_obj *, const char *, int, int);
 static pid_t pex_djgpp_exec_child (struct pex_obj *, int, const char *,
   char * const *, char * const *,
   int, int, int, int,
@@ -90,10 +90,12 @@ pex_djgpp_open_read (struct pex_obj *obj ATTRIBUTE_UNUSED,
 
 static int
 pex_djgpp_open_write (struct pex_obj *obj ATTRIBUTE_UNUSED,
-		  const char *name, int binary)
+		  const char *name, int binary, int append)
 {
   /* Note that we can't use O_EXCL here because gcc may have already
  created the temporary file via make_temp_file.  */
+  if (append)
+return -1;
   return open (name,
 	   (O_WRONLY | O_CREAT | O_TRUNC
 		| (binary ? O_BINARY : O_TEXT)),
diff --git a/libiberty/pex-unix.c b/libiberty/pex-unix.c
index addf8ee..0715115 100644
--- a/libiberty/pex-unix.c
+++ b/libiberty/pex-unix.c
@@ -301,7 +301,7 @@ pex_wait (struct pex_obj *obj, pid_t pid, int *status, struct pex_time *time)
 static void pex_child_error (struct pex_obj *, const char *, const char *, int)
  ATTRIBUTE_NORETURN;
 static int pex_unix_open_read (struct pex_obj *, const char *, int);
-static int pex_unix_open_write (struct pex_obj *, const char *, int);
+static int pex_unix_open_write (struct pex_obj *, const char *, int, int);
 static pid_t pex_unix_exec_child (struct pex_obj *, int, const char *,
  char * const *, char * const *,
  int, int, int, int,
@@ -350,11 +350,12 @@ pex_unix_open_read (struct pex_obj *obj ATTRIBUTE_UNUSED, const char *name,
 
 static int
 pex_unix_open_write (struct pex_obj *obj ATTRIBUTE_UNUSED, const char *name,
-		 int binary ATTRIBUTE_UNUSED)
+		 int binary ATTRIBUTE_UNUSED, int append)
 {
   /* Note that w

Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Bernd Schmidt

On 09/11/2014 06:15 PM, Steven Bosscher wrote:

On Thu, Sep 11, 2014 at 6:06 PM, Bernd Schmidt wrote:

On 09/11/2014 05:55 PM, Steven Bosscher wrote:


On Thu, Sep 11, 2014 at 3:26 PM, Bernd Schmidt wrote:


nvptx will be the first port to use BImode and have STORE_FLAG_VALUE==-1.
That has exposed a bug in combine where we can end up calling
num_sign_bit_copies for a BImode value. However, the return value is
always
1 in that case, so it doesn't tell us anything and is going to be
misinterpreted by the caller.

Bootstrapped and tested on x86_64-linux, together with the other patches.
Ok?



This should be handled in num_sign_bit_copies itself, i.e. handle BImode
there.



What do you expect that function to do different? It returns the correct
value.



No different. Just that if you want to check whether DECL is a global
variable then we have a predicate for it. So why use TREE_STATIC
instead?

In other words: Just trying to make/keep certain checks consistent. (A
hopeless cause, but a noble one... ;-)


You're talking about a different patch here. This one is about 
num_sign_bit_copies.


I can certainly use is_global_var if the patch is ok with that change.


Bernd




[PATCH 2/2] Add patch for debugging compiler ICEs.

2014-09-11 Thread Maxim Ostapenko

Hi, Joseph,

Thanks for your review! I've added comments for new functions and 
replaced POSIX subprocess interfaces with libiberty's ones.


In general, when cc1 or cc1plus ICE-es, we try to reproduce the bug by 
running compiler 3 times and comparing stderr and stdout on each attempt 
with respective ones that were gotten as the result of previous compiler 
run (we use temporary dump files to do this). If these files are 
identical, we add GCC configuration (e.g. target, configure options and 
version), compiler command line and preprocessed source code into last 
dump file, containing backtrace. Following Jakub's approach, we trigger 
ICE_EXIT_CODE instead of FATAL_EXIT_CODE in case of DK_FATAL error to 
differ ICEs from other fatal errors, so try_generate_repro routine will 
be able to run even if fatal_error occurred in compiler.


We've noticed that on rare occasion a particularly severe segfault can 
cause GCC to abort without ICE-ing. These (hopefully rare) errors will 
be missed by our patch, because SIGSEGV handler is not able to catch the 
signal due to corrupted stack. It could make sense to allocate separate 
stack for SIGSEGV handler to resolve this situation.


-Maxim
On 09/10/2014 08:37 PM, Joseph S. Myers wrote:

On Wed, 10 Sep 2014, Jakub Jelinek wrote:


On Tue, Sep 09, 2014 at 10:51:23PM +, Joseph S. Myers wrote:

On Thu, 28 Aug 2014, Maxim Ostapenko wrote:


diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 0cc7593..67b8c5b 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -492,7 +492,7 @@ diagnostic_action_after_output (diagnostic_context *context,
real_abort ();
diagnostic_finish (context);
fnotice (stderr, "compilation terminated.\n");
-  exit (FATAL_EXIT_CODE);
+  exit (ICE_EXIT_CODE);

Why?  This is the case for fatal_error.  FATAL_EXIT_CODE seems right for
this, and ICE_EXIT_CODE wrong.

So that the driver can understand the difference between an ICE and other
fatal errors (e.g. sorry etc.).
Users are typically using the driver and for them it matters what exit code
is returned from the driver, not from cc1/cc1plus etc.

Well, I think the next revision of the patch submission needs more
explanation in this area.  What exit codes do cc1 and the driver now
return for (normal error, fatal error, ICE), and what do they return after
the patch, and how does the change to the fatal_error case avoid incorrect
changes if either cc1 or the driver called fatal_error (as opposed to
either cc1 or the driver having an ICE)?  Maybe that explanation should be
in the form of a comment on this exit call, explaining why the
counterintuitive use of ICE_EXIT_CODE in the DK_FATAL case is correct.



2014-09-04  Jakub Jelinek  
	Max Ostapenko  

	* common.opt: New option.
	* doc/invoke.texi: Describe new option.
	* diagnostic.c (diagnostic_action_after_output): Exit with
	ICE_EXIT_CODE instead of FATAL_EXIT_CODE.
	* gcc.c (execute): Don't free first string early, but at the end
	of the function.  Call retry_ice if compiler exited with
	ICE_EXIT_CODE.
	(main): Factor out common code.
	(print_configuration): New function.
	(try_fork): Likewise.
	(redirect_stdout_stderr): Likewise.
	(files_equal_p): Likewise.
	(check_repro): Likewise.
	(run_attempt): Likewise.
	(do_report_bug): Likewise.
	(append_text): Likewise.
	(try_generate_repro): Likewise

diff --git a/gcc/common.opt b/gcc/common.opt
index 7d78803..ce71f09 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1120,6 +1120,11 @@ fdump-noaddr
 Common Report Var(flag_dump_noaddr)
 Suppress output of addresses in debugging dumps
 
+freport-bug
+Common Driver Var(flag_report_bug)
+Collect and dump debug information into temporary file if ICE in C/C++
+compiler occured.
+
 fdump-passes
 Common Var(flag_dump_passes) Init(0)
 Dump optimization passes
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 73666d6..dbc928b 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -494,7 +494,10 @@ diagnostic_action_after_output (diagnostic_context *context,
 	real_abort ();
   diagnostic_finish (context);
   fnotice (stderr, "compilation terminated.\n");
-  exit (FATAL_EXIT_CODE);
+  /* Exit with ICE_EXIT_CODE rather then FATAL_EXIT_CODE so the driver
+ understands the difference between an ICE and other fatal errors
+ (DK_SORRY and DK_ERROR).  */
+  exit (ICE_EXIT_CODE);
 
 default:
   gcc_unreachable ();
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 863b382..565421c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6336,6 +6336,11 @@ feasible to use diff on debugging dumps for compiler invocations with
 different compiler binaries and/or different
 text / bss / data / heap / stack / dso start locations.
 
+@item -freport-bug
+@opindex freport-bug
+Collect and dump debug information into temporary file if ICE in C/C++
+compiler occured.
+
 @item -fdump-unnumbered
 @opindex fdump-unnumbered
 When doing debugging dumps, suppress instruction 

Re: Fix some more decl types in the Fortran frontend

2014-09-11 Thread Bernd Schmidt

On 09/11/2014 12:37 PM, FX wrote:

Changing the fntype[2] looks wrong to me, as it is also used for
powi(double, int) , where the argument order matches the current
version:


Ah, sorry. I only looked at mathbuiltins.def and didn't spot the other use.


It looks like fntype[5] is actually what you need, and it’s already
constructed! However, there is even more mistery here, because it is
currently used for __builtin_scalbn, which doesn’t seem right:
http://pubs.opengroup.org/onlinepubs/009695399/functions/scalbln.html

 So I suspect looking a bit more in depth is required! Also,
testcases that excercise this fndecl matching (which you would see
fail on ptx) would be a great addition to the testsuite, once you
commit (for powi & scalbn, which do not look covered right now,
otherwise you would have seen regressions).


So it looks like the following patch would be the right thing? I'm 
afraid I failed to construct a compileable Fortran testcase for scalbn.



Bernd

commit 5f170b2710aaa5e098d74c71fcd206ef209f0b60
Author: Bernd Schmidt 
Date:   Wed Sep 10 18:02:53 2014 +0200

Fix type mismatches in intrinsic functions.

	* f95-lang.c (gfc_init_builtin_functions): Use type index 2 for
	scalbn, scalbnl and scalbnf.
	* mathbuiltins.def (JN, YN): Use type index 5.

diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index da3a0d0..e485201 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -784,11 +784,11 @@ gfc_init_builtin_functions (void)
   gfc_define_builtin ("__builtin_fabsf", mfunc_float[0], 
 		  BUILT_IN_FABSF, "fabsf", ATTR_CONST_NOTHROW_LEAF_LIST);
  
-  gfc_define_builtin ("__builtin_scalbnl", mfunc_longdouble[5], 
+  gfc_define_builtin ("__builtin_scalbnl", mfunc_longdouble[2], 
 		  BUILT_IN_SCALBNL, "scalbnl", ATTR_CONST_NOTHROW_LEAF_LIST);
-  gfc_define_builtin ("__builtin_scalbn", mfunc_double[5], 
+  gfc_define_builtin ("__builtin_scalbn", mfunc_double[2], 
 		  BUILT_IN_SCALBN, "scalbn", ATTR_CONST_NOTHROW_LEAF_LIST);
-  gfc_define_builtin ("__builtin_scalbnf", mfunc_float[5], 
+  gfc_define_builtin ("__builtin_scalbnf", mfunc_float[2], 
 		  BUILT_IN_SCALBNF, "scalbnf", ATTR_CONST_NOTHROW_LEAF_LIST);
  
   gfc_define_builtin ("__builtin_fmodl", mfunc_longdouble[1], 
diff --git a/gcc/fortran/mathbuiltins.def b/gcc/fortran/mathbuiltins.def
index d5bf60d..d06a90b 100644
--- a/gcc/fortran/mathbuiltins.def
+++ b/gcc/fortran/mathbuiltins.def
@@ -42,10 +42,10 @@ DEFINE_MATH_BUILTIN_C (TAN,   "tan",0)
 DEFINE_MATH_BUILTIN_C (TANH,  "tanh",   0)
 DEFINE_MATH_BUILTIN   (J0,"j0", 0)
 DEFINE_MATH_BUILTIN   (J1,"j1", 0)
-DEFINE_MATH_BUILTIN   (JN,"jn", 2)
+DEFINE_MATH_BUILTIN   (JN,"jn", 5)
 DEFINE_MATH_BUILTIN   (Y0,"y0", 0)
 DEFINE_MATH_BUILTIN   (Y1,"y1", 0)
-DEFINE_MATH_BUILTIN   (YN,"yn", 2)
+DEFINE_MATH_BUILTIN   (YN,"yn", 5)
 DEFINE_MATH_BUILTIN   (ERF,   "erf",0)
 DEFINE_MATH_BUILTIN   (ERFC,  "erfc",   0)
 DEFINE_MATH_BUILTIN   (TGAMMA,"tgamma", 0)


Re: Fix some more decl types in the Fortran frontend

2014-09-11 Thread FX
> So it looks like the following patch would be the right thing?

I would think so.

FX


Re: [PATCH 1/2] Extend libiberty to allow append stdout and stderr to existing files.

2014-09-11 Thread Ian Lance Taylor
On Thu, Sep 11, 2014 at 9:18 AM, Maxim Ostapenko
 wrote:
>
> Working on ICE debugging patch, I've noted that libiberty interface doesn't
> allow to append stdout and stderr to existing files.
>
> This small patch provides two new flags for pex_run and extends open_write
> interface to handle the issue.
>
> Does this patch look sane?

I'm not sure why you want to do this, but the patch looks sane.

Ian


Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Steven Bosscher
On Thu, Sep 11, 2014 at 6:19 PM, Bernd Schmidt wrote:
>>> What do you expect that function to do different? It returns the correct
>>> value.
>>>
>>
>> No different. Just that if you want to check whether DECL is a global
>> variable then we have a predicate for it. So why use TREE_STATIC
>> instead?
>>
>> In other words: Just trying to make/keep certain checks consistent. (A
>> hopeless cause, but a noble one... ;-)
>
>
> You're talking about a different patch here. This one is about
> num_sign_bit_copies.


Ah. *sigh* can't even keep two patches in my mind at any one time.

The point about num_sign_bit_copies is that it doesn't really return
the correct value IMHO, if there isn't really a correct value to speak
of: What is the sign of TRUE or FALSE, the only two values a BImode
value can take?

A 1-bit precision integer can have value 0 or -1 and in that case
num_sign_bit_copies should be 0. But for a BImode value, it seems to
me that asking for the sign bit or sign bit copies is just wrong.

Ciao!
Steven


Re: ptx preliminary rtl patches [3/4]

2014-09-11 Thread Bernd Schmidt

On 09/11/2014 06:34 PM, Steven Bosscher wrote:

On Thu, Sep 11, 2014 at 6:19 PM, Bernd Schmidt wrote:

What do you expect that function to do different? It returns the correct
value.



No different. Just that if you want to check whether DECL is a global
variable then we have a predicate for it. So why use TREE_STATIC
instead?

In other words: Just trying to make/keep certain checks consistent. (A
hopeless cause, but a noble one... ;-)



You're talking about a different patch here. This one is about
num_sign_bit_copies.



Ah. *sigh* can't even keep two patches in my mind at any one time.

The point about num_sign_bit_copies is that it doesn't really return
the correct value IMHO, if there isn't really a correct value to speak
of: What is the sign of TRUE or FALSE, the only two values a BImode
value can take?

A 1-bit precision integer can have value 0 or -1 and in that case
num_sign_bit_copies should be 0. But for a BImode value, it seems to
me that asking for the sign bit or sign bit copies is just wrong.


I strongly disagree. It's the same as for any other integer - there's 
one sign bit, and since there aren't any other bits, the number of sign 
bit copies is always exactly 1.



Bernd




Re: [PATCH] gcc parallel make check

2014-09-11 Thread Tom Tromey
> "Jakub" == Jakub Jelinek  writes:

Jakub> I fear that is going to be too expensive, because e.g. all the
Jakub> caching that dejagnu and our tcl stuff does would be gone, all
Jakub> the tests for lp64 etc.  would need to be repeated for each test.

In gdb I arranged to have this stuff saved in a special cache directory.
See gdb/testsuite/lib/cache.exp for the mechanism.

Tom


Re: DBL_DENORM_MIN should never be 0

2014-09-11 Thread Joseph S. Myers
On Thu, 11 Sep 2014, Marc Glisse wrote:

> I don't know what kind of test you have in mind, so I added a runtime test. I
> am just guessing that it probably fails on alpha because of PR 58757, I can't
> test. Computing d+d may be even more likely to trigger potential issues, if
> that's the goal.

Yes, a runtime test.  I don't think there should be an xfail without it 
actually having been tested to fail (and then such an xfail should come 
with a comment referencing the bug filed in Bugzilla).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch] PR 61489

2014-09-11 Thread Paolo Carlini

Hi,

On 09/11/2014 05:06 PM, Jason Merrill wrote:

Do we need a documentation update?

I agree. Something like the below would do?

Thanks,
Paolo.


2014-09-11  Paolo Carlini  

PR c++/61489
* doc/invoke.texi ([-Wmissing-field-initializers]): Update.

/cp
2014-09-11  Paolo Carlini  

PR c++/61489
* typeck2.c (process_init_constructor_record): Do not warn about
missing field initializer if EMPTY_CONSTRUCTOR_P (init).

/testsuite
2014-09-11  Paolo Carlini  

PR c++/61489
* g++.dg/warn/Wmissing-field-initializers-1.C: New.
* g++.old-deja/g++.other/warn5.C: Adjust.
Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 215117)
+++ cp/typeck2.c(working copy)
@@ -1359,7 +1359,8 @@ process_init_constructor_record (tree type, tree i
  next = massage_init_elt (TREE_TYPE (field), next, complain);
 
  /* Warn when some struct elements are implicitly initialized.  */
- if (complain & tf_warning)
+ if ((complain & tf_warning)
+ && !EMPTY_CONSTRUCTOR_P (init))
warning (OPT_Wmissing_field_initializers,
 "missing initializer for member %qD", field);
}
@@ -1382,7 +1383,8 @@ process_init_constructor_record (tree type, tree i
 
  /* Warn when some struct elements are implicitly initialized
 to zero.  */
- if (complain & tf_warning)
+ if ((complain & tf_warning)
+ && !EMPTY_CONSTRUCTOR_P (init))
warning (OPT_Wmissing_field_initializers,
 "missing initializer for member %qD", field);
 
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 215117)
+++ doc/invoke.texi (working copy)
@@ -4912,6 +4912,14 @@ struct s @{ int f, g, h; @};
 struct s x = @{ .f = 3, .g = 4 @};
 @end smallexample
 
+In C++ this option does not warn either about the empty @{ @}
+initializer, for example:
+
+@smallexample
+struct s @{ int f, g, h; @};
+s x = @{ @};
+@end smallexample
+
 This warning is included in @option{-Wextra}.  To get other @option{-Wextra}
 warnings without this one, use @option{-Wextra 
-Wno-missing-field-initializers}.
 
Index: testsuite/g++.dg/warn/Wmissing-field-initializers-1.C
===
--- testsuite/g++.dg/warn/Wmissing-field-initializers-1.C   (revision 0)
+++ testsuite/g++.dg/warn/Wmissing-field-initializers-1.C   (working copy)
@@ -0,0 +1,31 @@
+// PR c++/61489
+// { dg-options "-Wmissing-field-initializers" }
+
+struct mystruct1 {
+  int a, b;
+};
+
+struct aux2 {
+  aux2();
+};
+
+struct mystruct2 {
+  aux2 a, b;
+};
+
+struct aux3 {
+  int x;
+};
+
+struct mystruct3 {
+  aux3 a, b;
+};
+
+mystruct1 obj11 = {};
+mystruct1 obj12 = {0};   // { dg-warning "missing initializer" }
+
+mystruct2 obj21 = {};
+mystruct2 obj22 = {aux2()};  // { dg-warning "missing initializer" }
+
+mystruct3 obj31 = {};
+mystruct3 obj32 = {0};   // { dg-warning "missing initializer" }
Index: testsuite/g++.old-deja/g++.other/warn5.C
===
--- testsuite/g++.old-deja/g++.other/warn5.C(revision 215117)
+++ testsuite/g++.old-deja/g++.other/warn5.C(working copy)
@@ -16,4 +16,4 @@ X *foo ()
   return new X ();  // gets bogus warning
 }
 
-X x = {};   // { dg-warning "" } missing initializer
+X x = {};


Re: Stream ODR types

2014-09-11 Thread Jason Merrill

On 09/11/2014 03:06 AM, Jan Hubicka wrote:

http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt



/aux/hubicka/firefox4/content/media/fmp4/ffmpeg/libav53/include/libavcodec/avcodec.h:997:0:
 note: the first difference of corresponding definitions is field ‘data’
 uint8_t *data[AV_NUM_DATA_POINTERS];
 ^
/aux/hubicka/firefox4/content/media/fmp4/ffmpeg/libav54/include/libavcodec/avcodec.h:997:0:
 note: a field of same name but different type is defined in another 
translation unit
 uint8_t *data[AV_NUM_DATA_POINTERS];
 ^
/usr/include/stdint.h:49:24: note: type ‘uint8_t’ should match type ‘uint8_t’
 typedef unsigned char  uint8_t;
^
/usr/include/stdint.h:49:24: note: the incompatible type is defined here


Hmm, how can uint8_t be incompatible with itself?


/aux/hubicka/firefox4/dom/base/nsJSEnvironment.cpp:311:0: note: a field with 
different name is defined in another translation unit


This should print the different name in case the difference is due to a 
macro (as it is here).



+Eventually we should start saving mangled names in TYPE_NAME.
+Then this condition will become non-trivial.  */


I think this sentence is out of date now.  :)

Looks good otherwise.

Jason



Re: [C++ Patch] PR 61489

2014-09-11 Thread Jason Merrill

OK, thanks.

Jason


Re: Stream ODR types

2014-09-11 Thread Jan Hubicka
> On Thu, 11 Sep 2014, Jan Hubicka wrote:
> 
> > Hi,
> > this patch adds computation and streaming of mangled type names.  As 
> > suggested by Jason,
> > it simple calls DECL_ASSEMBLER_NAME on all names types and lets C++ supply 
> > them.
> > This makes it possible to stablish precise ODR type equivalency at LTO 
> > (till now we can
> > do that only for complete class types with virtual methods attached to 
> > them).
> > Lto type merging is then updated to register all types into the ODR type 
> > hash.  This
> > makes warnings to be output for ODR violations. Here are ones output for 
> > Firefox:
> > http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt
> > 
> > As discussed earlier, in addition to ODR warnings that seems useful, I would
> > like to use it for TBAA analysis for ODR types that are not structurally
> > equivalent to non-ODR types, so C++ programs will get better alias analysis 
> > and
> > for other tricks, such as more agresively merging ODR types.
> > 
> > I believe this makes sense (is orthogonal) with early debug info (for 
> > warnings, TBAA
> > and devirtualization).  It can be also used to more agresively merge debug 
> > information
> > as done by LLVM.
> > 
> > The change increase LTO object fules by about 2% (uncompressed by 6%) and 
> > also
> > increase WPA memory use and streaming times by about same percentage.  It is
> > not small and thus I made it optional (enabled by default for now).  We 
> > could see
> > how benefits relate to this cost once the other three parts are implemented.
> > 
> > Bootstrapped/regtested x86_64-linux, seems sane?
> 
> It looks sane, but when early debug is completed we likely will drop
> all the elaborated types from decls.  Thus to keep the ODR type you'd
> have to keep (and compute early as well) their DECL_ASSEMBLER_NAME?

I currently compute it in free_lang_data.  Obviously we can compute earlier
(in the frontend) as fit.
> 
> Can't we just store a hash of the assembler name?  From alias analysis
> perspective false aliasing due to a hash collision is harmless, no?
> Maybe not for ODR warnings though.  At least a hash would be way
> cheaper than those usually very large strings

Hmm, interesting idea.  False positives are harmless for alias analysis, they
do matter for type inheritance graph construction but if we decide we will ever
care only about polymorphic types, we can always use the virtual table name to
resolve conflicts.

We will get false ODR violation warnings, but the chances would be very low.
> 
> You probably want to restrict ODR types to aggregates?

For ODR warnings and TBAA I think i want other types, too.  But yep, we need to 
handle
gracefuly component types that does not have names and we could drop names of 
types
and handle them as component types as it seems fit.

OK, so if you agree, I will go ahead with this patch and we can resolve these 
details
incrementally.

Honza
> 
> Richard.


Re: Stream ODR types

2014-09-11 Thread Jan Hubicka
> On 09/11/2014 03:06 AM, Jan Hubicka wrote:
> >http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt
> 
> >/aux/hubicka/firefox4/content/media/fmp4/ffmpeg/libav53/include/libavcodec/avcodec.h:997:0:
> > note: the first difference of corresponding definitions is field ‘data’
> > uint8_t *data[AV_NUM_DATA_POINTERS];
> > ^
> >/aux/hubicka/firefox4/content/media/fmp4/ffmpeg/libav54/include/libavcodec/avcodec.h:997:0:
> > note: a field of same name but different type is defined in another 
> >translation unit
> > uint8_t *data[AV_NUM_DATA_POINTERS];
> > ^
> >/usr/include/stdint.h:49:24: note: type ‘uint8_t’ should match type ‘uint8_t’
> > typedef unsigned char  uint8_t;
> >^
> >/usr/include/stdint.h:49:24: note: the incompatible type is defined here
> 
> Hmm, how can uint8_t be incompatible with itself?

I can imagine that the array has different size and the warning is confused or 
that
there is bug due to builtin types streaming (they do give me hard time, because 
LTO
frontned supply bulitin types without names, while C++ bulitin types and their
variants do have names). I will debug that.

Honza


RE: [PATCH] gcc parallel make check

2014-09-11 Thread VandeVondele Joost
> Here is a patch I'm testing now:

Hi Jakub,

I also tested your patch to compare timings vs a newer patch (v8) I'll send soon

== patch v8 == make -j32 -k ==
check-fortran   4m58.178s
check-c++ ~10m
check-c   ~10m
check  15m29.873s

== patch Jakub
check-c++ ~20m
check-fortran   3m31.237s 
check-c 8m8

on the positive side, your patch provides a further speedup e.g. fortran and c 
testing (where it splits things nicely). The libstdc++ bottleneck is not 
solved, but I guess that is expected.

As you have presumably found as well, your patch introduces a number failures, 
because some tests seem to have additional dependencies, either explicit or 
implicit:

e.g. in gfortran.dg/binding_label_tests_10_main.f03
! { dg-do compile }
! This file must be compiled AFTER binding_label_tests_10.f03, which it 
! should be because dejagnu will sort the files.
module binding_label_tests_10_main

in gfortran.dg/class_45b.f03 
! { dg-do link }
! { dg-additional-sources class_45a.f03 }

This could clearly trigger as well in the current scheme of splitting, only we 
have been lucky that dependencies seem to be 'well behaved' in having the same 
initial letter in the filename.

Joost

Re: [GOOGLE] Fix gcda build info support

2014-09-11 Thread Teresa Johnson
On Wed, Sep 10, 2014 at 3:31 PM, Xinliang David Li  wrote:
> Can you share the buildinfo reader code with the merger by defininig
> some hooks for different callbacks?

Do you mean the two blobs of code guarded by 'if (tag ==
GCOV_TAG_BUILD_INFO)' that I added here and the existing one in
gcov_exit_merge_gcda further down in the same file? Sure, I could
outline that and pass in the gi_ptr for the merger case. Let me know
if you meant something else.

Teresa

>
> David
>
> On Wed, Sep 10, 2014 at 10:24 AM, Teresa Johnson  wrote:
>> While porting recent support for a build info section in the gcda from
>> google/4_8 to 4_9 and doing manual testing, I discovered that it does
>> not interact well with the COMDAT fixup handling. This patch fixes the
>> issue, and adds a test case that exposes the problem without the fix.
>>
>> Here is the google/4_8 patch - I plan to commit there first then port
>> it along with the original build info patch to 4_9.
>>
>> Passes regression tests - ok for google branches?
>>
>> Thanks,
>> Teresa
>>
>> 2014-09-10  Teresa Johnson  
>>
>> libgcc:
>> * libgcov-driver.c (gcov_scan_to_function_data): Rename from
>> gcov_scan_summary_end, scan past BUILD_INFO section.
>> (gcov_dump_module_info): Rename gcov_scan_summary_end to
>> gcov_scan_to_function_data.
>>
>> gcc/testsuite:
>> * g++.dg/tree-prof/lipo/buildinfo.txt: Input for
>> -fprofile-generate-buildinfo option.
>> * g++.dg/tree-prof/lipo/comdat_fixup_0.C: New test.
>> * g++.dg/tree-prof/lipo/comdat_fixup_1.C: Ditto.
>> * g++.dg/tree-prof/lipo/comdat_fixup_2.C: Ditto.
>> * g++.dg/tree-prof/lipo/comdat_fixup.h: Ditto.
>> * lib/profopt.exp: Declare srcdir for use in test options.
>>
>> Index: libgcc/libgcov-driver.c
>> ===
>> --- libgcc/libgcov-driver.c (revision 214976)
>> +++ libgcc/libgcov-driver.c (working copy)
>> @@ -428,13 +428,15 @@ struct gcov_filename_aux{
>>  #include "libgcov-driver-system.c"
>>
>>  /* Scan through the current open gcda file corresponding to GI_PTR
>> -   to locate the end position of the last summary, returned in
>> -   SUMMARY_END_POS_P.  Return 0 on success, -1 on error.  */
>> +   to locate the end position just before function data should be rewritten,
>> +   returned in SUMMARY_END_POS_P. E.g. scan past the last summary and other
>> +   sections that won't be rewritten, like the build info.  Return 0 on 
>> success,
>> +   -1 on error.  */
>>  static int
>> -gcov_scan_summary_end (struct gcov_info *gi_ptr,
>> -   gcov_position_t *summary_end_pos_p)
>> +gcov_scan_to_function_data (struct gcov_info *gi_ptr,
>> +gcov_position_t *summary_end_pos_p)
>>  {
>> -  gcov_unsigned_t tag, version, stamp;
>> +  gcov_unsigned_t tag, version, stamp, i, length;
>>tag = gcov_read_unsigned ();
>>if (tag != GCOV_DATA_MAGIC)
>>  {
>> @@ -467,6 +469,28 @@ static int
>>  return -1;
>>  }
>>
>> +  /* If there is a build info section, scan past it as well.  */
>> +  if (tag == GCOV_TAG_BUILD_INFO)
>> +{
>> +  length = gcov_read_unsigned ();
>> +  gcov_unsigned_t num_strings = 0;
>> +  char **build_info_strings = gcov_read_build_info (length, 
>> &num_strings);
>> +  if (!build_info_strings)
>> +{
>> +  gcov_error ("profiling:%s:Error reading build info\n", 
>> gi_filename);
>> +  return -1;
>> +}
>> +
>> +  for (i = 0; i < num_strings; i++)
>> +free (build_info_strings[i]);
>> +  free (build_info_strings);
>> +
>> +  *summary_end_pos_p = gcov_position ();
>> +  tag = gcov_read_unsigned ();
>> +}
>> +  /* The next section should be the function counters.  */
>> +  gcc_assert (tag == GCOV_TAG_FUNCTION);
>> +
>>return 0;
>>  }
>>
>> @@ -1031,10 +1055,10 @@ gcov_dump_module_info (struct gcov_filename_aux *g
>>
>>if (changed)
>>  {
>> -  /* Scan file to find the end of the summary section, which is
>> +  /* Scan file to find the start of the function section, which is
>>   where we will start re-writing the counters.  */
>>gcov_position_t summary_end_pos;
>> -  if (gcov_scan_summary_end (gi_ptr, &summary_end_pos) == -1)
>> +  if (gcov_scan_to_function_data (gi_ptr, &summary_end_pos) == -1)
>>  gcov_error ("profiling:%s:Error scanning summaries\n",
>>  gi_filename);
>>else
>> Index: gcc/testsuite/g++.dg/tree-prof/lipo/buildinfo.txt
>> ===
>> --- gcc/testsuite/g++.dg/tree-prof/lipo/buildinfo.txt   (revision 0)
>> +++ gcc/testsuite/g++.dg/tree-prof/lipo/buildinfo.txt   (revision 0)
>> @@ -0,0 +1 @@
>> +Test -fprofile-generate-buildinfo option
>> Index: gcc/testsuite/g++.dg/tree-prof/lipo/comdat_fixup_0.C
>> ===

Re: [GOOGLE] Fix gcda build info support

2014-09-11 Thread Xinliang David Li
Yes, that is what I meant.

David

On Thu, Sep 11, 2014 at 10:09 AM, Teresa Johnson  wrote:
> On Wed, Sep 10, 2014 at 3:31 PM, Xinliang David Li  wrote:
>> Can you share the buildinfo reader code with the merger by defininig
>> some hooks for different callbacks?
>
> Do you mean the two blobs of code guarded by 'if (tag ==
> GCOV_TAG_BUILD_INFO)' that I added here and the existing one in
> gcov_exit_merge_gcda further down in the same file? Sure, I could
> outline that and pass in the gi_ptr for the merger case. Let me know
> if you meant something else.
>
> Teresa
>
>>
>> David
>>
>> On Wed, Sep 10, 2014 at 10:24 AM, Teresa Johnson  
>> wrote:
>>> While porting recent support for a build info section in the gcda from
>>> google/4_8 to 4_9 and doing manual testing, I discovered that it does
>>> not interact well with the COMDAT fixup handling. This patch fixes the
>>> issue, and adds a test case that exposes the problem without the fix.
>>>
>>> Here is the google/4_8 patch - I plan to commit there first then port
>>> it along with the original build info patch to 4_9.
>>>
>>> Passes regression tests - ok for google branches?
>>>
>>> Thanks,
>>> Teresa
>>>
>>> 2014-09-10  Teresa Johnson  
>>>
>>> libgcc:
>>> * libgcov-driver.c (gcov_scan_to_function_data): Rename from
>>> gcov_scan_summary_end, scan past BUILD_INFO section.
>>> (gcov_dump_module_info): Rename gcov_scan_summary_end to
>>> gcov_scan_to_function_data.
>>>
>>> gcc/testsuite:
>>> * g++.dg/tree-prof/lipo/buildinfo.txt: Input for
>>> -fprofile-generate-buildinfo option.
>>> * g++.dg/tree-prof/lipo/comdat_fixup_0.C: New test.
>>> * g++.dg/tree-prof/lipo/comdat_fixup_1.C: Ditto.
>>> * g++.dg/tree-prof/lipo/comdat_fixup_2.C: Ditto.
>>> * g++.dg/tree-prof/lipo/comdat_fixup.h: Ditto.
>>> * lib/profopt.exp: Declare srcdir for use in test options.
>>>
>>> Index: libgcc/libgcov-driver.c
>>> ===
>>> --- libgcc/libgcov-driver.c (revision 214976)
>>> +++ libgcc/libgcov-driver.c (working copy)
>>> @@ -428,13 +428,15 @@ struct gcov_filename_aux{
>>>  #include "libgcov-driver-system.c"
>>>
>>>  /* Scan through the current open gcda file corresponding to GI_PTR
>>> -   to locate the end position of the last summary, returned in
>>> -   SUMMARY_END_POS_P.  Return 0 on success, -1 on error.  */
>>> +   to locate the end position just before function data should be 
>>> rewritten,
>>> +   returned in SUMMARY_END_POS_P. E.g. scan past the last summary and other
>>> +   sections that won't be rewritten, like the build info.  Return 0 on 
>>> success,
>>> +   -1 on error.  */
>>>  static int
>>> -gcov_scan_summary_end (struct gcov_info *gi_ptr,
>>> -   gcov_position_t *summary_end_pos_p)
>>> +gcov_scan_to_function_data (struct gcov_info *gi_ptr,
>>> +gcov_position_t *summary_end_pos_p)
>>>  {
>>> -  gcov_unsigned_t tag, version, stamp;
>>> +  gcov_unsigned_t tag, version, stamp, i, length;
>>>tag = gcov_read_unsigned ();
>>>if (tag != GCOV_DATA_MAGIC)
>>>  {
>>> @@ -467,6 +469,28 @@ static int
>>>  return -1;
>>>  }
>>>
>>> +  /* If there is a build info section, scan past it as well.  */
>>> +  if (tag == GCOV_TAG_BUILD_INFO)
>>> +{
>>> +  length = gcov_read_unsigned ();
>>> +  gcov_unsigned_t num_strings = 0;
>>> +  char **build_info_strings = gcov_read_build_info (length, 
>>> &num_strings);
>>> +  if (!build_info_strings)
>>> +{
>>> +  gcov_error ("profiling:%s:Error reading build info\n", 
>>> gi_filename);
>>> +  return -1;
>>> +}
>>> +
>>> +  for (i = 0; i < num_strings; i++)
>>> +free (build_info_strings[i]);
>>> +  free (build_info_strings);
>>> +
>>> +  *summary_end_pos_p = gcov_position ();
>>> +  tag = gcov_read_unsigned ();
>>> +}
>>> +  /* The next section should be the function counters.  */
>>> +  gcc_assert (tag == GCOV_TAG_FUNCTION);
>>> +
>>>return 0;
>>>  }
>>>
>>> @@ -1031,10 +1055,10 @@ gcov_dump_module_info (struct gcov_filename_aux *g
>>>
>>>if (changed)
>>>  {
>>> -  /* Scan file to find the end of the summary section, which is
>>> +  /* Scan file to find the start of the function section, which is
>>>   where we will start re-writing the counters.  */
>>>gcov_position_t summary_end_pos;
>>> -  if (gcov_scan_summary_end (gi_ptr, &summary_end_pos) == -1)
>>> +  if (gcov_scan_to_function_data (gi_ptr, &summary_end_pos) == -1)
>>>  gcov_error ("profiling:%s:Error scanning summaries\n",
>>>  gi_filename);
>>>else
>>> Index: gcc/testsuite/g++.dg/tree-prof/lipo/buildinfo.txt
>>> ===
>>> --- gcc/testsuite/g++.dg/tree-prof/lipo/buildinfo.txt   (revision 0)
>>> 

Re: [PATCH] gcc parallel make check

2014-09-11 Thread Jakub Jelinek
On Thu, Sep 11, 2014 at 05:04:56PM +, VandeVondele  Joost wrote:
> > Here is a patch I'm testing now:
> 
> I also tested your patch to compare timings vs a newer patch (v8) I'll send 
> soon
> 
> == patch v8 == make -j32 -k ==
> check-fortran   4m58.178s
> check-c++ ~10m
> check-c   ~10m
> check  15m29.873s
> 
> == patch Jakub
> check-c++ ~20m
> check-fortran   3m31.237s 
> check-c 8m8
> 
> on the positive side, your patch provides a further speedup e.g. fortran
> and c testing (where it splits things nicely).  The libstdc++ bottleneck
> is not solved, but I guess that is expected.

The same technique can be of course used for libstdc++, I just didn't want
to do that until the -C gcc testing is changed.

> As you have presumably found as well, your patch introduces a number 
> failures, because some tests seem to have additional dependencies, either 
> explicit or implicit:

I found more issues, in particular it seemed that struct-layout-1.exp,
gnu-encoding.exp, plugin.exp and some go*.exp don't call runtest_file_p
in the same amounts and same arguments in all invocations.
And these Fortran inter-test dependencies, which Tobias told me is
PR56408.

Unfortunately my remote testing box is unreachable now and I'm still waiting
for DDR4 modules to finish building better workstation, so can't test this
right now.  The patch below intends to serialize the content of the
problematic *.exp tests (the first runtest to reach one of those will simply
run all the tests from that *.exp file, others will skip it).

For go I currently have no idea why does that happen, quick hack would be
just disable parallelization of go temporarily and let Ian investigate.

For PR56408 we need some fix.

Jakub


Re: [PATCH] Implement -fsanitize=object-size

2014-09-11 Thread Marek Polacek
Sorry I let this slide.

On Mon, Jul 14, 2014 at 01:54:13PM +0200, Jakub Jelinek wrote:
> On Sun, Jul 13, 2014 at 07:55:44PM +0200, Marek Polacek wrote:
> > 2014-07-13  Marek Polacek  
> > 
> > * ubsan.h (struct ubsan_mismatch_data):
> 
> Missing description.

Fixed.
 
> > +  gcc_assert (TREE_CODE (size) == INTEGER_CST);
> > +  /* See if we can discard the check.  */
> > +  if (tree_to_uhwi (size) == (unsigned HOST_WIDE_INT) -1)
> 
> This would be integer_all_onesp (size).

Fixed.

> > +static void
> > +instrument_object_size (gimple_stmt_iterator *gsi, bool is_lhs)
> > +{
> > +  gimple stmt = gsi_stmt (*gsi);
> > +  location_t loc = gimple_location (stmt);
> > +  tree t = is_lhs ? gimple_get_lhs (stmt) : gimple_assign_rhs1 (stmt);
> > +
> > +  if (TREE_CODE (t) != MEM_REF)
> > +return;
> 
> I think this is undesirable.  IMHO you want to call here
> get_inner_reference, and if the given size is equal to maxsize, consider
> instrumenting it, otherwise you don't instrument e.g. COMPONENT_REFs and
> many other things.  Look at what e.g. asan.c or even ubsan.c does; the
> question is what exactly to do with bitfields, but supposedly we should
> require that the DECL_BIT_FIELD_REPRESENTATIVE is accessible in that case.

Adjusted.  For bit-fields I just bail out; it's this check:
  || bitsize != size_in_bytes * BITS_PER_UNIT)

> Also, I wonder if using base, ptr, objsz, ckind arguments are best for the
> builtin, I'd think you want instead base, ptr+size-base, objsz, ckind.
> Reasons:
> a) the size addition when expanding UBSAN_OBJECT_SIZE will not work
>reliably, the middle end considers all pointer conversions useless,
>so you can very well end up with a different TREE_TYPE of the pointer
>type
> b) sanopt runs very late, there aren't many GIMPLE optimization passes,
>so to optimize the condition checks you pretty much rely on RTL passes
> c) for e.g. gimple_fold_call it will be much easier if it can remove
>redundant UBSAN_OBJECT_SIZE calls if it can just compare two constants
 
Ok, that's better.  I rewrote this part.

> > +  tree ptr = TREE_OPERAND (t, 0);
> > +  tree sizet, base = ptr;
> > +  gimple g;
> > +  gimple def_stmt;
> > +
> > +  while (TREE_CODE (base) == SSA_NAME)
> > +{
> > +  def_stmt = SSA_NAME_DEF_STMT (base);
> > +  if (is_gimple_assign (def_stmt))
> > +   base = gimple_assign_rhs1 (def_stmt);
> 
> This looks too dangerous.  All you should look through are:
> a) gimple_assign_ssa_name_copy_p
> b) gimple_assign_cast_p if the rhs1 also has POINTER_TYPE_P
> c) gimple_assign_rhs_code == POINTER_PLUS_EXPR

Fixed.

> I'm also including a testcase, which shows why instrumenting
> also COMPONENT_REFs etc. is important (see my reference to
> get_inner_reference above) and also that IMHO we should instrument
> not just when the base is a pointer, but also when it is a decl,

Fixed, we now properly instrument the testcase you posted, because I've
added instrumentation even of VAR_DECLs.

> but in that case we should avoid instrumenting when -fsanitize=bounds
> is on and we know it will handle it (in particular, if there was e.g.
> char d[8]; int e; in the struct definition instead).
 
I haven't fixed this, because it seems that when doing the object-size
instrumentation I can't tell whether the array has been
bounds-instrumented or not.  So for some structs we can issue both
=bounds and =object-size diagnostics.

> Note, the testcase ICEs with -O2 -fsanitize=bounds, can you please look
> at that first and fix it separately?

Already fixed.

> Other comments, in a form of a patch:
> 1) the gimple_fold_call bit shows that we should for the quite common
>case where __bos is folded into -1 remove the UBSAN_OBJECT_SIZE call
>immediately, not worth keeping it around through many other passes

Sure.

> 2) if you add -O2 to the dg-options, that just means the tests are done
>8 times or how many with -O2 all the time.  Better skip it unless
>-O2

Ugh.

> 3) when the second argument is something that can be directly compared
>against the third argument, you can in gimple_fold_call fold not just
>the "don't know" cases, but also when the third argument is >= the
>second and both are INTEGER_CSTs - then we know at compile time
>we are ok.

Thanks, I applied the patch.  And I've added some more optimizations to
gimple_fold_call.

So, how does this look now?

Bootstrapped/regtested on x86_64-linux, passes even bootstrap-ubsan.

2014-09-11  Marek Polacek  

* asan.c (pass_sanopt::execute): Handle IFN_UBSAN_OBJECT_SIZE.
* doc/invoke.texi: Document -fsanitize=object-size.
* flag-types.h (sanitize_code): Add SANITIZE_OBJECT_SIZE and
or it into SANITIZE_UNDEFINED.
* gimple-fold.c (gimple_fold_call): Optimize IFN_UBSAN_OBJECT_SIZE.
* internal-fn.c (expand_UBSAN_OBJECT_SIZE): New function.
* internal-fn.def (UBSAN_OBJECT_SIZE): Define.
* opts.c (common_handle_option): Handle -fsanitize=ob

Re: DBL_DENORM_MIN should never be 0

2014-09-11 Thread Marc Glisse

On Thu, 11 Sep 2014, Joseph S. Myers wrote:


On Thu, 11 Sep 2014, Marc Glisse wrote:


I don't know what kind of test you have in mind, so I added a runtime test. I
am just guessing that it probably fails on alpha because of PR 58757, I can't
test. Computing d+d may be even more likely to trigger potential issues, if
that's the goal.


Yes, a runtime test.  I don't think there should be an xfail without it
actually having been tested to fail (and then such an xfail should come
with a comment referencing the bug filed in Bugzilla).


Would it be ok with the attached testcase then? (same ChangeLog).

--
Marc Glisse/* { dg-do run } */
/* { dg-options "-std=c11" } */

/* Test that the smallest positive value is not 0. This needs to be true
   even when denormals are not supported, so we do not pass any flag
   like -mieee.  If it fails on alpha, see PR 58757.  */

#include 

int main(){
  volatile float f = FLT_TRUE_MIN;
  volatile double d = DBL_TRUE_MIN;
  volatile long double l = LDBL_TRUE_MIN;
  if (f == 0 || d == 0 || l == 0)
__builtin_abort ();
  return 0;
}


Re: [PATCH i386 AVX512] [37/n] Extend max/min insn patterns.

2014-09-11 Thread Uros Bizjak
On Thu, Sep 11, 2014 at 3:00 PM, Kirill Yukhin  wrote:

> Patch in the bottom extends integer max/min patterns.
> Also, it seems, like rounding variant was generated
> for maxmin patterns. Bug fixed.
>
> Bootstrapped.
> AVX-512* tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md (VI128_256): Delete.
> (define_mode_iterator VI124_256): New.
> (define_mode_iterator VI124_256_AVX512F_AVX512BW): Ditto.
> (define_expand "3"): Delete.
> (define_expand "3"): New.
> (define_insn "*avx2_3"): Rename from
> "*avx2_3" and update mode iterator.
> (define_expand "3_mask"): New.
> (define_insn "*avx512bw_3"): 
> Ditto.
> (define_insn "3"): Update mode
> iterator.
> (define_expand "3"): Update pettern generation

"3"

> in presence of AVX-512.

The patch is OK.

(BTW: Sometimes "svn diff -x -upw" comes handy to exclude whitespace
changes. I don't know the equivalent option for git, though.)

Thanks,
Uros.


Re: [PATCH i386 AVX512] [38/n] Extend vpternlog, valign, vrotate insn patterns.

2014-09-11 Thread Uros Bizjak
On Thu, Sep 11, 2014 at 3:16 PM, Kirill Yukhin  wrote:

> Patch in the bottom extends patterns for rotate, ternlog and align.
>
> Bootstrapped.
> AVX-512* tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_mode_iterator VI48_AVX512VL): New.
> (define_expand "_vternlog_maskz"): Rename from
> "avx512f_vternlog_maskz" and update mode iterator.
> (define_insn "_vternlog"): Rename
> from "avx512f_vternlog" and update mode iterator.
> (define_insn "_vternlog_mask"): Rename from
> "avx512f_vternlog_mask" and update mode iterator.
> (define_insn "_align"): Rename
> from "avx512f_align" and update mode
> iterator.
> (define_insn "_v"): Rename from
> "avx512f_v" and update mode iterator.
> (define_insn "_"): Rename from
> "avx512f_" and update mode iterator.
> (define_insn "clz2"): Use VI48_AVX512VL.

Use VI48_AVX512VL mode iterator.

> (define_insn "conflict"): Ditto.

Nice, almost mechanical patch!

OK.

Thanks,
Uros.


Re: [PATCH] gcc parallel make check

2014-09-11 Thread Jakub Jelinek
On Thu, Sep 11, 2014 at 07:26:37PM +0200, Jakub Jelinek wrote:
> right now.  The patch below intends to serialize the content of the
> problematic *.exp tests (the first runtest to reach one of those will simply
> run all the tests from that *.exp file, others will skip it).

Forgotten patch below.  BTW, something will probably need to be done about
acats too, either similar approach or just splitting the chapters into
little more jobs, because otherwise in make -C check -j48 acats dominated
the testing time for me.

--- gcc/Makefile.in.jj  2014-09-08 22:12:56.0 +0200
+++ gcc/Makefile.in 2014-09-11 16:58:01.076371437 +0200
@@ -513,34 +513,10 @@ xm_include_list=@xm_include_list@
 xm_defines=@xm_defines@
 lang_checks=
 lang_checks_parallelized=
-dg_target_exps:=aarch64.exp,alpha.exp,arm.exp,avr.exp,bfin.exp,cris.exp
-dg_target_exps:=$(dg_target_exps),epiphany.exp,frv.exp,i386.exp,ia64.exp
-dg_target_exps:=$(dg_target_exps),m68k.exp,microblaze.exp,mips.exp,powerpc.exp
-dg_target_exps:=$(dg_target_exps),rx.exp,s390.exp,sh.exp,sparc.exp,spu.exp
-dg_target_exps:=$(dg_target_exps),tic6x.exp,xstormy16.exp
-# This lists a couple of test files that take most time during check-gcc.
-# When doing parallelized check-gcc, these can run in parallel with the
-# remaining tests.  Each word in this variable stands for work for one
-# make goal and one extra make goal is added to handle all the *.exp
-# files not handled explicitly already.  If multiple *.exp files
-# should be run in the same runtest invocation (usually if they aren't
-# very long running, but still should be split of from the check-parallel-$lang
-# remaining tests runtest invocation), they should be concatenated with commas.
-# Note that [a-zA-Z] wildcards need to have []s prefixed with \ (needed
-# by tcl) and as the *.exp arguments are mached both as is and with
-# */ prefixed to it in runtest_file_p, it is usually desirable to include
-# a subdirectory name.
-check_gcc_parallelize=execute.exp=execute/2* \
- execute.exp=execute/\[013-9a-fA-F\]* \
- execute.exp=execute/\[pP\]*,dg.exp \
- 
execute.exp=execute/\[g-oq-zG-OQ-Z\]*,compile.exp=compile/2* \
- compile.exp=compile/\[9pP\]*,builtins.exp \
- compile.exp=compile/\[013-8a-oq-zA-OQ-Z\]* \
- dg-torture.exp,ieee.exp \
- vect.exp,unsorted.exp \
- guality.exp \
- struct-layout-1.exp,stackalign.exp \
- $(dg_target_exps)
+# Upper limit to which it is useful to parallelize this lang target.
+# It doesn't make sense to try e.g. 128 goals for small testsuites
+# like objc or go.
+check_gcc_parallelize=1
 lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt $(srcdir)/common.opt
 lang_specs_files=@lang_specs_files@
 lang_tree_files=@lang_tree_files@
@@ -3631,27 +3607,32 @@ $(filter-out $(lang_checks_parallelized)
export TCL_LIBRARY ; fi ; \
$(RUNTEST) --tool $* $(RUNTESTFLAGS))
 
-$(patsubst %,%-subtargets,$(filter-out 
$(lang_checks_parallelized),$(lang_checks))): check-%-subtargets:
+$(patsubst %,%-subtargets,$(lang_checks)): check-%-subtargets:
@echo check-$*
 
 check_p_tool=$(firstword $(subst _, ,$*))
-check_p_vars=$(check_$(check_p_tool)_parallelize)
+check_p_count=$(check_$(check_p_tool)_parallelize)
 check_p_subno=$(word 2,$(subst _, ,$*))
-check_p_comma=,
-check_p_subwork=$(subst $(check_p_comma), ,$(if $(check_p_subno),$(word 
$(check_p_subno),$(check_p_vars
-check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
+check_p_numbers0:=1 2 3 4 5 6 7 8 9
+check_p_numbers1:=0 $(check_p_numbers0)
+check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers1)))
+check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2)
+check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers3)))
+check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4)
+check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst 
%,$(i)%,$(check_p_numbers5)))
+check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) 
$(check_p_numbers6)
 check_p_subdir=$(subst _,,$*)
-check_p_subdirs=$(wordlist 1,$(words 
$(check_$*_parallelize)),$(check_p_numbers))
+check_p_subdirs=$(wordlist 1,$(check_p_count),$(wordlist 1,$(or 
$(GCC_TEST_PARALLEL_SLOTS),128),$(check_p_numbers)))
 
 # For parallelized check-% targets, this decides whether parallelization
 # is desirable (if -jN is used and RUNTESTFLAGS doesn't contain anything
 # but optional --target_board or --extra_opts arguments).  If desirable,
 # recursive make is run with check-parallel-$lang{,1,2,3,4,5} etc. goals,
 # which can be executed in parallel, as they are run in separate directories.
-# check-parallel-$lang{1,2,3,4,5} etc. goals invoke runtest with the longest
-# running *.exp files from the testsuite, as determined b

Re: [PATCH i386 AVX512] [37/n] Extend max/min insn patterns.

2014-09-11 Thread H.J. Lu
On Thu, Sep 11, 2014 at 11:10 AM, Uros Bizjak  wrote:
> On Thu, Sep 11, 2014 at 3:00 PM, Kirill Yukhin  
> wrote:
>
>> Patch in the bottom extends integer max/min patterns.
>> Also, it seems, like rounding variant was generated
>> for maxmin patterns. Bug fixed.
>>
>> Bootstrapped.
>> AVX-512* tests on top of patch-set all pass
>> under simulator.
>>
>> Is it ok for trunk?
>>
>> gcc/
>> * config/i386/sse.md (VI128_256): Delete.
>> (define_mode_iterator VI124_256): New.
>> (define_mode_iterator VI124_256_AVX512F_AVX512BW): Ditto.
>> (define_expand "3"): Delete.
>> (define_expand "3"): New.
>> (define_insn "*avx2_3"): Rename from
>> "*avx2_3" and update mode 
>> iterator.
>> (define_expand "3_mask"): New.
>> (define_insn "*avx512bw_3"): 
>> Ditto.
>> (define_insn "3"): Update mode
>> iterator.
>> (define_expand "3"): Update pettern generation
>
> "3"
>
>> in presence of AVX-512.
>
> The patch is OK.
>
> (BTW: Sometimes "svn diff -x -upw" comes handy to exclude whitespace
> changes. I don't know the equivalent option for git, though.)
>

git diff -w

   -w, --ignore-all-space
   Ignore whitespace when comparing lines. This ignores
differences even if one
   line has whitespace where the other line has none.

-- 
H.J.


RE: [PATCH] gcc parallel make check

2014-09-11 Thread VandeVondele Joost
> And these Fortran inter-test dependencies, which Tobias told me is
> PR56408.
> For PR56408 we need some fix.

BTW, is there anything special about Fortran ? There are at least 180 test 
files that contain 'dg-additional-sources' some in a very non-local way:

./objc.dg/foreach-2.m: /* { dg-additional-sources 
"../objc-obj-c++-shared/nsconstantstring-class-impl.m" } */

Joost

  1   2   >