Re: ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)

2016-11-18 Thread Jakub Jelinek
On Thu, Nov 17, 2016 at 05:39:57PM -0800, Marek Polacek wrote:
> > Yes, there will be 119 SAVE_EXPRs, and when you -fdump-tree-original it,
> > it will be just insanely huge, but each SAVE_EXPR appears exactly twice
> > in its containing SAVE_EXPR and the second time cp_genericize_r sees
> > the SAVE_EXPR, it will do the
> > 1138  /* Other than invisiref parms, don't walk the same tree 
> > twice.  */
> > 1139  if (p_set->contains (stmt))
> > 1140{
> > 1141  *walk_subtrees = 0;
> > 1142  return NULL_TREE;
> > 1143}
> > early exit.
>  
> Clearly I misread things here.  I blame the cold I've caught!
> 
> In any case, I'm sorry for wasting your time and I'll just close the PR.

I bet it is a compile-time hog with -fsanitize=undefined
-fdump-tree-original.  So, if anything, we'd need some hack in the SAVE_EXPR
generic printing code, say track the depth of SAVE_EXPRs being printed
concurrently and if we go above a certain level of them (say 10), change
into a different mode where we print the content of SAVE_EXPR just the
first time we encounter it + e.g. the SAVE_EXPR address and then when
encountering the same SAVE_EXPR again, just print the address.  When
the SAVE_EXPR nesting level shrinks below 10, clear the pointer set
of SAVE_EXPRs and resume normal behavior.
-fsanitize=undefined -fdump-tree-gimple is fast.

Jakub


Re: [PATCH v3] bb-reorder: Improve compgotos pass (PR71785)

2016-11-18 Thread Richard Biener
On Thu, 17 Nov 2016, Segher Boessenkool wrote:

> For code like the testcase in PR71785 GCC factors all the indirect branches
> to a single dispatcher that then everything jumps to.  This is because
> having many indirect branches with each many jump targets does not scale
> in large parts of the compiler.  Very late in the pass pipeline (right
> before peephole2) the indirect branches are then unfactored again, by
> the duplicate_computed_gotos pass.
> 
> This pass works by replacing branches to such a common dispatcher by a
> copy of the dispatcher.  For code like this testcase this does not work
> so well: most cases do a single addition instruction right before the
> dispatcher, but not all, and we end up with only two indirect jumps: the
> one without the addition, and the one with the addition in its own basic
> block, and now everything else jumps _there_.
> 
> This patch rewrites the algorithm to deal with this.  It also makes it
> simpler: it does not need the "candidates" array anymore, it does not
> need RTL layout mode, it does not need cleanup_cfg, and it does not
> need to keep track of what blocks it already visited.
> 
> Tested on powerpc64-linux {-m32,-m64}, and on the testcase, and on a version
> of the testcase that has 2000 cases instead of 4.  Is this okay for trunk?

Looks good to me - a single question below:

> 
> Segher
> 
> 
> 2016-11-17  Segher Boessenkool  
> 
>   PR rtl-optimization/71785
>   * bb-reorder.c (maybe_duplicate_computed_goto): New function.
>   (duplicate_computed_gotos): New function.
>   (pass_duplicate_computed_gotos::execute): Rewrite.
> 
> ---
>  gcc/bb-reorder.c | 214 
> ---
>  1 file changed, 93 insertions(+), 121 deletions(-)
> 
> diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
> index 85bc569..90de2de 100644
> --- a/gcc/bb-reorder.c
> +++ b/gcc/bb-reorder.c
> @@ -2587,11 +2587,100 @@ make_pass_reorder_blocks (gcc::context *ctxt)
>return new pass_reorder_blocks (ctxt);
>  }
>  
> +/* Duplicate a block (that we already know ends in a computed jump) into its
> +   predecessors, where possible.  Return whether anything is changed.  */
> +static bool
> +maybe_duplicate_computed_goto (basic_block bb, int max_size)
> +{
> +  if (single_pred_p (bb))
> +return false;
> +
> +  /* Make sure that the block is small enough.  */
> +  rtx_insn *insn;
> +  FOR_BB_INSNS (bb, insn)
> +if (INSN_P (insn))
> +  {
> + max_size -= get_attr_min_length (insn);
> + if (max_size < 0)
> +return false;
> +  }
> +
> +  bool changed = false;
> +  edge e;
> +  edge_iterator ei;
> +  for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
> +{
> +  basic_block pred = e->src;
> +
> +  /* Do not duplicate BB into PRED if that is the last predecessor, or if
> +  we cannot merge a copy of BB with PRED.  */
> +  if (single_pred_p (bb)
> +   || !single_succ_p (pred)
> +   || e->flags & EDGE_COMPLEX
> +   || pred->index < NUM_FIXED_BLOCKS
> +   || (JUMP_P (BB_END (pred)) && CROSSING_JUMP_P (BB_END (pred
> + {
> +   ei_next (&ei);
> +   continue;
> + }
> +
> +  if (dump_file)
> + fprintf (dump_file, "Duplicating computed goto bb %d into bb %d\n",
> +  bb->index, e->src->index);
> +
> +  /* Remember if PRED can be duplicated; if so, the copy of BB merged
> +  with PRED can be duplicated as well.  */
> +  bool can_dup_more = can_duplicate_block_p (pred);
> +
> +  /* Make a copy of BB, merge it into PRED.  */
> +  basic_block copy = duplicate_block (bb, e, NULL);
> +  emit_barrier_after_bb (copy);
> +  reorder_insns_nobb (BB_HEAD (copy), BB_END (copy), BB_END (pred));
> +  merge_blocks (pred, copy);

there is can_merge_blocks_p and I would have expected the RTL CFG hooks
to do the insn "merging" (you use reorder_insns_nobb for this which is
actually deprecated?).  I suppose if you'd specify pred as third arg
to duplicate_block the CFG hook would have done its work 
(can_merge_blocks_p checks a->next_bb == b)?  I'm not that familiar
with CFG-RTL mode.

> +
> +  changed = true;
> +
> +  /* Try to merge the resulting merged PRED into further predecessors.  
> */
> +  if (can_dup_more)
> + maybe_duplicate_computed_goto (pred, max_size);
> +}
> +
> +  return changed;
> +}
> +
>  /* Duplicate the blocks containing computed gotos.  This basically unfactors
> computed gotos that were factored early on in the compilation process to
> -   speed up edge based data flow.  We used to not unfactoring them again,
> -   which can seriously pessimize code with many computed jumps in the source
> -   code, such as interpreters.  See e.g. PR15242.  */
> +   speed up edge based data flow.  We used to not unfactor them again, which
> +   can seriously pessimize code with many computed jumps in the source code,
> +   such as interpreters.  See e.g. PR15242.  */
> +static void
> +duplic

Re: [PATCH v3] bb-reorder: Improve compgotos pass (PR71785)

2016-11-18 Thread Segher Boessenkool
On Fri, Nov 18, 2016 at 09:09:40AM +0100, Richard Biener wrote:
> > +  /* Make a copy of BB, merge it into PRED.  */
> > +  basic_block copy = duplicate_block (bb, e, NULL);
> > +  emit_barrier_after_bb (copy);
> > +  reorder_insns_nobb (BB_HEAD (copy), BB_END (copy), BB_END (pred));
> > +  merge_blocks (pred, copy);
> 
> there is can_merge_blocks_p and I would have expected the RTL CFG hooks
> to do the insn "merging" (you use reorder_insns_nobb for this which is
> actually deprecated?).  I suppose if you'd specify pred as third arg
> to duplicate_block the CFG hook would have done its work 
> (can_merge_blocks_p checks a->next_bb == b)?  I'm not that familiar
> with CFG-RTL mode.

The third arg to duplicate_block ("after") doesn't do anything in either
RTL mode: it only is passed to move_block_after, but that is a noop for
the RTL modes (and its result is not checked by duplicate_block, ouch).

can_merge_blocks_p will always return true here (except I forgot to check
for simplejump_p -- I'll add that).

rtl_merge_blocks does not check rtl_can_merge_blocks_p (and you get a
quite spectacular ICE when you get it wrong: everything between the two
blocks is deleted :-) ).  I'll make a separate patch that checks it
(layout mode already does).

reorder_insns_nobb is deprecated but it does an important job for which
there is no replacement.  In many cases you can do better than doing
manual surgery on the insn stream of course.

Thanks for the review,


Segher


Re: [PATCH] [AArch64] Fix PR78382

2016-11-18 Thread Kyrill Tkachov

Hi Naveen,

On 18/11/16 05:30, Hurugalawadi, Naveen wrote:

Hi,

Please find attached the patch that fixes PR78382.

The "SYMBOL_SMALL_TLSGD" was not handled for ILP32.
Hence it generates error when compiled for ILP32.
The attached patch adds the support and handles it properly as expected
for ILP32.

Please review the patch and let me know if its okay?

Regression tested on AArch64 with no regressions.


Bootstrap on an aarch64 system is also required.


Thanks,
Naveen

2016-11-18  Naveen H.S  

* config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
Handle SYMBOL_SMALL_TLSGD for ILP32.
* config/aarch64/aarch64.md : tlsgd_small modified into
tlsgd_small_ to support SImode and DImode.
*tlsgd_small modified into *tlsgd_small_ to support SImode and
DImode.



diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 11d41cf..1688f0d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1374,10 +1374,17 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
 case SYMBOL_SMALL_TLSGD:
   {
rtx_insn *insns;
-   rtx result = gen_rtx_REG (Pmode, R0_REGNUM);
+   rtx result;
+   if (TARGET_ILP32)
+ result = gen_rtx_REG (SImode, R0_REGNUM);
+   else
+ result = gen_rtx_REG (DImode, R0_REGNUM);
 
Just use gen_rtx_REG (ptr_mode, R0_REGNUM)?


start_sequence ();
-   aarch64_emit_call_insn (gen_tlsgd_small (result, imm));
+   if (TARGET_ILP32)
+ aarch64_emit_call_insn (gen_tlsgd_small_si (result, imm));
+   else
+ aarch64_emit_call_insn (gen_tlsgd_small_di (result, imm));
insns = get_insns ();
end_sequence ();
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md

index a652a7c..4833c7f 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5089,20 +5089,20 @@
 ;; The TLS ABI specifically requires that the compiler does not schedule
 ;; instructions in the TLS stubs, in order to enable linker relaxation.
 ;; Therefore we treat the stubs as an atomic sequence.
-(define_expand "tlsgd_small"
+(define_expand "tlsgd_small_"
  [(parallel [(set (match_operand 0 "register_operand" "")
   (call (mem:DI (match_dup 2)) (const_int 1)))
-(unspec:DI [(match_operand:DI 1 "aarch64_valid_symref" "")] 
UNSPEC_GOTSMALLTLS)
+(unspec:DI [(match_operand:PTR 1 "aarch64_valid_symref" "")] 
UNSPEC_GOTSMALLTLS)
 (clobber (reg:DI LR_REGNUM))])]
  ""
 {
   operands[2] = aarch64_tls_get_addr ();
 })
 
-(define_insn "*tlsgd_small"

+(define_insn "*tlsgd_small_"
   [(set (match_operand 0 "register_operand" "")
(call (mem:DI (match_operand:DI 2 "" "")) (const_int 1)))
-   (unspec:DI [(match_operand:DI 1 "aarch64_valid_symref" "S")] 
UNSPEC_GOTSMALLTLS)
+   (unspec:DI [(match_operand:PTR 1 "aarch64_valid_symref" "S")] 
UNSPEC_GOTSMALLTLS)
(clobber (reg:DI LR_REGNUM))
   ]
   ""
diff --git a/gcc/testsuite/gcc.target/aarch64/pr78382.c 
b/gcc/testsuite/gcc.target/aarch64/pr78382.c
new file mode 100644
index 000..6c98e5e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr78382.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -fpic -mabi=ilp32 -mtls-dialect=trad" } */
+
+__thread int abc;
+void
+foo ()
+{
+  int *p;
+  p = &abc;
+}




Re: [PATCH, GCC/ARM] Make arm_feature_set agree with type of FL_* macros

2016-11-18 Thread Kyrill Tkachov


On 17/11/16 20:41, Thomas Preudhomme wrote:



On 17/11/16 09:10, Kyrill Tkachov wrote:


On 16/11/16 18:09, Thomas Preudhomme wrote:

Hi,

I've rebased the patch to make arm_feature_set agree with type of FL_* macros
on top of trunk rather than on top of the optional -mthumb patch. That
involved doing the changes to gcc/config/arm/arm-protos.h rather than
gcc/config/arm/arm-flags.h. I also took advantage of the fact that each line
is changed to change the indentation to tabs and add dots in comments missing
one.

For reference, please find below the original patch description:

Currently arm_feature_set is defined in gcc/config/arm/arm-flags as an array
of 2 unsigned long. However, the flags stored in these two entries are
(signed) int, being combinations of bits set via expression of the form 1 <<
bitno. This creates 3 issues:

1) undefined behavior when setting the msb (1 << 31)
2) undefined behavior when storing a flag with msb set (negative int) into one
of the unsigned array entries (positive int)
3) waste of space since the top 32 bits of each entry is not used

This patch changes the definition of FL_* macro to be unsigned int by using
the form 1U << bitno instead and changes the definition of arm_feature_set to
be an array of 2 unsigned (int) entries.

*** gcc/ChangeLog ***

2016-10-15  Thomas Preud'homme 

* config/arm/arm-protos.h (FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M,
FL_MODE26, FL_MODE32, FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED,
FL_STRONG, FL_ARCH5E, FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF,
FL_ARCH6K, FL_THUMB2, FL_NOTM, FL_THUMB_DIV, FL_VFPV3, FL_NEON,
FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV, FL_ARCH8, FL_CRC32, FL_SMALLMUL,
FL_NO_VOLATILE_CE, FL_IWMMXT, FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1,
FL2_ARCH8_2, FL2_FP16INST): Reindent comment, add final dot when
missing and make value unsigned.
(arm_feature_set): Use unsigned entries instead of unsigned long.


Bootstrapped on arm-linux-gnueabihf targeting Thumb-2 state.

Is this ok for trunk?

Best regards,

Thomas

On 14/11/16 18:56, Thomas Preudhomme wrote:

My apologize, I realized when trying to apply the patch that I wrote it on top
of the optional -mthumb patch instead of the reverse. I'll rebase it to not
screw up bisect.

Best regards,

Thomas

On 14/11/16 14:47, Kyrill Tkachov wrote:


On 14/11/16 14:07, Thomas Preudhomme wrote:

Hi,

Currently arm_feature_set is defined in gcc/config/arm/arm-flags as an array
of 2 unsigned long. However, the flags stored in these two entries are
(signed) int, being combinations of bits set via expression of the form 1 <<
bitno. This creates 3 issues:

1) undefined behavior when setting the msb (1 << 31)
2) undefined behavior when storing a flag with msb set (negative int) into one
of the unsigned array entries (positive int)
3) waste of space since the top 32 bits of each entry is not used

This patch changes the definition of FL_* macro to be unsigned int by using
the form 1U << bitno instead and changes the definition of arm_feature_set to
be an array of 2 unsigned (int) entries.

Bootstrapped on arm-linux-gnueabihf targeting Thumb-2 state.

Is this ok for trunk?



Ok.
Thanks,
Kyrill


Best regards,

Thomas




+#define FL_ARCH7  (1U << 22)/* Architecture 7. */
+#define FL_ARM_DIV(1U << 23)/* Hardware divide (ARM mode).  */
+#define FL_ARCH8  (1U << 24)/* Architecture 8. */
+#define FL_CRC32  (1U << 25)/* ARMv8 CRC32 instructions.  */
+#define FL_SMALLMUL   (1U << 26)/* Small multiply supported.  */
+#define FL_NO_VOLATILE_CE  (1U << 27)/* No volatile memory in IT block.  */
+
+#define FL_IWMMXT (1U << 29)/* XScale v2 or "Intel Wireless MMX
+   technology".  */
+#define FL_IWMMXT2(1U << 30)/* "Intel Wireless MMX2
+technology".  */
+#define FL_ARCH6KZ(1U << 31)/* ARMv6KZ architecture.  */
+
+#define FL2_ARCH8_1   (1U << 0)/* Architecture 8.1.  */
+#define FL2_ARCH8_2   (1U << 1)/* Architecture 8.2.  */
+#define FL2_FP16INST  (1U << 2)/* FP16 Instructions for ARMv8.2 and
+   later.  */

Since you're here can you please replace the "Architecture " by
"ARMv(7,8,8.1,8.2)-A"
in these entries.


That seems to make it less accurate though. For a start, most of them would also apply to the R profile. There is also a couple of them that would apply to the M profile: for instance FL_ARCH7 would be set for ARMv7-M and FL_ARCH8 would 
be set for ARMv8-M.




Fair point.
Ok as is then,
Kyrill


Best regards,

Thomas




Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation

2016-11-18 Thread Kyrill Tkachov


On 17/11/16 17:45, Segher Boessenkool wrote:

On Thu, Nov 17, 2016 at 04:50:46PM +, Kyrill Tkachov wrote:

Is loading/storing a pair as cheap as loading/storing a single register?
In that case you could shrink-wrap per pair of registers instead.

I suppose it can vary by microarchitecture. For the purposes of codegen
I'd say
it's more expensive than load/storing a single register (as there's more
memory bandwidth required after all)
but cheaper than two separate loads stores (alignment quirks
notwithstanding).
Interesting idea. That could help with code size too. I'll try it out.

I'm encountering some difficulties implementing this idea.
I want to still keep the per-register structures across the hooks but
basically restrict the number
of components in a basic block to an even number of FPRs and GPRs. I tried
doing this in COMPONENTS_FOR_BB

So your COMPONENTS_FOR_BB returns both components in a pair whenever one
of those is needed?  That should work afaics.



I mean I still want to have one component per register and since
emit_{prologue,epilogue}_components knows how to form pairs from the
components passed down to it I just need to restrict the number of
components in any particular basic block to an even number.
So say a function can wrap 5 registers: x22,x23,x24,x25,x26.
I want get_separate_components to return 5 components since in that hook
we don't know how these registers are distributed across each basic block.
components_for_bb has that information.
In components_for_bb I want to restrict the components for a basic block to
an even number, so if normally all 5 registers would be valid for wrapping
in that bb I'd only choose 4 so I could form 2 pairs. But selecting only 4
of the 5 registers, say only x22,x23,x24,x25 leads to x26 not being saved
or restored at all, even during the normal prologue and epilogue because
x26 was marked as a component in components_for_bb and therefore omitted from
the prologue and epilogue.
So I'm thinking x26 should be removed from the wrappable components of
a basic block by disqualify_components. I'm trying that approach now.

Thanks,
Kyrill


but apparently this ended up not saving/restoring some of the registers at
all because the components that were
"filtered out" that way still made their way to the bitmap passed into
SET_HANDLED_COMPONENTS and so the normal
prologue/epilogue didn't end up saving and restoring them.

I am not sure what this means?  "filtered out"?


Segher




Re: [PATCH 1/2] PR77822

2016-11-18 Thread Segher Boessenkool
Hi Dominik,

On Thu, Nov 17, 2016 at 04:53:47PM +0100, Dominik Vogt wrote:
> +/* A convenience macro to determine whether a SIZE lies inclusively
> +   within [1, RANGE], POS lies inclusively within between
> +   [0, RANGE - 1] and the sum lies inclusively within [1, RANGE].  */
> +#define SIZE_POS_IN_RANGE(SIZE, POS, RANGE) \
> +  (IN_RANGE (POS, 0, (RANGE) - 1) \
> +   && IN_RANGE (SIZE, 1, RANGE) \
> +   && IN_RANGE ((SIZE) + (POS), 1, RANGE))

You should put parens around every use of SIZE, POS, RANGE -- there could
be a comma operator in the macro argument.

You don't check if SIZE+POS overflows / wraps around.  If you really don't
care about that, you can lose the

> +   && IN_RANGE (SIZE, 1, RANGE) \

part as well?


Segher


Re: Add a load_extend_op wrapper

2016-11-18 Thread Richard Sandiford
Eric Botcazou  writes:
>> 2016-11-15  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>> 
>>  * rtl.h (load_extend_op): Declare.
>>  * rtlanal.c (load_extend_op): New function.
>
> I'd make it an inline function.

Sorry, I'd committed it before I got this message.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* rtlanal.c (load_extend_op): Move to...
* rtl.h: ...here and make inline.

diff --git a/gcc/rtl.h b/gcc/rtl.h
index df5172b..b21514f 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2954,7 +2954,6 @@ extern void set_insn_deleted (rtx);
 
 /* Functions in rtlanal.c */
 
-extern rtx_code load_extend_op (machine_mode);
 extern rtx single_set_2 (const rtx_insn *, const_rtx);
 extern bool contains_symbol_ref_p (const_rtx);
 extern bool contains_symbolic_reference_p (const_rtx);
@@ -3771,5 +3770,17 @@ struct GTY(()) cgraph_rtl_info {
   unsigned function_used_regs_valid: 1;
 };
 
+/* If loads from memories of mode MODE always sign or zero extend,
+   return SIGN_EXTEND or ZERO_EXTEND as appropriate.  Return UNKNOWN
+   otherwise.  */
+
+inline rtx_code
+load_extend_op (machine_mode mode)
+{
+  if (SCALAR_INT_MODE_P (mode)
+  && GET_MODE_PRECISION (mode) < BITS_PER_WORD)
+return LOAD_EXTEND_OP (mode);
+  return UNKNOWN;
+}
 
 #endif /* ! GCC_RTL_H */
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index f07a77a..55a9d2c 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -3863,19 +3863,6 @@ subreg_nregs_with_regno (unsigned int regno, const_rtx x)
   return info.nregs;
 }
 
-/* If loads from memories of mode MODE always sign or zero extend,
-   return SIGN_EXTEND or ZERO_EXTEND as appropriate.  Return UNKNOWN
-   otherwise.  */
-
-rtx_code
-load_extend_op (machine_mode mode)
-{
-  if (SCALAR_INT_MODE_P (mode)
-  && GET_MODE_PRECISION (mode) < BITS_PER_WORD)
-return LOAD_EXTEND_OP (mode);
-  return UNKNOWN;
-}
-
 struct parms_set_data
 {
   int nregs;



Re: [PATCH] [AArch64] Fix PR78382

2016-11-18 Thread Hurugalawadi, Naveen
Hi Kyrill,

Thanks for the comment.

Bootstrapped successfully on AArch64 (thunder) system.
And also regression tested on AArch64(thunder) with no regressions.

Thanks,
Naveen

[PATCH GCC]Move simplification from fold-cond.c to match.pd and extend it

2016-11-18 Thread Bin Cheng
Hi,
This is a rework of https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html.  
According to review comment, I extended the original patch and made it covering 
last kind simplification of fold_cond_expr_with_comparison (Well, this patch 
handles <, <=, > and >=.  == will be handled by a follow up).  This patch also 
adds several tests, some tests are for existing fold_cond_expr_with_comparison 
simplification but not covered yet; others are for new extension.
Bootstrap and test on x86_64 and AArch64 along with following patches, is it OK?

Thanks,
bin

2016-11-16  Bin Cheng  

* fold-const.c (fold_cond_expr_with_comparison): Move simplification
for A cmp C1 ? A : C2 to below, also simplify remaining code.
* match.pd: Move and extend simplification from above to here:
(cond (cmp (convert1? x) c1) (convert2? x) c2) -> (minmax (x c)).
* tree-if-conv.c (ifcvt_follow_ssa_use_edges): New func.
(predicate_scalar_phi): Call fold_stmt using the new valueize func.

gcc/testsuite/ChangeLog
2016-11-16  Bin Cheng  

* gcc.dg/fold-cond_expr-1.c: New test.
* gcc.dg/fold-condcmpconv-1.c: New test.
* gcc.dg/fold-condcmpconv-2.c: New test.
* gcc.dg/tree-ssa/pr66726.c: Adjust test strings.diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c597414..1e61ccf 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -5210,95 +5210,18 @@ fold_cond_expr_with_comparison (location_t loc, tree 
type,
}
 }
 
-  /* If this is A op C1 ? A : C2 with C1 and C2 constant integers,
- we might still be able to simplify this.  For example,
- if C1 is one less or one more than C2, this might have started
- out as a MIN or MAX and been transformed by this function.
- Only good for INTEGER_TYPEs, because we need TYPE_MAX_VALUE.  */
+  /* If this is A == C1 ? A : C2 with C1 and C2 constant integers,
+ we simplify it into A == C1 ? C1 : C2.  */
 
-  if (INTEGRAL_TYPE_P (type)
+  if (comp_code == EQ_EXPR
+  && INTEGRAL_TYPE_P (type)
   && TREE_CODE (arg01) == INTEGER_CST
+  && TREE_CODE (arg1) != INTEGER_CST
   && TREE_CODE (arg2) == INTEGER_CST)
-switch (comp_code)
-  {
-  case EQ_EXPR:
-   if (TREE_CODE (arg1) == INTEGER_CST)
- break;
-   /* We can replace A with C1 in this case.  */
-   arg1 = fold_convert_loc (loc, type, arg01);
-   return fold_build3_loc (loc, COND_EXPR, type, arg0, arg1, arg2);
-
-  case LT_EXPR:
-   /* If C1 is C2 + 1, this is min(A, C2), but use ARG00's type for
-  MIN_EXPR, to preserve the signedness of the comparison.  */
-   if (! operand_equal_p (arg2, TYPE_MAX_VALUE (type),
-  OEP_ONLY_CONST)
-   && operand_equal_p (arg01,
-   const_binop (PLUS_EXPR, arg2,
-build_int_cst (type, 1)),
-   OEP_ONLY_CONST))
- {
-   tem = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (arg00), arg00,
-  fold_convert_loc (loc, TREE_TYPE (arg00),
-arg2));
-   return fold_convert_loc (loc, type, tem);
- }
-   break;
-
-  case LE_EXPR:
-   /* If C1 is C2 - 1, this is min(A, C2), with the same care
-  as above.  */
-   if (! operand_equal_p (arg2, TYPE_MIN_VALUE (type),
-  OEP_ONLY_CONST)
-   && operand_equal_p (arg01,
-   const_binop (MINUS_EXPR, arg2,
-build_int_cst (type, 1)),
-   OEP_ONLY_CONST))
- {
-   tem = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (arg00), arg00,
-  fold_convert_loc (loc, TREE_TYPE (arg00),
-arg2));
-   return fold_convert_loc (loc, type, tem);
- }
-   break;
-
-  case GT_EXPR:
-   /* If C1 is C2 - 1, this is max(A, C2), but use ARG00's type for
-  MAX_EXPR, to preserve the signedness of the comparison.  */
-   if (! operand_equal_p (arg2, TYPE_MIN_VALUE (type),
-  OEP_ONLY_CONST)
-   && operand_equal_p (arg01,
-   const_binop (MINUS_EXPR, arg2,
-build_int_cst (type, 1)),
-   OEP_ONLY_CONST))
- {
-   tem = fold_build2_loc (loc, MAX_EXPR, TREE_TYPE (arg00), arg00,
-  fold_convert_loc (loc, TREE_TYPE (arg00),
-arg2));
-   return fold_convert_loc (loc, type, tem);
- }
-   break;
-
-  case GE_EXPR:
-   /* If C1 is C2 + 1, this is max(A, C2), with the same care as above.  */
-   if (! operand_equal_p (arg2, TYPE_MAX_VALUE (type),
- 

Re: [PATCH 1/4] Remove build dependence on HSA run-time

2016-11-18 Thread Jakub Jelinek
On Sun, Nov 13, 2016 at 08:02:41PM +0100, Martin Jambor wrote:
> @@ -143,6 +240,12 @@ init_enviroment_variables (void)
>  suppress_host_fallback = true;
>else
>  suppress_host_fallback = false;
> +
> +  hsa_runtime_lib = getenv ("HSA_RUNTIME_LIB");
> +  if (hsa_runtime_lib == NULL)
> +hsa_runtime_lib = HSA_RUNTIME_LIB "libhsa-runtime64.so";

libgomp is very much env var driven, but the above one is IMHO just
too dangerous in suid/sgid apps, allowing one to select a library
of their own choice to dlopen is an instant exploit possibility,
so such env var should be only considered in non-priviledged processes.
It is possible to try dlopen (hsa_runtime_lib) and if that fails, try
dlopen ("libhsa-runtime64.so"), where it would search the library only
in the system paths (note, the dynamic linker handles LD_LIBRARY_PATH,
LD_PRELOAD etc. safely in priviledges processes).

So I'd recommend to use secure_getenv instead.  E.g. see how libgfortran
checks for it in configure and even provides a fallback version for it.
In the HSA plugin case, I think the fallback should be static function
in the plugin.
Otherwise it looks reasonable, thanks for working on that.

Jakub


[PATCH][wwwdocs] Document new arm/aarch64 CPU support in a more compact way

2016-11-18 Thread Kyrill Tkachov

Hi all,

This patch brings the new CPU support announcements in line with the format 
used in the GCC 6 notes.
That is, rather than have a separate "The  is now supported via 
the..." entry for each new core
just list them and give a use example with the -mcpu,-mtune options.

Ok to commit?

Thanks,
Kyrill
Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.24
diff -U 3 -r1.24 changes.html
--- htdocs/gcc-7/changes.html	9 Nov 2016 14:28:59 -	1.24
+++ htdocs/gcc-7/changes.html	16 Nov 2016 10:23:04 -
@@ -287,14 +287,14 @@
processing floating-point instructions.
  
  
-   The ARM Cortex-A73 processor is now supported via the
-   -mcpu=cortex-a73 and -mtune=cortex-a73
-   options as well as the equivalent target attributes and pragmas.
- 
- 
-   The Broadcom Vulcan processor is now supported via the
-   -mcpu=vulcan and -mtune=vulcan options as
-   well as the equivalent target attributes and pragmas.
+   Support has been added for the following processors
+   (GCC identifiers in parentheses): ARM Cortex-A73
+   (cortex-a73) and Broadcom Vulcan (vulcan).
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a73 or
+   -mtune=vulcan or as arguments to the equivalent target
+   attributes and pragmas.
  

 
@@ -316,19 +316,14 @@
armv8-m.main+dsp options.
  
  
-   The ARM Cortex-A73 processor is now supported via the
-   -mcpu=cortex-a73 and -mtune=cortex-a73
-   options.
- 
- 
-   The ARM Cortex-M23 processor is now supported via the
-   -mcpu=cortex-m23 and -mtune=cortex-m23
-   options.
- 
- 
-   The ARM Cortex-M33 processor is now supported via the
-   -mcpu=cortex-m33 and -mtune=cortex-m33
-   options.
+   Support has been added for the following processors
+   (GCC identifiers in parentheses): ARM Cortex-A73
+   (cortex-a73), ARM Cortex-M23 (cortex-m23) and
+   ARM Cortex-M33 (cortex-m33).
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a73 or
+   -mtune=cortex-m33.
  

 


Re: [PATCH 2/4] HSA specific built-ins

2016-11-18 Thread Jakub Jelinek
On Sun, Nov 13, 2016 at 08:39:35PM +0100, Martin Jambor wrote:
> Hello,
> 
> this patch adds a small file hsa-builtins.def which defines a few
> builtins that I then use in OpenMP lowering and expansion.
> 
> After we split gridification stuff in omp-low.c to a separate file, we
> should be able to only conditionally include the file and remove the
> weird conditional ifdef.
> 
> OK for trunk?

Does this work well even with lto and jit FEs?  Ok for trunk if it does.

> 2016-11-11  Martin Jambor  
> 
> gcc/
>   * hsa-builtins.def: New file.
>   * Makefile.in (BUILTINS_DEF): Add hsa-builtins.def dependency.
>   * builtins.def: Include hsa-builtins.def.
>   (DEF_HSA_BUILTIN): New macro.
> 
> fortran/
>   * f95-lang.c (DEF_HSA_BUILTIN): New macro.

Jakub


Re: [PATCH 3/4] OpenMP lowering changes from the hsa branch

2016-11-18 Thread Jakub Jelinek
On Sun, Nov 13, 2016 at 10:42:01PM +0100, Martin Jambor wrote:
> +  size_t collapse = gimple_omp_for_collapse (for_stmt);
> +  struct omp_for_data_loop *loops
> += (struct omp_for_data_loop *)
> +alloca (gimple_omp_for_collapse (for_stmt)
> + * sizeof (struct omp_for_data_loop));

Use
  struct omp_for_data_loop *loops
= XALLOCAVEC (struct omp_for_data_loop,
  gimple_omp_for_collapse (for_stmt));
instead?

> @@ -14133,7 +14183,7 @@ const pass_data pass_data_expand_omp =
>  {
>GIMPLE_PASS, /* type */
>"ompexp", /* name */
> -  OPTGROUP_NONE, /* optinfo_flags */
> +  OPTGROUP_OPENMP, /* optinfo_flags */
>TV_NONE, /* tv_id */
>PROP_gimple_any, /* properties_required */
>PROP_gimple_eomp, /* properties_provided */

What about the simdclone, omptargetlink, diagnose_omp_blocks passes?  What about
openacc specific passes (oaccdevlow)?  And Alex is hopefully going to add
ompdevlow pass soon.

Otherwise LGTM.

Jakub


Re: [fixincludes] Fix macOS 10.12 and (PR sanitizer/78267)

2016-11-18 Thread Rainer Orth
Hi Bruce, Mike,

> On Fri, Nov 11, 2016 at 3:18 AM, Mike Stump  wrote:
>>
>> No objections from me.
>>
> Or me.  Thanks!

as discussed at length in the PR, the fixincludes route alone isn't
enough to get libsanitizer to build on Darwin 15: we stay with undefined
references to a function which had to fixinclude'd out due to use of the
Blocks extension.  On closer inspection, it turns out that the Darwin 16
 effectively disables os_trace unless using clang >= 7.3.
So there's no point in jumping through hoops to achieve the same with a
large effort.

OTOH, the fixes in my fixincludes patch fix real incompatibilities
between Darwin headers an GCC, so we agreed to keep them even if
libsanitizer would build without.

So the current suggestion is to combine my fixincludes patch and Jack's
patch to disable  use if !__BLOCKS__.

Bootstrapped without regressions on Darwin 16 by myself, and on 15 and
14 by Jack.

I guess this is ok for mainline now to restore bootstrap?

It may be that we add some more fixincludes fixes later, even if not
required for libsanitizer.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2016-11-10  Rainer Orth  

PR sanitizer/78267
* inclhack.def (darwin_availabilityinternal, darwin_os_trace_1)
(darwin_os_trace_2, darwin_os_trace_3): New fixes.
(hpux_stdint_least_fast): Remove spurious _EOFix_.
* fixincl.x: Regenerate.
* tests/bases/AvailabilityInternal.h: New file.
* tests/bases/os/trace.h: New file.

2016-11-14  Jack Howarth  

PR sanitizer/78267
* sanitizer_common/sanitizer_mac.cc: Include  only if
compiler supports blocks extension.

# HG changeset patch
# Parent  14eb543b5cb5e7a2bdea7430e3bf3b771bf0e54f
Fix macOS 10.12  and  (PR sanitizer/78267)

diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def
--- a/fixincludes/inclhack.def
+++ b/fixincludes/inclhack.def
@@ -1338,6 +1338,32 @@ fix = {
 };
 
 /*
+ *  macOS 10.12  uses __attribute__((availability))
+ *  unconditionally.
+ */
+fix = {
+hackname  = darwin_availabilityinternal;
+mach  = "*-*-darwin*";
+files = AvailabilityInternal.h;
+select= "#define[ \t]+(__API_[ADU]\\([^)]*\\)).*";
+c_fix = format;
+c_fix_arg = <<- _EOFix_
+	#if defined(__has_attribute)
+	  #if __has_attribute(availability)
+	%0
+	  #else
+	#define %1
+	  #endif
+	#else
+	#define %1
+	#endif
+	_EOFix_;
+
+test_text = "#define __API_A(x) __attribute__((availability(__API_AVAILABLE_PLATFORM_##x)))\n"
+		"#define __API_D(msg,x) __attribute__((availability(__API_DEPRECATED_PLATFORM_##x,message=msg)))";
+};
+
+/*
  *  For the AAB_darwin7_9_long_double_funcs fix to be useful,
  *  you have to not use "" includes.
  */
@@ -1410,6 +1436,62 @@ fix = {
 };
 
 /*
+ *  Mac OS X 10.11  uses attribute on function definition.
+ */
+fix = {
+  hackname  = darwin_os_trace_1;
+  mach  = "*-*-darwin*";
+  files = os/trace.h;
+  select= "^(_os_trace_verify_printf.*) (__attribute__.*)";
+  c_fix = format;
+  c_fix_arg = "%1";
+  test_text = "_os_trace_verify_printf(const char *msg, ...) __attribute__((format(printf, 1, 2)))";
+};
+
+/*
+ *  Mac OS X 10.1[012]  os_trace_payload_t typedef uses Blocks
+ *  extension without guard.
+ */
+fix = {
+  hackname  = darwin_os_trace_2;
+  mach  = "*-*-darwin*";
+  files = os/trace.h;
+  select= "typedef.*\\^os_trace_payload_t.*";
+  c_fix = format;
+  c_fix_arg = "#if __BLOCKS__\n%0\n#endif";
+  test_text = "typedef void (^os_trace_payload_t)(xpc_object_t xdict);";
+};
+
+/*
+ *  In Mac OS X 10.1[012] , need to guard users of
+ *  os_trace_payload_t typedef, too.
+ */
+fix = {
+  hackname  = darwin_os_trace_3;
+  mach  = "*-*-darwin*";
+  files = os/trace.h;
+  select= <<- _EOSelect_
+	__(API|OSX)_.*
+	OS_EXPORT.*
+	.*
+	_os_trace.*os_trace_payload_t payload);
+	_EOSelect_;
+  c_fix = format;
+  c_fix_arg = "#if __BLOCKS__\n%0\n#endif";
+  test_text = <<- _EOText_
+	__API_AVAILABLE(macosx(10.10), ios(8.0), watchos(2.0), tvos(8.0))
+	OS_EXPORT OS_NOTHROW OS_NOT_TAIL_CALLED
+	void
+	_os_trace_with_buffer(void *dso, const char *message, uint8_t type, const void *buffer, size_t buffer_size, os_trace_payload_t payload);
+
+	__OSX_AVAILABLE_STARTING(__MAC_10_12, __IPHONE_10_0)
+	OS_EXPORT OS_NOTHROW
+	void
+	_os_trace_internal(void *dso, uint8_t type, const char *format, const uint8_t *buf, size_t buf_size, os_trace_payload_t payload);
+	_EOText_;
+};
+
+/*
  *  __private_extern__ doesn't exist in FSF GCC.  Even if it did,
  *  why would you ever put it in a system header file?
  */
@@ -2638,7 +2720,6 @@ fix = {
 c-fix-arg = "#  define	UINT_%164_MAX	__UINT64_MAX__";
 test-text = "#  define   UINT_FAST64_MAXULLONG_MAX\n"
 		"#  define   UINT_LEAST64_MAXULLONG_MAX\n";
-	_EOFix_;
 };
 
 /*
diff --git a/fixincludes/tests/base/A

Re: Add a load_extend_op wrapper

2016-11-18 Thread Eric Botcazou
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 
> Thanks,
> Richard
> 
> 
> gcc/
>   * rtlanal.c (load_extend_op): Move to...
>   * rtl.h: ...here and make inline.

OK, thanks.

-- 
Eric Botcazou


Re: [Patch 1/5] OpenACC tile clause support, OMP_CLAUSE_TILE adjustments

2016-11-18 Thread Jakub Jelinek
On Thu, Nov 17, 2016 at 05:27:14PM +0800, Chung-Lin Tang wrote:
> I've updated the patch as you suggested.
> Here's the v2 of the first patch, mainly gimplify.c adjusted.
> 
> About the ChangeLog issues, I'll make sure the final committed diffs will 
> solve them.

Ok.

Jakub


Re: [fixincludes, v3] Don't define libstdc++-internal macros in Solaris 10+

2016-11-18 Thread Rainer Orth
Hi Jonathan,

> On 03/11/16 15:11 +0100, Rainer Orth wrote:
>>Fortunately, this is all easily fixed by wrapping the affected templates
>>in a new macro.  That's what this patch does.  The new libstdc++
>>acinclude.m4 test may well need wording changes in comments etc. and can
>>perhaps be shortened a bit, bit it worked for me.
>
> See below.

thanks for the thorough review.

> The change is OK in principle, but I'd prefer more meaningful names
> for the macros ...

Fine with me: I know close to nothing of C++, so please bear with me ;-)

>>--- a/libstdc++-v3/acinclude.m4
>>+++ b/libstdc++-v3/acinclude.m4
>>@@ -2181,7 +2181,8 @@ AC_DEFUN([GLIBCXX_CHECK_STDIO_PROTO], [
>> ])
>>
>> dnl
>>-dnl Check whether required C++11 overloads are present in .
>>+dnl Check whether required C++11 overloads and templates are present
>>+dnl in .
>
> The standard doesn't actually require templates to be used here, it
> only requires "sufficient additional overloads" so that integral
> arguments work. We happen to do that with templates, and apparently so
> do the Solaris headers, but other implementations are possible. So a
> more accurate description would be:
>
> dnl Check whether required C++11 overloads for FP and integral types
> dnl are present in .
>
> And rather than PROTO1 and PROTO2 the macros could be called PROTO_FP
> for the overloads for FP types, and PROTO_INT (or just PROTO_I) for
> the overloads for integral types (which happen to be templates in our
> header).

I went for the longer/more expressive names.

> The suggestions above would make it shorter, while still remaining
> accurate to what the real code in the header does. It could be reduced
> even further without altering the meaning for the purposes of this
> test:
>
>>+namespace std {
>>+  template
>>+constexpr typename __gnu_cxx::__enable_if
>>+  <__is_integer<_Tp>::__value, double>::__type
>>+log2(_Tp __x)
>>+{ return __builtin_log2(__x); }
>
>template
>  struct __enable_if_int;
>
>template<>
>  struct __enable_if_int
>  { typedef double __type; };
>
>template
>  constexpr typename __enable_if_int<_Tp>::__type
>  log2(_Tp __x)
>  { return __builtin_log2(__x); }
>
> I don't mind whether you go with this or just remove the unnecessary
> structs, typedef and primary template bodies. Either is OK.

I went for the latter to keep the testcase close to the original.

Please find attached the revised version of the patch.  I hope I've
incorporated all of your comments.

@Bruce: To avoid the hpux11_fabsf test breakage from my original
fixincludes part, Dave suggested in private mail to replace the

  bypass = '__cplusplus'

by

  mach   = '*-hp-hpux11*'

so the fix only applies to the systems that need them and doesn't
get interference from other testcases.  So I've got clean make check
results in fixincludes.

The patch has again been bootstrapped without regressions on
i386-pc-solaris2.12, sparc-sun-solaris2.12, i386-pc-solaris2.11 (both
with only the fp overloads and nothing at all), and
x86_64-pc-linux-gnu.

I plan to repeat that testing on the gcc-6 and gcc-5 branches;
differences between the branches were minimal for the previons versions
(like regenerating fixincl.x or an adjacent comment difference in
c_global/cmath).

Ok for mainline now and the backports after some soak time?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2016-10-27  Rainer Orth  
John David Anglin  

libstdc++-v3:
* acinclude.m4 (GLIBCXX_CHECK_MATH11_PROTO): Update comments.
(__CORRECT_ISO_CPP11_MATH_H_PROTO): Rename to ...
(__CORRECT_ISO_CPP11_MATH_H_PROTO_FP): ... this.
Add test for C++11  templates.
* configure: Regenerate.
* config.h.in: Regenerate.

* include/c_global/cmath [__cplusplus >= 201103L]: Reflect
__CORRECT_ISO_CPP11_MATH_H_PROTO to
__CORRECT_ISO_CPP11_MATH_H_PROTO_FP rename.
* include/c_global/cmath [_GLIBCXX_USE_C99_MATH &&
!_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC && __cplusplus >= 201103L]
(std::fpclassify): Wrap in !__CORRECT_ISO_CPP11_MATH_H_PROTO_INT.
(std::isfinite): Likewise.
(std::isinf): Likewise.
(std::isnan): Likewise.
(std::isnormal): Likewise.
(std::signbit): Likewise.
(std::isgreater): Likewise.
(std::isgreaterequal): Likewise.
(std::isless): Likewise.
(std::islessequal): Likewise.
(std::islessgreater): Likewise.
(std::isunordered): Likewise.
[__cplusplus >= 201103L && _GLIBCXX_USE_C99_MATH_TR1]
(std::acosh): Likewise.
(std::asinh): Likewise.
(std::atanh): Likewise.
(std::cbrt): Likewise.
(std::copysign): Likewise.
 

Re: [PATCH v3] bb-reorder: Improve compgotos pass (PR71785)

2016-11-18 Thread Segher Boessenkool
On Fri, Nov 18, 2016 at 03:07:27AM -0600, Segher Boessenkool wrote:
> rtl_merge_blocks does not check rtl_can_merge_blocks_p (and you get a
> quite spectacular ICE when you get it wrong: everything between the two
> blocks is deleted :-) ).  I'll make a separate patch that checks it
> (layout mode already does).

This fails in cfgcleanup.c (in merge_blocks_move_successor_nojumps),
on the "a->next_bb == b" test.  Oh well.


Segher


Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-18 Thread Iain Sandoe

> On 17 Nov 2016, at 21:19, Jeff Law  wrote:
> 
> On 11/08/2016 05:49 PM, Iain Sandoe wrote:
>> 
>>> On 8 Nov 2016, at 13:39, Mike Stump  wrote:
>>> 
>>> On Nov 8, 2016, at 1:05 PM, Iain Sandoe 
>>> wrote:
 
 Simple for the simple case is already part of my patch, but
 capability for the expert and non-simple case is also present,
>>> 
>>> I'm trying to ask a specific question, what does the patch allow
>>> that can't be done otherwise?  Kinda a trivial question, and I
>>> don't see any answer.
>>> 
>>> I'm not looking for, it allows it to work, or it makes the expert
>>> case work.  I'm look for the specific question, and the specific
>>> information you want, and why ld -v doesn't get it.
>> 
>> ld -v gets it when you can execute ld. It doesn’t get it when the
>> $host ld is not executable on $build.
> Exactly!
> 
>> Providing the option to give the version allows that without
>> requiring the complexity of other (possibly valid) solutions.  If you
>> know that you’re building (my patched) ld64-253.9 for powerpc-darwin9
>> (crossed from x86-64-darwin14) it’s easy, just put —with-ld64=253.9
>> ..
>> 
>> I think we’ve debated this enough - I’m OK with keeping my extra
>> facility locally and will resubmit the patch with it removed in due
>> course, Iain
> Your call. But ISTM the ability to specify the linker version or even better, 
> its behaviour is a notable improvement for these crosses.

Thanks, at least I’m not going completely crazy ;-)

However, I’d already revised the patch to take the approach Mike wanted, and 
thus didn’t add the documentation that Joseph noted was missing.  It takes 
quite a while to re-test this across the Darwin patch, so I’m going to put 
forward the amended patch (as Mike wanted it) and we can discuss ways to deal 
with native and Canadian crosses later.

FWIW, it’s possible (but rather kludgy) with the attached patch to override 
gcc_gc_ld64_version on the make line to build a native/Canadian cross with a 
“non-standard” ld64 version (I did at least build a native X powerpc-darwin9 on 
x86_64-darwin14 which was not a brick).

OK now for trunk?
open branches?
Iain

gcc/
* configure.ac (with-ld64): New var, set for Darwin, set on
detection of ld64, gcc_cv_ld64_export_dynamic: New, New test.
* darwin.h: Use LD64_HAS_DYNAMIC export. DEF_LD64: New, define.
* darwin10.h(DEF_LD64): Update for this target version.
* darwin12.h(LINK_GCC_C_SEQUENCE_SPEC): Remove rdynamic test.
(DEF_LD64): Update for this target version.




PR71767-part2-revised.patch
Description: Binary data


Re: [Patch 2/5] OpenACC tile clause support, omp-low parts

2016-11-18 Thread Jakub Jelinek
Hi!

On Thu, Nov 17, 2016 at 05:31:40PM +0800, Chung-Lin Tang wrote:
> +#ifndef ACCEL_COMPILER
> +  span = integer_one_node;
> +#else
> +  if (!e_mask)
> +/* Not paritioning.  */
> +span = integer_one_node;
...
This goes against the recent trend of avoiding #if/#ifdef guarded blocks
of code as much as possible, the ACCEL_COMPILER only hunk is significant
and will usually not be enabled, so people will not notice breakages in it
until building an accel compiler.
What about
#ifndef ACCEL_COMPILER
  if (true)
span = integer_one_node;
  else
#endif
  if (!e_mask)
/* Not paritioning.  */
span_integer_one_node;
  else if
...
or what I've proposed earlier:
#ifndef ACCEL_COMPILER  
   
  e_mask = 0;   
   
#endif  
   
  if (!e_mask)  
   
...

Ok with that fixed.

Jakub


Re: [Patch 3/5] OpenACC tile clause support, C/C++ front-end parts

2016-11-18 Thread Jakub Jelinek
On Thu, Nov 17, 2016 at 05:34:34PM +0800, Chung-Lin Tang wrote:
> Updated C/C++ front-end patches, adjusted as reviewed.

Jason is right, finish_omp_clauses will verify the tile operands
when !processing_template_decl are non-negative host INTEGER_CSTs,
so can't you just tsubst it like OMP_CLAUSE_COLLAPSE?  If the operand
is not a constant expression, presumably it will not be INTEGER_CST.
On the other side, OMP_CLAUSE_TILE has now 3 operands instead of just 1,
don't you need to do something during instantiation for the other 2
operands?

Jakub


Re: [Patch 4/5] OpenACC tile clause support, Fortran front-end parts

2016-11-18 Thread Jakub Jelinek
On Sat, Nov 12, 2016 at 08:51:00AM -0800, Cesar Philippidis wrote:
> On 11/11/2016 02:34 AM, Jakub Jelinek wrote:
> > On Thu, Nov 10, 2016 at 06:46:46PM +0800, Chung-Lin Tang wrote:
> 
> And here's the patch.

The patch doesn't look like OpenACC tile clause fortran support,
but bind/nohost clause C/C++ support.

Jakub


Re: [PATCH, ARM] Enable ldrd/strd peephole rules unconditionally

2016-11-18 Thread Christophe Lyon
On 17 November 2016 at 10:23, Kyrill Tkachov
 wrote:
>
> On 09/11/16 12:58, Bernd Edlinger wrote:
>>
>> Hi!
>>
>>
>> This patch enables the ldrd/strd peephole rules unconditionally.
>>
>> It is meant to fix cases, where the patch to reduce the sha512
>> stack usage splits ldrd/strd instructions into separate ldr/str insns,
>> but is technically independent from the other patch:
>>
>> See https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00523.html
>>
>> It was necessary to change check_effective_target_arm_prefer_ldrd_strd
>> to retain the true prefer_ldrd_strd tuning flag.
>>
>>
>> Bootstrapped and reg-tested on arm-linux-gnueabihf.
>> Is it OK for trunk?
>
>
> This is ok.
> Thanks,
> Kyrill
>

Hi Bernd,

Since you committed this patch (r242549), I'm seeing the new test
failing on some arm*-linux-gnueabihf configurations:

FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times ldrd 10
FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times strd 9

See 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242549/report-build-info.html
for a map of failures.

Am I missing something?

Thanks,

Christophe


>> Thanks
>> Bernd.
>
>


RE: [PATCH] MIPS: If a test in the MIPS testsuite requires standard library support check the sysroot supports the required test options.

2016-11-18 Thread Toma Tabacu
> From: Moore, Catherine [mailto:catherine_mo...@mentor.com]
> Sent: 17 November 2016 21:53
> To: Matthew Fortune; Toma Tabacu; Andrew Bennett; 'gcc-
> patc...@gcc.gnu.org'
> Cc: Moore, Catherine
> Subject: RE: [PATCH] MIPS: If a test in the MIPS testsuite requires standard
> library support check the sysroot supports the required test options.
> 
> 
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Matthew Fortune
> > Sent: Thursday, November 17, 2016 8:47 AM
> > To: Toma Tabacu ; Andrew Bennett
> > ; Moore, Catherine
> > ; 'gcc-patches@gcc.gnu.org'  > patc...@gcc.gnu.org>
> > Subject: RE: [PATCH] MIPS: If a test in the MIPS testsuite requires
> > standard library support check the sysroot supports the required test
> > options.
> >
> > Toma Tabacu  writes:
> > > Hi,
> > >
> > > The patch below is a rebased version of Andrew's patch plus a few
> > more changes
> > > to test options.
> > >
> > > I have tested Andrew's patch by passing -msoft-float to the testsuite
> > without
> > > having a soft-float library available, and saw that the inline-memcpy-
> > {1,2,3,4,5}.c
> > > and memcpy-1.c tests were also failing to find standard library
> > headers.
> > > In the version below, I have added (REQUIRES_STDLIB) to them as
> > well.
> > >
> > > However, I believe that the memcpy-1.c test was removed from the
> > first version
> > > of Andrew's patch in response to Matthew's comments, but I don't
> > think it
> > > should be.
> > >
> > > Tested with mips-img-linux-gnu and mips-mti-linux-gnu.
> >
> > This looks OK to me but I then again I helped with the design for this.
> >
> > I'd like to give Catherine a chance to take a look though as the feature
> > is unusual.
> >
> > One comment below.
> >
> > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> > > b/gcc/testsuite/gcc.target/mips/mips.exp
> > > index e22d782..ccd4ecb 100644
> > > --- a/gcc/testsuite/gcc.target/mips/mips.exp
> > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> > > @@ -1420,6 +1426,22 @@ proc mips-dg-options { args } {
> > >   }
> > >  }
> > >
> > > +# If the test is marked as requiring standard libraries check
> > > +# that the sysroot has support for the current set of test options.
> > > +if { [mips_have_test_option_p options "REQUIRES_STDLIB"] } {
> > > + mips_push_test_options saved_options $extra_tool_flags
> > > + set output [mips_preprocess "" {
> > > +   #include 
> > > + } 1]
> > > + mips_pop_test_options saved_options
> > > +
> > > + # If the preprocessing of the stdlib.h file produced errors mark
> > > + # the test as unsupported.
> > > + if { ![string equal $output ""] } {
> > > + set do_what [lreplace $do_what 1 1 "N"]
> >
> > We should describe what we expect the format of do_what to be at
> > the time
> > of writing in case it ever changes. i.e. a comment to say what the
> > second
> > character means and therefore make it clear that setting it to 'n' makes
> > the test unsupported.
> >
> 
> This patch looks okay to me after updating the comment as Matthew suggested.

The version below has a more detailed comment about marking tests as 
unsupported.
Matthew, does it look good to you ?

Also, should we document our expectations for the rest of do_what's format 
(elements 0 and 2) ?

Regards,
Toma Tabacu

gcc/testsuite/ChangeLog:

* gcc.target/mips/inline-memcpy-1.c (dg-options): Add
(REQUIRES_STDLIB).
* gcc.target/mips/inline-memcpy-2.c: Ditto.
* gcc.target/mips/inline-memcpy-3.c: Ditto.
* gcc.target/mips/inline-memcpy-4.c: Ditto.
* gcc.target/mips/inline-memcpy-5.c: Ditto.
* gcc.target/mips/loongson-shift-count-truncated-1.c: Ditto.
* gcc.target/mips/loongson-simd.c: Ditto.
* gcc.target/mips/memcpy-1.c: Ditto.
* gcc.target/mips/mips-3d-1.c: Ditto.
* gcc.target/mips/mips-3d-2.c: Ditto.
* gcc.target/mips/mips-3d-3.c: Ditto.
* gcc.target/mips/mips-3d-4.c: Ditto.
* gcc.target/mips/mips-3d-5.c: Ditto.
* gcc.target/mips/mips-3d-6.c: Ditto.
* gcc.target/mips/mips-3d-7.c: Ditto.
* gcc.target/mips/mips-3d-8.c: Ditto.
* gcc.target/mips/mips-3d-9.c: Ditto.
* gcc.target/mips/mips-ps-1.c: Ditto.
* gcc.target/mips/mips-ps-2.c: Ditto.
* gcc.target/mips/mips-ps-3.c: Ditto.
* gcc.target/mips/mips-ps-4.c: Ditto.
* gcc.target/mips/mips-ps-6.c: Ditto.
* gcc.target/mips/mips16-attributes.c: Ditto.
* gcc.target/mips/mips32-dsp-run.c: Ditto.
* gcc.target/mips/mips32-dsp.c: Ditto.
* gcc.target/mips/save-restore-1.c: Ditto.
* gcc.target/mips/mips.exp (mips_option_groups): Add stdlib.
(mips_preprocess): Add ignore_output argument that when set
will not return the pre-processed output.
(mips_arch_info): Update arguments for the call to
mips_preprocess.
(mips-dg-init): Ditto.
(m

Re: [PATCH 1/2] PR77822

2016-11-18 Thread Dominik Vogt
On Fri, Nov 18, 2016 at 03:31:40AM -0600, Segher Boessenkool wrote:
> Hi Dominik,
> 
> On Thu, Nov 17, 2016 at 04:53:47PM +0100, Dominik Vogt wrote:
> > +/* A convenience macro to determine whether a SIZE lies inclusively
> > +   within [1, RANGE], POS lies inclusively within between
> > +   [0, RANGE - 1] and the sum lies inclusively within [1, RANGE].  */
> > +#define SIZE_POS_IN_RANGE(SIZE, POS, RANGE) \
> > +  (IN_RANGE (POS, 0, (RANGE) - 1) \
> > +   && IN_RANGE (SIZE, 1, RANGE) \
> > +   && IN_RANGE ((SIZE) + (POS), 1, RANGE))
> 
> You should put parens around every use of SIZE, POS, RANGE -- there could
> be a comma operator in the macro argument.

Good point.  That wouldn't matter for a backend macro, but in
system.h it does.

> You don't check if SIZE+POS overflows / wraps around.  If you really don't
> care about that, you can lose the
> 
> > +   && IN_RANGE (SIZE, 1, RANGE) \
> 
> part as well?

The macro is _intended_ for checking the (possibly bogus)
arguments of zero_extract and sing_extract that combine may
generate.  SIZE is some unsigned value, but POS might be unsigned
or signed, and actually have a negative value (INTVAL
(operands[x]) or UINTVAL (operands[x]))), and RANGE is typically
64 or another small value.

Anyway, the macro should work for any inputs (except RANGE <= 0).

How about this:

--
/* A convenience macro to determine whether a SIZE lies inclusively
   within [1, UPPER], POS lies inclusively within between
   [0, UPPER - 1] and the sum lies inclusively within [1, UPPER].
   UPPER must be >= 1.  */
#define SIZE_POS_IN_RANGE(SIZE, POS, UPPER) \
  (IN_RANGE ((POS), 0, (unsigned HOST_WIDE_INT) (UPPER) - 1) \
   && IN_RANGE ((SIZE), 1, (unsigned HOST_WIDE_INT) (UPPER) \
   - (unsigned HOST_WIDE_INT)(POS)))
--

IN_RANGE(POS...) makes sure that POS is a non-negative number
smaller than UPPER, so (UPPER) - (POS) does not wrap.  Or is there
some case that the new macro does not handle?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] OpenACC routines -- middle end

2016-11-18 Thread Jakub Jelinek
On Fri, Nov 11, 2016 at 03:43:02PM -0800, Cesar Philippidis wrote:
> + error_at (OMP_CLAUSE_LOCATION (c),
> +   "%qs specifies a conflicting level of parallelism",
> +   omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
> + inform (OMP_CLAUSE_LOCATION (c_level),
> + "... to the previous %qs clause here",

I think the '... ' part is unnecessary.
Perhaps word it better like we word errors/warnings for mismatched
attributes etc.?

> +incompatible:
> +  if (c_diag != NULL_TREE)
> + error_at (OMP_CLAUSE_LOCATION (c_diag),
> +   "incompatible %qs clause when applying"
> +   " %<%s%> to %qD, which has already been"
> +   " marked as an accelerator routine",
> +   omp_clause_code_name[OMP_CLAUSE_CODE (c_diag)],
> +   routine_str, fndecl);
> +  else if (c_diag_p != NULL_TREE)
> + error_at (loc,
> +   "missing %qs clause when applying"
> +   " %<%s%> to %qD, which has already been"
> +   " marked as an accelerator routine",
> +   omp_clause_code_name[OMP_CLAUSE_CODE (c_diag_p)],
> +   routine_str, fndecl);
> +  else
> + gcc_unreachable ();
> +  if (c_diag_p != NULL_TREE)
> + inform (OMP_CLAUSE_LOCATION (c_diag_p),
> + "... with %qs clause here",
> + omp_clause_code_name[OMP_CLAUSE_CODE (c_diag_p)]);

Again, I think this usually would be something like "previous %qs clause"
or similar in the inform.  Generally, I think the error message should
be self-contained and infom should be just extra information, rather than
error containing first half of the diagnostic message and inform the second
one.  E.g. for translations, while such a sentence crossing the two
diagnostic routines might make sense in english, it might look terrible in
other languages.

> +  else
> + {
> +   /* In the front ends, we don't preserve location information for the
> +  OpenACC routine directive itself.  However, that of c_level_p
> +  should be close.  */
> +   location_t loc_routine = OMP_CLAUSE_LOCATION (c_level_p);
> +   inform (loc_routine, "... without %qs clause near to here",
> +   omp_clause_code_name[OMP_CLAUSE_CODE (c_diag)]);
> + }
> +  /* Incompatible.  */
> +  return -1;
> +}
> +
> +  return 0;

Jakub


Re: [PATCH] OpenACC routines -- c front end

2016-11-18 Thread Jakub Jelinek
On Fri, Nov 11, 2016 at 03:43:23PM -0800, Cesar Philippidis wrote:
> @@ -11801,12 +11807,11 @@ c_parser_oacc_shape_clause (c_parser *parser, 
> omp_clause_code kind,
>   }
>  
> location_t expr_loc = c_parser_peek_token (parser)->location;
> -   c_expr cexpr = c_parser_expr_no_commas (parser, NULL);
> -   cexpr = convert_lvalue_to_rvalue (expr_loc, cexpr, false, true);
> -   tree expr = cexpr.value;
> +   tree expr = c_parser_expr_no_commas (parser, NULL).value;
> if (expr == error_mark_node)
>   goto cleanup_error;
>  
> +   mark_exp_read (expr);
> expr = c_fully_fold (expr, false, NULL);
>  
> /* Attempt to statically determine when the number isn't a

Why?  Are the arguments of the clauses lvalues?

> @@ -11867,12 +11872,12 @@ c_parser_oacc_shape_clause (c_parser *parser, 
> omp_clause_code kind,
> seq */
>  
>  static tree
> -c_parser_oacc_simple_clause (c_parser *parser, enum omp_clause_code code,
> -  tree list)
> +c_parser_oacc_simple_clause (c_parser * /* parser */, location_t loc,

Just leave it as c_parser *, or better yet remove the argument if you don't
need it.

> +  else
> + {
> +   //TODO? TREE_USED (decl) = 1;

This would be /* FIXME: TREE_USED (decl) = 1;  */
but wouldn't it be better to figure out if you want to do that or not?

Jakub


Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-18 Thread Christophe Lyon
On 16 November 2016 at 23:12, Andrew Burgess
 wrote:
> * Mike Stump  [2016-11-16 12:59:53 -0800]:
>
>> On Nov 16, 2016, at 12:09 PM, Andrew Burgess  
>> wrote:
>> > My only remaining concern is the new tests, I've tried to restrict
>> > them to targets that I suspect they'll pass on with:
>> >
>> >/* { dg-final-use { scan-assembler "\.section\[\t 
>> > \]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target 
>> > *-*-linux* *-*-gnu* } } } */
>> >
>> > but I'm still nervous that I'm going to introduce test failures.  Is
>> > there any advice / guidance I should follow before I commit, or are
>> > folk pretty relaxed so long as I've made a reasonable effort?
>>
>> So, if you are worried about the way the line is constructed, I usually test 
>> it by misspelling the *-*-linux* *-*-gnu* part as *-*-linNOTux* *-*-gnNOTu* 
>> and see if the test then doesn't run on your machine.  If it doesn't then 
>> you can be pretty confident that only machines that match the target triplet 
>> can be impacted.  I usually do this type of testing by running the test case 
>> in isolation (not the full tests suite).  Anyway, do the best you can, and 
>> don't worry about t it too much, learn from the experience, even if it goes 
>> wrong in some way.  If it did go wrong, just be responsive (don't check it 
>> in just before a 6 week vacation) about fixing it, if you can.
>>
>
> Thanks for the feedback.
>
> Change committed as revision 242519.  If anyone sees any issues just
> let me know.
>

Hi Andrew,

Sorry for the delay, there are so many commits these days, with so
many regression
reports to check manually before reporting here

So, your new test fails on arm* targets:

  gcc.dg/tree-prof/section-attr-1.c scan-assembler .section[\t
]*.text.unlikely[\\n\\r]+[\t ]*.size[\t ]*foo.cold.0
  gcc.dg/tree-prof/section-attr-2.c scan-assembler .section[\t
]*.text.unlikely[\\n\\r]+[\t ]*.size[\t ]*foo.cold.0
  gcc.dg/tree-prof/section-attr-3.c scan-assembler .section[\t
]*.text.unlikely[\\n\\r]+[\t ]*.size[\t ]*foo.cold.0

Christophe



> Thanks,
> Andrew


Re: [PATCH] OpenACC routines -- fortran front end

2016-11-18 Thread Jakub Jelinek
On Fri, Nov 11, 2016 at 03:44:07PM -0800, Cesar Philippidis wrote:
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -314,6 +314,15 @@ enum save_state
>  { SAVE_NONE = 0, SAVE_EXPLICIT, SAVE_IMPLICIT
>  };
>  
> +/* Flags to keep track of ACC routine states.  */
> +enum oacc_function
> +{ OACC_FUNCTION_NONE = 0,

Please add a newline after {.

>if (clauses)
>  {
>unsigned mask = 0;
>  
>if (clauses->gang)
> - level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
> + {
> +   level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
> +   ret = OACC_FUNCTION_GANG;
> + }
>if (clauses->worker)
> - level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
> + {
> +   level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
> +   ret = OACC_FUNCTION_WORKER;
> + }
>if (clauses->vector)
> - level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
> + {
> +   level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
> +   ret = OACC_FUNCTION_VECTOR;
> + }

As you have {}s around, please use
level = GOMP_DIM_*;
mask |= GOMP_DIM_MASK (level);
ret = OACC_FUNCTION_*;

>if (clauses->seq)
>   level = GOMP_DIM_MAX, mask |= GOMP_DIM_MASK (level);
>  
>if (mask != (mask & -mask))
> - gfc_error ("Multiple loop axes specified for routine");
> + ret = OACC_FUNCTION_NONE;
>  }
>  
> -  if (level < 0)
> -level = GOMP_DIM_MAX;
> -
> -  return level;
> +  return ret;
>  }
>  
>  match
>  gfc_match_oacc_routine (void)
>  {
>locus old_loc;
> -  gfc_symbol *sym = NULL;
>match m;
> +  gfc_intrinsic_sym *isym = NULL;
> +  gfc_symbol *sym = NULL;
>gfc_omp_clauses *c = NULL;
>gfc_oacc_routine_name *n = NULL;
> +  oacc_function dims = OACC_FUNCTION_NONE;
> +  bool seen_error = false;
>  
>old_loc = gfc_current_locus;
>  
> @@ -2287,45 +2314,52 @@ gfc_match_oacc_routine (void)
>if (m == MATCH_YES)
>  {
>char buffer[GFC_MAX_SYMBOL_LEN + 1];
> -  gfc_symtree *st;
> +  gfc_symtree *st = NULL;
>  
>m = gfc_match_name (buffer);
>if (m == MATCH_YES)
>   {
> -   st = gfc_find_symtree (gfc_current_ns->sym_root, buffer);
> +   if ((isym = gfc_find_function (buffer)) == NULL
> +   && (isym = gfc_find_subroutine (buffer)) == NULL)
> + {
> +   st = gfc_find_symtree (gfc_current_ns->sym_root, buffer);
> +   if (st == NULL && gfc_current_ns->proc_name->attr.contained

Please add a newline before &&.

> +   && gfc_current_ns->parent)
> + st = gfc_find_symtree (gfc_current_ns->parent->sym_root,
> +buffer);
> + }

> @@ -5934,6 +6033,21 @@ gfc_resolve_oacc_blocks (gfc_code *code, gfc_namespace 
> *ns)
>ctx.private_iterators = new hash_set;
>ctx.previous = omp_current_ctx;
>ctx.is_openmp = false;
> +
> +  if (code->ext.omp_clauses->gang)
> +dims = OACC_FUNCTION_GANG;
> +  if (code->ext.omp_clauses->worker)
> +dims = OACC_FUNCTION_WORKER;
> +  if (code->ext.omp_clauses->vector)
> +dims = OACC_FUNCTION_VECTOR;
> +  if (code->ext.omp_clauses->seq)
> +dims = OACC_FUNCTION_SEQ;

Shouldn't these be else if ?
> +
> +  if (dims == OACC_FUNCTION_NONE && ctx.previous != NULL

Again, as the whole condition doesn't fit on one line, please
put && on a new line.
> +  && !ctx.previous->is_openmp)
> +dims = ctx.previous->dims;

Jakub


Re: [PATCH] OpenACC routines -- test cases

2016-11-18 Thread Jakub Jelinek
On Fri, Nov 11, 2016 at 03:44:29PM -0800, Cesar Philippidis wrote:
> This patch contains new and adjusted, runtime and compiler test cases
> for the new OpenACC routine functionality.
> 
> Is this ok for trunk?

Ok (though of course, if the diagnostic wording is adjusted, then
the test will need to match).

Jakub


Re: [PATCH, Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels.

2016-11-18 Thread Iain Sandoe
Hi Segher,
Thanks for the review, 

> On 6 Nov 2016, at 22:09, Segher Boessenkool  
> wrote:
> 

> On Sun, Nov 06, 2016 at 12:13:16PM -0800, Iain Sandoe wrote:
>> 2016-11-06  Iain Sandoe  
>> 
>>  PR target/57438
>>  * config/i386/i386.c (ix86_code_end): Note that we emitted code where 
>> the
>>  function might otherwise appear empty for picbase thunks.
>>  (ix86_output_function_epilogue): If we find a zero-sized function 
>> assume that
>>  reaching it is UB and trap.  If we find a trailing label append a nop.
>>  * config/rs6000/rs6000.c (rs6000_output_function_epilogue): If we find
>>  a zero-sized function assume that reaching it is UB and trap.  If we 
>> find a
>>  trailing label, append a nop.
> 
> These lines are too long.
fixed.
> 
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index b0d2b64..326e2e9 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -30148,11 +30148,16 @@ rs6000_output_function_epilogue (FILE *file,
>> {
>> #if TARGET_MACHO
>>   macho_branch_islands ();
>> -  /* Mach-O doesn't support labels at the end of objects, so if
>> - it looks like we might want one, insert a NOP.  */
>> +
>> +  if (TARGET_MACHO)
> 
> Both #if and if?

As discussed on IRC, I was under the impression that it is desired to move away 
from #ifdef towards if() and  I have been adding those where locally things 
have been touched - in this case it was only partially possible.

However, FAOD - you pointed out that for the RS6000 back-end #ifdef are still 
preferred, and I’ve removed this change.

> +/* Mach-O doesn't support labels at the end of objects, so if
>> +   it looks like we might want one, take special action.
>> +
>> +   First, collect any sequence deleted debug labels.  */
> 
> "sequence of”.
indeed
> 
>> +/* If we have:
>> +   label:
>> +  barrier
>> +   That need to be guarded.  */
> 
> "then this needs to be guarded.", or similar?
amended,
> 
>> +/* up to now we've only seen notes or barriers.  */
> 
> Sentences start with a capital letter.
fixed
> 
>> +/* See if have a completely empty function body.  */
> 
> "we have”
amended
> 
>> +while (insn && ! INSN_P (insn))
>> +  insn = PREV_INSN (insn);
>> +/* If we don't find any, we've got an empty function body; i.e.
> 
> "any insn”?
amended
> 
>> +   completely empty - without a return or branch.  GCC declares
>> +   that reaching __builtin_unreachable() means UB (but we want
>> +   finite-sized function bodies; to help the user out, let's trap
>> +   the case.  */
> 
> Zero is finite size; "non-empty function bodies”?
yeah… amended.
> 
> The rs6000 code looks fine, thanks; but please work on the text a bit :-)

Sorry, that was poor proof-reading (too many times looking at the same code, I 
guess).

OK now for trunk?
Open branches?
Iain

PR target/57438
* config/i386/i386.c (ix86_code_end): Note that we emitted code
where the function might otherwise appear empty for picbase thunks.
(ix86_output_function_epilogue): If we find a zero-sized function
assume that reaching it is UB and trap.  If we find a trailing label
append a nop.
* config/rs6000/rs6000.c (rs6000_output_function_epilogue): If we
find a zero-sized function assume that reaching it is UB and trap.
If we find a trailing label, append a nop.



PR57438-revised.patch
Description: Binary data




RE: [PATCH] MIPS: If a test in the MIPS testsuite requires standard library support check the sysroot supports the required test options.

2016-11-18 Thread Matthew Fortune
Toma Tabacu  writes:
> The version below has a more detailed comment about marking tests as
> unsupported.
> Matthew, does it look good to you ?
> 
> Also, should we document our expectations for the rest of do_what's
> format (elements 0 and 2) ?

I don’t think it is necessary. The explanation is just to make it clear
how the modification should lead to the UNSUPPORTED state so that it
can be re-implemented if necessary in the future with the same effect.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/mips/inline-memcpy-1.c (dg-options): Add
>   (REQUIRES_STDLIB).
>   * gcc.target/mips/inline-memcpy-2.c: Ditto.
>   * gcc.target/mips/inline-memcpy-3.c: Ditto.
>   * gcc.target/mips/inline-memcpy-4.c: Ditto.
>   * gcc.target/mips/inline-memcpy-5.c: Ditto.
>   * gcc.target/mips/loongson-shift-count-truncated-1.c: Ditto.
>   * gcc.target/mips/loongson-simd.c: Ditto.
>   * gcc.target/mips/memcpy-1.c: Ditto.
>   * gcc.target/mips/mips-3d-1.c: Ditto.
>   * gcc.target/mips/mips-3d-2.c: Ditto.
>   * gcc.target/mips/mips-3d-3.c: Ditto.
>   * gcc.target/mips/mips-3d-4.c: Ditto.
>   * gcc.target/mips/mips-3d-5.c: Ditto.
>   * gcc.target/mips/mips-3d-6.c: Ditto.
>   * gcc.target/mips/mips-3d-7.c: Ditto.
>   * gcc.target/mips/mips-3d-8.c: Ditto.
>   * gcc.target/mips/mips-3d-9.c: Ditto.
>   * gcc.target/mips/mips-ps-1.c: Ditto.
>   * gcc.target/mips/mips-ps-2.c: Ditto.
>   * gcc.target/mips/mips-ps-3.c: Ditto.
>   * gcc.target/mips/mips-ps-4.c: Ditto.
>   * gcc.target/mips/mips-ps-6.c: Ditto.
>   * gcc.target/mips/mips16-attributes.c: Ditto.
>   * gcc.target/mips/mips32-dsp-run.c: Ditto.
>   * gcc.target/mips/mips32-dsp.c: Ditto.
>   * gcc.target/mips/save-restore-1.c: Ditto.
>   * gcc.target/mips/mips.exp (mips_option_groups): Add stdlib.
>   (mips_preprocess): Add ignore_output argument that when set
>   will not return the pre-processed output.
>   (mips_arch_info): Update arguments for the call to
>   mips_preprocess.
>   (mips-dg-init): Ditto.
>   (mips-dg-options): Check if a test having test option
>   (REQUIRES_STDLIB) has the required sysroot support for
>   the current test options.

OK. Committed as r242587.

Thanks,
Matthew



Re: [fixincludes, v3] Don't define libstdc++-internal macros in Solaris 10+

2016-11-18 Thread Jonathan Wakely

On 18/11/16 12:10 +0100, Rainer Orth wrote:

The change is OK in principle, but I'd prefer more meaningful names
for the macros ...


Fine with me: I know close to nothing of C++, so please bear with me ;-)


No problem, that's what the rest of us are here to help with :-)


I don't mind whether you go with this or just remove the unnecessary
structs, typedef and primary template bodies. Either is OK.


I went for the latter to keep the testcase close to the original.


OK, that's probably simpler to maintain in case we change the code in
the headers in future.


Please find attached the revised version of the patch.  I hope I've
incorporated all of your comments.


Looks good.


Ok for mainline now and the backports after some soak time?


Yes, the libstdc++ parts are OK, thanks.



Re: [PATCH v3] bb-reorder: Improve compgotos pass (PR71785)

2016-11-18 Thread Segher Boessenkool
On Fri, Nov 18, 2016 at 09:09:40AM +0100, Richard Biener wrote:
> > This patch rewrites the algorithm to deal with this.  It also makes it
> > simpler: it does not need the "candidates" array anymore, it does not
> > need RTL layout mode, it does not need cleanup_cfg, and it does not
> > need to keep track of what blocks it already visited.
> > 
> > Tested on powerpc64-linux {-m32,-m64}, and on the testcase, and on a version
> > of the testcase that has 2000 cases instead of 4.  Is this okay for trunk?
> 
> Looks good to me - a single question below:

And here is a testcase, okay as well?


Segher


2016-11-18  Segher Boessenkool  

gcc/testsuite/
PR rtl-optimization/71785
* gcc.target/powerpc/pr71785.c: New file.

---
 gcc/testsuite/gcc.target/powerpc/pr71785.c | 52 ++
 1 file changed, 52 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr71785.c

diff --git a/gcc/testsuite/gcc.target/powerpc/pr71785.c 
b/gcc/testsuite/gcc.target/powerpc/pr71785.c
new file mode 100644
index 000..613dcd1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr71785.c
@@ -0,0 +1,52 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not {\mb\M} } } */
+
+/* Check that all computed gotos in this testcase end up unfactored completely.
+   If some is not there will be a unconditional jump left; if all works fine,
+   all are gone.  */
+
+typedef enum opcode
+{
+   OP_A,
+   OP_B,
+   OP_END
+} opcode;
+
+typedef struct op
+{
+   opcode opcode;
+   int arg;
+} op;
+
+extern void do_stuff_b(int arg);
+extern void do_stuff_c(int arg);
+
+extern int someglobal;
+
+void
+eval(op *op)
+{
+   static const void *dispatch_table[] = {
+   &&CASE_OP_A,
+   &&CASE_OP_B,
+   &&CASE_OP_C,
+   &&CASE_OP_END
+   };
+
+   goto *dispatch_table[op->opcode];
+CASE_OP_A:
+   someglobal++;
+   op++;
+   goto *dispatch_table[op->opcode];
+CASE_OP_B:
+   do_stuff_b(op->arg);
+   op++;
+   goto *dispatch_table[op->opcode];
+CASE_OP_C:
+   do_stuff_c(op->arg);
+   op++;
+   goto *dispatch_table[op->opcode];
+CASE_OP_END:
+   return;
+}
-- 
1.9.3



Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation

2016-11-18 Thread Segher Boessenkool
On Fri, Nov 18, 2016 at 09:29:13AM +, Kyrill Tkachov wrote:
> >So your COMPONENTS_FOR_BB returns both components in a pair whenever one
> >of those is needed?  That should work afaics.
> 
> I mean I still want to have one component per register and since
> emit_{prologue,epilogue}_components knows how to form pairs from the
> components passed down to it I just need to restrict the number of
> components in any particular basic block to an even number.
> So say a function can wrap 5 registers: x22,x23,x24,x25,x26.
> I want get_separate_components to return 5 components since in that hook
> we don't know how these registers are distributed across each basic block.
> components_for_bb has that information.
> In components_for_bb I want to restrict the components for a basic block to
> an even number, so if normally all 5 registers would be valid for wrapping
> in that bb I'd only choose 4 so I could form 2 pairs. But selecting only 4
> of the 5 registers, say only x22,x23,x24,x25 leads to x26 not being saved
> or restored at all, even during the normal prologue and epilogue because
> x26 was marked as a component in components_for_bb and therefore omitted 
> from
> the prologue and epilogue.
> So I'm thinking x26 should be removed from the wrappable components of
> a basic block by disqualify_components. I'm trying that approach now.

My suggestion was, in components_for_bb, whenever you mark x22 as needed
you also mark x23 as needed, and whenever you mark x23 as needed you also
mark x22.  I think this is a lot simpler?


Segher


RE: [PATCH,testsuite] MIPS: Downgrade from R6 to R5 to prevent redundant testing of branch-cost-1.c.

2016-11-18 Thread Matthew Fortune
Moore, Catherine  writes:
> > gcc/testsuite/ChangeLog:
> >
> > 2016-11-15  Toma Tabacu  
> >
> > * gcc.target/mips/branch-cost-1.c (dg-options): Use
> > (HAS_MOVN) instead
> > of isa>=4, in order to downgrade to R5.
> >
> > diff --git a/gcc/testsuite/gcc.target/mips/branch-cost-1.c
> > b/gcc/testsuite/gcc.target/mips/branch-cost-1.c
> > index 61c3029..7f7ebbe 100644
> > --- a/gcc/testsuite/gcc.target/mips/branch-cost-1.c
> > +++ b/gcc/testsuite/gcc.target/mips/branch-cost-1.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-options "-mbranch-cost=1 isa>=4" } */
> > +/* { dg-options "-mbranch-cost=1 (HAS_MOVN)" } */
> >  /* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
> >  NOMIPS16 int
> >  foo (int x, int y, int z, int k)
> 
> I committed this patch for you.  Have you requested  write access to the
> repository?

I was thinking the same thing given Toma has now submitted a few patches.

Please fill out the form below and reference either myself or Catherine:

https://sourceware.org/cgi-bin/pdw/ps_form.cgi

Thanks,
Matthew


[Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

2016-11-18 Thread Janus Weil
Hi all,

the attached patch fixes an ice-on-valid problem, simply by removing
an assert. The generated code works as expected and the patch regtests
cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus



2016-11-18  Janus Weil  

PR fortran/78392
* trans-array.c (gfc_trans_auto_array_allocation): Remove an assert.

2016-11-18  Janus Weil  

PR fortran/78392
* gfortran.dg/saved_automatic_2.f90: New test case.
Index: gcc/fortran/trans-array.c
===
--- gcc/fortran/trans-array.c   (Revision 242586)
+++ gcc/fortran/trans-array.c   (Arbeitskopie)
@@ -5976,7 +5976,6 @@ gfc_trans_auto_array_allocation (tree decl, gfc_sy
   type = TREE_TYPE (type);
 
   gcc_assert (!sym->attr.use_assoc);
-  gcc_assert (!TREE_STATIC (decl));
   gcc_assert (!sym->module);
 
   if (sym->ts.type == BT_CHARACTER
! { dg-do run }
!
! PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979
!
! Contributed by Janus Weil 

module mytypes
   implicit none
 contains
   pure integer function get_i ()
 get_i = 13
   end function
end module

program test
  use mytypes
  implicit none
  integer, dimension(get_i()), save :: x
  if (size(x) /= 13) call abort()
end


Re: Fix PR78154

2016-11-18 Thread Prathamesh Kulkarni
On 17 November 2016 at 15:24, Richard Biener  wrote:
> On Thu, 17 Nov 2016, Prathamesh Kulkarni wrote:
>
>> On 17 November 2016 at 14:21, Richard Biener  wrote:
>> > On Thu, 17 Nov 2016, Prathamesh Kulkarni wrote:
>> >
>> >> Hi Richard,
>> >> Following your suggestion in PR78154, the patch checks if stmt
>> >> contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
>> >> and returns true in that case.
>> >>
>> >> Bootstrapped+tested on x86_64-unknown-linux-gnu.
>> >> Cross-testing on arm*-*-*, aarch64*-*-* in progress.
>> >> Would it be OK to commit this patch in stage-3 ?
>> >
>> > As people noted we have returns_nonnull for this and that is already
>> > checked.  So please make sure the builtins get this attribute instead.
>> OK thanks, I will add the returns_nonnull attribute to the required
>> string builtins.
>> I noticed some of the string builtins don't have RET1 in builtins.def:
>> strcat, strncpy, strncat have ATTR_NOTHROW_NONNULL_LEAF.
>> Should they instead be having ATTR_RET1_NOTHROW_NONNULL_LEAF similar
>> to entries for memmove, strcpy ?
>
> Yes, I think so.
Hi,
In the attached patch I added returns_nonnull attribute to
ATTR_RET1_NOTHROW_NONNULL_LEAF,
and changed few builtins like strcat, strncpy, strncat and
corresponding _chk builtins to use ATTR_RET1_NOTHROW_NONNULL_LEAF.
Does the patch look correct ?

Thanks,
Prathamesh
>
> Richard.
diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
index 8dc59c9..da82da5 100644
--- a/gcc/builtin-attrs.def
+++ b/gcc/builtin-attrs.def
@@ -108,6 +108,7 @@ DEF_ATTR_IDENT (ATTR_TYPEGENERIC, "type generic")
 DEF_ATTR_IDENT (ATTR_TM_REGPARM, "*tm regparm")
 DEF_ATTR_IDENT (ATTR_TM_TMPURE, "transaction_pure")
 DEF_ATTR_IDENT (ATTR_RETURNS_TWICE, "returns_twice")
+DEF_ATTR_IDENT (ATTR_RETURNS_NONNULL, "returns_nonnull")
 
 DEF_ATTR_TREE_LIST (ATTR_NOVOPS_LIST, ATTR_NOVOPS, ATTR_NULL, ATTR_NULL)
 
@@ -195,8 +196,11 @@ DEF_ATTR_TREE_LIST (ATTR_CONST_NOTHROW_NONNULL, 
ATTR_CONST, ATTR_NULL, \
ATTR_NOTHROW_NONNULL)
 /* Nothrow leaf functions whose pointer parameter(s) are all nonnull,
and which return their first argument.  */
-DEF_ATTR_TREE_LIST (ATTR_RET1_NOTHROW_NONNULL_LEAF, ATTR_FNSPEC, 
ATTR_LIST_STR1, \
+DEF_ATTR_TREE_LIST (ATTR_RET1_NOTHROW_NONNULL_LEAF_1, ATTR_RETURNS_NONNULL, 
ATTR_NULL, \
ATTR_NOTHROW_NONNULL_LEAF)
+DEF_ATTR_TREE_LIST (ATTR_RET1_NOTHROW_NONNULL_LEAF, ATTR_FNSPEC, 
ATTR_LIST_STR1, \
+   ATTR_RET1_NOTHROW_NONNULL_LEAF_1)
+
 /* Nothrow const leaf functions whose pointer parameter(s) are all nonnull.  */
 DEF_ATTR_TREE_LIST (ATTR_CONST_NOTHROW_NONNULL_LEAF, ATTR_CONST, ATTR_NULL, \
ATTR_NOTHROW_NONNULL_LEAF)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 219feeb..c697b0a 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -646,13 +646,13 @@ DEF_LIB_BUILTIN(BUILT_IN_MEMCHR, "memchr", 
BT_FN_PTR_CONST_PTR_INT_SIZE,
 DEF_LIB_BUILTIN(BUILT_IN_MEMCMP, "memcmp", 
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN_CHKP   (BUILT_IN_MEMCPY, "memcpy", 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN_CHKP   (BUILT_IN_MEMMOVE, "memmove", 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
-DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMPCPY, "mempcpy", 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_NOTHROW_NONNULL_LEAF)
+DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMPCPY, "mempcpy", 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN_CHKP   (BUILT_IN_MEMSET, "memset", BT_FN_PTR_PTR_INT_SIZE, 
ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_RINDEX, "rindex", 
BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)
-DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_STPCPY, "stpcpy", 
BT_FN_STRING_STRING_CONST_STRING, ATTR_NOTHROW_NONNULL_LEAF)
-DEF_EXT_LIB_BUILTIN(BUILT_IN_STPNCPY, "stpncpy", 
BT_FN_STRING_STRING_CONST_STRING_SIZE, ATTR_NOTHROW_NONNULL_LEAF)
+DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_STPCPY, "stpcpy", 
BT_FN_STRING_STRING_CONST_STRING, ATTR_RET1_NOTHROW_NONNULL_LEAF)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_STPNCPY, "stpncpy", 
BT_FN_STRING_STRING_CONST_STRING_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_STRCASECMP, "strcasecmp", 
BT_FN_INT_CONST_STRING_CONST_STRING, ATTR_PURE_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN_CHKP   (BUILT_IN_STRCAT, "strcat", 
BT_FN_STRING_STRING_CONST_STRING, ATTR_NOTHROW_NONNULL_LEAF)
+DEF_LIB_BUILTIN_CHKP   (BUILT_IN_STRCAT, "strcat", 
BT_FN_STRING_STRING_CONST_STRING, ATTR_RET1_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN_CHKP   (BUILT_IN_STRCHR, "strchr", 
BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_STRCMP, "strcmp", 
BT_FN_INT_CONST_STRING_CONST_STRING, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN_CHKP   (BUILT_IN_STRCPY, "strcpy", 
BT_FN_STRING_STRING_CONST_STRING, ATTR_RET1_NOTHROW_NONNULL_LEAF)
@@ -661,9 +661,9 @@

Re: [PATCH] Fix PR78189

2016-11-18 Thread Christophe Lyon
On 11 November 2016 at 09:56, Richard Biener  wrote:
> On Thu, 10 Nov 2016, Christophe Lyon wrote:
>
>> On 10 November 2016 at 09:34, Richard Biener  wrote:
>> > On Wed, 9 Nov 2016, Christophe Lyon wrote:
>> >
>> >> On 9 November 2016 at 09:36, Bin.Cheng  wrote:
>> >> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener  
>> >> > wrote:
>> >> >> On Mon, 7 Nov 2016, Christophe Lyon wrote:
>> >> >>
>> >> >>> Hi Richard,
>> >> >>>
>> >> >>>
>> >> >>> On 7 November 2016 at 09:01, Richard Biener  wrote:
>> >> >>> >
>> >> >>> > The following fixes an oversight when computing alignment in the
>> >> >>> > vectorizer.
>> >> >>> >
>> >> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to 
>> >> >>> > trunk.
>> >> >>> >
>> >> >>> > Richard.
>> >> >>> >
>> >> >>> > 2016-11-07  Richard Biener  
>> >> >>> >
>> >> >>> > PR tree-optimization/78189
>> >> >>> > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): 
>> >> >>> > Fix
>> >> >>> > alignment computation.
>> >> >>> >
>> >> >>> > * g++.dg/torture/pr78189.C: New testcase.
>> >> >>> >
>> >> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
>> >> >>> > ===
>> >> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C  (revision 0)
>> >> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C  (working copy)
>> >> >>> > @@ -0,0 +1,41 @@
>> >> >>> > +/* { dg-do run } */
>> >> >>> > +/* { dg-additional-options "-ftree-slp-vectorize 
>> >> >>> > -fno-vect-cost-model" } */
>> >> >>> > +
>> >> >>> > +#include 
>> >> >>> > +
>> >> >>> > +struct A
>> >> >>> > +{
>> >> >>> > +  void * a;
>> >> >>> > +  void * b;
>> >> >>> > +};
>> >> >>> > +
>> >> >>> > +struct alignas(16) B
>> >> >>> > +{
>> >> >>> > +  void * pad;
>> >> >>> > +  void * misaligned;
>> >> >>> > +  void * pad2;
>> >> >>> > +
>> >> >>> > +  A a;
>> >> >>> > +
>> >> >>> > +  void Null();
>> >> >>> > +};
>> >> >>> > +
>> >> >>> > +void B::Null()
>> >> >>> > +{
>> >> >>> > +  a.a = nullptr;
>> >> >>> > +  a.b = nullptr;
>> >> >>> > +}
>> >> >>> > +
>> >> >>> > +void __attribute__((noinline,noclone))
>> >> >>> > +NullB(void * misalignedPtr)
>> >> >>> > +{
>> >> >>> > +  B* b = reinterpret_cast(reinterpret_cast> >> >>> > *>(misalignedPtr) - offsetof(B, misaligned));
>> >> >>> > +  b->Null();
>> >> >>> > +}
>> >> >>> > +
>> >> >>> > +int main()
>> >> >>> > +{
>> >> >>> > +  B b;
>> >> >>> > +  NullB(&b.misaligned);
>> >> >>> > +  return 0;
>> >> >>> > +}
>> >> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> >> >>> > index 9346cfe..b03cb1e 100644
>> >> >>> > --- gcc/tree-vect-data-refs.c
>> >> >>> > +++ gcc/tree-vect-data-refs.c
>> >> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct 
>> >> >>> > data_reference *dr)
>> >> >>> >base = ref;
>> >> >>> >while (handled_component_p (base))
>> >> >>> >  base = TREE_OPERAND (base, 0);
>> >> >>> > +  unsigned int base_alignment;
>> >> >>> > +  unsigned HOST_WIDE_INT base_bitpos;
>> >> >>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> >> >>> > +  /* As data-ref analysis strips the MEM_REF down to its base 
>> >> >>> > operand
>> >> >>> > + to form DR_BASE_ADDRESS and adds the offset to DR_INIT we 
>> >> >>> > have to
>> >> >>> > + adjust things to make base_alignment valid as the alignment of
>> >> >>> > + DR_BASE_ADDRESS.  */
>> >> >>> >if (TREE_CODE (base) == MEM_REF)
>> >> >>> > -base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
>> >> >>> > -  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 
>> >> >>> > 1)), 0));
>> >> >>> > -  unsigned int base_alignment = get_object_alignment (base);
>> >> >>> > +{
>> >> >>> > +  base_bitpos -= mem_ref_offset (base).to_short_addr () * 
>> >> >>> > BITS_PER_UNIT;
>> >> >>> > +  base_bitpos &= (base_alignment - 1);
>> >> >>> > +}
>> >> >>> > +  if (base_bitpos != 0)
>> >> >>> > +base_alignment = base_bitpos & -base_bitpos;
>> >> >>> > +  /* Also look at the alignment of the base address DR analysis
>> >> >>> > + computed.  */
>> >> >>> > +  unsigned int base_addr_alignment = get_pointer_alignment 
>> >> >>> > (base_addr);
>> >> >>> > +  if (base_addr_alignment > base_alignment)
>> >> >>> > +base_alignment = base_addr_alignment;
>> >> >>> >
>> >> >>> >if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>> >> >>> >  DR_VECT_AUX (dr)->base_element_aligned = true;
>> >> >>>
>> >> >>> Since you committed this patch (r241892), I'm seeing execution 
>> >> >>> failures:
>> >> >>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
>> >> >>>   gcc.dg/vect/pr40074.c execution test
>> >> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
>> >> >>> --with-fpu=neon-fp16
>> >> >>> (using qemu as simulator)
>> >> >>
>> >> >> The difference is that we now vectorize the testcase with versioning
>> >> >> for alignment (but it should never execute the vectorized varia

Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Christophe Lyon
On 15 November 2016 at 15:41, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is patch for non-masked epilogue vectoriziation.
>
> Bootstrap and regression testing did not show any new failures.
>
> Is it OK for trunk?
>
> Thanks.
> Changelog:
>
> 2016-11-15  Yuri Rumyantsev  
>
> * params.def (PARAM_VECT_EPILOGUES_NOMASK): New.
> * tree-if-conv.c (tree_if_conversion): Make public.
> * * tree-if-conv.h: New file.
> * tree-vect-data-refs.c (vect_analyze_data_ref_dependences) Avoid
> dynamic alias checks for epilogues.
> * tree-vect-loop-manip.c (vect_do_peeling): Return created epilog.
> * tree-vect-loop.c: include tree-if-conv.h.
> (new_loop_vec_info): Add zeroing orig_loop_info field.
> (vect_analyze_loop_2): Don't try to enhance alignment for epilogues.
> (vect_analyze_loop): Add argument ORIG_LOOP_INFO which is not NULL
> if epilogue is vectorized, set up orig_loop_info field of loop_vinfo
> using passed argument.
> (vect_transform_loop): Check if created epilogue should be returned
> for further vectorization with less vf.  If-convert epilogue if
> required. Print vectorization success for epilogue.
> * tree-vectorizer.c (vectorize_loops): Add epilogue vectorization
> if it is required, pass loop_vinfo produced during vectorization of
> loop body to vect_analyze_loop.
> * tree-vectorizer.h (struct _loop_vec_info): Add new field
> orig_loop_info.
> (LOOP_VINFO_ORIG_LOOP_INFO): New.
> (LOOP_VINFO_EPILOGUE_P): New.
> (LOOP_VINFO_ORIG_VECT_FACTOR): New.
> (vect_do_peeling): Change prototype to return epilogue.
> (vect_analyze_loop): Add argument of loop_vec_info type.
> (vect_transform_loop): Return created loop.
>
> gcc/testsuite/
>
> * lib/target-supports.exp (check_avx2_hw_available): New.
> (check_effective_target_avx2_runtime): New.
> * gcc.dg/vect/vect-tail-nomask-1.c: New test.
>

Hi,

This new test fails on arm-none-eabi (using default cpu/fpu/mode):
  gcc.dg/vect/vect-tail-nomask-1.c -flto -ffat-lto-objects execution test
  gcc.dg/vect/vect-tail-nomask-1.c execution test

It does pass on the same target if configured --with-cpu=cortex-a9.

Christophe



>
> 2016-11-14 20:04 GMT+03:00 Richard Biener :
>> On November 14, 2016 4:39:40 PM GMT+01:00, Yuri Rumyantsev 
>>  wrote:
>>>Richard,
>>>
>>>I checked one of the tests designed for epilogue vectorization using
>>>patches 1 - 3 and found out that build compiler performs vectorization
>>>of epilogues with --param vect-epilogues-nomask=1 passed:
>>>
>>>$ gcc -Ofast -mavx2 t1.c -S --param vect-epilogues-nomask=1 -o
>>>t1.new-nomask.s -fdump-tree-vect-details
>>>$ grep VECTORIZED -c t1.c.156t.vect
>>>4
>>> Without param only 2 loops are vectorized.
>>>
>>>Should I simply add a part of tests related to this feature or I must
>>>delete all not necessary changes also?
>>
>> Please remove all not necessary changes.
>>
>> Richard.
>>
>>>Thanks.
>>>Yuri.
>>>
>>>2016-11-14 16:40 GMT+03:00 Richard Biener :
 On Mon, 14 Nov 2016, Yuri Rumyantsev wrote:

> Richard,
>
> In my previous patch I forgot to remove couple lines related to aux
>>>field.
> Here is the correct updated patch.

 Yeah, I noticed.  This patch would be ok for trunk (together with
 necessary parts from 1 and 2) if all not required parts are removed
 (and you'd add the testcases covering non-masked tail vect).

 Thus, can you please produce a single complete patch containing only
 non-masked epilogue vectoriziation?

 Thanks,
 Richard.

> Thanks.
> Yuri.
>
> 2016-11-14 15:51 GMT+03:00 Richard Biener :
> > On Fri, 11 Nov 2016, Yuri Rumyantsev wrote:
> >
> >> Richard,
> >>
> >> I prepare updated 3 patch with passing additional argument to
> >> vect_analyze_loop as you proposed (untested).
> >>
> >> You wrote:
> >> tw, I wonder if you can produce a single patch containing just
> >> epilogue vectorization, that is combine patches 1-3 but rip out
> >> changes only needed by later patches?
> >>
> >> Did you mean that I exclude all support for vectorization
>>>epilogues,
> >> i.e. exclude from 2-nd patch all non-related changes
> >> like
> >>
> >> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> >> index 11863af..32011c1 100644
> >> --- a/gcc/tree-vect-loop.c
> >> +++ b/gcc/tree-vect-loop.c
> >> @@ -1120,6 +1120,12 @@ new_loop_vec_info (struct loop *loop)
> >>LOOP_VINFO_PEELING_FOR_GAPS (res) = false;
> >>LOOP_VINFO_PEELING_FOR_NITER (res) = false;
> >>LOOP_VINFO_OPERANDS_SWAPPED (res) = false;
> >> +  LOOP_VINFO_CAN_BE_MASKED (res) = false;
> >> +  LOOP_VINFO_REQUIRED_MASKS (res) = 0;
> >> +  LOOP_VINFO_COMBINE_EPILOGUE (res) = false;
> >> +  LOOP_VINFO_MASK_EPILOGUE (res) = false;
> >> +  LOOP_VINFO_NEED_MASKING (res) = false;
> >> +  LOOP_VINFO_ORIG_LOOP_INFO (res) = NULL;
> >
> > Yes.
> >
> >> Did you mean also that new combined patch must be working patch,
>>>i.e.
>>

[patch,avr] Disallow LDS / STS for .rodata on AVR_TINY.

2016-11-18 Thread Georg-Johann Lay
Currently, Binutils still comes with an AVR_TINY linker description file 
which puts .rodata into RAM so that LDS is applicable for reading such 
objects.


However, as these devices come with a linearised address model, linker 
descriptions which put .rodata into flash are much more reasonable. 
This patch caters for such situations:  As .rodata virtual addresses 
will be offset by 0x4000, respective objects may no more be accessed by LDS.


Moreover, objects with a section attribute won't be accessed by LDS.

Ok for trunk?

Johann

PR target/78093
* config/avr/avr.c (avr_decl_maybe_lds_p): New static function.
(avr_encode_section_info) [TARGET_ABSDATA && AVR_TINY]: Use it.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 242544)
+++ config/avr/avr.c	(working copy)
@@ -10102,6 +10102,33 @@ avr_section_type_flags (tree decl, const
 }
 
 
+/* A helper for the next function.  NODE is a decl that is associated with
+   a symbol.  Return TRUE if the respective object may be accessed by LDS.
+   There migth still be other reasons for why LDS is not appropriate.
+   This function is only useful for AVR_TINY. */
+
+static bool
+avr_decl_maybe_lds_p (tree node)
+{
+  if (!node
+  || TREE_CODE (node) != VAR_DECL
+  || DECL_SECTION_NAME (node) != NULL)
+return false;
+
+  if (TREE_READONLY (node))
+return true;
+
+  // C++ requires peeling arrays.
+
+  do
+node = TREE_TYPE (node);
+  while (ARRAY_TYPE == TREE_CODE (node));
+
+  return (node != error_mark_node
+  && !TYPE_READONLY (node));
+}
+
+
 /* Implement `TARGET_ENCODE_SECTION_INFO'.  */
 
 static void
@@ -10193,7 +10220,8 @@ avr_encode_section_info (tree decl, rtx
   if (avr_decl_absdata_p (decl, DECL_ATTRIBUTES (decl))
   || (TARGET_ABSDATA
   && !progmem_p
-  && !addr_attr)
+  && !addr_attr
+  && avr_decl_maybe_lds_p (decl))
   || (addr_attr
   // If addr_attr is non-null, it has an argument.  Peek into it.
   && TREE_INT_CST_LOW (TREE_VALUE (TREE_VALUE (addr_attr))) < 0xc0))


[PATCH] S390: Lower requirements for successful htm tests.

2016-11-18 Thread Dominik Vogt
The attached patch makes the htm tests on s390 less sensitive to
spurious abort.  Please check the commit comment for details.  The
modified tests have been run once on a zEC12.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* gcc.target/s390/htm-builtins-1.c (DEFAULT_MAX_REPETITIONS)
(DEFAULT_REQUIRED_QUORUM, NUM_WARMUP_RUNS): Lower requirements for
successful test.
* gcc.target/s390/htm-builtins-2.c (DEFAULT_MAX_REPETITIONS)
(DEFAULT_REQUIRED_QUORUM): Likewise.
>From cd354b024c5e129575465b9088b0e0d03340c8f0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Fri, 18 Nov 2016 14:06:28 +0100
Subject: [PATCH] S390: Lower requirements for successful htm tests.

Due to spurious aborts, requiring four successful tests out of five fails too
often; accept four out of seven instead.

Reduce number of warmup passes.
---
 gcc/testsuite/gcc.target/s390/htm-builtins-1.c | 6 +++---
 gcc/testsuite/gcc.target/s390/htm-builtins-2.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-1.c 
b/gcc/testsuite/gcc.target/s390/htm-builtins-1.c
index c90490f..ff43be9 100644
--- a/gcc/testsuite/gcc.target/s390/htm-builtins-1.c
+++ b/gcc/testsuite/gcc.target/s390/htm-builtins-1.c
@@ -13,9 +13,9 @@
 
 /*  local definitions -- */
 
-#define DEFAULT_MAX_REPETITIONS 5
-#define DEFAULT_REQUIRED_QUORUM ((DEFAULT_MAX_REPETITIONS) - 1)
-#define NUM_WARMUP_RUNS 10
+#define DEFAULT_MAX_REPETITIONS 7
+#define DEFAULT_REQUIRED_QUORUM 4
+#define NUM_WARMUP_RUNS 2
 
 /*  local macros --- */
 
diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-2.c 
b/gcc/testsuite/gcc.target/s390/htm-builtins-2.c
index 15b0d12..bb9d346 100644
--- a/gcc/testsuite/gcc.target/s390/htm-builtins-2.c
+++ b/gcc/testsuite/gcc.target/s390/htm-builtins-2.c
@@ -13,8 +13,8 @@
 
 /*  local definitions -- */
 
-#define DEFAULT_MAX_REPETITIONS 5
-#define DEFAULT_REQUIRED_QUORUM ((DEFAULT_MAX_REPETITIONS) - 1)
+#define DEFAULT_MAX_REPETITIONS 7
+#define DEFAULT_REQUIRED_QUORUM 4
 #define DEFAULT_ABORT_ADDRESS (0x12345678u)
 
 /*  local macros --- */
-- 
2.3.0



Re: Rework subreg_get_info

2016-11-18 Thread Richard Sandiford
Joseph Myers  writes:
> On Tue, 15 Nov 2016, Richard Sandiford wrote:
>> Richard Sandiford  writes:
>> > This isn't intended to change the behaviour, just rewrite the
>> > existing logic in a different (and hopefully clearer) way.
>> > The new form -- particularly the part based on the "block"
>> > concept -- is easier to convert to polynomial sizes.
>> >
>> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> 
>> Sorry, I should have said: this was also tested by compiling the
>> testsuite before and after the change at -O2 -ftree-vectorize on:
>> 
>> aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
>> arm-linux-gnueabihf avr-elf bfin-elf c6x-elf cr16-elf cris-elf
>> epiphany-elf fr30-elf frv-linux-gnu ft32-elf h8300-elf
>> hppa64-hp-hpux11.23 ia64-linux-gnu i686-pc-linux-gnu
>> i686-apple-darwin iq2000-elf lm32-elf m32c-elf m32r-elf
>> m68k-linux-gnu mcore-elf microblaze-elf mips-linux-gnu
>> mipsisa64-linux-gnu mmix mn10300-elf moxie-rtems msp430-elf
>> nds32le-elf nios2-linux-gnu nvptx-none pdp11 powerpc-linux-gnu
>> powerpc-eabispe powerpc64-linux-gnu powerpc-ibm-aix7.0 rl78-elf
>
> e500 double (both DFmode and TFmode) was the case that motivated the 
> original creation of subreg_get_info.  I think powerpc-linux-gnuspe 
> --enable-e500-double would be a good case for verifying the patch doesn't 
> change generated code.  (Though when I tried building it from mainline 
> sources last week I got an ICE in LRA building __multc3, so testing it 
> might be problematic at present.)

Thanks for the pointer.  I've now tried comparing the testsuite
output on that combination and there were no differences.  31 tests
triggered an internal compiler error but >17800 compiled successfully,
so hopefully the test was at least somewhat meaningful.

(For the record: I got a build failure due to addsi3 FAILing,
but I hacked around that by converting it to a force_reg/DONE.)

Thanks,
Richard


Re: [v3 PATCH] LWG 2766, LWG 2749

2016-11-18 Thread Jonathan Wakely

On 17/11/16 23:38 +0200, Ville Voutilainen wrote:

The first patch does everything but the container adaptor parts (which
are wrong in the p/r) and the tuple part (which is icky). The second
part does the tuple parts, and needs some serious RFC. I know
it's ugly as sin, but it works. I don't think we should try to teach
our tuple to give the right answer for is_constructible etc., because
the last attempt at it broke boost and would break reasonable existing
code in addition to just boost - such things are not Stage 3 material,
imnsho.


I agree, but if we'd just refused to support such undefined behaviour
in stage 1 we wouldn't now be in a position of saying we can't change
it in stage 3.

Oh well, I'll review this ASAP.



Re: [v3 PATCH] LWG 2766, LWG 2749

2016-11-18 Thread Ville Voutilainen
On 18 November 2016 at 15:54, Jonathan Wakely  wrote:
> I agree, but if we'd just refused to support such undefined behaviour
> in stage 1 we wouldn't now be in a position of saying we can't change
> it in stage 3.

I want to support the code that the previous attempt breaks. I don't
think I can do so without
concepts.


Re: [PATCH 1/2] PR77822

2016-11-18 Thread Segher Boessenkool
On Fri, Nov 18, 2016 at 01:09:24PM +0100, Dominik Vogt wrote:
> How about this:
> 
> --
> /* A convenience macro to determine whether a SIZE lies inclusively
>within [1, UPPER], POS lies inclusively within between
>[0, UPPER - 1] and the sum lies inclusively within [1, UPPER].
>UPPER must be >= 1.  */
> #define SIZE_POS_IN_RANGE(SIZE, POS, UPPER) \
>   (IN_RANGE ((POS), 0, (unsigned HOST_WIDE_INT) (UPPER) - 1) \
>&& IN_RANGE ((SIZE), 1, (unsigned HOST_WIDE_INT) (UPPER) \
>- (unsigned HOST_WIDE_INT)(POS)))
   ^ missing space here

> IN_RANGE(POS...) makes sure that POS is a non-negative number
> smaller than UPPER, so (UPPER) - (POS) does not wrap.  Or is there
> some case that the new macro does not handle?

I think it works fine.

With a name like UPPER, you may want to have SIZE_POS_IN_RANGE work like
IN_RANGE, i.e. UPPER is inclusive then.  Dunno.


Segher


Re: [PATCH] debug/PR78112: remove recent duplicates for DW_TAG_subprogram attributes

2016-11-18 Thread Christophe Lyon
On 10 November 2016 at 12:06, Pierre-Marie de Rodat  wrote:
> On 11/09/2016 10:02 AM, Richard Biener wrote:
>>
>> Using scan-assembler-times on the dwarf?
>
>
> I always have a bad feeling about this kind of check as I imagine it can
> break very easily with legit changes. But I have nothing better to
> contribute, so I’ve added one such testcase. ;-)
>
>>> Ok to commit?
>>
>>
>> Ok.
>
>
> This is committed as r242035. Thanks!
>
Hi,

Part of the new testcase added with this commit fails on arm* targets:
  g++.dg/pr78112.C   scan-assembler-times DW_AT_object_pointer 37

Christophe



> --
> Pierre-Marie de Rodat


Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

2016-11-18 Thread Dominique d'Humières
Hi Janus,

> the attached patch fixes an ice-on-valid problem, simply by removing an 
> assert. ...

I have several instances in my test suite showing that the proposed patch 
removes the ICE but generates wrong code:

pr42359, second test, => ICE on another place
pr54613, sixth and eighth tests,

Thanks for working on the issue,

Dominique



Re: [PATCH v3] PR77359: Properly align local variables in functions calling alloca.

2016-11-18 Thread Andreas Krebbel
On Fri, Nov 11, 2016 at 10:33:16AM +0100, Dominik Vogt wrote:
> gcc/ChangeLog
> 
>   * config/rs6000/rs6000.c (rs6000_stack_info): PR/77359: Properly align
>   local variables in functions calling alloca.  Also update the ASCII
>   drawings
>   * config/rs6000/rs6000.h (STARTING_FRAME_OFFSET, STACK_DYNAMIC_OFFSET):
>   PR/77359: Likewise.
>   * config/rs6000/aix.h (STARTING_FRAME_OFFSET, STACK_DYNAMIC_OFFSET):
>   PR/77359: Copy AIX specific versions of the rs6000.h macros to aix.h.

Applied. Thanks!

-Andreas-




Re: [PATCH] S390: Lower requirements for successful htm tests.

2016-11-18 Thread Andreas Krebbel
On Fri, Nov 18, 2016 at 02:48:30PM +0100, Dominik Vogt wrote:
> gcc/testsuite/ChangeLog
> 
>   * gcc.target/s390/htm-builtins-1.c (DEFAULT_MAX_REPETITIONS)
>   (DEFAULT_REQUIRED_QUORUM, NUM_WARMUP_RUNS): Lower requirements for
>   successful test.
>   * gcc.target/s390/htm-builtins-2.c (DEFAULT_MAX_REPETITIONS)
>   (DEFAULT_REQUIRED_QUORUM): Likewise.

Applied. Thanks!

-Andreas-



libgo patch committed: Remove long-obsolete "old" directories

2016-11-18 Thread Ian Lance Taylor
The directories old/regexp and old/template were removed from the
master Go library in 2012 (https://golang.org/cl/5979046) but somehow
that was not reflected in libgo.  This patch removes them from libgo.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian


patch.txt.bz2
Description: BZip2 compressed data


Re: [PATCH 1/2] PR77822

2016-11-18 Thread Dominik Vogt
On Fri, Nov 18, 2016 at 08:02:08AM -0600, Segher Boessenkool wrote:
> On Fri, Nov 18, 2016 at 01:09:24PM +0100, Dominik Vogt wrote:
> > IN_RANGE(POS...) makes sure that POS is a non-negative number
> > smaller than UPPER, so (UPPER) - (POS) does not wrap.  Or is there
> > some case that the new macro does not handle?
> 
> I think it works fine.
> 
> With a name like UPPER, you may want to have SIZE_POS_IN_RANGE work like
> IN_RANGE, i.e. UPPER is inclusive then.  Dunno.

Yeah, maybe rather call it RANGE to avoid too much similarity.
Some name that is so vague that one has to read the documentation.
I'll post an updated patch later.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[PATCH] substring_loc info needs default track-macro-expansion (PR preprocessor/78324)

2016-11-18 Thread David Malcolm
gcc.dg/tree-ssa/builtin-sprintf-2.c is showing intermittent failures, which
valgrind shows to be a read past the end of a buffer.

The root cause is that the on-demand substring loc code isn't set up
to cope with -ftrack-macro-expansion != 2 (the default).

In the failing case, it attempts to use this location as the location
of the literal:

../../src/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c:95:1: note:
 RNG (0,  0,  0, "%hhi", i)
 ^~~

rather than:

 RNG (0,  0,  0, "%hhi", i)
  ^~~~

On re-parsing to generate the on-demand substring locations, it thus
erroneously attempts to parse the 'R' as a raw string, and so this
code within cpp_interpret_string_1 fires:

  if (*p == 'R')
{
  const uchar *prefix;

  /* Skip over 'R"'.  */
  p += 2;
  prefix = p;
  while (*p != '(')
p++;

and the issue happens in the "while" loop, as it erroneously
walks through this memory:

(gdb) p strs.m_vec.m_vecdata[0]
$20 = {len = 3, text = 0xc9bcca0 "RNG"}

trying to find the open parenthesis, and starts reading beyond the
allocated buffer.

The fix is to require that -ftrack-macro-expansion == 2 (the default)
for on-demand string literal locations to be available, failing
gracefully to simply using the whole location_t otherwise.

Doing so requires some fixups to existing test cases:
[gcc.dg/tree-ssa/builtin-sprintf-warn-{1,4}.c both have
-ftrack-macro-expansion=0.  In the latter case, it seems to be
unnecessary, so this patch removes it.  In the former case, it
seems to be needed, but there are some expected locations in
test_sprintf_note that are changed by the patch.  It's not clear to
me how to fix that, so for now the patch removes the expected
column numbers from that function within the test case.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

I think I can self-approve the input.c change and new test cases,
but I'm not sure about the changes to Martin's test cases.

Martin: are the changes to your test cases OK by you, or is there
a better way to rewrite them?

gcc/ChangeLog:
PR preprocessor/78324
* input.c (get_substring_ranges_for_loc): Fail gracefully if
-ftrack-macro-expansion has a value other than 2.

gcc/testsuite/ChangeLog:
PR preprocessor/78324
* gcc.dg/plugin/diagnostic-test-string-literals-1.c
(test_multitoken_macro): New function.
* gcc.dg/plugin/diagnostic-test-string-literals-3.c: New test
case.
* gcc.dg/plugin/diagnostic-test-string-literals-4.c: New test
case.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the new test
cases.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test_sprintf_note):
Drop expected column numbers from dg-warning directives.
* gcc.dg/tree-ssa/builtin-sprintf-warn-4.c: Drop
-ftrack-macro-expansion=0.
---
 gcc/input.c|  9 +
 .../plugin/diagnostic-test-string-literals-1.c | 16 
 .../plugin/diagnostic-test-string-literals-3.c | 43 ++
 .../plugin/diagnostic-test-string-literals-4.c | 43 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp |  4 +-
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-1.c   |  6 +--
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-4.c   |  2 +-
 7 files changed, 118 insertions(+), 5 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-3.c
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-4.c

diff --git a/gcc/input.c b/gcc/input.c
index 728f4dd..611e18b 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -1322,6 +1322,15 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
   if (strloc == UNKNOWN_LOCATION)
 return "unknown location";
 
+  /* Reparsing the strings requires accurate location information.
+ If -ftrack-macro-expansion has been overridden from its default
+ of 2, then we might have a location of a macro expansion point,
+ rather than the location of the literal itself.
+ Avoid this by requiring that we have full macro expansion tracking
+ for substring locations to be available.  */
+  if (cpp_get_options (pfile)->track_macro_expansion != 2)
+return "track_macro_expansion != 2";
+
   /* If string concatenation has occurred at STRLOC, get the locations
  of all of the literal tokens making up the compound string.
  Otherwise, just use STRLOC.  */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c 
b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
index 3e44936..76a085e 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-1.c
@@ -243,6 +243,22 @@ test_macro (void)
{ dg-end-multiline-output "" } */
 }
 
+void
+test_multitoken_macro (void)
+{
+#define RANGE ("0123456789")  /* { dg-error "unab

Re: Fix PR77881: combine improvement

2016-11-18 Thread Bin.Cheng
On Wed, Nov 16, 2016 at 3:05 PM, Andreas Schwab  wrote:
> On Nov 14 2016, Michael Matz  wrote:
>
>>   PR missed-optimization/77881
>>   * combine.c (simplify_comparison): Remove useless subregs
>>   also inside the loop, not just after it.
>>   (make_compound_operation): Recognize some subregs as being
>>   masking as well.
>
> This breaks gcc.c-torture/execute/cbrt.c execution test on aarch64.
Hi,
I can confirm that, also new PR opened for tracking.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78422

Thanks,
bin
>
> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Yuri Rumyantsev
It is very strange that this test failed on arm, since it requires
target avx2 to check vectorizer dumps:

/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" {
target avx2_runtime } } } */
/* { dg-final { scan-tree-dump-times "LOOP EPILOGUE VECTORIZED
\\(VS=16\\)" 2 "vect" { target avx2_runtime } } } */

Could you please clarify what is the reason of the failure?

Thanks.

2016-11-18 16:20 GMT+03:00 Christophe Lyon :
> On 15 November 2016 at 15:41, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> Here is patch for non-masked epilogue vectoriziation.
>>
>> Bootstrap and regression testing did not show any new failures.
>>
>> Is it OK for trunk?
>>
>> Thanks.
>> Changelog:
>>
>> 2016-11-15  Yuri Rumyantsev  
>>
>> * params.def (PARAM_VECT_EPILOGUES_NOMASK): New.
>> * tree-if-conv.c (tree_if_conversion): Make public.
>> * * tree-if-conv.h: New file.
>> * tree-vect-data-refs.c (vect_analyze_data_ref_dependences) Avoid
>> dynamic alias checks for epilogues.
>> * tree-vect-loop-manip.c (vect_do_peeling): Return created epilog.
>> * tree-vect-loop.c: include tree-if-conv.h.
>> (new_loop_vec_info): Add zeroing orig_loop_info field.
>> (vect_analyze_loop_2): Don't try to enhance alignment for epilogues.
>> (vect_analyze_loop): Add argument ORIG_LOOP_INFO which is not NULL
>> if epilogue is vectorized, set up orig_loop_info field of loop_vinfo
>> using passed argument.
>> (vect_transform_loop): Check if created epilogue should be returned
>> for further vectorization with less vf.  If-convert epilogue if
>> required. Print vectorization success for epilogue.
>> * tree-vectorizer.c (vectorize_loops): Add epilogue vectorization
>> if it is required, pass loop_vinfo produced during vectorization of
>> loop body to vect_analyze_loop.
>> * tree-vectorizer.h (struct _loop_vec_info): Add new field
>> orig_loop_info.
>> (LOOP_VINFO_ORIG_LOOP_INFO): New.
>> (LOOP_VINFO_EPILOGUE_P): New.
>> (LOOP_VINFO_ORIG_VECT_FACTOR): New.
>> (vect_do_peeling): Change prototype to return epilogue.
>> (vect_analyze_loop): Add argument of loop_vec_info type.
>> (vect_transform_loop): Return created loop.
>>
>> gcc/testsuite/
>>
>> * lib/target-supports.exp (check_avx2_hw_available): New.
>> (check_effective_target_avx2_runtime): New.
>> * gcc.dg/vect/vect-tail-nomask-1.c: New test.
>>
>
> Hi,
>
> This new test fails on arm-none-eabi (using default cpu/fpu/mode):
>   gcc.dg/vect/vect-tail-nomask-1.c -flto -ffat-lto-objects execution test
>   gcc.dg/vect/vect-tail-nomask-1.c execution test
>
> It does pass on the same target if configured --with-cpu=cortex-a9.
>
> Christophe
>
>
>
>>
>> 2016-11-14 20:04 GMT+03:00 Richard Biener :
>>> On November 14, 2016 4:39:40 PM GMT+01:00, Yuri Rumyantsev 
>>>  wrote:
Richard,

I checked one of the tests designed for epilogue vectorization using
patches 1 - 3 and found out that build compiler performs vectorization
of epilogues with --param vect-epilogues-nomask=1 passed:

$ gcc -Ofast -mavx2 t1.c -S --param vect-epilogues-nomask=1 -o
t1.new-nomask.s -fdump-tree-vect-details
$ grep VECTORIZED -c t1.c.156t.vect
4
 Without param only 2 loops are vectorized.

Should I simply add a part of tests related to this feature or I must
delete all not necessary changes also?
>>>
>>> Please remove all not necessary changes.
>>>
>>> Richard.
>>>
Thanks.
Yuri.

2016-11-14 16:40 GMT+03:00 Richard Biener :
> On Mon, 14 Nov 2016, Yuri Rumyantsev wrote:
>
>> Richard,
>>
>> In my previous patch I forgot to remove couple lines related to aux
field.
>> Here is the correct updated patch.
>
> Yeah, I noticed.  This patch would be ok for trunk (together with
> necessary parts from 1 and 2) if all not required parts are removed
> (and you'd add the testcases covering non-masked tail vect).
>
> Thus, can you please produce a single complete patch containing only
> non-masked epilogue vectoriziation?
>
> Thanks,
> Richard.
>
>> Thanks.
>> Yuri.
>>
>> 2016-11-14 15:51 GMT+03:00 Richard Biener :
>> > On Fri, 11 Nov 2016, Yuri Rumyantsev wrote:
>> >
>> >> Richard,
>> >>
>> >> I prepare updated 3 patch with passing additional argument to
>> >> vect_analyze_loop as you proposed (untested).
>> >>
>> >> You wrote:
>> >> tw, I wonder if you can produce a single patch containing just
>> >> epilogue vectorization, that is combine patches 1-3 but rip out
>> >> changes only needed by later patches?
>> >>
>> >> Did you mean that I exclude all support for vectorization
epilogues,
>> >> i.e. exclude from 2-nd patch all non-related changes
>> >> like
>> >>
>> >> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>> >> index 11863af..32011c1 100644
>> >> --- a/gcc/tree-vect-loop.c
>> >> +++ b/gcc/tree-vect-loop.c
>> >> @@ -1120,6 +1120,12 @@ new_loop_vec_info (struct loop *loop)
>> >>LOOP_

Re: [PATCH, ARM] Enable ldrd/strd peephole rules unconditionally

2016-11-18 Thread Bernd Edlinger
On 11/18/16 12:58, Christophe Lyon wrote:
> On 17 November 2016 at 10:23, Kyrill Tkachov
>  wrote:
>>
>> On 09/11/16 12:58, Bernd Edlinger wrote:
>>>
>>> Hi!
>>>
>>>
>>> This patch enables the ldrd/strd peephole rules unconditionally.
>>>
>>> It is meant to fix cases, where the patch to reduce the sha512
>>> stack usage splits ldrd/strd instructions into separate ldr/str insns,
>>> but is technically independent from the other patch:
>>>
>>> See https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00523.html
>>>
>>> It was necessary to change check_effective_target_arm_prefer_ldrd_strd
>>> to retain the true prefer_ldrd_strd tuning flag.
>>>
>>>
>>> Bootstrapped and reg-tested on arm-linux-gnueabihf.
>>> Is it OK for trunk?
>>
>>
>> This is ok.
>> Thanks,
>> Kyrill
>>
>
> Hi Bernd,
>
> Since you committed this patch (r242549), I'm seeing the new test
> failing on some arm*-linux-gnueabihf configurations:
>
> FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times ldrd 10
> FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times strd 9
>
> See 
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242549/report-build-info.html
> for a map of failures.
>
> Am I missing something?

Hi Christophe,

as always many thanks for your testing...

I have apparently only looked at the case -mfloat-abi=soft here, which
is what my other patch is going to address.  But all targets with
-mfpu=neon -mfloat-abi=hard can also use vldr.64 instead of ldrd
and vstr.64 instead of strd, which should be accepted as well.

So the attached patch should fix at least most of the fallout.

Is it OK for trunk?


Thanks
Bernd.
2016-11-18  Bernd Edlinger  

	* gcc.target/arm/pr53447-5.c: Fix test expectations for neon-fpu.

Index: gcc/testsuite/gcc.target/arm/pr53447-5.c
===
--- gcc/testsuite/gcc.target/arm/pr53447-5.c	(revision 242588)
+++ gcc/testsuite/gcc.target/arm/pr53447-5.c	(working copy)
@@ -15,5 +15,8 @@ void foo(long long* p)
   p[9] -= p[10];
 }
 
-/* { dg-final { scan-assembler-times "ldrd" 10 } } */
-/* { dg-final { scan-assembler-times "strd" 9 } } */
+/* We accept neon instructions vldr.64 and vstr.64 as well.
+   Note: DejaGnu counts patterns with alternatives twice,
+   so actually there are only 10 loads and 9 stores.  */
+/* { dg-final { scan-assembler-times "(ldrd|vldr\\.64)" 20 } } */
+/* { dg-final { scan-assembler-times "(strd|vstr\\.64)" 18 } } */


Re: [PATCH, vec-tails] Support loop epilogue vectorization

2016-11-18 Thread Christophe Lyon
On 18 November 2016 at 16:46, Yuri Rumyantsev  wrote:
> It is very strange that this test failed on arm, since it requires
> target avx2 to check vectorizer dumps:
>
> /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" {
> target avx2_runtime } } } */
> /* { dg-final { scan-tree-dump-times "LOOP EPILOGUE VECTORIZED
> \\(VS=16\\)" 2 "vect" { target avx2_runtime } } } */
>
> Could you please clarify what is the reason of the failure?

It's not the scan-dumps that fail, but the execution.
The test calls abort() for some reason.

It will take me a while to rebuild the test manually in the right
debug environment to provide you with more traces.



>
> Thanks.
>
> 2016-11-18 16:20 GMT+03:00 Christophe Lyon :
>> On 15 November 2016 at 15:41, Yuri Rumyantsev  wrote:
>>> Hi All,
>>>
>>> Here is patch for non-masked epilogue vectoriziation.
>>>
>>> Bootstrap and regression testing did not show any new failures.
>>>
>>> Is it OK for trunk?
>>>
>>> Thanks.
>>> Changelog:
>>>
>>> 2016-11-15  Yuri Rumyantsev  
>>>
>>> * params.def (PARAM_VECT_EPILOGUES_NOMASK): New.
>>> * tree-if-conv.c (tree_if_conversion): Make public.
>>> * * tree-if-conv.h: New file.
>>> * tree-vect-data-refs.c (vect_analyze_data_ref_dependences) Avoid
>>> dynamic alias checks for epilogues.
>>> * tree-vect-loop-manip.c (vect_do_peeling): Return created epilog.
>>> * tree-vect-loop.c: include tree-if-conv.h.
>>> (new_loop_vec_info): Add zeroing orig_loop_info field.
>>> (vect_analyze_loop_2): Don't try to enhance alignment for epilogues.
>>> (vect_analyze_loop): Add argument ORIG_LOOP_INFO which is not NULL
>>> if epilogue is vectorized, set up orig_loop_info field of loop_vinfo
>>> using passed argument.
>>> (vect_transform_loop): Check if created epilogue should be returned
>>> for further vectorization with less vf.  If-convert epilogue if
>>> required. Print vectorization success for epilogue.
>>> * tree-vectorizer.c (vectorize_loops): Add epilogue vectorization
>>> if it is required, pass loop_vinfo produced during vectorization of
>>> loop body to vect_analyze_loop.
>>> * tree-vectorizer.h (struct _loop_vec_info): Add new field
>>> orig_loop_info.
>>> (LOOP_VINFO_ORIG_LOOP_INFO): New.
>>> (LOOP_VINFO_EPILOGUE_P): New.
>>> (LOOP_VINFO_ORIG_VECT_FACTOR): New.
>>> (vect_do_peeling): Change prototype to return epilogue.
>>> (vect_analyze_loop): Add argument of loop_vec_info type.
>>> (vect_transform_loop): Return created loop.
>>>
>>> gcc/testsuite/
>>>
>>> * lib/target-supports.exp (check_avx2_hw_available): New.
>>> (check_effective_target_avx2_runtime): New.
>>> * gcc.dg/vect/vect-tail-nomask-1.c: New test.
>>>
>>
>> Hi,
>>
>> This new test fails on arm-none-eabi (using default cpu/fpu/mode):
>>   gcc.dg/vect/vect-tail-nomask-1.c -flto -ffat-lto-objects execution test
>>   gcc.dg/vect/vect-tail-nomask-1.c execution test
>>
>> It does pass on the same target if configured --with-cpu=cortex-a9.
>>
>> Christophe
>>
>>
>>
>>>
>>> 2016-11-14 20:04 GMT+03:00 Richard Biener :
 On November 14, 2016 4:39:40 PM GMT+01:00, Yuri Rumyantsev 
  wrote:
>Richard,
>
>I checked one of the tests designed for epilogue vectorization using
>patches 1 - 3 and found out that build compiler performs vectorization
>of epilogues with --param vect-epilogues-nomask=1 passed:
>
>$ gcc -Ofast -mavx2 t1.c -S --param vect-epilogues-nomask=1 -o
>t1.new-nomask.s -fdump-tree-vect-details
>$ grep VECTORIZED -c t1.c.156t.vect
>4
> Without param only 2 loops are vectorized.
>
>Should I simply add a part of tests related to this feature or I must
>delete all not necessary changes also?

 Please remove all not necessary changes.

 Richard.

>Thanks.
>Yuri.
>
>2016-11-14 16:40 GMT+03:00 Richard Biener :
>> On Mon, 14 Nov 2016, Yuri Rumyantsev wrote:
>>
>>> Richard,
>>>
>>> In my previous patch I forgot to remove couple lines related to aux
>field.
>>> Here is the correct updated patch.
>>
>> Yeah, I noticed.  This patch would be ok for trunk (together with
>> necessary parts from 1 and 2) if all not required parts are removed
>> (and you'd add the testcases covering non-masked tail vect).
>>
>> Thus, can you please produce a single complete patch containing only
>> non-masked epilogue vectoriziation?
>>
>> Thanks,
>> Richard.
>>
>>> Thanks.
>>> Yuri.
>>>
>>> 2016-11-14 15:51 GMT+03:00 Richard Biener :
>>> > On Fri, 11 Nov 2016, Yuri Rumyantsev wrote:
>>> >
>>> >> Richard,
>>> >>
>>> >> I prepare updated 3 patch with passing additional argument to
>>> >> vect_analyze_loop as you proposed (untested).
>>> >>
>>> >> You wrote:
>>> >> tw, I wonder if you can produce a single patch containing just
>>> >> epilogue vectorization, that is combine patches 1-3 but rip out
>>> >> changes only needed by later patches?
>>> >>
>>> >> Did you mean that

Re: Fix PR77881: combine improvement

2016-11-18 Thread Michael Matz
Hi,

On Fri, 18 Nov 2016, Bin.Cheng wrote:

> On Wed, Nov 16, 2016 at 3:05 PM, Andreas Schwab  wrote:
> > On Nov 14 2016, Michael Matz  wrote:
> >
> >>   PR missed-optimization/77881
> >>   * combine.c (simplify_comparison): Remove useless subregs
> >>   also inside the loop, not just after it.
> >>   (make_compound_operation): Recognize some subregs as being
> >>   masking as well.
> >
> > This breaks gcc.c-torture/execute/cbrt.c execution test on aarch64.
> Hi,
> I can confirm that, also new PR opened for tracking.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78422

See PR78390 for a patch (comment #8) fixing the aarch64 problem.


Ciao,
Michael.


Re: PING [PATCH] enable -fprintf-return-value by default

2016-11-18 Thread Martin Sebor

On 11/17/2016 10:34 PM, Sandra Loosemore wrote:

On 11/16/2016 09:49 AM, Martin Sebor wrote:

I'm looking for an approval of the attached patch.

I've adjusted the documentation based on Sandra's input (i.e.,
documented the negative of the option rather than the positive;
thank you for the review, btw.)

On 11/08/2016 08:13 PM, Martin Sebor wrote:

The -fprintf-return-value optimization has been disabled since
the last time it caused a bootstrap failure on powerpc64le.  With
the underlying problems fixed GCC has bootstrapped fine on all of
powerpc64, powerpc64le and x86_64 and tested with no regressions.
I'd like to re-enable the option.  The attached patch does that.


[snip]

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi(revision 242500)
+++ gcc/doc/invoke.texi(working copy)
@@ -384,7 +384,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss
@gol
 -fomit-frame-pointer -foptimize-sibling-calls @gol
 -fpartial-inlining -fpeel-loops -fpredictive-commoning @gol
--fprefetch-loop-arrays -fprintf-return-value @gol
+-fprefetch-loop-arrays -fno-printf-return-value @gol
 -fprofile-correction @gol
 -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol
 -fprofile-reorder-functions @gol


Please keep this list alphabetized -- the other "-fno-*" options are
sorted as such.  The documentation parts of the patch are OK with that
fixed, but I can't approve changing the default for the option.


I kept the option in the same place on the assumption that it was
the "printf" radix of the name, not the "no-" prefix, that these
were alphabetized by.

But now that you point it out and I've looked more carefully at
some of the other options, I see that in some sections they are
listed strictly alphabetically (-fno-foo after -fmoo) while in
others it's the way you suggest.  AFAICS, the former style looks
prevalent in the C++ Language Options and the in Warning Options,
for example.  The latter seems to be more popular in the
Optimization Options section (though there are counterexamples).

I'm happy to follow either of these conventions as long as its
consistent.  Otherwise, without both kinds of examples to choose
from, I don't think I can trust my brain to remember yet another
rule.

Martin


libgo patch committed: Don't call __go_alloc/__go_free from environment routines

2016-11-18 Thread Ian Lance Taylor
This patch to libgo changes the environment support routines to not
call __go_alloc/__go_free, but to instead use malloc/free.  This code
just needs temporary buffers, it is natural to write it in C because
it calls setenv/unsetenv, and C code can safely call malloc/free but
with the future GC can not safely allocate memory on the Go heap.
Patch bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 242592)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2ab785788691ad289f838a0b3a6bc9013d0fc337
+fc4ca600b2fc6de81fd3c4014542d6a50593db1a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/runtime/go-setenv.c
===
--- libgo/runtime/go-setenv.c   (revision 242581)
+++ libgo/runtime/go-setenv.c   (working copy)
@@ -9,10 +9,7 @@
 #include 
 #include 
 
-#include "go-alloc.h"
 #include "runtime.h"
-#include "arch.h"
-#include "malloc.h"
 
 /* Set the C environment from Go.  This is called by syscall.Setenv.  */
 
@@ -25,7 +22,6 @@ setenv_c (String k, String v)
   unsigned char *kn;
   const byte *vs;
   unsigned char *vn;
-  intgo len;
 
   ks = k.str;
   if (ks == NULL)
@@ -39,25 +35,23 @@ setenv_c (String k, String v)
 
 #ifdef HAVE_SETENV
 
-  if (ks != NULL && ks[k.len] != 0)
+  if (ks[k.len] != 0)
 {
-  // Objects that are explicitly freed must be at least 16 bytes in size,
-  // so that they are not allocated using tiny alloc.
-  len = k.len + 1;
-  if (len < TinySize)
-   len = TinySize;
-  kn = __go_alloc (len);
+  kn = malloc (k.len + 1);
+  if (kn == NULL)
+   runtime_throw ("out of malloc memory");
   __builtin_memcpy (kn, ks, k.len);
+  kn[k.len] = '\0';
   ks = kn;
 }
 
-  if (vs != NULL && vs[v.len] != 0)
+  if (vs[v.len] != 0)
 {
-  len = v.len + 1;
-  if (len < TinySize)
-   len = TinySize;
-  vn = __go_alloc (len);
+  vn = malloc (v.len + 1);
+  if (vn == NULL)
+   runtime_throw ("out of malloc memory");
   __builtin_memcpy (vn, vs, v.len);
+  vn[v.len] = '\0';
   vs = vn;
 }
 
@@ -66,19 +60,20 @@ setenv_c (String k, String v)
 #else /* !defined(HAVE_SETENV) */
 
   len = k.len + v.len + 2;
-  if (len < TinySize)
-len = TinySize;
-  kn = __go_alloc (len);
+  kn = malloc (len);
+  if (kn == NULL)
+runtime_throw ("out of malloc memory");
   __builtin_memcpy (kn, ks, k.len);
   kn[k.len] = '=';
   __builtin_memcpy (kn + k.len + 1, vs, v.len);
   kn[k.len + v.len + 1] = '\0';
   putenv ((char *) kn);
+  kn = NULL; /* putenv takes ownership of the string.  */
 
 #endif /* !defined(HAVE_SETENV) */
 
   if (kn != NULL)
-__go_free (kn);
+free (kn);
   if (vn != NULL)
-__go_free (vn);
+free (vn);
 }
Index: libgo/runtime/go-unsetenv.c
===
--- libgo/runtime/go-unsetenv.c (revision 242581)
+++ libgo/runtime/go-unsetenv.c (working copy)
@@ -9,10 +9,7 @@
 #include 
 #include 
 
-#include "go-alloc.h"
 #include "runtime.h"
-#include "arch.h"
-#include "malloc.h"
 
 /* Unset an environment variable from Go.  This is called by
syscall.Unsetenv.  */
@@ -24,7 +21,6 @@ unsetenv_c (String k)
 {
   const byte *ks;
   unsigned char *kn;
-  intgo len;
 
   ks = k.str;
   if (ks == NULL)
@@ -33,14 +29,11 @@ unsetenv_c (String k)
 
 #ifdef HAVE_UNSETENV
 
-  if (ks != NULL && ks[k.len] != 0)
+  if (ks[k.len] != 0)
 {
-  // Objects that are explicitly freed must be at least 16 bytes in size,
-  // so that they are not allocated using tiny alloc.
-  len = k.len + 1;
-  if (len < TinySize)
-   len = TinySize;
-  kn = __go_alloc (len);
+  kn = malloc (k.len + 1);
+  if (kn == NULL)
+   runtime_throw ("out of malloc memory");
   __builtin_memcpy (kn, ks, k.len);
   ks = kn;
 }
@@ -50,5 +43,5 @@ unsetenv_c (String k)
 #endif /* !defined(HAVE_UNSETENV) */
 
   if (kn != NULL)
-__go_free (kn);
+free (kn);
 }


Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Martin Sebor

On 11/18/2016 12:58 AM, Jakub Jelinek wrote:

On Thu, Nov 17, 2016 at 05:32:31PM -0700, Martin Sebor wrote:

The point is warning on an alloca (0) may not be as clear cut as it
might seem.  It's probably a reasonable thing to do on the host, but on
a target, which might be embedded and explicitly overriding the builtin
alloca, warning on alloca (0) is less of a slam dunk.


I confess I haven't heard of such an implementation before but after
some searching I have managed to find a few examples of it online,
such as in QNX or in the BlackBerry SDK, or in the GCC shipped by
Texas Instruments.  For instance:


In the libiberty/alloca.c, I don't see anything special about alloca (0).
Yes, alloca (0) reclaims garbage, but so does alloca (1) and alloca (4035).
alloca (0) just returns NULL after it.  The diffutils link is the same
alloca.c as in libiberty.  Returning NULL or returning a non-unique pointer
for alloca (0) is just fine, it is the same thing as returning NULL or
returning a non-unique pointer from malloc (0).  We definitely don't want
to warn for malloc (0), and IMNSHO don't want to warn for alloca (0),
because it behaves the same.


I disagree.  At a minimum, the return value of malloc(0) is
implementation-defined and so relying on it being either null
or non-null is non-portable and can be a source of subtle bugs.
But malloc(0) has also been known to result from unsigned overflow
which has led to vulnerabilities and exploits (famously by the JPG
COM vulnerability exploit, but there are plenty of others, including
for instance CVE-2016-2851).  Much academic research has been devoted
to this problem and to solutions to detect and prevent it.
The following paper has some good background and references:

https://www.usenix.org/legacy/event/woot10/tech/full_papers/Vanegue.pdf

Coding standards rules have been developed requiring conforming
software to avoid allocating zero bytes for these reasons.  TS
1796, the C Secure Coding Rules technical specification, has such
a requirement.  It was derived from the CERT C Secure Coding rule
MEM04-C. Beware of zero-length allocations:

  https://www.securecoding.cert.org/confluence/x/GQI

The same argument applies to alloca(0) with the added caveat that,
unlike with the other allocation functions, a non-null return value
need not be distinct.

In the absence of a policy or guidelines it's a matter of opinion
whether or not this warning belongs in either -Wall or -Wextra.
Based on the severity of the problems that allocating zero size
has been linked to I think it could be successfully argued that
it should be in -Wall (the problems are obviously more serious
than those that have ever been associated with the -Wunused-type
warnings, for example).  I put it in -Wextra only because I was
trying to be sensitive to the "no false positive" argument.

All this said, before debating the fine details of under which
conditions each of the new warninsg should be enabled, I was
hoping to get comments on the meat of the patch that implements
the warnings.

Martin


[PATCH] Fix PR78413

2016-11-18 Thread Bill Schmidt
Hi,

The if-conversion patch for PR77848 missed a case where an outer loop
should not be versioned for vectorization; this was caught by an assert
in tests recorded in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78413.
This patch fixes the problem by ensuring that both inner and outer loop
latches have a single predecessor before versioning an outer loop.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill


[gcc]

2016-11-18  Bill Schmidt  

PR tree-optimization/78413
* tree-if-conv.c (versionable_outer_loop_p): Require that both
inner and outer loop latches have single predecessors.

[gcc/testsuite]

2016-11-18  Bill Schmidt  

PR tree-optimization/78413
* gcc.dg/tree-ssa/pr78413.c: New test.


Index: gcc/testsuite/gcc.dg/tree-ssa/pr78413.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr78413.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr78413.c (working copy)
@@ -0,0 +1,35 @@
+/* PR78413.  These previously failed in tree if-conversion due to a loop
+   latch with multiple predecessors that the code did not anticipate.  */
+/* { dg-do compile } */
+/* { dg-options "-O3 -ffast-math" } */
+
+extern long long int llrint(double x);
+int a;
+double b;
+__attribute__((cold)) void decode_init() {
+  int c, d = 0;
+  for (; d < 12; d++) {
+if (d)
+  b = 0;
+c = 0;
+for (; c < 6; c++)
+  a = b ? llrint(b) : 0;
+  }
+}
+
+struct S {
+  _Bool bo;
+};
+int a, bb, c, d;
+void fn1() {
+  do
+do
+  do {
+   struct S *e = (struct S *)1;
+   do
+ bb = a / (e->bo ? 2 : 1);
+   while (bb);
+  } while (0);
+while (d);
+  while (c);
+}
Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 242551)
+++ gcc/tree-if-conv.c  (working copy)
@@ -2575,6 +2575,8 @@ version_loop_for_if_conversion (struct loop *loop)
 - The loop has a single exit.
 - The loop header has a single successor, which is the inner
   loop header.
+- Each of the inner and outer loop latches have a single
+  predecessor.
 - The loop exit block has a single predecessor, which is the
   inner loop's exit block.  */
 
@@ -2586,7 +2588,9 @@ versionable_outer_loop_p (struct loop *loop)
   || loop->inner->next
   || !single_exit (loop)
   || !single_succ_p (loop->header)
-  || single_succ (loop->header) != loop->inner->header)
+  || single_succ (loop->header) != loop->inner->header
+  || !single_pred_p (loop->latch)
+  || !single_pred_p (loop->inner->latch))
 return false;
   
   basic_block outer_exit = single_pred (loop->latch);



Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Jakub Jelinek
On Fri, Nov 18, 2016 at 09:21:38AM -0700, Martin Sebor wrote:
> >In the libiberty/alloca.c, I don't see anything special about alloca (0).
> >Yes, alloca (0) reclaims garbage, but so does alloca (1) and alloca (4035).
> >alloca (0) just returns NULL after it.  The diffutils link is the same
> >alloca.c as in libiberty.  Returning NULL or returning a non-unique pointer
> >for alloca (0) is just fine, it is the same thing as returning NULL or
> >returning a non-unique pointer from malloc (0).  We definitely don't want
> >to warn for malloc (0), and IMNSHO don't want to warn for alloca (0),
> >because it behaves the same.
> 
> I disagree.  At a minimum, the return value of malloc(0) is
> implementation-defined and so relying on it being either null
> or non-null is non-portable and can be a source of subtle bugs.

Most apps know what malloc (0) means and treat it correctly, they know they
shouldn't dereference the pointer, because it is either NULL or holds an
array with 0 elements.  I fail to see why you would want to warn.

> But malloc(0) has also been known to result from unsigned overflow
> which has led to vulnerabilities and exploits (famously by the JPG
> COM vulnerability exploit, but there are plenty of others, including
> for instance CVE-2016-2851).  Much academic research has been devoted
> to this problem and to solutions to detect and prevent it.

How is it different from overflowing to 1 or 2 or 27?  What is special on
the value 0?

> In the absence of a policy or guidelines it's a matter of opinion
> whether or not this warning belongs in either -Wall or -Wextra.

It IMHO doesn't belong to either of these.

Jakub


Re: [PATCH] Fix PR78413

2016-11-18 Thread Markus Trippelsdorf
On 2016.11.18 at 10:27 -0600, Bill Schmidt wrote:
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/pr78413.c   (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr78413.c   (working copy)
> @@ -0,0 +1,35 @@
> +/* PR78413.  These previously failed in tree if-conversion due to a loop
> +   latch with multiple predecessors that the code did not anticipate.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -ffast-math" } */

Please add -fno-strict-aliasing, otherwise fn1() wouldn't ICE.

> +extern long long int llrint(double x);
> +int a;
> +double b;
> +__attribute__((cold)) void decode_init() {
> +  int c, d = 0;
> +  for (; d < 12; d++) {
> +if (d)
> +  b = 0;
> +c = 0;
> +for (; c < 6; c++)
> +  a = b ? llrint(b) : 0;
> +  }
> +}
> +
> +struct S {
> +  _Bool bo;
> +};
> +int a, bb, c, d;
> +void fn1() {
> +  do
> +do
> +  do {
> + struct S *e = (struct S *)1;
> + do
> +   bb = a / (e->bo ? 2 : 1);
> + while (bb);
> +  } while (0);
> +while (d);
> +  while (c);
> +}

-- 
Markus


[COMMITTED] Add myself to MAINTAINERS (Write After Approval).

2016-11-18 Thread Toma Tabacu
Committed as r242595.

Thanks,
Toma Tabacu

Index: ChangeLog
===
--- ChangeLog   (revision 242594)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2016-11-18  Toma Tabacu  
+
+   * MAINTAINERS (Write After Approval): Add myself.
+
 2016-11-15  Matthias Klose  
 
* Makefile.def: Remove references to GCJ.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 242594)
+++ MAINTAINERS (working copy)
@@ -587,6 +587,7 @@
 Robert Suchanek

 Andrew Sutton  
 Gabriele Svelto
+Toma Tabacu
 Sriraman Tallam
 Chung-Lin Tang 
 Samuel Tardieu 


Re: [PATCH] Fix PR78413

2016-11-18 Thread Bill Schmidt

> On Nov 18, 2016, at 10:33 AM, Markus Trippelsdorf  
> wrote:
> 
> On 2016.11.18 at 10:27 -0600, Bill Schmidt wrote:
>> ===
>> --- gcc/testsuite/gcc.dg/tree-ssa/pr78413.c  (revision 0)
>> +++ gcc/testsuite/gcc.dg/tree-ssa/pr78413.c  (working copy)
>> @@ -0,0 +1,35 @@
>> +/* PR78413.  These previously failed in tree if-conversion due to a loop
>> +   latch with multiple predecessors that the code did not anticipate.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3 -ffast-math" } */
> 
> Please add -fno-strict-aliasing, otherwise fn1() wouldn't ICE.

Whoops.  Yes indeed, sorry I missed the flag difference for the second failure.

Bill




Re: [PATCH, GCC/ARM, ping] Optional -mthumb for Thumb only targets

2016-11-18 Thread Thomas Preudhomme

On 11/11/16 14:35, Kyrill Tkachov wrote:


On 08/11/16 13:36, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 25/10/16 18:07, Thomas Preudhomme wrote:

Hi,

Currently when a user compiles for a thumb-only target (such as Cortex-M
processors) without specifying the -mthumb option GCC throws the error "target
CPU does not support ARM mode". This is suboptimal from a usability point of
view: the -mthumb could be deduced from the -march or -mcpu option when there is
no ambiguity.

This patch implements this behavior by extending the DRIVER_SELF_SPECS to
automatically append -mthumb to the command line for thumb-only targets. It does
so by checking the last -march option if any is given or the last -mcpu option
otherwise. There is no ordering issue because conflicting -mcpu and -march is
already handled.

Note that the logic cannot be implemented in function arm_option_override
because we need to provide the modified command line to the GCC driver for
finding the right multilib path and the function arm_option_override is executed
too late for that effect.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-10-18  Terry Guo  
Thomas Preud'homme 

PR target/64802
* common/config/arm/arm-common.c (arm_target_thumb_only): New function.
* config/arm/arm-opts.h: Include arm-flags.h.
(struct arm_arch_core_flag): Define.
(arm_arch_core_flags): Define.
* config/arm/arm-protos.h: Include arm-flags.h.
(FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG, FL_ARCH5E,
FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2, FL_NOTM,
FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV,
FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2, FL2_FP16INST,
FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M, FL_FOR_ARCH4,
FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A, FL_FOR_ARCH8M_BASE,
FL_FOR_ARCH8M_MAIN, arm_feature_set, ARM_FSET_MAKE,
ARM_FSET_MAKE_CPU1, ARM_FSET_MAKE_CPU2, ARM_FSET_CPU1, ARM_FSET_CPU2,
ARM_FSET_EMPTY, ARM_FSET_ANY, ARM_FSET_HAS_CPU1, ARM_FSET_HAS_CPU2,
ARM_FSET_HAS_CPU, ARM_FSET_ADD_CPU1, ARM_FSET_ADD_CPU2,
ARM_FSET_DEL_CPU1, ARM_FSET_DEL_CPU2, ARM_FSET_UNION, ARM_FSET_INTER,
ARM_FSET_XOR, ARM_FSET_EXCLUDE, ARM_FSET_IS_EMPTY,
ARM_FSET_CPU_SUBSET): Move to ...
* config/arm/arm-flags.h: This new file.
* config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): Define.
(EXTRA_SPEC_FUNCTIONS): Add TARGET_MODE_SPEC_FUNCTIONS to its value.
(TARGET_MODE_SPECS): Define.
(DRIVER_SELF_SPECS): Add TARGET_MODE_SPECS to its value.


*** gcc/testsuite/ChangeLog ***

2016-10-11  Thomas Preud'homme 

PR target/64802
* gcc.target/arm/optional_thumb-1.c: New test.
* gcc.target/arm/optional_thumb-2.c: New test.
* gcc.target/arm/optional_thumb-3.c: New test.


No regression when running the testsuite for -mcpu=cortex-m0 -mthumb,
-mcpu=cortex-m0 -marm and -mcpu=cortex-a8 -marm

Is this ok for trunk?



This looks like a useful usability improvement.
This is ok after a bootstrap on an arm-none-linux-gnueabihf target.

Sorry for the delay,
Kyrill


I've rebased the patch on top of the arm_feature_set type consistency fix [1] 
and committed it. The committed patch is in attachment for reference.


[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01680.html

Best regards,

Thomas
diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index f3b674339a50460d55920ca8d26275a550bbbc1e..473417a2e5f04488197c27ead2b65680bddec274 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -98,6 +98,29 @@ arm_rewrite_mcpu (int argc, const char **argv)
   return arm_rewrite_selected_cpu (argv[argc - 1]);
 }
 
+/* Called by the driver to check whether the target denoted by current
+   command line options is a Thumb-only target.  ARGV is an array of
+   -march and -mcpu values (ie. it contains the rhs after the equal
+   sign) and we use the last one of them to make a decision.  The
+   number of elements in ARGV is given in ARGC.  */
+const char *
+arm_target_thumb_only (int argc, const char **argv)
+{
+  unsigned int opt;
+
+  if (argc)
+{
+  for (opt = 0; opt < (ARRAY_SIZE (arm_arch_core_flags)); opt++)
+	if ((strcmp (argv[argc - 1], arm_arch_core_flags[opt].name) == 0)
+	&& !ARM_FSET_HAS_CPU1(arm_arch_core_flags[opt].flags, FL_N

Re: [PATCH] substring_loc info needs default track-macro-expansion (PR preprocessor/78324)

2016-11-18 Thread Martin Sebor

Martin: are the changes to your test cases OK by you, or is there
a better way to rewrite them?


Thanks for looking into it!

Since the purpose of the test_sprintf_note function in the test is
to verify the location of the caret within the warnings I think we
should keep it if it's possible.  Would either removing the P macro
or moving the function to a different file that doesn't use the
-ftrack-macro-expansion=0 option work?

Martin


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c
index 5779a95..b6a6011 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c
@@ -181,13 +181,13 @@ void test_sprintf_note (void)
   /* Diagnostic column numbers are 1-based.  */

   P (buffer (0),/* { dg-message "format output 4 bytes into a 
destination of size 0" } */
- "%c%s%i", '1', "2", 3);/* { dg-warning "7:.%c. directive writing 1 byte 
into a region of size 0" } */
+ "%c%s%i", '1', "2", 3);/* { dg-warning ".%c. directive writing 1 byte into 
a region of size 0" } */

   P (buffer (1),/* { dg-message "format output 6 bytes into a 
destination of size 1" } */
- "%c%s%i", '1', "23", 45);  /* { dg-warning "9:.%s. directive writing 2 bytes 
into a region of size 0" } */
+ "%c%s%i", '1', "23", 45);  /* { dg-warning ".%s. directive writing 2 bytes 
into a region of size 0" } */

   P (buffer (2),/* { dg-message "format output 6 bytes into a 
destination of size 2" } */
- "%c%s%i", '1', "2", 345);  /* { dg-warning "11:.%i. directive writing 3 bytes 
into a region of size 0" } */
+ "%c%s%i", '1', "2", 345);  /* { dg-warning ".%i. directive writing 3 bytes 
into a region of size 0" } */

   /* It would be nice if the caret in the location range for the format
  string below could be made to point at the closing quote of the format
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
index faa5806..b587d00 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Wformat -Wformat-length=1 -fdiagnostics-show-caret 
-ftrack-macro-expansion=0" } */
+/* { dg-options "-Wformat -Wformat-length=1 -fdiagnostics-show-caret" } */

 extern int sprintf (char*, const char*, ...);






Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Eric Gallager
On 11/18/16, Jakub Jelinek  wrote:
> On Fri, Nov 18, 2016 at 09:21:38AM -0700, Martin Sebor wrote:
>> >In the libiberty/alloca.c, I don't see anything special about alloca
>> > (0).
>> >Yes, alloca (0) reclaims garbage, but so does alloca (1) and alloca
>> > (4035).
>> >alloca (0) just returns NULL after it.  The diffutils link is the same
>> >alloca.c as in libiberty.  Returning NULL or returning a non-unique
>> > pointer
>> >for alloca (0) is just fine, it is the same thing as returning NULL or
>> >returning a non-unique pointer from malloc (0).  We definitely don't
>> > want
>> >to warn for malloc (0), and IMNSHO don't want to warn for alloca (0),
>> >because it behaves the same.
>>
>> I disagree.  At a minimum, the return value of malloc(0) is
>> implementation-defined and so relying on it being either null
>> or non-null is non-portable and can be a source of subtle bugs.
>
> Most apps know what malloc (0) means and treat it correctly, they know they
> shouldn't dereference the pointer, because it is either NULL or holds an
> array with 0 elements.  I fail to see why you would want to warn.
>


Just as a reference point, my old version of the clang static analyzer
warns for malloc(0); but not alloca(0); though. For example:


$ cat alloc_0.c
#include 
#include 

void fn(void)
{
char *ptr0 = (char *)malloc(0);
void *ptr1 = alloca(0);
*ptr0 = *(char *)ptr1;
}

$ clang -Wall -Wextra -pedantic --analyze -c alloc_0.c
alloc_0.c:6:23: warning: Call to 'malloc' has an allocation size of 0 bytes
char *ptr0 = (char *)malloc(0);
 ^  ~
1 warning generated.


Doing some more Googling on the topic finds debates as to whether this
warning is warranted or not, but it seems like people are pretty used
to dealing with it from clang already, so they probably wouldn't mind
too much if gcc started being consistent with it.


Re: [PATCH PR78114]Refine gfortran.dg/vect/fast-math-mgrid-resid.f

2016-11-18 Thread Michael Matz
Hi,

On Thu, 17 Nov 2016, Bin.Cheng wrote:

> B) Depending on ilp, I think below test strings fail for long time with 
> haswell:
> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
> without unrolling" 1 "pcom" { target lp64 } } }
> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
> without unrolling" 2 "pcom" { target ia32 } } }
> Because vectorizer choose vf==4 in this case, and there is no
> predictive commoning opportunities at all.
> Also the newly added test string fails in this case too because the
> prolog peeled iterates more than 1 times.

Btw, this probably means that on haswell (or other archs with vf==4) mgrid 
is slower than necessary.  On mgrid you really really want predictive 
commoning to happen.  Vectorization isn't that interesting there.


Ciao,
Michael.


[PATCH GCC]Move simplification of (A == C1) ? A : C2 to match.pd

2016-11-18 Thread Bin Cheng
Hi,
This is a follow up patch for 
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html
It moves remaining simplification for (A == C1) ? A : C2 in 
fold_cond_expr_with_comparison
to match.pd.  Bootstrap and test on x86_64 and AArch64, is it OK?

Thanks,
bin

2016-11-17  Bin Cheng  

* fold-const.c (fold_cond_expr_with_comparison): Move simplification
for A == C1 ? A : C2 to below.
* match.pd: Move from above to here:
(cond (eq (convert1? x) c1) (convert2? x) c2)
  -> (cond (eq x c1) c1 c2).diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 1e61ccf..1877dac 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -5210,19 +5210,6 @@ fold_cond_expr_with_comparison (location_t loc, tree 
type,
}
 }
 
-  /* If this is A == C1 ? A : C2 with C1 and C2 constant integers,
- we simplify it into A == C1 ? C1 : C2.  */
-
-  if (comp_code == EQ_EXPR
-  && INTEGRAL_TYPE_P (type)
-  && TREE_CODE (arg01) == INTEGER_CST
-  && TREE_CODE (arg1) != INTEGER_CST
-  && TREE_CODE (arg2) == INTEGER_CST)
-{
-  arg1 = fold_convert_loc (loc, type, arg01);
-  return fold_build3_loc (loc, COND_EXPR, type, arg0, arg1, arg2);
-}
-
   return NULL_TREE;
 }
 
diff --git a/gcc/match.pd b/gcc/match.pd
index 4beac4e..a8d94de 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1939,15 +1939,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Simplification moved from fold_cond_expr_with_comparison.  It may also
be extended.  */
-/* (cond (cmp (convert1? x) c1) (convert2? x) c2) -> (minmax (x c)) if:
+/* This pattern implements two kinds simplification:
+
+   Case 1)
+   (cond (cmp (convert1? x) c1) (convert2? x) c2) -> (minmax (x c)) if:
  1) Conversions are type widening from smaller type.
  2) Const c1 equals to c2 after canonicalizing comparison.
  3) Comparison has tree code LT, LE, GT or GE.
This specific pattern is needed when (cmp (convert x) c) may not
be simplified by comparison patterns because of multiple uses of
x.  It also makes sense here because simplifying across multiple
-   referred var is always benefitial for complicated cases.  */
-(for cmp (lt le gt ge)
+   referred var is always benefitial for complicated cases.
+
+   Case 2)
+   (cond (eq (convert1? x) c1) (convert2? x) c2) -> (cond (eq x c1) c1 c2).  */
+(for cmp (lt le gt ge eq)
  (simplify
   (cond (cmp@0 (convert1? @1) INTEGER_CST@3) (convert2? @1) INTEGER_CST@2)
   (with
@@ -1966,37 +1972,45 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 && (TYPE_UNSIGNED (from_type)
 || TYPE_SIGN (c2_type) == TYPE_SIGN (from_type)
{
-if (wi::to_widest (@3) == (wi::to_widest (@2) - 1))
-  {
-/* X <= Y - 1 equals to X < Y.  */
-if (cmp_code == LE_EXPR)
-  code = LT_EXPR;
-/* X > Y - 1 equals to X >= Y.  */
-if (cmp_code == GT_EXPR)
-  code = GE_EXPR;
-  }
-if (wi::to_widest (@3) == (wi::to_widest (@2) + 1))
-  {
-/* X < Y + 1 equals to X <= Y.  */
-if (cmp_code == LT_EXPR)
-  code = LE_EXPR;
-/* X >= Y + 1 equals to X > Y.  */
-if (cmp_code == GE_EXPR)
-  code = GT_EXPR;
-  }
-if (code != cmp_code || wi::to_widest (@2) == wi::to_widest (@3))
+if (code != EQ_EXPR)
   {
-if (cmp_code == LT_EXPR || cmp_code == LE_EXPR)
-  code = MIN_EXPR;
-if (cmp_code == GT_EXPR || cmp_code == GE_EXPR)
-  code = MAX_EXPR;
+if (wi::to_widest (@3) == (wi::to_widest (@2) - 1))
+  {
+/* X <= Y - 1 equals to X < Y.  */
+if (cmp_code == LE_EXPR)
+  code = LT_EXPR;
+/* X > Y - 1 equals to X >= Y.  */
+if (cmp_code == GT_EXPR)
+  code = GE_EXPR;
+  }
+if (wi::to_widest (@3) == (wi::to_widest (@2) + 1))
+  {
+/* X < Y + 1 equals to X <= Y.  */
+if (cmp_code == LT_EXPR)
+  code = LE_EXPR;
+/* X >= Y + 1 equals to X > Y.  */
+if (cmp_code == GE_EXPR)
+  code = GT_EXPR;
+  }
+if (code != cmp_code || wi::to_widest (@2) == wi::to_widest (@3))
+  {
+if (cmp_code == LT_EXPR || cmp_code == LE_EXPR)
+  code = MIN_EXPR;
+if (cmp_code == GT_EXPR || cmp_code == GE_EXPR)
+  code = MAX_EXPR;
+  }
   }
+/* Can do A == C1 ? A : C2  ->  A == C1 ? C1 : C2?  */
+else if (!int_fits_type_p (@3, from_type))
+  code = ERROR_MARK;
}
}
(if (code == MAX_EXPR)
 (convert (max @1 (convert:from_type @2)))
 (if (code == MIN_EXPR)
- (convert (min @1 (convert:from_type @2
+ (convert (min @1 (convert

[PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)

2016-11-18 Thread Bin Cheng
Hi,
This is a rework of https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html.  
Though review comments suggested it could be merged with last kind 
simplification
of fold_cond_expr_with_comparison, it's not really applicable.  As a matter of 
fact, 
the suggestion stands for patch 
@https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html.
I had previous patch (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html)
moving fold_cond_expr_with_comparison to match.pd pattern and incorporated
that patch into it.  For this one, the rework is trivial, just renames several 
variable
tags as suggested.  Bootstrap and test on x86_64 and AArch64, is it OK?

Thanks,
bin

2016-11-17  Bin Cheng  

* match.pd: Add new pattern:
(cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).

gcc/testsuite/ChangeLog
2016-11-17  Bin Cheng  

* gcc.dg/fold-bopcond-1.c: New test.
* gcc.dg/fold-bopcond-2.c: New test.diff --git a/gcc/match.pd b/gcc/match.pd
index a8d94de..e47f77a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2012,6 +2012,107 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (code == EQ_EXPR)
   (cond (cmp @1 (convert:from_type @3)) (convert:from_type @3) @2)))
 
+/* (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2) if:
+
+ 1) OP is PLUS or MINUS.
+ 2) CMP is LT, LE, GT or GE.
+ 3) C3 == (C1 op C2), and computation doesn't have undefined behavior.
+
+   This pattern also handles special cases like:
+
+ A) Operand x is a unsigned to signed type conversion and c1 is
+   integer zero.  In this case,
+ (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
+ (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)
+ B) Const c1 may not equal to (C3 op' C2).  In this case we also
+   check equality for (c1+1) and (c1-1) by adjusting comparison
+   code.
+
+   TODO: Though signed type is handled by this pattern, it cannot be
+   simplified at the moment because C standard requires additional
+   type promotion.  In order to match&simplify it here, the IR needs
+   to be cleaned up by other optimizers, i.e, VRP.  */
+(for op (plus minus)
+ (for cmp (lt le gt ge)
+  (simplify
+   (cond (cmp (convert? @X) INTEGER_CST@1) (op @X INTEGER_CST@2) INTEGER_CST@3)
+   (with { tree from_type = TREE_TYPE (@X), to_type = TREE_TYPE (@1); }
+(if (types_match (from_type, to_type)
+/* Check if it is special case A).  */
+|| (TYPE_UNSIGNED (from_type)
+&& !TYPE_UNSIGNED (to_type)
+&& TYPE_PRECISION (from_type) == TYPE_PRECISION (to_type)
+&& integer_zerop (@1)
+&& (cmp == LT_EXPR || cmp == GE_EXPR)))
+ (with
+  {
+   bool overflow = false;
+   enum tree_code code, cmp_code = cmp;
+   tree real_c1, c1 = @1, c2 = @2, c3 = @3;
+   tree op_type = TREE_TYPE (@X);
+   signop sgn = TYPE_SIGN (op_type);
+   widest_int wmin = widest_int::from (wi::min_value (op_type), sgn);
+   widest_int wmax = widest_int::from (wi::max_value (op_type), sgn);
+
+   /* Handle special case A), given x of unsigned type:
+   ((signed type)x  < 0) <=> (x  > MAX_VAL(signed type))
+   ((signed type)x >= 0) <=> (x <= MAX_VAL(signed type))  */
+   if (!types_match (from_type, to_type))
+ {
+   if (cmp_code == LT_EXPR)
+ cmp_code = GT_EXPR;
+   if (cmp_code == GE_EXPR)
+ cmp_code = LE_EXPR;
+   c1 = wide_int_to_tree (op_type, wi::max_value (to_type));
+ }
+   /* To simplify this pattern, we require c3 = (c1 op c2).  Here we
+  compute (c3 op' c2) and check if it equals to c1 with op' being
+  the inverted operator of op.  Make sure overflow doesn't happen
+  if it is undefined.  */
+   if (op == PLUS_EXPR)
+ real_c1 = wide_int_to_tree (op_type,
+ wi::sub (c3, c2, sgn, &overflow));
+   else
+ real_c1 = wide_int_to_tree (op_type,
+ wi::add (c3, c2, sgn, &overflow));
+   code = cmp_code;
+   if (!overflow || !TYPE_OVERFLOW_UNDEFINED (op_type))
+ {
+   /* Check if c1 equals to real_c1.  Boundary condition is handled
+  by adjusting comparison operation if necessary.  */
+   if (wi::to_widest (c1) == (wi::to_widest (real_c1) - 1))
+ {
+   /* X <= Y - 1 equals to X < Y.  */
+   if (cmp_code == LE_EXPR)
+ code = LT_EXPR;
+   /* X > Y - 1 equals to X >= Y.  */
+   if (cmp_code == GT_EXPR)
+ code = GE_EXPR;
+ }
+   if (wi::to_widest (c1) == (wi::to_widest (real_c1) + 1))
+ {
+   /* X < Y + 1 equals to X <= Y.  */
+   if (cmp_code == LT_EXPR)
+ code = LE_EXPR;
+   /* X >= Y + 1 equals to X > Y.  */
+   if (cmp_code == GE_EXPR)
+ 

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-18 Thread Mike Stump
On Nov 18, 2016, at 3:13 AM, Iain Sandoe  wrote:
> 
> Thanks, at least I’m not going completely crazy ;-)

I'll just note for completeness that Jeff also couldn't explain a failure of 
your latest patch.  If you run into one, let me know.

> OK now for trunk?

Ok.

> open branches?

Ok.



Re: [PATCH PR78114]Refine gfortran.dg/vect/fast-math-mgrid-resid.f

2016-11-18 Thread Bin.Cheng
On Fri, Nov 18, 2016 at 4:52 PM, Michael Matz  wrote:
> Hi,
>
> On Thu, 17 Nov 2016, Bin.Cheng wrote:
>
>> B) Depending on ilp, I think below test strings fail for long time with 
>> haswell:
>> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
>> without unrolling" 1 "pcom" { target lp64 } } }
>> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
>> without unrolling" 2 "pcom" { target ia32 } } }
>> Because vectorizer choose vf==4 in this case, and there is no
>> predictive commoning opportunities at all.
>> Also the newly added test string fails in this case too because the
>> prolog peeled iterates more than 1 times.
>
> Btw, this probably means that on haswell (or other archs with vf==4) mgrid
> is slower than necessary.  On mgrid you really really want predictive
> commoning to happen.  Vectorization isn't that interesting there.
Interesting, I will check if there is difference between 2/4 vf.  we
do have cases that smaller vf is better and should be chosen, though
for different reasons.

Thanks,
bin
>
>
> Ciao,
> Michael.


[PATCH, Fortran, pr78395, v1] [OOP] error on polymorphic assignment

2016-11-18 Thread Andre Vehreschild
Hi all,

attached patch fixes the issue which was given by nesting calls to typebound
procedures. The expression of the inner typebound procedure call was resolved
correctly, but in the case of it's having a class type the ref-list was
discarded. Leaving the list of references untouched, resolves the wrong
error-message and generates correct code.

When checking the shortened example in comment #3 one gets a segfault, because
v6 is not allocated explicitly. The initial example made sure, that v6 was
allocated. Reading through the standard, I did not find, whether the
auto-allocation is applicable here. I therefore have extended the testcase by
an allocate(v6). Dominique pointed out, that there are already some prs for
adding an on-demand -fcheck=something runtime check for not allocated objects.
But that does not solve the question, whether v6 should be auto-allocated
when assigned by a typebound-procedure (ifort and cray need v6 allocated do,
i.e., they don't auto-allocate). Btw, when using the in gcc-7 available
polymorphic assign, then v6 is actually auto-allocated and the program runs
fine. So what are your opinions on the auto-allocation issue?

This patch fixes the wrong error messages in both gcc-7 and gcc-6.
Bootstraped and regtested on x86_64-linux/F23 for gcc-7 and -6. Ok for trunk
and gcc-6?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/testsuite/ChangeLog:

2016-11-18  Andre Vehreschild  

PR fortran/78395
* gfortran.dg/typebound_operator_21.f03: New test.


gcc/fortran/ChangeLog:

2016-11-18  Andre Vehreschild  

PR fortran/78395
* resolve.c (resolve_typebound_function): Prevent stripping of refs,
when the base-expression is a class' typed one.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 825bb12..589a673 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -6140,7 +6140,7 @@ resolve_typebound_function (gfc_expr* e)
 	  gfc_free_ref_list (class_ref->next);
 	  class_ref->next = NULL;
 	}
-  else if (e->ref && !class_ref)
+  else if (e->ref && !class_ref && expr->ts.type != BT_CLASS)
 	{
 	  gfc_free_ref_list (e->ref);
 	  e->ref = NULL;
diff --git a/gcc/testsuite/gfortran.dg/typebound_operator_21.f03 b/gcc/testsuite/gfortran.dg/typebound_operator_21.f03
new file mode 100644
index 000..ea374a1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/typebound_operator_21.f03
@@ -0,0 +1,78 @@
+! { dg-do run }
+!
+! Test that pr78395 is fixed.
+! Contributed by Chris and Janus Weil
+
+module types_mod
+  implicit none
+
+  type, public :: t1
+integer :: a
+  contains
+procedure :: get_t2
+  end type
+
+  type, public :: t2
+integer :: b
+  contains
+procedure, pass(rhs) :: mul2
+procedure :: assign
+generic :: operator(*) => mul2
+generic :: assignment(=) => assign
+  end type
+
+contains
+
+  function get_t2(this)
+class(t1), intent(in) :: this
+class(t2), allocatable :: get_t2
+type(t2), allocatable :: local
+allocate(local)
+local%b = this%a
+call move_alloc(local, get_t2)
+  end function
+
+  function mul2(lhs, rhs)
+class(t2), intent(in) :: rhs
+integer, intent(in) :: lhs
+class(t2), allocatable :: mul2
+type(t2), allocatable :: local
+allocate(local)
+local%b = rhs%b*lhs
+call move_alloc(local, mul2)
+  end function
+
+  subroutine assign(this, rhs)
+class(t2), intent(out) :: this
+class(t2), intent(in)  :: rhs
+select type(rhs)
+type is(t2)
+  this%b = rhs%b
+class default
+  error stop
+end select
+  end subroutine
+
+end module
+
+
+program minimal
+  use types_mod
+  implicit none
+
+  class(t1), allocatable :: v4
+  class(t2), allocatable :: v6
+
+  allocate(v4, source=t1(4))
+  allocate(v6)
+  v6 = 3 * v4%get_t2() 
+
+  select type (v6)
+type is (t2)
+  if (v6%b /= 12) error stop
+class default
+  error stop
+  end select
+  deallocate(v4, v6)
+end
+


Re: [PATCH, Fortran, pr78395, v1] [OOP] error on polymorphic assignment

2016-11-18 Thread Thomas Koenig

Hi Andre,



Btw, when using the in gcc-7 available
polymorphic assign, then v6 is actually auto-allocated and the program runs
fine. So what are your opinions on the auto-allocation issue?


It is my experience that such questions can speedily and correctly
be answered by the regulars on comp.lang.fortran.

I have no opinion either way :-)

Regards

Thomas



Re: [PATCH] Fix combine's make_extraction (PR rtl-optimization/78378)

2016-11-18 Thread Michael Matz
Hi,

On Wed, 16 Nov 2016, Jakub Jelinek wrote:

> If inner is a MEM, make_extraction requires that pos is a multiple of 
> bytes and deals with offsetting it.  Or otherwise requires that pos is a 
> multiple of BITS_PER_WORD and for REG inner it handles that too.  But if 
> inner is something different, it calls just force_to_mode to the target 
> mode, which only really works if pos is 0.

This should also fix the aarch64 fail for gcc.c-torture/execute/cbrt.c .  
(At least I came up with the same patch in PR78390)


Ciao,
Michael.


Patch ping

2016-11-18 Thread Jakub Jelinek
Hi!

I'd like to ping 2 patches:

http://gcc.gnu.org/ml/gcc-patches/2016-11/msg01074.html
- C++ ABI - mangling of TLS aux symbols; either the posted
  patch or one with if (abi_version_at_least (11))

http://gcc.gnu.org/ml/gcc-patches/2016-11/msg00351.html
- DWARF Solaris bootstrap fix

Jakub


Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Martin Sebor

On 11/18/2016 09:29 AM, Jakub Jelinek wrote:

On Fri, Nov 18, 2016 at 09:21:38AM -0700, Martin Sebor wrote:

In the libiberty/alloca.c, I don't see anything special about alloca (0).
Yes, alloca (0) reclaims garbage, but so does alloca (1) and alloca (4035).
alloca (0) just returns NULL after it.  The diffutils link is the same
alloca.c as in libiberty.  Returning NULL or returning a non-unique pointer
for alloca (0) is just fine, it is the same thing as returning NULL or
returning a non-unique pointer from malloc (0).  We definitely don't want
to warn for malloc (0), and IMNSHO don't want to warn for alloca (0),
because it behaves the same.


I disagree.  At a minimum, the return value of malloc(0) is
implementation-defined and so relying on it being either null
or non-null is non-portable and can be a source of subtle bugs.


Most apps know what malloc (0) means and treat it correctly, they know they
shouldn't dereference the pointer, because it is either NULL or holds an
array with 0 elements.  I fail to see why you would want to warn.


Because people make mistakes and warnings help us avoid them (isn't
that obvious?)  Just because we get it right most of the time doesn't
mean we get right all of the time.  The papers and exploits I pointed
to should provide ample evidence that zero allocations are a problem
that can have serious (and costly) consequences.  We (i.e., Red Hat)
recognize this risk and have made it our goal to help stem some of
these problems.


But malloc(0) has also been known to result from unsigned overflow
which has led to vulnerabilities and exploits (famously by the JPG
COM vulnerability exploit, but there are plenty of others, including
for instance CVE-2016-2851).  Much academic research has been devoted
to this problem and to solutions to detect and prevent it.


How is it different from overflowing to 1 or 2 or 27?  What is special on
the value 0?


It's a case that, unlike the others, can be readily detected.  It
would be nice to detect the others as well but GCC can't do that
yet.  That doesn't mean we shouldn't try to detect at least the
small subset for now.  It doesn't have to be perfect for it to be
useful.




In the absence of a policy or guidelines it's a matter of opinion
whether or not this warning belongs in either -Wall or -Wextra.


It IMHO doesn't belong to either of these.


You've made that quite clear.  I struggle to reconcile your
position in this case with the one in debate about the
-Wimplicit-fallthrough option where you insisted on the exact
opposite despite the overwhelming number of false positives
caused by it.  Why is it appropriate for that option to be in
-Wextra and not this one?

Martin


Re: [PATCH, Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels.

2016-11-18 Thread Mike Stump
On Nov 18, 2016, at 4:33 AM, Iain Sandoe  wrote:
> 
> As discussed on IRC, I was under the impression that it is desired to move 
> away from #ifdef towards if() and  I have been adding those where locally 
> things have been touched - in this case it was only partially possible.
> 
> However, FAOD - you pointed out that for the RS6000 back-end #ifdef are still 
> preferred,

Shudder.  We can encourage anyone that likes #if, to like if () instead.

> OK now for trunk?

Ok; I'm pretty sure that change can only impact darwin.  If you wanted to 
reduce the test case to 3 cases, I think that would also show the problem that 
show that your patch fixes it, ok with such a change, if you want.

> Open branches?

I'm fine with back porting.



Re: [PATCH, Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels.

2016-11-18 Thread Iain Sandoe
Hi Mike,

> On 18 Nov 2016, at 17:16, Mike Stump  wrote:
> 
> On Nov 18, 2016, at 4:33 AM, Iain Sandoe  wrote:
>> 
>> As discussed on IRC, I was under the impression that it is desired to move 
>> away from #ifdef towards if() and  I have been adding those where locally 
>> things have been touched - in this case it was only partially possible.
>> 
>> However, FAOD - you pointed out that for the RS6000 back-end #ifdef are 
>> still preferred,
> 
> Shudder.  We can encourage anyone that likes #if, to like if () instead.
> 
>> OK now for trunk?
> 
> Ok; I'm pretty sure that change can only impact darwin.  If you wanted to 
> reduce the test case to 3 cases, I think that would also show the problem 
> that show that your patch fixes it, ok with such a change, if you want.

I’d like to do that;  is there a way to force a jump table for a limited set of 
cases? 
(the example was about the smallest I could get where GCC elected to produce a 
jump table instead of branches)

Iain

> 
>> Open branches?
> 
> I'm fine with back porting.
> 



Re: [PATCH v2][PR libgfortran/78314] Fix ieee_support_halting

2016-11-18 Thread Szabolcs Nagy
On 16/11/16 16:53, Szabolcs Nagy wrote:
> ieee_support_halting only checked the availability of status
> flags, not trapping support.  On some targets the later can
> only be checked at runtime: feenableexcept reports if
> enabling traps failed.
> 
> So check trapping support by enabling/disabling it.
> 
> Updated the test that enabled trapping to check if it is
> supported.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.
> 
> gcc/testsuite/
> 2016-11-16  Szabolcs Nagy  
> 
>   PR libgfortran/78314
>   * gfortran.dg/ieee/ieee_6.f90: Use ieee_support_halting.
> 
> libgfortran/
> 2016-11-16  Szabolcs Nagy  
> 
>   PR libgfortran/78314
>   * config/fpu-glibc.h (support_fpu_trap): Use feenableexcept.
> 

it seems this broke ieee_8.f90 which tests compile time
vs runtime value of ieee_support_halting

if fortran needs this, then support_halting should be
always false on arm and aarch64.

but i'm not familiar enough with fortran to tell if
there is some better workaround.




Re: [PATCH] DWARF: make signedness explicit for enumerator const values

2016-11-18 Thread Pierre-Marie de Rodat

On 11/14/2016 01:05 PM, Mark Wielaard wrote:

You can either choose a signed/unsigned form not giving the consumer
a hint about the size of the underlying constant value or one of the
sized data forms that don't give a hint about the value
representation/signedness. So in both cases the consumer looses
without an actual (base) type telling them how to interpret the
constant.


I see, thank you for the explanation! :-)

I agree with you, so I’ll revise to instead add a DW_AT_type attribute 
to enumeration DIEs to point to a base type.


--
Pierre-Marie de Rodat


Re: [fixincludes] Fix macOS 10.12 and (PR sanitizer/78267)

2016-11-18 Thread Bruce Korb
I think restoring bootstrap is likely a good thing.

On Fri, Nov 18, 2016 at 2:45 AM, Rainer Orth
 wrote:
>
> I guess this is ok for mainline now to restore bootstrap?


Re: [PATCH] debug/PR78112: remove recent duplicates for DW_TAG_subprogram attributes

2016-11-18 Thread Pierre-Marie de Rodat

Hi Christophe,

On 11/18/2016 03:03 PM, Christophe Lyon wrote:

Hi,

Part of the new testcase added with this commit fails on arm* targets:
  g++.dg/pr78112.C   scan-assembler-times DW_AT_object_pointer 37


Thank you for the report.

As I said on the bugzilla 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78112#c17), I reproduce 
inconsistent results on this even on a very reduced testcase. So I 
wonder if I should just remove this testcase.


--
Pierre-Marie de Rodat


Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Jakub Jelinek
On Fri, Nov 18, 2016 at 10:14:09AM -0700, Martin Sebor wrote:
> Because people make mistakes and warnings help us avoid them (isn't
> that obvious?)  Just because we get it right most of the time doesn't
> mean we get right all of the time.  The papers and exploits I pointed
> to should provide ample evidence that zero allocations are a problem
> that can have serious (and costly) consequences.  We (i.e., Red Hat)
> recognize this risk and have made it our goal to help stem some of
> these problems.

Are you talking about cases where user writes malloc (0) explicitly, or
where user writes malloc (n); and the optimizers figure out n is 0,
either always, or e.g. because the compiler decided to duplicate the code
and and in one of the branches it proves it is 0, or you want to add a
runtime warning when malloc is called with 0?
E.g. I don't see how writing malloc (0) in the code should have anything
to do with any overflows.  The cases where jump threading creates cases
where we call functions with constant arguments and we then warn on them
is also problematic, often such code is even dead except the compiler is not
able to prove it.

> >It IMHO doesn't belong to either of these.
> 
> You've made that quite clear.  I struggle to reconcile your
> position in this case with the one in debate about the
> -Wimplicit-fallthrough option where you insisted on the exact
> opposite despite the overwhelming number of false positives
> caused by it.  Why is it appropriate for that option to be in
> -Wextra and not this one?

It also matters what one can do to tell the compiler the code is fine.
For e.g. -Wimplicit-fallthrough and many other warnings one can add
a comment, or attribute, etc. to get the warning out of the way.
But the patch you've posted for the alloca (0) stuff contained
changes of n to n + !n, so the warning basically asks people to slow
down their code for no reason, just to get the warning out of the way.

Jakub


Re: [PATCH, Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels.

2016-11-18 Thread Segher Boessenkool
On Fri, Nov 18, 2016 at 05:19:31PM +, Iain Sandoe wrote:
> I’d like to do that;  is there a way to force a jump table for a limited set 
> of cases? 
> (the example was about the smallest I could get where GCC elected to produce 
> a jump table instead of branches)

--param case-values-threshold ?


Segher


Re: Import libcilkrts Build 4467 (PR target/68945)

2016-11-18 Thread Ilya Verbin
2016-11-17 20:01 GMT+03:00 Jeff Law :
> On 11/17/2016 09:56 AM, Ilya Verbin wrote:
>>
>> 2016-11-17 18:50 GMT+03:00 Rainer Orth :
>>>
>>> Rainer Orth  writes:
>>>
 I happened to notice that my libcilkrts SPARC port has been applied
 upstream.  So to reach closure on this issue for the GCC 7 release, I'd
 like to import upstream into mainline which seems to be covered by the
 free-for-all clause in https://gcc.gnu.org/svnwrite.html#policies, even
 though https://gcc.gnu.org/codingconventions.html#upstream lists nothing
 specific and we have no listed maintainer.
>>>
>>>
>>> I initially used Ilya's intel.com address, which bounced.  Now using the
>>> current address listed in MAINTAINERS...
>>
>>
>> Yeah, I don't work for Intel anymore. And I'm not a libcilkrts
>> maintainer, so I can't approve it.
>
> Do you want to be?  IIRC I was going to nominate you, but held off knowing
> your situation was going to change.
>
> If you're interested in maintainer positions, I can certainly still nominate
> you.

I have little experience with this library, and no longer have a
connection with Cilk developers an Intel, so I'm not interested.

  -- Ilya


Re: PING [PATCH] enable -fprintf-return-value by default

2016-11-18 Thread Sandra Loosemore

On 11/18/2016 09:01 AM, Martin Sebor wrote:

On 11/17/2016 10:34 PM, Sandra Loosemore wrote:

On 11/16/2016 09:49 AM, Martin Sebor wrote:

I'm looking for an approval of the attached patch.

I've adjusted the documentation based on Sandra's input (i.e.,
documented the negative of the option rather than the positive;
thank you for the review, btw.)

On 11/08/2016 08:13 PM, Martin Sebor wrote:

The -fprintf-return-value optimization has been disabled since
the last time it caused a bootstrap failure on powerpc64le.  With
the underlying problems fixed GCC has bootstrapped fine on all of
powerpc64, powerpc64le and x86_64 and tested with no regressions.
I'd like to re-enable the option.  The attached patch does that.


[snip]

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi(revision 242500)
+++ gcc/doc/invoke.texi(working copy)
@@ -384,7 +384,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss
@gol
 -fomit-frame-pointer -foptimize-sibling-calls @gol
 -fpartial-inlining -fpeel-loops -fpredictive-commoning @gol
--fprefetch-loop-arrays -fprintf-return-value @gol
+-fprefetch-loop-arrays -fno-printf-return-value @gol
 -fprofile-correction @gol
 -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol
 -fprofile-reorder-functions @gol


Please keep this list alphabetized -- the other "-fno-*" options are
sorted as such.  The documentation parts of the patch are OK with that
fixed, but I can't approve changing the default for the option.


I kept the option in the same place on the assumption that it was
the "printf" radix of the name, not the "no-" prefix, that these
were alphabetized by.

But now that you point it out and I've looked more carefully at
some of the other options, I see that in some sections they are
listed strictly alphabetically (-fno-foo after -fmoo) while in
others it's the way you suggest.  AFAICS, the former style looks
prevalent in the C++ Language Options and the in Warning Options,
for example.  The latter seems to be more popular in the
Optimization Options section (though there are counterexamples).

I'm happy to follow either of these conventions as long as its
consistent.  Otherwise, without both kinds of examples to choose
from, I don't think I can trust my brain to remember yet another
rule.


Well, how about the rule is that you look at the convention of the 
specific list you are adding something to?  At least that way we retain 
local consistency and don't mess up those parts of the documentation 
that we have already tried to organize in some way.  There's so much 
material in the command-line options chapter that it would be hard to 
figure out how to present it even if the content were completely static, 
much less when people are adding/renaming/modifying options all the time.


-Sandra



Re: [fixincludes] Fix macOS 10.12 and (PR sanitizer/78267)

2016-11-18 Thread Mike Stump
On Nov 18, 2016, at 2:45 AM, Rainer Orth  wrote:
> So the current suggestion is to combine my fixincludes patch and Jack's
> patch to disable  use if !__BLOCKS__.

> I guess this is ok for mainline now to restore bootstrap?

I think we are down to everyone likes this, Ok.  The big question is, does this 
need a back port?


If you fix includes virtual members or data members of C/C++ classes, just be 
careful disappearing content as that can change the layout of the structure or 
class.



libgo patch committed: Move sched variable from C to Go

2016-11-18 Thread Ian Lance Taylor
This patch simply moves the schedt (aka Sched) type and the sched
variable from C to Go.  This is a step toward further changes.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 242594)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-fc4ca600b2fc6de81fd3c4014542d6a50593db1a
+bf4762823c4543229867436399be3ae30b4d13bb
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/runtime2.go
===
--- libgo/go/runtime/runtime2.go(revision 242581)
+++ libgo/go/runtime/runtime2.go(working copy)
@@ -550,9 +550,6 @@ const (
_MaxGomaxprocs = 1 << 8
 )
 
-/*
-Commented out for gccgo for now.
-
 type schedt struct {
// accessed atomically. keep at top to ensure alignment on 32-bit 
systems.
goidgen  uint64
@@ -578,18 +575,17 @@ type schedt struct {
runqsize int32
 
// Global cache of dead G's.
-   gflock   mutex
-   gfreeStack   *g
-   gfreeNoStack *g
-   ngfree   int32
+   gflock mutex
+   gfree  *g
+   ngfree int32
 
// Central cache of sudog structs.
sudoglock  mutex
sudogcache *sudog
 
-   // Central pool of available defer structs of different sizes.
+   // Central pool of available defer structs.
deferlock mutex
-   deferpool [5]*_defer
+   deferpool *_defer
 
gcwaiting  uint32 // gc is waiting to run
stopwait   int32
@@ -608,7 +604,6 @@ type schedt struct {
procresizetime int64 // nanotime() of last change to gomaxprocs
totaltime  int64 // ∫gomaxprocs dt up to procresizetime
 }
-*/
 
 // The m.locked word holds two pieces of state counting active calls to 
LockOSThread/lockOSThread.
 // The low bit (LockExternal) is a boolean reporting whether any LockOSThread 
call is active.
@@ -772,8 +767,10 @@ var (
 
ncpu int32
 
-// forcegc forcegcstate
-// sched   schedt
+   //  forcegc forcegcstate
+
+   sched schedt
+
 // newprocsint32
 
 // Information about what cpu features are available.
Index: libgo/go/runtime/stubs.go
===
--- libgo/go/runtime/stubs.go   (revision 242581)
+++ libgo/go/runtime/stubs.go   (working copy)
@@ -520,3 +520,9 @@ func dumpregs(*_siginfo_t, unsafe.Pointe
 
 // Temporary for gccgo until we port panic.go.
 func startpanic()
+
+// Temporary for gccgo until we port proc.go.
+//go:linkname getsched runtime.getsched
+func getsched() *schedt {
+   return &sched
+}
Index: libgo/runtime/proc.c
===
--- libgo/runtime/proc.c(revision 242581)
+++ libgo/runtime/proc.c(working copy)
@@ -351,48 +351,18 @@ runtime_mcall(void (*pfn)(G*))
 //
 // Design doc at http://golang.org/s/go11sched.
 
-typedef struct Sched Sched;
-struct Sched {
-   Lock;
-
-   uint64  goidgen;
-   M*  midle;   // idle m's waiting for work
-   int32   nmidle;  // number of idle m's waiting for work
-   int32   nmidlelocked; // number of locked m's waiting for work
-   int32   mcount;  // number of m's that have been created
-   int32   maxmcount;  // maximum number of m's allowed (or die)
-
-   P*  pidle;  // idle P's
-   uint32  npidle;
-   uint32  nmspinning;
-
-   // Global runnable queue.
-   G*  runqhead;
-   G*  runqtail;
-   int32   runqsize;
-
-   // Global cache of dead G's.
-   Lockgflock;
-   G*  gfree;
-
-   uint32  gcwaiting;  // gc is waiting to run
-   int32   stopwait;
-   Notestopnote;
-   uint32  sysmonwait;
-   Notesysmonnote;
-   uint64  lastpoll;
-
-   int32   profilehz;  // cpu profiling rate
-};
+typedef struct schedt Sched;
 
 enum
 {
-   // Number of goroutine ids to grab from runtime_sched.goidgen to local 
per-P cache at once.
+   // Number of goroutine ids to grab from runtime_sched->goidgen to local 
per-P cache at once.
// 16 seems to provide enough amortization, but other than that it's 
mostly arbitrary number.
GoidCacheBatch = 16,
 };
 
-Sched  runtime_sched;
+extern Sched* runtime_getsched() __asm__ (GOSYM_PREFIX "runtime.getsched");
+
+static Sched*  runtime_sched;
 int32  runtime_gomaxprocs;
 uint32 runtime_needextram = 1;
 M  runtime_m0;
@@ -471,6 +441,8 @@ runtime_schedinit(void)
const byte *p;
Eface i;
 
+   runtime_sched = runtime_getsched();
+
m = &runtime_m0;
g = &runtime_g0;
m->g0 = g;
@@ -479,7 +451,7 @@ runtime_schedinit(void)
 
initcontext();
 
-   runtime_sched.maxmcount = 

Re: [Patch v4 0/17] Add support for _Float16 to AArch64 and ARM

2016-11-18 Thread James Greenhalgh
On Fri, Nov 11, 2016 at 03:37:17PM +, James Greenhalgh wrote:
> 
> Hi,
> 
> This patch set enables the _Float16 type specified in ISO/IEC TS 18661-3
> for AArch64 and ARM. The patch set has been posted over the past two months,
> with many of the target-independent changes approved. I'm reposting it in
> entirity in the form I hope to commit it to trunk.
> 
> The patch set can be roughly split in three; first, hookization of
> TARGET_FLT_EVAL_METHOD, and changes to the excess precision logic in the
> compiler to handle the new values for FLT_EVAL_METHOD defined in
> ISO/IEC TS-18661-3. Second, the AArch64 changes required to enable _Float16,
> and finally the ARM changes required to enable _Float16.
> 
> The broad goals and an outline of each patch in the patch set were
> described in https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02383.html .
> As compared to the original submission, the patch set has grown an ARM
> port, and has had several rounds of technical review on the target
> independent aspects.
> 
> This has resulted in many of the patches already being approved, a full
> summary of the status of each ticket is immediately below.
> 
> Clearly the focus for review of this patch set now needs to be the AArch64
> and ARM ports, I hope the appropriate maintainers will be able to do so in
> time for the patch set to be accepted for GCC 7.
> 
> I've built and tested the full patch set on ARM (cross and native),
> AArch64 (cross and native) and x86_64 (native) with no identified issues.

All the target independent changes are now approved, and all the ARM patches
have been approved (two are conditional on minor changes).

I'd like to apply the target independent and ARM changes to trunk, while I
wait for approval of the AArch64 changes. The changes for the two ports should
be independent. Would that be acceptable, or would you prefer me to wait
for review of the AArch64 changes?

I will then send a follow-up patch for doc/extend.texi detailing the
availability of _Float16 on ARM (I'm holding off on doing this until I know
which order the ARM and AArch64 parts will go in).

Thanks,
James

> --
> Target independent changes
> 
> 10 patches, 9 previously approved, 1 New implementing testsuite
> changes to enable _Float16 tests in more circumstances on ARM.
> --
> 
> [Patch 1/17] Add a new target hook for describing excess precision intentions
> 
>   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00781.html
> 
> [Patch 2/17] Implement TARGET_C_EXCESS_PRECISION for i386
> 
>   Blanket approved by Jeff in:
> https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02402.html
> 
> [Patch 3/17] Implement TARGET_C_EXCESS_PRECISION for s390
> 
>   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01554.html
> 
> [Patch 4/17] Implement TARGET_C_EXCESS_PRECISION for m68k
> 
>   Blanket approved by Jeff in:
> https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02402.html
>   And by Andreas: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02414.html
> 
>   There was a typo in the original patch, fixed in:
> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01173.html
>   which I would apply as an "obvious" fix to the original patch.
> 
> [Patch 5/17] Add -fpermitted-flt-eval-methods=[c11|ts-18661-3]
> 
>   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02405.html
> 
>   Joseph had a comment in
>   https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00335.html that the tests
>   should check FLT_EVAL_METHOD from  rather than
>   __FLT_EVAL_METHOD__. Rather than implement that suggestion, I added tests
>   to patch 6 which tested the  macro, and left the tests in this
>   patch testing the internal macro.
> 
> [Patch 6/17] Migrate excess precision logic to use TARGET_EXCESS_PRECISION
> 
>   Approved (after removing a rebase bug):
>   https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00231.html
> 
> [Patch 7/17] Delete TARGET_FLT_EVAL_METHOD and poison it.
> 
>   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02401.html
> 
> [Patch 8/17] Make _Float16 available if HFmode is available
> 
>   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02403.html
> 
> [Patch libgcc 9/17] Update soft-fp from glibc
> 
>   Self approved under policy that we can update libraries which GCC mirrors
>   without further approval.
> 
> [Patch testsuite patch 10/17] Add options for floatN when checking effective 
> target for support
> 
>   NEW!
> 
> 
> AArch64 changes
> 
> 3 patches, none reviewed
> 
> 
> [Patch AArch64 11/17] Add floatdihf2 and floatunsdihf2 patterns
> 
>   Not reviewed, last pinged (^6):
>   https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00584.html
> 
> [Patch libgcc AArch64 12/17] Enable hfmode soft-float conversions and 
> truncations
> 
>   Not reviewed:
>   https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02395.html
> 
> [Patch AArch64 13/17] Enable _Float16 for AArch64
> 
>   Not reviewed:
>   https://gcc.gnu.org/ml/gcc-patches/2016-10/ms

Re: [PATCH] enable -fprintf-return-value by default

2016-11-18 Thread Jeff Law

On 11/08/2016 08:13 PM, Martin Sebor wrote:

The -fprintf-return-value optimization has been disabled since
the last time it caused a bootstrap failure on powerpc64le.  With
the underlying problems fixed GCC has bootstrapped fine on all of
powerpc64, powerpc64le and x86_64 and tested with no regressions.
I'd like to re-enable the option.  The attached patch does that.

Thanks
Martin


gcc-fprintf-return-value.diff


gcc/c-family/ChangeLog:

* c.opt (-fprintf-return-value): Enable by default.

gcc/ChangeLog:

* doc/invoke.texi (-fprintf-return-value): Document that option
is enabled by default.

OK once you and Sandra are in-sync on the doc changes.

jeff



Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Jeff Law

On 11/17/2016 05:32 PM, Martin Sebor wrote:


I confess I haven't heard of such an implementation before but after
some searching I have managed to find a few examples of it online,
such as in QNX or in the BlackBerry SDK, or in the GCC shipped by
Texas Instruments.  For instance:
I have the "advantage" of being a GCC gray beard and had the misfortune 
of maintaining a system that had one of those allocas (hpux) *and* 
having to debug heap exhaustions in GCC that would occur due to this 
kind of construct


for (some large number of iterations)
   frobit (...)

Where frobit would allocate a metric ton of stuff with alloca.

Note the calls to alloca down in frobit would all appear to be at the 
same stack depth (alloca literally recorded the SP value and the PA was 
a fixed frame target).  So the calls to alloca within frobit wouldn't 
deallocate anything.





http://www.qnx.com/developers/docs/6.5.0SP1.update/com.qnx.doc.neutrino_lib_ref/a/alloca.html


Their GCC-derived compilers apparently enable it with -fno-builtins.

Right.



That said, I don't view this as reason to exclude the warning from
-Wextra.  The vendor-provided compilers are obviously customized
versions of GCC with their own features, including their own set
of options and warnings.  It shouldn't stop us from enabling those
we expect to be useful to the typical GCC user on typical targets,
and especially those that can help expose sources of common bugs.
Recognizing I'm preaching to choir here: Calling alloca with any
argument is considered dangerous enough to be discouraged in most
man pages.  Alloca(0) is a potentially dangerous corner case of
an already dangerous API.  It seems at least as problematic as
any of the warnings already in -Wextra, and I'd say as many in
-Wall.
Right.  We're well in the "living dangerously" space.  But I could claim 
that on one of these targets, warning about an alloca (0) would likely 
result in a developer removing it or forcing it to allocate some small 
amount of space to shut up the warning.  That in turn runs the risk of 
making the target code more prone to heap exhaustion and possibly less 
secure, particularly if the developer isn't aware of the special nature 
of alloca (0).







Even on systems with this unusual alloca implementation, since
alloca(0) is special (and expensive), warning on such calls would
likely be useful.  In fact, since calling alloca(0) in these
environments is actually important, warning on them coukd help
detect their unintended absence.  The few such calls that are
intended can be easily tweaked to suppress the warning.
It certainly is special and often expensive.  But I'd come back to the 
likely behavior of the developer.  I suspect unless there's a comment 
indicating the alloca (0) is intentional, they're likely to just remove it.


So I see both sides here and I'm not sure what the best path forward 
should be.


jeff


Re: [PATCH] avoid calling alloca(0)

2016-11-18 Thread Jeff Law

On 11/18/2016 12:58 AM, Jakub Jelinek wrote:

On Thu, Nov 17, 2016 at 05:32:31PM -0700, Martin Sebor wrote:

The point is warning on an alloca (0) may not be as clear cut as it
might seem.  It's probably a reasonable thing to do on the host, but on
a target, which might be embedded and explicitly overriding the builtin
alloca, warning on alloca (0) is less of a slam dunk.


I confess I haven't heard of such an implementation before but after
some searching I have managed to find a few examples of it online,
such as in QNX or in the BlackBerry SDK, or in the GCC shipped by
Texas Instruments.  For instance:


In the libiberty/alloca.c, I don't see anything special about alloca (0).
Yes, alloca (0) reclaims garbage, but so does alloca (1) and alloca (4035).
alloca (0) just returns NULL after it.  The diffutils link is the same
alloca.c as in libiberty.  Returning NULL or returning a non-unique pointer
for alloca (0) is just fine, it is the same thing as returning NULL or
returning a non-unique pointer from malloc (0).  We definitely don't want
to warn for malloc (0), and IMNSHO don't want to warn for alloca (0),
because it behaves the same.

Maybe that's the key.  We don't want to warn for

alloca (0);

But we should warn for

whatever = alloca (0);


The former is a clear request for GC of the alloca space.  The latter it 
much more likely a request for space where something went wrong 
computing the amount of space.


jeff


Re: PING [PATCH] enable -fprintf-return-value by default

2016-11-18 Thread Martin Sebor

On 11/18/2016 10:38 AM, Sandra Loosemore wrote:

On 11/18/2016 09:01 AM, Martin Sebor wrote:

On 11/17/2016 10:34 PM, Sandra Loosemore wrote:

On 11/16/2016 09:49 AM, Martin Sebor wrote:

I'm looking for an approval of the attached patch.

I've adjusted the documentation based on Sandra's input (i.e.,
documented the negative of the option rather than the positive;
thank you for the review, btw.)

On 11/08/2016 08:13 PM, Martin Sebor wrote:

The -fprintf-return-value optimization has been disabled since
the last time it caused a bootstrap failure on powerpc64le.  With
the underlying problems fixed GCC has bootstrapped fine on all of
powerpc64, powerpc64le and x86_64 and tested with no regressions.
I'd like to re-enable the option.  The attached patch does that.


[snip]

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi(revision 242500)
+++ gcc/doc/invoke.texi(working copy)
@@ -384,7 +384,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss
@gol
 -fomit-frame-pointer -foptimize-sibling-calls @gol
 -fpartial-inlining -fpeel-loops -fpredictive-commoning @gol
--fprefetch-loop-arrays -fprintf-return-value @gol
+-fprefetch-loop-arrays -fno-printf-return-value @gol
 -fprofile-correction @gol
 -fprofile-use -fprofile-use=@var{path} -fprofile-values @gol
 -fprofile-reorder-functions @gol


Please keep this list alphabetized -- the other "-fno-*" options are
sorted as such.  The documentation parts of the patch are OK with that
fixed, but I can't approve changing the default for the option.


I kept the option in the same place on the assumption that it was
the "printf" radix of the name, not the "no-" prefix, that these
were alphabetized by.

But now that you point it out and I've looked more carefully at
some of the other options, I see that in some sections they are
listed strictly alphabetically (-fno-foo after -fmoo) while in
others it's the way you suggest.  AFAICS, the former style looks
prevalent in the C++ Language Options and the in Warning Options,
for example.  The latter seems to be more popular in the
Optimization Options section (though there are counterexamples).

I'm happy to follow either of these conventions as long as its
consistent.  Otherwise, without both kinds of examples to choose
from, I don't think I can trust my brain to remember yet another
rule.


Well, how about the rule is that you look at the convention of the
specific list you are adding something to?  At least that way we retain
local consistency and don't mess up those parts of the documentation
that we have already tried to organize in some way.  There's so much
material in the command-line options chapter that it would be hard to
figure out how to present it even if the content were completely static,
much less when people are adding/renaming/modifying options all the time.


I think it would be be ideal if all the options were sorted the same
way in all sections.  Is there some command to have texinfo sort them
for us?  If not, can we write a script to sort them, either each time
just before generating the docs or manually?  (I'm happy to help.)
Otherwise, consistency will continue to be an elusive goal.

There are at least two reasons why I don't think following the style
used by each section is likely to yield good results (and clearly
hasn't to date).  First, the big sections already have examples of
both approaches (or simply options out of order).  In some of them
it can also be hard to tell if the radix of the options you're
looking to for guidance starts with an 'n'.  Second, when adding
more than one option to different sections (such as the
-Wformat-length and -fprintf-format-length options) having to
remember to apply a different sort order for each (warnings are
sorted by radix but optimization options, for the most parts,
strictly alphabetically), seems also likely to trip people up.

Martin

PS I don't mind moving the -fno-printf-return-value option up or
down and I will do it before committing the patch.  I would just
prefer to be able not to have to remember and worry about all
these subtle rules.  There are too many of them and they tend
to take time and energy away from things that matter more (like
fixing bugs).


Re: PING [PATCH] enable -fprintf-return-value by default

2016-11-18 Thread Jeff Law

On 11/18/2016 11:52 AM, Martin Sebor wrote:


I think it would be be ideal if all the options were sorted the same
way in all sections.  Is there some command to have texinfo sort them
for us?  If not, can we write a script to sort them, either each time
just before generating the docs or manually?  (I'm happy to help.)
Otherwise, consistency will continue to be an elusive goal.

I'm not aware of texinfo way to do this automatically.




There are at least two reasons why I don't think following the style
used by each section is likely to yield good results (and clearly
hasn't to date).  First, the big sections already have examples of
both approaches (or simply options out of order).  In some of them
it can also be hard to tell if the radix of the options you're
looking to for guidance starts with an 'n'.  Second, when adding
more than one option to different sections (such as the
-Wformat-length and -fprintf-format-length options) having to
remember to apply a different sort order for each (warnings are
sorted by radix but optimization options, for the most parts,
strictly alphabetically), seems also likely to trip people up.
Let's split this issue off by moving the option into the location Sandra 
has asked so that we're at least kindof, sorta, locally consistent. 
That allows your patch to go forward.


Then separately we can see if we can bring more sense to the larger 
issue.  Sandra has tried to work towards bring sanity to our 
documentation (which has grown like field bindweed over time) and we can 
include a discussion about this issue in that larger effort.




PS I don't mind moving the -fno-printf-return-value option up or
down and I will do it before committing the patch.  I would just
prefer to be able not to have to remember and worry about all
these subtle rules.  There are too many of them and they tend
to take time and energy away from things that matter more (like
fixing bugs).
Understood.  But that's also part of the reason why we delegate things 
-- there's a million little things to remember and nobody can remember 
them all.  So it's a balance between saying "we should clean this up and 
bring consistency here now" and "the maintainer has asked for a change, 
let's do that and address the consistency issues separately".


There's obviously pros and cons to each decision which I don't enumerate ;-)

Jeff



Re: [fixincludes] Fix macOS 10.12 and (PR sanitizer/78267)

2016-11-18 Thread Bruce Korb
On Fri, Nov 18, 2016 at 9:42 AM, Mike Stump  wrote:
> On Nov 18, 2016, at 2:45 AM, Rainer Orth  
> wrote:
>> So the current suggestion is to combine my fixincludes patch and Jack's
>> patch to disable  use if !__BLOCKS__.
>
>> I guess this is ok for mainline now to restore bootstrap?
>
> I think we are down to everyone likes this, Ok.  The big question is, does 
> this need a back port?

My thinking on that subject is that since include fixes do not directly affect
the compiler and, instead, facilitate its use on various platforms, it
is therefore
reasonably safe to back port.  Especially if adequate guards (selection tests)
are included in the fix to prevent it from taking action on the wrong files.
But I also think it is really a "steering committee" decision.

For me, I think it is safe enough.


Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

2016-11-18 Thread Jakub Jelinek
Hi!

On Thu, Nov 17, 2016 at 02:18:57PM -0800, H.J. Lu wrote:
> > Hi HJ, could you please commit it?
> 
> Done.

I'm seeing lots of ICEs with this.

E.g. reduced:

typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
typedef unsigned char __mmask8;
typedef float __v4sf __attribute__ ((__vector_size__ (16)));

static inline  __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
_mm_setzero_ps (void)
{
  return __extension__ (__m128){ 0.0f, 0.0f, 0.0f, 0.0f };
}

 __m128
foo (__mmask8 __U, __m128 __A, __m128 __B, __m128 __C, __m128 __D, __m128 __E, 
__m128 *__F)
{
  return (__m128) __builtin_ia32_4fmaddss_mask ((__v4sf) __B,
  (__v4sf) __C,
  (__v4sf) __D,
  (__v4sf) __E,
  (__v4sf) __A,
  (const __v4sf *) __F,
  (__v4sf) _mm_setzero_ps (),
  (__mmask8) __U);
}

ICEs with -mavx5124fmaps -O0, but succeeds with
-mavx512vl -mavx5124fmaps -O0 or -mavx5124fmaps -O2.

fcn_mask = gen_avx5124fmaddps_4fmaddss_mask;
fcn_maskz = gen_avx5124fmaddps_4fmaddss_maskz;
msk_mov   = gen_avx512vl_loadv4sf_mask;

looks wrong, while -mavx5124fmaps implies -mavx512f, it doesn't
imply -mavx512vl, so using -mavx512vl insns unconditionally is just wrong.
You need some fallback if avx512vl isn't available, perhaps use
avx512f 512-bit masked insns with bits in the mask forced to pick only the
ones you want?

Also, seems there are various formatting issues in the change,
e.g. shortly after s4fma_expand: there is indentation by 3 chars relative to
above { instead of 2, gen_rtx_SUBREG (V16SFmode, tmp, 0)); has extra 1 char
indentation, some lines too long.

Jakub


  1   2   >