Re: [1/2] Add get_next_strinfo helper function

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 07:58:54AM +0100, Richard Sandiford wrote:
> Ping
> 
> Richard Sandiford  writes:
> > This patch just adds a helper function for getting the next strinfo
> > in a chain, since part 2 adds another place where we do that.
> >
> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >
> > Thanks,
> > Richard
> >
> >
> > 2017-05-16  Richard Sandiford  
> >
> > gcc/
> > * tree-ssa-strlen.c (get_next_strinfo): New function.
> > (get_stridx_plus_constant): Use it.
> > (zero_length_string): Likewise.
> > (adjust_related_strinfos): Likewise.
> > (adjust_last_stmt): Likewise.

Ok, thanks.

Jakub


Re: [PATCH] add more detail to -Wconversion and -Woverflow (PR 80731)

2017-05-31 Thread Christophe Lyon
On 30 May 2017 at 23:28, Martin Sebor  wrote:
> On 05/29/2017 08:02 AM, Christophe Lyon wrote:
>>
>> On 25 May 2017 at 00:16, Martin Sebor  wrote:
>>>
>>> On 05/24/2017 11:08 AM, Joseph Myers wrote:


 On Wed, 17 May 2017, Martin Sebor wrote:

> @@ -1036,31 +1079,76 @@ warnings_for_convert_and_check (location_t loc,
> tree type, tree expr,
>   /* This detects cases like converting -129 or 256 to
>  unsigned char.  */
>   if (!int_fits_type_p (expr, c_common_signed_type (type)))
> -   warning_at (loc, OPT_Woverflow,
> -   "large integer implicitly truncated to unsigned
> type");
> +   {
> + if (cst)
> +   warning_at (loc, OPT_Woverflow,
> +   (TYPE_UNSIGNED (exprtype)
> +? "conversion from %qT to %qT "
> +"changes value from %qE to %qE"
> +: "unsigned conversion from %qT to %qT "
> +"changes value from %qE to %qE"),
> +   exprtype, type, expr, result);
> + else
> +   warning_at (loc, OPT_Woverflow,
> +   (TYPE_UNSIGNED (exprtype)
> +? "conversion from %qT to %qT "
> +"changes the value of %qE"
> +: "unsigned conversion from %qT to %qT "
> +"changes the value of %qE"),
> +   exprtype, type, expr);
> +   }



 You need to use G_() around both arguments to ?:, otherwise only one
 will
 get extracted for translation.

> diff --git a/gcc/testsuite/c-c++-common/pr68657-1.c
> b/gcc/testsuite/c-c++-common/pr68657-1.c
> index 84f3e54..33fdf86 100644
> --- a/gcc/testsuite/c-c++-common/pr68657-1.c
> +++ b/gcc/testsuite/c-c++-common/pr68657-1.c
> @@ -5,14 +5,14 @@
>  void
>  f1 (void)
>  {
> -  unsigned int a = -5; /* { dg-error "negative integer implicitly
> converted to unsigned type" } */
> +  unsigned int a = -5; /* { dg-error "unsigned conversion from .int.
> to
> .unsigned int. changes value from .-5. to .4294967291." } */



 The more specific match would fail for targets with 16-bit int.  You
 need
 to keep it less specific in this test (if you want to test the more
 specific text as well, another test could be added for that, restricted
 to
 the int32 effective-target).

 (The changes to Wconversion-real-integer-3.C and
 Wconversion-real-integer2.C are OK in that those tests are restricted to
 int32plus, although in theory 64-bit int would be an issue there.)

> +  /* According to 6.3.1.3 of C11:
> + -3-  Otherwise, the new type is signed and the value cannot be
> +  represented in it; either the result is
> implementation-defined
> + or an implementation-defined signal is raised.
> +
> + In GCC such conversios wrap and diagnosed by mentioning
> "overflow"
> + if the absolut value of the operand is in excess of the maximum
> of
> + the destination of type, and "conversion" otherwise, as follows:
> */



 s/conversios/conversions/; s/absolut/absolute/

 OK with those changes.
>>>
>>>
>>>
>>> Thanks for the careful review!  Done and committed in r248431.
>>>
>>
>> Hi,
>>
>> I have noticed failures on arm*:
>>   Executed from: gcc.dg/fixed-point/fixed-point.exp
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 12)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 13)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 14)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 15)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 16)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 17)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 18)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 19)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 20)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 21)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 22)
>> gcc.dg/fixed-point/int-warning.c  (test for warnings, line 23)
>> gcc.dg/fixed-point/int-warning.c (test for excess errors)
>>
>> Excess errors:
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/fixed-point/int-warning.c:12:8:
>> warning: overflow in conversion from '_Accum' to 'signed char' chages
>> value from '5.0e+2' to '-12' [-Woverflow]
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/fixed-point/int-warning.c:13:8:
>> warning: overflow in conversion from '_Accum' to 's

Re: [PATCH GCC][4/6]Relax minimal segment length of DR_B for merging alias check

2017-05-31 Thread Richard Biener
On Tue, May 30, 2017 at 5:29 PM, Bin.Cheng  wrote:
> On Tue, May 30, 2017 at 12:27 PM, Richard Biener
>  wrote:
>> On Thu, May 25, 2017 at 5:16 PM, Bin.Cheng  wrote:
>>> On Tue, May 23, 2017 at 5:23 PM, Bin Cheng  wrote:
 Hi,
 As commented in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80815#c1,
 We can relax minimal segment length of DR_B for merging.  With this change,
 the new test can be improved to only one alias check.  Note the
 condition is still accurate after this patch, it won't introduce false
 alias.
 Bootstrap and test on x86_64 and AArch64, is it OK?
>>> Updated patch wrto change of previous patch.
>>>
>>> Bootstrap and test on x86_64 and AArch64.
>>
>> Please omit unnecessary braces.  Ok with that change.
>>
>> Note that
>>
>>   if (tree_fits_uhwi_p (dr_b1->seg_len))
>> {
>>   min_seg_len_b = dr_b1->seg_len;
>>   if (tree_int_cst_sign_bit (dr_b1->seg_len))
>> min_seg_len_b = wi::neg (min_seg_len_b);
>>
>> the tree_fits_uhwi_p check is somewhat bogus now that min_seg_len_b is
>> a wide-int.
>> It should probably be changed to TREE_CODE (dr_b1->seg_len) == INTEGER_CST
>> which also means  that
>>
>>   min_seg_len_b = wi::abs (dr_b1->seg_len);
>>
>> should work.
> Thanks for reviewing.  Here is updated patch.  Bootstrap and test on
> x86_64.  Is it OK?

Ok.

Richard.

> Thanks,
> bin
>>
>> Richard.
>>
>>
>>> Thanks,
>>> bin

 2017-05-22  Bin Cheng  

 * tree-data-ref.c (prune_runtime_alias_test_list): Relax minimal
 segment length for dr_b.

 gcc/testsuite/ChangeLog
 2017-05-22  Bin Cheng  

 * gcc.dg/vect/pr80815-3.c: New test.


Re: Optimisation of std::binary_search of the header

2017-05-31 Thread jay pokarna
Hey,
Could you tell the way as to how can I measure the time taken
by my algorithm and compare it with the inbuilt functions ?
My algorithm is similar to std::binary_search in working.

Also , could you recommend some data that could be helpful to help the
comparison between the function and the std::binary_search?

Thanks,
Jay Pokarna

On Wed, May 31, 2017 at 3:50 AM, Mike Stump  wrote:
> On May 29, 2017, at 1:05 AM, jay pokarna  wrote:
>>
>> Could you give me the contact of the standard committee?
>
> https://isocpp.org/std/the-committee
>



-- 
Regards,
Jay Pokarna
CS Sophomore
Wordpress | Linkedin
Birla Institute of Technology and Science, Pilani
Pilani Campus
Rajasthan - 333031.


Re: [PATCH 2/2] DWARF: make it possible to emit debug info for declarations only

2017-05-31 Thread Richard Biener
On Tue, May 30, 2017 at 5:47 PM, Pierre-Marie de Rodat
 wrote:
> Thank you for your review, Richard.
>
> On 05/30/2017 01:59 PM, Richard Biener wrote:
>>
>> I think the issue is unfortunate in the C frontend as well.  So I believe
>> we can
>> go without a new langhook and instead make sure
>> dwarf2out_early_global_decl
>> is not called for uninteresting decls (which means eventually pushing the
>> call(s) of that hook more towards the FEs).
>
>
> It is called by rest_of_decl_compilation, which seems itself to be called a
> lot on FUNCTION_DECL nodes. Before I dive into this consequent change: this
> would lead for instance to add a parameter to rest_of_compilation to control
> whether it must call the early_global_decl hook, and then to update all
> callers accordingly. Is this what you actually have in mind?

Actually for the bigger picture I'd refactor rest_of_decl_compilation, not
calling it from the frontends but rely on finalize_decl/function.  The missing
part would then be calling the dwarf hook which should eventually be done
at some of the places the frontends now call rest_of_decl_compliation.

>
>> For C/C++ it would be reasonable to output debug info for external
>> declarations
>> that end up being used for example.
>
>
> I guess that could be done indeed. :-)

But for an easier way (you might still explore the above ;)) just remove
the guards from dwarf2out.c and handle it more like types that we
prune if they end up being unused (OTOH I guess we don't refer to
the decl DIEs from "calls" because not all calls are refered to with
standard DWARF -- the GNU callsite stuff refers them I think but those
get generated too late).

That said, when early_finish is called the cgraph and IPA references
exists and thus you can
sort-of see which functions are "used".

Richard.

> --
> Pierre-Marie de Rodat


Re: Alternative check for vector refs with same alignment

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 8:55 AM, Richard Sandiford
 wrote:
> Richard Sandiford  writes:
>> "Bin.Cheng"  writes:
>>> On Wed, May 3, 2017 at 11:07 AM, Richard Biener
>>>  wrote:
 On Wed, May 3, 2017 at 9:54 AM, Richard Sandiford
  wrote:
> vect_find_same_alignment_drs uses the ddr dependence distance
> to tell whether two references have the same alignment.  Although
> that's safe with the current code, there's no particular reason
> why a dependence distance of 0 should mean that the accesses start
> on the same byte.  E.g. a reference to a full complex value could
> in principle depend on a reference to the imaginary component.
> A later patch adds support for this kind of dependence.
>
> On the other side, checking modulo vf is pessimistic when the step
> divided by the element size is a factor of 2.
>
> This patch instead looks for cases in which the drs have the same
> base, offset and step, and for which the difference in their constant
> initial values is a multiple of the alignment.

 I'm a bit wary about trusting operand_equal_p over dependence analysis.
 So, did you (can you) add an assert that the new code computes
 same alignment in all cases where the old one did?
>>
>> FWIW, the operand_equal_p for the base addresses is the same as the one
>> used by the dependence analysis:
>>
>>   /* If the references do not access the same object, we do not know
>>  whether they alias or not.  We do not care about TBAA or alignment
>>  info so we can use OEP_ADDRESS_OF to avoid false negatives.
>>  But the accesses have to use compatible types as otherwise the
>>  built indices would not match.  */
>>   if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), 
>> OEP_ADDRESS_OF)
>>   || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
>> TREE_TYPE (DR_BASE_OBJECT (b
>> {
>>   DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>   return res;
>> }
>>
>>> At the moment operand_equal_p method looks more powerful than
>>> dependence analysis, for example, it can handle the same memory
>>> reference with pointer/array_ref forms like in PR65206.  However,
>>> given dependence check is not expensive here, is it possible to build
>>> the patch on top of it when it fails?
>>
>> The old check isn't valid after my later patches, because there's
>> no guarantee that the accesses start on the same byte.  And like
>> you say, the new check is more powerful in some ways (including
>> the modulo vf thing I mentioned).
>>
>> So I'm not sure we can do anything useful with the dependence distance
>> information.  Sometimes it would give false positives and sometimes
>> it would give false negatives.
>
> Upthread you said "otherwise ok" apart from the "I'm a bit wary..." part.
> Is the original patch OK given the above?

Yes.

Thanks,
Richard.

> Thanks,
> Richard


[PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
Hi.

Having a discussion with Jakub on IRC, I decided to implement it in a bit 
different way:

I added to common.opt:
Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) PerFunction
No sanitize flags for a function

and this per function flag is used to save no_sanitize values, then checked in:

bool
sanitize_flags_p (unsigned int flag, const_tree fn)
{
  unsigned int result_flags = flag_sanitize & flag;

  if (fn != NULL)
result_flags &= ~opt_for_fn (fn, flag_no_sanitize_fn);
   return result_flags;
}

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests,
the only issue I have is that I probably need a dummy option for the 
flag_no_sanitize_fn,
because:

FAIL: compiler driver --help=optimizers option(s): "^ +-.*[^:.]$" absent from 
output: "  -###No sanitize flags for a function"

Should I add new flag IgnoreOption for the option?

Thanks for feedback
Martin
>From 5e85b19a83ee9e8c2642c47ff0c4bd8c27bbf71b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 30 May 2017 11:36:50 +0200
Subject: [PATCH] Implement no_sanitize function attribute

gcc/cp/ChangeLog:

2017-05-29  Martin Liska  

	* class.c (build_base_path): Use sanitize_flags_p function.
	* cp-gimplify.c (cp_genericize_r): Likewise.
	(cp_genericize_tree): Likewise.
	(cp_genericize): Likewise.
	* cp-ubsan.c (cp_ubsan_instrument_vptr_p): Likewise.
	* decl.c (compute_array_index_type): Likewise.
	(start_preparsed_function): Likewise.
	* decl2.c (one_static_initialization_or_destruction): Likewise.
	* init.c (finish_length_check): Likewise.
	* lambda.c (maybe_add_lambda_conv_op): Use
	add_no_sanitize_value.
	* typeck.c (get_member_function_from_ptrfunc): Save and restore
	also flag_sanitize_local.
	(cp_build_binary_op): Use sanitize_flags_p.
	(build_static_cast_1): Likewise.

gcc/c/ChangeLog:

2017-05-29  Martin Liska  

	* c-convert.c (convert): Use sanitize_flags_p.
	* c-decl.c (grokdeclarator): Likewise.
	* c-typeck.c (convert_for_assignment): Use sanitize_flags_p.
	(c_finish_return): Likewise.
	(build_binary_op): Likewise.

gcc/c-family/ChangeLog:

2017-05-29  Martin Liska  

	* c-attribs.c (add_no_sanitize_value): New function.
	(handle_no_sanitize_attribute): Likewise.
	(handle_no_sanitize_address_attribute): Use
	add_no_sanitize_value.
	(handle_no_sanitize_thread_attribute): Likewise.
	(handle_no_address_safety_analysis_attribute): Likewise.
	(handle_no_sanitize_undefined_attribute): Likewise.
	* c-common.h (add_no_sanitize_value): Declare.
	* c-ubsan.c (ubsan_instrument_division): Use sanitize_flags_p.
	(ubsan_instrument_shift): Likewise.
	(ubsan_instrument_bounds): Likewise.
	(ubsan_maybe_instrument_array_ref): Likewise.
	(ubsan_maybe_instrument_reference_or_call): Likewise.
	* c-ubsan.h (do_ubsan_in_current_function): Remove.

gcc/ChangeLog:

2017-05-29  Martin Liska  

	* asan.c (asan_sanitize_stack_p): Use sanitize_flags_p.
	(gate_asan): Likewise.
	* asan.h (asan_no_sanitize_address_p): Remove.
	* builtins.def: Fix coding style.
	* common.opt: Add flag_no_sanitize_fn.
	* convert.c (convert_to_integer_1): Use sanitize_flags_p.
	* doc/extend.texi: Document the new attribute.
	* flag-types.h (enum sanitize_code): Rename SANITIZE_NONDEFAULT
	to SANITIZE_UNDEFINED_NONDEFAULT.
	* gcc.c (sanitize_spec_function): Use the new enum value.
	* gimple-fold.c (optimize_atomic_compare_exchange_p):
	Use sanitize_flags_p.
	* gimplify.c (gimplify_function_tree): Use sanitize_flags_p.
	* ipa-inline.c (sanitize_attrs_match_for_inline_p): Likewise.
	* opts.c (parse_no_sanitize_attribute): New function.
	(common_handle_option): Use renamed enum value.
	* opts.h (parse_no_sanitize_attribute): Declare.
	* tree.c (sanitize_flags_p): New function.
	* tree.h (sanitize_flags_p): Declare the function.
	* tsan.c: Use sanitize_flags_p.
	* ubsan.c (ubsan_expand_null_ifn): Likewise.
	(instrument_mem_ref): Likewise.
	(instrument_bool_enum_load): Likewise.
	(do_ubsan_in_current_function): Remove.
	(pass_ubsan::execute): Use sanitize_flags_p.
	* ubsan.h (do_ubsan_in_current_function): Remove.

gcc/testsuite/ChangeLog:

2017-05-29  Martin Liska  

	* c-c++-common/ubsan/attrib-2.c (float_cast2): Add no_sanitize
	attribute test.
	* gcc.dg/asan/use-after-scope-4.c (main): Likewise.
---
 gcc/asan.c|  8 +--
 gcc/asan.h|  7 --
 gcc/builtins.def  |  3 +-
 gcc/c-family/c-attribs.c  | 94 +--
 gcc/c-family/c-common.h   |  1 +
 gcc/c-family/c-ubsan.c| 22 +++
 gcc/c-family/c-ubsan.h|  3 -
 gcc/c/c-convert.c |  5 +-
 gcc/c/c-decl.c|  5 +-
 gcc/c/c-typeck.c  | 15 ++---
 gcc/common.opt|  6 +-
 gcc/convert.c |  3 +-
 gcc/cp/class.c|  3 +-
 g

Re: [PATCH] Optimize divmod expansion (PR middle-end/79665)

2017-05-31 Thread Georg-Johann Lay

On 23.02.2017 06:59, Jeff Law wrote:

On 02/22/2017 02:40 PM, Jakub Jelinek wrote:

Hi!

If both arguments of integer division or modulo are known to be
non-negative
in corresponding signed type, then signed as well as unsigned
division/modulo
shall have the exact same result and therefore we can choose between
those
two depending on which one is faster (or shorter for -Os), which varries
a lot depending on target and especially for constant divisors on the
exact
divisor.  expand_divmod itself is too complicated and we don't even have
the ability to ask about costs e.g. for highpart multiplication without
actually expanding it, so this patch just in that case tries both
sequences,
computes their costs and uses the cheaper (and for equal cost honors the
actual original signedness of the operation).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-22  Jakub Jelinek  

PR middle-end/79665
* internal-fn.c (get_range_pos_neg): Moved to ...
* tree.c (get_range_pos_neg): ... here.  No longer static.
* tree.h (get_range_pos_neg): New prototype.
* expr.c (expand_expr_real_2) : If both
arguments
are known to be in between 0 and signed maximum inclusive, try to
expand both unsigned and signed divmod and use the cheaper one from
those.

OK.
jeff


Hi, this causes a performance degradation for avr.

When optimizing for speed, and with a known denominatior, then v6 uses
s/umulMM3_highpart insn to avoid division because no div instruction is
available.

unsigned scale256 (unsigned val)
{
return value / 255;
}

With this patch, v7 now uses __divmodhi4 which is very expensive but
the costs are not computed because rtlanal.c:seq_cost assumes a cost of
ONE:

  for (; seq; seq = NEXT_INSN (seq))
{
  set = single_set (seq);
  if (set)
cost += set_rtx_cost (set, speed);
  else
cost++;
}

because divmod in not a single_set:
(gdb) p seq
$10 = (const rtx_insn *) 0x7730d500
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(insn 14 13 0 (parallel [
(set (reg:HI 52)
(div:HI (reg:HI 47)
(reg:HI 54)))
(set (reg:HI 53)
(mod:HI (reg:HI 47)
(reg:HI 54)))
(clobber (reg:QI 21 r21))
(clobber (reg:HI 22 r22))
(clobber (reg:HI 24 r24))
(clobber (reg:HI 26 r26))
]) "scale.c":7 -1
 (nil))
(gdb)

Hence the divmod appears to be much less expensive than the unsigned
variant that computed the costs for mult_highpart.


Johann







Re: [PATCH] Optimize divmod expansion (PR middle-end/79665)

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 10:06:34AM +0200, Georg-Johann Lay wrote:
> Hi, this causes a performance degradation for avr.
> 
> When optimizing for speed, and with a known denominatior, then v6 uses
> s/umulMM3_highpart insn to avoid division because no div instruction is
> available.
> 
> unsigned scale256 (unsigned val)
> {
> return value / 255;
> }
> 
> With this patch, v7 now uses __divmodhi4 which is very expensive but
> the costs are not computed because rtlanal.c:seq_cost assumes a cost of
> ONE:
> 
>   for (; seq; seq = NEXT_INSN (seq))
> {
>   set = single_set (seq);
>   if (set)
> cost += set_rtx_cost (set, speed);
>   else
> cost++;
> }
> 
> because divmod in not a single_set:
> (gdb) p seq
> $10 = (const rtx_insn *) 0x7730d500
> (gdb) pr
> warning: Expression is not an assignment (and might have no effect)
> (insn 14 13 0 (parallel [
> (set (reg:HI 52)
> (div:HI (reg:HI 47)
> (reg:HI 54)))
> (set (reg:HI 53)
> (mod:HI (reg:HI 47)
> (reg:HI 54)))
> (clobber (reg:QI 21 r21))
> (clobber (reg:HI 22 r22))
> (clobber (reg:HI 24 r24))
> (clobber (reg:HI 26 r26))
> ]) "scale.c":7 -1
>  (nil))
> (gdb)
> 
> Hence the divmod appears to be much less expensive than the unsigned
> variant that computed the costs for mult_highpart.

Then you should fix the cost computation - be able to use a target hook
on insns that are not a single set or something similar.

Jakub


Re: [PATCH, GCC/ARM/gcc-7-branch] Backport PR71607

2017-05-31 Thread Richard Sandiford
Prakhar Bahuguna  writes:
> This patch tackles the issue reported in PR71607. This patch takes a different
> approach for disabling the creation of literal pools. Instead of disabling the
> patterns that would normally transform the rtl into actual literal pools, it
> disables the creation of this literal pool rtl by making the target hook
> TARGET_CANNOT_FORCE_CONST_MEM return true if arm_disable_literal_pool is true.
> I added patterns to split floating point constants for both SF and DFmode. A
> pattern to handle the addressing of label_refs had to be included as well 
> since
> all "memory_operand" patterns are disabled when TARGET_CANNOT_FORCE_CONST_MEM
> returns true. Also the pattern for splitting 32-bit immediates had to be
> changed, it was not accepting unsigned 32-bit unsigned integers with the MSB
> set. I believe const_int_operand expects the mode of the operand to be set to
> VOIDmode and not SImode. I have only changed it in the patterns that were
> affecting this code, though I suggest looking into changing it in the rest of
> the ARM backend.

I couldn't see the const_int_operand bit in the attached patch, but:
const_int_operand *should* usually be used with the logical integer
mode, such as SImode or DImode.

const_ints are supposed to be stored in sign-extended form, so a 32-bit
integer with the MSB set should be 0x8000|x instead of
0x8000|x.  It's a bug if you have one where that isn't true.

In the patch it looks like this could come from:

> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index befdea9edd9..d8f77e2ffe4 100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -2079,3 +2079,40 @@
>  ;; fmdhr et al (VFPv1)
>  ;; Support for xD (single precision only) variants.
>  ;; fmrrs, fmsrr
> +
> +;; Split an immediate DF move to two immediate SI moves.
> +(define_insn_and_split "no_literal_pool_df_immediate"
> +  [(set (match_operand:DF 0 "s_register_operand" "")
> + (match_operand:DF 1 "const_double_operand" ""))]
> +  "TARGET_THUMB2 && arm_disable_literal_pool
> +  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
> +   && vfp3_const_double_rtx (operands[1]))"
> +  "#"
> +  "&& !reload_completed"
> +  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
> +   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
> +   (set (match_dup 0) (match_dup 1))]
> +  "
> +  long buf[2];
> +  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
> +  operands[2] = GEN_INT ((int) buf[0]);
> +  operands[3] = GEN_INT ((int) buf[1]);
> +  operands[1] = gen_reg_rtx (DFmode);
> +  ")
> +
> +;; Split an immediate SF move to one immediate SI move.
> +(define_insn_and_split "no_literal_pool_sf_immediate"
> +  [(set (match_operand:SF 0 "s_register_operand" "")
> + (match_operand:SF 1 "const_double_operand" ""))]
> +  "TARGET_THUMB2 && arm_disable_literal_pool
> +  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
> +  "#"
> +  "&& !reload_completed"
> +  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
> +   (set (match_dup 0) (match_dup 1))]
> +  "
> +  long buf;
> +  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
> +  operands[2] = GEN_INT ((int) buf);
> +  operands[1] = gen_reg_rtx (SFmode);
> +  ")

...these two splits, where the GEN_INTs should probably be:

  gen_int_mode (..., SImode);

instead.

Thanks,
Richard


[PATCH, Committed] Add self to MAINTAINERS

2017-05-31 Thread Prakhar Bahuguna
I have added myself to the Write After Approval section of the MAINTAINERS
list.

ChangeLog:

2017-05-31  Prakhar Bahuguna  

* MAINTAINERS: Add self to Write After Approval

-- 

Prakhar Bahuguna


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Alexander Monakov
On Wed, 31 May 2017, Martin Liška wrote:
> I added to common.opt:
> Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) PerFunction
> No sanitize flags for a function

This needs a period at the end ("for a function.").

> FAIL: compiler driver --help=optimizers option(s): "^ +-.*[^:.]$" absent from 
> output: "  -###No sanitize flags for a function"
> 
> Should I add new flag IgnoreOption for the option?

Adding the missing period should fix this failure.

Alexander

Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 10:04:53AM +0200, Martin Liška wrote:
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 13305558d2d..5e9942d5100 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -222,9 +222,13 @@ bool flag_opts_finished
>  Variable
>  unsigned int flag_sanitize
>  
> +###
> +Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) PerFunction
> +No sanitize flags for a function

This looks weird, you are redefining the -### option which is normally
a driver option.
I would have thought you just want a Variable, like the one right below
this.  Aren't all "Variable"s per-function?

Jakub


Re: [PATCH][x86][PR73350][PR80862]

2017-05-31 Thread Kirill Yukhin
Hello Julia,
On 26 May 09:13, Koval, Julia wrote:
> Hi,
> This patch fixes these PR's. Ok for trunk?
> 
> gcc/
>   * config/i386/subst.md (round): Fix round pattern.
>   * config/i386/i386.c (ix86_erase_embedded_rounding):
>   Fix erasing rounding for the fixed pattern.
> 
> Thanks,
> Julia

Let me copy-paste parts of your patch here.
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index 0bc22fd..2e632b9 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -137,12 +137,12 @@
 
 (define_subst "round"
   [(set (match_operand:SUBST_A 0)
-(match_operand:SUBST_A 1))]
+   (match_operand:SUBST_A 1))]
   "TARGET_AVX512F"
-  [(parallel[
- (set (match_dup 0)
-  (match_dup 1))
- (unspec [(match_operand:SI 2 "const_4_or_8_to_11_operand")] 
UNSPEC_EMBEDDED_ROUNDING)])])
+  [(set (match_dup 0)
+   (unspec:SUBST_A [(match_dup 1)
+   (match_operand:SI 2 "const_4_or_8_to_11_operand")]
+UNSPEC_EMBEDDED_ROUNDING))])
 
 (define_subst_attr "round_saeonly_name" "round_saeonly" "" "_round")
 (define_subst_attr "round_saeonly_mask_operand2" "mask" "%r2" "%r4")
-- 
2.5.5

So, you propose to put RC as third argument to the set expression.
I am not sure that we might use (set ...) in such a way.

GCC Internals state that:
(set lval x)
  Represents the action of storing the value of x into the place represented by 
lval. 

May this lead to problems? From the first looks like answer is NO and we're 
might
add stuff back to the set expression.

Jakub, Richard, could you pls comment on this?

--
Thanks, K






Re: [PATCH] Optimize divmod expansion (PR middle-end/79665)

2017-05-31 Thread Georg-Johann Lay

On 31.05.2017 10:15, Jakub Jelinek wrote:

On Wed, May 31, 2017 at 10:06:34AM +0200, Georg-Johann Lay wrote:

Hi, this causes a performance degradation for avr.

When optimizing for speed, and with a known denominatior, then v6 uses
s/umulMM3_highpart insn to avoid division because no div instruction is
available.

unsigned scale256 (unsigned val)
{
return value / 255;
}

With this patch, v7 now uses __divmodhi4 which is very expensive but
the costs are not computed because rtlanal.c:seq_cost assumes a cost of
ONE:

  for (; seq; seq = NEXT_INSN (seq))
{
  set = single_set (seq);
  if (set)
cost += set_rtx_cost (set, speed);
  else
cost++;
}

because divmod in not a single_set:
(gdb) p seq
$10 = (const rtx_insn *) 0x7730d500
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(insn 14 13 0 (parallel [
(set (reg:HI 52)
(div:HI (reg:HI 47)
(reg:HI 54)))
(set (reg:HI 53)
(mod:HI (reg:HI 47)
(reg:HI 54)))
(clobber (reg:QI 21 r21))
(clobber (reg:HI 22 r22))
(clobber (reg:HI 24 r24))
(clobber (reg:HI 26 r26))
]) "scale.c":7 -1
 (nil))
(gdb)

Hence the divmod appears to be much less expensive than the unsigned
variant that computed the costs for mult_highpart.


Then you should fix the cost computation - be able to use a target hook
on insns that are not a single set or something similar.

Jakub



Are you saying that cost computation in GCC is fundamentally flawed
for anything that it not a single_set?

Johann



Re: [PATCH] Optimize divmod expansion (PR middle-end/79665)

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 10:48:07AM +0200, Georg-Johann Lay wrote:
> > > because divmod in not a single_set:
> > > (gdb) p seq
> > > $10 = (const rtx_insn *) 0x7730d500
> > > (gdb) pr
> > > warning: Expression is not an assignment (and might have no effect)
> > > (insn 14 13 0 (parallel [
> > > (set (reg:HI 52)
> > > (div:HI (reg:HI 47)
> > > (reg:HI 54)))
> > > (set (reg:HI 53)
> > > (mod:HI (reg:HI 47)
> > > (reg:HI 54)))
> > > (clobber (reg:QI 21 r21))
> > > (clobber (reg:HI 22 r22))
> > > (clobber (reg:HI 24 r24))
> > > (clobber (reg:HI 26 r26))
> > > ]) "scale.c":7 -1
> > >  (nil))
> > > (gdb)
> > > 
> > > Hence the divmod appears to be much less expensive than the unsigned
> > > variant that computed the costs for mult_highpart.
> > 
> > Then you should fix the cost computation - be able to use a target hook
> > on insns that are not a single set or something similar.
> 
> Are you saying that cost computation in GCC is fundamentally flawed
> for anything that it not a single_set?

The division/modulo optimization I've added as well as many other spots
in GCC rely on reasonable cost, just grep e.g. all places that call
seq_cost.  So, if it returns something that is a very wrong estimate,
it won't affect just that single optimization, but all others.  Therefore,
you should fix the cost computation, rather than disabling all the places
that use the costs.  Many targets have instructions with multiple sets,
so I'm surprised assuming cost of 1 for them doesn't break many more things.
I think either we should have a separate target hook for multiple sets
instructions, or just call the targetm.rtx_costs on the PARALLEL in that
case and see if the targets compute something reasonable for it, otherwise
either use the cost of the first set, or maximum of all sets (that might be
best) or something similar.

Jakub


Re: [PATCH 2/2] DWARF: make it possible to emit debug info for declarations only

2017-05-31 Thread Pierre-Marie de Rodat

On 05/31/2017 09:34 AM, Richard Biener wrote:

Actually for the bigger picture I'd refactor rest_of_decl_compilation, not
calling it from the frontends but rely on finalize_decl/function.  The missing
part would then be calling the dwarf hook which should eventually be done
at some of the places the frontends now call rest_of_decl_compliation.
[…]
But for an easier way (you might still explore the above ;)) just remove
the guards from dwarf2out.c and handle it more like types that we
prune if they end up being unused (OTOH I guess we don't refer to
the decl DIEs from "calls" because not all calls are refered to with
standard DWARF -- the GNU callsite stuff refers them I think but those
get generated too late).

That said, when early_finish is called the cgraph and IPA references
exists and thus you can
sort-of see which functions are "used".


Ok, thanks. I’ll give a try to the first option, then. :-)

--
Pierre-Marie de Rodat


Re: [PATCH 9/13] D: D2 Testsuite Dejagnu files.

2017-05-31 Thread Matthias Klose
On 30.05.2017 16:32, Mike Stump wrote:
> On May 28, 2017, at 2:16 PM, Iain Buclaw  wrote:
>>
>> This patch adds D language support to the GCC test suite.
> 
> Ok.  If you could ensure that gcc without D retains all it's goodness and 
> that gcc with D works on 2 different systems, that will help ensure 
> integration smoothness.
> 
> Something this large can be integration tested on a svn/git branch, if you 
> need others to help out.

I built the library (x86 and ARM32) and the D frontend on several Debian
architectures and OSes (Linux, KFreeBSD, Hurd) in the past, but can do that with
the proposed patches again. A svn/git branch would be helpful for that, if a
recent test is required.

Matthias


[arm-embedded] Enable Purecode for ARMv8-M Baseline

2017-05-31 Thread Prakhar Bahuguna
We have decided to apply the following patch to ARM/embedded-7-branch and
ARM/embedded-6-branch to enable Purecode support for ARMv8-M Baseline targets.

ChangeLog:

2017-05-31  Prakhar Bahuguna  

Backport from mainline
2017-05-04  Prakhar Bahuguna  
Andre Simoes Dias Vieira  

gcc/
* config/arm/arm.md (movsi): Add TARGET_32BIT in addition to the
TARGET_HAVE_MOVT conditional.
(movt splitter): Likewise.
* config/arm/arm.c (arm_option_check_internal): Change arm_arch_thumb2
to TARGET_HAVE_MOVT, and merge with -mslow-flash-data check.
(const_ok_for_arm): Change else to else if (TARGET_THUMB2) and add else
block for Thumb-1 with MOVT.
(thumb2_legitimate_address_p): Move code block ...
(can_avoid_literal_pool_for_label_p): ... into this new function.
(thumb1_legitimate_address_p): Add check for TARGET_HAVE_MOVT and
literal pool.
(thumb_legitimate_constant_p): Add conditional on TARGET_HAVE_MOVT
* doc/invoke.texi (-mpure-code): Change "ARMv7-M targets" for
"M-profile targets with the MOVT instruction".

gcc/testsuite/
* gcc.target/arm/pure-code/pure-code.exp: Add conditional for
check_effective_target_arm_thumb1_movt_ok.

-- 

Prakhar Bahuguna
>From a9a11e668170e9f832ba76fb9dd35284069756cf Mon Sep 17 00:00:00 2001
From: thopre01 
Date: Thu, 4 May 2017 10:26:25 +
Subject: [PATCH] [ARM] Enable Purecode for ARMv8-M Baseline

This patch adds support for purecode to ARMv8-M Baseline, in addition to
the existing support for ARMv7-M and ARMv8-M Mainline.
---
 gcc/config/arm/arm.c   | 78 ++
 gcc/config/arm/arm.md  |  6 +-
 gcc/doc/invoke.texi|  3 +-
 .../gcc.target/arm/pure-code/pure-code.exp |  5 +-
 4 files changed, 58 insertions(+), 34 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f3a6b64b168..acee644fa98 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2832,16 +2832,16 @@ arm_option_check_internal (struct gcc_options *opts)
   flag_pic = 0;
 }
 
-  /* We only support -mslow-flash-data on armv7-m targets.  */
-  if (target_slow_flash_data
-  && ((!(arm_arch7 && !arm_arch_notm) && !arm_arch7em)
- || (TARGET_THUMB1_P (flags) || flag_pic || TARGET_NEON)))
-error ("-mslow-flash-data only supports non-pic code on armv7-m targets");
-
-  /* We only support pure-code on Thumb-2 M-profile targets.  */
-  if (target_pure_code
-  && (!arm_arch_thumb2 || arm_arch_notm || flag_pic || TARGET_NEON))
-error ("-mpure-code only supports non-pic code on armv7-m targets");
+  /* We only support -mpure-code and -mslow-flash-data on M-profile targets
+ with MOVT.  */
+  if ((target_pure_code || target_slow_flash_data)
+  && (!TARGET_HAVE_MOVT || arm_arch_notm || flag_pic || TARGET_NEON))
+{
+  const char *flag = (target_pure_code ? "-mpure-code" :
+"-mslow-flash-data");
+  error ("%s only supports non-pic code on M-profile targets with the "
+"MOVT instruction", flag);
+}
 
 }
 
@@ -4076,7 +4076,7 @@ const_ok_for_arm (HOST_WIDE_INT i)
   || (i & ~0xfc03) == 0))
return TRUE;
 }
-  else
+  else if (TARGET_THUMB2)
 {
   HOST_WIDE_INT v;
 
@@ -4092,6 +4092,14 @@ const_ok_for_arm (HOST_WIDE_INT i)
   if (i == v)
return TRUE;
 }
+  else if (TARGET_HAVE_MOVT)
+{
+  /* Thumb-1 Targets with MOVT.  */
+  if (i > 0x)
+   return FALSE;
+  else
+   return TRUE;
+}
 
   return FALSE;
 }
@@ -7699,6 +7707,32 @@ arm_legitimate_address_outer_p (machine_mode mode, rtx 
x, RTX_CODE outer,
   return 0;
 }
 
+/* Return true if we can avoid creating a constant pool entry for x.  */
+static bool
+can_avoid_literal_pool_for_label_p (rtx x)
+{
+  /* Normally we can assign constant values to target registers without
+ the help of constant pool.  But there are cases we have to use constant
+ pool like:
+ 1) assign a label to register.
+ 2) sign-extend a 8bit value to 32bit and then assign to register.
+
+ Constant pool access in format:
+ (set (reg r0) (mem (symbol_ref (".LC0"
+ will cause the use of literal pool (later in function arm_reorg).
+ So here we mark such format as an invalid format, then the compiler
+ will adjust it into:
+ (set (reg r0) (symbol_ref (".LC0")))
+ (set (reg r0) (mem (reg r0))).
+ No extra register is required, and (mem (reg r0)) won't cause the use
+ of literal pools.  */
+  if (arm_disable_literal_pool && GET_CODE (x) == SYMBOL_REF
+  && CONSTANT_POOL_ADDRESS_P (x))
+return 1;
+  return 0;
+}
+
+
 /* Return nonzero if X is a valid Thumb-2 address operand.  */
 static int
 thumb2_legitimate_address_p (machine_mode mode, rtx x, int strict_p)

[PATCH] Rename __builtin_ia32_kmov16 to __builtin_ia32_kmovw in gcc-{5,6}-branch

2017-05-31 Thread Senkevich, Andrew
Hi,

attached patches are for renaming __builtin_ia32_kmov16 to __builtin_ia32_kmovw 
in GCC 5.* and 6.* branches since it was renamed in master.
Bootstrapped and regtested on x86_64-linux-gnu.

gcc/
* config/i386/i386.c (__builtin_ia32_kmovw): Renamed from
__builtin_ia32_kmov16 since it was renamed in master.
* config/i386/avx512fintrin.h: Ditto.

Are they Ok to commit?


--
Andrew



rename_kmov_builtin_gcc-5-branch.patch
Description: rename_kmov_builtin_gcc-5-branch.patch


rename_kmov_builtin_gcc-6-branch.patch
Description: rename_kmov_builtin_gcc-6-branch.patch


Re: [PATCH][x86][PR73350][PR80862]

2017-05-31 Thread Kirill Yukhin
On 31 May 11:38, Kirill Yukhin wrote:
> Hello Julia,
> On 26 May 09:13, Koval, Julia wrote:
> > Hi,
> > This patch fixes these PR's. Ok for trunk?
> > 
> > gcc/
> > * config/i386/subst.md (round): Fix round pattern.
> > * config/i386/i386.c (ix86_erase_embedded_rounding):
> > Fix erasing rounding for the fixed pattern.
> > 
> > Thanks,
> > Julia
> 
> Let me copy-paste parts of your patch here.
> diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
> index 0bc22fd..2e632b9 100644
> --- a/gcc/config/i386/subst.md
> +++ b/gcc/config/i386/subst.md
> @@ -137,12 +137,12 @@
>  
>  (define_subst "round"
>[(set (match_operand:SUBST_A 0)
> -(match_operand:SUBST_A 1))]
> + (match_operand:SUBST_A 1))]
>"TARGET_AVX512F"
> -  [(parallel[
> - (set (match_dup 0)
> -  (match_dup 1))
> - (unspec [(match_operand:SI 2 "const_4_or_8_to_11_operand")] 
> UNSPEC_EMBEDDED_ROUNDING)])])
> +  [(set (match_dup 0)
> + (unspec:SUBST_A [(match_dup 1)
> + (match_operand:SI 2 "const_4_or_8_to_11_operand")]
> +  UNSPEC_EMBEDDED_ROUNDING))])
>  
>  (define_subst_attr "round_saeonly_name" "round_saeonly" "" "_round")
>  (define_subst_attr "round_saeonly_mask_operand2" "mask" "%r2" "%r4")
> -- 
> 2.5.5
> 
> So, you propose to put RC as third argument to the set expression.
> I am not sure that we might use (set ...) in such a way.
Whoops, I was wrong. You are setting w/ SET_SRC as UNSPEC which
which is eliminated conditionally, which is much better to me.

Few nits:
1.
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index 0bc22fd..2e632b9 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -137,12 +137,12 @@
 
 (define_subst "round"
   [(set (match_operand:SUBST_A 0)
-(match_operand:SUBST_A 1))]
+   (match_operand:SUBST_A 1))]
   "TARGET_AVX512F"
Junk.

2. Check identation pls

3. Mention PR in ChangeLog entry

4. Add reg test please

Overall I like this approach. We must somehow set explicit dependency
between RC and actual op, why not this way?

Could you pls make sure that CSE is still working for ops w/ identical RC?

To be paranoid: is it possible to check skylake-avx512 w/ and w/o the patch
on Spec2k6?

--
Thanks, K


Re: Default std::vector default and move constructor

2017-05-31 Thread Jonathan Wakely

On 29/05/17 22:55 +0200, François Dumont wrote:

Hi

   It wasn't such a big deal to restore value-init of the allocator. 
So here is the updated patch.


   I used:
 _Bvector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )

   rather than using is_nothrow_default_constructible. Any advantage 
in one approach or the other ?


Well in general the is_nothrow_default_constructible trait also tells
you if the type is default-constructible at all, but the form above
won't compile if it isn't default-constructible. In this specific case
it doesn't matter, because that constructor won't compile anyway if
the allocator isn't default-constructible.



   I'll complete testing and add a test on this value-initialization 
before commit if you agree.


Thanks.


   Tests still running but I'm pretty sure it will work the same.


Yes, it should do.

I'm going to commit a fix for PR80893 in vector::_M_initialize
but I don't think it will conflict with your changes.



Re: Default std::vector default and move constructor

2017-05-31 Thread Jonathan Wakely

On 28/05/17 22:13 +0200, François Dumont wrote:
Sure but like freedom which stop where start others' freedom so does 
those requirements :-). Because the Standard says that an allocator 
will be value-init when there is no default-init it makes usage of the 
C++11 default constructor more complicated.


It makes the std::lib implementors job harder, for the benefit of
users. That is the correct trade off.

We don't get to ignore the guarantees of the standard just because
they're difficult.



Re: [PATCH 2/4 v3][PR 67328] Analyze some bit tests in VRP

2017-05-31 Thread Richard Biener
On Mon, 29 May 2017, Yuri Gribov wrote:

> This improve VRP handling for bitfield comparisons added by previous patch.
> 
> -I

+is_masked_range_test (tree name, tree valt, enum tree_code cond_code, 
bool is_else_edge,
+ tree *new_name,

long line

+  wide_int mask = maskt,
+inv_mask = ~mask,
+val = valt;  // Assume VALT is INTEGER_CST

indent is off here, please use separate declarations:

  wide_int mask = maskt;
  wide_int inv_mask = ~mask;
...

+//  bool is_range = (cond_code == EQ_EXPR) ^ is_else_edge;
+  bool is_range = cond_code == EQ_EXPR;
+

do not leave dead code around.

+  if (is_range)
+{
+  *low_code = val == min ? (enum tree_code) 0 : GE_EXPR;
+  *high_code = val == max ? (enum tree_code) 0 : LE_EXPR;
+}

please use ERROR_MARK here instead of (enum tree_code) 0.

+   {
+ if (low_code)

and check with != ERROR_MARK here.

+  if (is_masked_range_test (name, val, comp_code, is_else_edge, 
&name, &low, &low_code, &high, &high_code))
+   {

long line again.

Otherwise looks ok.

Thanks,
Richard.



Re: [PATCH 4/4 v3][PR 67328] Optimize some masked comparisons to efficient bittest

2017-05-31 Thread Richard Biener
On Mon, 29 May 2017, Yuri Gribov wrote:

> This no longer fixes the PR but still works in some cases as
> demonstrated by the test. So I decided to keep it.

As Richard noticed you don't need widest_ints but can use wide_ints.
Please use == 0 instead of ! on wide-ints as well.

+(for cmp (le gt)
+ (simplify
..
+  (switch
+   (if (cmp == LE_EXPR)
+   (eq:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) { 
build_zero_cst (ty); }))
+   (if (cmp == GT_EXPR)
+   (ne:type (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) { 
build_zero_cst (ty); })

long lines plus you can simplify this with using

 (for cmp (le gt)
  eqcmp (eq ne)
 ...

 (eqcmp (bit_and @1 { wide_int_to_tree (ty, hi_bits); }) 
{build_zero_cst (ty); }

no need to spell out :type on the result as well.

Richard.


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 10:35 AM, Jakub Jelinek wrote:
> On Wed, May 31, 2017 at 10:04:53AM +0200, Martin Liška wrote:
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 13305558d2d..5e9942d5100 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -222,9 +222,13 @@ bool flag_opts_finished
>>  Variable
>>  unsigned int flag_sanitize
>>  
>> +###
>> +Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) PerFunction
>> +No sanitize flags for a function
> 
> This looks weird, you are redefining the -### option which is normally
> a driver option.

I know. I was thinking that it's also a 'dummy' value.

> I would have thought you just want a Variable, like the one right below
> this.  Aren't all "Variable"s per-function?

Unfortunately not. Well, probably adding new type 'PerFunctionVariable' would be
solution. Then optc-save-gen.awk needs to be learned how to save/restore these 
variables.

Is it the way we want to go?

Martin

> 
>   Jakub
> 



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 10:31 AM, Alexander Monakov wrote:
> On Wed, 31 May 2017, Martin Liška wrote:
>> I added to common.opt:
>> Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) PerFunction
>> No sanitize flags for a function
> 
> This needs a period at the end ("for a function.").

Ah, I see.

> 
>> FAIL: compiler driver --help=optimizers option(s): "^ +-.*[^:.]$" absent 
>> from output: "  -###No sanitize flags for a function"
>>
>> Should I add new flag IgnoreOption for the option?
> 
> Adding the missing period should fix this failure.

Good, however as Jakub noticed, overwriting '-###' is probably not solution.

Martin

> 
> Alexander
> 



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 01:24:47PM +0200, Martin Liška wrote:
> On 05/31/2017 10:35 AM, Jakub Jelinek wrote:
> > On Wed, May 31, 2017 at 10:04:53AM +0200, Martin Liška wrote:
> >> diff --git a/gcc/common.opt b/gcc/common.opt
> >> index 13305558d2d..5e9942d5100 100644
> >> --- a/gcc/common.opt
> >> +++ b/gcc/common.opt
> >> @@ -222,9 +222,13 @@ bool flag_opts_finished
> >>  Variable
> >>  unsigned int flag_sanitize
> >>  
> >> +###
> >> +Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) PerFunction
> >> +No sanitize flags for a function
> > 
> > This looks weird, you are redefining the -### option which is normally
> > a driver option.
> 
> I know. I was thinking that it's also a 'dummy' value.

It is not.

> > I would have thought you just want a Variable, like the one right below
> > this.  Aren't all "Variable"s per-function?
> 
> Unfortunately not. Well, probably adding new type 'PerFunctionVariable' would 
> be
> solution. Then optc-save-gen.awk needs to be learned how to save/restore 
> these variables.
> 
> Is it the way we want to go?

Yes.  We already have TargetVariable.  Or allow specifying
Variable PerFunction

CCing Joseph as option handling maintainer.

Jakub


[OBVIOUS][PATCH] Fix typo in a comment in cpuid.h (PR target/79155).

2017-05-31 Thread Martin Liška
Hello.

Installing as obvious as it only touches comment.

Martin
>From 4b0eebe5accdc7aa0782acccdd61a151c0a48378 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 31 May 2017 13:35:41 +0200
Subject: [PATCH] Fix typo in a comment in cpuid.h (PR target/79155).

gcc/ChangeLog:

2017-05-31  Martin Liska  

	PR target/79155
	* config/i386/cpuid.h: Fix typo in a comment in cpuid.h.
---
 gcc/config/i386/cpuid.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index f915d2dbd5a..b3b0f912c98 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -179,7 +179,7 @@
 
 
 /* Return highest supported input value for cpuid instruction.  ext can
-   be either 0x0 or 0x800 to return highest supported value for
+   be either 0x0 or 0x8000 to return highest supported value for
basic or extended cpuid information.  Function returns 0 if cpuid
is not supported or whatever cpuid returns in eax register.  If sig
pointer is non-null, then first four bytes of the signature
-- 
2.12.2



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 1:33 PM, Jakub Jelinek  wrote:
> On Wed, May 31, 2017 at 01:24:47PM +0200, Martin Liška wrote:
>> On 05/31/2017 10:35 AM, Jakub Jelinek wrote:
>> > On Wed, May 31, 2017 at 10:04:53AM +0200, Martin Liška wrote:
>> >> diff --git a/gcc/common.opt b/gcc/common.opt
>> >> index 13305558d2d..5e9942d5100 100644
>> >> --- a/gcc/common.opt
>> >> +++ b/gcc/common.opt
>> >> @@ -222,9 +222,13 @@ bool flag_opts_finished
>> >>  Variable
>> >>  unsigned int flag_sanitize
>> >>
>> >> +###
>> >> +Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) 
>> >> PerFunction
>> >> +No sanitize flags for a function
>> >
>> > This looks weird, you are redefining the -### option which is normally
>> > a driver option.
>>
>> I know. I was thinking that it's also a 'dummy' value.
>
> It is not.
>
>> > I would have thought you just want a Variable, like the one right below
>> > this.  Aren't all "Variable"s per-function?
>>
>> Unfortunately not. Well, probably adding new type 'PerFunctionVariable' 
>> would be
>> solution. Then optc-save-gen.awk needs to be learned how to save/restore 
>> these variables.
>>
>> Is it the way we want to go?
>
> Yes.  We already have TargetVariable.  Or allow specifying
> Variable PerFunction
>
> CCing Joseph as option handling maintainer.

Just wanting to add that "ab-"using options/variables to implement
what are really
function attributes doesn't look very clean.  Unless the plan is to get rid of
function attributes in favor of per-function options.

I'll also note that eventually global variables may want to be no-sanitized
(for asan maybe).  And we don't (yet) have per-variable options.

Richard.

> Jakub


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
> Just wanting to add that "ab-"using options/variables to implement
> what are really
> function attributes doesn't look very clean.  Unless the plan is to get rid of
> function attributes in favor of per-function options.

Function attribute here is one thing (the way user writes it) and that
combined with the command line options determines the sanitization performed
(the function attributes only say what sanitization flags should be
ignored).  The proposed per-function variable is just a cache of this
information, because parsing function attributes every time is way too
expensive.

Jakub


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 01:46 PM, Richard Biener wrote:
> On Wed, May 31, 2017 at 1:33 PM, Jakub Jelinek  wrote:
>> On Wed, May 31, 2017 at 01:24:47PM +0200, Martin Liška wrote:
>>> On 05/31/2017 10:35 AM, Jakub Jelinek wrote:
 On Wed, May 31, 2017 at 10:04:53AM +0200, Martin Liška wrote:
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 13305558d2d..5e9942d5100 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -222,9 +222,13 @@ bool flag_opts_finished
>  Variable
>  unsigned int flag_sanitize
>
> +###
> +Common RejectNegative Joined UInteger Var(flag_no_sanitize_fn) 
> PerFunction
> +No sanitize flags for a function

 This looks weird, you are redefining the -### option which is normally
 a driver option.
>>>
>>> I know. I was thinking that it's also a 'dummy' value.
>>
>> It is not.
>>
 I would have thought you just want a Variable, like the one right below
 this.  Aren't all "Variable"s per-function?
>>>
>>> Unfortunately not. Well, probably adding new type 'PerFunctionVariable' 
>>> would be
>>> solution. Then optc-save-gen.awk needs to be learned how to save/restore 
>>> these variables.
>>>
>>> Is it the way we want to go?
>>
>> Yes.  We already have TargetVariable.  Or allow specifying
>> Variable PerFunction
>>
>> CCing Joseph as option handling maintainer.
> 
> Just wanting to add that "ab-"using options/variables to implement
> what are really
> function attributes doesn't look very clean.  Unless the plan is to get rid of
> function attributes in favor of per-function options.

Well, that was what I did in my original version of the patch. I basically 
transformed
all no_sanitize_address, no_sanitize_undefined and others to a single 
DECL_ATTRIBUTE
called 'no_sanitize' where I masked in integer all these. Feedback I was given 
by Jakub
recommended me to not to do it.

> 
> I'll also note that eventually global variables may want to be no-sanitized
> (for asan maybe).  And we don't (yet) have per-variable options.

Good remark.

Martin

> 
> Richard.
> 
>> Jakub



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 01:51 PM, Jakub Jelinek wrote:
> On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
>> Just wanting to add that "ab-"using options/variables to implement
>> what are really
>> function attributes doesn't look very clean.  Unless the plan is to get rid 
>> of
>> function attributes in favor of per-function options.
> 
> Function attribute here is one thing (the way user writes it) and that
> combined with the command line options determines the sanitization performed
> (the function attributes only say what sanitization flags should be
> ignored).  The proposed per-function variable is just a cache of this
> information, because parsing function attributes every time is way too
> expensive.

But one the other hand every function decorated with such attribute will lead
to having a separate copy of struct cl_optimization, which is quite big 
structure.

Another question is how often such attribute is used, I guess usage is quite 
rare?

Martin

> 
>   Jakub
> 



[build] Support --sysroot with Solaris ld

2017-05-31 Thread Rainer Orth
The Solaris linker recently gained sysroot support.  The following patch
enables that, although there isn't much to do:

* Until recently, ld --help output went to stderr, not being caught by
  gcc/configure's tests which only checked stdout.  However, older ld
  versions still differ here and libtool long has been checking both
  stdout and stderr, so this seems a pretty obvious change to me.

  The only point worth mentioning is that I've guarded the --as-needed
  check for non-GNU ld: before, the native Solaris -z ignore/-z record
  forms were used, now this would use the compat options
  --as-needed/--no-as-needed which occur in the --help output.

* While Solaris ld *does* support --sysroot for gld compatibility, we've
  always preferred the native forms of the options, -z sysroot in this
  case, which the sol2.h part of the patch implements.

Tested in i386-pc-solaris2.12 and sparc-sun-solaris2.12 builds with both
ld and gld, checking that auto-host.h has no unexpected changes.

Also tested with i386-pc-solaris2.12 x sparc-sun-solaris2.12 and
sparc-sun-solaris2.12 x i386-pc-solaris2.12 crosses with both ld (which
has been a cross-linker for quite some time) and gld, checking that -z
sysroot/--sysroot was used as expected.

Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-05-16  Rainer Orth  

* configure.ac (gcc_cv_ld_static_dynamic): Also check stderr for
$gcc_cv_ld --help output.
(gcc_cv_ld_demangle): Likewise.
(gcc_cv_ld_eh_frame_hdr): Likewise.
(gcc_cv_ld_pie): Likewise.
(gcc_cv_ld_as_needed): Likewise.  Prefer native forms unless $gnu_ld.
(gcc_cv_ld_buildid): Likewise.
(gcc_cv_ld_sysroot): Likewise.
(ld_bndplt_support): Likewise.
(ld_pushpopstate_support): Likewise.
* configure: Regenerate.
* config/sol2.h [!USE_GLD] (SYSROOT_SPEC): Define.

# HG changeset patch
# Parent  75c2ebacfcb74a24e20ed0ec0acd6eeafeff5e86
Support --sysroot with Solaris ld

diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -334,6 +334,11 @@ along with GCC; see the file COPYING3.  
 #endif
 
 #ifndef USE_GLD
+/* Prefer native form with Solaris ld.  */
+#define SYSROOT_SPEC "-z sysroot=%R"
+#endif
+
+#ifndef USE_GLD
 /* With Sun ld, use mapfile to enforce direct binding to libgcc_s unwinder.  */
 #define LINK_LIBGCC_MAPFILE_SPEC \
   "%{shared|shared-libgcc:-M %slibgcc-unwind.map}"
diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3569,8 +3569,8 @@ if test $in_tree_ld = yes ; then
   fi
 elif test x$gcc_cv_ld != x; then
   # Check if linker supports -Bstatic/-Bdynamic option
-  if $gcc_cv_ld --help 2>/dev/null | grep -- -Bstatic > /dev/null \
- && $gcc_cv_ld --help 2>/dev/null | grep -- -Bdynamic > /dev/null; then
+  if $gcc_cv_ld --help 2>&1 | grep -- -Bstatic > /dev/null \
+ && $gcc_cv_ld --help 2>&1 | grep -- -Bdynamic > /dev/null; then
   gcc_cv_ld_static_dynamic=yes
   else
 case "$target" in
@@ -3614,7 +3614,7 @@ if test x"$demangler_in_ld" = xyes; then
 fi
   elif test x$gcc_cv_ld != x -a x"$gnu_ld" = xyes; then
 # Check if the GNU linker supports --demangle option
-if $gcc_cv_ld --help 2>/dev/null | grep no-demangle > /dev/null; then
+if $gcc_cv_ld --help 2>&1 | grep no-demangle > /dev/null; then
   gcc_cv_ld_demangle=yes
 fi
   fi
@@ -4949,7 +4949,7 @@ if test $in_tree_ld = yes ; then
 elif test x$gcc_cv_ld != x; then
   if echo "$ld_ver" | grep GNU > /dev/null; then
 # Check if linker supports --eh-frame-hdr option
-if $gcc_cv_ld --help 2>/dev/null | grep eh-frame-hdr > /dev/null; then
+if $gcc_cv_ld --help 2>&1 | grep eh-frame-hdr > /dev/null; then
   gcc_cv_ld_eh_frame_hdr=yes
 fi
   else
@@ -5020,7 +5020,7 @@ if test $in_tree_ld = yes ; then
   fi
 elif test x$gcc_cv_ld != x; then
   # Check if linker supports -pie option
-  if $gcc_cv_ld --help 2>/dev/null | grep -- -pie > /dev/null; then
+  if $gcc_cv_ld --help 2>&1 | grep -- -pie > /dev/null; then
 gcc_cv_ld_pie=yes
 case "$target" in
   *-*-solaris2*)
@@ -5346,19 +5346,19 @@ if test $in_tree_ld = yes ; then
 gcc_cv_ld_as_needed=yes
   fi
 elif test x$gcc_cv_ld != x; then
-	# Check if linker supports --as-needed and --no-as-needed options
-	if $gcc_cv_ld --help 2>/dev/null | grep as-needed > /dev/null; then
-		gcc_cv_ld_as_needed=yes
-	else
-	  case "$target" in
-	# Solaris 2 ld always supports -z ignore/-z record.
-	*-*-solaris2*)
-	  gcc_cv_ld_as_needed=yes
-	  gcc_cv_ld_as_needed_option="-z ignore"
-	  gcc_cv_ld_no_as_needed_option="-z record"
-	  ;;
-	  esac
-	fi
+  # Check if linker supports --as-needed and --no-as-needed options
+  if $gcc_cv_ld --help 2>&1 | grep as-needed > /dev/null; then
+gcc_cv_ld_as_needed=yes
+  fi
+  c

Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 01:57:48PM +0200, Martin Liška wrote:
> On 05/31/2017 01:51 PM, Jakub Jelinek wrote:
> > On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
> >> Just wanting to add that "ab-"using options/variables to implement
> >> what are really
> >> function attributes doesn't look very clean.  Unless the plan is to get 
> >> rid of
> >> function attributes in favor of per-function options.
> > 
> > Function attribute here is one thing (the way user writes it) and that
> > combined with the command line options determines the sanitization performed
> > (the function attributes only say what sanitization flags should be
> > ignored).  The proposed per-function variable is just a cache of this
> > information, because parsing function attributes every time is way too
> > expensive.
> 
> But one the other hand every function decorated with such attribute will lead
> to having a separate copy of struct cl_optimization, which is quite big 
> structure.

Separate?  I thought cl_optimization structs are shared, so if you have 2
functions that have the same no_sanitize* attributes and all other
optimization flags same as well, they should share OPTIMIZATION_NODE.

Jakub


[C++ Patch] PR 80896 ("[[nodiscard]] is ignored for functions returning references")

2017-05-31 Thread Paolo Carlini

Hi,

this one appears to be a rather simple case of missing diagnostic: in 
convert_to_void we aren't calling maybe_warn_nodiscard when we strip an 
INDIRECT_REF wrapping a CALL_EXPR thus we don't issue the diagnostic 
that we normally provide for plain CALL_EXPRs (eg, for a func returning 
a plain int). Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2017-05-31  Paolo Carlini  

PR c++/80896
* cvt.c (convert_to_void): Possibly call maybe_warn_nodiscard
for case INDIRECT_REF too in the main switch.

/testsuite
2017-05-31  Paolo Carlini  

PR c++/80896
* g++.dg/cpp1z/nodiscard5.C: New.
Index: testsuite/g++.dg/cpp1z/nodiscard5.C
===
--- testsuite/g++.dg/cpp1z/nodiscard5.C (revision 0)
+++ testsuite/g++.dg/cpp1z/nodiscard5.C (working copy)
@@ -0,0 +1,7 @@
+// PR c++/80896
+// { dg-do compile { target c++11 } }
+
+int x = 42;
+[[nodiscard]] int& func() { return x; }
+
+int main() { func(); }  // { dg-warning "ignoring return value" }
Index: cp/cvt.c
===
--- cp/cvt.c(revision 248728)
+++ cp/cvt.c(working copy)
@@ -1296,6 +1296,8 @@ convert_to_void (tree expr, impl_conv_void implici
 && !is_reference)
   warning_at (loc, OPT_Wunused_value, "value computed is not 
used");
 expr = TREE_OPERAND (expr, 0);
+   if (TREE_CODE (expr) == CALL_EXPR)
+ maybe_warn_nodiscard (expr, implicit);
   }
 
break;


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 1:51 PM, Jakub Jelinek  wrote:
> On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
>> Just wanting to add that "ab-"using options/variables to implement
>> what are really
>> function attributes doesn't look very clean.  Unless the plan is to get rid 
>> of
>> function attributes in favor of per-function options.
>
> Function attribute here is one thing (the way user writes it) and that
> combined with the command line options determines the sanitization performed
> (the function attributes only say what sanitization flags should be
> ignored).  The proposed per-function variable is just a cache of this
> information, because parsing function attributes every time is way too
> expensive.

True, but isn't that just an excuse to not improve attribute list
representation?

Ideally we'd have sth like attributes.def and a sorted vector of
integer id, args
pairs.  Using a sorted vector of the existing stuff (compared to the tree list)
might also help.

Yes, we'd get (quite?) a bit less attribute list sharing this way but
we can still
share the actual tree-whatever thing that represents the args.

Richard.

>
> Jakub


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 2:01 PM, Jakub Jelinek  wrote:
> On Wed, May 31, 2017 at 01:57:48PM +0200, Martin Liška wrote:
>> On 05/31/2017 01:51 PM, Jakub Jelinek wrote:
>> > On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
>> >> Just wanting to add that "ab-"using options/variables to implement
>> >> what are really
>> >> function attributes doesn't look very clean.  Unless the plan is to get 
>> >> rid of
>> >> function attributes in favor of per-function options.
>> >
>> > Function attribute here is one thing (the way user writes it) and that
>> > combined with the command line options determines the sanitization 
>> > performed
>> > (the function attributes only say what sanitization flags should be
>> > ignored).  The proposed per-function variable is just a cache of this
>> > information, because parsing function attributes every time is way too
>> > expensive.
>>
>> But one the other hand every function decorated with such attribute will lead
>> to having a separate copy of struct cl_optimization, which is quite big 
>> structure.
>
> Separate?  I thought cl_optimization structs are shared, so if you have 2
> functions that have the same no_sanitize* attributes and all other
> optimization flags same as well, they should share OPTIMIZATION_NODE.

Yes.  For optimizing size we might want to tweak the machinery to use a bool
bitfield instead of {un,}signed ints for things that are really just
flags (true/false).

Richard.

> Jakub


[PATCH] Fix PR80880

2017-05-31 Thread Richard Biener

Approved by Ilya in the PR.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-05-31  Richard Biener  

PR target/80880
* config/i386/i386.c (ix86_expand_builtin): Remove assert
for arg being an SSA name when expanding IX86_BUILTIN_BNDRET.

* gcc.target/i386/pr80880.c: New testcase.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 248722)
+++ gcc/config/i386/i386.c  (working copy)
@@ -37584,7 +37584,6 @@ ix86_expand_builtin (tree exp, rtx targe
 
 case IX86_BUILTIN_BNDRET:
   arg0 = CALL_EXPR_ARG (exp, 0);
-  gcc_assert (TREE_CODE (arg0) == SSA_NAME);
   target = chkp_get_rtl_bounds (arg0);
 
   /* If no bounds were specified for returned value,
Index: gcc/testsuite/gcc.target/i386/pr80880.c
===
--- gcc/testsuite/gcc.target/i386/pr80880.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr80880.c (working copy)
@@ -0,0 +1,10 @@
+/* PR target/65523 */
+/* { dg-do compile { target { ! x32 } } } */
+/* { dg-options "-O -fcheck-pointer-bounds -mmpx" } */
+
+int *fn1()
+{
+  int *r = fn1();
+  if (r == (void *)0)
+return r;
+}


Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 02:06 PM, Richard Biener wrote:
> On Wed, May 31, 2017 at 2:01 PM, Jakub Jelinek  wrote:
>> On Wed, May 31, 2017 at 01:57:48PM +0200, Martin Liška wrote:
>>> On 05/31/2017 01:51 PM, Jakub Jelinek wrote:
 On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
> Just wanting to add that "ab-"using options/variables to implement
> what are really
> function attributes doesn't look very clean.  Unless the plan is to get 
> rid of
> function attributes in favor of per-function options.

 Function attribute here is one thing (the way user writes it) and that
 combined with the command line options determines the sanitization 
 performed
 (the function attributes only say what sanitization flags should be
 ignored).  The proposed per-function variable is just a cache of this
 information, because parsing function attributes every time is way too
 expensive.
>>>
>>> But one the other hand every function decorated with such attribute will 
>>> lead
>>> to having a separate copy of struct cl_optimization, which is quite big 
>>> structure.
>>
>> Separate?  I thought cl_optimization structs are shared, so if you have 2
>> functions that have the same no_sanitize* attributes and all other
>> optimization flags same as well, they should share OPTIMIZATION_NODE.
> 
> Yes.  For optimizing size we might want to tweak the machinery to use a bool
> bitfield instead of {un,}signed ints for things that are really just
> flags (true/false).

I've got written that on my TODO list. Will work on that some time in the 
stage1.

Martin

> 
> Richard.
> 
>> Jakub



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 02:04 PM, Richard Biener wrote:
> On Wed, May 31, 2017 at 1:51 PM, Jakub Jelinek  wrote:
>> On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
>>> Just wanting to add that "ab-"using options/variables to implement
>>> what are really
>>> function attributes doesn't look very clean.  Unless the plan is to get rid 
>>> of
>>> function attributes in favor of per-function options.
>>
>> Function attribute here is one thing (the way user writes it) and that
>> combined with the command line options determines the sanitization performed
>> (the function attributes only say what sanitization flags should be
>> ignored).  The proposed per-function variable is just a cache of this
>> information, because parsing function attributes every time is way too
>> expensive.
> 
> True, but isn't that just an excuse to not improve attribute list
> representation?
> 
> Ideally we'd have sth like attributes.def and a sorted vector of
> integer id, args
> pairs.  Using a sorted vector of the existing stuff (compared to the tree 
> list)
> might also help.

Then it would be tree-wise very similar to CONSTRUCTOR which also contains 
vector
of (index, value) pairs?

> 
> Yes, we'd get (quite?) a bit less attribute list sharing this way but
> we can still
> share the actual tree-whatever thing that represents the args.

Any estimation how difficult such transformation would be?

Martin

> 
> Richard.
> 
>>
>> Jakub



Re: [PATCH 1/2] Port Doxygen support script from Perl to Python; add unittests

2017-05-31 Thread Martin Liška
On 04/28/2017 02:03 PM, Martin Liška wrote:
> You were not brave enough to port remaining pattern in 
> contrib/filter_knr2ansi.pl,
> right :) ?

Well. It shows the script just screws up many places and I bet we don't have 
many KNR2 declarations
(if any). I'm attaching diff how it transforms input files. That said, I'm 
suggesting to eventually
remove usage of the script.

Martin
--- /tmp/content	2017-05-31 14:07:13.220516780 +0200
+++ /tmp/content.after	2017-05-31 14:08:51.718531194 +0200
@@ -372,7 +372,7 @@
Returns false if MEM is not suitable for the alias-oracle.  */
 
 static bool
-ao_ref_from_mem (ao_ref *ref, const_rtx mem)
+ao_ref_from_mem ()
 {
   tree expr = MEM_EXPR (mem);
   tree base;
@@ -456,7 +456,7 @@
two rtxen may alias, false otherwise.  */
 
 static bool
-rtx_refs_may_alias_p (const_rtx x, const_rtx mem, bool tbaa_p)
+rtx_refs_may_alias_p ()
 {
   ao_ref ref1, ref2;
 
@@ -474,7 +474,7 @@
such an entry, or NULL otherwise.  */
 
 static inline alias_set_entry *
-get_alias_set_entry (alias_set_type alias_set)
+get_alias_set_entry ()
 {
   return (*alias_sets)[alias_set];
 }
@@ -483,7 +483,7 @@
the two MEMs cannot alias each other.  */
 
 static inline int
-mems_in_disjoint_alias_sets_p (const_rtx mem1, const_rtx mem2)
+mems_in_disjoint_alias_sets_p ()
 {
   return (flag_strict_aliasing
 	  && ! alias_sets_conflict_p (MEM_ALIAS_SET (mem1),
@@ -493,7 +493,7 @@
 /* Return true if the first alias set is a subset of the second.  */
 
 bool
-alias_set_subset_of (alias_set_type set1, alias_set_type set2)
+alias_set_subset_of ()
 {
   alias_set_entry *ase2;
 
@@ -557,7 +557,7 @@
 /* Return 1 if the two specified alias sets may conflict.  */
 
 int
-alias_sets_conflict_p (alias_set_type set1, alias_set_type set2)
+alias_sets_conflict_p ()
 {
   alias_set_entry *ase1;
   alias_set_entry *ase2;
@@ -631,7 +631,7 @@
 /* Return 1 if the two specified alias sets will always conflict.  */
 
 int
-alias_sets_must_conflict_p (alias_set_type set1, alias_set_type set2)
+alias_sets_must_conflict_p ()
 {
   /* Disable TBAA oracle with !flag_strict_aliasing.  */
   if (!flag_strict_aliasing)
@@ -656,7 +656,7 @@
NULL_TREE, it means we know nothing about the storage.  */
 
 int
-objects_must_conflict_p (tree t1, tree t2)
+objects_must_conflict_p ()
 {
   alias_set_type set1, set2;
 
@@ -698,7 +698,7 @@
set of this parent is the alias set that must be used for T itself.  */
 
 tree
-component_uses_parent_alias_set_from (const_tree t)
+component_uses_parent_alias_set_from ()
 {
   const_tree found = NULL_TREE;
 
@@ -761,7 +761,7 @@
alias-set zero.  */
 
 static bool
-ref_all_alias_ptr_type_p (const_tree t)
+ref_all_alias_ptr_type_p ()
 {
   return (TREE_CODE (TREE_TYPE (t)) == VOID_TYPE
 	  || TYPE_REF_CAN_ALIAS_ALL (t));
@@ -772,7 +772,7 @@
special about dereferencing T.  */
 
 static alias_set_type
-get_deref_alias_set_1 (tree t)
+get_deref_alias_set_1 ()
 {
   /* All we care about is the type.  */
   if (! TYPE_P (t))
@@ -791,7 +791,7 @@
either a type or an expression.  */
 
 alias_set_type
-get_deref_alias_set (tree t)
+get_deref_alias_set ()
 {
   /* If we're not doing any alias analysis, just assume everything
  aliases everything else.  */
@@ -817,7 +817,7 @@
can be used for assigning an alias set.  */
  
 static tree
-reference_alias_ptr_type_1 (tree *t)
+reference_alias_ptr_type_1 ()
 {
   tree inner;
 
@@ -869,7 +869,7 @@
set for T and the replacement.  */
 
 tree
-reference_alias_ptr_type (tree t)
+reference_alias_ptr_type ()
 {
   /* If the frontend assigns this alias-set zero, preserve that.  */
   if (lang_hooks.get_alias_set (t) == 0)
@@ -894,7 +894,7 @@
from get_deref_alias_set.  */
 
 bool
-alias_ptr_types_compatible_p (tree t1, tree t2)
+alias_ptr_types_compatible_p ()
 {
   if (TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2))
 return true;
@@ -910,7 +910,7 @@
 /* Create emptry alias set entry.  */
 
 alias_set_entry *
-init_alias_set_entry (alias_set_type set)
+init_alias_set_entry ()
 {
   alias_set_entry *ase = ggc_alloc ();
   ase->alias_set = set;
@@ -927,7 +927,7 @@
expression.  Call language-specific routine for help, if needed.  */
 
 alias_set_type
-get_alias_set (tree t)
+get_alias_set ()
 {
   alias_set_type set;
 
@@ -1212,7 +1212,7 @@
 /* Return a brand-new alias set.  */
 
 alias_set_type
-new_alias_set (void)
+new_alias_set ()
 {
   if (alias_sets == 0)
 vec_safe_push (alias_sets, (alias_set_entry *) NULL);
@@ -1234,7 +1234,7 @@
subset of alias set zero.  */
 
 void
-record_alias_subset (alias_set_type superset, alias_set_type subset)
+record_alias_subset ()
 {
   alias_set_entry *superset_entry;
   alias_set_entry *subset_entry;
@@ -1291,7 +1291,7 @@
only record the component type if it is not marked non-aliased.  */
 
 void
-record_component_aliases (tree type)
+record_component_aliases ()
 {
   alias_set_type superset = get_alias_set (type);
   tree field;
@@ -1365,7 +1365,7 @@
 static GTY(()) alias_set_type 

Re: [PATCH TEST]Rectify test case gcc.dg/tree-ssa/ivopt_mult_4.c

2017-05-31 Thread Bin.Cheng
On Fri, May 26, 2017 at 12:49 PM, Richard Biener
 wrote:
> On Thu, May 25, 2017 at 8:00 PM, Bin Cheng  wrote:
>> Hi,
>> I believe this tests has been wrongly modified previously.  It is to test 
>> that the exit check on
>> pointer shouldn't be replaced by integer IV.  Somehow GCC starts replacing 
>> the check on
>> integer IV with pointer IV.  It's valid, though inefficient.  And somehow we 
>> starting checking
>> this iv replacement.   This patch rectifies it by specifically checking the 
>> check on pointer
>> shouldn't be replaced.
>
> So maybe it should then test that the pointer test prevails?  Or
> rather that it doesn't replace
> any exit test?  If 'p' changes for '_2' for unrelated reasons the
> pattern will be not testing what
> it is supposed to test...
Thanks for reviewing, I updated patch testing if condition on p_limit2
still exists before expanding.  Is it OK?

Thanks,
bin
>
> Richard.
>
>> Bootstrap and test in series on x86_64.  Is it OK?
>> Thanks,
>> bin
>> gcc/testsuite/ChangeLog
>> 2017-05-11  Bin Cheng  
>>
>> * gcc.dg/tree-ssa/ivopt_mult_4.c: Explicitly check comparison
>> on pointer should not be replaced.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_4.c
index effb052..e69e416 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_4.c
@@ -1,6 +1,6 @@
 
 /* { dg-do compile { target {{ i?86-*-* x86_64-*-* } && lp64 } } } */
-/* { dg-options "-O2 -m64 -fdump-tree-ivopts-details" } */
+/* { dg-options "-O2 -m64 -fdump-tree-optimized" } */
 
 /* iv i's step 16 so its period is smaller than the max iterations
  * i.e. replacing if (p2 > p_limit2) with testing of i may result in
@@ -21,4 +21,4 @@ long foo(long* p, long* p2, int N1, int N2)
   return s;
 }
 
-/* { dg-final { scan-tree-dump "Replacing exit test" "ivopts"} } */
+/* { dg-final { scan-tree-dump "if \\(.*p_limit2.*\\)" "optimized"} } */


[PATCH] PR libstdc++/80893 Fix null dereference in vector

2017-05-31 Thread Jonathan Wakely

vector does addressof(*ptr) where ptr is returned by
allocate(n), but if n==0 that pointer might not be dereferencable.

While testing the fix I also found some bugs in the
__gnu_test::PointerBase helper that needed correcting.

PR libstdc++/80893
* include/bits/stl_bvector.h (vector::_M_initialize): Avoid
null pointer dereference when size is zero.
* testsuite/23_containers/vector/bool/80893.cc: New.
* testsuite/util/testsuite_allocator.h (PointerBase::PointerBase):
Add non-explicit constructor from nullptr.
(PointerBase::derived() const): Add const-qualified overload.

Tested powerpc64le-linux, committed to trunk.

commit bdb028b38ace766538150d5ef7874123d0689cd7
Author: Jonathan Wakely 
Date:   Wed May 31 11:40:14 2017 +0100

PR libstdc++/80893 Fix null dereference in vector

PR libstdc++/80893
* include/bits/stl_bvector.h (vector::_M_initialize): Avoid
null pointer dereference when size is zero.
* testsuite/23_containers/vector/bool/80893.cc: New.
* testsuite/util/testsuite_allocator.h (PointerBase::PointerBase):
Add non-explicit constructor from nullptr.
(PointerBase::derived() const): Add const-qualified overload.

diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 37e000a..78195c1 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -1089,9 +1089,17 @@ template
 void
 _M_initialize(size_type __n)
 {
-  _Bit_pointer __q = this->_M_allocate(__n);
-  this->_M_impl._M_end_of_storage = __q + _S_nword(__n);
-  this->_M_impl._M_start = iterator(std::__addressof(*__q), 0);
+  if (__n)
+   {
+ _Bit_pointer __q = this->_M_allocate(__n);
+ this->_M_impl._M_end_of_storage = __q + _S_nword(__n);
+ this->_M_impl._M_start = iterator(std::__addressof(*__q), 0);
+   }
+  else
+   {
+ this->_M_impl._M_end_of_storage = _Bit_pointer();
+ this->_M_impl._M_start = iterator(0, 0);
+   }
   this->_M_impl._M_finish = this->_M_impl._M_start + difference_type(__n);
 }
 
diff --git a/libstdc++-v3/testsuite/23_containers/vector/bool/80893.cc 
b/libstdc++-v3/testsuite/23_containers/vector/bool/80893.cc
new file mode 100644
index 000..0545b38
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/vector/bool/80893.cc
@@ -0,0 +1,74 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// libstdc++/80893
+
+#include 
+#include 
+
+struct DereferencedInvalidPointer { };
+
+// User-defined pointer type that throws if a null pointer is dereferenced.
+template
+struct Pointer : __gnu_test::PointerBase, T>
+{
+  using __gnu_test::PointerBase, T>::PointerBase;
+
+  T& operator*() const
+  {
+if (!this->value)
+  throw DereferencedInvalidPointer();
+return *this->value;
+  }
+};
+
+// Minimal allocator using Pointer
+template
+struct Alloc
+{
+  typedef T value_type;
+  typedef Pointer pointer;
+
+  Alloc() = default;
+  template
+Alloc(const Alloc&) { }
+
+  pointer allocate(std::size_t n)
+  {
+if (n)
+  return pointer(std::allocator().allocate(n));
+return nullptr;
+  }
+
+  void deallocate(pointer p, std::size_t n)
+  {
+if (n)
+  std::allocator().deallocate(p.value, n);
+  }
+};
+
+template
+bool operator==(Alloc, Alloc) { return true; }
+
+template
+bool operator!=(Alloc, Alloc) { return false; }
+
+int main()
+{
+  std::vector> v(0);
+  std::vector> w(v);
+}
diff --git a/libstdc++-v3/testsuite/util/testsuite_allocator.h 
b/libstdc++-v3/testsuite/util/testsuite_allocator.h
index 813fc81..56c2708 100644
--- a/libstdc++-v3/testsuite/util/testsuite_allocator.h
+++ b/libstdc++-v3/testsuite/util/testsuite_allocator.h
@@ -570,6 +570,8 @@ namespace __gnu_test
 
   explicit PointerBase(T* p = nullptr) : value(p) { }
 
+  PointerBase(std::nullptr_t) : value(nullptr) { }
+
   template(std::declval()))>
PointerBase(const PointerBase& p) : value(p.value) { }
@@ -603,7 +605,11 @@ namespace __gnu_test
   }
 
 private:
-  Derived& derived() { return static_cast(*this); }
+  Derived&
+  derived() { return static_cast(*this); }
+
+  const Deri

Re: [PATCH, GCC/ARM/gcc-7-branch] Backport PR71607

2017-05-31 Thread Prakhar Bahuguna
On 31/05/2017 09:19:40, Richard Sandiford wrote:
> const_ints are supposed to be stored in sign-extended form, so a 32-bit
> integer with the MSB set should be 0x8000|x instead of
> 0x8000|x.  It's a bug if you have one where that isn't true.
> 
> In the patch it looks like this could come from:
> ...these two splits, where the GEN_INTs should probably be:
> 
>   gen_int_mode (..., SImode);
> 
> instead.

Hi Richard, thanks for the tip. Is there a test case that could produce an
incorrect result? I've attempted to create one using negative doubles and
floats but haven't succeeded.

Thanks,

-- 

Prakhar Bahuguna


Re: [PATCH, GCC/ARM/gcc-7-branch] Backport PR71607

2017-05-31 Thread Richard Sandiford
Prakhar Bahuguna  writes:
> On 31/05/2017 09:19:40, Richard Sandiford wrote:
>> const_ints are supposed to be stored in sign-extended form, so a 32-bit
>> integer with the MSB set should be 0x8000|x instead of
>> 0x8000|x.  It's a bug if you have one where that isn't true.
>> 
>> In the patch it looks like this could come from:
>> ...these two splits, where the GEN_INTs should probably be:
>> 
>>   gen_int_mode (..., SImode);
>> 
>> instead.
>
> Hi Richard, thanks for the tip. Is there a test case that could produce an
> incorrect result? I've attempted to create one using negative doubles and
> floats but haven't succeeded.

Just to check, are you testing with --enable-checking=yes,rtl?

When the values you tried were split, did you get the sign-extended form
or the zero-extended form?

Thanks,
Richard


Re: [PATCH] Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)

2017-05-31 Thread Segher Boessenkool
Ping.

(Sorry for the very aggressive ping; this fixes 764 testsuite failures
on powerpc-linux).


Segher


On Sun, May 28, 2017 at 12:31:12PM +, Segher Boessenkool wrote:
> __atomic_add_fetch adds a value to some memory, and returns the result.
> If there is no direct support for this, expand_builtin_atomic_fetch_op
> is asked to implement this as __atomic_fetch_add (which returns the
> original value of the mem), followed by the addition.  Now, the
> __atomic_add_fetch could have been a tail call, but we shouldn't
> perform the __atomic_fetch_add as a tail call: following code would
> not be executed, and in fact thrown away because there is a barrier
> after tail calls.
> 
> This fixes it.
> 
> Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2017-05-28  Segher Boessenkool  
> 
>   PR middle-end/80902
>   * builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
>   a call, force the call to not be a tail call.
> 
> ---
>  gcc/builtins.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 4f6c9c4..3a70693 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, 
> tree exp, rtx target,
>gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
>TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
>  
> +  /* If we will emit code after the call, the call can not be a tail call.
> + If it is emitted as a tail call, a barrier is emitted after it, and
> + then all trailing code is removed.  */
> +  if (!ignore)
> +CALL_EXPR_TAILCALL (exp) = 0;
> +
>/* Expand the call here so we can emit trailing code.  */
>ret = expand_call (exp, target, ignore);
>  
> -- 
> 1.9.3


[PATCH] Fix configure.ac to respect --{enable,disable}-werror option.

2017-05-31 Thread Martin Liška
Hi.

One has to set stage2_werror_flags in action-if-{not,}-given
in order to properly respect the configure option.

Ready to be installed?
Martin
>From 77244d330010b2a17ca81fc866e2904e2f3fece0 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 31 May 2017 15:27:05 +0200
Subject: [PATCH] Fix configure.ac to respect --{enable,disable}-werror option.

ChangeLog:

2017-05-31  Martin Liska  

	* configure.ac: Add handling of stage2_werror_flags to
	action-if-given and to action-if-not-given.
	* configure: Regenerate.
---
 configure| 7 +--
 configure.ac | 7 +--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 82aa619fad1..2cbb4b7ab9d 100755
--- a/configure
+++ b/configure
@@ -14641,13 +14641,13 @@ fi
 # Check whether --enable-werror was given.
 if test "${enable_werror+set}" = set; then :
   enableval=$enable_werror;
-fi
-
 case ${enable_werror} in
   yes) stage2_werror_flag="--enable-werror-always" ;;
   *) stage2_werror_flag="" ;;
 esac
 
+else
+
 if test -d ${srcdir}/gcc && test x"`cat $srcdir/gcc/DEV-PHASE`" = xexperimental; then
   case $BUILD_CONFIG in
   bootstrap-debug)
@@ -14657,6 +14657,9 @@ if test -d ${srcdir}/gcc && test x"`cat $srcdir/gcc/DEV-PHASE`" = xexperimental;
   esac
 fi
 
+fi
+
+
 
 
 # Specify what files to not compare during bootstrap.
diff --git a/configure.ac b/configure.ac
index 78d2d593106..82faf06946d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3508,12 +3508,14 @@ AC_SUBST(stage1_checking)
 # Enable -Werror in bootstrap stage2 and later.
 AC_ARG_ENABLE(werror,
 [AS_HELP_STRING([--enable-werror],
-		[enable -Werror in bootstrap stage2 and later])], [], [])
+		[enable -Werror in bootstrap stage2 and later])],
+[
 case ${enable_werror} in
   yes) stage2_werror_flag="--enable-werror-always" ;;
   *) stage2_werror_flag="" ;;
 esac
-
+],
+[
 if test -d ${srcdir}/gcc && test x"`cat $srcdir/gcc/DEV-PHASE`" = xexperimental; then
   case $BUILD_CONFIG in
   bootstrap-debug)
@@ -3522,6 +3524,7 @@ if test -d ${srcdir}/gcc && test x"`cat $srcdir/gcc/DEV-PHASE`" = xexperimental;
   stage2_werror_flag="--enable-werror-always" ;;
   esac
 fi
+])
 
 AC_SUBST(stage2_werror_flag)
 
-- 
2.12.2



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 2:28 PM, Martin Liška  wrote:
> On 05/31/2017 02:04 PM, Richard Biener wrote:
>> On Wed, May 31, 2017 at 1:51 PM, Jakub Jelinek  wrote:
>>> On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
 Just wanting to add that "ab-"using options/variables to implement
 what are really
 function attributes doesn't look very clean.  Unless the plan is to get 
 rid of
 function attributes in favor of per-function options.
>>>
>>> Function attribute here is one thing (the way user writes it) and that
>>> combined with the command line options determines the sanitization performed
>>> (the function attributes only say what sanitization flags should be
>>> ignored).  The proposed per-function variable is just a cache of this
>>> information, because parsing function attributes every time is way too
>>> expensive.
>>
>> True, but isn't that just an excuse to not improve attribute list
>> representation?
>>
>> Ideally we'd have sth like attributes.def and a sorted vector of
>> integer id, args
>> pairs.  Using a sorted vector of the existing stuff (compared to the tree 
>> list)
>> might also help.
>
> Then it would be tree-wise very similar to CONSTRUCTOR which also contains 
> vector
> of (index, value) pairs?
>
>>
>> Yes, we'd get (quite?) a bit less attribute list sharing this way but
>> we can still
>> share the actual tree-whatever thing that represents the args.
>
> Any estimation how difficult such transformation would be?

attribute lists are dealt with in quite some places (with or without
helpers) so I guess it would be somewhat invasive but largely
mechanical.  Using a .def file vs. the current strings can be
done separately -- after all we can also sort strings.  I suspect
doing the string -> ID transform pays off faster (still linear search
but integer comparison instead of string compare).

Richard.

> Martin
>
>>
>> Richard.
>>
>>>
>>> Jakub
>


Re: [PATCH TEST]Rectify test case gcc.dg/tree-ssa/ivopt_mult_4.c

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 2:43 PM, Bin.Cheng  wrote:
> On Fri, May 26, 2017 at 12:49 PM, Richard Biener
>  wrote:
>> On Thu, May 25, 2017 at 8:00 PM, Bin Cheng  wrote:
>>> Hi,
>>> I believe this tests has been wrongly modified previously.  It is to test 
>>> that the exit check on
>>> pointer shouldn't be replaced by integer IV.  Somehow GCC starts replacing 
>>> the check on
>>> integer IV with pointer IV.  It's valid, though inefficient.  And somehow 
>>> we starting checking
>>> this iv replacement.   This patch rectifies it by specifically checking the 
>>> check on pointer
>>> shouldn't be replaced.
>>
>> So maybe it should then test that the pointer test prevails?  Or
>> rather that it doesn't replace
>> any exit test?  If 'p' changes for '_2' for unrelated reasons the
>> pattern will be not testing what
>> it is supposed to test...
> Thanks for reviewing, I updated patch testing if condition on p_limit2
> still exists before expanding.  Is it OK?

Ok.

Richard.

> Thanks,
> bin
>>
>> Richard.
>>
>>> Bootstrap and test in series on x86_64.  Is it OK?
>>> Thanks,
>>> bin
>>> gcc/testsuite/ChangeLog
>>> 2017-05-11  Bin Cheng  
>>>
>>> * gcc.dg/tree-ssa/ivopt_mult_4.c: Explicitly check comparison
>>> on pointer should not be replaced.


Re: [PATCH] Fix configure.ac to respect --{enable,disable}-werror option.

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 3:31 PM, Martin Liška  wrote:
> Hi.
>
> One has to set stage2_werror_flags in action-if-{not,}-given
> in order to properly respect the configure option.
>
> Ready to be installed?

Ok.

Richard.

> Martin


Re: [PATCH 4/5 v3] Vect peeling cost model

2017-05-31 Thread Christophe Lyon
Hi,

On 23 May 2017 at 17:59, Robin Dapp  wrote:
> gcc/ChangeLog:
>
> 2017-05-23  Robin Dapp  
>
> * tree-vect-data-refs.c (vect_get_data_access_cost):
> Workaround for SLP handling.
> (vect_enhance_data_refs_alignment):
> Compute costs for doing no peeling at all, compare to the best
> peeling costs so far and avoid peeling if cheaper.

Since this commit (r248678), I've noticed regressions on some arm targets.
  Executed from: gcc.dg/tree-ssa/tree-ssa.exp
gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment
of access forced using peeling" 1
gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
"Vectorizing an unaligned access" 0
gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "Alignment
of access forced using peeling" 1
gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
"Vectorizing an unaligned access" 0

For instance with --target arm-linux-gnueabihf --with-cpu=cortex-a5
--with-fpu=vfpv3-d16-fp16
(using cortex-a9+neon makes the test pass).

Thanks,

Christophe


Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-31 Thread Will Schmidt
On Tue, 2017-05-30 at 09:00 +0200, Richard Biener wrote:
> On Mon, May 29, 2017 at 2:21 PM, Segher Boessenkool
>  wrote:
> > On Mon, May 29, 2017 at 01:35:22PM +0200, Richard Biener wrote:
> >> >> What's the documented behavior for vec_abs with respect to an
> >> >argument
> >> >> of value INT_MIN?
> >> >
> >> >The documentation says:
> >> >
> >> > "For integer vectors, the arithmetic is modular."
> >>
> >> This means that folding as ABS_EXPR is not safe for !TYPE_OVERFLOW_WRAPS
> >> Integral vector types.
> >
> > Is it still fine if TYPE_OVERFLOW_UNDEFINED?  So essentially always
> > except with -ftrapv?
> 
> The docs say it needs to wrap so the correct check is TYPE_OVERFLOW_WRAPS.
> It's not fine with TYPE_OVERFLOW_UNDEFINED as we will conclude the result
> can never be INT_MIN while the spec says it can.

Ok, thanks for the review.

So it looks like I should bail with something like: 
...
case VSX_BUILTIN_XVABSDP:
  {
arg0 = gimple_call_arg (stmt, 0);
lhs = gimple_call_lhs (stmt);
if (TYPE_OVERFLOW_WRAPS(TREE_TYPE(arg1))
   return false;
...

How can I test this scenario?  At a glance, a testcase snippet doesn't
appear to error out.  Am I quietly losing an overflow indicator?

vector signed int
test1_min (vector signed int x)
{
  vector signed int y = {INT_MIN,INT_MIN,INT_MIN,INT_MIN};
  return vec_abs (y);
}

generates gimple code:
  y = { -2147483648, -2147483648, -2147483648, -2147483648 };
  D.2579 = __builtin_altivec_abs_v4si (y);
or after folding:
  y = { -2147483648, -2147483648, -2147483648, -2147483648 };
  D.2579 = ABS_EXPR ;




> 
> Richard.
> 
> >
> >
> > Segher
> 




Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-31 Thread Richard Biener
On Wed, May 31, 2017 at 3:56 PM, Will Schmidt  wrote:
> On Tue, 2017-05-30 at 09:00 +0200, Richard Biener wrote:
>> On Mon, May 29, 2017 at 2:21 PM, Segher Boessenkool
>>  wrote:
>> > On Mon, May 29, 2017 at 01:35:22PM +0200, Richard Biener wrote:
>> >> >> What's the documented behavior for vec_abs with respect to an
>> >> >argument
>> >> >> of value INT_MIN?
>> >> >
>> >> >The documentation says:
>> >> >
>> >> > "For integer vectors, the arithmetic is modular."
>> >>
>> >> This means that folding as ABS_EXPR is not safe for !TYPE_OVERFLOW_WRAPS
>> >> Integral vector types.
>> >
>> > Is it still fine if TYPE_OVERFLOW_UNDEFINED?  So essentially always
>> > except with -ftrapv?
>>
>> The docs say it needs to wrap so the correct check is TYPE_OVERFLOW_WRAPS.
>> It's not fine with TYPE_OVERFLOW_UNDEFINED as we will conclude the result
>> can never be INT_MIN while the spec says it can.
>
> Ok, thanks for the review.
>
> So it looks like I should bail with something like:
> ...
> case VSX_BUILTIN_XVABSDP:
>   {
> arg0 = gimple_call_arg (stmt, 0);
> lhs = gimple_call_lhs (stmt);
> if (TYPE_OVERFLOW_WRAPS(TREE_TYPE(arg1))
>return false;

No, you want

if (! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1)))
  return false;

that will likely render the transform useless unless -fwrapv is given.

What we miss in the middle-end is a ABSU_EXPR that computes the
unsigned result of the absolute value (of the signed operand).  That's
always well-defined.  So you'd then lower to

y = { -2147483648, -2147483648, -2147483648, -2147483648 };
D.1234 = ABSU_EXPR ;
D.2579 = VIEW_CONVERT ;

RTL expansion of ABSU_EXPR can re-use RTL abs since there's
nothing undefined on RTL.

Richard.

> ...
>
> How can I test this scenario?  At a glance, a testcase snippet doesn't
> appear to error out.  Am I quietly losing an overflow indicator?
>
> vector signed int
> test1_min (vector signed int x)
> {
>   vector signed int y = {INT_MIN,INT_MIN,INT_MIN,INT_MIN};
>   return vec_abs (y);
> }
>
> generates gimple code:
>   y = { -2147483648, -2147483648, -2147483648, -2147483648 };
>   D.2579 = __builtin_altivec_abs_v4si (y);
> or after folding:
>   y = { -2147483648, -2147483648, -2147483648, -2147483648 };
>   D.2579 = ABS_EXPR ;
>
>
>
>
>>
>> Richard.
>>
>> >
>> >
>> > Segher
>>
>
>


Re: [PATCH] Optimize divmod expansion (PR middle-end/79665)

2017-05-31 Thread Georg-Johann Lay

On 31.05.2017 11:00, Jakub Jelinek wrote:

On Wed, May 31, 2017 at 10:48:07AM +0200, Georg-Johann Lay wrote:

because divmod in not a single_set:
(gdb) p seq
$10 = (const rtx_insn *) 0x7730d500
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(insn 14 13 0 (parallel [
(set (reg:HI 52)
(div:HI (reg:HI 47)
(reg:HI 54)))
(set (reg:HI 53)
(mod:HI (reg:HI 47)
(reg:HI 54)))
(clobber (reg:QI 21 r21))
(clobber (reg:HI 22 r22))
(clobber (reg:HI 24 r24))
(clobber (reg:HI 26 r26))
]) "scale.c":7 -1
 (nil))
(gdb)

Hence the divmod appears to be much less expensive than the unsigned
variant that computed the costs for mult_highpart.


Then you should fix the cost computation - be able to use a target hook
on insns that are not a single set or something similar.


Are you saying that cost computation in GCC is fundamentally flawed
for anything that it not a single_set?


The division/modulo optimization I've added as well as many other spots
in GCC rely on reasonable cost, just grep e.g. all places that call
seq_cost.  So, if it returns something that is a very wrong estimate,
it won't affect just that single optimization, but all others.  Therefore,
you should fix the cost computation, rather than disabling all the places
that use the costs.  Many targets have instructions with multiple sets,


I didn't intend to disable anything...

Would the following addition be in order?

gcc/
PR middle-end/80929
* rtlanal.c (seq_cost) [PARALLEL]: Get cost from insn_rtx_cost
instead of assuming cost of 1.

Index: rtlanal.c
===
--- rtlanal.c   (revision 248737)
+++ rtlanal.c   (working copy)
@@ -5300,6 +5300,8 @@ seq_cost (const rtx_insn *seq, bool spee
   set = single_set (seq);
   if (set)
 cost += set_rtx_cost (set, speed);
+  else if (PARALLEL == GET_CODE (PATTERN (seq)))
+   cost += insn_rtx_cost (PATTERN (seq), speed);
   else
 cost++;
 }



so I'm surprised assuming cost of 1 for them doesn't break many more things.


Maybe because PARALLEL is not common, and when expand tests for costs of
DIV or MOD, it passes respective RTXes to the RTL cost functions, and
*not* what the target expands in divmod insns.


I think either we should have a separate target hook for multiple sets
instructions, or just call the targetm.rtx_costs on the PARALLEL in that
case and see if the targets compute something reasonable for it, otherwise
either use the cost of the first set, or maximum of all sets (that might be
best) or something similar.

Jakub



The patch uses whatever insn_rtx_costs comes up with.  For PARALLEL,
it's the cost of the 1st SET which is reasonable imo (at least for the
divmod case).

Johann




Re: [PATCH, rs6000] Fold vector absolutes in GIMPLE

2017-05-31 Thread Ramana Radhakrishnan
On Wed, May 31, 2017 at 3:02 PM, Richard Biener
 wrote:
> On Wed, May 31, 2017 at 3:56 PM, Will Schmidt  
> wrote:
>> On Tue, 2017-05-30 at 09:00 +0200, Richard Biener wrote:
>>> On Mon, May 29, 2017 at 2:21 PM, Segher Boessenkool
>>>  wrote:
>>> > On Mon, May 29, 2017 at 01:35:22PM +0200, Richard Biener wrote:
>>> >> >> What's the documented behavior for vec_abs with respect to an
>>> >> >argument
>>> >> >> of value INT_MIN?
>>> >> >
>>> >> >The documentation says:
>>> >> >
>>> >> > "For integer vectors, the arithmetic is modular."
>>> >>
>>> >> This means that folding as ABS_EXPR is not safe for !TYPE_OVERFLOW_WRAPS
>>> >> Integral vector types.
>>> >
>>> > Is it still fine if TYPE_OVERFLOW_UNDEFINED?  So essentially always
>>> > except with -ftrapv?
>>>
>>> The docs say it needs to wrap so the correct check is TYPE_OVERFLOW_WRAPS.
>>> It's not fine with TYPE_OVERFLOW_UNDEFINED as we will conclude the result
>>> can never be INT_MIN while the spec says it can.
>>
>> Ok, thanks for the review.
>>
>> So it looks like I should bail with something like:
>> ...
>> case VSX_BUILTIN_XVABSDP:
>>   {
>> arg0 = gimple_call_arg (stmt, 0);
>> lhs = gimple_call_lhs (stmt);
>> if (TYPE_OVERFLOW_WRAPS(TREE_TYPE(arg1))
>>return false;
>
> No, you want
>
> if (! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1)))
>   return false;
>
> that will likely render the transform useless unless -fwrapv is given.
>
> What we miss in the middle-end is a ABSU_EXPR that computes the
> unsigned result of the absolute value (of the signed operand).  That's
> always well-defined.  So you'd then lower to
>
> y = { -2147483648, -2147483648, -2147483648, -2147483648 };
> D.1234 = ABSU_EXPR ;
> D.2579 = VIEW_CONVERT ;
>
> RTL expansion of ABSU_EXPR can re-use RTL abs since there's
> nothing undefined on RTL.


There is a PR for this in BZ, though can't find it in a quick search
... We can use this on arm and aarch64 as well IIRC.

regards
Ramana

>
> Richard.
>
>> ...
>>
>> How can I test this scenario?  At a glance, a testcase snippet doesn't
>> appear to error out.  Am I quietly losing an overflow indicator?
>>
>> vector signed int
>> test1_min (vector signed int x)
>> {
>>   vector signed int y = {INT_MIN,INT_MIN,INT_MIN,INT_MIN};
>>   return vec_abs (y);
>> }
>>
>> generates gimple code:
>>   y = { -2147483648, -2147483648, -2147483648, -2147483648 };
>>   D.2579 = __builtin_altivec_abs_v4si (y);
>> or after folding:
>>   y = { -2147483648, -2147483648, -2147483648, -2147483648 };
>>   D.2579 = ABS_EXPR ;
>>
>>
>>
>>
>>>
>>> Richard.
>>>
>>> >
>>> >
>>> > Segher
>>>
>>
>>


[PATCH] Port Doxygen support script from Perl to Python; add unittests

2017-05-31 Thread Martin Liška
Hello.

After discussion with Richi, he approved to install patches separately
to current perl scripts. I'm attaching these patches and I will send patch
that will remove the legacy Perl scripts. The patch will be subject for
normal review process.

Thanks
Martin
>From 818d9da7892bcdb70df6fb456f7ea9243f155f3f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 28 Apr 2017 13:50:24 +0200
Subject: [PATCH 1/3] Port Doxygen support script from Perl to Python; add
 unittests

contrib/ChangeLog:

2017-05-31  David Malcolm  
	Martin Liska  

	* filter_params.py: New, porting the perl script to python,
	adding a test suite.
	* filter_gcc_for_doxygen_new: New file.
---
 contrib/filter_gcc_for_doxygen_new |  12 
 contrib/filter_params.py   | 144 +
 2 files changed, 156 insertions(+)
 create mode 100755 contrib/filter_gcc_for_doxygen_new
 create mode 100644 contrib/filter_params.py

diff --git a/contrib/filter_gcc_for_doxygen_new b/contrib/filter_gcc_for_doxygen_new
new file mode 100755
index 000..d1109a50c88
--- /dev/null
+++ b/contrib/filter_gcc_for_doxygen_new
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+# This filters GCC source before Doxygen can get confused by it;
+# this script is listed in the doxyfile.  The output is not very
+# pretty, but at least we get output that Doxygen can understand.
+#
+# $1 is a source file of some kind.  The source we wish doxygen to
+# process is put on stdout.
+
+dir=`dirname $0`
+python $dir/filter_params.py $1
+exit 0
diff --git a/contrib/filter_params.py b/contrib/filter_params.py
new file mode 100644
index 000..f94d201bbf8
--- /dev/null
+++ b/contrib/filter_params.py
@@ -0,0 +1,144 @@
+#!/usr/bin/python
+"""
+Filters out some of the #defines used throughout the GCC sources:
+- GTY(()) marks declarations for gengtype.c
+- PARAMS(()) is used for K&R compatibility. See ansidecl.h.
+
+When passed one or more filenames, acts on those files and prints the
+results to stdout.
+
+When run without a filename, runs a unit-testing suite.
+"""
+import re
+import sys
+import unittest
+
+# Optional whitespace
+OPT_WS = '\s*'
+
+def filter_src(text):
+"""
+str -> str.  We operate on the whole of the source file at once
+(rather than individual lines) so that we can have multiline
+regexes.
+"""
+
+# Convert C comments from GNU coding convention of:
+#/* FIRST_LINE
+#   NEXT_LINE
+#   FINAL_LINE.  */
+# to:
+#/** @verbatim FIRST_LINE
+#   NEXT_LINE
+#   FINAL_LINE.  @endverbatim */
+# so that doxygen will parse them.
+#
+# Only comments that begin on the left-most column are converted.
+text = re.sub(r'^/\* ',
+  r'/** @verbatim ',
+  text,
+  flags=re.MULTILINE)
+text = re.sub(r'\*/',
+  r' @endverbatim */',
+  text)
+
+# Remove GTY markings (potentially multiline ones):
+text = re.sub('GTY' + OPT_WS + r'\(\(.*?\)\)\s+',
+  '',
+  text,
+  flags=(re.MULTILINE|re.DOTALL))
+
+# Strip out 'ATTRIBUTE_UNUSED'
+text = re.sub('\sATTRIBUTE_UNUSED',
+  '',
+  text)
+
+# PARAMS(()) is used for K&R compatibility. See ansidecl.h.
+text = re.sub('PARAMS' + OPT_WS + r'\(\((.*?)\)\)',
+  r'(\1)',
+  text)
+
+return text
+
+class FilteringTests(unittest.TestCase):
+'''
+Unit tests for filter_src.
+'''
+def assert_filters_to(self, src_input, expected_result):
+# assertMultiLineEqual was added to unittest in 2.7/3.1
+if hasattr(self, 'assertMultiLineEqual'):
+assertion = self.assertMultiLineEqual
+else:
+assertion = self.assertEqual
+assertion(expected_result, filter_src(src_input))
+
+def test_comment_example(self):
+self.assert_filters_to(
+('/* FIRST_LINE\n'
+ '   NEXT_LINE\n'
+ '   FINAL_LINE.  */\n'),
+('/** @verbatim FIRST_LINE\n'
+ '   NEXT_LINE\n'
+ '   FINAL_LINE.   @endverbatim */\n'))
+
+def test_oneliner_comment(self):
+self.assert_filters_to(
+'/* Returns the string representing CLASS.  */\n',
+('/** @verbatim Returns the string representing CLASS.   @endverbatim */\n'))
+
+def test_multiline_comment(self):
+self.assert_filters_to(
+('/* The thread-local storage model associated with a given VAR_DECL\n'
+ "   or SYMBOL_REF.  This isn't used much, but both trees and RTL refer\n"
+ "   to it, so it's here.  */\n"),
+('/** @verbatim The thread-local storage model associated with a given VAR_DECL\n'
+ "   or SYMBOL_REF.  This isn't used much, but both trees and RTL refer\n"
+ "   to it, so it's here.   @endverbatim */\n"))
+
+def test_GTY(self):
+self.as

[PATCH] Fix PR66313

2017-05-31 Thread Richard Biener

So I've come back to PR66313 and found a solution to the tailrecursion
missed optimization when fixing the factoring folding to use an unsigned
type when we're not sure of overflow.

The folding part is identical to my last try from 2015, the tailrecursion
part makes us handle intermittent stmts that were introduced by foldings
that "clobber" our quest walking the single-use chain of stmts between
the call and the return (and failing at all stmts that are not part
of said chain).  A simple solution is to move the stmts that are not
part of the chain and that we can move before the call.  That handles
the leaf conversions that now appear for tree-ssa/tailrecursion-6.c

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-05-31  Richard Biener  

PR middle-end/66313
* fold-const.c (fold_plusminus_mult_expr): If the factored
factor may be zero use a wrapping type for the inner operation.
* tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
and handle moved defs.
(process_assignment): Properly guard the unary op case.  Return a
tri-state indicating that moving the stmt before the call may allow
to continue.  Pass through to_move.
(find_tail_calls): Handle moving unrelated defs before
the call.

* c-c++-common/ubsan/pr66313.c: New testcase.
* gcc.dg/tree-ssa/loop-15.c: Adjust.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c.orig   2015-10-29 12:32:33.302782318 +0100
--- gcc/fold-const.c2015-10-29 14:08:39.936497739 +0100
*** fold_plusminus_mult_expr (location_t loc
*** 6916,6925 
  }
same = NULL_TREE;
  
!   if (operand_equal_p (arg01, arg11, 0))
! same = arg01, alt0 = arg00, alt1 = arg10;
!   else if (operand_equal_p (arg00, arg10, 0))
  same = arg00, alt0 = arg01, alt1 = arg11;
else if (operand_equal_p (arg00, arg11, 0))
  same = arg00, alt0 = arg01, alt1 = arg10;
else if (operand_equal_p (arg01, arg10, 0))
--- 6916,6926 
  }
same = NULL_TREE;
  
!   /* Prefer factoring a common non-constant.  */
!   if (operand_equal_p (arg00, arg10, 0))
  same = arg00, alt0 = arg01, alt1 = arg11;
+   else if (operand_equal_p (arg01, arg11, 0))
+ same = arg01, alt0 = arg00, alt1 = arg10;
else if (operand_equal_p (arg00, arg11, 0))
  same = arg00, alt0 = arg01, alt1 = arg10;
else if (operand_equal_p (arg01, arg10, 0))
*** fold_plusminus_mult_expr (location_t loc
*** 6974,6987 
}
  }
  
!   if (same)
  return fold_build2_loc (loc, MULT_EXPR, type,
fold_build2_loc (loc, code, type,
 fold_convert_loc (loc, type, alt0),
 fold_convert_loc (loc, type, alt1)),
fold_convert_loc (loc, type, same));
  
!   return NULL_TREE;
  }
  
  /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
--- 6975,7010 
}
  }
  
!   if (!same)
! return NULL_TREE;
! 
!   if (! INTEGRAL_TYPE_P (type)
!   || TYPE_OVERFLOW_WRAPS (type)
!   /* We are neither factoring zero nor minus one.  */
!   || TREE_CODE (same) == INTEGER_CST)
  return fold_build2_loc (loc, MULT_EXPR, type,
fold_build2_loc (loc, code, type,
 fold_convert_loc (loc, type, alt0),
 fold_convert_loc (loc, type, alt1)),
fold_convert_loc (loc, type, same));
  
!   /* Same may be zero and thus the operation 'code' may overflow.  Likewise
!  same may be minus one and thus the multiplication may overflow.  Perform
!  the operations in an unsigned type.  */
!   tree utype = unsigned_type_for (type);
!   tree tem = fold_build2_loc (loc, code, utype,
! fold_convert_loc (loc, utype, alt0),
! fold_convert_loc (loc, utype, alt1));
!   /* If the sum evaluated to a constant that is not -INF the multiplication
!  cannot overflow.  */
!   if (TREE_CODE (tem) == INTEGER_CST
!   && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), SIGNED)))
! return fold_build2_loc (loc, MULT_EXPR, type,
!   fold_convert (type, tem), same);
! 
!   return fold_convert_loc (loc, type,
!  fold_build2_loc (loc, MULT_EXPR, utype, tem,
!   fold_convert_loc (loc, utype, 
same)));
  }
  
  /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
Index: gcc/testsuite/c-c++-common/ubsan/pr66313.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/c-c++-common/ubsan/pr66313.c  2015-10-29 14:08:39.969498105 
+0100
***
*** 0 
--- 1,26 
+ /* { dg-do run } */
+ /* { dg-options 

Re: [PATCH] Port Doxygen support script from Perl to Python; add unittests

2017-05-31 Thread Martin Liška
Hi.

This is patch which removes legacy perl scripts and set default values
for both OUTPUT_DIRECTORY and INPUT_FILTER.

Ready for trunk?
Thanks,
Martin


Re: [PATCH] Port Doxygen support script from Perl to Python; add unittests

2017-05-31 Thread Martin Liška
..adding missing patch
>From 3021b695a8111e1552176529ab3342cdd2ae3a43 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 May 2017 11:42:41 +0200
Subject: [PATCH] Doxygen: add default location for filters and output folder.

contrib/ChangeLog:

2017-05-03  Martin Liska  

	* gcc.doxy: Add default location for filters and output folder.
	* filter_gcc_for_doxygen_new: Rename to filter_gcc_for_doxygen.
	* filter_params.pl: Remove.
---
 contrib/filter_gcc_for_doxygen |  6 +++---
 contrib/filter_gcc_for_doxygen_new | 12 
 contrib/filter_params.pl   | 14 --
 contrib/gcc.doxy   |  8 ++--
 4 files changed, 5 insertions(+), 35 deletions(-)
 delete mode 100755 contrib/filter_gcc_for_doxygen_new
 delete mode 100755 contrib/filter_params.pl

diff --git a/contrib/filter_gcc_for_doxygen b/contrib/filter_gcc_for_doxygen
index 3787eebbf0e..d1109a50c88 100755
--- a/contrib/filter_gcc_for_doxygen
+++ b/contrib/filter_gcc_for_doxygen
@@ -1,12 +1,12 @@
 #!/bin/sh
 
 # This filters GCC source before Doxygen can get confused by it;
-# this script is listed in the doxyfile. The output is not very
+# this script is listed in the doxyfile.  The output is not very
 # pretty, but at least we get output that Doxygen can understand.
 #
-# $1 is a source file of some kind. The source we wish doxygen to
+# $1 is a source file of some kind.  The source we wish doxygen to
 # process is put on stdout.
 
 dir=`dirname $0`
-perl $dir/filter_params.pl < $1 | perl $dir/filter_knr2ansi.pl 
+python $dir/filter_params.py $1
 exit 0
diff --git a/contrib/filter_gcc_for_doxygen_new b/contrib/filter_gcc_for_doxygen_new
deleted file mode 100755
index d1109a50c88..000
--- a/contrib/filter_gcc_for_doxygen_new
+++ /dev/null
@@ -1,12 +0,0 @@
-#!/bin/sh
-
-# This filters GCC source before Doxygen can get confused by it;
-# this script is listed in the doxyfile.  The output is not very
-# pretty, but at least we get output that Doxygen can understand.
-#
-# $1 is a source file of some kind.  The source we wish doxygen to
-# process is put on stdout.
-
-dir=`dirname $0`
-python $dir/filter_params.py $1
-exit 0
diff --git a/contrib/filter_params.pl b/contrib/filter_params.pl
deleted file mode 100755
index 22dae6cc561..000
--- a/contrib/filter_params.pl
+++ /dev/null
@@ -1,14 +0,0 @@
-#!/usr/bin/perl
-
-# Filters out some of the #defines used throughout the GCC sources:
-# - GTY(()) marks declarations for gengtype.c
-# - PARAMS(()) is used for K&R compatibility. See ansidecl.h.
-
-while (<>) {
-s/^\/\* /\/\*\* \@verbatim /;
-s/\*\// \@endverbatim \*\//;
-s/GTY[ \t]*\(\(.*\)\)//g;
-s/[ \t]ATTRIBUTE_UNUSED//g;
-s/PARAMS[ \t]*\(\((.*?)\)\)/\($1\)/sg;
-print;
-}
diff --git a/contrib/gcc.doxy b/contrib/gcc.doxy
index 7a284e754aa..a8eeb03c9a0 100644
--- a/contrib/gcc.doxy
+++ b/contrib/gcc.doxy
@@ -11,16 +11,12 @@
 # Values that contain spaces should be placed between quotes (" ")
 
 
-#-
-# NOTE: YOU MUST EDIT THE FOLLOWING HARDWIRED PATHS BEFORE USING THIS FILE.
-#-
-
 # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) 
 # base path where the generated documentation will be put. 
 # If a relative path is entered, it will be relative to the location 
 # where doxygen was started. If left blank the current directory will be used.
 
-OUTPUT_DIRECTORY   = @OUTPUT_DIRECTORY@
+OUTPUT_DIRECTORY   = gcc-doxygen
 
 # The INPUT_FILTER tag can be used to specify a program that doxygen should 
 # invoke to filter for each input file. Doxygen will invoke the filter program 
@@ -30,7 +26,7 @@ OUTPUT_DIRECTORY   = @OUTPUT_DIRECTORY@
 # to standard output.  If FILTER_PATTERNS is specified, this tag will be 
 # ignored.
 
-INPUT_FILTER   = @INPUT_FILTER@
+INPUT_FILTER   = contrib/filter_gcc_for_doxygen
 
 #-
 
-- 
2.12.2



Re: [PATCH v2] Implement no_sanitize function attribute

2017-05-31 Thread Martin Liška
On 05/31/2017 03:31 PM, Richard Biener wrote:
> On Wed, May 31, 2017 at 2:28 PM, Martin Liška  wrote:
>> On 05/31/2017 02:04 PM, Richard Biener wrote:
>>> On Wed, May 31, 2017 at 1:51 PM, Jakub Jelinek  wrote:
 On Wed, May 31, 2017 at 01:46:00PM +0200, Richard Biener wrote:
> Just wanting to add that "ab-"using options/variables to implement
> what are really
> function attributes doesn't look very clean.  Unless the plan is to get 
> rid of
> function attributes in favor of per-function options.

 Function attribute here is one thing (the way user writes it) and that
 combined with the command line options determines the sanitization 
 performed
 (the function attributes only say what sanitization flags should be
 ignored).  The proposed per-function variable is just a cache of this
 information, because parsing function attributes every time is way too
 expensive.
>>>
>>> True, but isn't that just an excuse to not improve attribute list
>>> representation?
>>>
>>> Ideally we'd have sth like attributes.def and a sorted vector of
>>> integer id, args
>>> pairs.  Using a sorted vector of the existing stuff (compared to the tree 
>>> list)
>>> might also help.
>>
>> Then it would be tree-wise very similar to CONSTRUCTOR which also contains 
>> vector
>> of (index, value) pairs?
>>
>>>
>>> Yes, we'd get (quite?) a bit less attribute list sharing this way but
>>> we can still
>>> share the actual tree-whatever thing that represents the args.
>>
>> Any estimation how difficult such transformation would be?
> 
> attribute lists are dealt with in quite some places (with or without
> helpers) so I guess it would be somewhat invasive but largely
> mechanical.  Using a .def file vs. the current strings can be
> done separately -- after all we can also sort strings.  I suspect
> doing the string -> ID transform pays off faster (still linear search
> but integer comparison instead of string compare).

Ok, I'm ready to do the transformation in this stage1. That said, will you be
Jakub fine with the original patch (rebase will be needed) as it is, using
DECL_ATTRIBUTE?

Thanks,
Martin

> 
> Richard.
> 
>> Martin
>>
>>>
>>> Richard.
>>>

 Jakub
>>



Re: SSA range class and removal of VR_ANTI_RANGEs

2017-05-31 Thread Aldy Hernandez

On 05/23/2017 08:11 AM, Jakub Jelinek wrote:

On Tue, May 23, 2017 at 06:48:15AM -0400, Aldy Hernandez wrote:


[ughh, one more time, but CCing the list.]

Sorry, for the delayed response.  I was fighting with Firefox + LTO to 
gather some data :).



I'm worried a lot here about compile time memory usage increase, especially
with EVRP and IPA-VRP and even more so with LTO.
The changes mean that for every SSA_NAME of any kind we now need 8 more
bytes, and for every SSA_NAME that has range info (typically both range info
and nonzero_bits; in the old implementation the two were tied together for a
good reason, many ranges also imply some non-zero bits and from non-zero
bits one can in many cases derive a range) much more than that (through
the nonzero_bits_def you get all the overhead of trailing_wide_ints -
the 3 fields it has, just save on the actual 2 lengths and 2 HWI sets,
but then you add 3x 8 bytes and 6x size of the whole wide_int which is
from what I understood not really meant as the memory storage of wide ints
in data structures, but as something not memory efficient you can work
quickly on (so ideal for automatic variables in functions that work with
those).  So it is a 8 byte growth for SSA_NAMEs without ranges and
8+3*8+6*32-2*4-2*8*x growth for SSA_NAMEs with ranges if x is the number
of HWIs needed to represent the integers of that type (so most commonly
x=1).  For x=1 8+3*8+6*32-2*4-2*8*x is 200 bytes growth.
With say 1000 SSA_NAMEs, 500 of them with ranges, that will be
already a 1GB difference, dunno how many SSA_NAMEs are there e.g. in firefox
LTO build.
Can the nonzero_bits stuff be folded back into irange (and have code to
maintain nonzero_bits in addition to ranges as before (from setting a range
compute or update nonzero_bits and vice versa)?  Can you use
trailing_wide_int for the irange storage of the values?  Can you allocate
only as many HWIs as you really need, rather than always 6?
Again, it can be different between a class you use for accessing the
information and manipulating it and for actual underlying storage on
SSA_NAMEs.


Before I fire off some stats, a few things.  Yes, we can do better. Yes, 
we can use trailing wide ints or another idiom.  Yes, we could join 
nonzero bits with the range info (if absolutely necessary, because as 
I'll show below, the extra word doesn't blow up memory as much as 
advertised).


Thanks to Martin Liska and Markus I was able to build a three-month old 
firefox branch with GCC7 LTO.  I built firefox with -O2 -flto=9, and 
then looked at the biggest partition at the end of VRP1 (that is, 
warn_array_bound_p == true).


The short version is that I don't see anywhere close to 10 million 
SSA_NAMEs.  Of the SSA_NAMEs we do get, 25% are pointers and irrelevant. 
  22% of the non-pointer SSA_NAMEs actually have range information, and 
of this range info 32% is actually useless because it spans the entire 
domain.  So, unless I'm misunderstanding something, even in the worst 
case scenario (in firefox + LTO anyways), I don't see such a big blow up 
of memory.


Now on to the actual numbers.  Remember, this is only for one of the 
partitions during the ltrans stage of LTO, running VRP1, since this 
seems to be the granular level of a compilation process.  If you're 
trying to compile 20 partitions at the same time, get more memory :).


The biggest number of SSA_NAMEs I saw was actually 472,225.  Of these, 
357,032 were non-pointers, so could conceivably have range information. 
In reality, 77,398 had range information, so 16% of all pointer and 
non-pointer SSA_NAMEs actually have range information.


Now here's the interesting bit... in analyzing those 77,398 ranges, I 
noticed that a great many of them were useless.  They consisted of the 
range for the entire domain even with no NON_ZERO_BITS set.  The new 
implementation will obviously not have this limitation :-).  Taking this 
into account, I see a total of 52,718 non-useless ranges out of 472,225. 
 That's about 11%.


In the compilation of this large partition, G.stats.total_allocated is 
2,192,677,688, so about 2 gigs of memory.  Of which the total 
non-useless range info is 1,686,976 for vanilla GCC7.  This looks like 
less than 2 megs. I calculated the range info memory consumption with 
(sizeof (range_info_def) + trailing_wide_ints <3>::extra_size (precision)).


Assuming a pathological case of 200 extra bytes in each of the 52,718 
non-useless ranges, we'd have 10 megabytes of extra data.  And that's in 
a compilation that already consumes 2 gigs of memory.  Isn't that like 
less than half of a percent?  We could even account for that extra 
pointer in SSA_NAME and add less than 4 megs to that number (472,225 * 8 
bytes).


Again, do not take this as an excuse for the careless memory layout of 
my patch (after all, it was meant as a proof of concept).  I will 
re-arrange things and optimize things a bit.  However, I also don't 
think that adding a word or two here is going to 

Re: [PATCH 4/5 v3] Vect peeling cost model

2017-05-31 Thread Robin Dapp
> Since this commit (r248678), I've noticed regressions on some arm targets.
>   Executed from: gcc.dg/tree-ssa/tree-ssa.exp
> gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment
> of access forced using peeling" 1
> gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
> "Vectorizing an unaligned access" 0
> gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "Alignment
> of access forced using peeling" 1
> gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
> "Vectorizing an unaligned access" 0
> 
> For instance with --target arm-linux-gnueabihf --with-cpu=cortex-a5
> --with-fpu=vfpv3-d16-fp16
> (using cortex-a9+neon makes the test pass).

I do not have access to an arm machine for testing but could these
regressions be "ok" as in we no longer perform peeling because costs for
not peeling <= costs for peeling and we still vectorize? (Just guessing)
Or are these real regressions that prevent vectorization? Does the
"vectorized 1 loops" check fail?

Regards
 Robin



Re: [PATCH 4/5 v3] Vect peeling cost model

2017-05-31 Thread Christophe Lyon
On 31 May 2017 at 16:27, Robin Dapp  wrote:
>> Since this commit (r248678), I've noticed regressions on some arm targets.
>>   Executed from: gcc.dg/tree-ssa/tree-ssa.exp
>> gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment
>> of access forced using peeling" 1
>> gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
>> "Vectorizing an unaligned access" 0
>> gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "Alignment
>> of access forced using peeling" 1
>> gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect
>> "Vectorizing an unaligned access" 0
>>
>> For instance with --target arm-linux-gnueabihf --with-cpu=cortex-a5
>> --with-fpu=vfpv3-d16-fp16
>> (using cortex-a9+neon makes the test pass).
>
> I do not have access to an arm machine for testing but could these
> regressions be "ok" as in we no longer perform peeling because costs for
> not peeling <= costs for peeling and we still vectorize? (Just guessing)
> Or are these real regressions that prevent vectorization? Does the
> "vectorized 1 loops" check fail?

I know it's not very practical, and I would also have to start a manual build
with the right config to get all the details because all my builds are
in temporary
workspaces.

I reported only the regressions, so yes "vectorized 1 loops" still passes.

Thanks,

Christophe

>
> Regards
>  Robin
>


Re: [PATCH 2/7] [ARC] Avoid use of hard registers before reg-alloc.

2017-05-31 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-05-19 12:30:57 
+0200]:

> gcc/
> 2017-04-10  Claudiu Zissulescu  
> 
>   * config/arc/arc.md (mulsi3): Avoid use of hard registers before
>   reg-alloc when having mul64 or mul32x16 instructions.
>   (mulsidi3): Likewise.
>   (umulsidi3): Likewise.
>   (mulsi32x16): New pattern.
>   (mulsi64): Likewise.
>   (mulsidi64): Likewise.
>   (umulsidi64): Likewise.
>   (MUL32x16_REG): Define.
>   (mul64_600): Use MUL32x16_REG.
>   (mac64_600): Likewise.
>   (umul64_600): Likewise.
>   (umac64_600): Likewise.


Looks good, thanks,

Andrew

> ---
>  gcc/config/arc/arc.md | 168 
> +++---
>  1 file changed, 119 insertions(+), 49 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index db5867c..c0ad86c 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -176,6 +176,7 @@
> (ILINK2_REGNUM 30)
> (RETURN_ADDR_REGNUM 31)
> (MUL64_OUT_REG 58)
> +   (MUL32x16_REG 56)
> (ARCV2_ACC 58)
>  
> (LP_COUNT 60)
> @@ -1940,29 +1941,17 @@
>  }
>else if (TARGET_MUL64_SET)
>  {
> -  emit_insn (gen_mulsi_600 (operands[1], operands[2],
> - gen_mlo (), gen_mhi ()));
> -  emit_move_insn (operands[0], gen_mlo ());
> -  DONE;
> + rtx tmp = gen_reg_rtx (SImode);
> + emit_insn (gen_mulsi64 (tmp, operands[1], operands[2]));
> + emit_move_insn (operands[0], tmp);
> + DONE;
>  }
>else if (TARGET_MULMAC_32BY16_SET)
>  {
> -  if (immediate_operand (operands[2], SImode)
> -   && INTVAL (operands[2]) >= 0
> -   && INTVAL (operands[2]) <= 65535)
> - {
> -   emit_insn (gen_umul_600 (operands[1], operands[2],
> -  gen_acc2 (), gen_acc1 ()));
> -   emit_move_insn (operands[0], gen_acc2 ());
> -   DONE;
> - }
> -  operands[2] = force_reg (SImode, operands[2]);
> -  emit_insn (gen_umul_600 (operands[1], operands[2],
> -gen_acc2 (), gen_acc1 ()));
> -  emit_insn (gen_mac_600 (operands[1], operands[2],
> -gen_acc2 (), gen_acc1 ()));
> -  emit_move_insn (operands[0], gen_acc2 ());
> -  DONE;
> + rtx tmp = gen_reg_rtx (SImode);
> + emit_insn (gen_mulsi32x16 (tmp, operands[1], operands[2]));
> + emit_move_insn (operands[0], tmp);
> + DONE;
>  }
>else
>  {
> @@ -1974,6 +1963,35 @@
>  }
>  })
>  
> +(define_insn_and_split "mulsi32x16"
> + [(set (match_operand:SI 0 "register_operand""=w")
> + (mult:SI (match_operand:SI 1 "register_operand"  "%c")
> +  (match_operand:SI 2 "nonmemory_operand" "ci")))
> +  (clobber (reg:DI MUL32x16_REG))]
> + "TARGET_MULMAC_32BY16_SET"
> + "#"
> + "TARGET_MULMAC_32BY16_SET && reload_completed"
> + [(const_int 0)]
> + {
> +  if (immediate_operand (operands[2], SImode)
> +&& INTVAL (operands[2]) >= 0
> +&& INTVAL (operands[2]) <= 65535)
> + {
> +  emit_insn (gen_umul_600 (operands[1], operands[2],
> +gen_acc2 (), gen_acc1 ()));
> +  emit_move_insn (operands[0], gen_acc2 ());
> +  DONE;
> + }
> +   emit_insn (gen_umul_600 (operands[1], operands[2],
> +gen_acc2 (), gen_acc1 ()));
> +   emit_insn (gen_mac_600 (operands[1], operands[2],
> +gen_acc2 (), gen_acc1 ()));
> +   emit_move_insn (operands[0], gen_acc2 ());
> +   DONE;
> +  }
> + [(set_attr "type" "multi")
> +  (set_attr "length" "8")])
> +
>  ; mululw conditional execution without a LIMM clobbers an input register;
>  ; we'd need a different pattern to describe this.
>  ; To make the conditional execution valid for the LIMM alternative, we
> @@ -2011,6 +2029,24 @@
> (set_attr "predicable" "no, no, yes")
> (set_attr "cond" "nocond, canuse_limm, canuse")])
>  
> +(define_insn_and_split "mulsi64"
> + [(set (match_operand:SI 0 "register_operand""=w")
> + (mult:SI (match_operand:SI 1 "register_operand"  "%c")
> +  (match_operand:SI 2 "nonmemory_operand" "ci")))
> +  (clobber (reg:DI MUL64_OUT_REG))]
> + "TARGET_MUL64_SET"
> + "#"
> + "TARGET_MUL64_SET && reload_completed"
> +  [(const_int 0)]
> +{
> +  emit_insn (gen_mulsi_600 (operands[1], operands[2],
> + gen_mlo (), gen_mhi ()));
> +  emit_move_insn (operands[0], gen_mlo ());
> +  DONE;
> +}
> +  [(set_attr "type" "multi")
> +   (set_attr "length" "8")])
> +
>  (define_insn "mulsi_600"
>[(set (match_operand:SI 2 "mlo_operand" "")
>   (mult:SI (match_operand:SI 0 "register_operand"  "%Rcq#q,c,c,c")
> @@ -2155,8 +2191,7 @@
>   (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" ""))
>(sign_extend:DI (match_operand:SI 2 "nonmemory_operand" ""]
>"TARGET_ANY_MPY"
> -"
> -{
> +  {
>if (TARGET_PLUS_MACD)
>  {
>   if (CONST_INT_P (operands

Re: [PATCH 3/7] [ARC] Allow r30 to be used by the reg-alloc.

2017-05-31 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-05-19 12:30:58 
+0200]:

> gcc/
> 2016-12-12  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (arc_conditional_register_usage): Allow r30 to
>   be used by the reg-alloc.

Looks good, thanks,

Andrew


> ---
>  gcc/config/arc/arc.c | 9 -
>  gcc/config/arc/arc.h | 3 ++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index fd4bf2c..ff86f6c 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -1551,7 +1551,14 @@ arc_conditional_register_usage (void)
>/* For ARCv2 the core register set is changed.  */
>strcpy (rname29, "ilink");
>strcpy (rname30, "r30");
> -  fixed_regs[30] = call_used_regs[30] = 1;
> +  call_used_regs[30] = 1;
> +  fixed_regs[30] = 0;
> +
> +  arc_regno_reg_class[30] = WRITABLE_CORE_REGS;
> +  SET_HARD_REG_BIT (reg_class_contents[WRITABLE_CORE_REGS], 30);
> +  SET_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], 30);
> +  SET_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], 30);
> +  SET_HARD_REG_BIT (reg_class_contents[MPY_WRITABLE_CORE_REGS], 30);
> }
>  
>if (TARGET_MUL64_SET)
> diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
> index 0a4c745..fbc1195 100644
> --- a/gcc/config/arc/arc.h
> +++ b/gcc/config/arc/arc.h
> @@ -641,7 +641,8 @@ extern enum reg_class arc_regno_reg_class[];
>((REGNO) < 29 || ((REGNO) == ARG_POINTER_REGNUM) || ((REGNO) == 63)
> \
> || ((unsigned) reg_renumber[REGNO] < 29)  \
> || ((unsigned) (REGNO) == (unsigned) arc_tp_regno)
> \
> -   || (fixed_regs[REGNO] == 0 && IN_RANGE (REGNO, 32, 59)))
> +   || (fixed_regs[REGNO] == 0 && IN_RANGE (REGNO, 32, 59))   \
> +   || ((REGNO) == 30 && fixed_regs[REGNO] == 0))
>  
>  #define REGNO_OK_FOR_INDEX_P(REGNO) REGNO_OK_FOR_BASE_P(REGNO)
>  
> -- 
> 1.9.1
> 


Re: SSA range class and removal of VR_ANTI_RANGEs

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 10:20:51AM -0400, Aldy Hernandez wrote:
> The biggest number of SSA_NAMEs I saw was actually 472,225.  Of these,
> 357,032 were non-pointers, so could conceivably have range information. In
> reality, 77,398 had range information, so 16% of all pointer and non-pointer
> SSA_NAMEs actually have range information.

I've tried to look just at insn-recog.c with the usual stage3 flags +
-fsanitize=address,undefined and I see there
ssa_name_nodes_created
4092344
(of course, that doesn't mean there are 4M SSA_NAMEs all live at the same
time, but I think they don't go through ggc_free and thus the only way that
number goes down is during GC.  There are both 72 and 80 bytes alloc pools,
even a 32MB increase is something we shouldn't ignore.
Furthermore, as e.g. PR80917 shows, we really should track nonzero bits next
to value ranges, the current tracking of it only in tree-ssa-ccp which
doesn't have ASSERT_EXPRs nor any kind of framework to do something similar
without them is insufficient.

Jakub


Re: [PATCH] Rename __builtin_ia32_kmov16 to __builtin_ia32_kmovw in gcc-{5,6}-branch

2017-05-31 Thread Uros Bizjak
On Wed, May 31, 2017 at 12:33 PM, Senkevich, Andrew
 wrote:
> Hi,
>
> attached patches are for renaming __builtin_ia32_kmov16 to 
> __builtin_ia32_kmovw in GCC 5.* and 6.* branches since it was renamed in 
> master.
> Bootstrapped and regtested on x86_64-linux-gnu.
>
> gcc/
> * config/i386/i386.c (__builtin_ia32_kmovw): Renamed from
> __builtin_ia32_kmov16 since it was renamed in master.
> * config/i386/avx512fintrin.h: Ditto.
>
> Are they Ok to commit?

Various undocumented builtins are considered internal, unpublished
interface, so IMO there is no need to backport their renames.

Uros.


Re: [PATCH 4/7] [ARC] Change predicate movv2hi to avoid scaled addresses.

2017-05-31 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-05-19 12:30:59 
+0200]:

> From: Claudiu Zissulescu 
> 
> 2016-12-17  Claudiu Zissulescu  
> 
>   * config/arc/simdext.md (movv2hi_insn): Change predicate to avoid
>   scaled addresses.

Seems reasonable.

Thanks,
Andrew


> ---
>  gcc/config/arc/simdext.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/arc/simdext.md b/gcc/config/arc/simdext.md
> index 5253033..6c102d3 100644
> --- a/gcc/config/arc/simdext.md
> +++ b/gcc/config/arc/simdext.md
> @@ -1356,7 +1356,7 @@
> }")
>  
>  (define_insn_and_split "*movv2hi_insn"
> -  [(set (match_operand:V2HI 0 "nonimmediate_operand" "=r,r,r,m")
> +  [(set (match_operand:V2HI 0 "move_dest_operand" "=r,r,r,m")
>   (match_operand:V2HI 1 "general_operand"   "i,r,m,r"))]
>"(register_operand (operands[0], V2HImode)
>  || register_operand (operands[1], V2HImode))"
> -- 
> 1.9.1
> 


Re: SSA range class and removal of VR_ANTI_RANGEs

2017-05-31 Thread Richard Biener
On May 31, 2017 5:10:04 PM GMT+02:00, Jakub Jelinek  wrote:
>On Wed, May 31, 2017 at 10:20:51AM -0400, Aldy Hernandez wrote:
>> The biggest number of SSA_NAMEs I saw was actually 472,225.  Of
>these,
>> 357,032 were non-pointers, so could conceivably have range
>information. In
>> reality, 77,398 had range information, so 16% of all pointer and
>non-pointer
>> SSA_NAMEs actually have range information.
>
>I've tried to look just at insn-recog.c with the usual stage3 flags +
>-fsanitize=address,undefined and I see there
>ssa_name_nodes_created
>4092344
>(of course, that doesn't mean there are 4M SSA_NAMEs all live at the
>same
>time, but I think they don't go through ggc_free and thus the only way
>that
>number goes down is during GC.  There are both 72 and 80 bytes alloc
>pools,
>even a 32MB increase is something we shouldn't ignore.
>Furthermore, as e.g. PR80917 shows, we really should track nonzero bits
>next
>to value ranges, the current tracking of it only in tree-ssa-ccp which
>doesn't have ASSERT_EXPRs nor any kind of framework to do something
>similar
>without them is insufficient.

I think the important part to recognize is that the VR type used during 
propagation does (and likely should) not have to be the same as that used for 
long-term storage alongside SSA names.

The first thing to do is improve the one we use internally in VRP where we can 
also add a separate bit-lattice easily (though we have to iterate until both 
converge).

Richard.

>   Jakub



Re: [PATCH, GCC/ARM/gcc-7-branch] Backport PR71607

2017-05-31 Thread Prakhar Bahuguna
On 31/05/2017 14:11:43, Richard Sandiford wrote:
> Prakhar Bahuguna  writes:
> > On 31/05/2017 09:19:40, Richard Sandiford wrote:
> >> const_ints are supposed to be stored in sign-extended form, so a 32-bit
> >> integer with the MSB set should be 0x8000|x instead of
> >> 0x8000|x.  It's a bug if you have one where that isn't true.
> >> 
> >> In the patch it looks like this could come from:
> >> ...these two splits, where the GEN_INTs should probably be:
> >> 
> >>   gen_int_mode (..., SImode);
> >> 
> >> instead.
> >
> > Hi Richard, thanks for the tip. Is there a test case that could produce an
> > incorrect result? I've attempted to create one using negative doubles and
> > floats but haven't succeeded.
> 
> Just to check, are you testing with --enable-checking=yes,rtl?
> 
> When the values you tried were split, did you get the sign-extended form
> or the zero-extended form?
> 
> Thanks,
> Richard

I've now rebuilt with --enable-checking=yes,rtl and it appears that the split
values are being correctly sign-extended in the rtl and appear correctly in the
assembly.

However, if you believe it is safer to use gen_int_mode(), I'll respin the
patch accordingly.

-- 

Prakhar Bahuguna


Re: [PATCH 5/7] [ARC] Update (non)commutative_binary_comparison patterns.

2017-05-31 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-05-19 12:31:00 
+0200]:

> gcc/
> 2016-12-20  Claudiu Zissulescu  
> 
>   * config/arc/arc.md (commutative_binary_comparison): Remove 'I'
>   constraint. It is not valid for the pattern.
>   (noncommutative_binary_comparison): Likewise.

Looks good, thanks,
Andrew


> ---
>  gcc/config/arc/arc.md | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index c0ad86c..743a844 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -948,15 +948,15 @@
>[(set (match_operand:CC_ZN 0 "cc_set_register" "")
>   (match_operator:CC_ZN 5 "zn_compare_operator"
> [(match_operator:SI 4 "commutative_operator"
> -  [(match_operand:SI 1 "register_operand" "%c,c,c")
> -   (match_operand:SI 2 "nonmemory_operand" "cL,I,?Cal")])
> +  [(match_operand:SI 1 "register_operand" "%c,c")
> +   (match_operand:SI 2 "nonmemory_operand" "cL,Cal")])
>  (const_int 0)]))
> -   (clobber (match_scratch:SI 3 "=X,1,X"))]
> +   (clobber (match_scratch:SI 3 "=X,X"))]
>""
>"%O4.f 0,%1,%2"
>[(set_attr "type" "compare")
> (set_attr "cond" "set_zn")
> -   (set_attr "length" "4,4,8")])
> +   (set_attr "length" "4,8")])
>  
>  ; for flag setting 'add' instructions like if (a+b) { ...}
>  ; the combiner needs this pattern
> @@ -1050,15 +1050,15 @@
>[(set (match_operand:CC_ZN 0 "cc_set_register" "")
>   (match_operator:CC_ZN 5 "zn_compare_operator"
> [(match_operator:SI 4 "noncommutative_operator"
> -  [(match_operand:SI 1 "register_operand" "c,c,c")
> -   (match_operand:SI 2 "nonmemory_operand" "cL,I,?Cal")])
> +  [(match_operand:SI 1 "register_operand" "c,c")
> +   (match_operand:SI 2 "nonmemory_operand" "cL,Cal")])
>  (const_int 0)]))
> -   (clobber (match_scratch:SI 3 "=X,1,X"))]
> +   (clobber (match_scratch:SI 3 "=X,X"))]
>"TARGET_BARREL_SHIFTER || GET_CODE (operands[4]) == MINUS"
>"%O4.f 0,%1,%2"
>[(set_attr "type" "compare")
> (set_attr "cond" "set_zn")
> -   (set_attr "length" "4,4,8")])
> +   (set_attr "length" "4,8")])
>  
>  (define_expand "bic_f_zn"
>[(parallel
> -- 
> 1.9.1
> 


Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-05-31 Thread Dominique d'Humières
If I am not mistaken, compiling the following code with the patch applied

program test_ivs
  use iso_varying_string
  implicit none

  type(varying_string),dimension(:,:),allocatable :: array2d
  type(varying_string) :: extra
  integer :: i,j

  allocate(array2d(2,3))

  extra = "four"

  array2d(:,:) = reshape((/ var_str("1"), &
   var_str("2"), var_str("3"), &
   extra, var_str("5"), &
   var_str("six") /), (/ 2, 3 /))


  print *,"array2d second ",ubound(array2d),(("'"//char(array2d(i,j))//"' 
",i=1,size(array2d,1)),j=1,size(array2d,2))

end program test_ivs

gives an ICE.

TIA

Dominique

> Le 31 mai 2017 à 08:16, Bernhard Reutner-Fischer  a 
> écrit :
> 
> On 29 May 2017 17:49:30 CEST, Nicolas Koenig  wrote:
>> Hello Dominique,
>> 
>> mea culpa, their was a bit confusion with the file being open in emacs
>> and vi at the same time. Attached is the new patch with the #define
>> removed.



Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-05-31 Thread Dominique d'Humières

> Le 31 mai 2017 à 17:40, Dominique d'Humières  a écrit :
> 
> If I am not mistaken, compiling the following code with the patch applied

simpler test

  print *,(huge(0),i=1,6)
!  print*,(i,i=1,6)
!  print*,(i,i=1,6,1)
  end

> 
> gives an ICE.
> 
> TIA
> 
> Dominique



Re: [PATCH 7/7] [ARC] Test against frame_pointer_needed in arc_can_eliminate.

2017-05-31 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-05-19 12:31:02 
+0200]:

> arc_can_eliminate is using arc_frmae_pointer_required() which is wrong
> as the frame_pointer_needed can be set on different conditions. Fix it
> by calling arc_frame_pointer_needed().
> 
> gcc/
> 2017-01-09  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (arc_can_eliminate): Test against
>   arc_frame_pointer_needed.

Looks good,

thanks,
Andrew



> ---
>  gcc/config/arc/arc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 0c4c901..aac1952 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -4733,7 +4733,7 @@ arc_final_prescan_insn (rtx_insn *insn, rtx *opvec 
> ATTRIBUTE_UNUSED,
>  static bool
>  arc_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
>  {
> -  return to == FRAME_POINTER_REGNUM || !arc_frame_pointer_required ();
> +  return ((to == FRAME_POINTER_REGNUM) || !arc_frame_pointer_needed ());
>  }
>  
>  /* Define the offset between two registers, one to be eliminated, and
> -- 
> 1.9.1
> 


Re: SSA range class and removal of VR_ANTI_RANGEs

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 05:36:12PM +0200, Richard Biener wrote:
> On May 31, 2017 5:10:04 PM GMT+02:00, Jakub Jelinek  wrote:
> >On Wed, May 31, 2017 at 10:20:51AM -0400, Aldy Hernandez wrote:
> >> The biggest number of SSA_NAMEs I saw was actually 472,225.  Of
> >these,
> >> 357,032 were non-pointers, so could conceivably have range
> >information. In
> >> reality, 77,398 had range information, so 16% of all pointer and
> >non-pointer
> >> SSA_NAMEs actually have range information.
> >
> >I've tried to look just at insn-recog.c with the usual stage3 flags +
> >-fsanitize=address,undefined and I see there
> >ssa_name_nodes_created
> >4092344
> >(of course, that doesn't mean there are 4M SSA_NAMEs all live at the
> >same
> >time, but I think they don't go through ggc_free and thus the only way
> >that
> >number goes down is during GC.  There are both 72 and 80 bytes alloc
> >pools,
> >even a 32MB increase is something we shouldn't ignore.
> >Furthermore, as e.g. PR80917 shows, we really should track nonzero bits
> >next
> >to value ranges, the current tracking of it only in tree-ssa-ccp which
> >doesn't have ASSERT_EXPRs nor any kind of framework to do something
> >similar
> >without them is insufficient.
> 
> I think the important part to recognize is that the VR type used during
> propagation does (and likely should) not have to be the same as that used
> for long-term storage alongside SSA names.
> 
> The first thing to do is improve the one we use internally in VRP where we
> can also add a separate bit-lattice easily (though we have to iterate
> until both converge).

I believe Andrew/Aldy's goal is to make the "during VRP propagation" vs.
in other passes line fuzzier, but I agree it would be best to do it
like wide_int can - have a template that can work on different range/nz bits
storages, and have a compact storage for the on SSA_NAME data and
less compact one for other purposes.

Jakub


Re: [PATCH 6/7] [ARC] Prevent moving stores to the frame before the stack adjustment.

2017-05-31 Thread Andrew Burgess
* Claudiu Zissulescu  [2017-05-19 12:31:01 
+0200]:

> From: Claudiu Zissulescu 
> 
> If the stack pointer is needed, emit a special barrier that will prevent
> the scheduler from moving stores to the frame before the stack adjustment.
> 
> 2017-01-03  Claudiu Zissulescu  
> 
>   * config/arc/arc.c (arc_expand_prologue): Emit a special barrier
>   to prevent store reordering.
>   * config/arc/arc.md (UNSPEC_ARC_STKTIE): Define.
>   (type): Add block type.
>   (stack_tie): Define special instruction to be used in
>   expand_prologue.

Given the description the code looks fine.  It would be nice to see
more of a _why_ in the commit message.  I'm guessing this is either
something related to signal handling, or debugging... I don't see why
this would be needed for functional correctness.

Thanks,
Andrew






> ---
>  gcc/config/arc/arc.c  | 10 +-
>  gcc/config/arc/arc.md | 15 ++-
>  2 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index ff86f6c..0c4c901 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -3030,7 +3030,15 @@ arc_expand_prologue (void)
>frame_size_to_allocate -= first_offset;
>/* Allocate the stack frame.  */
>if (frame_size_to_allocate > 0)
> -frame_stack_add ((HOST_WIDE_INT) 0 - frame_size_to_allocate);
> +{
> +  frame_stack_add ((HOST_WIDE_INT) 0 - frame_size_to_allocate);
> +  /* If the frame pointer is needed, emit a special barrier that
> +  will prevent the scheduler from moving stores to the frame
> +  before the stack adjustment.  */
> +  if (arc_frame_pointer_needed ())
> + emit_insn (gen_stack_tie (stack_pointer_rtx,
> +   hard_frame_pointer_rtx));
> +}
>  
>/* Setup the gp register, if needed.  */
>if (crtl->uses_pic_offset_table)
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 743a844..6cd192a 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -135,6 +135,7 @@
>UNSPEC_ARC_VMAC2HU
>UNSPEC_ARC_VMPY2H
>UNSPEC_ARC_VMPY2HU
> +  UNSPEC_ARC_STKTIE
>])
>  
>  (define_c_enum "vunspec" [
> @@ -205,7 +206,7 @@
> simd_vcompare, simd_vpermute, simd_vpack, simd_vpack_with_acc,
> simd_valign, simd_valign_with_acc, simd_vcontrol,
> simd_vspecial_3cycle, simd_vspecial_4cycle, simd_dma, mul16_em, div_rem,
> -   fpu"
> +   fpu, block"
>(cond [(eq_attr "is_sfunc" "yes")
>(cond [(match_test "!TARGET_LONG_CALLS_SET && (!TARGET_MEDIUM_CALLS || 
> GET_CODE (PATTERN (insn)) != COND_EXEC)") (const_string "call")
>   (match_test "flag_pic") (const_string "sfunc")]
> @@ -6547,6 +6548,18 @@
>(set_attr "predicable" "yes,no,no,yes,no")
>(set_attr "cond" "canuse,nocond,nocond,canuse_limm,nocond")])
>  
> +(define_insn "stack_tie"
> +  [(set (mem:BLK (scratch))
> + (unspec:BLK [(match_operand:SI 0 "register_operand" "rb")
> +  (match_operand:SI 1 "register_operand" "rb")]
> + UNSPEC_ARC_STKTIE))]
> +  ""
> +  ""
> +  [(set_attr "length" "0")
> +   (set_attr "iscompact" "false")
> +   (set_attr "type" "block")]
> +  )
> +
>  ;; include the arc-FPX instructions
>  (include "fpx.md")
>  
> -- 
> 1.9.1
> 


[C++ PATCH] using directive

2017-05-31 Thread Nathan Sidwell
I've committed this new testcase, from the modules branch.  Something 
that got fixed in the name-lookup change.


nathan
--
Nathan Sidwell
2017-05-31  Nathan Sidwell  

	* g++.dg/lookup/lambda1.C New.
Index: testsuite/g++.dg/lookup/lambda1.C
===
--- testsuite/g++.dg/lookup/lambda1.C	(revision 0)
+++ testsuite/g++.dg/lookup/lambda1.C	(working copy)
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++11 } }
+
+namespace std
+{
+  typedef int I;
+}
+
+void foo ()
+{
+  using namespace std;
+
+  auto l = [] (I) {};
+}


Re: SSA range class and removal of VR_ANTI_RANGEs

2017-05-31 Thread Richard Biener
On May 31, 2017 6:28:26 PM GMT+02:00, Jakub Jelinek  wrote:
>On Wed, May 31, 2017 at 05:36:12PM +0200, Richard Biener wrote:
>> On May 31, 2017 5:10:04 PM GMT+02:00, Jakub Jelinek
> wrote:
>> >On Wed, May 31, 2017 at 10:20:51AM -0400, Aldy Hernandez wrote:
>> >> The biggest number of SSA_NAMEs I saw was actually 472,225.  Of
>> >these,
>> >> 357,032 were non-pointers, so could conceivably have range
>> >information. In
>> >> reality, 77,398 had range information, so 16% of all pointer and
>> >non-pointer
>> >> SSA_NAMEs actually have range information.
>> >
>> >I've tried to look just at insn-recog.c with the usual stage3 flags
>+
>> >-fsanitize=address,undefined and I see there
>> >ssa_name_nodes_created
>> >4092344
>> >(of course, that doesn't mean there are 4M SSA_NAMEs all live at the
>> >same
>> >time, but I think they don't go through ggc_free and thus the only
>way
>> >that
>> >number goes down is during GC.  There are both 72 and 80 bytes alloc
>> >pools,
>> >even a 32MB increase is something we shouldn't ignore.
>> >Furthermore, as e.g. PR80917 shows, we really should track nonzero
>bits
>> >next
>> >to value ranges, the current tracking of it only in tree-ssa-ccp
>which
>> >doesn't have ASSERT_EXPRs nor any kind of framework to do something
>> >similar
>> >without them is insufficient.
>> 
>> I think the important part to recognize is that the VR type used
>during
>> propagation does (and likely should) not have to be the same as that
>used
>> for long-term storage alongside SSA names.
>> 
>> The first thing to do is improve the one we use internally in VRP
>where we
>> can also add a separate bit-lattice easily (though we have to iterate
>> until both converge).
>
>I believe Andrew/Aldy's goal is to make the "during VRP propagation"
>vs.
>in other passes line fuzzier

I realize that.  But for caching purposes they do have to keep a sparse lattice 
anyway which can use a more precise info (and also expose that through another 
API than the persistent one).
It's hopefully going to be much like SCEV.

I've not yet seen any code of course (not that I have high hopes...)

Richard.


, but I agree it would be best to do it
>like wide_int can - have a template that can work on different range/nz
>bits
>storages, and have a compact storage for the on SSA_NAME data and
>less compact one for other purposes.
>
>   Jakub



[C++ PATCH] lang_decl selector & decomposition

2017-05-31 Thread Nathan Sidwell
This patch reworks how decl_decomposition is marked.  With the new 
lang_decl_decomp struct, the selector grew another used but -- which 
affected me on the modules branch.  however, we can use that use 
selector value in place of the decompostion_p bitfield.


This patch makes that change, but it also replaces the use of magic 
constants [0-4] with an enumeration.


Rather than have retrofit_lang_decl know about decomposition conversion, 
I broke out a fit_decomposition_lang_decl function for just that purpose.


The other change I made is breaking out new maybe_add_lang_decl_raw and 
maybe_add_lang_type_raw functions.  On the modules branch I need to 
recreate the lang_decl/lang_type nodes, but based on an untrustworth 
binary file.  So we don't want to assert if they're called incorrectly. 
I was going to leave that on the modules branch, until Jakub altered 
retrofit_lang_decl, and I had a more complicated merge.


Jakub, I discovered that fit_decomposition_lang_decl can get called 
repeatedly on the same var but with different base vars.  Is that expected?


nathan

--
Nathan Sidwell
2017-05-31  Nathan Sidwell  

	* cp-tree.h (lang_decl_slector): New enum.
	(lang_decl_base): Make selector an enum.  Drop decomposition_p
	field.
	(lang_decl): Use enum for discrimination.
	(LANG_DECL_FN_CHECK, LANG_DECL_NS_CHECK, LANG_DECL_PARM_CHECK,
	LANG_DECL_DEOMP_CHECK): Use enum.
	(DECL_DECOMPOSITION_P): Use selector value.
	(SET_DECL_DECOMPOSITION_P): Delete.
	(retrofit_lang_decl): Lose SEL parm.
	(fit_decomposition_lang_decl): Declare.
	* decl.c (cp_finish_decomp, grokdeclarator): Use
	fit_decomposition_lang_decl.
	* lex.c (maybe_add_lang_decl_raw): New. Broken out of
	retrofit_lang_decl.
	(set_decl_linkage): New.  Broken out of retrofit_lang_decl.  Use enum.
	(fit_decomposition_lang_decl): Likewise.
	(retrofit_lang_decl): Use worker functions.
	(cxx_dup_lang_specific_decl): Use selector enum.
	(maybe_add_lang_type_raw): New.  Broken out of ...
	(cxx_make_type_name): ... here.  Call it.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 248745)
+++ cp/cp-tree.h	(working copy)
@@ -2423,13 +2423,25 @@ struct GTY(()) lang_type {
 #define NAMESPACE_LEVEL(NODE) \
   (LANG_DECL_NS_CHECK (NODE)->level)
 
+/* Discriminator values for lang_decl.  */
+
+enum lang_decl_selector
+{
+  lds_min,
+  lds_fn,
+  lds_ns,
+  lds_parm,
+  lds_decomp
+};
+
 /* Flags shared by all forms of DECL_LANG_SPECIFIC.
 
Some of the flags live here only to make lang_decl_min/fn smaller.  Do
not make this struct larger than 32 bits; instead, make sel smaller.  */
o 
 struct GTY(()) lang_decl_base {
-  unsigned selector : 16;   /* Larger than necessary for faster access.  */
+  /* Larger than necessary for faster access.  */
+  ENUM_BITFIELD(lang_decl_selector) selector : 16;
   ENUM_BITFIELD(languages) language : 1;
   unsigned use_template : 2;
   unsigned not_really_extern : 1;	   /* var or fn */
@@ -2444,8 +2456,7 @@ struct GTY(()) lang_decl_base {
   unsigned u2sel : 1;
   unsigned concept_p : 1;  /* applies to vars and functions */
   unsigned var_declared_inline_p : 1;	   /* var */
-  unsigned decomposition_p : 1;		   /* var */
-  /* 1 spare bit */
+  /* 2 spare bits */
 };
 
 /* True for DECL codes which have template info and access.  */
@@ -2577,12 +2588,13 @@ struct GTY(()) lang_decl_decomp {
 
 struct GTY(()) lang_decl {
   union GTY((desc ("%h.base.selector"))) lang_decl_u {
+ /* Nothing of only the base type exists.  */
 struct lang_decl_base GTY ((default)) base;
-struct lang_decl_min GTY((tag ("0"))) min;
-struct lang_decl_fn GTY ((tag ("1"))) fn;
-struct lang_decl_ns GTY((tag ("2"))) ns;
-struct lang_decl_parm GTY((tag ("3"))) parm;
-struct lang_decl_decomp GTY((tag ("4"))) decomp;
+struct lang_decl_min GTY((tag ("lds_min"))) min;
+struct lang_decl_fn GTY ((tag ("lds_fn"))) fn;
+struct lang_decl_ns GTY((tag ("lds_ns"))) ns;
+struct lang_decl_parm GTY((tag ("lds_parm"))) parm;
+struct lang_decl_decomp GTY((tag ("lds_decomp"))) decomp;
   } u;
 };
 
@@ -2603,26 +2615,29 @@ struct GTY(()) lang_decl {
lang_decl_fn, look down through a TEMPLATE_DECL into its result.  */
 #define LANG_DECL_FN_CHECK(NODE) __extension__\
 ({ struct lang_decl *lt = DECL_LANG_SPECIFIC (STRIP_TEMPLATE (NODE));	\
-   if (!DECL_DECLARES_FUNCTION_P (NODE) || lt->u.base.selector != 1)	\
+   if (!DECL_DECLARES_FUNCTION_P (NODE)	\
+   || lt->u.base.selector != lds_fn)\
  lang_check_failed (__FILE__, __LINE__, __FUNCTION__);		\
<->u.fn; })
 
 #define LANG_DECL_NS_CHECK(NODE) __extension__\
 ({ struct lang_decl *lt = DECL_LANG_SPECIFIC (NODE);			\
-   if (TREE_CODE (NODE) != NAMESPACE_DECL || lt->u.base.selector != 2)	\
+   if (TREE_CODE (NODE) != NAMESPACE_DECL\
+   || lt->u.base.selector != lds_ns)\
  lang_check_failed (__FILE__, __LINE__, __FUNCTION__);		\
<->u.ns; })
 
 #def

Re: [C++ PATCH] lang_decl selector & decomposition

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 12:54:24PM -0400, Nathan Sidwell wrote:
> This patch reworks how decl_decomposition is marked.  With the new
> lang_decl_decomp struct, the selector grew another used but -- which
> affected me on the modules branch.  however, we can use that use selector
> value in place of the decompostion_p bitfield.
> 
> This patch makes that change, but it also replaces the use of magic
> constants [0-4] with an enumeration.
> 
> Rather than have retrofit_lang_decl know about decomposition conversion, I
> broke out a fit_decomposition_lang_decl function for just that purpose.
> 
> The other change I made is breaking out new maybe_add_lang_decl_raw and
> maybe_add_lang_type_raw functions.  On the modules branch I need to recreate
> the lang_decl/lang_type nodes, but based on an untrustworth binary file.  So
> we don't want to assert if they're called incorrectly. I was going to leave
> that on the modules branch, until Jakub altered retrofit_lang_decl, and I
> had a more complicated merge.
> 
> Jakub, I discovered that fit_decomposition_lang_decl can get called
> repeatedly on the same var but with different base vars.  Is that expected?

That is weird, that sounds like a bug somewhere?  Which testcase is it on?

> 2017-05-31  Nathan Sidwell  
> 
>   * cp-tree.h (lang_decl_slector): New enum.

selector?

Jakub


C++ PATCH for c++/80840, ICE with constexpr and reference template parm

2017-05-31 Thread Jason Merrill
In convert_nontype_argument to reference type we were inappropriately
checking value_dependent_expression_p on an expression that might be a
VAR_DECL and might be a TEMPLATE_PARM_INDEX of reference type.  It's
inappropriate in the former case because we don't care about the value
of the object, only its address; we only want to test for the latter
case.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 388ec8de844dda6785d55daef0fffc90eb49118c
Author: Jason Merrill 
Date:   Tue May 30 21:38:17 2017 -0400

PR c++/80840 - ICE with constexpr and reference

* pt.c (convert_nontype_argument): Don't test whether a decl is
value-dependent when binding to a reference.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c1a8885..8239a3e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -6749,7 +6749,11 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
}
}
 
-  if (!value_dependent_expression_p (expr))
+  if (TREE_CODE (TREE_TYPE (expr)) == REFERENCE_TYPE
+ && value_dependent_expression_p (expr))
+   /* OK, dependent reference.  We don't want to ask whether a DECL is
+  itself value-dependent, since what we want here is its address.  */;
+  else
{
  if (!DECL_P (expr))
{
@@ -6771,8 +6775,11 @@ convert_nontype_argument (tree type, tree expr, 
tsubst_flags_t complain)
  return NULL_TREE;
}
 
- expr = build_nop (type, build_address (expr));
+ expr = build_address (expr);
}
+
+  if (!same_type_p (type, TREE_TYPE (expr)))
+   expr = build_nop (type, expr);
 }
   /* [temp.arg.nontype]/5, bullet 4
 
diff --git a/gcc/testsuite/g++.dg/template/ref10.C 
b/gcc/testsuite/g++.dg/template/ref10.C
new file mode 100644
index 000..60e91d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/ref10.C
@@ -0,0 +1,13 @@
+// PR c++/80840
+// { dg-do compile { target c++11 } }
+
+template 
+struct Just;
+
+template 
+struct Number {
+static constexpr double value = X;
+using result = Just;
+};
+
+int main() {}


Re: [C++ PATCH] lang_decl selector & decomposition

2017-05-31 Thread Nathan Sidwell

On 05/31/2017 01:05 PM, Jakub Jelinek wrote:


That is weird, that sounds like a bug somewhere?  Which testcase is it on?


I think decomp25 & decomp7 & 8 (sorry, can't recall full name).
I discovered them because my first attempt had an assert that the 
incoming base was the same as the stored one, for the already-converted 
case.


It wasn't the case of turning a NULL into non-NULL.

nathan

--
Nathan Sidwell


Re: [C++ PATCH] lang_decl selector & decomposition

2017-05-31 Thread Jakub Jelinek
On Wed, May 31, 2017 at 01:16:36PM -0400, Nathan Sidwell wrote:
> On 05/31/2017 01:05 PM, Jakub Jelinek wrote:
> 
> > That is weird, that sounds like a bug somewhere?  Which testcase is it on?
> 
> I think decomp25 & decomp7 & 8 (sorry, can't recall full name).
> I discovered them because my first attempt had an assert that the incoming
> base was the same as the stored one, for the already-converted case.
> 
> It wasn't the case of turning a NULL into non-NULL.

Even that shouldn't happen.  NULL should be only for the base declaration,
i.e. the underlying artificial var, non-NULL should be set on the VAR_DECLs
for the user identifiers and should point to the underlying artificial var.

Jakub


[PATCH rs6000] Addition fixes to BMI intrinsic tests, 3rd edition

2017-05-31 Thread Steven Munroe
Bill Seurer pointed out that building the BMI tests on a power8 but with
gcc built --with-cpu=power6 fails with link errors. The intrinsics
_pdep_u64/32 and _pext_u64/32 are guarded with #ifdef _ARCH_PWR7 as the
implementation uses bpermd and popcntd instructions introduced with
power7 (PowerISA-2.06).

But if the GCC is built --with-cpu=power6, the compiler is capable of
supporting -mcpu=power7 but will not generate bpermd/popcntd by default.
Then if some code uses say _pext_u64 with -mcpu=power6 the
intrinsic is not not supported (needs power7) and so is not defined. 

The { dg-require-effective-target powerpc_vsx_ok } is not sufficient for
the { dg-do run } and need to be changed to vsx_hw. Also we need add
-mcpu=power7 to dg-options to insure the compiler will generated the
bpermd/popcntd instructions.

Also added:

{ dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power7" } }

and 

dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } }

To ward off the evil spirits 

Tests on BE --with-cpu=power6 -m32/-m64 and LE --with-cpu=power8. All
bmi/bmi2 intrinsic tests pasted.

[gcc/testsuite]

2017-05-31  Steven Munroe  

* gcc.target/powerpc/bmi2-pdep32-1.c: Add -mcpu=power7 to
dg-options.  Change dg-require-effective-target powerpc_vsx_ok
to vsx_hw.  Add dg-skip-if directive disable this test if
-mcpu overridden.
* gcc.target/powerpc/bmi2-pdep64-1.c: Likewise.
* gcc.target/powerpc/bmi2-pext32-1.c: Likewise.
* gcc.target/powerpc/bmi2-pext64-1.c: Likewise.
* gcc.target/powerpc/bmi2-pext64-1a.c: Add -mcpu=power7
to dg-option.  Add dg-skip-if directive to disable this test
for darwin.

Index: gcc/testsuite/gcc.target/powerpc/bmi2-pdep32-1.c
===
--- gcc/testsuite/gcc.target/powerpc/bmi2-pdep32-1.c(revision 248468)
+++ gcc/testsuite/gcc.target/powerpc/bmi2-pdep32-1.c(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do run } */
-/* { dg-options "-O3" } */
+/* { dg-options "-O3 -mcpu=power7" } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power7" } } */
 
 #define NO_WARN_X86_INTRINSICS 1
 #include 
Index: gcc/testsuite/gcc.target/powerpc/bmi2-pdep64-1.c
===
--- gcc/testsuite/gcc.target/powerpc/bmi2-pdep64-1.c(revision 248468)
+++ gcc/testsuite/gcc.target/powerpc/bmi2-pdep64-1.c(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do run } */
-/* { dg-options "-O3" } */
+/* { dg-options "-O3 -mcpu=power7" } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power7" } } */
 
 #define NO_WARN_X86_INTRINSICS 1
 #include 
Index: gcc/testsuite/gcc.target/powerpc/bmi2-pext32-1.c
===
--- gcc/testsuite/gcc.target/powerpc/bmi2-pext32-1.c(revision 248468)
+++ gcc/testsuite/gcc.target/powerpc/bmi2-pext32-1.c(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do run } */
-/* { dg-options "-O3" } */
+/* { dg-options "-O3 -mcpu=power7" } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power7" } } */
 
 #define NO_WARN_X86_INTRINSICS 1
 #include 
Index: gcc/testsuite/gcc.target/powerpc/bmi2-pext64-1.c
===
--- gcc/testsuite/gcc.target/powerpc/bmi2-pext64-1.c(revision 248468)
+++ gcc/testsuite/gcc.target/powerpc/bmi2-pext64-1.c(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do run } */
-/* { dg-options "-O3" } */
+/* { dg-options "-O3 -mcpu=power7" } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power7" } } */
 
 #define NO_WARN_X86_INTRINSICS 1
 #include 
Index: gcc/testsuite/gcc.target/powerpc/bmi2-pext64-1a.c
===
--- gcc/testsuite/gcc.target/powerpc/bmi2-pext64-1a.c   (revision 248468)
+++ gcc/testsuite/gcc.target/powerpc/bmi2-pext64-1a.c   (working copy)
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-O3" } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O3 -mcpu=power7" } */
 /* { dg-require-effective-target lp64 } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 



Re: [PATCH, rs6000] fold-vec-logical-ors-longlong test update

2017-05-31 Thread Segher Boessenkool
On Fri, May 26, 2017 at 12:20:05PM -0500, Will Schmidt wrote:
> This test has been flaky on both AIX and in older linux based
> environments. Notably, the number of xxlor instructions
> generated varies depending on the platform and the specified
> bit-size (32/64), with older environments generating either 6 or 24
> xxlor instructions.  The behavior appears to level out on power8 targets,
> so update this test to target power8-vector and specify the
> -mpower8-vector option.

Okay for trunk.  Thanks!


Segher


> 2017-05-26  Will Schmidt  
> 
>   * gcc.target/powerpc/fold-vec-logical-ors-longlong.c:
>   Update the target to powerpc_p8vector_ok. Update dg-options
>   value to -mpower8-vector.


Re: [PATCH][x86] Add missing mask intrinsics for MAX[SD,SS] and MIN[SD,SS]

2017-05-31 Thread Uros Bizjak
On Tue, May 30, 2017 at 4:29 PM, Peryt, Sebastian
 wrote:
> Hi,
>
> This patch adds missing intrinsics for MAX[SD,SS] and MIN[SD,SS] listed below:
> - _mm_mask_max_sd,
> - _mm_maskz_max_sd,
> - _mm_mask_max_ss,
> - _mm_maskz_max_ss,
>
> - _mm_mask_min_sd,
> - _mm_maskz_min_sd,
> - _mm_mask_min_ss,
> - _mm_maskz_min_ss.
>
> gcc/
> * config/i386/avx512fintrin.h (_mm_mask_max_sd,
> _mm_maskz_max_sd, _mm_mask_max_ss, _mm_maskz_max_ss,
> _mm_mask_min_sd, _mm_maskz_min_sd, _mm_mask_min_ss,
> _mm_maskz_min_ss): New intrinsics.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-vmaxsd-1.c (_mm_mask_max_sd,
> _mm_maskz_max_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vmaxsd-2.c (_mm_mask_max_sd,
> _mm_maskz_max_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vmaxss-1.c (_mm_mask_max_ss,
> _mm_maskz_max_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vmaxss-2.c (_mm_mask_max_ss,
> _mm_maskz_max_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vminsd-1.c (_mm_mask_min_sd,
> _mm_maskz_min_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vminsd-2.c (_mm_mask_min_sd,
> _mm_maskz_min_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vminss-1.c (_mm_mask_min_ss,
> _mm_maskz_min_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vminss-2.c (_mm_mask_min_ss,
> _mm_maskz_min_ss): Test new intrinsics.
>
> Is it ok for trunk?

Approved and committed to mainline SVN.

Thanks,
Uros.


[PATCH, i386]: Allow direct XMM->GR zero extensions for 32bit targets

2017-05-31 Thread Uros Bizjak
Hello!

Attached patch allows direct XMM->GR zero extensions for 32bit
targets. This insn will be split after reload to a direct
XMM->lowpart(GR) move and 0->highpart(GR) zeroing.

2017-05-31  Uros Bizjak  

* config/i386/i386.md (*zero_extendsidi2): Enable alternative (?r, *Yj)
also for 32bit target.  Update insn attributes.
(zero-extendsidi2 splitter): Allow all registers for operand 1.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 248692)
+++ config/i386/i386.md (working copy)
@@ -3843,7 +3843,7 @@
   [(set (attr "isa")
  (cond [(eq_attr "alternative" "0,1,2")
  (const_string "nox64")
-   (eq_attr "alternative" "3,7")
+   (eq_attr "alternative" "3")
  (const_string "x64")
(eq_attr "alternative" "9")
  (const_string "sse2")
@@ -3860,7 +3860,11 @@
  (const_string "multi")
(eq_attr "alternative" "5,6")
  (const_string "mmxmov")
-   (eq_attr "alternative" "7,8,9,10,11")
+   (eq_attr "alternative" "7")
+ (if_then_else (match_test "TARGET_64BIT")
+   (const_string "ssemov")
+   (const_string "multi"))
+   (eq_attr "alternative" "8,9,10,11")
  (const_string "ssemov")
(eq_attr "alternative" "12")
  (const_string "mskmov")
@@ -3881,8 +3885,11 @@
(set (attr "mode")
  (cond [(eq_attr "alternative" "5,6")
  (const_string "DI")
-   (eq_attr "alternative" "7,8,10,11")
+   (and (eq_attr "alternative" "7")
+(match_test "TARGET_64BIT"))
  (const_string "TI")
+   (eq_attr "alternative" "8,10,11")
+ (const_string "TI")
   ]
   (const_string "SI")))])
 
@@ -3903,7 +3910,7 @@
 
 (define_split
   [(set (match_operand:DI 0 "nonimmediate_gr_operand")
-   (zero_extend:DI (match_operand:SI 1 "nonimmediate_gr_operand")))]
+   (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand")))]
   "!TARGET_64BIT && reload_completed
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))"
   [(set (match_dup 3) (match_dup 1))


Re: [C++ Patch] PR 80896 ("[[nodiscard]] is ignored for functions returning references")

2017-05-31 Thread Jason Merrill
OK.

On Wed, May 31, 2017 at 8:04 AM, Paolo Carlini  wrote:
> Hi,
>
> this one appears to be a rather simple case of missing diagnostic: in
> convert_to_void we aren't calling maybe_warn_nodiscard when we strip an
> INDIRECT_REF wrapping a CALL_EXPR thus we don't issue the diagnostic that we
> normally provide for plain CALL_EXPRs (eg, for a func returning a plain
> int). Tested x86_64-linux.
>
> Thanks, Paolo.
>
> //
>


Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-05-31 Thread Nicolas Koenig

Hello Dominique,

attached is the next try, this time without stupidities (I hope). Both 
test cases you posted don't ICE anymore.


Ok for trunk?

Nicolas

Regression tested for x86_64-pc-linux-gnu.

Changelog (still the same):
2017-05-27  Nicolas Koenig  

PR fortran/35339
* frontend-passes.c (traverse_io_block): New function.
(simplify_io_impl_do): New function.
(optimize_namespace): Invoke gfc_code_walker with
simplify_io_impl_do.

2017-05-27  Nicolas Koenig  

PR fortran/35339
* gfortran.dg/implied_do_io_1.f90: New Test.

On 05/31/2017 05:49 PM, Dominique d'Humières wrote:

Le 31 mai 2017 à 17:40, Dominique d'Humières  a écrit :

If I am not mistaken, compiling the following code with the patch applied

simpler test

   print *,(huge(0),i=1,6)
!  print*,(i,i=1,6)
!  print*,(i,i=1,6,1)
   end


gives an ICE.

TIA

Dominique


Index: frontend-passes.c
===
--- frontend-passes.c	(revision 248539)
+++ frontend-passes.c	(working copy)
@@ -1060,6 +1060,257 @@ convert_elseif (gfc_code **c, int *walk_subtrees A
   return 0;
 }
 
+struct do_stack
+{
+  struct do_stack *prev;
+  gfc_iterator *iter;
+  gfc_code *code;
+} *stack_top;
+
+/* Recursivly traverse the block of a WRITE or READ statement, and, can it be
+   optimized, do so. It optimizes it by replacing do loops with their analog
+   array slices. For example:
+   
+ write (*,*) (a(i), i=1,4)
+ 
+   is replaced with
+ 
+ write (*,*) a(1:4:1) .  */
+
+static bool 
+traverse_io_block(gfc_code *code, bool *has_reached, gfc_code *prev)
+{
+  gfc_code *curr; 
+  gfc_expr *new_e, *expr, *start;
+  gfc_ref *ref;
+  struct do_stack ds_push;
+  int i, future_rank = 0;
+  gfc_iterator *iters[GFC_MAX_DIMENSIONS];
+
+  /* Find the first transfer/do statement.  */
+  for (curr = code; curr; curr = curr->next)
+{
+  if (curr->op == EXEC_DO || curr->op == EXEC_TRANSFER)
+break;
+}
+
+  /* Ensure it is the only transfer/do statement because cases like
+   
+   write (*,*) (a(i), b(i), i=1,4)
+
+ cannot be optimized.  */
+
+  if (!curr || curr->next)
+return false;
+
+  if (curr->op == EXEC_DO)
+{
+  if (curr->ext.iterator->var->ref)
+return false;
+  ds_push.prev = stack_top;
+  ds_push.iter = curr->ext.iterator;
+  ds_push.code = curr;
+  stack_top = &ds_push;
+  if (traverse_io_block(curr->block->next, has_reached, prev))
+{
+	  if (curr != stack_top->code && !*has_reached)
+	{
+  curr->block->next = NULL;
+  gfc_free_statements(curr);
+	}
+	  else
+	*has_reached = true;
+	  return true;
+}
+  return false;
+}
+
+  gcc_assert(curr->op == EXEC_TRANSFER);
+
+  ref = curr->expr1->ref;
+  if (!ref || ref->type != REF_ARRAY || ref->u.ar.codimen != 0 || ref->next)
+return false;
+
+  /* Find the iterators belonging to each variable and check conditions.  */
+  for (i = 0; i < ref->u.ar.dimen; i++)
+{
+  if (!ref->u.ar.start[i] || ref->u.ar.start[i]->ref
+  || ref->u.ar.dimen_type[i] != DIMEN_ELEMENT)
+return false;
+  
+  start = ref->u.ar.start[i];
+  gfc_simplify_expr(start, 0);
+  switch (start->expr_type)
+{
+	case EXPR_VARIABLE:
+
+	  /* write (*,*) (a(i), i=a%b,1) not handled yet.  */
+	  if (start->ref)
+	return false;
+
+	  /*  Check for (a(k), i=1,4) or ((a(j, i), i=1,4), j=1,4).  */
+	  if (!stack_top || !stack_top->iter 
+	 || stack_top->iter->var->symtree != start->symtree)
+	iters[i] = NULL; 
+	  else
+	{
+  iters[i] = stack_top->iter;
+	  stack_top = stack_top->prev;
+	  future_rank++;
+	}
+	  break;
+case EXPR_CONSTANT:
+	  iters[i] = NULL;
+	  break;
+	case EXPR_OP:
+  switch (start->value.op.op)
+	{
+	case INTRINSIC_PLUS:
+	case INTRINSIC_TIMES:
+	  if (start->value.op.op1->expr_type != EXPR_VARIABLE)
+	std::swap(start->value.op.op1, start->value.op.op2);
+	gcc_fallthrough();
+	case INTRINSIC_MINUS:
+	  if ((start->value.op.op1->expr_type!= EXPR_VARIABLE 
+	&& start->value.op.op2->expr_type != EXPR_CONSTANT)
+	  || start->value.op.op1->ref)
+	return false;
+  if (!stack_top || !stack_top->iter 
+	 || stack_top->iter->var->symtree 
+		!= start->value.op.op1->symtree)
+	return false;
+	  iters[i] = stack_top->iter; 
+	  stack_top = stack_top->prev;
+	  break;
+	default:
+	  return false;
+	}
+	future_rank++;
+	  break;
+	default:
+	  return false;
+}
+}
+
+  /* Create new expr.  */
+  new_e = gfc_copy_expr(curr->expr1);
+  new_e->expr_type = EXPR_VARIABLE;
+  new_e->rank = future_rank; 
+  if (curr->expr1->shape)
+{
+  new_e->shape = gfc_get_shape(new_e->rank);
+}
+
+
+  /* Assign new starts, ends and strides if necessary.  */
+  fo

Re: Optimisation of std::binary_search of the header

2017-05-31 Thread Mike Stump
On May 31, 2017, at 12:33 AM, jay pokarna  wrote:
> 
>Could you tell the way as to how can I measure the time taken
> by my algorithm and compare it with the inbuilt functions ?

No, that's beyond our charter.  We review patches for gcc.  I'd recommend 
google, it has answers to most questions.  I think this one might be covered.  
[ quick check ] https://gist.github.com/nfarring/1624742, yeah, it's covered.

> Also , could you recommend some data that could be helpful to help the
> comparison between the function and the std::binary_search?

I think google can find some data for you, just search; try google("data sets 
for searching").



[PATCH v2, rs6000] Fold vector absolutes in GIMPLE

2017-05-31 Thread Will Schmidt
Hi,
 
Add support for early expansion of vector absolute built-ins.

[V2] Per reviews and feedback, skip the early folding for
integral types based on a check against TYPE_OVERFLOW_WRAPS(arg0).

Added test variants to exercise the -fwrapv option during
this folding.

OK for trunk?  (bootstraps running, pending review).

[gcc]

2017-05-31  Will Schmidt  

* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
for early expansion of vector absolute builtins.

[gcc/testsuite]

2017-05-31  Will Schmidt  

* gcc.target/powerpc/fold-vec-abs-char.c: New.
* gcc.target/powerpc/fold-vec-abs-floatdouble.c: New.
* gcc.target/powerpc/fold-vec-abs-int.c: New.
* gcc.target/powerpc/fold-vec-abs-longlong.c: New.
* gcc.target/powerpc/fold-vec-abs-short.c: New.
* gcc.target/powerpc/fold-vec-abs-char-fwrapv.c: New.
* gcc.target/powerpc/fold-vec-abs-int-fwrapv.c: New.
* gcc.target/powerpc/fold-vec-abs-longlong-fwrapv.c: New.
* gcc.target/powerpc/fold-vec-abs-short-fwrapv.c: New.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index dac673c..46d281a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17333,6 +17333,24 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gsi_replace (gsi, g, true);
return true;
   }
+/* flavors of vec_abs. */
+case ALTIVEC_BUILTIN_ABS_V16QI:
+case ALTIVEC_BUILTIN_ABS_V8HI:
+case ALTIVEC_BUILTIN_ABS_V4SI:
+case ALTIVEC_BUILTIN_ABS_V4SF:
+case P8V_BUILTIN_ABS_V2DI:
+case VSX_BUILTIN_XVABSDP:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   if ( INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE(arg0)))
+   && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE(arg0
+ return false;
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, ABS_EXPR, arg0);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
 default:
   break;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
new file mode 100644
index 000..739f06e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_abs with char
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2 -fwrapv" } */
+
+#include 
+
+vector signed char
+test2 (vector signed char x)
+{
+  return vec_abs (x);
+}
+
+/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vsububm" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
new file mode 100644
index 000..239c919
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_abs with char
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector signed char
+test2 (vector signed char x)
+{
+  return vec_abs (x);
+}
+
+/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
+/* { dg-final { scan-assembler-times "vsububm" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c
new file mode 100644
index 000..1a08618
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c
@@ -0,0 +1,23 @@
+/* Verify that overloaded built-ins for vec_abs with float and
+   double inputs for VSX produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+vector float
+test1 (vector float x)
+{
+  return vec_abs (x);
+}
+
+vector double
+test2 (vector double x)
+{
+  return vec_abs (x);
+}
+
+/* { dg-final { scan-assembler-times "xvabssp" 1 } } */
+/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
new file mode 100644
index 000..34dead4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_abs with int
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2 -fwrapv" } */
+
+#include 
+
+vector signed int
+test1 (vector signed int x)
+{
+  return vec_abs (x);
+}
+
+/* { dg-final { scan-assembler-tim

[PATCH, rs6000] fold vector min/max in GIMPLE

2017-05-31 Thread Will Schmidt
Hi, 

(resending with folks on CC, apologies to anyone having deja-vu)

Add support for early expansion of vec_min, vec_max built-ins.

Bootstraps currently running.

OK for trunk?

Thanks,
-Will

[gcc]

2017-05-26  Will Schmidt  
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
for early expansion of vec_min and vec_max builtins.
(builtin_function_type): Add min/max unsigned variants to those
identified as having unsigned arguments.

[gcc/testsuite]

2017-05-26  Will Schmidt  

*  testsuite/gcc.target/powerpc/fold-vec-minmax-char.c: New.
*  testsuite/gcc.target/powerpc/fold-vec-minmax-floatdouble.c: New.
*  testsuite/gcc.target/powerpc/fold-vec-minmax-int.c: New.
*  testsuite/gcc.target/powerpc/fold-vec-minmax-longlong.c: New.
*  testsuite/gcc.target/powerpc/fold-vec-minmax-short.c: New.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 104a052..ce6cc1b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17348,6 +17348,46 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gsi_replace (gsi, g, true);
return true;
   }
+/* flavors of vec_min. */
+case VSX_BUILTIN_XVMINDP:
+case P8V_BUILTIN_VMINSD:
+case P8V_BUILTIN_VMINUD:
+case ALTIVEC_BUILTIN_VMINSB:
+case ALTIVEC_BUILTIN_VMINSH:
+case ALTIVEC_BUILTIN_VMINSW:
+case ALTIVEC_BUILTIN_VMINUB:
+case ALTIVEC_BUILTIN_VMINUH:
+case ALTIVEC_BUILTIN_VMINUW:
+case ALTIVEC_BUILTIN_VMINFP:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, MIN_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+/* flavors of vec_max. */
+case VSX_BUILTIN_XVMAXDP:
+case P8V_BUILTIN_VMAXSD:
+case P8V_BUILTIN_VMAXUD:
+case ALTIVEC_BUILTIN_VMAXSB:
+case ALTIVEC_BUILTIN_VMAXSH:
+case ALTIVEC_BUILTIN_VMAXSW:
+case ALTIVEC_BUILTIN_VMAXUB:
+case ALTIVEC_BUILTIN_VMAXUH:
+case ALTIVEC_BUILTIN_VMAXUW:
+case ALTIVEC_BUILTIN_VMAXFP:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, MAX_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
 default:
   break;
 }
@@ -18986,6 +19026,14 @@ builtin_function_type (machine_mode mode_ret, 
machine_mode mode_arg0,
 case MISC_BUILTIN_DIVDEU:
 case MISC_BUILTIN_DIVDEUO:
 case VSX_BUILTIN_UDIV_V2DI:
+case ALTIVEC_BUILTIN_VMAXUB:
+case ALTIVEC_BUILTIN_VMINUB:
+case ALTIVEC_BUILTIN_VMAXUH:
+case ALTIVEC_BUILTIN_VMINUH:
+case ALTIVEC_BUILTIN_VMAXUW:
+case ALTIVEC_BUILTIN_VMINUW:
+case P8V_BUILTIN_VMAXUD:
+case P8V_BUILTIN_VMINUD:
   h.uns_p[0] = 1;
   h.uns_p[1] = 1;
   h.uns_p[2] = 1;
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-minmax-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-minmax-char.c
new file mode 100644
index 000..9df6ecd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-minmax-char.c
@@ -0,0 +1,37 @@
+/* Verify that overloaded built-ins for vec_min with char
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec" } */
+
+#include 
+
+vector signed char
+test3_min (vector signed char x, vector signed char y)
+{
+  return vec_min (x, y);
+}
+
+vector unsigned char
+test6_min (vector unsigned char x, vector unsigned char y)
+{
+  return vec_min (x, y);
+}
+
+vector signed char
+test3_max (vector signed char x, vector signed char y)
+{
+  return vec_max (x, y);
+}
+
+vector unsigned char
+test6_max (vector unsigned char x, vector unsigned char y)
+{
+  return vec_max (x, y);
+}
+
+/* { dg-final { scan-assembler-times "vminsb" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
+/* { dg-final { scan-assembler-times "vminub" 1 } } */
+/* { dg-final { scan-assembler-times "vmaxub" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-minmax-floatdouble.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-minmax-floatdouble.c
new file mode 100644
index 000..1185ce2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-minmax-floatdouble.c
@@ -0,0 +1,37 @@
+/* Verify that overloaded built-ins for vec_max with float and
+   double inputs for VSX produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+vector float
+test1_min (vector float x, vector float y)
+{
+  return vec_min (x, y);
+}
+
+vector double
+test2_min (vector double x, vector double

[PATCH, rs6000] Fold vector logicals (eqv) in GIMPLE

2017-05-31 Thread Will Schmidt
Hi, 

Add support for early expansion of vector eqv built-ins.

Bootstraps currently running.  

OK for trunk?

Thanks,
-Will

  
[gcc]

2017-05-26  Will Schmidt  

* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
for early expansion of vec_eqv.

[gcc/testsuite]

2017-05-26  Will Schmidt  

* testsuite/gcc.target/powerpc/fold-vec-logical-eqv-char.c: New.
* testsuite/gcc.target/powerpc/fold-vec-logical-eqv-float.c: New.
* testsuite/gcc.target/powerpc/fold-vec-logical-eqv-floatdouble.c: New.
* testsuite/gcc.target/powerpc/fold-vec-logical-eqv-int.c: New.
* testsuite/gcc.target/powerpc/fold-vec-logical-eqv-longlong.c: New.
* testsuite/gcc.target/powerpc/fold-vec-logical-eqv-short.c: New.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ce6cc1b..8adbc06 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17388,6 +17388,26 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gsi_replace (gsi, g, true);
return true;
   }
+/* Flavors of vec_eqv.  */
+case P8V_BUILTIN_EQV_V16QI:
+case P8V_BUILTIN_EQV_V8HI:
+case P8V_BUILTIN_EQV_V4SI:
+case P8V_BUILTIN_EQV_V4SF:
+case P8V_BUILTIN_EQV_V2DF:
+case P8V_BUILTIN_EQV_V2DI:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   tree temp = create_tmp_reg_or_ssa_name (TREE_TYPE (arg1));
+   gimple *g = gimple_build_assign (temp, BIT_XOR_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_insert_before(gsi, g, GSI_SAME_STMT);
+   g = gimple_build_assign (lhs, BIT_NOT_EXPR, temp);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
 default:
   break;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-char.c
new file mode 100644
index 000..6810848
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-char.c
@@ -0,0 +1,28 @@
+/* Verify that overloaded built-ins for vec_eqv with char
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
+
+#include 
+
+vector bool char
+test1 (vector bool char x, vector bool char y)
+{
+  return vec_eqv (x, y);
+}
+
+vector signed char
+test3 (vector signed char x, vector signed char y)
+{
+  return vec_eqv (x, y);
+}
+
+vector unsigned char
+test6 (vector unsigned char x, vector unsigned char y)
+{
+  return vec_eqv (x, y);
+}
+
+/* { dg-final { scan-assembler-times "xxleqv" 3 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-float.c
new file mode 100644
index 000..d206cfe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-float.c
@@ -0,0 +1,16 @@
+/* Verify that overloaded built-ins for vec_eqv with float
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
+
+#include 
+
+vector float
+test1 (vector float x, vector float y)
+{
+  return vec_eqv (x, y);
+}
+
+/* { dg-final { scan-assembler-times "xxleqv" 1 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-floatdouble.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-floatdouble.c
new file mode 100644
index 000..56b7cac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-floatdouble.c
@@ -0,0 +1,22 @@
+/* Verify that overloaded built-ins for vec_eqv with float and
+   double inputs for VSX produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
+
+#include 
+
+vector float
+test1 (vector float x, vector float y)
+{
+  return vec_eqv (x, y);
+}
+
+vector double
+test2 (vector double x, vector double y)
+{
+  return vec_eqv (x, y);
+}
+
+/* { dg-final { scan-assembler-times "xxleqv" 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-int.c
new file mode 100644
index 000..f5d292e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-eqv-int.c
@@ -0,0 +1,28 @@
+/* Verify that overloaded built-ins for vec_eqv with int
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
+
+#include 
+
+vector bool int
+test1 (vector bool int x, vector bool int y)
+{
+  return vec_eqv (x, y);
+}
+
+vector signed int
+test3 (vector signed int x, vector si

[PATCH, rs6000] Fold vector shifts in GIMPLE

2017-05-31 Thread Will Schmidt
Hi, 

Add support for early expansion of vector shifts.  Including
vec_sl (shift left), vec_sr (shift right), vec_sra (shift
right algebraic), vec_rl (rotate left).
Part of this includes adding the vector shift right instructions to
the list of those instructions having an unsigned second argument.

The VSR (vector shift right) folding is a bit more complex than
the others. This is due to requiring arg0 be unsigned for an algebraic
shift before the gimple RSHIFT_EXPR assignment is built.

[gcc]

2017-05-26  Will Schmidt  

* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
for early expansion of vector shifts (sl,sr,sra,rl).
(builtin_function_type): Add vector shift right instructions
to the unsigned argument list.

[gcc/testsuite]

2017-05-26  Will Schmidt  

* testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New.
* testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8adbc06..6ee0bfd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gsi_replace (gsi, g, true);
return true;
   }
+/* Flavors of vec_rotate_left . */
+case ALTIVEC_BUILTIN_VRLB:
+case ALTIVEC_BUILTIN_VRLH:
+case ALTIVEC_BUILTIN_VRLW:
+case P8V_BUILTIN_VRLD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+  /* Flavors of vector shift right algebraic.  vec_sra{b,h,w} -> vsra{b,h,w}. 
*/
+case ALTIVEC_BUILTIN_VSRAB:
+case ALTIVEC_BUILTIN_VSRAH:
+case ALTIVEC_BUILTIN_VSRAW:
+case P8V_BUILTIN_VSRAD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+   /* Flavors of vector shift left.  builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}. 
 */
+case ALTIVEC_BUILTIN_VSLB:
+case ALTIVEC_BUILTIN_VSLH:
+case ALTIVEC_BUILTIN_VSLW:
+case P8V_BUILTIN_VSLD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+/* Flavors of vector shift right. */
+case ALTIVEC_BUILTIN_VSRB:
+case ALTIVEC_BUILTIN_VSRH:
+case ALTIVEC_BUILTIN_VSRW:
+case P8V_BUILTIN_VSRD:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   lhs = gimple_call_lhs (stmt);
+   gimple *g;
+   /* convert arg0 to unsigned */
+   arg0 = convert(unsigned_type_for(TREE_TYPE(arg0)),arg0);
+   tree arg0_uns = 
create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(arg0)));
+   g = gimple_build_assign(arg0_uns,arg0);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_insert_before (gsi, g, GSI_SAME_STMT);
+   /* convert lhs to unsigned and do the shift.  */
+   tree lhs_uns = 
create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(lhs)));
+   g = gimple_build_assign (lhs_uns, RSHIFT_EXPR, arg0_uns, arg1);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_insert_before (gsi, g, GSI_SAME_STMT);
+   /* convert lhs back to a signed type for the return. */
+   lhs_uns = convert(signed_type_for(TREE_TYPE(lhs)),lhs_uns);
+   g = gimple_build_assign(lhs,lhs_uns);
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
 default:
   break;
 }
@@ -19128,6 +19198,14 @@ builtin_function_type (machine_mode mode_ret, 
machine_mode mode_arg0,
   h.uns_p[2] = 1;
   break;
 
+   /* unsigned second arguments (vector shift right).  */
+case ALTIVEC_BUILTIN_VSRB:
+case ALTIVEC_BUILTIN_VSRH:
+case ALTIVEC_BUILTIN_VSRW:
+case P8V_BUILTIN_VSRD:
+  h.uns_p[2] = 1;
+  break;
+
 default:
   break;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c
new file mode 100644
index 000..ebe91e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c
@@ -0,0 +1,66 @@
+/* Verify that overloaded built-ins 

Re: Default std::vector default and move constructor

2017-05-31 Thread François Dumont

On 31/05/2017 12:34, Jonathan Wakely wrote:


Well in general the is_nothrow_default_constructible trait also tells
you if the type is default-constructible at all, but the form above
won't compile if it isn't default-constructible. In this specific case
it doesn't matter, because that constructor won't compile anyway if
the allocator isn't default-constructible.



Thanks for explanation, for the moment I kept the noexcept calls. You'll 
tell me if it is fine in this new proposal.


   I'll complete testing and add a test on this value-initialization 
before commit if you agree.


So here is the new proposal with the additional test.

Unless I made a mistake it revealed that restoring explicit call to 
_Bit_alloc_type() in default constructor was not enough. G++ doesn't 
transform it into a value-init if needed. I don't know if it is a 
compiler bug but I had to do just like presented in the Standard to 
achieve the expected behavior.


This value-init is specific to post-C++11 right ? Maybe I could remove 
the useless explicit call to _Bit_alloc_type() in pre-C++11 mode ?


Now I wonder if I really introduced a regression in rb_tree...

Tested under Linux x86_64.

* include/bits/stl_bvector.h
(__fill_bvector(_Bit_type*, unsigned int, unsigned int, bool)):
Change signature.
(std::fill(_Bit_iterator, _Bit_iterator, bool)): Adapt.
(_Bvector_impl_data): New.
(_Bvector_impl): Inherits from latter.
(_Bvector_impl(_Bit_alloc_type&&)): Delete.
(_Bvector_impl(_Bvector_impl&&)): New, default.
(_Bvector_base()): Default.
(_Bvector_base(_Bvector_base&&)): Default.
(_Bvector_base::_M_move_data(_Bvector_base&&)): New.
(vector(vector&&, const allocator_type&)): Use latter.
(vector::operator=(vector&&)): Likewise.
(vector::vector()): Default.
(vector::assign(_InputIterator, _InputIterator)): Use
_M_assign_aux.
(vector::assign(initializer_list)): Likewise.
(vector::_M_initialize_value(bool)): New.
(vector(size_type, const bool&, const allocator_type&)): Use
latter.
(vector::_M_initialize_dispatch(_Integer, _Integer, 
__true_type)):

Likewise.
(vector::_M_fill_assign(size_t, bool)): Likewise.

Ok to commit ?

François
diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 78195c1..c441957 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -388,10 +388,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { return __x + __n; }
 
   inline void
-  __fill_bvector(_Bit_iterator __first, _Bit_iterator __last, bool __x)
+  __fill_bvector(_Bit_type * __v,
+		 unsigned int __first, unsigned int __last, bool __x)
   {
-for (; __first != __last; ++__first)
-  *__first = __x;
+const _Bit_type __fmask = ~0ul << __first;
+const _Bit_type __lmask = ~0ul >> (_S_word_bit - __last);
+const _Bit_type __mask = __fmask & __lmask;
+
+if (__x)
+  *__v |= __mask;
+else
+  *__v &= ~__mask;
   }
 
   inline void
@@ -399,12 +406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 if (__first._M_p != __last._M_p)
   {
-	std::fill(__first._M_p + 1, __last._M_p, __x ? ~0 : 0);
-	__fill_bvector(__first, _Bit_iterator(__first._M_p + 1, 0), __x);
-	__fill_bvector(_Bit_iterator(__last._M_p, 0), __last, __x);
+	_Bit_type *__first_p = __first._M_p;
+	if (__first._M_offset != 0)
+	  __fill_bvector(__first_p++, __first._M_offset, _S_word_bit, __x);
+
+	__builtin_memset(__first_p, __x ? ~0 : 0,
+			 (__last._M_p - __first_p) * sizeof(_Bit_type));
+
+	if (__last._M_offset != 0)
+	  __fill_bvector(__last._M_p, 0, __last._M_offset, __x);
   }
 else
-  __fill_bvector(__first, __last, __x);
+  __fill_bvector(__first._M_p, __first._M_offset, __last._M_offset, __x);
   }
 
   template
@@ -416,33 +429,70 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	_Bit_alloc_traits;
   typedef typename _Bit_alloc_traits::pointer _Bit_pointer;
 
-  struct _Bvector_impl
-  : public _Bit_alloc_type
+  struct _Bvector_impl_data
   {
 	_Bit_iterator 	_M_start;
 	_Bit_iterator 	_M_finish;
 	_Bit_pointer 	_M_end_of_storage;
 
+	_Bvector_impl_data() _GLIBCXX_NOEXCEPT
+	: _M_start(), _M_finish(), _M_end_of_storage()
+	{ }
+
+#if __cplusplus >= 201103L
+	_Bvector_impl_data(_Bvector_impl_data&& __x) noexcept
+	: _M_start(__x._M_start), _M_finish(__x._M_finish)
+	, _M_end_of_storage(__x._M_end_of_storage)
+	{ __x._M_reset(); }
+
+	void
+	_M_move_data(_Bvector_impl_data&& __x) noexcept
+	{
+	  this->_M_start = __x._M_start;
+	  this->_M_finish = __x._M_finish;
+	  this->_M_end_of_storage = __x._M_end_of_storage;
+	  __x._M_reset();
+	}
+#endif
+
+	void
+	_M_reset() _GLIBCXX_NOEXCEPT
+	{
+	  _M_start = _M_finish = _Bit_iterator();
+	  _M_end_of_storage = _Bit_pointer();
+	}
+  };
+
+  struct _Bvector_impl
+	: public _Bit_alloc_type, public _Bvector_impl_data
+	{
+	public:
+#if __cplusplus >= 201103L
+	  _Bvector_impl()
+	noexcept( noexcept

Re: [PATCH] DWARF: for variants, produce unsigned discr. when debug type is unsigned

2017-05-31 Thread Jason Merrill

On 05/30/2017 05:06 AM, Pierre-Marie de Rodat wrote:

Hello,

In Ada, the Character type is supposed to be unsigned.  However,
depending on the sign of C char types, GNAT can materialize it as a
signed type for code generation purposes.  When this is the case, GNAT
also attach a debug type to it so it is represented as an unsigned base
type in the debug information.

This change adapts record variant parts processing in the DWARF back-end
so that when the debug type of discriminant is unsigned while
discriminant values are signed themselves, we output unsigned
discriminant values in DWARF.

Bootstrapped and reg-tested on x86_64-linux.  Ok to commit?  Thanks!

gcc/

* dwarf2out.c (get_discr_value): Call the get_debug_type hook on
the type of the input discriminant value.  Convert the
discriminant value of signedness vary.


OK.

Jason



Re: [C++ PATCH] PR c++/80812

2017-05-31 Thread Jason Merrill

On 05/25/2017 05:29 AM, Ville Voutilainen wrote:

Tested on Linux-x64, running full suite on Linux-ppc64. It seems fitting
to put the test into the library tests, we don't have separate tests
on the front-end side for __is_constructible, so I think adding such
would be a separate job.

2017-05-25  Ville Voutilainen  

 cp/

 PR c++/80812
 * method.c (constructible_expr): Strip array types before calling
 build_value_init.


OK.

Jason




Re: {PATCH] New C++ warning -Wcatch-value

2017-05-31 Thread Jason Merrill
On Tue, May 30, 2017 at 2:14 AM, Volker Reichelt
 wrote:
> On 24 May, Jason Merrill wrote:
>> On Mon, May 15, 2017 at 3:58 PM, Martin Sebor  wrote:
 So how about the following then? I stayed with the catch part and added
 a parameter to the warning to let the user decide on the warnings she/he
 wants to get: -Wcatch-value=n.
 -Wcatch-value=1 only warns for polymorphic classes that are caught by
 value (to avoid slicing), -Wcatch-value=2 warns for all classes that
 are caught by value (to avoid copies). And finally -Wcatch-value=3
 warns for everything not caught by reference to find typos (like pointer
 instead of reference) and bad coding practices.
>>>
>>> It seems reasonable to me.  I'm not too fond of multi-level
>>> warnings since few users take advantage of anything but the
>>> default, but this case is simple and innocuous enough that
>>> I don't think it can do harm.
>>
 Bootstrapped and regtested on x86_64-pc-linux-gnu.
 OK for trunk?
>>
>> OK.
>
> Committed.
>
 If so, would it make sense to add -Wcatch-value=1 to -Wextra or even -Wall?
 I would do this in a seperate patch, becuase I haven't checked what that
 would mean for the testsuite.
>>>
>>> I can't think of a use case for polymorphic slicing that's not
>>> harmful so unless there is a common one that escapes me, I'd say
>>> -Wall.
>>
>> Agreed.  But then you'll probably want to allow -Wno-catch-value to turn it 
>> off.
>
> So how about the following then?
> Bootstrapped and regtested on x86_64-pc-linux-gnu.
> OK for trunk?

OK, thanks.

Jason


[PATCH] rs6000: Don't write "nor" as (not (ior () ())) (PR80618)

2017-05-31 Thread Segher Boessenkool
The canonical RTL for "nor" is (and (not ()) (not ())), and that is
indeed what we use in boolccv2df3_internal1.  So, the splitter for
*vector_uneq should use that form, not (not (ior () ())), which
does not match any pattern.

Tested on powerpc64-linux {-m32,-m64}, and tested the pr50310-2.c
testcase on powerpc64le-linux (it failed before, works after).
Committing to trunk.


Segher


2017-05-31  Segher Boessenkool  

PR target/80618
* config/rs6000/rs6000.md (*vector_uneq): Write the nor in the
splitter result in the canonical way.

---
 gcc/config/rs6000/vector.md | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index c35875d..a3d53e7 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -582,13 +582,12 @@ (define_insn_and_split "*vector_uneq"
(gt:VEC_F (match_dup 2)
  (match_dup 1)))
(set (match_dup 0)
-   (not:VEC_F (ior:VEC_F (match_dup 3)
- (match_dup 4]
-  "
+   (and:VEC_F (not:VEC_F (match_dup 3))
+  (not:VEC_F (match_dup 4]
 {
   operands[3] = gen_reg_rtx (mode);
   operands[4] = gen_reg_rtx (mode);
-}")
+})
 
 (define_insn_and_split "*vector_ltgt"
   [(set (match_operand:VEC_F 0 "vfloat_operand" "")
-- 
1.9.3



  1   2   >