[PATCH][C++] Fix PR86763, wrong-code with TYPE_TYPELESS_STORAGE

2018-08-02 Thread Richard Biener


This fixes the PR by making sure CLASSTYPE_AS_BASE types inherit
TYPE_TYPELESS_STORAGE from the main type so that types that inherit
a type with TYPE_TYPELESS_STORAGE also get TYPE_TYPELESS_STORAGE
propagated.

Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2018-08-02  Richard Biener  

PR c++/86763
* class.c (layout_class_type): Copy TYPE_TYPELESS_STORAGE
to the CLASSTYPE_AS_BASE.

* g++.dg/torture/pr86763.C: New testcase.

Index: gcc/cp/class.c
===
--- gcc/cp/class.c  (revision 263209)
+++ gcc/cp/class.c  (working copy)
@@ -6243,6 +6243,7 @@ layout_class_type (tree t, tree *virtual
  bitsize_int (BITS_PER_UNIT)));
   SET_TYPE_ALIGN (base_t, rli->record_align);
   TYPE_USER_ALIGN (base_t) = TYPE_USER_ALIGN (t);
+  TYPE_TYPELESS_STORAGE (base_t) = TYPE_TYPELESS_STORAGE (t);
 
   /* Copy the non-static data members of T. This will include its
 direct non-virtual bases & vtable.  */
Index: gcc/testsuite/g++.dg/torture/pr86763.C
===
--- gcc/testsuite/g++.dg/torture/pr86763.C  (nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr86763.C  (working copy)
@@ -0,0 +1,36 @@
+// { dg-do run }
+// { dg-additional-options "-fschedule-insns2 -fstrict-aliasing" }
+
+#include 
+#include 
+#include 
+struct ID {
+  uint64_t value;
+};
+uint64_t value(ID id) { return id.value; }
+uint64_t gen { 1000 };
+struct Msg {
+  uint64_t time;
+  ID id;
+};
+struct V {
+  V() { }
+  V(Msg const & msg) : msg(msg) { }
+  Msg & get() { return msg; }
+  Msg msg;
+  char pad[237 - sizeof(Msg)];
+};
+struct T : V { using V::V; };
+Msg init_msg() {
+  Msg msg;
+  timespec t;
+  clock_gettime(CLOCK_REALTIME, &t);
+  msg.time = t.tv_sec + t.tv_nsec;
+  msg.id.value = ++gen;
+  return msg;
+}
+int main() {
+  T t;
+  t = init_msg();
+  assert(value(t.get().id) == 1001);
+}


Re: [PATCH, Ada] Makefile patches from initial RISC-V cross/native build.

2018-08-02 Thread Pierre-Marie de Rodat

Hello Jim,

On 07/06/2018 01:45 AM, Jim Wilson wrote:

These are some patches I needed to complete my cross build of a native
riscv linux Ada compiler.  Some paths were different on the build machine
and host machine.  I needed to pass options into gnatmake to work around this,
and that required fixing some makefile rules to use $(GNATMAKE) instead of
calling gnatmake directly.


At AdaCore, we realized that this patch broke our canadian builds: in 
the cases your patch involved, the intent is to call the build 
toolchain, not the host one (to which GNATMAKE expands to).


Given that you described this change as a workaround for cross-machine 
path issues, I was wondering: do you still need this? If not, we should 
probably revert it right now, otherwise we should investigate why you 
needed it in the first place. What do you think?


--
Pierre-Marie de Rodat


Re: [Patch, fortran] A first small step towards CFI descriptor implementation

2018-08-02 Thread Paul Richard Thomas
The ordering is fixed by the ordering of types in the CFI part of the
standard. Intrinsic types, then derived types and finally all the
others.

For the time being I will interchange character and derived types in
the conversion functions.

Cheers

Paul
On Tue, 31 Jul 2018 at 18:10, Janus Weil  wrote:
>
> Hi Paul,
>
> 2018-07-31 14:06 GMT+02:00 Paul Richard Thomas 
> :
> > Daniel Celis Garza and Damian Rouson have developed a runtime library
> > and include file for the TS 29113 and F2018 C descriptors.
> > https://github.com/sourceryinstitute/ISO_Fortran_binding
> >
> > The ordering of types is different to the current 'bt' enum in
> > libgfortran.h. This patch interchanges BT_DERIVED and BT_CHARACTER to
> > fix this.
>
> is this ordering actually fixed by the F18 standard, or is there any
> other reason why it needs to be like this? What's wrong with
> gfortran's current ordering?
>
> Cheers,
> Janus



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Allan Sandfeld Jensen
On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote:
> On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote:
> >  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,
> > 
> > __artificial__))
> > 
> >  _mm_move_sd (__m128d __A, __m128d __B)
> >  {
> > 
> > -  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
> > +  return __extension__ (__m128d)(__v2df){__B[0],__A[1]};
> > 
> >  }
> 
> If the goal is to have it represented as a VEC_PERM_EXPR internally, I
> wonder if we should be explicit and use __builtin_shuffle instead of
> relying on some forwprop pass to transform it. Maybe not, just asking. And
> the answer need not even be the same for _mm_move_sd and _mm_move_ss.

I wrote it this way because this pattern could later also be used for the 
other _ss intrinsics, such as _mm_add_ss, where a _builtin_shuffle could not. 
To match the other intrinsics the logic that tries to match vector 
construction just needs to be extended to try merge patterns even if one of 
the subexpressions is not simple.

'Allan




[PATCH] Handle TARGET_SPLIT_COMPLEX_ARG in default VA_ARG_EXPR implementation

2018-08-02 Thread Chung-Lin Tang

Hi, during testing of the new in-review C-SKY port, we discovered some FAILs 
due to
the default va_args gimplifying (targhooks.c:std_gimplify_va_arg_expr) not 
properly
having the logic to handle the TARGET_SPLIT_COMPLEX_ARG hook.

It appears that all other targets that happen to use TARGET_SPLIT_COMPLEX_ARG 
also
defines TARGET_GIMPLIFY_VA_ARG_EXPR, so this went undiscovered.
The C-SKY port happens to only have TARGET_SPLIT_COMPLEX_ARG defined.

This patch completes this handling in std_gimplify_va_arg_expr(), though at the
moment it's only really exercised by the C-SKY port, which we tested to fix 
several
_Complex va_args related FAILs and without regressions.
(the patch fragment is actually adapted from the xtensa port, FWIW)

Is this okay for trunk?

Thanks,
Chung-Lin

2018-08-02  Chung-Lin Tang  

* targhooks.c (std_gimplify_va_arg_expr): Properly handle case of when
TARGET_SPLIT_COMPLEX_ARG is defined.
Index: targhooks.c
===
--- targhooks.c (revision 263244)
+++ targhooks.c (working copy)
@@ -2154,6 +2154,23 @@ std_gimplify_va_arg_expr (tree valist, tree type,
   if (indirect)
 type = build_pointer_type (type);
 
+  if (targetm.calls.split_complex_arg
+  && TREE_CODE (type) == COMPLEX_TYPE
+  && targetm.calls.split_complex_arg (type))
+{
+  tree real_part, imag_part;
+
+  real_part = std_gimplify_va_arg_expr (valist,
+   TREE_TYPE (type), pre_p, NULL);
+  real_part = get_initialized_tmp_var (real_part, pre_p, NULL);
+
+  imag_part = std_gimplify_va_arg_expr (unshare_expr (valist),
+   TREE_TYPE (type), pre_p, NULL);
+  imag_part = get_initialized_tmp_var (imag_part, pre_p, NULL);
+
+  return build2 (COMPLEX_EXPR, type, real_part, imag_part);
+   }
+
   align = PARM_BOUNDARY / BITS_PER_UNIT;
   boundary = targetm.calls.function_arg_boundary (TYPE_MODE (type), type);
 


Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Richard Biener
On Thu, Aug 2, 2018 at 11:12 AM Allan Sandfeld Jensen
 wrote:
>
> On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote:
> > On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote:
> > >  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,
> > >
> > > __artificial__))
> > >
> > >  _mm_move_sd (__m128d __A, __m128d __B)
> > >  {
> > >
> > > -  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
> > > +  return __extension__ (__m128d)(__v2df){__B[0],__A[1]};
> > >
> > >  }
> >
> > If the goal is to have it represented as a VEC_PERM_EXPR internally, I
> > wonder if we should be explicit and use __builtin_shuffle instead of
> > relying on some forwprop pass to transform it. Maybe not, just asking. And
> > the answer need not even be the same for _mm_move_sd and _mm_move_ss.
>
> I wrote it this way because this pattern could later also be used for the
> other _ss intrinsics, such as _mm_add_ss, where a _builtin_shuffle could not.
> To match the other intrinsics the logic that tries to match vector
> construction just needs to be extended to try merge patterns even if one of
> the subexpressions is not simple.

The question is what users expect and get when they use -O0 with intrinsics?

Richard.

> 'Allan
>
>


[PATCH] Cherry-pick compiler-rt revision 338606 (PR sanitizer/86022).

2018-08-02 Thread Martin Liška
Hi.

I'm sending following cherry pick that I'm going to install. And I'll backport 
that
then.

Martin

Fix sizeof(struct pthread) in glibc 2.14.

libsanitizer/ChangeLog:

2018-08-02  Martin Liska  

PR sanitizer/86022
* sanitizer_common/sanitizer_linux_libcdep.cc (ThreadDescriptorSize):
Cherry-pick compiler-rt revision 338606.
---
 libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc
index d27a8435802..3b1a2174c46 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc
@@ -235,7 +235,7 @@ uptr ThreadDescriptorSize() {
 val = FIRST_32_SECOND_64(1168, 1776);
   else if (minor == 11 || (minor == 12 && patch == 1))
 val = FIRST_32_SECOND_64(1168, 2288);
-  else if (minor <= 13)
+  else if (minor <= 14)
 val = FIRST_32_SECOND_64(1168, 2304);
   else
 val = FIRST_32_SECOND_64(1216, 2304);



[PATCH] Fix gcov misleading error (PR gcov-profile/86817).

2018-08-02 Thread Martin Liška
Hi.

As seen in the PR, we must first read all functions into global array functions
and then processing of these can happen. Currently we process functions
multiple times.

I'm planning to install that after it survives regression tests.

Martin

gcc/ChangeLog:

2018-08-02  Martin Liska  

PR gcov-profile/86817
* gcov.c (process_all_functions): New function.
(main): Call it.
(process_file): Move functions processing to
process_all_functions.
---
 gcc/gcov.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)


diff --git a/gcc/gcov.c b/gcc/gcov.c
index 78a3e0e19e9..43dfc9a4b2c 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -543,6 +543,7 @@ static int process_args (int, char **);
 static void print_usage (int) ATTRIBUTE_NORETURN;
 static void print_version (void) ATTRIBUTE_NORETURN;
 static void process_file (const char *);
+static void process_all_functions (void);
 static void generate_results (const char *);
 static void create_file_names (const char *);
 static char *canonicalize_name (const char *);
@@ -798,6 +799,7 @@ main (int argc, char **argv)
 
   if (flag_intermediate_format || argno == argc - 1)
 	{
+	  process_all_functions ();
 	  generate_results (argv[argno]);
 	  release_structures ();
 	}
@@ -1145,11 +1147,14 @@ process_file (const char *file_name)
 {
   create_file_names (file_name);
   read_graph_file ();
-  if (functions.empty ())
-return;
-
   read_count_file ();
+}
 
+/* Process all functions in all files.  */
+
+static void
+process_all_functions (void)
+{
   hash_map fn_map;
 
   /* Identify group functions.  */
@@ -1226,7 +1231,6 @@ process_file (const char *file_name)
 	  if (fn->is_group)
 	fn->lines.resize (fn->end_line - fn->start_line + 1);
 
-
 	  solve_flow_graph (fn);
 	  if (fn->has_catch)
 	find_exception_blocks (fn);



Re: [PATCH] Print default options selection for -march,-mcpu and -mtune for aarch64 (PR driver/83193).

2018-08-02 Thread Richard Earnshaw (lists)
On 18/07/18 16:48, Martin Liška wrote:
> Hi.
> 
> This is aarch64 fix for PR83193. It's about setting of default options
> so that --help=target -Q prints proper numbers:
> 
> Now this is seen on my cross-compiler:
> 
> --- /home/marxin/Downloads/options-2-before.txt   2018-07-18 
> 14:53:11.658146543 +0200
> +++ /home/marxin/Downloads/options-2.txt  2018-07-18 14:52:30.113274284 
> +0200
> @@ -1,10 +1,10 @@
>  The following options are target specific:
>-mabi=ABI  lp64
> -  -march=ARCH
> +  -march=armv8-a

So we have

-mabi=ABI   lp64

but

-march= armv8-a
   ^ blank

Isn't that inconsistent?

R.

>-mbig-endian   [disabled]
>-mbionic   [disabled]
>-mcmodel=  small
> -  -mcpu=CPU  
> +  -mcpu= generic
>-mfix-cortex-a53-835769[enabled]
>-mfix-cortex-a53-843419[enabled]
>-mgeneral-regs-only[disabled]
> @@ -19,7 +19,7 @@
>-msve-vector-bits=Nscalable
>-mtls-dialect= desc
>-mtls-size=24
> -  -mtune=CPU 
> +  -mtune=generic
>-muclibc   [disabled]
> 
> May I please ask ARM folks to test the patch?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-07-18  Martin Liska  
> 
> PR driver/83193
>   * config/aarch64/aarch64.c (aarch64_override_options_internal):
> Set default values for x_aarch64_*_string strings.
>   * config/aarch64/aarch64.opt: Remove --{march,mcpu,mtune}==
> prefix.
> ---
>  gcc/config/aarch64/aarch64.c   | 7 +++
>  gcc/config/aarch64/aarch64.opt | 6 +++---
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> 
> 
> 0001-Print-default-options-selection-for-march-mcpu-and-m.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 6fa03e4b091..d48e6278efa 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -10713,6 +10713,13 @@ aarch64_override_options_internal (struct 
> gcc_options *opts)
>&& opts->x_optimize >= aarch64_tune_params.prefetch->default_opt_level)
>  opts->x_flag_prefetch_loop_arrays = 1;
>  
> +  if (opts->x_aarch64_arch_string == NULL)
> +opts->x_aarch64_arch_string = selected_arch->name;
> +  if (opts->x_aarch64_cpu_string == NULL)
> +opts->x_aarch64_cpu_string = selected_cpu->name;
> +  if (opts->x_aarch64_tune_string == NULL)
> +opts->x_aarch64_tune_string = selected_tune->name;
> +
>aarch64_override_options_after_change_1 (opts);
>  }
>  
> diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
> index 1426b45ff0f..7f0b65de37b 100644
> --- a/gcc/config/aarch64/aarch64.opt
> +++ b/gcc/config/aarch64/aarch64.opt
> @@ -117,15 +117,15 @@ Enum(aarch64_tls_size) String(48) Value(48)
>  
>  march=
>  Target RejectNegative ToLower Joined Var(aarch64_arch_string)
> --march=ARCH  Use features of architecture ARCH.
> +Use features of architecture ARCH.
>  
>  mcpu=
>  Target RejectNegative ToLower Joined Var(aarch64_cpu_string)
> --mcpu=CPUUse features of and optimize for CPU.
> +Use features of and optimize for CPU.
>  
>  mtune=
>  Target RejectNegative ToLower Joined Var(aarch64_tune_string)
> --mtune=CPU   Optimize for CPU.
> +Optimize for CPU.
>  
>  mabi=
>  Target RejectNegative Joined Enum(aarch64_abi) Var(aarch64_abi) 
> Init(AARCH64_ABI_DEFAULT)
> 



Re: [PATCH] Print default options selection for -march,-mcpu and -mtune for aarch64 (PR driver/83193).

2018-08-02 Thread Martin Liška
On 08/02/2018 11:39 AM, Richard Earnshaw (lists) wrote:
> On 18/07/18 16:48, Martin Liška wrote:
>> Hi.
>>
>> This is aarch64 fix for PR83193. It's about setting of default options
>> so that --help=target -Q prints proper numbers:
>>
>> Now this is seen on my cross-compiler:
>>
>> --- /home/marxin/Downloads/options-2-before.txt  2018-07-18 
>> 14:53:11.658146543 +0200
>> +++ /home/marxin/Downloads/options-2.txt 2018-07-18 14:52:30.113274284 
>> +0200
>> @@ -1,10 +1,10 @@
>>  The following options are target specific:
>>-mabi=ABI lp64
>> -  -march=ARCH   
>> +  -march=   armv8-a
> 
> So we have
> 
> -mabi=ABI lp64
> 
> but
> 
> -march=   armv8-a
>^ blank
> 
> Isn't that inconsistent?

It is probably, in this case I would remove 'ABI' from -mabi option. It's 
explained bellow
what are possible options:

  Known AArch64 ABIs (for use with the -mabi= option):
ilp32 lp64

Similarly for:
  -moverride=STRING   Power users only! Override CPU optimization 
parameters.
  -msve-vector-bits=N Set the number of bits in an SVE vector register 
to N.

It's more common to  notation, there are some samples from --help=common:

  -fmax-errors=   Maximum number of errors to report.
  -fmessage-length=   Limit diagnostics to  characters per 
line.  0 suppresses line-wrapping.
  -fira-region=[one|all|mixed] Set regions for IRA.
  -fira-verbose=  Control IRA's level of diagnostic messages.
  -flifetime-dse=<0,2>This option lacks documentation.
  -fstack-limit-register= Trap if the stack goes past .
  -fstack-limit-symbol= Trap if the stack goes past symbol .

Are you fine with the suggested approach?

Martin

> 
> R.
> 
>>-mbig-endian  [disabled]
>>-mbionic  [disabled]
>>-mcmodel= small
>> -  -mcpu=CPU 
>> +  -mcpu=generic
>>-mfix-cortex-a53-835769   [enabled]
>>-mfix-cortex-a53-843419   [enabled]
>>-mgeneral-regs-only   [disabled]
>> @@ -19,7 +19,7 @@
>>-msve-vector-bits=N   scalable
>>-mtls-dialect=desc
>>-mtls-size=   24
>> -  -mtune=CPU
>> +  -mtune=   generic
>>-muclibc  [disabled]
>>
>> May I please ask ARM folks to test the patch?
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-07-18  Martin Liska  
>>
>> PR driver/83193
>>  * config/aarch64/aarch64.c (aarch64_override_options_internal):
>> Set default values for x_aarch64_*_string strings.
>>  * config/aarch64/aarch64.opt: Remove --{march,mcpu,mtune}==
>> prefix.
>> ---
>>  gcc/config/aarch64/aarch64.c   | 7 +++
>>  gcc/config/aarch64/aarch64.opt | 6 +++---
>>  2 files changed, 10 insertions(+), 3 deletions(-)
>>
>>
>>
>> 0001-Print-default-options-selection-for-march-mcpu-and-m.patch
>>
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 6fa03e4b091..d48e6278efa 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -10713,6 +10713,13 @@ aarch64_override_options_internal (struct 
>> gcc_options *opts)
>>&& opts->x_optimize >= 
>> aarch64_tune_params.prefetch->default_opt_level)
>>  opts->x_flag_prefetch_loop_arrays = 1;
>>  
>> +  if (opts->x_aarch64_arch_string == NULL)
>> +opts->x_aarch64_arch_string = selected_arch->name;
>> +  if (opts->x_aarch64_cpu_string == NULL)
>> +opts->x_aarch64_cpu_string = selected_cpu->name;
>> +  if (opts->x_aarch64_tune_string == NULL)
>> +opts->x_aarch64_tune_string = selected_tune->name;
>> +
>>aarch64_override_options_after_change_1 (opts);
>>  }
>>  
>> diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
>> index 1426b45ff0f..7f0b65de37b 100644
>> --- a/gcc/config/aarch64/aarch64.opt
>> +++ b/gcc/config/aarch64/aarch64.opt
>> @@ -117,15 +117,15 @@ Enum(aarch64_tls_size) String(48) Value(48)
>>  
>>  march=
>>  Target RejectNegative ToLower Joined Var(aarch64_arch_string)
>> --march=ARCH Use features of architecture ARCH.
>> +Use features of architecture ARCH.
>>  
>>  mcpu=
>>  Target RejectNegative ToLower Joined Var(aarch64_cpu_string)
>> --mcpu=CPU   Use features of and optimize for CPU.
>> +Use features of and optimize for CPU.
>>  
>>  mtune=
>>  Target RejectNegative ToLower Joined Var(aarch64_tune_string)
>> --mtune=CPU  Optimize for CPU.
>> +Optimize for CPU.
>>  
>>  mabi=
>>  Target RejectNegative Joined Enum(aarch64_abi) Var(aarch64_abi) 
>> Init(AARCH64_ABI_DEFAULT)
>>
> 



Re: [14/46] Make STMT_VINFO_VEC_STMT a stmt_vec_info

2018-08-02 Thread Richard Sandiford
"H.J. Lu"  writes:
> On Tue, Jul 24, 2018 at 2:58 AM, Richard Sandiford
>  wrote:
>> This patch changes STMT_VINFO_VEC_STMT from a gimple stmt to a
>> stmt_vec_info and makes the vectorizable_* routines pass back
>> a stmt_vec_info to vect_transform_stmt.
>>
>>
>> 2018-07-24  Richard Sandiford  
>>
>> gcc/
>> * tree-vectorizer.h (_stmt_vec_info::vectorized_stmt): Change from
>> a gimple stmt to a stmt_vec_info.
>> (vectorizable_condition, vectorizable_live_operation)
>> (vectorizable_reduction, vectorizable_induction): Pass back the
>> vectorized statement as a stmt_vec_info.
>> * tree-vect-data-refs.c (vect_record_grouped_load_vectors): Update
>> use of STMT_VINFO_VEC_STMT.
>> * tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise,
>> accumulating the inner phis that feed the STMT_VINFO_VEC_STMT
>> as stmt_vec_infos rather than gimple stmts.
>> (vectorize_fold_left_reduction): Change vec_stmt from a gimple stmt
>> to a stmt_vec_info.
>> (vectorizable_live_operation): Likewise.
>> (vectorizable_reduction, vectorizable_induction): Likewise,
>> updating use of STMT_VINFO_VEC_STMT.
>> * tree-vect-stmts.c (vect_get_vec_def_for_operand_1): Update use
>> of STMT_VINFO_VEC_STMT.
>> (vect_build_gather_load_calls, vectorizable_bswap, vectorizable_call)
>> (vectorizable_simd_clone_call, vectorizable_conversion)
>> (vectorizable_assignment, vectorizable_shift, vectorizable_operation)
>> (vectorizable_store, vectorizable_load, vectorizable_condition)
>> (vectorizable_comparison, can_vectorize_live_stmts): Change vec_stmt
>> from a gimple stmt to a stmt_vec_info.
>> (vect_transform_stmt): Update use of STMT_VINFO_VEC_STMT.  Pass a
>> pointer to a stmt_vec_info to the vectorizable_* routines.
>>
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86824

Should be fixed by r263222 (tested on an x86 SPEC2k6 run).
Sorry again for the breakage.

Richard


Re: [gen/AArch64] Generate helpers for substituting iterator values into pattern names

2018-08-02 Thread Richard Earnshaw (lists)
On 13/07/18 10:15, Richard Sandiford wrote:
> Given a pattern like:
> 
>   (define_insn "aarch64_frecpe" ...)
> 
> the SVE ACLE implementation wants to generate the pattern for a
> particular (non-constant) mode.  This patch automatically generates
> helpers to do that, specifically:
> 
>   // Return CODE_FOR_nothing on failure.
>   insn_code maybe_code_for_aarch64_frecpe (machine_mode);
> 
>   // Assert that the code exists.
>   insn_code code_for_aarch64_frecpe (machine_mode);
> 
>   // Return NULL_RTX on failure.
>   rtx maybe_gen_aarch64_frecpe (machine_mode, rtx, rtx);
> 
>   // Assert that generation succeeds.
>   rtx gen_aarch64_frecpe (machine_mode, rtx, rtx);
> 
> Many patterns don't have sensible names when all <...>s are removed.
> E.g. "2" would give a base name "2".  The new functions
> therefore require explicit opt-in, which should also help to reduce
> code bloat.
> 
> The (arbitrary) opt-in syntax I went for was to prefix the pattern
> name with '@', similarly to the existing '*' marker.
> 
> The patch also makes config/aarch64 use the new routines in cases where
> they obviously apply.  This was mostly straight-forward, but it seemed
> odd that we defined:
> 
>aarch64_reload_movcp<...>
> 
> but then only used it with DImode, never SImode.  If we should be
> using Pmode instead of DImode, then that's a simple change,
> but should probably be a separate patch.
> 
> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
> and x86_64-linux-gnu.  I think I can self-approve the gen* bits,
> but OK for the AArch64 parts?
> 
> Any objections to this approach or syntax?

So if I have two or more iterator rules in the MD, with a similar
'shape', eg

(define_insn "@fred"...

(define_insn "@fred"...

this will generate a single factory function?  If so, that sounds sensible.

(havent looked at the details of the patch, but if James has already
done that, then that should be enough).

R.

> 
> Richard
> 
> 
> 2018-07-13  Richard Sandiford  
> 
> gcc/
>   * doc/md.texi: Expand the documentation of instruction names
>   to mention port-local uses.  Document '@' in pattern names.
>   * read-md.h (overloaded_instance, overloaded_name): New structs.
>   (mapping): Declare.
>   (md_reader::handle_overloaded_name): New member function.
>   (md_reader::get_overloads): Likewise.
>   (md_reader::m_first_overload): New member variable.
>   (md_reader::m_next_overload_ptr): Likewise.
>   (md_reader::m_overloads_htab): Likewise.
>   * read-md.c (md_reader::md_reader): Initialize m_first_overload,
>   m_next_overload_ptr and m_overloads_htab.
>   * read-rtl.c (iterator_group): Add "type" and "get_c_token" fields.
>   (get_mode_token, get_code_token, get_int_token): New functions.
>   (map_attr_string): Add an optional argument that passes back
>   the associated iterator.
>   (overloaded_name_hash, overloaded_name_eq_p, named_rtx_p):
>   (md_reader::handle_overloaded_name, add_overload_instance): New
>   functions.
>   (apply_iterators): Handle '@' names.  Report an error if '@'
>   is used without iterators.
>   (initialize_iterators): Initialize the new iterator_group fields.
>   * gencodes.c (handle_overloaded_code_for): New function.
>   (main): Use it to print declarations of maybe_code_for_* functions
>   and inline definitions of code_for_*.
>   * genflags.c (emit_overloaded_gen_proto): New function.
>   (main): Use it to print declarations of maybe_gen_* functions
>   and inline definitions of gen_*.
>   * genemit.c (print_overload_arguments, print_overload_test)
>   (handle_overloaded_code_for, handle_overloaded_gen): New functions.
>   (main): Use it to print definitions of maybe_code_for_* and
>   maybe_gen_* functions.
>   * config/aarch64/aarch64.c (aarch64_split_128bit_move): Use
>   gen_aarch64_mov{low,high}_di and gen_aarch64_movdi_{low,high}
>   instead of explicit mode checks.
>   (aarch64_split_simd_combine): Likewise gen_aarch64_simd_combine.
>   (aarch64_split_simd_move): Likewise gen_aarch64_split_simd_mov.
>   (aarch64_emit_load_exclusive): Likewise gen_aarch64_load_exclusive.
>   (aarch64_emit_store_exclusive): Likewise gen_aarch64_store_exclusive.
>   (aarch64_expand_compare_and_swap): Likewise
>   gen_aarch64_compare_and_swap and gen_aarch64_compare_and_swap_lse
>   (aarch64_gen_atomic_cas): Likewise gen_aarch64_atomic_cas.
>   (aarch64_emit_atomic_swap): Likewise gen_aarch64_atomic_swp.
>   (aarch64_constant_pool_reload_icode): Delete.
>   (aarch64_secondary_reload): Use code_for_aarch64_reload_movcp
>   instead of aarch64_constant_pool_reload_icode.  Use
>   code_for_aarch64_reload_mov instead of explicit mode checks.
>   (rsqrte_type, get_rsqrte_type, rsqrts_type, get_rsqrts_type): Delete.
>   (aarch64_emit_approx_sqrt): Use gen_aarch64_rsqrte instead of
>   get_rsqrte_type and 

Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Bernd Edlinger
Hi,

this is an update of the patch to prevent unsafe optimizations due to strlen 
range assuming
always zero-terminated char arrays.

Since the previous version I do no longer try to locate the outermost char 
array,
but just bail out if there is a typecast, that means, supposed we have a 
2-dimensional
array, char a[x][y], strlen (s.a[x]) may assume a[x] is zero-terminated, if the 
optimization
is enabled.  strlen ((char*)s.a) involves a type cast and should assume nothing.

Additionally due to the discussion, I came to the conclusion that this strlen 
range optimization
should only be used when enabled with -fassume-zero-terminated-char-arrays or 
-Ofast.

Note that I included the test case with the removed assertion as
gcc/testsuite/gcc.dg/strlenopt-57.c

Initially it would only miscompile with -Ofast, but with r263018 aka
PR tree-optimization/86043, PR tree-optimization/86042 it was miscompiled with 
-O3 as well.

I located the reason in gcc/tree-ssa-strlen.c (get_min_string_length) which did 
not
check if the string constant is in fact zero terminated:

@@ -3192,7 +3194,9 @@ get_min_string_length (tree rhs, bool *full_string
   && TREE_READONLY (rhs))
 rhs = DECL_INITIAL (rhs);

-  if (rhs && TREE_CODE (rhs) == STRING_CST)
+  if (rhs && TREE_CODE (rhs) == STRING_CST
+  && compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (rhs)),
+  TREE_STRING_LENGTH (rhs)) >= 0)
 {
   *full_string_p = true;
   return strlen (TREE_STRING_POINTER (rhs));

Fortunately this also shows a way how to narrow strlen return value 
expectations when
we are able to positively prove that a string must be zero terminated.



Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
gcc:
2018-08-01  Bernd Edlinger  

* common.opt: Add new optimization option
-fassume-zero-terminated-char-arrays.
* opts.c (default_options): Enable -fassume-zero-terminated-char-arrays
with -Ofast.
* gimple-fold.c (get_inner_char_array_unless_typecast): Helper
function for strlen range estimations.
(get_range_strlen): Use get_inner_char_array_unless_typecast.
* gimple-fold.h (get_inner_char_array_unless_typecast): Declare.
* tree-ssa-strlen.c (maybe_set_strlen_range): Likewise.
(get_min_string_length): Avoid not NUL terminated string literals.
* doc/invoke.texi: Document -fassume-zero-terminated-char-arrays.

testsuite:
2018-08-01  Bernd Edlinger  

* gcc.dg/tree-ssa/builtin-snprintf-warn-1.c: Add
-fassume-zero-terminated-char-arrays.
* gcc.dg/tree-ssa/builtin-snprintf-warn-2.c: Likewise
* gcc.dg/tree-ssa/builtin-snprintf-warn-3.c: Likewise
* gcc.dg/tree-ssa/builtin-sprintf-warn-14.c: Likewise
* gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Likewise
* gcc.dg/tree-ssa/pr79376.c: Likewise
* gcc.dg/Wstringop-overflow-5.c: Likewise.
* gcc.dg/Wstringop-truncation-3.c: Likewise.
* gcc.dg/Wstringop-truncation.c: Likewise.
* gcc.dg/pr79538.c: Likewise.
* gcc.dg/pr83373.c: Likewise.
* gcc.dg/strlenopt-36.c: Likewise.
* gcc.dg/strlenopt-40.c: Adjust test expectations.
* gcc.dg/strlenopt-45.c: Likewise.
* gcc.dg/strlenopt-48.c: Likewise.
* gcc.dg/strlenopt-51.c: Likewise.
* gcc.dg/strlenopt-55.c: New test.
* gcc.dg/strlenopt-56.c: New test.
* gcc.dg/strlenopt-57.c: New test.

Index: gcc/common.opt
===
--- gcc/common.opt	(revision 263029)
+++ gcc/common.opt	(working copy)
@@ -1025,6 +1025,10 @@ fsanitize-undefined-trap-on-error
 Common Driver Report Var(flag_sanitize_undefined_trap_on_error) Init(0)
 Use trap instead of a library function for undefined behavior sanitization.
 
+fassume-zero-terminated-char-arrays
+Common Var(flag_assume_zero_terminated_char_arrays) Optimization Init(0)
+Optimize under the assumption that char arrays must always be zero terminated.
+
 fasynchronous-unwind-tables
 Common Report Var(flag_asynchronous_unwind_tables) Optimization
 Generate unwind tables that are exact at each instruction boundary.
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c	(revision 263029)
+++ gcc/gimple-fold.c	(working copy)
@@ -1257,7 +1257,45 @@ gimple_fold_builtin_memset (gimple_stmt_iterator *
   return true;
 }
 
+/* Obtain the inner char array for strlen range estimations.
+   Return NULL if ARG is not a char array, or if the inner reference
+   chain goes through a type cast.  */
 
+tree
+get_inner_char_array_unless_typecast (tree arg)
+{
+  if (!flag_assume_zero_terminated_char_arrays)
+return NULL_TREE;
+
+  /* We handle arrays of integer types.  */
+  if (TREE_CODE (TREE_TYPE (arg)) != ARRAY_TYPE
+  || TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) != INTEGER_TYPE
+  || TYPE_MODE (TREE_TYPE (TREE_TYPE (arg)

[PATCH] [aarch64] Fix falkor pipeline description for dup

2018-08-02 Thread Siddhesh Poyarekar
There was a typo in the pipeline description where DUP was assigned to
the vector pipes for quad mode ops when it really only uses the VTOG
pipes.  Fixing this does not show any noticeable difference in
performance (there's a very small bump of 1.7% in x264 but that's
about it) in my tests but is the more precise description of operations
for falkor.

* gcc/config/aarch64/falkor.md (falkor_am_1_vxvy_vxvy): Move
neon_dup_q to...
(falkor_am_1_gtov_gtov): ... a new insn reservation.

CC: james.greenha...@arm.com
CC: kyrylo.tkac...@foss.arm.com
---
 gcc/config/aarch64/falkor.md | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/falkor.md b/gcc/config/aarch64/falkor.md
index 5cbc5150ef3..83b4b14b7f4 100644
--- a/gcc/config/aarch64/falkor.md
+++ b/gcc/config/aarch64/falkor.md
@@ -322,6 +322,12 @@
(eq_attr "type" "neon_from_gp_q"))
   "falkor_gtov,falkor_gtov")
 
+;; DUP  does not use vector pipes in Q mode, only gtov+gtov.
+(define_insn_reservation "falkor_am_1_gtov_gtov" 1
+  (and (eq_attr "tune" "falkor")
+   (eq_attr "type" "neon_dup_q"))
+  "falkor_gtov,falkor_gtov")
+
 ;; neon_to_gp_q is used for 32-bit ARM instructions that move 64-bits of data
 ;; so no use needed here.
 
@@ -337,7 +343,7 @@
 
 (define_insn_reservation "falkor_am_1_vxvy_vxvy" 1
   (and (eq_attr "tune" "falkor")
-   (eq_attr "type" 
"neon_bsl_q,neon_dup_q,neon_ext_q,neon_move_q,neon_rev_q,neon_tbl1_q,neon_permute_q"))
+   (eq_attr "type" 
"neon_bsl_q,neon_ext_q,neon_move_q,neon_rev_q,neon_tbl1_q,neon_permute_q"))
   "falkor_vxvy+falkor_vxvy")
 
 (define_insn_reservation "falkor_am_2_vxvy" 2
-- 
2.17.1



Re: [PATCH] [aarch64] Fix falkor pipeline description for dup

2018-08-02 Thread Kyrill Tkachov

Hi Siddhesh,

On 02/08/18 11:23, Siddhesh Poyarekar wrote:

There was a typo in the pipeline description where DUP was assigned to
the vector pipes for quad mode ops when it really only uses the VTOG
pipes.  Fixing this does not show any noticeable difference in
performance (there's a very small bump of 1.7% in x264 but that's
about it) in my tests but is the more precise description of operations
for falkor.



You know the microarchitecture better, so I'm assuming that's correct for 
Falkor.
I assume this has been tested with a bootstrap and test.


* gcc/config/aarch64/falkor.md (falkor_am_1_vxvy_vxvy): Move
neon_dup_q to...
(falkor_am_1_gtov_gtov): ... a new insn reservation.



No gcc/ prefix in the ChangeLog path.


CC: james.greenha...@arm.com
CC: kyrylo.tkac...@foss.arm.com
---
 gcc/config/aarch64/falkor.md | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/falkor.md b/gcc/config/aarch64/falkor.md
index 5cbc5150ef3..83b4b14b7f4 100644
--- a/gcc/config/aarch64/falkor.md
+++ b/gcc/config/aarch64/falkor.md
@@ -322,6 +322,12 @@
(eq_attr "type" "neon_from_gp_q"))
   "falkor_gtov,falkor_gtov")

+;; DUP  does not use vector pipes in Q mode, only gtov+gtov.
+(define_insn_reservation "falkor_am_1_gtov_gtov" 1
+  (and (eq_attr "tune" "falkor")
+   (eq_attr "type" "neon_dup_q"))
+  "falkor_gtov,falkor_gtov")


You should be able to use the "*" construct here like this:
falkor_gtov*2

Otherwise looks ok to me (but you'll need a maintainer to approve)
Thanks,
Kyrill


+
 ;; neon_to_gp_q is used for 32-bit ARM instructions that move 64-bits of data
 ;; so no use needed here.

@@ -337,7 +343,7 @@

 (define_insn_reservation "falkor_am_1_vxvy_vxvy" 1
   (and (eq_attr "tune" "falkor")
-   (eq_attr "type" 
"neon_bsl_q,neon_dup_q,neon_ext_q,neon_move_q,neon_rev_q,neon_tbl1_q,neon_permute_q"))
+   (eq_attr "type" 
"neon_bsl_q,neon_ext_q,neon_move_q,neon_rev_q,neon_tbl1_q,neon_permute_q"))
   "falkor_vxvy+falkor_vxvy")

 (define_insn_reservation "falkor_am_2_vxvy" 2
--
2.17.1





Re: [PATCH] Add malloc predictor (PR middle-end/83023).

2018-08-02 Thread Martin Liška
On 08/01/2018 05:04 PM, Marc Glisse wrote:
> On Wed, 1 Aug 2018, Martin Liška wrote:
> 
>> On 08/01/2018 02:25 PM, Marc Glisse wrote:
>>> On Wed, 1 Aug 2018, Martin Liška wrote:
>>>
 On 07/27/2018 02:38 PM, Marc Glisse wrote:
> On Fri, 27 Jul 2018, Martin Liška wrote:
>
>> So answer is yes, the builtin can be then removed.
>
> Good, thanks. While looking at how widely it is going to apply, I noticed 
> that the default, throwing operator new has attribute malloc and 
> everything, but the non-throwing variant declared in  doesn't, so it 
> won't benefit from the new predictor. I don't know if there is a good 
> reason for this disparity...
>

 Well in case somebody uses operator new:

     int* p1 = new int;
     if (p1)
  delete p1;

 we optimize out that to if (true), even when one has used defined
 operator new. Thus it's probably OK.
>>>
>>> Throwing new is returns_nonnull (errors are reported with exceptions) so 
>>> that's fine, but non-throwing new is not:
>>>
>>> int* p1 = new(std::nothrow) int;
>>>
>>> Here errors are reported by returning 0, so it is common to test if p1 is 0 
>>> and this is precisely the case that could benefit from a predictor but does 
>>> not have the attribute to do so (there are also consequences on aliasing).
>>
>> Then it can be handled with DECL_IS_OPERATOR_NEW, for those we can also set 
>> the newly introduced predictor.
> 
> Independently of whether you extend the predictor to DECL_IS_OPERATOR_NEW, it 
> would be good for this nothrow operator new to get the aliasing benefits of 
> attribute malloc. I'll open a PR.
> 
>>> (Jan's remark about functions with an inferred malloc attribute reminds me 
>>> that at some point, the code was adding attribute malloc for functions that 
>>> always return 0...)
>>
>> By inferred do you mean function that are marked as malloc in IPA pure-const 
>> (propagate_malloc)?
> 
> Yes.
> 
>> Example would be appreciated.
> 
> I used the past tense, I am not claiming this still happens.
> 

Ok, so I'm sending updated version of the patch including DECL_IS_OPERATOR_NEW 
operator.
In SPEC2006 it's seen 1021 times. There are some statistics about frequency of 
individual
benchmarks:

400.perlbench: 4
401.bzip2: 4
403.gcc: 13
410.bwaves: 0
416.gamess: 0
429.mcf: 4
433.milc: 23
434.zeusmp: 0
435.gromacs: 13
436.cactusADM: 221
437.leslie3d: 84
444.namd: 1
445.gobmk: 15
447.dealII: 52
450.soplex: 87
453.povray: 20
454.calculix: 102
456.hmmer: 40
458.sjeng: 4
459.GemsFDTD: 0
462.libquantum: 19
464.h264ref: 166
465.tonto: 55
470.lbm: 1
471.omnetpp: 3
473.astar: 0
481.wrf: 47
482.sphinx3: 21
483.xalancbmk: 22

Ready for trunk?

Martin
>From 8689ac0907c27709d81f72782eaa5a3a752a2991 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 26 Jul 2018 15:25:00 +0200
Subject: [PATCH] Add malloc predictor (PR middle-end/83023).

gcc/ChangeLog:

2018-07-26  Martin Liska  

PR middle-end/83023
	* predict.c (expr_expected_value_1): Handle DECL_IS_MALLOC,
BUILT_IN_REALLOC and DECL_IS_OPERATOR_NEW.
	* predict.def (PRED_MALLOC_NONNULL): New predictor.

gcc/testsuite/ChangeLog:

2018-07-26  Martin Liska  

PR middle-end/83023
	* gcc.dg/predict-16.c: New test.
	* g++.dg/predict-1.C: New test.
---
 gcc/predict.c | 12 +++
 gcc/predict.def   |  5 -
 gcc/testsuite/g++.dg/predict-1.C  | 15 +
 gcc/testsuite/gcc.dg/predict-16.c | 36 +++
 4 files changed, 67 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/predict-1.C
 create mode 100644 gcc/testsuite/gcc.dg/predict-16.c

diff --git a/gcc/predict.c b/gcc/predict.c
index 2ee8274fe61..ef6794edda5 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -2380,6 +2380,14 @@ expr_expected_value_1 (tree type, tree op0, enum tree_code code,
 		}
 	  return NULL;
 	}
+
+	  if (DECL_IS_MALLOC (decl) || DECL_IS_OPERATOR_NEW (decl))
+	{
+	  if (predictor)
+		*predictor = PRED_MALLOC_NONNULL;
+	  return boolean_true_node;
+	}
+
 	  if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
 	switch (DECL_FUNCTION_CODE (decl))
 	  {
@@ -2420,6 +2428,10 @@ expr_expected_value_1 (tree type, tree op0, enum tree_code code,
 		if (predictor)
 		  *predictor = PRED_COMPARE_AND_SWAP;
 		return boolean_true_node;
+	  case BUILT_IN_REALLOC:
+		if (predictor)
+		  *predictor = PRED_MALLOC_NONNULL;
+		return boolean_true_node;
 	  default:
 		break;
 	}
diff --git a/gcc/predict.def b/gcc/predict.def
index 03900bf89b3..bf659704bfc 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -55,6 +55,10 @@ DEF_PREDICTOR (PRED_UNCONDITIONAL, "unconditional jump", PROB_ALWAYS,
 DEF_PREDICTOR (PRED_BUILTIN_UNPREDICTABLE, "__builtin_unpredictable", PROB_EVEN,
   PRED_FLAG_FIRST_MATCH)
 
+/* Return value of malloc function is almost always non-null.  */
+DEF_PREDICTOR (PRED_MALLOC_NONNULL, "malloc returned non

Re: [gen/AArch64] Generate helpers for substituting iterator values into pattern names

2018-08-02 Thread Richard Sandiford
"Richard Earnshaw (lists)"  writes:
> On 13/07/18 10:15, Richard Sandiford wrote:
>> Given a pattern like:
>> 
>>   (define_insn "aarch64_frecpe" ...)
>> 
>> the SVE ACLE implementation wants to generate the pattern for a
>> particular (non-constant) mode.  This patch automatically generates
>> helpers to do that, specifically:
>> 
>>   // Return CODE_FOR_nothing on failure.
>>   insn_code maybe_code_for_aarch64_frecpe (machine_mode);
>> 
>>   // Assert that the code exists.
>>   insn_code code_for_aarch64_frecpe (machine_mode);
>> 
>>   // Return NULL_RTX on failure.
>>   rtx maybe_gen_aarch64_frecpe (machine_mode, rtx, rtx);
>> 
>>   // Assert that generation succeeds.
>>   rtx gen_aarch64_frecpe (machine_mode, rtx, rtx);
>> 
>> Many patterns don't have sensible names when all <...>s are removed.
>> E.g. "2" would give a base name "2".  The new functions
>> therefore require explicit opt-in, which should also help to reduce
>> code bloat.
>> 
>> The (arbitrary) opt-in syntax I went for was to prefix the pattern
>> name with '@', similarly to the existing '*' marker.
>> 
>> The patch also makes config/aarch64 use the new routines in cases where
>> they obviously apply.  This was mostly straight-forward, but it seemed
>> odd that we defined:
>> 
>>aarch64_reload_movcp<...>
>> 
>> but then only used it with DImode, never SImode.  If we should be
>> using Pmode instead of DImode, then that's a simple change,
>> but should probably be a separate patch.
>> 
>> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
>> and x86_64-linux-gnu.  I think I can self-approve the gen* bits,
>> but OK for the AArch64 parts?
>> 
>> Any objections to this approach or syntax?
>
> So if I have two or more iterator rules in the MD, with a similar
> 'shape', eg
>
> (define_insn "@fred"...
>
> (define_insn "@fred"...
>
> this will generate a single factory function?  If so, that sounds sensible.

Yeah, that's right.  I think this is going to be quite common with the
SVE ACLE implementation, where some patterns need to cope with immediates
and others don't, and so are more naturally done as separate patterns.

Thanks to C++ overloading, it's also possible to have different shapes
(and thus different factories) with the same name, although YMMV on
whether that's a good idea in practice.

The original patch put the code_for_* declarations in insn-codes.h
(where codes are defined) and the gen_* declarations in insn-flags.h
(where normal generators are declared).  Kugan pointed out that that
wouldn't work with code iterators (not covered by the aarch64 changes),
so I've moved then to insn-opinit.h instead.  In some ways this seems
cleaner, since this is providing a very similar service to optabs.

I also realised that just removing <...> could lead to awkward
trailing or double underscores, so this patch avoids that.

I tested both updates with local changes to aarch64-sve.md
(not part of the commit).

Retested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu.  Applied.

Thanks,
Richard


2018-08-02  Richard Sandiford  

gcc/
* doc/md.texi: Expand the documentation of instruction names
to mention port-local uses.  Document '@' in pattern names.
* read-md.h (overloaded_instance, overloaded_name): New structs.
(mapping): Declare.
(md_reader::handle_overloaded_name): New member function.
(md_reader::get_overloads): Likewise.
(md_reader::m_first_overload): New member variable.
(md_reader::m_next_overload_ptr): Likewise.
(md_reader::m_overloads_htab): Likewise.
* read-md.c (md_reader::md_reader): Initialize m_first_overload,
m_next_overload_ptr and m_overloads_htab.
* read-rtl.c (iterator_group): Add "type" and "get_c_token" fields.
(get_mode_token, get_code_token, get_int_token): New functions.
(map_attr_string): Add an optional argument that passes back
the associated iterator.
(overloaded_name_hash, overloaded_name_eq_p, named_rtx_p):
(md_reader::handle_overloaded_name, add_overload_instance): New
functions.
(apply_iterators): Handle '@' names.  Report an error if '@'
is used without iterators.
(initialize_iterators): Initialize the new iterator_group fields.
* genopinit.c (handle_overloaded_code_for)
(handle_overloaded_gen): New functions.
(main): Use them to print declarations of maybe_code_for_* and
maybe_gen_* functions, and inline definitions of code_for_* and gen_*.
* genemit.c (print_overload_arguments, print_overload_test)
(handle_overloaded_code_for, handle_overloaded_gen): New functions.
(main): Use it to print definitions of maybe_code_for_* and
maybe_gen_* functions.
* config/aarch64/aarch64.c (aarch64_split_128bit_move): Use
gen_aarch64_mov{low,high}_di and gen_aarch64_movdi_{low,high}
instead of explicit mode checks.
 

[PATCH] Fix PR86816

2018-08-02 Thread Richard Biener


The following fixes PR86816 - CFG cleanup can introduce new SSA names
which we may not query the VN lattice with.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2018-08-02  Richard Biener  

PR tree-optimization/86816
* tree-ssa-tail-merge.c (tail_merge_valueize): New function
which checks for value availability before querying it.
(gvn_uses_equal): Use it.
(same_succ_hash): Likewise.
(gimple_equal_p): Likewise.

* g++.dg/torture/pr86816.C: New testcase.

Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c   (revision 263190)
+++ gcc/tree-ssa-tail-merge.c   (working copy)
@@ -286,6 +286,21 @@ struct aux_bb_info
 #define BB_VOP_AT_EXIT(bb) (((struct aux_bb_info *)bb->aux)->vop_at_exit)
 #define BB_DEP_BB(bb) (((struct aux_bb_info *)bb->aux)->dep_bb)
 
+/* Valueization helper querying the VN lattice.  */
+
+static tree
+tail_merge_valueize (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME
+  && has_VN_INFO (name))
+{
+  tree tem = VN_INFO (name)->valnum;
+  if (tem != VN_TOP)
+   return tem;
+}
+  return name;
+}
+
 /* Returns true if the only effect a statement STMT has, is to define locally
used SSA_NAMEs.  */
 
@@ -371,7 +386,7 @@ gvn_uses_equal (tree val1, tree val2)
   if (val1 == val2)
 return true;
 
-  if (vn_valueize (val1) != vn_valueize (val2))
+  if (tail_merge_valueize (val1) != tail_merge_valueize (val2))
 return false;
 
   return ((TREE_CODE (val1) == SSA_NAME || CONSTANT_CLASS_P (val1))
@@ -481,7 +496,7 @@ same_succ_hash (const same_succ *e)
   for (i = 0; i < gimple_call_num_args (stmt); i++)
{
  arg = gimple_call_arg (stmt, i);
- arg = vn_valueize (arg);
+ arg = tail_merge_valueize (arg);
  inchash::add_expr (arg, hstate);
}
 }
@@ -1147,7 +1162,7 @@ gimple_equal_p (same_succ *same_succ, gi
   if (lhs1 == NULL_TREE || lhs2 == NULL_TREE)
return false;
   if (TREE_CODE (lhs1) == SSA_NAME && TREE_CODE (lhs2) == SSA_NAME)
-   return vn_valueize (lhs1) == vn_valueize (lhs2);
+   return tail_merge_valueize (lhs1) == tail_merge_valueize (lhs2);
   return operand_equal_p (lhs1, lhs2, 0);
 
 case GIMPLE_ASSIGN:
Index: gcc/testsuite/g++.dg/torture/pr86816.C
===
--- gcc/testsuite/g++.dg/torture/pr86816.C  (nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr86816.C  (working copy)
@@ -0,0 +1,42 @@
+// { dg-do compile }
+
+class Signal
+{
+public:
+int  m_Mode;
+};
+
+class Ctx
+{
+public:
+bool  m_Invert;
+
+void DoSomething();
+};
+
+class Test
+{
+  void TestIce( Ctx& ctx, Signal* sig);
+};
+
+void Test::TestIce( Ctx& ctx, Signal* sig)
+{
+  int invert = false;
+
+  if( ! ctx.m_Invert)
+invert = ! invert;
+
+  switch( sig->m_Mode)
+{
+case 1:
+invert = ! invert;
+break;
+
+case 2:
+invert = true;
+break;
+}
+
+  if( invert)
+ctx.DoSomething();
+}


Re: [PATCHv3 3/6] libstdc++ futex: Support waiting on std::chrono::steady_clock directly

2018-08-02 Thread Mike Crowe
On Wednesday 01 August 2018 at 20:38:33 +0100, Jonathan Wakely wrote:
> On 01/08/18 14:19 +0100, Mike Crowe wrote:
> > I believe that I've added this functionality in a way that it doesn't break
> > ABI compatibility, but that has made it more verbose and less type safe. I
> > believe that it would be better to maintain the timeout as an instance of
> > the correct clock type all the way down to a single _M_futex_wait_until
> > function with an overload for each clock. The current scheme of separating
> > out the seconds and nanoseconds early risks accidentally calling the wait
> > function for the wrong clock.
> 
> Surely that would just be a programming error?

Yes, but it's not one that the compiler can detect.

> Users aren't calling
> these functions, and we only call them in a very limited number of
> places, so the risk of calling the wrong one seems manageable. Just
> don't do that.

I see no reason not to use the type system to help avoid mistakes in the
internal impementation of the library as well as in its external interface.
But, as you say, the function is only called from a few places so there's
little reason to change it now.

> > diff --git a/libstdc++-v3/src/c++11/futex.cc 
> > b/libstdc++-v3/src/c++11/futex.cc
[snip]
> > +   // We only get to here if futex_clock_realtime_unavailable was
> 
> This should say futex_clock_monotonic_unavailable.

Thanks. I've fixed that, but I'm not sure whether you want me to continue
to updating these patches or whether you plan to apply all the fixes
yourself before committing them.

Thanks.

Mike.


[PATCH] atomic_futex: Avoid rounding errors in std::future::wait_for

2018-08-02 Thread Mike Crowe
Convert the specified duration to the target clock's duration type before
adding it to the current time. This avoids the situation described in PR
libstdc++/68519 when the specified duration type lacks sufficient
precision.
---
 libstdc++-v3/include/bits/atomic_futex.h |  6 +-
 libstdc++-v3/testsuite/30_threads/async/async.cc | 15 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

This change depends on the some of the changes in the series I posted
yesterday. See https://gcc.gnu.org/ml/libstdc++/2018-08/msg1.html

diff --git a/libstdc++-v3/include/bits/atomic_futex.h 
b/libstdc++-v3/include/bits/atomic_futex.h
index b7ffb7fb191..9703d752981 100644
--- a/libstdc++-v3/include/bits/atomic_futex.h
+++ b/libstdc++-v3/include/bits/atomic_futex.h
@@ -219,8 +219,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_load_when_equal_for(unsigned __val, memory_order __mo,
  const chrono::duration<_Rep, _Period>& __rtime)
   {
+   using __dur = typename __clock_t::duration;
+   auto __reltime = chrono::duration_cast<__dur>(__rtime);
+   if (__reltime < __rtime)
+ ++__reltime;
return _M_load_when_equal_until(__val, __mo,
-   __clock_t::now() + __rtime);
+   __clock_t::now() + __reltime);
   }
 
 // Returns false iff a timeout occurred.
diff --git a/libstdc++-v3/testsuite/30_threads/async/async.cc 
b/libstdc++-v3/testsuite/30_threads/async/async.cc
index 755c95cbea6..cd5113d5010 100644
--- a/libstdc++-v3/testsuite/30_threads/async/async.cc
+++ b/libstdc++-v3/testsuite/30_threads/async/async.cc
@@ -157,6 +157,20 @@ void test04()
   }
 }
 
+void test_pr36827()
+{
+  future f1 = async(launch::async, []() {
+  std::this_thread::sleep_for(std::chrono::seconds(1));
+});
+
+  std::chrono::duration const wait_time = std::chrono::seconds(1);
+  auto const start_steady = chrono::steady_clock::now();
+  auto status = f1.wait_for(wait_time);
+  auto const elapsed_steady = chrono::steady_clock::now() - start_steady;
+
+  VERIFY( elapsed_steady >= std::chrono::seconds(1) );
+}
+
 int main()
 {
   test01();
@@ -165,5 +179,6 @@ int main()
   test03();
   test03();
   test04();
+  test_pr36827();
   return 0;
 }
-- 
2.11.0



[committed] dumpfile.c/h: add "const" to dump location ctors

2018-08-02 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Committed to trunk as r263244, under the "obvious" rule.

gcc/ChangeLog:
* dumpfile.c (dump_user_location_t::dump_user_location_t): Add
"const" to the "gimple *" and "rtx_insn *" parameters.
* dumpfile.h (dump_user_location_t::dump_user_location_t):
Likewise.
(dump_location_t::dump_location_t): Likewise.
---
 gcc/dumpfile.c | 4 ++--
 gcc/dumpfile.h | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 10e9cab..76a2ee8 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -397,7 +397,7 @@ dump_open_alternate_stream (struct dump_file_info *dfi)
 /* Construct a dump_user_location_t from STMT (using its location and
hotness).  */
 
-dump_user_location_t::dump_user_location_t (gimple *stmt)
+dump_user_location_t::dump_user_location_t (const gimple *stmt)
 : m_count (), m_loc (UNKNOWN_LOCATION)
 {
   if (stmt)
@@ -411,7 +411,7 @@ dump_user_location_t::dump_user_location_t (gimple *stmt)
 /* Construct a dump_user_location_t from an RTL instruction (using its
location and hotness).  */
 
-dump_user_location_t::dump_user_location_t (rtx_insn *insn)
+dump_user_location_t::dump_user_location_t (const rtx_insn *insn)
 : m_count (), m_loc (UNKNOWN_LOCATION)
 {
   if (insn)
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index 2b174e5..8de001d 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -285,10 +285,10 @@ class dump_user_location_t
   dump_user_location_t () : m_count (), m_loc (UNKNOWN_LOCATION) {}
 
   /* Construct from a gimple statement (using its location and hotness).  */
-  dump_user_location_t (gimple *stmt);
+  dump_user_location_t (const gimple *stmt);
 
   /* Construct from an RTL instruction (using its location and hotness).  */
-  dump_user_location_t (rtx_insn *insn);
+  dump_user_location_t (const rtx_insn *insn);
 
   /* Construct from a location_t.  This one is deprecated (since it doesn't
  capture hotness information); it thus needs to be spelled out.  */
@@ -363,7 +363,7 @@ class dump_location_t
   }
 
   /* Construct from a gimple statement (using its location and hotness).  */
-  dump_location_t (gimple *stmt,
+  dump_location_t (const gimple *stmt,
   const dump_impl_location_t &impl_location
 = dump_impl_location_t ())
   : m_user_location (dump_user_location_t (stmt)),
@@ -372,7 +372,7 @@ class dump_location_t
   }
 
   /* Construct from an RTL instruction (using its location and hotness).  */
-  dump_location_t (rtx_insn *insn,
+  dump_location_t (const rtx_insn *insn,
   const dump_impl_location_t &impl_location
   = dump_impl_location_t ())
   : m_user_location (dump_user_location_t (insn)),
-- 
1.8.5.3



Async I/O patch with compilation fix

2018-08-02 Thread Nicolas Koenig

Hello everyone,

Here is an updated version of the patch that hopefully fixes the compilation
problems by disabling async I/O if conditions are not supported by the target.

I would appreciate if people could test it on systems on which it failed 
before. As for the array_constructor_8.f90 failure reported in the PR, why
it fails is beyond me, it doesn't even use I/O. Maybe/Probably something
unrelated?

Nicolas


2018-08-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* gfortran.texi: Add description of asynchronous I/O.
* trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
as volatile.
* trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
st_wait_async and change argument spec from ".X" to ".w".
(gfc_trans_wait): Pass ID argument via reference.

2018-08-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* gfortran.dg/f2003_inquire_1.f03: Add write statement.
* gfortran.dg/f2003_io_1.f03: Add wait statement.

2018-08-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* Makefile.am: Add async.c to gfor_io_src.
Add async.h to gfor_io_headers.
* Makefile.in: Regenerated.
* gfortran.map: Add _gfortran_st_wait_async.
* io/async.c: New file.
* io/async.h: New file.
* io/close.c: Include async.h.
(st_close): Call async_wait for an asynchronous unit.
* io/file_pos.c (st_backspace): Likewise.
(st_endfile): Likewise.
(st_rewind): Likewise.
(st_flush): Likewise.
* io/inquire.c: Add handling for asynchronous PENDING
and ID arguments.
* io/io.h (st_parameter_dt): Add async bit.
(st_parameter_wait): Correct.
(gfc_unit): Add au pointer.
(st_wait_async): Add prototype.
(transfer_array_inner): Likewise.
(st_write_done_worker): Likewise.
* io/open.c: Include async.h.
(new_unit): Initialize asynchronous unit.
* io/transfer.c (async_opt): New struct.
(wrap_scalar_transfer): New function.
(transfer_integer): Call wrap_scalar_transfer to do the work.
(transfer_real): Likewise.
(transfer_real_write): Likewise.
(transfer_character): Likewise.
(transfer_character_wide): Likewise.
(transfer_complex): Likewise.
(transfer_array_inner): New function.
(transfer_array): Call transfer_array_inner.
(transfer_derived): Call wrap_scalar_transfer.
(data_transfer_init): Check for asynchronous I/O.
Perform a wait operation on any pending asynchronous I/O
if the data transfer is synchronous. Copy PDT and enqueue
thread for data transfer.
(st_read_done_worker): New function.
(st_read_done): Enqueue transfer or call st_read_done_worker.
(st_write_done_worker): New function.
(st_write_done): Enqueue transfer or call st_read_done_worker.
(st_wait): Document as no-op for compatibility reasons.
(st_wait_async): New function.
* io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
add NOTE where necessary.
(get_gfc_unit): Likewise.
(init_units): Likewise.
(close_unit_1): Likewise. Call async_close if asynchronous.
(close_unit): Use macros LOCK and UNLOCK.
(finish_last_advance_record): Likewise.
(newunit_alloc): Likewise.
* io/unix.c (find_file): Likewise.
(flush_all_units_1): Likewise.
(flush_all_units): Likewise.
* libgfortran.h (generate_error_common): Add prototype.
* runtime/error.c: Include io.h and async.h.
(generate_error_common): New function.

2018-08-02  Nicolas Koenig  
Thomas Koenig 

PR fortran/25829
* testsuite/libgomp.fortran/async_io_1.f90: New test.
* testsuite/libgomp.fortran/async_io_2.f90: New test.
* testsuite/libgomp.fortran/async_io_3.f90: New test.
* testsuite/libgomp.fortran/async_io_4.f90: New test.
* testsuite/libgomp.fortran/async_io_5.f90: New test.
* testsuite/libgomp.fortran/async_io_6.f90: New test.
* testsuite/libgomp.fortran/async_io_7.f90: New test.
Index: gcc/fortran/gfortran.texi
===
--- gcc/fortran/gfortran.texi	(revision 263244)
+++ gcc/fortran/gfortran.texi	(working copy)
@@ -879,8 +879,7 @@ than @code{(/.../)}.  Type-specification for array
 @item Extensions to the specification and initialization expressions,
 including the support for intrinsics with real and complex arguments.
 
-@item Support for the asynchronous input/output syntax; however, the
-data transfer is currently always synchronously performed. 
+@item Support for the asynchronous input/output.
 
 @item
 @cindex @code{FLUSH} statement
@@ -1183,6 +1182,7 @@ might in some way or another become visible to th

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-08-02 Thread H.J. Lu
On Tue, Jul 31, 2018 at 6:36 AM, Kyrill  Tkachov
 wrote:
> Hi Thomas,
>
>
> On 25/07/18 14:28, Thomas Preudhomme wrote:
>>
>> Hi Kyrill,
>>
>> Using memory_operand worked, the issues I encountered when using it in
>> earlier versions of the patch must have been due to the missing test
>> on address_operand in the preparation statements which I added later.
>> Please find an updated patch in attachment. ChangeLog entry is as
>> follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2018-07-05  Thomas Preud'homme  
>>
>>  * target-insns.def (stack_protect_combined_set): Define new standard
>>  pattern name.
>>  (stack_protect_combined_test): Likewise.
>>  * cfgexpand.c (stack_protect_prologue): Try new
>>  stack_protect_combined_set pattern first.
>>  * function.c (stack_protect_epilogue): Try new
>>  stack_protect_combined_test pattern first.
>>  * config/arm/arm.c (require_pic_register): Add pic_reg and
>> compute_now
>>  parameters to control which register to use as PIC register and force
>>  reloading PIC register respectively.  Insert in the stream of insns
>> if
>>  possible.
>>  (legitimize_pic_address): Expose above new parameters in prototype
>> and
>>  adapt recursive calls accordingly.
>>  (arm_legitimize_address): Adapt to new legitimize_pic_address
>>  prototype.
>>  (thumb_legitimize_address): Likewise.
>>  (arm_emit_call_insn): Adapt to new require_pic_register prototype.
>>  * config/arm/arm-protos.h (legitimize_pic_address): Adapt to
>> prototype
>>  change.
>>  * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
>>  prototype change.
>>  (stack_protect_combined_set): New insn_and_split pattern.
>>  (stack_protect_set): New insn pattern.
>>  (stack_protect_combined_test): New insn_and_split pattern.
>>  (stack_protect_test): New insn pattern.
>>  * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
>>  (UNSPEC_SP_TEST): Likewise.
>>  * doc/md.texi (stack_protect_combined_set): Document new standard
>>  pattern name.
>>  (stack_protect_set): Clarify that the operand for guard's address is
>>  legal.
>>  (stack_protect_combined_test): Document new standard pattern name.
>>  (stack_protect_test): Clarify that the operand for guard's address is
>>  legal.
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2018-07-05  Thomas Preud'homme  
>>
>>  * gcc.target/arm/pr85434.c: New test.
>>
>> Bootstrapped again for Arm and Thumb-2 and regtested with and without
>> -fstack-protector-all without any regression.
>
>
> This looks ok to me now.
> Thank you for your patience and addressing my comments from before.
>

This breaks x86:

FAIL: gcc.dg/fstack-protector-strong.c (internal compiler error)
FAIL: gcc.dg/fstack-protector-strong.c (test for excess errors)
FAIL: gcc.dg/fstack-protector-strong.c  (test for warnings, line 109)
FAIL: gcc.dg/pr71585-3.c (internal compiler error)
FAIL: gcc.dg/pr71585-3.c (test for excess errors)
FAIL: gcc.dg/pr71585.c (internal compiler error)
FAIL: gcc.dg/pr71585.c (test for excess errors)
FAIL: gcc.target/i386/pr37275.c (internal compiler error)
FAIL: gcc.target/i386/pr37275.c (test for excess errors)
FAIL: gcc.target/i386/pr47780.c (internal compiler error)
FAIL: gcc.target/i386/pr47780.c (test for excess errors)
FAIL: gcc.target/i386/pr50788.c (internal compiler error)
FAIL: gcc.target/i386/pr50788.c (test for excess errors)
FAIL: gcc.target/i386/pr68680.c (internal compiler error)
FAIL: gcc.target/i386/pr68680.c (test for excess errors)
FAIL: gcc.target/i386/stack-prot-guard.c (internal compiler error)
FAIL: gcc.target/i386/stack-prot-guard.c (test for excess errors)
FAIL: gcc.target/i386/stack-prot-sym.c (internal compiler error)
FAIL: gcc.target/i386/stack-prot-sym.c (test for excess errors)
FAIL: g++.dg/fstack-protector-strong.C  -std=gnu++11 (internal compiler error)
FAIL: g++.dg/fstack-protector-strong.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/fstack-protector-strong.C  -std=gnu++14 (internal compiler error)
FAIL: g++.dg/fstack-protector-strong.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/fstack-protector-strong.C  -std=gnu++98 (internal compiler error)
FAIL: g++.dg/fstack-protector-strong.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/pr65032.C   (internal compiler error)
FAIL: g++.dg/pr65032.C   (test for excess errors)
FAIL: g++.dg/stackprotectexplicit2.C  -std=gnu++11 (internal compiler error)
FAIL: g++.dg/stackprotectexplicit2.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/stackprotectexplicit2.C  -std=gnu++14 (internal compiler error)
FAIL: g++.dg/stackprotectexplicit2.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/stackprotectexplicit2.C  -std=gnu++98 (internal compiler error)
FAIL: g++.dg/stackprotectexplicit2.C  -std=gnu++98 (test for excess errors)


-- 
H.J.


PING [PATCH v2 0/7] Support partial clobbers around TLS calls on Aarch64 SVE

2018-08-02 Thread Alan Hayward



> On 26 Jul 2018, at 10:13, Alan Hayward  wrote:
> 
> This is rebasing of the patch posted in November.
> It's aim is to support aarch64 SVE register preservation around TLS calls
> by adding a CLOBBER_HIGH expression.
> 
> Across a TLS call, Aarch64 SVE does not explicitly preserve the
> SVE vector registers. However, the Neon vector registers are preserved.
> Due to overlapping of registers, this means the lower 128bits of all
> SVE vector registers will be preserved.
> 
> The existing GCC code assume all Neon and SVE registers are clobbered
> across TLS calls.
> 
> This patch introduces a CLOBBER_HIGH expression. This behaves a bit like
> a CLOBBER expression. CLOBBER_HIGH can only refer to a single register.
> The mode of the expression indicates the size of the lower bits which
> will be preserved. If the register contains a value bigger than this
> mode then the code will treat the register as clobbered, otherwise the
> register remains untouched.
> 
> The means in order to evaluate if a clobber high is relevant, we need to
> ensure the mode of the existing value in a register is tracked.
> 
> The first two patches introduce CLOBBER_HIGH and generation support.
> The next patch adds a helper function for easily determining if a register
> gets clobbered by a CLOBBER_HIGH.
> The next three patches add clobber high checks to all of the passes. I
> couldn't think of a better way of splitting this up (maybe needs dividing
> into smaller patches?).
> Finally the last patch adds the CLOBBER_HIGHS around a TLS call for
> aarch64 SVE and some test cases.
> 
> Alan Hayward (7):
>  Add CLOBBER_HIGH expression
>  Generation support for CLOBBER_HIGH
>  Add func to check if register is clobbered by clobber_high
>  lra support for clobber_high
>  cse support for clobber_high
>  Remaining support for clobber high
>  Enable clobber high for tls descs on Aarch64
> 
> gcc/alias.c|  11 ++
> gcc/cfgexpand.c|   1 +
> gcc/combine-stack-adj.c|   1 +
> gcc/combine.c  |  38 -
> gcc/config/aarch64/aarch64.md  |  69 ++--
> gcc/cse.c  | 187 ++---
> gcc/cselib.c   |  42 +++--
> gcc/cselib.h   |   2 +-
> gcc/dce.c  |  11 +-
> gcc/df-scan.c  |   6 +
> gcc/doc/rtl.texi   |  15 +-
> gcc/dwarf2out.c|   1 +
> gcc/emit-rtl.c |  16 ++
> gcc/genconfig.c|   1 +
> gcc/genemit.c  |  51 +++---
> gcc/genrecog.c |   3 +-
> gcc/haifa-sched.c  |   3 +
> gcc/ira-build.c|   5 +
> gcc/ira-costs.c|   7 +
> gcc/ira.c  |   6 +-
> gcc/jump.c |   1 +
> gcc/lra-eliminations.c |  11 ++
> gcc/lra-int.h  |   2 +
> gcc/lra-lives.c|  31 ++--
> gcc/lra.c  |  66 +---
> gcc/postreload-gcse.c  |  21 ++-
> gcc/postreload.c   |  25 ++-
> gcc/print-rtl.c|   1 +
> gcc/recog.c|   9 +-
> gcc/regcprop.c |  10 +-
> gcc/reginfo.c  |   4 +
> gcc/reload1.c  |  16 +-
> gcc/reorg.c|  27 ++-
> gcc/resource.c |  24 ++-
> gcc/rtl.c  |  15 ++
> gcc/rtl.def|  10 ++
> gcc/rtl.h  |  27 ++-
> gcc/rtlanal.c  |  47 +-
> gcc/sched-deps.c   |  15 +-
> .../gcc.target/aarch64/sve_tls_preserve_1.c|  19 +++
> .../gcc.target/aarch64/sve_tls_preserve_2.c|  24 +++
> .../gcc.target/aarch64/sve_tls_preserve_3.c|  24 +++
> 42 files changed, 725 insertions(+), 180 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve_tls_preserve_1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve_tls_preserve_2.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve_tls_preserve_3.c
> 
> -- 
> 2.15.2 (Apple Git-101.1)
> 



Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

2018-08-02 Thread Bernd Edlinger
On 08/02/18 04:44, Martin Sebor wrote:
> Since the foundation of the patch is detecting and avoiding
> the overly aggressive folding of unterminated char arrays,
> besides issuing a warning for such arguments to strlen,
> the patch also fixes pr86711 - wrong folding of memchr, and
> pr86714 - tree-ssa-forwprop.c confused by too long initializer.
> 
> The substance of the attached updated patch is unchanged,
> I have just added test cases for the two additional bugs.
> 
> Bernd, as I mentioned Wednesday, the patch supersedes
> yours here:
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01800.html
> 

No problem, but I hope you understand, that I still uphold
my patch.

So we have two patches now:
- mine, fixing a wrong code bug,
- yours, implementing a new warning and fixing a wrong
code bug at the same time.

I will add a few comments to your patch below.

> Martin
> 
> On 07/30/2018 01:17 PM, Martin Sebor wrote:
>> Attached is an updated version of the patch that handles more
>> instances of calling strlen() on a constant array that is not
>> a nul-terminated string.
>>
>> No other functions except strlen are explicitly handled yet,
>> and neither are constant arrays with braced-initializer lists
>> like const char a[] = { 'a', 'b', 'c' };  I am testing
>> an independent solution for those (bug 86552).  Once those
>> are handled the warning will be able to detect those as well.
>>
>> Tested on x86_64-linux.
>>
>> On 07/25/2018 05:38 PM, Martin Sebor wrote:
>>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html
>>>
>>> The fix for bug 86532 has been checked in so this enhancement
>>> can now be applied on top of it (with only minor adjustments).
>>>
>>> On 07/19/2018 02:08 PM, Martin Sebor wrote:
 In the discussion of my patch for pr86532 Bernd noted that
 GCC silently accepts constant character arrays with no
 terminating nul as arguments to strlen (and other string
 functions).

 The attached patch is a first step in detecting these kinds
 of bugs in strlen calls by issuing -Wstringop-overflow.
 The next step is to modify all other handlers of built-in
 functions to detect the same problem (not part of this patch).
 Yet another step is to detect these problems in arguments
 initialized using the non-string form:

   const char a[] = { 'a', 'b', 'c' };

 This patch is meant to apply on top of the one for bug 86532
 (I tested it with an earlier version of that patch so there
 is code in the context that does not appear in the latest
 version of the other diff).

 Martin

>>>
>>
> 
> PR tree-optimization/86714 - tree-ssa-forwprop.c confused by too long 
> initializer
> PR tree-optimization/86711 - wrong folding of memchr
> PR tree-optimization/86552 - missing warning for reading past the end of 
> non-string arrays
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/86714
>   PR tree-optimization/86711
>   PR tree-optimization/86552
>   * builtins.h (warn_string_no_nul): Declare..
>   (c_strlen): Add argument.
>   * builtins.c (warn_string_no_nul): New function.
>   (fold_builtin_strlen): Add argument.  Detect missing nul.
>   (fold_builtin_1): Adjust.
>   (string_length): Add argument and use it.
>   (c_strlen): Same.
>   (expand_builtin_strlen): Detect missing nul.
>   * expr.c (string_constant): Add arguments.  Detect missing nul
>   terminator and outermost declaration it's missing in.
>   * expr.h (string_constant): Add argument.
>   * fold-const.c (c_getstr): Change argument to bool*, rename
>   other arguments.
>   * fold-const-call.c (fold_const_call): Detect missing nul.
>   * gimple-fold.c (get_range_strlen): Add argument.
>   (get_maxval_strlen): Adjust.
>   * gimple-fold.h (get_range_strlen): Add argument.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/86714
>   PR tree-optimization/86711
>   PR tree-optimization/86552
>   * gcc.c-torture/execute/memchr-1.c: New test.
>   * gcc.c-torture/execute/pr86714.c: New test.
>   * gcc.dg/warn-strlen-no-nul.c: New test.
> 
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index aa3e0d8..f4924d5 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -150,7 +150,7 @@ static tree stabilize_va_list_loc (location_t, tree, int);
>  static rtx expand_builtin_expect (tree, rtx);
>  static tree fold_builtin_constant_p (tree);
>  static tree fold_builtin_classify_type (tree);
> -static tree fold_builtin_strlen (location_t, tree, tree);
> +static tree fold_builtin_strlen (location_t, tree, tree, tree);
>  static tree fold_builtin_inf (location_t, tree, int);
>  static tree rewrite_call_expr (location_t, tree, int, tree, int, ...);
>  static bool validate_arg (const_tree, enum tree_code code);
> @@ -550,6 +550,36 @@ string_length (const void *ptr, unsigned eltsize, 
> unsigned maxelts)
>return n;
>  }
>  
> +/* For a call expression EXP to a function that ex

Re: [PATCH][C++] Fix PR86763, wrong-code with TYPE_TYPELESS_STORAGE

2018-08-02 Thread Nathan Sidwell

On 08/02/2018 01:30 AM, Richard Biener wrote:


This fixes the PR by making sure CLASSTYPE_AS_BASE types inherit
TYPE_TYPELESS_STORAGE from the main type so that types that inherit
a type with TYPE_TYPELESS_STORAGE also get TYPE_TYPELESS_STORAGE
propagated.

Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2018-08-02  Richard Biener  

PR c++/86763
* class.c (layout_class_type): Copy TYPE_TYPELESS_STORAGE
to the CLASSTYPE_AS_BASE.

* g++.dg/torture/pr86763.C: New testcase.


Ok, thanks.


--
Nathan Sidwell


[PATCH] Fix PR86505

2018-08-02 Thread Richard Biener


I am testing the following to fix another __builtin_va_arg_pack_len ()
inlining issue.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

2018-08-02  Richard Biener  

PR middle-end/86505
* tree-inline.c (copy_bb): When inlining __builtin_va_arg_pack_len ()
across a va-arg-pack using call adjust its return value accordingly.

* gcc.dg/torture/pr86505.c: New testcase.

Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 263258)
+++ gcc/tree-inline.c   (working copy)
@@ -1940,8 +1940,7 @@ copy_bb (copy_body_data *id, basic_block
   && id->call_stmt
   && (decl = gimple_call_fndecl (stmt))
   && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL
-  && DECL_FUNCTION_CODE (decl) == BUILT_IN_VA_ARG_PACK_LEN
-  && ! gimple_call_va_arg_pack_p (id->call_stmt))
+  && DECL_FUNCTION_CODE (decl) == BUILT_IN_VA_ARG_PACK_LEN)
{
  /* __builtin_va_arg_pack_len () should be replaced by
 the number of anonymous arguments.  */
@@ -1952,10 +1951,22 @@ copy_bb (copy_body_data *id, basic_block
  for (p = DECL_ARGUMENTS (id->src_fn); p; p = DECL_CHAIN (p))
nargs--;
 
- count = build_int_cst (integer_type_node, nargs);
- new_stmt = gimple_build_assign (gimple_call_lhs (stmt), count);
- gsi_replace (©_gsi, new_stmt, false);
- stmt = new_stmt;
+ if (!gimple_call_va_arg_pack_p (id->call_stmt))
+   {
+ count = build_int_cst (integer_type_node, nargs);
+ new_stmt = gimple_build_assign (gimple_call_lhs (stmt), 
count);
+ gsi_replace (©_gsi, new_stmt, false);
+ stmt = new_stmt;
+   }
+ else if (nargs != 0)
+   {
+ tree newlhs = create_tmp_reg_or_ssa_name (integer_type_node);
+ count = build_int_cst (integer_type_node, nargs);
+ new_stmt = gimple_build_assign (gimple_call_lhs (stmt),
+ PLUS_EXPR, newlhs, count);
+ gimple_call_set_lhs (stmt, newlhs);
+ gsi_insert_after (©_gsi, new_stmt, GSI_NEW_STMT);
+   }
}
  else if (call_stmt
   && id->call_stmt
Index: gcc/testsuite/gcc.dg/torture/pr86505.c
===
--- gcc/testsuite/gcc.dg/torture/pr86505.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr86505.c  (working copy)
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+
+static inline __attribute__(( __always_inline__)) int 
+funA(unsigned int param, ...) 
+{ 
+  return __builtin_va_arg_pack_len(); 
+}
+
+static inline __attribute__(( __always_inline__)) int
+funB(unsigned int param, ...)
+{ 
+  return funA(param,  2, 4, __builtin_va_arg_pack()); 
+}
+
+int 
+testBuiltin(void) 
+{ 
+  int rc = funB(0,1,2); 
+  if (rc != 4)
+return 1;
+  return 0;
+}
+
+int
+main()
+{
+  int rc = testBuiltin();
+  if (rc == 1)
+__builtin_abort ();
+
+  return 0;
+}


[committed] Typo fix

2018-08-02 Thread Richard Sandiford
Noticed by Tamar (thanks).

Tested on aarch64-linux-gnu and committed as obvious.

Richard


2018-08-02  Richard Sandiford  

gcc/
* genemit.c (print_overload_test): Fix typo.

Index: gcc/genemit.c
===
--- gcc/genemit.c   2018-08-02 11:59:06.855355888 +0100
+++ gcc/genemit.c   2018-08-02 15:31:35.445774594 +0100
@@ -761,7 +761,7 @@ print_overload_arguments (overloaded_nam
 printf ("%s%s arg%d", i == 0 ? "" : ", ", oname->arg_types[i], i);
 }
 
-/* Print code to test whether INSTANCE should be chosne, given that
+/* Print code to test whether INSTANCE should be chosen, given that
argument N of the overload is available as "arg".  */
 
 static void


Re: [PATCH, Ada] Makefile patches from initial RISC-V cross/native build.

2018-08-02 Thread Jim Wilson
On Thu, Aug 2, 2018 at 1:44 AM, Pierre-Marie de Rodat
 wrote:
> On 07/06/2018 01:45 AM, Jim Wilson wrote:
>>
>> These are some patches I needed to complete my cross build of a native
>> riscv linux Ada compiler.  Some paths were different on the build machine
>> and host machine.  I needed to pass options into gnatmake to work around
>> this,
>> and that required fixing some makefile rules to use $(GNATMAKE) instead of
>> calling gnatmake directly.
>
>
> At AdaCore, we realized that this patch broke our canadian builds: in the
> cases your patch involved, the intent is to call the build toolchain, not
> the host one (to which GNATMAKE expands to).
>
> Given that you described this change as a workaround for cross-machine path
> issues, I was wondering: do you still need this? If not, we should probably
> revert it right now, otherwise we should investigate why you needed it in
> the first place. What do you think?

I only needed it for the first canadian cross build.  If it is causing
problems for you it can be dropped.  If I need it again, it is easy
enough to write again.

I ran into a problem where the search paths were wrong when running
the canadian cross built compiler for the native build, and I had to
add something like -cargs -L$installdir/lib to gnatmake to make it
work, but it is possible I did something wrong.  These canadian cross
builds aren't something I do often, so I might have made a mistake.
Also, RISC-V is a new target, so there are lots of things that are
just a little broken and have to be worked around.  This might have
been related to one of those problems.

Jim


[SVE ACLE] Resync "@" pattern support

2018-08-02 Thread Richard Sandiford
This patch just resyncs the branch version of the "@" pattern support
to match the current trunk version.  It now supports "@" patterns
with code iterators and avoids trailing and double underscores
when removing "<...>" from the name.

Tested on aarch64-linux-gnu and applied to aarch64/sve-acle-branch

Richard


diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 1c3b2c1a43b..0558cdfc8e6 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -10909,10 +10909,12 @@ A port might need to generate this pattern for a 
variable
 ways of doing this.  The first is to build the rtx for the pattern
 directly from C++ code; this is a valid technique and avoids any risk
 of combinatorial explosion.  The second is to prefix the instruction
-name with the special character @samp{@@}.  This tells GCC to generate
-the four additional functions below, where @var{name} is the name of
-the instruction without the @samp{@@} and without the @samp{<@dots{}>}
-placeholders.
+name with the special character @samp{@@}, which tells GCC to generate
+the four additional functions below.  In each case, @var{name} is the
+name of the instruction without the leading @samp{@@} character,
+without the @samp{<@dots{}>} placeholders, and with any underscore
+before a @samp{<@dots{}>} placeholder removed if keeping it would
+lead to a double or trailing underscore.
 
 @table @samp
 @item insn_code maybe_code_for_@var{name} (@var{i1}, @var{i2}, @dots{})
diff --git a/gcc/gencodes.c b/gcc/gencodes.c
index 05635e4da26..891f465df04 100644
--- a/gcc/gencodes.c
+++ b/gcc/gencodes.c
@@ -46,29 +46,6 @@ gen_insn (md_rtx_info *info)
 }
 }
 
-/* Declare the maybe_code_for_* function for ONAME, and provide
-   an inline definition of the assserting code_for_* wrapper.  */
-
-static void
-handle_overloaded_code_for (overloaded_name *oname)
-{
-  printf ("\nextern insn_code maybe_code_for_%s (", oname->name);
-  for (unsigned int i = 0; i < oname->arg_types.length (); ++i)
-printf ("%s%s", i == 0 ? "" : ", ", oname->arg_types[i]);
-  printf (");\n");
-
-  printf ("inline insn_code\ncode_for_%s (", oname->name);
-  for (unsigned int i = 0; i < oname->arg_types.length (); ++i)
-printf ("%s%s arg%d", i == 0 ? "" : ", ", oname->arg_types[i], i);
-  printf (")\n{\n  insn_code code = maybe_code_for_%s (", oname->name);
-  for (unsigned int i = 0; i < oname->arg_types.length (); ++i)
-printf ("%sarg%d", i == 0 ? "" : ", ", i);
-  printf (");\n"
- "  gcc_assert (code != CODE_FOR_nothing);\n"
- "  return code;\n"
- "}\n");
-}
-
 int
 main (int argc, const char **argv)
 {
@@ -108,13 +85,8 @@ enum insn_code {\n\
 
   printf ("\n};\n\
 \n\
-const unsigned int NUM_INSN_CODES = %d;\n", get_num_insn_codes ());
-
-  for (overloaded_name *oname = rtx_reader_ptr->get_overloads ();
-   oname; oname = oname->next)
-handle_overloaded_code_for (oname);
-
-  printf ("\n#endif /* GCC_INSN_CODES_H */\n");
+const unsigned int NUM_INSN_CODES = %d;\n\
+#endif /* GCC_INSN_CODES_H */\n", get_num_insn_codes ());
 
   if (ferror (stdout) || fflush (stdout) || fclose (stdout))
 return FATAL_EXIT_CODE;
diff --git a/gcc/genemit.c b/gcc/genemit.c
index 92270302119..675cf04ff4b 100644
--- a/gcc/genemit.c
+++ b/gcc/genemit.c
@@ -761,7 +761,7 @@ print_overload_arguments (overloaded_name *oname)
 printf ("%s%s arg%d", i == 0 ? "" : ", ", oname->arg_types[i], i);
 }
 
-/* Print code to test whether INSTANCE should be chosne, given that
+/* Print code to test whether INSTANCE should be chosen, given that
argument N of the overload is available as "arg".  */
 
 static void
diff --git a/gcc/genflags.c b/gcc/genflags.c
index ce838e7..6186d6433d9 100644
--- a/gcc/genflags.c
+++ b/gcc/genflags.c
@@ -197,37 +197,6 @@ gen_insn (md_rtx_info *info)
   obstack_grow (&obstack, &insn, sizeof (rtx));
 }
 
-/* Declare the maybe_gen_* function for ONAME, and provide
-   an inline definition of the assserting gen_* wrapper.  */
-
-static void
-emit_overloaded_gen_proto (overloaded_name *oname)
-{
-  unsigned int num_ops = num_operands (oname->first_instance->insn);
-
-  printf ("\nextern rtx maybe_gen_%s (", oname->name);
-  for (unsigned int i = 0; i < oname->arg_types.length (); ++i)
-printf ("%s%s", i == 0 ? "" : ", ", oname->arg_types[i]);
-  for (unsigned int i = 0; i < num_ops; ++i)
-printf (", rtx");
-  printf (");\n");
-
-  printf ("inline rtx\ngen_%s (", oname->name);
-  for (unsigned int i = 0; i < oname->arg_types.length (); ++i)
-printf ("%s%s arg%d", i == 0 ? "" : ", ", oname->arg_types[i], i);
-  for (unsigned int i = 0; i < num_ops; ++i)
-printf (", rtx x%d", i);
-  printf (")\n{\n  rtx res = maybe_gen_%s (", oname->name);
-  for (unsigned int i = 0; i < oname->arg_types.length (); ++i)
-printf ("%sarg%d", i == 0 ? "" : ", ", i);
-  for (unsigned int i = 0; i < num_ops; ++i)
-printf (", x%d", i);
-  printf (");\n"
- "  gcc_assert (res);\n"
- "  return res;\n"
- "}\n");
-}
-
 int
 main (i

[PATCH][nvptx] Ignore c++ exceptions

2018-08-02 Thread Tom de Vries
Hi,

The nvptx port can't support exceptions using sjlj, because ptx does not
support sjlj.  However, default_except_unwind_info still returns UI_SJLJ, even
even if we configure with --disable-sjlj-exceptions, because UI_SJLJ is the
fallback option.

The reason default_except_unwind_info doesn't return UI_DWARF2 is because
DWARF2_UNWIND_INFO is not defined in defaults.h, because
INCOMING_RETURN_ADDR_RTX is not defined, because there's no ptx equivalent.

Testcase libgomp.c++/for-15.C currently doesn't compile unless fno-exceptions
is added because:
- it tries to generate sjlj exception handling code, and
- it tries to generate exception tables using label-addressed .byte sequence.
  Ptx doesn't support generating random data at a label, nor being able to
  load/write data relative to a label.

This patch fixes the first problem by using UI_TARGET for nvptx.

The second problem is worked around by generating all .byte sequences commented
out.  It would be better to have a narrower workaround, and define
TARGET_ASM_BYTE_OP to "error: .byte unsupported " or some such.

This patch does not enable exceptions for nvptx, it merely allows c++ programs
to run correctly if they do no use exception handling.

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

OK for trunk?

Thanks,
- Tom

[nvptx] Ignore c++ exceptions

2018-08-02  Tom de Vries  

PR target/86660
* common/config/nvptx/nvptx-common.c (nvptx_except_unwind_info): New
function.  Return UI_TARGET unconditionally.
(TARGET_EXCEPT_UNWIND_INFO): Redefine to nvptx_except_unwind_info.
* config/nvptx/nvptx.c (TARGET_ASM_BYTE_OP): Emit commented out '.byte'.

* testsuite/libgomp.oacc-c++/routine-1-auto.C: Remove -fno-exceptions.
* testsuite/libgomp.oacc-c++/routine-1-template-auto.C: Same.
* testsuite/libgomp.oacc-c++/routine-1-template-trailing-return-type.C:
Same.
* testsuite/libgomp.oacc-c++/routine-1-template.C: Same.
* testsuite/libgomp.oacc-c++/routine-1-trailing-return-type.C: Same.
* testsuite/libgomp.oacc-c-c++-common/routine-1.c: Same.

---
 gcc/common/config/nvptx/nvptx-common.c   | 9 +
 gcc/config/nvptx/nvptx.c | 3 +++
 libgomp/testsuite/libgomp.oacc-c++/routine-1-auto.C  | 2 --
 libgomp/testsuite/libgomp.oacc-c++/routine-1-template-auto.C | 2 --
 .../libgomp.oacc-c++/routine-1-template-trailing-return-type.C   | 2 --
 libgomp/testsuite/libgomp.oacc-c++/routine-1-template.C  | 2 --
 .../testsuite/libgomp.oacc-c++/routine-1-trailing-return-type.C  | 2 --
 libgomp/testsuite/libgomp.oacc-c-c++-common/routine-1.c  | 2 --
 8 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/common/config/nvptx/nvptx-common.c 
b/gcc/common/config/nvptx/nvptx-common.c
index 27a4f4675d0..f31e0696281 100644
--- a/gcc/common/config/nvptx/nvptx-common.c
+++ b/gcc/common/config/nvptx/nvptx-common.c
@@ -30,10 +30,19 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "flags.h"
 
+enum unwind_info_type
+nvptx_except_unwind_info (struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return UI_TARGET;
+}
+
 #undef TARGET_HAVE_NAMED_SECTIONS
 #define TARGET_HAVE_NAMED_SECTIONS false
 
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS MASK_ABI64
 
+#undef TARGET_EXCEPT_UNWIND_INFO
+#define TARGET_EXCEPT_UNWIND_INFO nvptx_except_unwind_info
+
 struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index c1946e75f42..b804f5611fb 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -6048,6 +6048,9 @@ nvptx_can_change_mode_class (machine_mode, machine_mode, 
reg_class_t)
 #undef TARGET_CAN_CHANGE_MODE_CLASS
 #define TARGET_CAN_CHANGE_MODE_CLASS nvptx_can_change_mode_class
 
+#undef TARGET_ASM_BYTE_OP
+#define TARGET_ASM_BYTE_OP "// .byte "
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-nvptx.h"
diff --git a/libgomp/testsuite/libgomp.oacc-c++/routine-1-auto.C 
b/libgomp/testsuite/libgomp.oacc-c++/routine-1-auto.C
index f4b54e55fa7..771a2734306 100644
--- a/libgomp/testsuite/libgomp.oacc-c++/routine-1-auto.C
+++ b/libgomp/testsuite/libgomp.oacc-c++/routine-1-auto.C
@@ -1,7 +1,5 @@
 // Routine with "auto" return type.
 
-// { dg-additional-options "-fno-exceptions" }
-
 #define TEMPLATE
 #define TYPE int
 #define RETURN_1 auto
diff --git a/libgomp/testsuite/libgomp.oacc-c++/routine-1-template-auto.C 
b/libgomp/testsuite/libgomp.oacc-c++/routine-1-template-auto.C
index 444f1f32a76..17bdaa0c1c1 100644
--- a/libgomp/testsuite/libgomp.oacc-c++/routine-1-template-auto.C
+++ b/libgomp/testsuite/libgomp.oacc-c++/routine-1-template-auto.C
@@ -1,7 +1,5 @@
 // Templated routine with "auto" return type.
 
-// { dg-additional-options "-fno-exceptions" }
-
 #define TEMPLATE template
 #define RETURN_1 auto
 #define 

Re: [PATCH] PR libstdc++/60555 std::system_category() should recognise POSIX errno values

2018-08-02 Thread David Edelsohn
Jonathan

This patch broke AIX bootstrap.

/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc: In
member function 'virtual std::error_condition
{anonymous}::system_error_category::default_error_condition(int)
const':
/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:245:7:
error: duplicate case value
   case ENOTEMPTY:
   ^~~~
/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:136:7:
note: previously used here
   case EEXIST:
   ^~~~

AIX errno.h

/*
 * AIX returns EEXIST where 4.3BSD used ENOTEMPTY;
 * but, the standards insist on unique errno values for each errno.
 * A unique value is reserved for users that want to code case
 * statements for systems that return either EEXIST or ENOTEMPTY.
 */
#if defined(_ALL_SOURCE) && !defined(_LINUX_SOURCE_COMPAT)
#define ENOTEMPTY   EEXIST  /* Directory not empty */
#else   /* not _ALL_SOURCE */
#define ENOTEMPTY   87
#endif  /* _ALL_SOURCE */

I will add _LINUX_SOURCE_COMPAT and see how bootstrap proceeds. If it
works, it also will be required in the backport.

Thanks, David


Re: [PATCH][nvptx] Ignore c++ exceptions

2018-08-02 Thread Jakub Jelinek
On Thu, Aug 02, 2018 at 05:30:53PM +0200, Tom de Vries wrote:
> The nvptx port can't support exceptions using sjlj, because ptx does not
> support sjlj.  However, default_except_unwind_info still returns UI_SJLJ, even
> even if we configure with --disable-sjlj-exceptions, because UI_SJLJ is the
> fallback option.
> 
> The reason default_except_unwind_info doesn't return UI_DWARF2 is because
> DWARF2_UNWIND_INFO is not defined in defaults.h, because
> INCOMING_RETURN_ADDR_RTX is not defined, because there's no ptx equivalent.
> 
> Testcase libgomp.c++/for-15.C currently doesn't compile unless fno-exceptions
> is added because:
> - it tries to generate sjlj exception handling code, and
> - it tries to generate exception tables using label-addressed .byte sequence.
>   Ptx doesn't support generating random data at a label, nor being able to
>   load/write data relative to a label.
> 
> This patch fixes the first problem by using UI_TARGET for nvptx.
> 
> The second problem is worked around by generating all .byte sequences 
> commented
> out.  It would be better to have a narrower workaround, and define
> TARGET_ASM_BYTE_OP to "error: .byte unsupported " or some such.

Hopefully this part is temporary and there is some way to figure this out
later.

> This patch does not enable exceptions for nvptx, it merely allows c++ programs
> to run correctly if they do no use exception handling.
> 
> Build on x86_64 with nvptx accelerator and reg-tested libgomp.
> 
> OK for trunk?

You are the nvptx maintainer, so you can approve the changes yourself.
The libgomp changes are ok.

> [nvptx] Ignore c++ exceptions
> 
> 2018-08-02  Tom de Vries  
> 
>   PR target/86660
>   * common/config/nvptx/nvptx-common.c (nvptx_except_unwind_info): New
>   function.  Return UI_TARGET unconditionally.
>   (TARGET_EXCEPT_UNWIND_INFO): Redefine to nvptx_except_unwind_info.
>   * config/nvptx/nvptx.c (TARGET_ASM_BYTE_OP): Emit commented out '.byte'.
> 
>   * testsuite/libgomp.oacc-c++/routine-1-auto.C: Remove -fno-exceptions.
>   * testsuite/libgomp.oacc-c++/routine-1-template-auto.C: Same.
>   * testsuite/libgomp.oacc-c++/routine-1-template-trailing-return-type.C:
>   Same.
>   * testsuite/libgomp.oacc-c++/routine-1-template.C: Same.
>   * testsuite/libgomp.oacc-c++/routine-1-trailing-return-type.C: Same.
>   * testsuite/libgomp.oacc-c-c++-common/routine-1.c: Same.

Jakub


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Martin Sebor

On 08/02/2018 04:22 AM, Bernd Edlinger wrote:

Hi,

this is an update of the patch to prevent unsafe optimizations due to strlen 
range assuming
always zero-terminated char arrays.

Since the previous version I do no longer try to locate the outermost char 
array,
but just bail out if there is a typecast, that means, supposed we have a 
2-dimensional
array, char a[x][y], strlen (s.a[x]) may assume a[x] is zero-terminated, if the 
optimization
is enabled.  strlen ((char*)s.a) involves a type cast and should assume nothing.

Additionally due to the discussion, I came to the conclusion that this strlen 
range optimization
should only be used when enabled with -fassume-zero-terminated-char-arrays or 
-Ofast.

Note that I included the test case with the removed assertion as
gcc/testsuite/gcc.dg/strlenopt-57.c

Initially it would only miscompile with -Ofast, but with r263018 aka
PR tree-optimization/86043, PR tree-optimization/86042 it was miscompiled with 
-O3 as well.

I located the reason in gcc/tree-ssa-strlen.c (get_min_string_length) which did 
not
check if the string constant is in fact zero terminated:

@@ -3192,7 +3194,9 @@ get_min_string_length (tree rhs, bool *full_string
   && TREE_READONLY (rhs))
 rhs = DECL_INITIAL (rhs);

-  if (rhs && TREE_CODE (rhs) == STRING_CST)
+  if (rhs && TREE_CODE (rhs) == STRING_CST
+  && compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (rhs)),
+  TREE_STRING_LENGTH (rhs)) >= 0)
 {
   *full_string_p = true;
   return strlen (TREE_STRING_POINTER (rhs));

Fortunately this also shows a way how to narrow strlen return value 
expectations when
we are able to positively prove that a string must be zero terminated.



Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


The warning bits are definitely not okay by me.  The purpose
of the warnings (-W{format,sprintf}-{overflow,truncation} is
to detect buffer overflows.  When a warning doesn't have access
to string length information for dynamically created strings
(like the strlen pass does) it uses array sizes as a proxy.
This is useful both to detect possible buffer overflows and
to prevent false positives for overflows that cannot happen
in correctly written programs.

By changing the logic to not consider the bounds of the array
the warnkngs will become prone to many false positives as is
evident from your changes to the tests (you had to explicitly
enable the new option).

In effect, the patch undoes a couple of years worth of fine
tuning of the warnings to strike a balance between true and
false positives.  That's completely unacceptable to me, and,
frankly, proposals along these lines are exceedingly
disruptive.

Beyond that, I don't have a problem with adding a new option
but I continue to strongly disagree with disabling the strlen
optimization by default.  It implies that by default GCC
targets invalid code.  If there is consensus that some new
option is necessary, the right setting is on by default,
along with warnings for programs that pass non-terminated
arrays to string functions.  I have already submitted a patch
that effect for strlen and const arrays and am working on
extending it to other built-in functions and dynamically
created strings (when I have a chance between defending
my past work).

Either way, if a new option is introduced, the array bound
computation for sprintf (and all other buffer overflow
warnings, likely including -Wrestrict) will need to be
decoupled from the option so the current behavior is
preserved.

As I have also said, the area of the provenance of subobject
vs enclosing object pointers is under discussion in WG14 and
the C Object Model study group.  The informal consensus so
far is to maintain the provenance of subobjects and introduce
some mechanism to make it possible to extend the provenance
to the enclosing object.  This would be a pervasive change,
not one just limited to strings.  So if introducing some new
option at this stage is thought to be necessary this should
be taken into consideration.

Martin


Re: Async I/O patch with compilation fix

2018-08-02 Thread Christophe Lyon
On Thu, 2 Aug 2018 at 13:35, Nicolas Koenig  wrote:
>
>
> Hello everyone,
>
> Here is an updated version of the patch that hopefully fixes the compilation
> problems by disabling async I/O if conditions are not supported by the target.
>
> I would appreciate if people could test it on systems on which it failed
> before. As for the array_constructor_8.f90 failure reported in the PR, why
> it fails is beyond me, it doesn't even use I/O. Maybe/Probably something
> unrelated?
>

Hi,
I'm probably missing something obvious, but after applying this patch
on top of r263136, the builds fail while building libgfortran:
/tmp/9271913_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgfortran/runtime/error.c:28:10:
fatal error: async.h: No such file or directory
 #include "async.h"
  ^
compilation terminated.
make[3]: *** [error.lo] Error 1

> Nicolas
>
>
> 2018-08-02  Nicolas Koenig  
> Thomas Koenig 
>
> PR fortran/25829
> * gfortran.texi: Add description of asynchronous I/O.
> * trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
> as volatile.
> * trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
> st_wait_async and change argument spec from ".X" to ".w".
> (gfc_trans_wait): Pass ID argument via reference.
>
> 2018-08-02  Nicolas Koenig  
> Thomas Koenig 
>
> PR fortran/25829
> * gfortran.dg/f2003_inquire_1.f03: Add write statement.
> * gfortran.dg/f2003_io_1.f03: Add wait statement.
>
> 2018-08-02  Nicolas Koenig  
> Thomas Koenig 
>
> PR fortran/25829
> * Makefile.am: Add async.c to gfor_io_src.
> Add async.h to gfor_io_headers.
> * Makefile.in: Regenerated.
> * gfortran.map: Add _gfortran_st_wait_async.
> * io/async.c: New file.
> * io/async.h: New file.
> * io/close.c: Include async.h.
> (st_close): Call async_wait for an asynchronous unit.
> * io/file_pos.c (st_backspace): Likewise.
> (st_endfile): Likewise.
> (st_rewind): Likewise.
> (st_flush): Likewise.
> * io/inquire.c: Add handling for asynchronous PENDING
> and ID arguments.
> * io/io.h (st_parameter_dt): Add async bit.
> (st_parameter_wait): Correct.
> (gfc_unit): Add au pointer.
> (st_wait_async): Add prototype.
> (transfer_array_inner): Likewise.
> (st_write_done_worker): Likewise.
> * io/open.c: Include async.h.
> (new_unit): Initialize asynchronous unit.
> * io/transfer.c (async_opt): New struct.
> (wrap_scalar_transfer): New function.
> (transfer_integer): Call wrap_scalar_transfer to do the work.
> (transfer_real): Likewise.
> (transfer_real_write): Likewise.
> (transfer_character): Likewise.
> (transfer_character_wide): Likewise.
> (transfer_complex): Likewise.
> (transfer_array_inner): New function.
> (transfer_array): Call transfer_array_inner.
> (transfer_derived): Call wrap_scalar_transfer.
> (data_transfer_init): Check for asynchronous I/O.
> Perform a wait operation on any pending asynchronous I/O
> if the data transfer is synchronous. Copy PDT and enqueue
> thread for data transfer.
> (st_read_done_worker): New function.
> (st_read_done): Enqueue transfer or call st_read_done_worker.
> (st_write_done_worker): New function.
> (st_write_done): Enqueue transfer or call st_read_done_worker.
> (st_wait): Document as no-op for compatibility reasons.
> (st_wait_async): New function.
> * io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
> add NOTE where necessary.
> (get_gfc_unit): Likewise.
> (init_units): Likewise.
> (close_unit_1): Likewise. Call async_close if asynchronous.
> (close_unit): Use macros LOCK and UNLOCK.
> (finish_last_advance_record): Likewise.
> (newunit_alloc): Likewise.
> * io/unix.c (find_file): Likewise.
> (flush_all_units_1): Likewise.
> (flush_all_units): Likewise.
> * libgfortran.h (generate_error_common): Add prototype.
> * runtime/error.c: Include io.h and async.h.
> (generate_error_common): New function.
>
> 2018-08-02  Nicolas Koenig  
> Thomas Koenig 
>
> PR fortran/25829
> * testsuite/libgomp.fortran/async_io_1.f90: New test.
> * testsuite/libgomp.fortran/async_io_2.f90: New test.
> * testsuite/libgomp.fortran/async_io_3.f90: New test.
> * testsuite/libgomp.fortran/async_io_4.f90: New test.
> * testsuite/libgomp.fortran/async_io_5.f90: New test.
> * testsuite/libgomp.fortran/async_io_6.f90: New test.
> * testsuite/libgomp.fortran/async_io_7.f90: New test.


Re: [PATCH][4/4] RPO-style value-numbering for FRE/PRE

2018-08-02 Thread Jeff Law
On 08/01/2018 08:33 AM, Richard Biener wrote:
> 
> Repeating the boiler-plate from 0/4 here again, in case you didn't read 
> it.  This is the main patch (and thus big).  It scraps all of the
> SCCVN engine replacing it with RPO stuff but the stmt-based workers mostly
> stay the same.
> 
> 
> This rewrites the value-numbering algorithm used for FRE and PRE from
> SSA SCC based to RPO based, thus switching from an algorithm that
> handles SSA SCCs optimistically to one that handles CFG SCCs optimistically.
> 
> The main motivation for this besides being more optimistic was that
> adding CFG context sensitive info is easier in RPO style.  Also
> tracking availability and thus making expression simplification not
> based on values like with SCCVN is possible which allows us to remove
> all the warts that scrap side-info we store on SSA names.  It also
> fixes PR86554 which is another manifestation of the same issue.
> 
> Another motivation was that we're in the need of applying value-numbering
> on regions like when unrolling loops or as part of cleanup on code
> generated by other passes like the vectorizer.  Thus this rewrite
> makes sure that the value-numbering works efficiently on regions
> (though in a non-iterative mode), avoiding work and space that is
> on the order of the function size rather than the region size to work on.
> Sofar the GIMPLE unroller makes use of this, scrapping its own
> simple constant propagation engine.  I expect that DOM could get rid of
> its value-numbering and instead use a non-iterative RPO-VN run as well.
Yea, I think we kicked this (using a single RPO-VN and dropping DOM's
own hashing).

I've got a ton of cleanups to do over the next couple months (primarily
related to the VRP/DOM interactions and making DOM usable on subgraphs).
 I'll try to slot the RPO-VN changes in after those.

jeff



Re: [PATCH] doc: discourage const/volatile on register variables

2018-08-02 Thread Jeff Law
On 07/26/2018 05:18 AM, Richard Biener wrote:
> On Thu, Jul 26, 2018 at 12:29 PM Alexander Monakov  wrote:
>>
>> Hi,
>>
>> when using explicit register variables ('register int foo asm("%ebp");'),
>> using const/volatile type qualifiers on the declaration does not result in
>> the logically expected effect.
>>
>> The main issue is that gcc-8 got "better" at propagating initializers of
>> 'register const' variables to their uses in asm operands, losing the
>> association with the register and thus causing the operand to
>> unexpectedly appear in some other register. This caused build issues for
>> the Linux kernel and was reported a couple of times in the GCC Bugzilla.
>>
>> This patch adds a few lines to the documentation to say that qualifiers
>> won't work as expected. OK for trunk?
> 
> Looks ok to me.  Maybe we should change FEs to ignore those
> qualifiers on explicit register variables and emit a warning like
> 
> warning: const/volatile qualifier ignored on X
It'd certainly help catch these kinds of issues earlier rather than
seeing the assembler barf when the constant gets propagated or some
unexpected optimization.

jeff


Re: [PATCH][nvptx] Ignore c++ exceptions

2018-08-02 Thread Tom de Vries
On 08/02/2018 05:39 PM, Jakub Jelinek wrote:
> On Thu, Aug 02, 2018 at 05:30:53PM +0200, Tom de Vries wrote:
>> The nvptx port can't support exceptions using sjlj, because ptx does not
>> support sjlj.  However, default_except_unwind_info still returns UI_SJLJ, 
>> even
>> even if we configure with --disable-sjlj-exceptions, because UI_SJLJ is the
>> fallback option.
>>
>> The reason default_except_unwind_info doesn't return UI_DWARF2 is because
>> DWARF2_UNWIND_INFO is not defined in defaults.h, because
>> INCOMING_RETURN_ADDR_RTX is not defined, because there's no ptx equivalent.
>>
>> Testcase libgomp.c++/for-15.C currently doesn't compile unless fno-exceptions
>> is added because:
>> - it tries to generate sjlj exception handling code, and
>> - it tries to generate exception tables using label-addressed .byte sequence.
>>   Ptx doesn't support generating random data at a label, nor being able to
>>   load/write data relative to a label.
>>
>> This patch fixes the first problem by using UI_TARGET for nvptx.
>>
>> The second problem is worked around by generating all .byte sequences 
>> commented
>> out.  It would be better to have a narrower workaround, and define
>> TARGET_ASM_BYTE_OP to "error: .byte unsupported " or some such.
> 
> Hopefully this part is temporary and there is some way to figure this out
> later.
> 

I except that for UI_NONE we don't want to write out exception tables,
so that's something we can fix, and then we can return UI_NONE in
nvptx_except_unwind_info.

>> This patch does not enable exceptions for nvptx, it merely allows c++ 
>> programs
>> to run correctly if they do no use exception handling.
>>
>> Build on x86_64 with nvptx accelerator and reg-tested libgomp.
>>
>> OK for trunk?
> 
> You are the nvptx maintainer, so you can approve the changes yourself.

Right, sorry, force of habit, I meant: 'any comments?'

> The libgomp changes are ok.

Committed.

Thanks,
- Tom

> 
>> [nvptx] Ignore c++ exceptions
>>
>> 2018-08-02  Tom de Vries  
>>
>>  PR target/86660
>>  * common/config/nvptx/nvptx-common.c (nvptx_except_unwind_info): New
>>  function.  Return UI_TARGET unconditionally.
>>  (TARGET_EXCEPT_UNWIND_INFO): Redefine to nvptx_except_unwind_info.
>>  * config/nvptx/nvptx.c (TARGET_ASM_BYTE_OP): Emit commented out '.byte'.
>>
>>  * testsuite/libgomp.oacc-c++/routine-1-auto.C: Remove -fno-exceptions.
>>  * testsuite/libgomp.oacc-c++/routine-1-template-auto.C: Same.
>>  * testsuite/libgomp.oacc-c++/routine-1-template-trailing-return-type.C:
>>  Same.
>>  * testsuite/libgomp.oacc-c++/routine-1-template.C: Same.
>>  * testsuite/libgomp.oacc-c++/routine-1-trailing-return-type.C: Same.
>>  * testsuite/libgomp.oacc-c-c++-common/routine-1.c: Same.
> 
>   Jakub
> 


Re: [PATCH] PR libstdc++/60555 std::system_category() should recognise POSIX errno values

2018-08-02 Thread Jonathan Wakely

On 02/08/18 11:32 -0400, David Edelsohn wrote:

Jonathan

This patch broke AIX bootstrap.

/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc: In
member function 'virtual std::error_condition
{anonymous}::system_error_category::default_error_condition(int)
const':
/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:245:7:
error: duplicate case value
  case ENOTEMPTY:
  ^~~~
/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:136:7:
note: previously used here
  case EEXIST:
  ^~~~

AIX errno.h

/*
* AIX returns EEXIST where 4.3BSD used ENOTEMPTY;
* but, the standards insist on unique errno values for each errno.
* A unique value is reserved for users that want to code case
* statements for systems that return either EEXIST or ENOTEMPTY.
*/
#if defined(_ALL_SOURCE) && !defined(_LINUX_SOURCE_COMPAT)
#define ENOTEMPTY   EEXIST  /* Directory not empty */
#else   /* not _ALL_SOURCE */
#define ENOTEMPTY   87
#endif  /* _ALL_SOURCE */


Strange that "linux source compat" is required for POSIX conformance.


I will add _LINUX_SOURCE_COMPAT and see how bootstrap proceeds. If it
works, it also will be required in the backport.


Let's just workaround the maybe-not-unique values instead:

--- a/libstdc++-v3/src/c++11/system_error.cc
+++ b/libstdc++-v3/src/c++11/system_error.cc
@@ -241,7 +241,8 @@ namespace
#ifdef ENOTDIR
  case ENOTDIR:
#endif
-#ifdef ENOTEMPTY
+#if defined ENOTEMPTY && (!defined EEXIST || ENOTEMPTY != EEXIST)
+  // AIX sometimes uses the same value for EEXIST and ENOTEMPTY
  case ENOTEMPTY:
#endif
#ifdef ENOTRECOVERABLE





Re: PING [PATCH] treat -Wxxx-larger-than=HWI_MAX special (PR 86631)

2018-08-02 Thread Joseph Myers
On Wed, 1 Aug 2018, Martin Sebor wrote:

> Richard, do you have any further comments or suggestions or is
> the patch acceptable?
> 
> I realize it's not ideal but I don't see how to achieve the ideal
> (understanding PTRDIFF_MAX) without deferring the processing of
> these options until the back end has been initialized.  It would
> still mean setting aside some special value, traversing options
> again, looking for the Host_Wide_Int ones with the special value,
> and replacing it with the value of PTRDIFF_MAX.  That doesn't seem
> any cleaner than the current solution, just a lot more work (not
> just to implement but for the compiler to go through at startup).
> 
> Joseph, can you think of anything better?
> 
>   https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01455.html

I think the cleanest option for accepting symbolic values would require 
options accepting such a value in addition to numeric arguments to have 
additional data allocated (effectively storing both a numeric argument and 
an Enum argument), and then the late initialization as noted to set the 
numeric value based on the Enum argument if one was given.  As noted, 
that's significant extra complexity.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, rs6000] Correct descriptions of __builtin_bcdadd* and _builtin_bcdsub* functions

2018-08-02 Thread Segher Boessenkool
Hi Kelvin,

On Wed, Aug 01, 2018 at 02:55:22PM -0500, Kelvin Nilsen wrote:
> Several errors were discovered in the descriptions of the __builtin_bcdadd, 
> __builtin_bcdadd_lt, __builtin_bcdadd_eq, __builtin_bcdadd_gt, 
> __builtin_bcdadd_ov, __builtin_bcdsub, __builtin_bcdsub_lt, 
> __builtin_bcdsub_eq, __builtin_bcdsub_gt, and __builtin_bcdsub_ov functions.  
> This patch corrects these documentation errors.

What did you check this against?  The ABI doc, or what is currently
implemented?  Neither is very clear to me :-/


Segher


Re: [PATCH] [aarch64] Fix falkor pipeline description for dup

2018-08-02 Thread Siddhesh Poyarekar

On 08/02/2018 04:19 PM, Kyrill Tkachov wrote:

Hi Siddhesh,

On 02/08/18 11:23, Siddhesh Poyarekar wrote:

There was a typo in the pipeline description where DUP was assigned to
the vector pipes for quad mode ops when it really only uses the VTOG
pipes.  Fixing this does not show any noticeable difference in
performance (there's a very small bump of 1.7% in x264 but that's
about it) in my tests but is the more precise description of operations
for falkor.



You know the microarchitecture better, so I'm assuming that's correct 
for Falkor.

I assume this has been tested with a bootstrap and test.


Sorry I should have mentioned this: bootstrapped and tested 
--with-cpu=falkor and no new regressions resulted from this patch.





    * gcc/config/aarch64/falkor.md (falkor_am_1_vxvy_vxvy): Move
    neon_dup_q to...
    (falkor_am_1_gtov_gtov): ... a new insn reservation.



No gcc/ prefix in the ChangeLog path.


Oops, will fix.


You should be able to use the "*" construct here like this:
falkor_gtov*2


Fixed and tested.


Otherwise looks ok to me (but you'll need a maintainer to approve)


Thanks, I'll send the updated patch for James' review.

Siddhesh


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Jeff Law
On 07/25/2018 01:08 AM, Richard Biener wrote:
>>
>> So ISTM that you really need a stronger justification using the
>> standards compliance and/or real world code that is made less safe by
>> keeping string lengths as accurate as possible.
> 
> Note you cannot solely look at what the C standard says.  Instead
> you have to see where the middle-end lessens that constraints since
> these functions are not only called from C FE context.
Agreed.  I guess my point WRT standards was to start with C/C++ and see
if they rule in/out either case.

If (for example) we could make a determination that either standard
didn't allow us to narrow the length of a string after a cast in the way
the code currently does, then that would make the current code simply
wrong and we should either revert those bits or use Bernd's patches in
that space.

If the language standards don't give us guidance of that nature, then we
look at common uses as well as our own GIMPLE/GENERIC (unwritten)
semantics to guide us one way or the other.


> 
> So arguing from a C language lawyer point here is pointless.  You
> have to argue from a GENERIC language lawyer point which is going
> to be impossible since apart from the implementation (which
> includes how all existing GCC FEs _and_ the middle-end uses it)
> there is no formal specification available.  Yes, in most cases
> we say we match C language behavior and constraints but in some
> cases we are clearly and definitely less strict and that has
> followup consequences in other areas.
> 
> While exploiting all fine details of the C language might get
> you more constraints on things like string lengths you have to
> apply some common sense and middle-end knowledge - which means
> from my side trusting hunch feelings whether something is
> possibly not safe.
I'm absolutely not suggesting we want to exploit all the finer language
details here.  The language standards are just one piece of the puzzle IMHO.


> 
> As for patches in this area I would really love to see _smaller_
> changes.  And I'd like to see changes that make it clear
> what cases where supposed to be handled and _not_ including
> other cases "by accident".  See especially my comments on this patch
> from Bernd.
Agreed.


> 
>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>> Is it OK for trunk?
>> I'd like to ask we hold on this until I return from PTO (Aug 1) so that
>> we can discuss the best thing to do here for each class of change.
>>
>> I think you, Martin, Richi and myself should hash through the technical
>> issues raised by the patch.  Obviously others can chime in, but I think
>> the 4 of us probably need to drive the discussion.
> 
> I think with respect to patches to fix issues in previous patches
> at this point a better option might be to revert the patches causing
> the issues and start from scratch in a more defined manner.
I wouldn't object to that.  I think there's enough concerns here that we
ought to slow down and re-visit the reasoning behind each change and
make sure it's sensible.  If the best way to do that is revert, break
the patches down and resubmit, then let's do that.

Mostly what I wanted to do was give us time to hash through the changes
and decide what the best option was without having two developers
changing, then reverting bits of each other's work.  To that end, again,
I'm open to reverting Martin's work and having it resubmitted as we has
through each issue.

My general preference is to lean towards more accurate length
information (mostly to drive accuracy in warnings elsewhere), but that
obviously has to be within the constraints of the language standards,
common practice, as well as our own internal semantics.  If we need to
pull back in some spaces, that's fine, but we should clearly document
when/how we're doing that and if possible make it consistent throughout
the compiler if at all possible.

I'm also open to the concept of splitting this stuff up a bit in the
sense of what we'll use for optimization vs what we'll use to increase
accuracy of our other warnings such as sprintf).



> 
> Giving recent (temporary) regressions in the testsuite it feels like
> Martin is going too fast.
ACK.

jeff


Re: [Patch-86512]: Subnormal float support in armv7(with -msoft-float) for intrinsics

2018-08-02 Thread Richard Earnshaw (lists)
On 27/07/18 17:45, Nicolas Pitre wrote:
> On Fri, 27 Jul 2018, Wilco Dijkstra wrote:
> 
>> Nicolas Pitre wrote:
>>
 However if r4 is non-zero, the carry will be set, and the tsths will be 
 executed. This
 clears the carry and sets the Z flag based on bit 20.
>>>
>>> No, not at all. The carry is not affected. And that's the point of the 
>>> tst instruction here rather than a cmp: it sets the N and Z flags but 
>>> leaves C alone as there is no shifter involved.
>>
>> No, the carry is always set by logical operations with a shifted immediate. 
>> It is only
>> unchanged if the immediate uses a zero rotate. So any shifted immediate > 255
>> will set the carry.
> 
> Holy cow !!!
> 
> Who knows all the bugs I must have created in the past 25 years having 
> missed this particular subtlety.
> 
> Here's the updated patch with your suggestion.
> 

Thanks.

Committed.

R.

> 
> fix.patch
> 
> 
> diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
> index c13bf4cb2f6..c19d05c8a2e 100644
> --- a/libgcc/ChangeLog
> +++ b/libgcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2018-07-26  Nicolas Pitre 
> +
> + * config/arm/ieee754-df.S: Don't shortcut denormal handling when
> + exponent goes negative. Update my email address.
> + * config/arm/ieee754-sf.S: Likewise.
> +
>  2018-07-05  James Clarke  
>  
>   * configure: Regenerated.
> diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-df.S
> index 8741aa99245..9a1b2dd2a4e 100644
> --- a/libgcc/config/arm/ieee754-df.S
> +++ b/libgcc/config/arm/ieee754-df.S
> @@ -1,7 +1,7 @@
>  /* ieee754-df.S double-precision floating point support for ARM
>  
> Copyright (C) 2003-2018 Free Software Foundation, Inc.
> -   Contributed by Nicolas Pitre (n...@cam.org)
> +   Contributed by Nicolas Pitre (n...@fluxnic.net)
>  
> This file is free software; you can redistribute it and/or modify it
> under the terms of the GNU General Public License as published by the
> @@ -238,9 +238,10 @@ LSYM(Lad_a):
>   movsip, ip, lsl #1
>   adcsxl, xl, xl
>   adc xh, xh, xh
> - tst xh, #0x0010
> - sub r4, r4, #1
> - bne LSYM(Lad_e)
> + subsr4, r4, #1
> + do_it   hs
> + cmphs   xh, #0x0010
> + bhs LSYM(Lad_e)
>  
>   @ No rounding necessary since ip will always be 0 at this point.
>  LSYM(Lad_l):
> diff --git a/libgcc/config/arm/ieee754-sf.S b/libgcc/config/arm/ieee754-sf.S
> index d80d5e9080c..b8a81279a3c 100644
> --- a/libgcc/config/arm/ieee754-sf.S
> +++ b/libgcc/config/arm/ieee754-sf.S
> @@ -1,7 +1,7 @@
>  /* ieee754-sf.S single-precision floating point support for ARM
>  
> Copyright (C) 2003-2018 Free Software Foundation, Inc.
> -   Contributed by Nicolas Pitre (n...@cam.org)
> +   Contributed by Nicolas Pitre (n...@fluxnic.net)
>  
> This file is free software; you can redistribute it and/or modify it
> under the terms of the GNU General Public License as published by the
> @@ -168,10 +168,11 @@ LSYM(Lad_e):
>  LSYM(Lad_a):
>   movsr1, r1, lsl #1
>   adc r0, r0, r0
> - tst r0, #0x0080
> - sub r2, r2, #1
> - bne LSYM(Lad_e)
> - 
> + subsr2, r2, #1
> + do_it   hs
> + cmphs   r0, #0x0080
> + bhs LSYM(Lad_e)
> +
>   @ No rounding necessary since r1 will always be 0 at this point.
>  LSYM(Lad_l):
>  
> 



Re: [PATCH, rs6000] refactor/cleanup in rs6000-string.c

2018-08-02 Thread Segher Boessenkool
On Tue, Jul 31, 2018 at 10:33:59AM -0500, Aaron Sawdey wrote:
> Just teasing things apart a bit more in this function so I can add
> vec/vsx code generation without making it enormous and
> incomprehensible.
> 
> Bootstrap/regtest passes on powerpc64le, ok for trunk?

Yes please.  Thanks!


Segher


> 2018-07-31  Aaron Sawdey  
> 
>   * config/rs6000/rs6000-string.c (select_block_compare_mode): Move test
>   for word_mode_ok here instead of passing as argument.
>   (expand_block_compare): Change select_block_compare_mode() call.
>   (expand_strncmp_gpr_sequence): New function.
>   (expand_strn_compare): Make use of expand_strncmp_gpr_sequence.


[PATCH] [aarch64][v2] Fix falkor pipeline description for dup

2018-08-02 Thread Siddhesh Poyarekar
There was a typo in the pipeline description where DUP was assigned to
the vector pipes for quad mode ops when it really only uses the VTOG
pipes.  Fixing this does not show any noticeable difference in
performance (there's a very small bump of 1.7% in x264 but that's
about it) in my tests but is the more precise description of operations
for falkor.

Bootstrapped and tested with --with-cpu=falkor to confirm that there
are no regressions.

Siddhesh

Changes from v1:

- Fixed up ChangeLog entry

- Replaced falkor_gtov,falkor_gtov with falkor_gtov*2

* config/aarch64/falkor.md (falkor_am_1_vxvy_vxvy): Move
neon_dup_q to...
(falkor_am_1_gtov_gtov): ... a new insn reservation.

CC: james.greenha...@arm.com
CC: kyrylo.tkac...@foss.arm.com
---
 gcc/config/aarch64/falkor.md | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/falkor.md b/gcc/config/aarch64/falkor.md
index 5cbc5150ef3..45cbff93b24 100644
--- a/gcc/config/aarch64/falkor.md
+++ b/gcc/config/aarch64/falkor.md
@@ -322,6 +322,12 @@
(eq_attr "type" "neon_from_gp_q"))
   "falkor_gtov,falkor_gtov")
 
+;; DUP  does not use vector pipes in Q mode, only gtov+gtov.
+(define_insn_reservation "falkor_am_1_gtov_gtov" 1
+  (and (eq_attr "tune" "falkor")
+   (eq_attr "type" "neon_dup_q"))
+  "falkor_gtov*2")
+
 ;; neon_to_gp_q is used for 32-bit ARM instructions that move 64-bits of data
 ;; so no use needed here.
 
@@ -337,7 +343,7 @@
 
 (define_insn_reservation "falkor_am_1_vxvy_vxvy" 1
   (and (eq_attr "tune" "falkor")
-   (eq_attr "type" 
"neon_bsl_q,neon_dup_q,neon_ext_q,neon_move_q,neon_rev_q,neon_tbl1_q,neon_permute_q"))
+   (eq_attr "type" 
"neon_bsl_q,neon_ext_q,neon_move_q,neon_rev_q,neon_tbl1_q,neon_permute_q"))
   "falkor_vxvy+falkor_vxvy")
 
 (define_insn_reservation "falkor_am_2_vxvy" 2
-- 
2.17.1



Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Martin Sebor

As an alternate approach I have been thinking about, if
there is a strong feeling that allowing strlen to iterate
past the subobject boundary is necessary (I don't believe
it is.)

Rather than indiscriminately expanding the provenance of
the subobject regardless of what members follow it in
the enclosing structure, only consider doing that if
the next member is an array of the same type.  E.g.,

  struct S { char a[4], b[3], c[2], d; };
  extern struct S *p;

  strlen (p->a);   // consider p->a's bounds to be char[9]

I.e., treat p->a, p->b, and p->c as one array but exclude
from it d because it's not an array.

(This wouldn't solve the warning problem below -- a separate
computation would still be necessary to determine the tighter
bound of the member itself.)

On 08/02/2018 09:42 AM, Martin Sebor wrote:

On 08/02/2018 04:22 AM, Bernd Edlinger wrote:

Hi,

this is an update of the patch to prevent unsafe optimizations due to
strlen range assuming
always zero-terminated char arrays.

Since the previous version I do no longer try to locate the outermost
char array,
but just bail out if there is a typecast, that means, supposed we have
a 2-dimensional
array, char a[x][y], strlen (s.a[x]) may assume a[x] is
zero-terminated, if the optimization
is enabled.  strlen ((char*)s.a) involves a type cast and should
assume nothing.

Additionally due to the discussion, I came to the conclusion that this
strlen range optimization
should only be used when enabled with
-fassume-zero-terminated-char-arrays or -Ofast.

Note that I included the test case with the removed assertion as
gcc/testsuite/gcc.dg/strlenopt-57.c

Initially it would only miscompile with -Ofast, but with r263018 aka
PR tree-optimization/86043, PR tree-optimization/86042 it was
miscompiled with -O3 as well.

I located the reason in gcc/tree-ssa-strlen.c (get_min_string_length)
which did not
check if the string constant is in fact zero terminated:

@@ -3192,7 +3194,9 @@ get_min_string_length (tree rhs, bool *full_string
   && TREE_READONLY (rhs))
 rhs = DECL_INITIAL (rhs);

-  if (rhs && TREE_CODE (rhs) == STRING_CST)
+  if (rhs && TREE_CODE (rhs) == STRING_CST
+  && compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (rhs)),
+  TREE_STRING_LENGTH (rhs)) >= 0)
 {
   *full_string_p = true;
   return strlen (TREE_STRING_POINTER (rhs));

Fortunately this also shows a way how to narrow strlen return value
expectations when
we are able to positively prove that a string must be zero terminated.



Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


The warning bits are definitely not okay by me.  The purpose
of the warnings (-W{format,sprintf}-{overflow,truncation} is
to detect buffer overflows.  When a warning doesn't have access
to string length information for dynamically created strings
(like the strlen pass does) it uses array sizes as a proxy.
This is useful both to detect possible buffer overflows and
to prevent false positives for overflows that cannot happen
in correctly written programs.

By changing the logic to not consider the bounds of the array
the warnkngs will become prone to many false positives as is
evident from your changes to the tests (you had to explicitly
enable the new option).

In effect, the patch undoes a couple of years worth of fine
tuning of the warnings to strike a balance between true and
false positives.  That's completely unacceptable to me, and,
frankly, proposals along these lines are exceedingly
disruptive.

Beyond that, I don't have a problem with adding a new option
but I continue to strongly disagree with disabling the strlen
optimization by default.  It implies that by default GCC
targets invalid code.  If there is consensus that some new
option is necessary, the right setting is on by default,
along with warnings for programs that pass non-terminated
arrays to string functions.  I have already submitted a patch
that effect for strlen and const arrays and am working on
extending it to other built-in functions and dynamically
created strings (when I have a chance between defending
my past work).

Either way, if a new option is introduced, the array bound
computation for sprintf (and all other buffer overflow
warnings, likely including -Wrestrict) will need to be
decoupled from the option so the current behavior is
preserved.

As I have also said, the area of the provenance of subobject
vs enclosing object pointers is under discussion in WG14 and
the C Object Model study group.  The informal consensus so
far is to maintain the provenance of subobjects and introduce
some mechanism to make it possible to extend the provenance
to the enclosing object.  This would be a pervasive change,
not one just limited to strings.  So if introducing some new
option at this stage is thought to be necessary this should
be taken into consideration.

Martin




Re: [PATCH] PR libstdc++/60555 std::system_category() should recognise POSIX errno values

2018-08-02 Thread David Edelsohn
On Thu, Aug 2, 2018 at 12:12 PM Jonathan Wakely  wrote:
>
> On 02/08/18 11:32 -0400, David Edelsohn wrote:
> >Jonathan
> >
> >This patch broke AIX bootstrap.
> >
> >/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc: In
> >member function 'virtual std::error_condition
> >{anonymous}::system_error_category::default_error_condition(int)
> >const':
> >/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:245:7:
> >error: duplicate case value
> >   case ENOTEMPTY:
> >   ^~~~
> >/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:136:7:
> >note: previously used here
> >   case EEXIST:
> >   ^~~~
> >
> >AIX errno.h
> >
> >/*
> > * AIX returns EEXIST where 4.3BSD used ENOTEMPTY;
> > * but, the standards insist on unique errno values for each errno.
> > * A unique value is reserved for users that want to code case
> > * statements for systems that return either EEXIST or ENOTEMPTY.
> > */
> >#if defined(_ALL_SOURCE) && !defined(_LINUX_SOURCE_COMPAT)
> >#define ENOTEMPTY   EEXIST  /* Directory not empty */
> >#else   /* not _ALL_SOURCE */
> >#define ENOTEMPTY   87
> >#endif  /* _ALL_SOURCE */
>
> Strange that "linux source compat" is required for POSIX conformance.
>
> >I will add _LINUX_SOURCE_COMPAT and see how bootstrap proceeds. If it
> >works, it also will be required in the backport.
>
> Let's just workaround the maybe-not-unique values instead:
>
> --- a/libstdc++-v3/src/c++11/system_error.cc
> +++ b/libstdc++-v3/src/c++11/system_error.cc
> @@ -241,7 +241,8 @@ namespace
>  #ifdef ENOTDIR
>case ENOTDIR:
>  #endif
> -#ifdef ENOTEMPTY
> +#if defined ENOTEMPTY && (!defined EEXIST || ENOTEMPTY != EEXIST)
> +  // AIX sometimes uses the same value for EEXIST and ENOTEMPTY
>case ENOTEMPTY:
>  #endif
>  #ifdef ENOTRECOVERABLE

Whatever you prefer.  _LINUX_SOURCE_COMPAT seems to be working for
bootstrap so far.

Thanks, David


[PATCH] v2: Formatted printing for dump_* in the middle-end

2018-08-02 Thread David Malcolm
On Tue, 2018-07-31 at 19:56 +, Joseph Myers wrote:
> On Tue, 31 Jul 2018, David Malcolm wrote:
> 
> > I didn't exhaustively check every callsite to the changed calls;
> > I'm
> > assuming that -Wformat during bootstrap has effectively checked
> > that
> > for me.  Though now I think about it, I note that we use
> > HOST_WIDE_INT_PRINT_DEC in many places: is this guaranteed to be a
> > valid input to pp_format on all of our configurations?
> 
> HOST_WIDE_INT_PRINT_DEC should not be considered safe with pp_format 
> (although since r197049 may have effectively stopped using %I64 on
> MinGW 
> hosts, I'm not sure if there are current cases where it won't
> work).  
> Rather, it is the job of pp_format to map the 'w' length specifier
> to 
> HOST_WIDE_INT_PRINT_DEC etc.
> 
> I think it clearly makes for cleaner code to limit use of 
> HOST_WIDE_INT_PRINT_* to as few places as possible and to prefer use
> of 
> internal printf-like functions that accept formats such as %wd where 
> possible.

Thanks.

I grepped the tree for every use of HOST_WIDE_INT_PRINT* and found
all that were within dump_printf[_loc] calls.  All such uses were
within files of the form "gcc/tree-vect*.c".

Here's an updated version of the patch (v2) which fixes those
callsites to use "%wd" or "%wu"; the dumpfile.c changes are as
per v1.

Changed in v2:
* rebased to after r263239 (which also touched c-format.c/h)
* avoid HOST_WIDE_INT_PRINT* within dump_printf* calls (in
  gcc/tree-vect*.c)

I didn't add test coverage for %wd and %wu, as pretty-print.c already
has selftest coverage for these.

Richard's review of the v1 patch was:
"The patch is OK if C family maintainers agree on their parts."
so I'm looking for review of:
(a) the c-format.c changes (Joseph?), and
(b) the tree-vect*.c changes (Richard?  Joseph?)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Is this patch OK for trunk?

Thanks
Dave


Blurb from v1, for reference:

This patch converts dump_print and dump_printf_loc from using
printf (and thus ATTRIBUTE_PRINTF) to using a new pretty-printer
based on pp_format, which supports formatting middle-end types.

In particular, the following codes are implemented (in addition
to the standard pretty_printer ones):

   %E: gimple *:
   Equivalent to: dump_gimple_expr (MSG_*, TDF_SLIM, stmt, 0)
   %G: gimple *:
   Equivalent to: dump_gimple_stmt (MSG_*, TDF_SLIM, stmt, 0)
   %T: tree:
   Equivalent to: dump_generic_expr (MSG_*, arg, TDF_SLIM).

Hence it becomes possible to convert e.g.:

  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
   "not vectorized: different sized vector "
   "types in statement, ");
  dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM, vectype);
  dump_printf (MSG_MISSED_OPTIMIZATION, " and ");
  dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM, nunits_vectype);
  dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
}

into a one-liner:

  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "not vectorized: different sized vector "
 "types in statement, %T and %T\n",
 vectype, nunits_vectype);

Unlike regular pretty-printers, this one captures optinfo_item
instances for the formatted chunks as appropriate, so that when
written out to a JSON optimization record, the relevant parts of
the message are labelled by type, and by source location (so that
e.g. %G is entirely equivalent to using dump_gimple_stmt).

dump_printf and dump_printf_loc become marked with
ATTRIBUTE_GCC_DUMP_PRINTF, which the patch also implements.

gcc/c-family/ChangeLog:
* c-format.c (enum format_type): Add gcc_dump_printf_format_type.
(gcc_dump_printf_length_specs): New.
(gcc_dump_printf_flag_pairs): New.
(gcc_dump_printf_flag_specs): New.
(gcc_dump_printf_char_table): New.
(format_types_orig): Add entry for "gcc_dump_printf".
(init_dynamic_diag_info): Set up length_char_specs and
conversion_specs for gcc_dump_printf_format_type.
(handle_format_attribute): Handle gcc_dump_printf_format_type.

gcc/ChangeLog:
* dump-context.h: Include "dumpfile.h".
(dump_context::dump_printf_va): Convert final param from va_list
to va_list *.  Convert from ATTRIBUTE_PRINTF to
ATTRIBUTE_GCC_DUMP_PRINTF.
(dump_context::dump_printf_loc_va): Likewise.
* dumpfile.c: Include "stringpool.h".
(make_item_for_dump_printf_va): Delete.
(make_item_for_dump_printf): Delete.
(class dump_pretty_printer): New class.
(dump_pretty_printer::dump_pretty_printer): New ctor.
(dump_pretty_printer::emit_items): New member function.
(dump_pretty_printer::emit_any_pending_textual_chunks): New member
function.
(dump_pretty_printer::emit_item): New member function.
(dump_pretty

Re: Async I/O patch with compilation fix

2018-08-02 Thread Nicolas Koenig
On Thu, Aug 02, 2018 at 05:42:46PM +0200, Christophe Lyon wrote:
> On Thu, 2 Aug 2018 at 13:35, Nicolas Koenig  wrote:
> >
> >
> > Hello everyone,
> >
> > Here is an updated version of the patch that hopefully fixes the compilation
> > problems by disabling async I/O if conditions are not supported by the 
> > target.
> >
> > I would appreciate if people could test it on systems on which it failed
> > before. As for the array_constructor_8.f90 failure reported in the PR, why
> > it fails is beyond me, it doesn't even use I/O. Maybe/Probably something
> > unrelated?
> >
> 
> Hi,
> I'm probably missing something obvious, but after applying this patch
> on top of r263136, the builds fail while building libgfortran:
> /tmp/9271913_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgfortran/runtime/error.c:28:10:
> fatal error: async.h: No such file or directory
>  #include "async.h"
>   ^
> compilation terminated.
> make[3]: *** [error.lo] Error 1
> 

Hi,

It wasn't you who missed something obvious. Typing `svn add` is hard.
Here is a version of the patch with the two new files.

Nicolas

> > Nicolas
> >
> >
> > 2018-08-02  Nicolas Koenig  
> > Thomas Koenig 
> >
> > PR fortran/25829
> > * gfortran.texi: Add description of asynchronous I/O.
> > * trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
> > as volatile.
> > * trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
> > st_wait_async and change argument spec from ".X" to ".w".
> > (gfc_trans_wait): Pass ID argument via reference.
> >
> > 2018-08-02  Nicolas Koenig  
> > Thomas Koenig 
> >
> > PR fortran/25829
> > * gfortran.dg/f2003_inquire_1.f03: Add write statement.
> > * gfortran.dg/f2003_io_1.f03: Add wait statement.
> >
> > 2018-08-02  Nicolas Koenig  
> > Thomas Koenig 
> >
> > PR fortran/25829
> > * Makefile.am: Add async.c to gfor_io_src.
> > Add async.h to gfor_io_headers.
> > * Makefile.in: Regenerated.
> > * gfortran.map: Add _gfortran_st_wait_async.
> > * io/async.c: New file.
> > * io/async.h: New file.
> > * io/close.c: Include async.h.
> > (st_close): Call async_wait for an asynchronous unit.
> > * io/file_pos.c (st_backspace): Likewise.
> > (st_endfile): Likewise.
> > (st_rewind): Likewise.
> > (st_flush): Likewise.
> > * io/inquire.c: Add handling for asynchronous PENDING
> > and ID arguments.
> > * io/io.h (st_parameter_dt): Add async bit.
> > (st_parameter_wait): Correct.
> > (gfc_unit): Add au pointer.
> > (st_wait_async): Add prototype.
> > (transfer_array_inner): Likewise.
> > (st_write_done_worker): Likewise.
> > * io/open.c: Include async.h.
> > (new_unit): Initialize asynchronous unit.
> > * io/transfer.c (async_opt): New struct.
> > (wrap_scalar_transfer): New function.
> > (transfer_integer): Call wrap_scalar_transfer to do the work.
> > (transfer_real): Likewise.
> > (transfer_real_write): Likewise.
> > (transfer_character): Likewise.
> > (transfer_character_wide): Likewise.
> > (transfer_complex): Likewise.
> > (transfer_array_inner): New function.
> > (transfer_array): Call transfer_array_inner.
> > (transfer_derived): Call wrap_scalar_transfer.
> > (data_transfer_init): Check for asynchronous I/O.
> > Perform a wait operation on any pending asynchronous I/O
> > if the data transfer is synchronous. Copy PDT and enqueue
> > thread for data transfer.
> > (st_read_done_worker): New function.
> > (st_read_done): Enqueue transfer or call st_read_done_worker.
> > (st_write_done_worker): New function.
> > (st_write_done): Enqueue transfer or call st_read_done_worker.
> > (st_wait): Document as no-op for compatibility reasons.
> > (st_wait_async): New function.
> > * io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
> > add NOTE where necessary.
> > (get_gfc_unit): Likewise.
> > (init_units): Likewise.
> > (close_unit_1): Likewise. Call async_close if asynchronous.
> > (close_unit): Use macros LOCK and UNLOCK.
> > (finish_last_advance_record): Likewise.
> > (newunit_alloc): Likewise.
> > * io/unix.c (find_file): Likewise.
> > (flush_all_units_1): Likewise.
> > (flush_all_units): Likewise.
> > * libgfortran.h (generate_error_common): Add prototype.
> > * runtime/error.c: Include io.h and async.h.
> > (generate_error_common): New function.
> >
> > 2018-08-02  Nicolas Koenig  
> > Thomas Koenig 
> >
> > PR fortran/25829
> > * testsuite/libgomp.fortran/async_io_1.f90: New test.
> > * test

Re: Fix invalid cc_status after [const_][us]mulsi3_highpart

2018-08-02 Thread Andreas Schwab
On Jul 17 2018, Andreas Schwab  wrote:

> The mulu.l insn sets the CC according to the 64-bit result, but we are
> only using the high part, so the Z flag cannot be used.  The other flags
> would still be valid, but we have no cc_status flag for that case.
>
>   * config/m68k/m68k.md (umulsi3_highpart+1, const_umulsi3_highpart)
>   (smulsi3_highpart+1, const_smulsi3_highpart): Add CC_STATUS_INIT.
>
> testsuite/:
>   * gcc.target/m68k/mulsi_highpart.c: New test.

Backported to gcc-8 branch to fix PR target/86820.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[committed] Fix target/86784

2018-08-02 Thread Jeff Law

I'm not aware of any H8 part that does speculation that would cause
problems here.

Committed to the trunk.

Jeff
commit a56e14aaf64f3726a3e2114dc76db74925a4b078
Author: law 
Date:   Thu Aug 2 17:24:59 2018 +

PR target/86784
* config/h8300/h8300.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
Define to speculation_safe_value_not_needed.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@263270 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ef198bdc209..06db972e5d6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2018-08-02  Jeff Law  
+
+   PR target/86784
+   * config/h8300/h8300.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
+   Define to speculation_safe_value_not_needed.
+
 2018-08-02  Tom de Vries  
 
PR target/86660
diff --git a/gcc/config/h8300/h8300.c b/gcc/config/h8300/h8300.c
index 01c765dbc09..596f2fd2cda 100644
--- a/gcc/config/h8300/h8300.c
+++ b/gcc/config/h8300/h8300.c
@@ -6148,4 +6148,7 @@ h8300_push_rounding (poly_int64 bytes)
 #undef TARGET_MODE_DEPENDENT_ADDRESS_P
 #define TARGET_MODE_DEPENDENT_ADDRESS_P h8300_mode_dependent_address_p
 
+#undef TARGET_HAVE_SPECULATION_SAFE_VALUE
+#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+
 struct gcc_target targetm = TARGET_INITIALIZER;


[PATCH][GCC] Correct name of file in ChangeLog

2018-08-02 Thread matthew . malcomson
My first patch included an incorrect ChangeLog entry -- the filename was
misspelt. This corrects it.diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 
76f935a48adbcbab2ec765372f48a4599ad85966..6e56ba2d2f081b5e755b2ff834c7878a1268a3ba
 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -18,7 +18,7 @@
 
 2018-07-24  Matthew Malcomson  
 
-   * gcc.target/aarch64/vect-su-add-sub.c: New.
+   * gcc.target/aarch64/simd/vect_su_add_sub.c: New.
 
 2018-07-24  Jakub Jelinek  
 



[PATCH][GCC][AARCH64] Use stdint integers in vect_su_add_sub.c

2018-08-02 Thread Matthew Malcomson
Fixing the ilp32 issue that Christophe found.

The existing testcase uses `long` to represent a 64 bit integer.
This breaks when compiled using the `-mabi=ilp32` flag.
We switch the use of int/long for int32_t/int64_t to avoid this problem
and show the requirement of a widening operation more clearly.

Ok for trunk?

gcc/testsuite/Changelog
2018-07-31  Matthew Malcomson  

* gcc.target/aarch64/simd/vect_su_add_sub.c: Use stdint types

diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c 
b/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c
index 
338da54f6281c90e1c6b1c59fa50d9b719005c77..e3a2d0175e75b20e309c9b734b747512e466fbb1
 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c
@@ -1,21 +1,24 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#include 
+#include 
+
 /* Ensure we use the signed/unsigned extend vectorized add and sub
instructions.  */
 #define N 1024
 
-int a[N];
-long c[N];
-long d[N];
-unsigned int ua[N];
-unsigned long uc[N];
-unsigned long ud[N];
+int32_t a[N];
+int64_t c[N];
+int64_t d[N];
+uint32_t ua[N];
+uint64_t uc[N];
+uint64_t ud[N];
 
 void
 add ()
 {
-  for (int i = 0; i < N; i++)
+  for (size_t i = 0; i < N; i++)
 d[i] = a[i] + c[i];
 }
 /* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */
@@ -24,7 +27,7 @@ add ()
 void
 subtract ()
 {
-  for (int i = 0; i < N; i++)
+  for (size_t i = 0; i < N; i++)
 d[i] = c[i] - a[i];
 }
 /* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */
@@ -33,7 +36,7 @@ subtract ()
 void
 uadd ()
 {
-  for (int i = 0; i < N; i++)
+  for (size_t i = 0; i < N; i++)
 ud[i] = ua[i] + uc[i];
 }
 /* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */
@@ -42,7 +45,7 @@ uadd ()
 void
 usubtract ()
 {
-  for (int i = 0; i < N; i++)
+  for (size_t i = 0; i < N; i++)
 ud[i] = uc[i] - ua[i];
 }
 /* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */



[PING 2][PATCH] [v4][aarch64] Avoid tag collisions for loads falkor

2018-08-02 Thread Siddhesh Poyarekar

Ping!

Siddhesh

On 07/24/2018 12:37 PM, Siddhesh Poyarekar wrote:

Hi,

This is a rewrite of the tag collision avoidance patch that Kugan had
written as a machine reorg pass back in February.

The falkor hardware prefetching system uses a combination of the
source, destination and offset to decide which prefetcher unit to
train with the load.  This is great when loads in a loop are
sequential but sub-optimal if there are unrelated loads in a loop that
tag to the same prefetcher unit.

This pass attempts to rename the desination register of such colliding
loads using routines available in regrename.c so that their tags do
not collide.  This shows some performance gains with mcf and xalancbmk
(~5% each) and will be tweaked further.  The pass is placed near the
fag end of the pass list so that subsequent passes don't inadvertantly
end up undoing the renames.

A full gcc bootstrap and testsuite ran successfully on aarch64, i.e. it
did not introduce any new regressions.  I also did a make-check with
-mcpu=falkor to ensure that there were no regressions.  The couple of
regressions I found were target-specific and were related to scheduling
and cost differences and are not correctness issues.

Changes from v3:
- Avoid renaming argument/return registers and registers that have a
   specific architectural meaning, i.e. stack pointer, frame pointer,
   etc.  Try renaming their aliases instead.

Changes from v2:
- Ignore SVE instead of asserting that falkor does not support sve

Changes from v1:

- Fixed up issues pointed out by Kyrill
- Avoid renaming R0/V0 since they could be return values
- Fixed minor formatting issues.

2018-07-02  Siddhesh Poyarekar  
Kugan Vivekanandarajah  

* config/aarch64/falkor-tag-collision-avoidance.c: New file.
* config.gcc (extra_objs): Build it.
* config/aarch64/t-aarch64 (falkor-tag-collision-avoidance.o):
Likewise.
* config/aarch64/aarch64-passes.def
(pass_tag_collision_avoidance): New pass.
* config/aarch64/aarch64.c (qdf24xx_tunings): Add
AARCH64_EXTRA_TUNE_RENAME_LOAD_REGS to tuning_flags.
(aarch64_classify_address): Remove static qualifier.
(aarch64_address_info, aarch64_address_type): Move to...
* config/aarch64/aarch64-protos.h: ... here.
(make_pass_tag_collision_avoidance): New function.
* config/aarch64/aarch64-tuning-flags.def (rename_load_regs):
New tuning flag.

CC: james.greenha...@arm.com
CC: kyrylo.tkac...@foss.arm.com
---
  gcc/config.gcc|   2 +-
  gcc/config/aarch64/aarch64-passes.def |   1 +
  gcc/config/aarch64/aarch64-protos.h   |  49 +
  gcc/config/aarch64/aarch64-tuning-flags.def   |   2 +
  gcc/config/aarch64/aarch64.c  |  48 +-
  .../aarch64/falkor-tag-collision-avoidance.c  | 881 ++
  gcc/config/aarch64/t-aarch64  |   9 +
  7 files changed, 946 insertions(+), 46 deletions(-)
  create mode 100644 gcc/config/aarch64/falkor-tag-collision-avoidance.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 78e84c2b864..8f5e458e8a6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -304,7 +304,7 @@ aarch64*-*-*)
extra_headers="arm_fp16.h arm_neon.h arm_acle.h"
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
-   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
+   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o 
falkor-tag-collision-avoidance.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
target_has_targetm_common=yes
;;
diff --git a/gcc/config/aarch64/aarch64-passes.def 
b/gcc/config/aarch64/aarch64-passes.def
index 87747b420b0..f61a8870aa1 100644
--- a/gcc/config/aarch64/aarch64-passes.def
+++ b/gcc/config/aarch64/aarch64-passes.def
@@ -19,3 +19,4 @@
 .  */
  
  INSERT_PASS_AFTER (pass_regrename, 1, pass_fma_steering);

+INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance);
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index af5db9c5953..647ad7a9c37 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -288,6 +288,49 @@ struct tune_params
const struct cpu_prefetch_tune *prefetch;
  };
  
+/* Classifies an address.

+
+   ADDRESS_REG_IMM
+   A simple base register plus immediate offset.
+
+   ADDRESS_REG_WB
+   A base register indexed by immediate offset with writeback.
+
+   ADDRESS_REG_REG
+   A base register indexed by (optionally scaled) register.
+
+   ADDRESS_REG_UXTW
+   A base register indexed by (optionally scaled) zero-extended register.
+
+   ADDRESS_REG_SXTW
+   A base register indexed by (optionally scaled) sign-extended register.
+
+   ADDRESS_LO_SUM
+   A LO_SUM rtx with a base register and "LO12" symbol relocation.
+
+   ADDRESS_SYMBOLIC:
+

Re: [PATCH][GCC] Correct name of file in ChangeLog

2018-08-02 Thread Sudakshina Das

Hi Matthew

On 01/08/18 10:25, matthew.malcom...@arm.com wrote:

My first patch included an incorrect ChangeLog entry -- the filename was
misspelt. This corrects it.



I think this counts as an obvious change. I have committed this on your 
behalf.


Thanks
Sudi


Re: [PATCH] Backport gettext fixes to get rid of warnings on macOS

2018-08-02 Thread Tom Tromey
> "Simon" == Simon Marchi  writes:

Simon> intl/ChangeLog:

Simon>  * libgnuintl.h (_INTL_MAY_RETURN_STRING_ARG, gettext, dgettext,
Simon>  dcgettext, ngettext, dngettext, dcngettext): Backport changes
Simon>  from upstream gettext.

Thanks, I think you should check this in.

Tom


[PATCH][GCC][AARCH64] Use STLUR for atomic_store

2018-08-02 Thread matthew . malcomson
Use the STLUR instruction introduced in Armv8.4-a.
This insruction has the store-release semantic like STLR but can take a
9-bit unscaled signed immediate offset.

Example test case:
```
void
foo ()
{
int32_t *atomic_vals = calloc (4, sizeof (int32_t));
atomic_store_explicit (atomic_vals + 1, 2, memory_order_release);
}
```

Before patch generates
```
foo:
stp x29, x30, [sp, -16]!
mov x1, 4
mov x0, x1
mov x29, sp
bl  calloc
mov w1, 2
add x0, x0, 4
stlrw1, [x0]
ldp x29, x30, [sp], 16
ret
```

After patch generates
```
foo:
stp x29, x30, [sp, -16]!
mov x1, 4
mov x0, x1
mov x29, sp
bl  calloc
mov w1, 2
stlur   w1, [x0, 4]
ldp x29, x30, [sp], 16
ret
```

Full bootstrap and regression test done on aarch64.

Ok for trunk?

gcc/
2018-07-26  Matthew Malcomson  

* config/aarch64/aarch64-protos.h
(aarch64_offset_9bit_signed_unscaled_p): New declaration.
* config/aarch64/aarch64.c
(aarch64_offset_9bit_signed_unscaled_p): Rename from
offset_9bit_signed_unscaled_p.
* config/aarch64/aarch64.h (TARGET_ARMV8_4): Add feature macro.
* config/aarch64/atomics.md (atomic_store): Allow offset
and use stlur.
* config/aarch64/constraints.md (Ust): New constraint.
* config/aarch64/predicates.md.
(aarch64_sync_or_stlur_memory_operand): New predicate.

gcc/testsuite/
2018-07-26  Matthew Malcomson  

* gcc.target/aarch64/atomic-store.c: New.diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
af5db9c595385f7586692258f750b6aceb3ed9c8..630a75bf776fcdc374aa9ffa4bb020fea3719320
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -393,6 +393,7 @@ void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, 
rtx, rtx, rtx);
 bool aarch64_mov_operand_p (rtx, machine_mode);
 rtx aarch64_reverse_mask (machine_mode, unsigned int);
 bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64);
+bool aarch64_offset_9bit_signed_unscaled_p (machine_mode, poly_int64);
 char *aarch64_output_sve_cnt_immediate (const char *, const char *, rtx);
 char *aarch64_output_sve_addvl_addpl (rtx, rtx, rtx);
 char *aarch64_output_sve_inc_dec_immediate (const char *, rtx);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
c1218503bab19323eee1cca8b7e4bea8fbfcf573..328512e11f4230e24223bc51e55bdca8b31f6a20
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -237,6 +237,9 @@ extern unsigned aarch64_architecture_version;
 /* ARMv8.3-A features.  */
 #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3)
 
+/* ARMv8.4-A features.  */
+#define TARGET_ARMV8_4 (AARCH64_ISA_V8_4)
+
 /* Make sure this is always defined so we don't have to check for ifdefs
but rather use normal ifs.  */
 #ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
fa01475aa9ee579b6a3b2526295b622157120660..8560150d552b4f830335ccd420a0749f01d4bb6a
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4546,8 +4546,8 @@ aarch64_offset_7bit_signed_scaled_p (machine_mode mode, 
poly_int64 offset)
 
 /* Return true if OFFSET is a signed 9-bit value.  */
 
-static inline bool
-offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED,
+bool
+aarch64_offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED,
   poly_int64 offset)
 {
   HOST_WIDE_INT const_offset;
@@ -5823,7 +5823,7 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
 instruction memory accesses.  */
  if (mode == TImode || mode == TFmode)
return (aarch64_offset_7bit_signed_scaled_p (DImode, offset)
-   && (offset_9bit_signed_unscaled_p (mode, offset)
+   && (aarch64_offset_9bit_signed_unscaled_p (mode, offset)
|| offset_12bit_unsigned_scaled_p (mode, offset)));
 
  /* A 7bit offset check because OImode will emit a ldp/stp
@@ -5837,7 +5837,7 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
 ldr/str instructions (only big endian will get here).  */
  if (mode == CImode)
return (aarch64_offset_7bit_signed_scaled_p (TImode, offset)
-   && (offset_9bit_signed_unscaled_p (V16QImode, offset + 32)
+   && (aarch64_offset_9bit_signed_unscaled_p (V16QImode, 
offset + 32)
|| offset_12bit_unsigned_scaled_p (V16QImode,
   offset + 32)));
 
@@ -5877,7 +5877,7 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
 || known_eq (GET_MODE_SIZE (mode), 16))
   

[committed] Fix PR target/86790 (m68k)

2018-08-02 Thread Jeff Law

While m68k does have branch prediction, it's a dead architecture, does
not have out of order properties and AFAIK doesn't have a suitably
accurate timer.  Thus, it doesn't seem worth trying to do any spectre v1
mitigations.

Committed to the trunk.

Jeff
commit 78095ef84bbe24cd371a4a3baeaf83944efac6a2
Author: law 
Date:   Thu Aug 2 17:50:16 2018 +

PR target/86790
* config/m68k/m68k.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
Define to speculation_safe_value_not_needed.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@263272 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 06db972e5d6..762e14ef224 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2018-08-02  Jeff Law  
 
+   PR target/86790
+   * config/m68k/m68k.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
+   Define to speculation_safe_value_not_needed.
+
PR target/86784
* config/h8300/h8300.c (TARGET_HAVE_SPECULATION_SAFE_VALUE):
Define to speculation_safe_value_not_needed.
diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index ef8604ebe3d..75a5a5b69b9 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -353,6 +353,9 @@ static machine_mode m68k_promote_function_mode (const_tree, 
machine_mode,
 #undef TARGET_PROMOTE_FUNCTION_MODE
 #define TARGET_PROMOTE_FUNCTION_MODE m68k_promote_function_mode
 
+#undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
+#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+
 static const struct attribute_spec m68k_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req,


Re: Move all wide_int_range* functions into wide-int-range.[ch]

2018-08-02 Thread Jeff Law
On 08/01/2018 03:19 PM, David Malcolm wrote:
> On Wed, 2018-08-01 at 15:39 -0400, Aldy Hernandez wrote:
>> This is actually an obvious patch, but I'm not committing it just in 
>> case you'd prefer another name for the files.
> 
> BTW, is it our policy that new gcc C++ source files should have a .cc
> extension?
I believe so.
jeff


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Bernd Edlinger
On 08/02/18 19:00, Martin Sebor wrote:
> As an alternate approach I have been thinking about, if
> there is a strong feeling that allowing strlen to iterate
> past the subobject boundary is necessary (I don't believe
> it is.)
> 
> Rather than indiscriminately expanding the provenance of
> the subobject regardless of what members follow it in
> the enclosing structure, only consider doing that if
> the next member is an array of the same type.  E.g.,
> 
>    struct S { char a[4], b[3], c[2], d; };
>    extern struct S *p;
> 
>    strlen (p->a);   // consider p->a's bounds to be char[9]
> 

No, initially I thought in the same direction,
but looking at the way how X-server is broken,
I realized that will probably not be sufficient.

Maybe it would be good to have one set of optimistic range infos
that follow the standards, and can be used to control the warnings,
and another set of pessimistic range infos, that control optimizations.

Consider
extern char garbage[10];
char x[10];
memcpy(x, garbage, 10);
x[9] = 0;

strlen(x) will be 0..9 no matter what was in garbage.
That information can be safely used for optimizations.

But if we have
extern char garbage[10];
char x[10];
memcpy(x, garbage, 10);
char *y = x;
if (strlen(y) < 10) <= may not be removed pessimistically

char z[10];
sprintf(z, "%s", y); <= omitting warning would be okay optimistically.

It is not really easy to do but possible.

In the moment I would like to concentrate exclusively on wrong code issues
and not new warnings, even some regressions on the warnings look
acceptable to me.


Bernd.

> I.e., treat p->a, p->b, and p->c as one array but exclude
> from it d because it's not an array.
> 
> (This wouldn't solve the warning problem below -- a separate
> computation would still be necessary to determine the tighter
> bound of the member itself.)
> 
> On 08/02/2018 09:42 AM, Martin Sebor wrote:
>> On 08/02/2018 04:22 AM, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this is an update of the patch to prevent unsafe optimizations due to
>>> strlen range assuming
>>> always zero-terminated char arrays.
>>>
>>> Since the previous version I do no longer try to locate the outermost
>>> char array,
>>> but just bail out if there is a typecast, that means, supposed we have
>>> a 2-dimensional
>>> array, char a[x][y], strlen (s.a[x]) may assume a[x] is
>>> zero-terminated, if the optimization
>>> is enabled.  strlen ((char*)s.a) involves a type cast and should
>>> assume nothing.
>>>
>>> Additionally due to the discussion, I came to the conclusion that this
>>> strlen range optimization
>>> should only be used when enabled with
>>> -fassume-zero-terminated-char-arrays or -Ofast.
>>>
>>> Note that I included the test case with the removed assertion as
>>> gcc/testsuite/gcc.dg/strlenopt-57.c
>>>
>>> Initially it would only miscompile with -Ofast, but with r263018 aka
>>> PR tree-optimization/86043, PR tree-optimization/86042 it was
>>> miscompiled with -O3 as well.
>>>
>>> I located the reason in gcc/tree-ssa-strlen.c (get_min_string_length)
>>> which did not
>>> check if the string constant is in fact zero terminated:
>>>
>>> @@ -3192,7 +3194,9 @@ get_min_string_length (tree rhs, bool *full_string
>>>    && TREE_READONLY (rhs))
>>>  rhs = DECL_INITIAL (rhs);
>>>
>>> -  if (rhs && TREE_CODE (rhs) == STRING_CST)
>>> +  if (rhs && TREE_CODE (rhs) == STRING_CST
>>> +  && compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (rhs)),
>>> +  TREE_STRING_LENGTH (rhs)) >= 0)
>>>  {
>>>    *full_string_p = true;
>>>    return strlen (TREE_STRING_POINTER (rhs));
>>>
>>> Fortunately this also shows a way how to narrow strlen return value
>>> expectations when
>>> we are able to positively prove that a string must be zero terminated.
>>>
>>>
>>>
>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>> Is it OK for trunk?
>>
>> The warning bits are definitely not okay by me.  The purpose
>> of the warnings (-W{format,sprintf}-{overflow,truncation} is
>> to detect buffer overflows.  When a warning doesn't have access
>> to string length information for dynamically created strings
>> (like the strlen pass does) it uses array sizes as a proxy.
>> This is useful both to detect possible buffer overflows and
>> to prevent false positives for overflows that cannot happen
>> in correctly written programs.
>>
>> By changing the logic to not consider the bounds of the array
>> the warnkngs will become prone to many false positives as is
>> evident from your changes to the tests (you had to explicitly
>> enable the new option).
>>
>> In effect, the patch undoes a couple of years worth of fine
>> tuning of the warnings to strike a balance between true and
>> false positives.  That's completely unacceptable to me, and,
>> frankly, proposals along these lines are exceedingly
>> disruptive.
>>
>> Beyond that, I don't have a problem with adding a new option
>> but I continue to strongly disagree with disabling the strlen
>> optimization by default.  

Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Jakub Jelinek
On Thu, Aug 02, 2018 at 11:00:32AM -0600, Martin Sebor wrote:
> As an alternate approach I have been thinking about, if
> there is a strong feeling that allowing strlen to iterate
> past the subobject boundary is necessary (I don't believe
> it is.)
> 
> Rather than indiscriminately expanding the provenance of
> the subobject regardless of what members follow it in
> the enclosing structure, only consider doing that if
> the next member is an array of the same type.  E.g.,
> 
>   struct S { char a[4], b[3], c[2], d; };
>   extern struct S *p;
> 
>   strlen (p->a);   // consider p->a's bounds to be char[9]

See the mail with testcases where the middle-end doesn't distinguish
between p->a and (char *) p, unless you want to warn or optimize
in the FEs or extremely early in the lowering passes, that isn't going to
work.

Jakub


Re: Move all wide_int_range* functions into wide-int-range.[ch]

2018-08-02 Thread Aldy Hernandez




On 08/02/2018 02:07 PM, Jeff Law wrote:

On 08/01/2018 03:19 PM, David Malcolm wrote:

On Wed, 2018-08-01 at 15:39 -0400, Aldy Hernandez wrote:

This is actually an obvious patch, but I'm not committing it just in
case you'd prefer another name for the files.


BTW, is it our policy that new gcc C++ source files should have a .cc
extension?

I believe so.
jeff



I did not know that.  I'm happy to rename it to .cc.

OK with that change ;-)?

Aldy


[gomp5] Iterator fixes

2018-08-02 Thread Jakub Jelinek
Hi!

Another thing being voted into OpenMP 5.0 these days are clarifications for
iterators - step must be integral, and whether it is > 0 determines if it
iterates up or down, so one can use downward iterators even with unsigned
iterator type, either just use signed negative step constant, or some signed
negative expression.  Wrap-around, both signed and unsigned, of the iterator
is unspecified behavior, which simplifies things.

Tested on x86_64-linux, committed to gomp-5_0-branch.

2018-08-02  Jakub Jelinek  

* gimplify.c (gimplify_omp_depend): Load block from elt 5 instead
of 4, in 4 expect to find original step expression, gimplify it and
use it to determine if iterating upwards or downwards.  When iterating
downwards with unsigned iterator type, negate both the difference and
step before division.
gcc/c/
* c-parser.c (c_parser_omp_iterators): Build vector with 6 elts
instead of 5.
(c_parser_omp_clause_depend): Put block into elt 5 instead of 4.
* c-typeck.c (c_omp_finish_iterators): Remove iterator if step is
errorneous, diagnose if step doesn't have integral type.  Remember
original step expression wrapped with save_expr and store that to
elt 4.
gcc/cp/
* parser.c (cp_parser_omp_iterators): Build vector with 6 elts
instead of 5.
(cp_parser_omp_clause_depend): Put block into elt 5 instead of 4.
* semantics.c (cp_omp_finish_iterators): Remove iterator if step is
errorneous, diagnose if step doesn't have integral type.  Remember
original step expression wrapped with save_expr and store that to
elt 4.  If processing_template_decl, punt earlier if begin/end/step
are type dependent expression, and only update step to the orig_step.
* pt.c (tsubst_omp_clause_decl): Put block into elt 5 instead of 4.
gcc/testsuite/
* c-c++-common/gomp/depend-iterator-2.c (f1): Adjust expected
diagnostics, split the test with step 3.5 into one where only
begin/end are floating point, and one where only step is floating
point.
* g++.dg/gomp/depend-iterator-2.C (f1, f3): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/depend-iterator-1.c: Add tests for
unsigned iterators, add gaps in between different arr2 bits.
* testsuite/libgomp.c++/depend-iterator-1.C: Likewise.

--- gcc/gimplify.c.jj   2018-08-01 16:36:33.391000517 +0200
+++ gcc/gimplify.c  2018-08-02 17:59:48.980047042 +0200
@@ -7572,7 +7572,9 @@ gimplify_omp_depend (tree *list_p, gimpl
   is_gimple_val, fb_rvalue) == GS_ERROR
|| gimplify_expr (&TREE_VEC_ELT (it, 2), pre_p, NULL,
  is_gimple_val, fb_rvalue) == GS_ERROR
-   || (gimplify_expr (&TREE_VEC_ELT (it, 3), pre_p, NULL,
+   || gimplify_expr (&TREE_VEC_ELT (it, 3), pre_p, NULL,
+ is_gimple_val, fb_rvalue) == GS_ERROR
+   || (gimplify_expr (&TREE_VEC_ELT (it, 4), pre_p, NULL,
   is_gimple_val, fb_rvalue)
== GS_ERROR))
  return 2;
@@ -7580,12 +7582,13 @@ gimplify_omp_depend (tree *list_p, gimpl
tree begin = TREE_VEC_ELT (it, 1);
tree end = TREE_VEC_ELT (it, 2);
tree step = TREE_VEC_ELT (it, 3);
+   tree orig_step = TREE_VEC_ELT (it, 4);
tree type = TREE_TYPE (var);
tree stype = TREE_TYPE (step);
location_t loc = DECL_SOURCE_LOCATION (var);
tree endmbegin;
/* Compute count for this iterator as
-  step > 0
+  orig_step > 0
   ? (begin < end ? (end - begin + (step - 1)) / step : 0)
   : (begin > end ? (end - begin + (step + 1)) / step : 0)
   and compute product of those for the entire depend
@@ -7608,8 +7611,14 @@ gimplify_omp_depend (tree *list_p, gimpl
   pos, step);
tree neg = fold_build2_loc (loc, PLUS_EXPR, stype,
endmbegin, stepp1);
+   if (TYPE_UNSIGNED (stype))
+ {
+   neg = fold_build1_loc (loc, NEGATE_EXPR, stype, neg);
+   step = fold_build1_loc (loc, NEGATE_EXPR, stype, step);
+ }
neg = fold_build2_loc (loc, TRUNC_DIV_EXPR, stype,
   neg, step);
+   step = NULL_TREE;
tree cond = fold_build2_loc (loc, LT_EXPR,
 boolean_type_node,
 

Re: [PATCH] [aarch64][v2] Fix falkor pipeline description for dup

2018-08-02 Thread James Greenhalgh
On Thu, Aug 02, 2018 at 11:58:37AM -0500, Siddhesh Poyarekar wrote:
> There was a typo in the pipeline description where DUP was assigned to
> the vector pipes for quad mode ops when it really only uses the VTOG
> pipes.  Fixing this does not show any noticeable difference in
> performance (there's a very small bump of 1.7% in x264 but that's
> about it) in my tests but is the more precise description of operations
> for falkor.
> 
> Bootstrapped and tested with --with-cpu=falkor to confirm that there
> are no regressions.

OK.

Thanks,
James

> 
> Siddhesh
> 
> Changes from v1:
> 
> - Fixed up ChangeLog entry
> 
> - Replaced falkor_gtov,falkor_gtov with falkor_gtov*2
> 
>   * config/aarch64/falkor.md (falkor_am_1_vxvy_vxvy): Move
>   neon_dup_q to...
>   (falkor_am_1_gtov_gtov): ... a new insn reservation.
> 



Re: [gomp5] Iterator fixes

2018-08-02 Thread Marek Polacek
On Thu, Aug 02, 2018 at 08:28:22PM +0200, Jakub Jelinek wrote:
> --- gcc/c/c-typeck.c.jj   2018-08-01 17:00:46.329549760 +0200
> +++ gcc/c/c-typeck.c  2018-08-02 18:57:31.969477582 +0200
> @@ -13112,6 +13112,7 @@ c_omp_finish_iterators (tree iter)
>tree begin = TREE_VEC_ELT (it, 1);
>tree end = TREE_VEC_ELT (it, 2);
>tree step = TREE_VEC_ELT (it, 3);
> +  tree orig_step;
>tree type = TREE_TYPE (var);
>location_t loc = DECL_SOURCE_LOCATION (var);
>if (type == error_mark_node)
> @@ -13138,11 +13139,24 @@ c_omp_finish_iterators (tree iter)
> ret = true;
> continue;
>   }
> -
> +  else if (step == error_mark_node
> +|| TREE_TYPE (step) == error_mark_node)

You can use error_operand_p for this.

Marek


Re: [PATCH 00/11] (v2) Mitigation against unsafe data speculation (CVE-2017-5753)

2018-08-02 Thread Jeff Law
On 07/27/2018 01:48 PM, John David Anglin wrote:
> On 2018-07-27 5:37 AM, Richard Earnshaw wrote:
>> Port Maintainers: You need to decide what action is required for your
>> port to handle speculative execution, even if that action is to use
>> the trivial no-speculation on this architecture.  You must also
>> consider whether or not a furture implementation of your architecture
>> might need to deal with this in making that decision.
> On hppa, I think we should go with the hook that assumes there is no
> speculative execution.
> 
> Nominally, there is branch prediction and speculative execution; but the
> spectre test program
> was not able to successfully access memory on my rp3440.
> 
> As far as I know, the details of speculative execution on PA-RISC are
> not public.  Jeff would know
It's been eons.   I think there's enough building blocks on the PA to
mount a spectre v1 attack.  They've got branch prediction with varying
degress of speculative execution, caches and user accessable cycle timers.

There's varying degrees of out of order execution all the way back in
the PA7xxx processors (hit-under-miss) to full o-o-o execution in the
PA8xxx series (including the PA8900 that's in the rp3440).

I suspect that given enough time we could figure out why the test didn't
indicate spectre v1 vulnerability on your system and twiddle it, but
given it's a dead processor, I doubt it's worth the effort.

jeff


Re: Move all wide_int_range* functions into wide-int-range.[ch]

2018-08-02 Thread Jeff Law
On 08/02/2018 12:23 PM, Aldy Hernandez wrote:
> 
> 
> On 08/02/2018 02:07 PM, Jeff Law wrote:
>> On 08/01/2018 03:19 PM, David Malcolm wrote:
>>> On Wed, 2018-08-01 at 15:39 -0400, Aldy Hernandez wrote:
 This is actually an obvious patch, but I'm not committing it just in
 case you'd prefer another name for the files.
>>>
>>> BTW, is it our policy that new gcc C++ source files should have a .cc
>>> extension?
>> I believe so.
>> jeff
>>
> 
> I did not know that.  I'm happy to rename it to .cc.
> 
> OK with that change ;-)?
If it's just moving code around, yes, OK.

Jeff



Re: Move all wide_int_range* functions into wide-int-range.[ch]

2018-08-02 Thread Aldy Hernandez



On 08/02/2018 02:23 PM, Aldy Hernandez wrote:



On 08/02/2018 02:07 PM, Jeff Law wrote:

On 08/01/2018 03:19 PM, David Malcolm wrote:

On Wed, 2018-08-01 at 15:39 -0400, Aldy Hernandez wrote:

This is actually an obvious patch, but I'm not committing it just in
case you'd prefer another name for the files.


BTW, is it our policy that new gcc C++ source files should have a .cc
extension?

I believe so.
jeff



I did not know that.  I'm happy to rename it to .cc.

OK with that change ;-)?


Like this, btw.

As promised, I also moved wide_int_range_shift_undefined_p() into an 
inlined function in the header file.


Aldy
gcc/

	* Makefile.in (wide-int-range.o): New.
	* tree-vrp.c: Move all the wide_int_* functions to...
	* wide-int-range.cc: ...here.
	* tree-vrp.h: Move all the wide_int_* prototypes to...
	* wide-int-range.h: ...here.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b8716406533..e7d818d174c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1601,6 +1601,7 @@ OBJS = \
 	web.o \
 	wide-int.o \
 	wide-int-print.o \
+	wide-int-range.o \
 	xcoffout.o \
 	$(out_object_file) \
 	$(EXTRA_OBJS) \
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d18c72d0a02..e1875d8d46e 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "vr-values.h"
 #include "builtins.h"
+#include "wide-int-range.h"
 
 /* Set of SSA names found live during the RPO traversal of the function
for still active basic-blocks.  */
@@ -956,98 +957,7 @@ value_range_constant_singleton (value_range *vr)
   return NULL_TREE;
 }
 
-/* Wrapper around wide_int_binop that adjusts for overflow.
-
-   Return true if we can compute the result; i.e. if the operation
-   doesn't overflow or if the overflow is undefined.  In the latter
-   case (if the operation overflows and overflow is undefined), then
-   adjust the result to be -INF or +INF depending on CODE, VAL1 and
-   VAL2.  Return the value in *RES.
-
-   Return false for division by zero, for which the result is
-   indeterminate.  */
-
-static bool
-wide_int_binop_overflow (wide_int &res,
-			 enum tree_code code,
-			 const wide_int &w0, const wide_int &w1,
-			 signop sign, bool overflow_undefined)
-{
-  wi::overflow_type overflow;
-  if (!wide_int_binop (res, code, w0, w1, sign, &overflow))
-return false;
-
-  /* If the operation overflowed return -INF or +INF depending on the
- operation and the combination of signs of the operands.  */
-  if (overflow && overflow_undefined)
-{
-  switch (code)
-	{
-	case MULT_EXPR:
-	  /* For multiplication, the sign of the overflow is given
-	 by the comparison of the signs of the operands.  */
-	  if (sign == UNSIGNED || w0.sign_mask () == w1.sign_mask ())
-	res = wi::max_value (w0.get_precision (), sign);
-	  else
-	res = wi::min_value (w0.get_precision (), sign);
-	  return true;
-
-	case TRUNC_DIV_EXPR:
-	case FLOOR_DIV_EXPR:
-	case CEIL_DIV_EXPR:
-	case EXACT_DIV_EXPR:
-	case ROUND_DIV_EXPR:
-	  /* For division, the only case is -INF / -1 = +INF.  */
-	  res = wi::max_value (w0.get_precision (), sign);
-	  return true;
-
-	default:
-	  gcc_unreachable ();
-	}
-}
-  return !overflow;
-}
-
-/* For range [LB, UB] compute two wide_int bit masks.
-
-   In the MAY_BE_NONZERO bit mask, if some bit is unset, it means that
-   for all numbers in the range the bit is 0, otherwise it might be 0
-   or 1.
-
-   In the MUST_BE_NONZERO bit mask, if some bit is set, it means that
-   for all numbers in the range the bit is 1, otherwise it might be 0
-   or 1.  */
-
-void
-wide_int_set_zero_nonzero_bits (signop sign,
-const wide_int &lb, const wide_int &ub,
-wide_int &may_be_nonzero,
-wide_int &must_be_nonzero)
-{
-  may_be_nonzero = wi::minus_one (lb.get_precision ());
-  must_be_nonzero = wi::zero (lb.get_precision ());
-
-  if (wi::eq_p (lb, ub))
-{
-  may_be_nonzero = lb;
-  must_be_nonzero = may_be_nonzero;
-}
-  else if (wi::ge_p (lb, 0, sign) || wi::lt_p (ub, 0, sign))
-{
-  wide_int xor_mask = lb ^ ub;
-  may_be_nonzero = lb | ub;
-  must_be_nonzero = lb & ub;
-  if (xor_mask != 0)
-	{
-	  wide_int mask = wi::mask (wi::floor_log2 (xor_mask), false,
-may_be_nonzero.get_precision ());
-	  may_be_nonzero = may_be_nonzero | mask;
-	  must_be_nonzero = wi::bit_and_not (must_be_nonzero, mask);
-	}
-}
-}
-
-/* Value range wrapper for wide_int_set_zero_nonzero_bits.
+/* Value range wrapper for wide_int_range_set_zero_nonzero_bits.
 
Compute MAY_BE_NONZERO and MUST_BE_NONZERO bit masks for range in VR.
 
@@ -1066,9 +976,10 @@ vrp_set_zero_nonzero_bits (const tree expr_type,
   *must_be_nonzero = wi::zero (TYPE_PRECISION (expr_type));
   return false;
 }
-  wide_int_set_zero_nonzero_bits (TYPE_SIGN (expr_type),
- wi::to_wide (vr->min), wi::to_wide (vr->max),
- *may_be_nonzero, *must_be_nonzero);
+  wide_int_range_set_zero_nonzero_bits (TY

Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

2018-08-02 Thread Bernd Edlinger
On 08/02/18 15:26, Bernd Edlinger wrote:
>>
>>    /* If the length can be computed at compile-time, return it.  */
>> -  len = c_strlen (src, 0);
>> +  tree array;
>> +  tree len = c_strlen (src, 0, &array);
> 
> You know the c_strlen tries to compute wide character sizes,
> but strlen does not do that, strlen (L"abc") should give 1
> (or 0 on a BE machine)
> I wonder if that is correct.
> 
[snip]
>>
>>  static tree
>> -fold_builtin_strlen (location_t loc, tree type, tree arg)
>> +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg)
>>  {
>>    if (!validate_arg (arg, POINTER_TYPE))
>>  return NULL_TREE;
>>    else
>>  {
>> -  tree len = c_strlen (arg, 0);
>> -
>> +  tree arr = NULL_TREE;
>> +  tree len = c_strlen (arg, 0, &arr);
> 
> Is it possible to write a test case where strlen(L"test") reaches this point?
> what will c_strlen return then?
> 

Yes, of course it is:

$ cat y.c
int f(char *x)
{
   return __builtin_strlen(x);
}

int main ()
{
   return f((char*)&L"abcdef"[0]);
}
$ gcc -O3 -S y.c
$ cat y.s
main:
.LFB1:
.cfi_startproc
movl$6, %eax
ret
.cfi_endproc

The reason is that c_strlen tries to fold wide chars at all.
I do not know when that was introduced, was that already before your last 
patches?
Is it possible to revert the last few patches cleanly?


Bernd.


Re: [PATCH] Backport gettext fixes to get rid of warnings on macOS

2018-08-02 Thread Simon Marchi

On 2018-08-02 13:44, Tom Tromey wrote:

"Simon" == Simon Marchi  writes:


Simon> intl/ChangeLog:

Simon>   * libgnuintl.h (_INTL_MAY_RETURN_STRING_ARG, gettext, dgettext,
Simon>   dcgettext, ngettext, dngettext, dcngettext): Backport changes
Simon>   from upstream gettext.

Thanks, I think you should check this in.


I was hoping to get an approval from gcc and binutils first, but this is 
probably obvious enough, so I pushed it in.  Could somebody from gcc 
take care of merging it in gcc too?


Thanks,

Simon


Re: Move all wide_int_range* functions into wide-int-range.[ch]

2018-08-02 Thread Aldy Hernandez




On 08/02/2018 02:50 PM, Jeff Law wrote:

On 08/02/2018 12:23 PM, Aldy Hernandez wrote:



On 08/02/2018 02:07 PM, Jeff Law wrote:

On 08/01/2018 03:19 PM, David Malcolm wrote:

On Wed, 2018-08-01 at 15:39 -0400, Aldy Hernandez wrote:

This is actually an obvious patch, but I'm not committing it just in
case you'd prefer another name for the files.


BTW, is it our policy that new gcc C++ source files should have a .cc
extension?

I believe so.
jeff



I did not know that.  I'm happy to rename it to .cc.

OK with that change ;-)?

If it's just moving code around, yes, OK.


Yes it is.

Will commit after another bootstrap after my header change, just in case.

Thanks.
Aldy


Re: [PATCH] Make function clone name numbering independent.

2018-08-02 Thread Michael Ploujnikov
On 2018-08-01 06:37 AM, Richard Biener wrote:
> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov
>  wrote:
>>
>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote:
>>> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
 On 2018-07-20 06:05 AM, Richard Biener wrote:
>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>> NAME.  */
>> @@ -521,14 +521,13 @@ tree
>>  clone_function_name_1 (const char *name, const char *suffix)
>
> pass this function the counter to use
>
>>  {
>>size_t len = strlen (name);
>> -  char *tmp_name, *prefix;
>> +  char *prefix;
>>
>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>memcpy (prefix, name, len);
>>strcpy (prefix + len + 1, suffix);
>>prefix[len] = symbol_table::symbol_suffix_separator ();
>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>
> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
> the lto/lto-partition.c caller (just use zero as counter).
>
>> -  return get_identifier (tmp_name);
>> +  return get_identifier (prefix);
>>  }
>>
>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>> @@ -537,7 +536,17 @@ tree
>>  clone_function_name (tree decl, const char *suffix)
>>  {
>>tree name = DECL_ASSEMBLER_NAME (decl);
>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>> +  char *numbered_name;
>> +  unsigned int *suffix_counter;
>> +  if (!clone_fn_ids) {
>> +/* Initialize the per-function counter hash table if this is the 
>> first call */
>> +clone_fn_ids = hash_map::create_ggc (64);
>> +  }
>
> I still do not like throwing memory at the problem in this way for the
> little benefit
> this change provides :/
>
> So no approval from me at this point...
>
> Richard.

 Can you give me an idea of the memory constraints that are involved?

 The highest memory usage increase that I could find was when compiling
 a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
 increase over the before-patch use of 6936kB which is barely 0.03%.

 Using a single counter can result in more confusing namespacing when
 you have .bar.clone.4 despite there only being 3 clones of .bar.

 From a practical point of view this change is helpful to anyone
 diffing binary output such as forensic analysts, Debian Reproducible
 Builds or even someone validating compiler output (before and after an 
 input
 source patch). The extra changes that this patch alleviates are a
 distraction and could even be misleading. For example, applying a
 source patch to the same Linux source produces the following binary
 diff before my change:

 --- /tmp/output.o.objdump
 +++ /tmp/patched-output.o.objdump
 @@ -1,5 +1,5 @@

 -/tmp/uverbs_cmd/output.o: file format elf32-i386
 +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386


  Disassembly of section .text.get_order:
 @@ -265,12 +265,12 @@
 3:   e9 fc ff ff ff  jmp4 
  4: R_386_PC32   .text.put_uobj_read

 -Disassembly of section .text.trace_kmalloc.constprop.3:
 +Disassembly of section .text.trace_kmalloc.constprop.4:

 - :
 + :
 0:   83 3d 04 00 00 00 00cmpl   $0x0,0x4
  2: R_386_32 __tracepoint_kmalloc
 -   7:   74 34   je 3d 
 
 +   7:   74 34   je 3d 
 
 9:   55  push   %ebp
 a:   89 cd   mov%ecx,%ebp
 c:   57  push   %edi
 @@ -281,7 +281,7 @@
13:   8b 1d 10 00 00 00   mov0x10,%ebx
  15: R_386_32__tracepoint_kmalloc
19:   85 db   test   %ebx,%ebx
 -  1b:   74 1b   je 38 
 
 +  1b:   74 1b   je 38 
 
1d:   68 d0 00 00 00  push   $0xd0
22:   89 fa   mov%edi,%edx
24:   89 f0   mov%esi,%eax
 @@ -292,7 +292,7 @@
31:   58  pop%eax
32:   83 3b 00cmpl   $0x0,(%ebx)
35:   5a  pop%edx
 -  36:   eb e3   jmp1b 
 
 +  36:   eb e3   jmp1b 
 
38:   5b  pop%ebx
39:   5e  pop%esi
3a:   5f  pop%edi
 @@ -846,7 +846,7 @@
78:   b8 5f 00 00 00

Re: [PATCH][GCC][AARCH64] Use stdint integers in vect_su_add_sub.c

2018-08-02 Thread James Greenhalgh
On Tue, Jul 31, 2018 at 04:53:19AM -0500, Matthew Malcomson wrote:
> Fixing the ilp32 issue that Christophe found.
> 
> The existing testcase uses `long` to represent a 64 bit integer.
> This breaks when compiled using the `-mabi=ilp32` flag.
> We switch the use of int/long for int32_t/int64_t to avoid this problem
> and show the requirement of a widening operation more clearly.

Normally we try to avoid pulling more headers than we need in the testsuite.

Have you seen this construct:

  typedef unsigned __attribute__((mode(DI))) uint64_t;

It can be a good way to ensure you have the right type for the patterns you
are adding.

Thanks,
James

> Ok for trunk?
> 
> gcc/testsuite/Changelog
> 2018-07-31  Matthew Malcomson  
> 
>   * gcc.target/aarch64/simd/vect_su_add_sub.c: Use stdint types
> 

> diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c 
> b/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c
> index 
> 338da54f6281c90e1c6b1c59fa50d9b719005c77..e3a2d0175e75b20e309c9b734b747512e466fbb1
>  100644
> --- a/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c
> +++ b/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c
> @@ -1,21 +1,24 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O3" } */
>  
> +#include 
> +#include 
> +
>  /* Ensure we use the signed/unsigned extend vectorized add and sub
> instructions.  */
>  #define N 1024
>  
> -int a[N];
> -long c[N];
> -long d[N];
> -unsigned int ua[N];
> -unsigned long uc[N];
> -unsigned long ud[N];
> +int32_t a[N];
> +int64_t c[N];
> +int64_t d[N];
> +uint32_t ua[N];
> +uint64_t uc[N];
> +uint64_t ud[N];
>  
>  void
>  add ()
>  {
> -  for (int i = 0; i < N; i++)
> +  for (size_t i = 0; i < N; i++)
>  d[i] = a[i] + c[i];
>  }
>  /* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */
> @@ -24,7 +27,7 @@ add ()
>  void
>  subtract ()
>  {
> -  for (int i = 0; i < N; i++)
> +  for (size_t i = 0; i < N; i++)
>  d[i] = c[i] - a[i];
>  }
>  /* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */
> @@ -33,7 +36,7 @@ subtract ()
>  void
>  uadd ()
>  {
> -  for (int i = 0; i < N; i++)
> +  for (size_t i = 0; i < N; i++)
>  ud[i] = ua[i] + uc[i];
>  }
>  /* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */
> @@ -42,7 +45,7 @@ uadd ()
>  void
>  usubtract ()
>  {
> -  for (int i = 0; i < N; i++)
> +  for (size_t i = 0; i < N; i++)
>  ud[i] = uc[i] - ua[i];
>  }
>  /* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */
> 



[PATCH] adjust sprintf range for AIX QNaN output (PR 86571)

2018-08-02 Thread Martin Sebor

The recently added test gcc.dg/torture/builtin-sprintf.c
to verify that the sprintf result computed by GCC matches
libc's for Infinity and NaN has been failing on AIX which
formats NaN as either QNaN or SNaN, contrary to C/POSIX
requirements.  The attached tweak adjusts the result
computed by GCC to include the AIX format.  If there are
no objections I'll commit the tweak later this week and
backport it to GCC 8 the next.

Martin
PR tree-optimization/86571 - AIX NaNQ and NaNS output format conflicts with __builtin_sprintf

gcc/ChangeLog:

	PR tree-optimization/86571
	* gimple-ssa-sprintf.c (format_floating): Extend upper bound of
	NaN output to 4.

Index: gcc/gimple-ssa-sprintf.c
===
--- gcc/gimple-ssa-sprintf.c	(revision 263268)
+++ gcc/gimple-ssa-sprintf.c	(working copy)
@@ -2014,8 +2014,15 @@ format_floating (const directive &dir, tree arg, v
 
   res.range.likely = res.range.min;
   res.range.max = res.range.min;
-  /* The inlikely maximum is "[-/+]infinity" or "[-/+]nan".  */
-  res.range.unlikely = sign + (real_isinf (rvp) ? 8 : 3);
+  /* The unlikely maximum is "[-/+]infinity" or "[-/+][qs]nan".
+	 For NaN, the C/POSIX standards specify two formats:
+	   "[-/+]nan"
+	 and
+	   "[-/+]nan(n-char-sequence)"
+	 No known printf implementation outputs the latter format but AIX
+	 outputs QNaN and SNaN for quiet and signalling NaN, respectively,
+	 so the unlikely maximum reflects that.  */
+  res.range.unlikely = sign + (real_isinf (rvp) ? 8 : 4);
 
   /* The range for infinity and NaN is known unless either width
 	 or precision is unknown.  Width has the same effect regardless


[committed] Fix memory leak of pretty_printer prefixes

2018-08-02 Thread David Malcolm
We were rather sloppy about handling the ownership of prefixes for
pretty_printer, and this lead to a memory leak for any time a
diagnostic_show_locus call emits multiple line spans.

This showed up in "make selftest-valgrind" as:

3,976 bytes in 28 blocks are definitely lost in loss record 632 of 669
   at 0x4A0645D: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x1F08227: xmalloc (xmalloc.c:147)
   by 0x1F083E6: xvasprintf (xvasprintf.c:58)
   by 0x1E7EC7D: build_message_string(char const*, ...) (diagnostic.c:78)
   by 0x1E7F438: diagnostic_get_location_text(diagnostic_context*, 
expanded_location) (diagnostic.c:328)
   by 0x1E7FD54: default_diagnostic_start_span_fn(diagnostic_context*, 
expanded_location) (diagnostic.c:626)
   by 0x1EB3508: 
selftest::test_diagnostic_context::start_span_cb(diagnostic_context*, 
expanded_location) (selftest-diagnostic.c:57)
   by 0x1E89215: diagnostic_show_locus(diagnostic_context*, rich_location*, 
diagnostic_t) (diagnostic-show-locus.c:1992)
   by 0x1E8ECAD: 
selftest::test_fixit_insert_containing_newline_2(selftest::line_table_case 
const&) (diagnostic-show-locus.c:3044)
   by 0x1EB0606: selftest::for_each_line_table_case(void 
(*)(selftest::line_table_case const&)) (input.c:3525)
   by 0x1E8F3F5: selftest::diagnostic_show_locus_c_tests() 
(diagnostic-show-locus.c:3164)
   by 0x1E010BF: selftest::run_tests() (selftest-run-tests.c:88)

4,004 bytes in 28 blocks are definitely lost in loss record 633 of 669
   at 0x4A0645D: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x1F08227: xmalloc (xmalloc.c:147)
   by 0x1F083E6: xvasprintf (xvasprintf.c:58)
   by 0x1E7EC7D: build_message_string(char const*, ...) (diagnostic.c:78)
   by 0x1E7F438: diagnostic_get_location_text(diagnostic_context*, 
expanded_location) (diagnostic.c:328)
   by 0x1E7FD54: default_diagnostic_start_span_fn(diagnostic_context*, 
expanded_location) (diagnostic.c:626)
   by 0x1EB3508: 
selftest::test_diagnostic_context::start_span_cb(diagnostic_context*, 
expanded_location) (selftest-diagnostic.c:57)
   by 0x1E89215: diagnostic_show_locus(diagnostic_context*, rich_location*, 
diagnostic_t) (diagnostic-show-locus.c:1992)
   by 0x1E8B373: 
selftest::test_diagnostic_show_locus_fixit_lines(selftest::line_table_case 
const&) (diagnostic-show-locus.c:2500)
   by 0x1EB0606: selftest::for_each_line_table_case(void 
(*)(selftest::line_table_case const&)) (input.c:3525)
   by 0x1E8F3B9: selftest::diagnostic_show_locus_c_tests() 
(diagnostic-show-locus.c:3159)
   by 0x1E010BF: selftest::run_tests() (selftest-run-tests.c:88)

This patch fixes the leaks by ensuring that the pretty_printer "owns"
the prefix if it's non-NULL, freeing it in the dtor and in pp_set_prefix.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
manually verified "make selftest-valgrind" output (one unrelated leak
remains).

Committed to trunk as r263275.

gcc/cp/ChangeLog:
* error.c (cxx_print_error_function): Duplicate "file" before
passing it to pp_set_prefix.
(cp_print_error_function): Use pp_take_prefix when saving the
existing prefix.

gcc/ChangeLog:
* diagnostic-show-locus.c (diagnostic_show_locus): Use
pp_take_prefix when saving the existing prefix.
* diagnostic.c (diagnostic_append_note): Likewise.
* langhooks.c (lhd_print_error_function): Likewise.
* pretty-print.c (pp_set_prefix): Drop the "const" from "prefix"
param's type.  Free the existing prefix.
(pp_take_prefix): New function.
(pretty_printer::pretty_printer): Drop the prefix parameter.
Rename the length parameter to match the comment.
(pretty_printer::~pretty_printer): Free the prefix.
* pretty-print.h (pretty_printer::pretty_printer): Drop the prefix
parameter.
(struct pretty_printer): Drop the "const" from "prefix" field's
type and clarify memory management.
(pp_set_prefix): Drop the "const" from the 2nd param.
(pp_take_prefix): New decl.
---
 gcc/cp/error.c  |  9 +++--
 gcc/diagnostic-show-locus.c |  2 +-
 gcc/diagnostic.c|  3 +--
 gcc/langhooks.c |  2 +-
 gcc/pretty-print.c  | 31 +++
 gcc/pretty-print.h  | 14 --
 6 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index a12fbc5..c49f4d7 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3287,8 +3287,13 @@ void
 cxx_print_error_function (diagnostic_context *context, const char *file,
  diagnostic_info *diagnostic)
 {
+  char *prefix;
+  if (file)
+prefix = xstrdup (file);
+  else
+prefix = NULL;
   lhd_print_error_function (context, file, diagnostic);
-  pp_set_prefix (context->printer, file);
+  pp_set_prefix (context->printer, prefix);
   maybe_print_instantiation_context (context);
 }
 
@@ -3316,7 +3321,7 @@ cp_print_error_funct

PR libstdc++/68222 Hide safe iterator operators

2018-08-02 Thread François Dumont

Hi

    Here is a patch to avoid definition of invalid operators on the 
Debug mode safe iterator type depending on its category.


    Even if it is limited to the Debug mode code I would like to have a 
feedback before committing. Especially on the following points:


- _Safe_tagged_iterator: Is the name ok ?

- Inheritance between the different _Safe_tagged_iterator 
instantiations. I am already working on making the operators friends as 
we discuss in another thread so I might review this design at this moment.


- Are concept checks I added to testsuite_containers.h citerator ok ?

    This patch also does some cleanup on Debug functions. 
__check_dereferenceable was not used (anymore maybe) so I removed it. I 
also prefer to take iterator by value in the fallback implementations 
like it is done in std algos. Only _Safe_tagged_iterator are taken by 
lvalue ref as they are quite expensive to copy (mutex lock).


    I also attach the ChangeLog entry which is quite big.

Tested under Linux x86_64 Debug mode.

François


diff --git a/libstdc++-v3/include/debug/deque b/libstdc++-v3/include/debug/deque
index 93b82cf..213e23b 100644
--- a/libstdc++-v3/include/debug/deque
+++ b/libstdc++-v3/include/debug/deque
@@ -57,12 +57,14 @@ namespace __debug
   typedef __gnu_debug::_Equal_to<_Base_const_iterator> _Equal;
 
 public:
+  typedef _Base	normal_type;
+
   typedef typename _Base::reference			reference;
   typedef typename _Base::const_reference		const_reference;
 
-  typedef __gnu_debug::_Safe_iterator<_Base_iterator, deque>
+  typedef __gnu_debug::_Safe_tagged_iterator<_Base_iterator, deque>
 			iterator;
-  typedef __gnu_debug::_Safe_iterator<_Base_const_iterator, deque>
+  typedef __gnu_debug::_Safe_tagged_iterator<_Base_const_iterator, deque>
 			const_iterator;
 
   typedef typename _Base::size_type			size_type;
diff --git a/libstdc++-v3/include/debug/formatter.h b/libstdc++-v3/include/debug/formatter.h
index 7b3c30b..33bc2b2 100644
--- a/libstdc++-v3/include/debug/formatter.h
+++ b/libstdc++-v3/include/debug/formatter.h
@@ -76,8 +76,8 @@ namespace __gnu_debug
 
   class _Safe_sequence_base;
 
-  template
-class _Safe_iterator;
+  template
+class _Safe_tagged_iterator;
 
   template
 class _Safe_local_iterator;
@@ -264,8 +264,9 @@ namespace __gnu_debug
 	_M_variant._M_string._M_value = __value;
   }
 
-  template
-	_Parameter(_Safe_iterator<_Iterator, _Sequence> const& __it,
+  template
+	_Parameter(_Safe_tagged_iterator<_Iterator, _Sequence,
+	 _Category> const& __it,
 		   const char* __name, _Is_iterator)
 	: _M_kind(__iterator),  _M_variant()
 	{
@@ -379,9 +380,10 @@ namespace __gnu_debug
 	= _S_reverse_state(_M_variant._M_iterator._M_state);
 	}
 
-  template
-	_Parameter(std::reverse_iterator<_Safe_iterator<_Iterator,
-			_Sequence>> const& __it,
+  template
+	_Parameter(
+	  std::reverse_iterator<_Safe_tagged_iterator<_Iterator, _Sequence,
+		  _Category>> const& __it,
 	  const char* __name, _Is_iterator)
 	: _Parameter(__it.base(), __name, _Is_iterator{})
 	{
@@ -397,9 +399,10 @@ namespace __gnu_debug
 	: _Parameter(__it.base(), __name, _Is_iterator{})
 	{ _M_variant._M_iterator._M_type = _GLIBCXX_TYPEID(__it); }
 
-  template
-	_Parameter(std::move_iterator<_Safe_iterator<_Iterator,
-		 _Sequence>> const& __it,
+  template
+	_Parameter(
+	  std::move_iterator<_Safe_tagged_iterator<_Iterator, _Sequence,
+		   _Category>> const& __it,
 	  const char* __name, _Is_iterator)
 	: _Parameter(__it.base(), __name, _Is_iterator{})
   {
diff --git a/libstdc++-v3/include/debug/forward_list b/libstdc++-v3/include/debug/forward_list
index 633af1a..6b874db 100644
--- a/libstdc++-v3/include/debug/forward_list
+++ b/libstdc++-v3/include/debug/forward_list
@@ -194,12 +194,14 @@ namespace __debug
   typedef typename _Base::const_iterator	_Base_const_iterator;
 
 public:
+  typedef _Basenormal_type;
+
   typedef typename _Base::reference		reference;
   typedef typename _Base::const_reference	const_reference;
 
-  typedef __gnu_debug::_Safe_iterator<
+  typedef __gnu_debug::_Safe_tagged_iterator<
 	_Base_iterator, forward_list>		iterator;
-  typedef __gnu_debug::_Safe_iterator<
+  typedef __gnu_debug::_Safe_tagged_iterator<
 	_Base_const_iterator, forward_list>	const_iterator;
 
   typedef typename _Base::size_type		size_type;
diff --git a/libstdc++-v3/include/debug/functions.h b/libstdc++-v3/include/debug/functions.h
index ce501f2..6719703 100644
--- a/libstdc++-v3/include/debug/functions.h
+++ b/libstdc++-v3/include/debug/functions.h
@@ -31,7 +31,9 @@
 
 #include 		// for __addressof
 #include 	// for less
+
 #if __cplusplus >= 201103L
+# include 	// for __miter_base
 # include 		// for is_lvalue_reference and conditional.
 #endif
 
@@ -64,19 +66,6 @@ namespace __gnu_debug
 __check_singular(const _Tp* __ptr)
 { return __ptr == 0; }
 

Re: [PATCH 00/11] (v2) Mitigation against unsafe data speculation (CVE-2017-5753)

2018-08-02 Thread John David Anglin

On 2018-08-02 2:40 PM, Jeff Law wrote:

It's been eons.   I think there's enough building blocks on the PA to
mount a spectre v1 attack.  They've got branch prediction with varying
degress of speculative execution, caches and user accessable cycle timers.

Yes.


There's varying degrees of out of order execution all the way back in
the PA7xxx processors (hit-under-miss) to full o-o-o execution in the
PA8xxx series (including the PA8900 that's in the rp3440).

However, as far as I know, loads and stores are always ordered.


I suspect that given enough time we could figure out why the test didn't
indicate spectre v1 vulnerability on your system and twiddle it, but
given it's a dead processor, I doubt it's worth the effort.

Spectre output looks like this:
dave@mx3210:~/meltdown$ ./spectre
Reading 40 bytes:
Reading at malicious_x = 0xef10... Unclear: 0xFE='?' score=999    
(second best: 0xFC score=999)
Reading at malicious_x = 0xef11... Unclear: 0xFC='?' score=999    
(second best: 0xFB score=999)
Reading at malicious_x = 0xef12... Unclear: 0xFE='?' score=999    
(second best: 0xFC score=999)


I don't think there's a suitable barrier.  The sync instruction seems 
like overkill.


So, I'm going to install attached change after testing is complete.

Dave

--
John David Anglin  dave.ang...@bell.net

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 263228)
+++ config/pa/pa.c  (working copy)
@@ -428,6 +428,9 @@
 #undef TARGET_STARTING_FRAME_OFFSET
 #define TARGET_STARTING_FRAME_OFFSET pa_starting_frame_offset
 
+#undef TARGET_HAVE_SPECULATION_SAFE_VALUE
+#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 /* Parse the -mfixed-range= option string.  */


Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

2018-08-02 Thread Martin Sebor

On 08/02/2018 12:56 PM, Bernd Edlinger wrote:

On 08/02/18 15:26, Bernd Edlinger wrote:


   /* If the length can be computed at compile-time, return it.  */
-  len = c_strlen (src, 0);
+  tree array;
+  tree len = c_strlen (src, 0, &array);


You know the c_strlen tries to compute wide character sizes,
but strlen does not do that, strlen (L"abc") should give 1
(or 0 on a BE machine)
I wonder if that is correct.


[snip]


 static tree
-fold_builtin_strlen (location_t loc, tree type, tree arg)
+fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg)
 {
   if (!validate_arg (arg, POINTER_TYPE))
 return NULL_TREE;
   else
 {
-  tree len = c_strlen (arg, 0);
-
+  tree arr = NULL_TREE;
+  tree len = c_strlen (arg, 0, &arr);


Is it possible to write a test case where strlen(L"test") reaches this point?
what will c_strlen return then?



Yes, of course it is:

$ cat y.c
int f(char *x)
{
   return __builtin_strlen(x);
}

int main ()
{
   return f((char*)&L"abcdef"[0]);
}
$ gcc -O3 -S y.c
$ cat y.s
main:
.LFB1:
.cfi_startproc
movl$6, %eax
ret
.cfi_endproc

The reason is that c_strlen tries to fold wide chars at all.
I do not know when that was introduced, was that already before your last 
patches?


The function itself was introduced in 1992 if not earlier,
before wide strings even existed.  AFAICS, it has always
accepted strings of all widths.  Until r241489 (in GCC 7)
it computed their length in bytes, not characters.  I don't
know if that was on purpose or if it was just never changed
to compute the length in characters when wide strings were
first introduced.  From the name I assume it's the latter.
The difference wasn't detected until sprintf tests were added
for wide string directives.  The ChangeLog description for
the change reads: Correctly handle wide strings.  I didn't
consider pathological cases like strlen (L"abc").  It
shouldn't be difficult to change to fix this case.

Martin


Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Allan Sandfeld Jensen
On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote:
> On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote:
> >  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,
> > 
> > __artificial__))
> > 
> >  _mm_move_sd (__m128d __A, __m128d __B)
> >  {
> > 
> > -  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
> > +  return __extension__ (__m128d)(__v2df){__B[0],__A[1]};
> > 
> >  }
> 
> If the goal is to have it represented as a VEC_PERM_EXPR internally, I
> wonder if we should be explicit and use __builtin_shuffle instead of
> relying on some forwprop pass to transform it. Maybe not, just asking. And
> the answer need not even be the same for _mm_move_sd and _mm_move_ss.

I forgot. One of the things that makes using __builtin_shuffle ugly is that 
__v4si  as the suffle argument needs to be in _mm_move_ss, is declared
in emmintrin.h, but _mm_move_ss is in xmmintrin.h.

In general the gcc __builtin_shuffle syntax with the argument being a vector 
is kind of ackward. At least for the declaring intrinsics, the clang still 
where the permutator is extra argument is easier to deal with:
__builtin_shuffle(a, b, (__v4si){4, 0, 1, 2})
 vs
 __builtin_shuffle(a, b, 4, 0, 1, 2)







Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Allan Sandfeld Jensen
On Donnerstag, 2. August 2018 11:18:41 CEST Richard Biener wrote:
> On Thu, Aug 2, 2018 at 11:12 AM Allan Sandfeld Jensen
> 
>  wrote:
> > On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote:
> > > On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote:
> > > >  extern __inline __m128d __attribute__((__gnu_inline__,
> > > >  __always_inline__,
> > > > 
> > > > __artificial__))
> > > > 
> > > >  _mm_move_sd (__m128d __A, __m128d __B)
> > > >  {
> > > > 
> > > > -  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
> > > > +  return __extension__ (__m128d)(__v2df){__B[0],__A[1]};
> > > > 
> > > >  }
> > > 
> > > If the goal is to have it represented as a VEC_PERM_EXPR internally, I
> > > wonder if we should be explicit and use __builtin_shuffle instead of
> > > relying on some forwprop pass to transform it. Maybe not, just asking.
> > > And
> > > the answer need not even be the same for _mm_move_sd and _mm_move_ss.
> > 
> > I wrote it this way because this pattern could later also be used for the
> > other _ss intrinsics, such as _mm_add_ss, where a _builtin_shuffle could
> > not. To match the other intrinsics the logic that tries to match vector
> > construction just needs to be extended to try merge patterns even if one
> > of the subexpressions is not simple.
> 
> The question is what users expect and get when they use -O0 with intrinsics?
> 
> Richard.
> 
Here is the version with __builtin_shuffle. It might be more expectable -O0, 
but it is also uglier.

diff --git a/gcc/config/i386/emmintrin.h b/gcc/config/i386/emmintrin.h
index b940a39d27b..6501638f619 100644
--- a/gcc/config/i386/emmintrin.h
+++ b/gcc/config/i386/emmintrin.h
@@ -113,7 +113,7 @@ _mm_setzero_pd (void)
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_move_sd (__m128d __A, __m128d __B)
 {
-  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
+  return __extension__ (__m128d) __builtin_shuffle((__v2df)__A, (__v2df)__B, (__v2di){2, 1});
 }
 
 /* Load two DPFP values from P.  The address must be 16-byte aligned.  */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ee409cfe7e4..2337ef5ea08 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -46143,6 +46143,46 @@ expand_vselect_vconcat (rtx target, rtx op0, rtx op1,
   return ok;
 }
 
+/* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to implement D
+   using movss or movsd.  */
+static bool
+expand_vec_perm_movs (struct expand_vec_perm_d *d)
+{
+  machine_mode vmode = d->vmode;
+  unsigned i, nelt = d->nelt;
+  rtx x;
+
+  if (d->one_operand_p)
+return false;
+
+  if (TARGET_SSE2 && (vmode == V2DFmode || vmode == V4SFmode))
+;
+  else
+return false;
+
+  /* Only the first element is changed. */
+  if (d->perm[0] != nelt && d->perm[0] != 0)
+return false;
+  for (i = 1; i < nelt; ++i) {
+{
+  if (d->perm[i] != i + nelt - d->perm[0])
+return false;
+}
+  }
+
+  if (d->testing_p)
+return true;
+
+  if (d->perm[0] == nelt)
+x = gen_rtx_VEC_MERGE (vmode, d->op1, d->op0, GEN_INT (1));
+  else
+x = gen_rtx_VEC_MERGE (vmode, d->op0, d->op1, GEN_INT (1));
+
+  emit_insn (gen_rtx_SET (d->target, x));
+
+  return true;
+}
+
 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to implement D
in terms of blendp[sd] / pblendw / pblendvb / vpblendd.  */
 
@@ -46885,6 +46925,10 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
 	}
 }
 
+  /* Try movss/movsd instructions.  */
+  if (expand_vec_perm_movs (d))
+return true;
+
   /* Finally, try the fully general two operand permute.  */
   if (expand_vselect_vconcat (d->target, d->op0, d->op1, d->perm, nelt,
 			  d->testing_p))
diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h
index f64f3f74a0b..45b99ff87d5 100644
--- a/gcc/config/i386/xmmintrin.h
+++ b/gcc/config/i386/xmmintrin.h
@@ -1011,7 +1011,8 @@ _mm_storer_ps (float *__P, __m128 __A)
 extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_move_ss (__m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_movss ((__v4sf)__A, (__v4sf)__B);
+  return __extension__ (__m128) __builtin_shuffle((__v4sf)__A, (__v4sf)__B,
+  (__attribute__((__vector_size__ (16))) int){4, 1, 2, 3});
 }
 
 /* Extracts one of the four words of A.  The selector N must be immediate.  */


Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Marc Glisse

On Thu, 2 Aug 2018, Allan Sandfeld Jensen wrote:


I forgot. One of the things that makes using __builtin_shuffle ugly is that
__v4si  as the suffle argument needs to be in _mm_move_ss, is declared
in emmintrin.h, but _mm_move_ss is in xmmintrin.h.


__v4si is some internal detail, I don't see much issue with moving it to 
xmmintrin.h if you want to use it there.



In general the gcc __builtin_shuffle syntax with the argument being a vector
is kind of ackward. At least for the declaring intrinsics, the clang still
where the permutator is extra argument is easier to deal with:
__builtin_shuffle(a, b, (__v4si){4, 0, 1, 2})
vs
__builtin_shuffle(a, b, 4, 0, 1, 2)


__builtin_shufflevector IIRC



The question is what users expect and get when they use -O0 with intrinsics?


Here is the version with __builtin_shuffle. It might be more expectable -O0,
but it is also uglier.


I am not convinced -O0 is very important.

If you start extending your approach to _mm_add_sd and others, while one 
instruction is easy enough to recognize, if we put several in a row, they 
will be partially simplified and may become harder to recognize.
{ x*(y+v[0]-z), v[1] } requires that you notice that the upper part of 
this vector is v[1], i.e. the upper part of a vector whose lower part 
appears somewhere in the arbitrarily complex expression for the lower 
part of the result. And you then have to propagate the fact that you are 
doing vector operations all the way back to v[0].


I don't have a strong opinion on what the best approach is.

--
Marc Glisse


Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Jakub Jelinek
On Thu, Aug 02, 2018 at 10:50:58PM +0200, Allan Sandfeld Jensen wrote:
> Here is the version with __builtin_shuffle. It might be more expectable -O0, 
> but it is also uglier.

I don't find anything ugly on it, except the formatting glitches (missing
space before (, overlong line, and useless __extension__.
Improving code generated for __builtin_shuffle is desirable too.

> --- a/gcc/config/i386/xmmintrin.h
> +++ b/gcc/config/i386/xmmintrin.h
> @@ -1011,7 +1011,8 @@ _mm_storer_ps (float *__P, __m128 __A)
>  extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_move_ss (__m128 __A, __m128 __B)
>  {
> -  return (__m128) __builtin_ia32_movss ((__v4sf)__A, (__v4sf)__B);
> +  return __extension__ (__m128) __builtin_shuffle((__v4sf)__A, (__v4sf)__B,
> +  
> (__attribute__((__vector_size__ (16))) int){4, 1, 2, 3});

And obviously use __v4si here instead of __attribute__((__vector_size__ (16))) 
int.

Jakub


Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Allan Sandfeld Jensen
On Donnerstag, 2. August 2018 23:46:37 CEST Jakub Jelinek wrote:
> On Thu, Aug 02, 2018 at 10:50:58PM +0200, Allan Sandfeld Jensen wrote:
> > Here is the version with __builtin_shuffle. It might be more expectable
> > -O0, but it is also uglier.
> 
> I don't find anything ugly on it, except the formatting glitches (missing
> space before (, overlong line, and useless __extension__.
> Improving code generated for __builtin_shuffle is desirable too.
> 

__extension__ is needed when using the the {...} initialization otherwise -
std=C89 will produce warnings about standards.  The line is a bit long, but I 
thought it looked better like this rather than adding any emergency line 
breaks. Is there a hard limit?

> > --- a/gcc/config/i386/xmmintrin.h
> > +++ b/gcc/config/i386/xmmintrin.h
> > @@ -1011,7 +1011,8 @@ _mm_storer_ps (float *__P, __m128 __A)
> > 
> >  extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__,
> >  __artificial__)) _mm_move_ss (__m128 __A, __m128 __B)
> >  {
> > 
> > -  return (__m128) __builtin_ia32_movss ((__v4sf)__A, (__v4sf)__B);
> > +  return __extension__ (__m128) __builtin_shuffle((__v4sf)__A,
> > (__v4sf)__B, + 
> > (__attribute__((__vector_size__ (16))) int){4, 1, 2, 3});
> And obviously use __v4si here instead of __attribute__((__vector_size__
> (16))) int.
> 
__v4si is declared in emmintrin.h, so I couldn't use it here unless I moved 
the definition. I tried changing as little as possible to not trigger bike 
shedding.

'Allan





Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-02 Thread Allan Sandfeld Jensen
On Donnerstag, 2. August 2018 23:15:28 CEST Marc Glisse wrote:
> On Thu, 2 Aug 2018, Allan Sandfeld Jensen wrote:
> > I forgot. One of the things that makes using __builtin_shuffle ugly is
> > that
> > __v4si  as the suffle argument needs to be in _mm_move_ss, is declared
> > in emmintrin.h, but _mm_move_ss is in xmmintrin.h.
> 
> __v4si is some internal detail, I don't see much issue with moving it to
> xmmintrin.h if you want to use it there.
> 
> > In general the gcc __builtin_shuffle syntax with the argument being a
> > vector is kind of ackward. At least for the declaring intrinsics, the
> > clang still where the permutator is extra argument is easier to deal
> > with:
> > __builtin_shuffle(a, b, (__v4si){4, 0, 1, 2})
> > vs
> > __builtin_shuffle(a, b, 4, 0, 1, 2)
> 
> __builtin_shufflevector IIRC
> 
> >> The question is what users expect and get when they use -O0 with
> >> intrinsics?> 
> > Here is the version with __builtin_shuffle. It might be more expectable
> > -O0, but it is also uglier.
> 
> I am not convinced -O0 is very important.
> 
Me neither, and in any case I would argue the logic that recognizes the vector 
constructions patterns are not optimizations but instruction matching.

> If you start extending your approach to _mm_add_sd and others, while one
> instruction is easy enough to recognize, if we put several in a row, they
> will be partially simplified and may become harder to recognize.
> { x*(y+v[0]-z), v[1] } requires that you notice that the upper part of
> this vector is v[1], i.e. the upper part of a vector whose lower part
> appears somewhere in the arbitrarily complex expression for the lower
> part of the result. And you then have to propagate the fact that you are
> doing vector operations all the way back to v[0].
> 
> I don't have a strong opinion on what the best approach is.

Yes, I am not sure all of those could be done exhaustively with the existing 
logic, and it might also be of dubious value as in almost all cases the ps 
instructions have the same latency and bandwidth as the ss instructions, so 
developers should probably use _ps versions as they are scheduled better by 
the compiler (or at least better by gcc).
It was just an idea, and I haven't tried it at this point.

'Allan





Re: [2/5] C-SKY port: Backend implementation

2018-08-02 Thread Jeff Law
On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
> 
>> 在 2018年7月25日,上午5:24,Jeff Law  写道:
>>
>> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
>>> On 07/24/2018 09:45 AM, Jeff Law wrote:
 On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
 I'm not a big fan of more awk code, but I'm not going to object to it :-)

 Why does the port have its own little pass for condition code
 optimization (cse_cc)?  What is it doing that can't be done with our
 generic optimizers?
>>>
>>> This pass was included in the initial patch set we got from C-SKY, and
>>> as it didn't seem to break anything I left it in.  Perhaps C-SKY can
>>> provide a testcase that demonstrates why it's still useful in the
>>> current version of GCC; otherwise we can remove this from the initial
>>> port submission and restore it later if some performance analysis shows
>>> it is still worthwhile.
>> FWIW it looks like we model CC setting on just a few insns, (add,
>> subtract) so I'd be surprised if this little mini pass found much.  I'd
>> definitely like to hear from the csky authors here.
>>
>> Alternately, you could do add some instrumentation to flag when it
>> triggers, take a test or two that does, reduce it and we can then look
>> at the key RTL sequences and see what the pass is really doing.
>>
> 
> I wrote a case to reproduce this problem on C-SKY. C code is as follows:
> ---
> int e1, e2;
> 
> void func (int a, int b, int c, int d, int f, int g)
> {
>   e1 = a > b ? f : g;
>   e2 = a > b ? c : d;
> 
>   return;
> }
> ---
> 
> compile to assembler with option “-O3 -S” :
> ---
> func:
>   cmplt a1, a0
>   ld.w  t1, (sp, 0)
>   ld.w  t0, (sp, 4)
>   movt  t0, t1
>   cmplt a1, a0
>   movt  a3, a2
>   lrw a1, e2
>   lrw a2, e1
>   st.w  a3, (a1, 0)
>   st.w  t0, (a2, 0)
>   rts
> ---
> There is an extra “cmplt a1, a0" in the above code without cse_cc. This 
> situation mainly occurs when a relatively short branch jump is converted into 
> a conditional execution instruction. And the CSE pass can not reduce the same 
> conditional comparison instruction . Here is the rtx sequence after “cse2” 
> pass.
> 
> ---
> (insn 28 13 29 2 (set (reg:CC 33 c)
> (gt:CC (reg/v:SI 77 [ a ])
> (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>  (nil))
> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
> (if_then_else:SI (eq (reg:CC 33 c)
> (const_int 0 [0]))
> (reg/v:SI 82 [ g ])
> (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
>  (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
> (expr_list:REG_DEAD (reg:CC 33 c)
> (nil
> (insn 30 29 31 2 (set (reg:CC 33 c)
> (gt:CC (reg/v:SI 77 [ a ])
> (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>  (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
> (expr_list:REG_DEAD (reg/v:SI 77 [ a ])
> (nil
> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
> (if_then_else:SI (eq (reg:CC 33 c)
> (const_int 0 [0]))
> (reg/v:SI 80 [ d ])
> (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
>  (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
> (expr_list:REG_DEAD (reg:CC 33 c)
> (nil
> ---
> 
> It doesn't seem to check the same conditional comparison instruction .So I 
> wrote this to solve this problem, but I am not sure if this is the best way : 
> )
> 
> PS, the same conditional comparison instruction cannot be reduced with the 
> latest version gcc with C-SKY because I just insert the “cse_cc” after 
> “cse1”, when I insert after “cse2”, this problem can be solved very well.
I think the cse_cc pass is really just working around one or more bugs
in CSE and/or a backend bug.  The RTL above clearly shows a common
subexpression that is not eliminated by CSE.

What CSE should be trying to do is changing the second and third
occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would
create nop-sets which would be subsequently deleted.  I suspect you do
not have an insn which matches that nop set of the CC register.  If you
fix that I suspect CSE will work better and eliminate the need for your
cse_cc pass.

jeff



Re: [2/5] C-SKY port: Backend implementation

2018-08-02 Thread Jeff Law
On 07/27/2018 07:49 PM, Sandra Loosemore wrote:
> On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
>>
>> I wrote a case to reproduce this problem on C-SKY. C code is as follows:
>> ---
>> int e1, e2;
>>
>> void func (int a, int b, int c, int d, int f, int g)
>> {
>>    e1 = a > b ? f : g;
>>    e2 = a > b ? c : d;
>>
>>    return;
>> }
>> ---
>>
>> compile to assembler with option “-O3 -S” :
>> ---
>> func:
>>    cmplt a1, a0
>>    ld.w  t1, (sp, 0)
>>    ld.w  t0, (sp, 4)
>>    movt  t0, t1
>>    cmplt a1, a0
>>    movt  a3, a2
>>    lrw a1, e2
>>    lrw a2, e1
>>    st.w  a3, (a1, 0)
>>    st.w  t0, (a2, 0)
>>    rts
>> ---
>> There is an extra “cmplt a1, a0" in the above code without cse_cc.
>> This situation mainly occurs when a relatively short branch jump is
>> converted into a conditional execution instruction. And the CSE pass
>> can not reduce the same conditional comparison instruction . Here is
>> the rtx sequence after “cse2” pass.
>>
>> ---
>> (insn 28 13 29 2 (set (reg:CC 33 c)
>>  (gt:CC (reg/v:SI 77 [ a ])
>>  (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>>   (nil))
>> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
>>  (if_then_else:SI (eq (reg:CC 33 c)
>>  (const_int 0 [0]))
>>  (reg/v:SI 82 [ g ])
>>  (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
>>   (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
>>  (expr_list:REG_DEAD (reg:CC 33 c)
>>  (nil
>> (insn 30 29 31 2 (set (reg:CC 33 c)
>>  (gt:CC (reg/v:SI 77 [ a ])
>>  (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>>   (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
>>  (expr_list:REG_DEAD (reg/v:SI 77 [ a ])
>>  (nil
>> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
>>  (if_then_else:SI (eq (reg:CC 33 c)
>>  (const_int 0 [0]))
>>  (reg/v:SI 80 [ d ])
>>  (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
>>   (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
>>  (expr_list:REG_DEAD (reg:CC 33 c)
>>  (nil
>> ---
>>
>> It doesn't seem to check the same conditional comparison instruction
>> .So I wrote this to solve this problem, but I am not sure if this is
>> the best way : )
>>
>> PS, the same conditional comparison instruction cannot be reduced with
>> the latest version gcc with C-SKY because I just insert the “cse_cc”
>> after “cse1”, when I insert after “cse2”, this problem can be solved
>> very well.
> 
> Thanks, this is very helpful.  I've verified this and I'm moving the
> pass as you suggest, adding a more detailed comment to the source to
> explain what the pass is for, and adding your test case to the
> testsuite.  This will be included when I resubmit the patches to address
> other review comments too.
> 
> Jeff, does that adequately address your concerns about whether the pass
> is useful?
I think the pass is papering over problems elsewhere (see my most other
reply on this thread).  I do think it would be useful to take that code
and create a test based on it.  I suspect you'll want to verify multiple
GT expressions prior to CSE2 and that after CSE2 you have a single GT
expression.  Presumably it'd be in the csky specific test.

jeff


Re: [PATCH] Print heuristics probability fraction part with 2 digits.

2018-08-02 Thread Jeff Law
On 07/26/2018 08:43 AM, Martin Liška wrote:
> Hi.
> 
> It's just a cosmetics change where I want to print 2 digits of fraction
> part of heuristics probabilities. It helps to distinguish 100% from
> PROB_VERY_LIKELY (99.96%).
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-07-26  Martin Liska  
> 
>   * predict.c (dump_prediction): Change to 2 digits
> in fraction part.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-07-26  Martin Liska  
> 
>   * gcc.dg/predict-1.c: Adjust scanned pattern to cover 2 digits.
>   * gcc.dg/predict-13.c:Likewise.
>   * gcc.dg/predict-3.c:Likewise.
>   * gcc.dg/predict-4.c:Likewise.
>   * gcc.dg/predict-5.c:Likewise.
>   * gcc.dg/predict-6.c:Likewise.
>   * gcc.dg/predict-9.c:Likewise.
>   * gfortran.dg/predict-1.f90:Likewise.
OK.
jeff


Re: [PATCH] Fix hard regno checks

2018-08-02 Thread Jeff Law
On 07/27/2018 09:36 AM, Vladimir Makarov wrote:
> On 07/23/2018 05:14 AM, Ilya Leoshkevich wrote:
>> FIRST_PSEUDO_REGISTER is not a hard regno, so comparisons should use
>> "<" instead of "<=", and ">=" instread of ">".
>>
> Thank you for finding these typos.  LRA parts of the patch are ok for me.
The rest look good as well.  I went ahead and committed the changes to
the trunk.
jeff



[PATCH] i386: Always set cfun->machine->max_used_stack_alignment

2018-08-02 Thread H.J. Lu
We should always set cfun->machine->max_used_stack_alignment if the
maximum stack slot alignment may be greater than 64 bits.

Tested on i686 and x86-64.  OK for master and backport for GCC 8?

Thanks.

H.J.

gcc/

PR target/86386
* config/i386/i386.c (ix86_finalize_stack_frame_flags): Set
cfun->machine->max_used_stack_alignment if needed.

gcc/testsuite/

PR target/86386
* gcc.target/i386/pr86386.c: New file.
---
 gcc/config/i386/i386.c  |  6 +++---
 gcc/testsuite/gcc.target/i386/pr86386.c | 26 +
 2 files changed, 29 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86386.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ee409cfe7e4..4a0a050b3a2 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13281,12 +13281,12 @@ ix86_finalize_stack_frame_flags (void)
  recompute_frame_layout_p = true;
}
 }
-  else if (crtl->max_used_stack_slot_alignment
-  > crtl->preferred_stack_boundary)
+  else if (crtl->max_used_stack_slot_alignment > 64)
 {
   /* We don't need to realign stack.  But we still need to keep
 stack frame properly aligned to satisfy the largest alignment
-of stack slots.  */
+of stack slots if the maximum stack slot alignment may be
+greater than 64 bits.  */
   if (ix86_find_max_used_stack_alignment (stack_alignment, true))
cfun->machine->max_used_stack_alignment
  = stack_alignment / BITS_PER_UNIT;
diff --git a/gcc/testsuite/gcc.target/i386/pr86386.c 
b/gcc/testsuite/gcc.target/i386/pr86386.c
new file mode 100644
index 000..a67cf45444e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86386.c
@@ -0,0 +1,26 @@
+/* PR target/86386 */
+/* { dg-do run { target { avx_runtime && int128 } } } */
+/* { dg-options "-Os -fno-tree-dce -mstringop-strategy=vector_loop -mavx" } */
+
+unsigned c, d, e, f;
+
+unsigned __attribute__((noipa))
+foo (unsigned char g, unsigned short h, unsigned i, unsigned long long j,
+ unsigned char k, unsigned short l, unsigned m, unsigned __int128 n)
+{
+  __builtin_memset (&e, 0, 3);
+  n <<= m;
+  __builtin_memcpy (&m, 2 + (char *) &n, 1);
+  m >>= 0;
+  d ^= __builtin_mul_overflow (l, n, &m);
+  return m;
+}
+
+int
+main ()
+{
+  unsigned __int128 x = foo (0, 0, 0, 0, 0, 4, 1, 3);
+  if (x != 24)
+__builtin_abort ();
+  return 0;
+}
-- 
2.17.1



[PATCH] Fix memory leak in selftest::test_expansion_to_rtl

2018-08-02 Thread David Malcolm
"make selftest-valgrind" shows:

187 bytes in 1 blocks are definitely lost in loss record 567 of 669
at 0x4A081D4: calloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x1F08260: xcalloc (xmalloc.c:162)
by 0xB24F32: init_emit() (emit-rtl.c:5843)
by 0xC10080: prepare_function_start() (function.c:4803)
by 0xC10254: init_function_start(tree_node*) (function.c:4877)
by 0x1CDF92A: selftest::test_expansion_to_rtl() (function-tests.c:595)
by 0x1CE007C: selftest::function_tests_c_tests() (function-tests.c:676)
by 0x1E010E7: selftest::run_tests() (selftest-run-tests.c:98)
by 0x1062D1E: toplev::run_self_tests() (toplev.c:2225)
by 0x1062F40: toplev::main(int, char**) (toplev.c:2303)
by 0x1E5B90A: main (main.c:39)

The allocation in question is:

  crtl->emit.regno_pointer_align
= XCNEWVEC (unsigned char, crtl->emit.regno_pointer_align_length);

This patch fixes this leak (and makes the output of
"make selftest-valgrind" clean) by calling free_after_compilation at the
end of the selftest in question.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* function-tests.c (selftest::test_expansion_to_rtl): Call
free_after_compilation.
---
 gcc/function-tests.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/function-tests.c b/gcc/function-tests.c
index 1b5ebf3..196b3a3 100644
--- a/gcc/function-tests.c
+++ b/gcc/function-tests.c
@@ -661,6 +661,7 @@ test_expansion_to_rtl ()
   ASSERT_STR_CONTAINS (dump, ") ;; function \"test_fn\"\n");
 
   free (dump);
+  free_after_compilation (fun);
 }
 
 /* Run all of the selftests within this file.  */
-- 
1.8.5.3



Re: [PATCH] [aarch64][v2] Fix falkor pipeline description for dup

2018-08-02 Thread Siddhesh Poyarekar

On 08/03/2018 12:02 AM, James Greenhalgh wrote:

On Thu, Aug 02, 2018 at 11:58:37AM -0500, Siddhesh Poyarekar wrote:

There was a typo in the pipeline description where DUP was assigned to
the vector pipes for quad mode ops when it really only uses the VTOG
pipes.  Fixing this does not show any noticeable difference in
performance (there's a very small bump of 1.7% in x264 but that's
about it) in my tests but is the more precise description of operations
for falkor.

Bootstrapped and tested with --with-cpu=falkor to confirm that there
are no regressions.


OK.


Thanks, may I backport this to gcc 8.x as well given that it has zero 
impact on anything except falkor?  We are targeting gcc 8.x as the 
toolchain for falkor and so would like to at least get all falkor 
specific optimizations and fixes back into the gcc 8.x branch.


Siddhesh


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Martin Sebor

On 08/02/2018 12:15 PM, Bernd Edlinger wrote:

On 08/02/18 19:00, Martin Sebor wrote:

As an alternate approach I have been thinking about, if
there is a strong feeling that allowing strlen to iterate
past the subobject boundary is necessary (I don't believe
it is.)

Rather than indiscriminately expanding the provenance of
the subobject regardless of what members follow it in
the enclosing structure, only consider doing that if
the next member is an array of the same type.  E.g.,

   struct S { char a[4], b[3], c[2], d; };
   extern struct S *p;

   strlen (p->a);   // consider p->a's bounds to be char[9]



No, initially I thought in the same direction,
but looking at the way how X-server is broken,
I realized that will probably not be sufficient.

Maybe it would be good to have one set of optimistic range infos
that follow the standards, and can be used to control the warnings,
and another set of pessimistic range infos, that control optimizations.

Consider
extern char garbage[10];
char x[10];
memcpy(x, garbage, 10);
x[9] = 0;

strlen(x) will be 0..9 no matter what was in garbage.
That information can be safely used for optimizations.

But if we have
extern char garbage[10];
char x[10];
memcpy(x, garbage, 10);
char *y = x;
if (strlen(y) < 10) <= may not be removed pessimistically


This example involves a whole object.  There should be no question
that running off the end of a full object is undefined and a bug.
It's far preferable to avoid such bugs.  Emitting a library call
that can return an arbitrary value or crash is the least secure
and least user-friendly solution.


char z[10];
sprintf(z, "%s", y); <= omitting warning would be okay optimistically.

It is not really easy to do but possible.


I see three cases here:

1) y points to a constant array whose initializer we know
2) y points to an array with contents (length) known to
   the strlen pass
3) we don't know what y points to or its contents

(1) can be handled easily by extending the approach in my patch
for bug 86552 to other built-ins.  I have a patch that does that
for sprintf.

(2) can be handled by extending the strlen pass to also track
the sizes of array objects into which it tracks stores.  I'm
working on a patch for GCC 9.

(3) can only be handled by adding an even stricter setting
for -Wformat-overflow by warning for all %s arguments of
unknown length.  I haven't considered this yet.



In the moment I would like to concentrate exclusively on wrong code issues


The only wrong code is in programs that run off the end of
unterminated buffers.  Those are the ones I believe we should
focus on helping detect and prevent.  Warnings are the fist step.
Enhancing _FORTIFY_SOURCE and the sanitizers to also detect such
reads would be another solution (in addition to warnings).
Emitting traps (say under an option) would be yet another.  But
changing GCC to silently accept such programs is, in my opinion,
the worst possible approach.

Martin


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Martin Sebor

On 08/02/2018 12:20 PM, Jakub Jelinek wrote:

On Thu, Aug 02, 2018 at 11:00:32AM -0600, Martin Sebor wrote:

As an alternate approach I have been thinking about, if
there is a strong feeling that allowing strlen to iterate
past the subobject boundary is necessary (I don't believe
it is.)

Rather than indiscriminately expanding the provenance of
the subobject regardless of what members follow it in
the enclosing structure, only consider doing that if
the next member is an array of the same type.  E.g.,

  struct S { char a[4], b[3], c[2], d; };
  extern struct S *p;

  strlen (p->a);   // consider p->a's bounds to be char[9]


See the mail with testcases where the middle-end doesn't distinguish
between p->a and (char *) p, unless you want to warn or optimize
in the FEs or extremely early in the lowering passes, that isn't going to
work.


When the object structure is lost (as in MEM_REF) the middle-end
(specifically strlen) already considers the whole object, so that
wouldn't change.  The only impact would be on the cases where
the middle end currently does consider the subobject: instead
of taking just its size, it would consider the size of
the subsequnt members.  For the case of strlen, that would be
a simple change to get_range_strlen().

Martin


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Martin Sebor

On 08/01/2018 02:40 AM, Jakub Jelinek wrote:

On Wed, Aug 01, 2018 at 09:19:43AM +0200, Richard Biener wrote:

And if so, what makes it well defined?


The fact that strlen takes a char * argument and thus inline-expansion
of a trivial implementation like

 int len = 0;
 for (; *p; ++p)
   ++len;

will have

 p = &s.a;

and the middle-end doesn't reconstruct s.a[..] from the pointer
access.


Some testcases:
struct S { char a[3]; char b[5]; } s[3] = { { "ab", "defg" }, { "h", "klmn" }, { "opq", 
"rstu" } };

__SIZE_TYPE__
foo (int i, int a)
{
  volatile char *p = (volatile char *) &s[i].a;
  if (a)
p = (volatile char *) &s[i];
  return __builtin_strlen ((char *) p);
}

If I call this with foo (2, 1), do you still claim it is not valid C?


String functions like strlen operate on character strings stored
in character arrays.  Calling strlen (&s[1]) is invalid because
&s[1] is not the address of a character array.  The fact that
objects can be represented as arrays of bytes doesn't change
that.  The standard may be somewhat loose with words on this
distinction but the intent certainly isn't for strlen to traverse
arbitrary sequences of bytes that cross subobject boundaries.
(That is the intent behind the raw memory functions, but
the current text doesn't make the distinction clear.)


I don't see in C anything that would say that this is not valid for strlen,
but valid for memcpy, and if you say it is not valid even for memcpy, then
pretty much nothing will work, we need memcpy to be able to copy whole objects
containing subobjects.


Yes,  I understand the concern, and it is something for
the standard to reconcile.  As I said, the object model study
group is considering how to reflect this in the model.  One
suggestion was to treat unsigned char (and only it) special.
Another was to introduce a new type, byte, like C++ has, and
treat it special.  Yet another is to come up with a special
cast to change the provenance of a pointer.  There probably
will be others.


 The middle-end optimizes the above into
  p_3 = &s[i_2(D)].a;
  _7 = __builtin_strlen (p_3);
(in fre3 in particular).

Or:
int bar (char *) __attribute__((pure));
int v;

__SIZE_TYPE__
baz (int i, int a)
{
  int b = bar (&s[i].a[0]);
  __SIZE_TYPE__ t = __builtin_strlen ((char *) &s[i]);
  v = b;
  return t;
}

where bar may say just int bar (char *p) { return *p; }
The middle-end optimizes this into:
  _1 = &s[i_3(D)].a[0];
  b_5 = bar (_1);
  t_6 = __builtin_strlen (_1);

That is why for __builtin_object_size (..., [13]) where we want to
differentiate between that we need to handle it very early, because later on
the distinction is gone.


Yep.  It would be nice to be able to hold on to the type.
In this case it doesn't look like it's a problem since &s[i]
is also the address of the first member of struct S and
those two need to be interchangeable (this also might need
tweaking in the standard).

Martin


Re: [PATCH] Make strlen range computations more conservative

2018-08-02 Thread Martin Sebor

On 08/01/2018 12:55 AM, Bernd Edlinger wrote:

Certainly not every "strlen" has these semantics.  For example,
this open-coded one doesn't:

   int len = 0;
   for (int i = 0; s.a[i]; ++i)
 ++len;

It computes 2 (with no warning for the out-of-bounds access).



yes, which is questionable as well, but that happens only
if the source code accesses the array via s.a[i]
not if it happens to use char *, as this experiment shows:


Yes, that just happens to be the case with GCC in some
situations, and not in others.  That's why it shouldn't
be relied on.


The point I make is that it is impossible to know where the function
is inlined, and if the original code can be broken in surprising ways.
And most importantly strlen is often used in security relevant ways.


Code that's concerned with security or safety (which should
be all of it) needs to follow the basic rules of the language.
Calling strlen() on a char[4] argument expecting it to return
a value larger than 3 as an indication that the array isn't
nul-terminated is not a secure coding practice -- it's a plain
old bug.  Don't take my word for it -- read any of the secure
coding standards: CEERT STR32-C. Do not pass a non-null-terminated
character sequence to a library function that expects a string,
CWE-170: Improper Null Termination, OWASP String Termination
Error.  This is elementary material that shouldn't need
explaining.

Martin