Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Christophe Lyon
On 13 April 2016 at 22:12, Jason Merrill  wrote:
> On 04/13/2016 03:18 PM, Jakub Jelinek wrote:
>>
>> On Wed, Apr 13, 2016 at 03:11:34PM -0400, Jason Merrill wrote:
>>>
>>> commit 761983a023b5217ef831a43f423779940c788ecf
>>> Author: Jason Merrill 
>>> Date:   Tue Apr 12 13:16:50 2016 -0400
>>>
>>>  gcc/
>>> * cfgexpand.c (pass_expand::execute): Handle attribute
>>> abi_warning.
>>> * expr.c (expand_expr_real_1): Likewise.
>>>  gcc/cp/
>>> * call.c (empty_class_msg, mark_for_abi_warning): New.
>>> (build_call_a): Use them.
>>> * decl.c (store_parm_decls): Use mark_for_abi_warning.
>>> * error.c (pp_format_to_string): New.
>>
>>
>> I think you should put a space into the attribute name instead of _
>> to make it clear that it is not an attribute users can use directly in
>> their
>> code through __attribute__.
>>
>> Otherwise it looks reasonable to me.
>
>
> Thanks, applied with that change.
>

Hi,

I'm seeing g++.dg/abi/empty13.C failing at execution on
aarch64-none-linux-gnu (using qemu):
qemu: uncaught target signal 11 (Segmentation fault) - core dumped

and aarch64_be-none-elf (using the Foundation Model):
Terminated by exception.

But it passes on aarch64-none-elf (using the Foundation Model too)

Am I the only one?

Christophe.



> Jason
>


Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Kyrill Tkachov


On 14/04/16 09:02, Christophe Lyon wrote:

On 13 April 2016 at 22:12, Jason Merrill  wrote:

On 04/13/2016 03:18 PM, Jakub Jelinek wrote:

On Wed, Apr 13, 2016 at 03:11:34PM -0400, Jason Merrill wrote:

commit 761983a023b5217ef831a43f423779940c788ecf
Author: Jason Merrill 
Date:   Tue Apr 12 13:16:50 2016 -0400

  gcc/
 * cfgexpand.c (pass_expand::execute): Handle attribute
abi_warning.
 * expr.c (expand_expr_real_1): Likewise.
  gcc/cp/
 * call.c (empty_class_msg, mark_for_abi_warning): New.
 (build_call_a): Use them.
 * decl.c (store_parm_decls): Use mark_for_abi_warning.
 * error.c (pp_format_to_string): New.


I think you should put a space into the attribute name instead of _
to make it clear that it is not an attribute users can use directly in
their
code through __attribute__.

Otherwise it looks reasonable to me.


Thanks, applied with that change.


Hi,

I'm seeing g++.dg/abi/empty13.C failing at execution on
aarch64-none-linux-gnu (using qemu):
qemu: uncaught target signal 11 (Segmentation fault) - core dumped

and aarch64_be-none-elf (using the Foundation Model):
Terminated by exception.

But it passes on aarch64-none-elf (using the Foundation Model too)

Am I the only one?

Hi Christophe,

I see the test failing on aarch64-none-linux-gnu (native)
with no output, just:
spawn [open ...]
FAIL: g++.dg/abi/empty13.C  -std=gnu++98 execution test

And I see it passing on aarch64-none-elf.

I haven't tried aarch64_be-none-elf yet.

Kyrill



Christophe.




Jason





[C++ Patch] PR 70540 ("[4.9/5/6 Regression] ICE on invalid code in cxx_incomplete_type_diagnostic...")

2016-04-14 Thread Paolo Carlini

Hi,

in this regression we ICE during error recovery after an additional 
redundant error message. I think it's one of those cases (we have got 
quite a few elsewhere, in semantics.c too) when it's better to 
immediately return error_mark_node when mark_used returns false, even if 
we aren't in a SFINAE context. To be sure, I double checked that in 
those cases mark_used certainly issues an error, thus we aren't risking 
creating accepts-invalid bugs, it's only matter of fine tuning error 
recovery. Tested x86_64-linux.


Thanks,
Paolo.

/
/cp
2016-04-14  Paolo Carlini  

PR c++/70540
* semantics.c (process_outer_var_ref): Unconditionally return
error_mark_node when mark_used returns false.

/testsuite
2016-04-14  Paolo Carlini  

PR c++/70540
* g++.dg/cpp0x/auto47.C: New.

Index: cp/semantics.c
===
--- cp/semantics.c  (revision 234970)
+++ cp/semantics.c  (working copy)
@@ -3276,7 +3276,7 @@ process_outer_var_ref (tree decl, tsubst_flags_t c
   tree initializer = convert_from_reference (decl);
 
   /* Mark it as used now even if the use is ill-formed.  */
-  if (!mark_used (decl, complain) && !(complain & tf_error))
+  if (!mark_used (decl, complain))
 return error_mark_node;
 
   bool saw_generic_lambda = false;
Index: testsuite/g++.dg/cpp0x/auto47.C
===
--- testsuite/g++.dg/cpp0x/auto47.C (revision 0)
+++ testsuite/g++.dg/cpp0x/auto47.C (working copy)
@@ -0,0 +1,8 @@
+// PR c++/70540
+// { dg-do compile { target c++11 } }
+
+void
+foo ()
+{
+  auto f = [&] { return f; };  // { dg-error "before deduction" }
+}


Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Ramana Radhakrishnan
On Thu, Apr 14, 2016 at 10:41 AM, Kyrill Tkachov
 wrote:
>
> On 14/04/16 09:02, Christophe Lyon wrote:
>>
>> On 13 April 2016 at 22:12, Jason Merrill  wrote:
>>>
>>> On 04/13/2016 03:18 PM, Jakub Jelinek wrote:

 On Wed, Apr 13, 2016 at 03:11:34PM -0400, Jason Merrill wrote:
>
> commit 761983a023b5217ef831a43f423779940c788ecf
> Author: Jason Merrill 
> Date:   Tue Apr 12 13:16:50 2016 -0400
>
>   gcc/
>  * cfgexpand.c (pass_expand::execute): Handle attribute
> abi_warning.
>  * expr.c (expand_expr_real_1): Likewise.
>   gcc/cp/
>  * call.c (empty_class_msg, mark_for_abi_warning): New.
>  (build_call_a): Use them.
>  * decl.c (store_parm_decls): Use mark_for_abi_warning.
>  * error.c (pp_format_to_string): New.


 I think you should put a space into the attribute name instead of _
 to make it clear that it is not an attribute users can use directly in
 their
 code through __attribute__.

 Otherwise it looks reasonable to me.
>>>
>>>
>>> Thanks, applied with that change.
>>>
>> Hi,
>>
>> I'm seeing g++.dg/abi/empty13.C failing at execution on
>> aarch64-none-linux-gnu (using qemu):
>> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>>
>> and aarch64_be-none-elf (using the Foundation Model):
>> Terminated by exception.
>>
>> But it passes on aarch64-none-elf (using the Foundation Model too)
>>
>> Am I the only one?
>
> Hi Christophe,
>
> I see the test failing on aarch64-none-linux-gnu (native)
> with no output, just:
> spawn [open ...]
> FAIL: g++.dg/abi/empty13.C  -std=gnu++98 execution test
>
> And I see it passing on aarch64-none-elf.

IIRC on AArch64 we document that empty classes and structs take up a
slot as per the PCS .

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059b/IHI0059B_cppabi64.pdf

See section 3.1

For the purposes of parameter passing in
[
AAPCS64
], a parameter whose type is an empty class shall be treated as if its
type were an aggregate with a
single
member of type unsigned byte.



regards
Ramana



>
> I haven't tried aarch64_be-none-elf yet.
>
> Kyrill
>
>
>> Christophe.
>>
>>
>>
>>> Jason
>>>
>


Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Jakub Jelinek
On Thu, Apr 14, 2016 at 11:25:07AM +0100, Ramana Radhakrishnan wrote:
> > I see the test failing on aarch64-none-linux-gnu (native)
> > with no output, just:
> > spawn [open ...]
> > FAIL: g++.dg/abi/empty13.C  -std=gnu++98 execution test
> >
> > And I see it passing on aarch64-none-elf.
> 
> IIRC on AArch64 we document that empty classes and structs take up a
> slot as per the PCS .
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059b/IHI0059B_cppabi64.pdf
> 
> See section 3.1
> 
> For the purposes of parameter passing in
> [
> AAPCS64
> ], a parameter whose type is an empty class shall be treated as if its
> type were an aggregate with a
> single
> member of type unsigned byte.

One thing is passing such arguments in registers, another is how they are
passed on the stack (e.g. if there are several other arguments that are
passed in registers).
Also, note that the (new) definition of C++ empty class I believe includes
even empty classes that are normally very large, i.e. don't pretend they
are just single byte, but can be e.g. 1024 bytes or longer.
Do you want to still pass them in registers and/or on the stack as if
it is 1 byte, 0 bytes or more bytes?
How do you pass empty C structures?

E.g. looking at sparc-solaris2.12 cross, C for empty structures passes
both in registers and on the stack as if there is one argument (and C++ did
too for the empty classes that pretended to have size 1), so appart from the
ICE in emit_move_insn there is no ABI change, only perhaps for the larger
structs.

Guess we need to figure out what do the various targets do for C empty
structures and decide what we do want to do with C++ empty classes.
Perhaps the C++ FE should just set some flag on the empty class parameters,
and let the backends decide what to do with those?

Jakub


Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements

2016-04-14 Thread Andreas Schwab
Martin Sebor  writes:

> diff --git a/gcc/testsuite/g++.dg/cpp1y/vla11.C 
> b/gcc/testsuite/g++.dg/cpp1y/vla11.C
> new file mode 100644
> index 000..af9624a
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp1y/vla11.C
> @@ -0,0 +1,711 @@
> +// PR c++/69517 - [5/6 regression] SEGV on a VLA with excess initializer
> +//   elements
> +// PR c++/70019 - VLA size overflow not detected
> +//
> +// Runtime test to verify that attempting to either construct a VLA with
> +// erroneous bounds, or initialize one with an initializer-list that
> +// contains more elements than the VLA's non-constant (runtime) bounds
> +// causes an exception to be thrown.  Test also verifies that valid
> +// VLAs and their initializers don't cause such an exception.
> +
> +// { dg-do run { target c++11 } }
> +// { dg-additional-options "-Wno-vla" }

On m68k:

/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In 
instantiation of 'struct TestType<32u>':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:201:1:   
required from here
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: 
requested alignment 32 is larger than 16 [-Wattributes]
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 
'int testcase201(const char*)':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:201:1: note: 
in expansion of macro 'TEST'
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In 
instantiation of 'struct TestType<64u>':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:202:1:   
required from here
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: 
requested alignment 64 is larger than 16 [-Wattributes]
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 
'int testcase202(const char*)':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:202:1: note: 
in expansion of macro 'TEST'
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 
'int testcase704(const char*)':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:704:1: note: 
in expansion of macro 'TEST'
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In function 
'int testcase705(const char*)':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:705:1: note: 
in expansion of macro 'TEST'

FAIL: g++.dg/cpp1y/vla11.C  -std=c++11 (test for excess errors)
Excess errors:
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: 
requested alignment 32 is larger than 16 [-Wattributes]
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: 
requested alignment 64 is larger than 16 [-Wattributes]
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:122:5: error: 
static assertion failed: wrong test type size

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH] Another small PRE speedup / cleanup

2016-04-14 Thread Richard Biener

As partial antic compute ignores back-edges (it's a union DF problem
and thus would blow up when working over back-edges) we can avoid
iterating if we use the correct processing order (postorder).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Queued for stage1.

Richard.

2016-04-14  Richard Biener  

* tree-ssa-pre.c (postorder, postorder_num): Make locals ...
(compute_antic): ... here.  For partial antic use regular
postorder and scrap iteration.
(compute_partial_antic_aux): Remove unused return value.
(init_pre): Do not allocate postorder.
(fini_pre): Do not free postorder.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 234970)
--- gcc/tree-ssa-pre.c  (working copy)
*** typedef struct bb_bitmap_sets
*** 432,441 
  #define BB_LIVE_VOP_ON_EXIT(BB) ((bb_value_sets_t) ((BB)->aux))->vop_on_exit
  
  
- /* Basic block list in postorder.  */
- static int *postorder;
- static int postorder_num;
- 
  /* This structure is used to keep track of statistics on what
 optimization PRE was able to perform.  */
  static struct
--- 432,437 
*** compute_antic_aux (basic_block block, bo
*** 2209,2219 
  - ANTIC_IN[BLOCK])
  
  */
! static bool
  compute_partial_antic_aux (basic_block block,
   bool block_has_abnormal_pred_edge)
  {
-   bool changed = false;
bitmap_set_t old_PA_IN;
bitmap_set_t PA_OUT;
edge e;
--- 2228,2237 
  - ANTIC_IN[BLOCK])
  
  */
! static void
  compute_partial_antic_aux (basic_block block,
   bool block_has_abnormal_pred_edge)
  {
bitmap_set_t old_PA_IN;
bitmap_set_t PA_OUT;
edge e;
*** compute_partial_antic_aux (basic_block b
*** 2312,2320 
  
dependent_clean (PA_IN (block), ANTIC_IN (block));
  
-   if (!bitmap_set_equal (old_PA_IN, PA_IN (block)))
- changed = true;
- 
   maybe_dump_sets:
if (dump_file && (dump_flags & TDF_DETAILS))
  {
--- 2330,2335 
*** compute_partial_antic_aux (basic_block b
*** 2327,2333 
  bitmap_set_free (old_PA_IN);
if (PA_OUT)
  bitmap_set_free (PA_OUT);
-   return changed;
  }
  
  /* Compute ANTIC and partial ANTIC sets.  */
--- 2342,2347 
*** compute_antic (void)
*** 2349,2371 
  
FOR_ALL_BB_FN (block, cfun)
  {
FOR_EACH_EDGE (e, ei, block->preds)
if (e->flags & EDGE_ABNORMAL)
  {
bitmap_set_bit (has_abnormal_preds, block->index);
break;
  }
  
-   BB_VISITED (block) = 0;
- 
/* While we are here, give empty ANTIC_IN sets to each block.  */
ANTIC_IN (block) = bitmap_set_new ();
!   PA_IN (block) = bitmap_set_new ();
  }
  
/* At the exit block we anticipate nothing.  */
BB_VISITED (EXIT_BLOCK_PTR_FOR_FN (cfun)) = 1;
  
sbitmap worklist = sbitmap_alloc (last_basic_block_for_fn (cfun) + 1);
bitmap_ones (worklist);
while (changed)
--- 2363,2395 
  
FOR_ALL_BB_FN (block, cfun)
  {
+   BB_VISITED (block) = 0;
+ 
FOR_EACH_EDGE (e, ei, block->preds)
if (e->flags & EDGE_ABNORMAL)
  {
bitmap_set_bit (has_abnormal_preds, block->index);
+ 
+   /* We also anticipate nothing.  */
+   BB_VISITED (block) = 1;
break;
  }
  
/* While we are here, give empty ANTIC_IN sets to each block.  */
ANTIC_IN (block) = bitmap_set_new ();
!   if (do_partial_partial)
!   PA_IN (block) = bitmap_set_new ();
  }
  
/* At the exit block we anticipate nothing.  */
BB_VISITED (EXIT_BLOCK_PTR_FOR_FN (cfun)) = 1;
  
+   /* For ANTIC computation we need a postorder that also guarantees that
+  a block with a single successor is visited after its successor.
+  RPO on the inverted CFG has this property.  */
+   int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
+   int postorder_num = inverted_post_order_compute (postorder);
+ 
sbitmap worklist = sbitmap_alloc (last_basic_block_for_fn (cfun) + 1);
bitmap_ones (worklist);
while (changed)
*** compute_antic (void)
*** 2403,2441 
  
if (do_partial_partial)
  {
!   bitmap_ones (worklist);
!   num_iterations = 0;
!   changed = true;
!   while (changed)
{
! if (dump_file && (dump_flags & TDF_DETAILS))
!   fprintf (dump_file, "Starting iteration %d\n", num_iterations);
! num_iterations++;
! changed = false;
! for (i = postorder_num - 1 ; i >= 0; i--)
!   {
! if (bitmap_bit_p (worklist, postorder[i]))
!   {
! basic_block block = BASIC_BLOCK_FOR_FN (cfun, postorder[i]);
! bitmap_clear_bit (worklist, block->index);
! if (compute_partial_a

[PATCH] Fix PR70614

2016-04-14 Thread Richard Biener

The following patch (the 2nd hunk actually) brings down compile-time
for the testcase in the PR from  to 0.5s.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-04-14  Richard Biener  

PR tree-optimization/70614
* tree-scalar-evolution.c (analyze_evolution_in_loop): Terminate
loop if the evolution dropped to chrec_dont_know.
(interpret_condition_phi): Likewise.

Index: gcc/tree-scalar-evolution.c
===
*** gcc/tree-scalar-evolution.c (revision 234970)
--- gcc/tree-scalar-evolution.c (working copy)
*** analyze_evolution_in_loop (gphi *loop_ph
*** 1510,1515 
--- 1510,1518 
/* When there are multiple back edges of the loop (which in fact never
 happens currently, but nevertheless), merge their evolutions.  */
evolution_function = chrec_merge (evolution_function, ev_fn);
+ 
+   if (evolution_function == chrec_dont_know)
+   break;
  }
  
if (dump_file && (dump_flags & TDF_SCEV))
*** interpret_condition_phi (struct loop *lo
*** 1687,1692 
--- 1690,1697 
(loop, PHI_ARG_DEF (condition_phi, i));
  
res = chrec_merge (res, branch_chrec);
+   if (res == chrec_dont_know)
+   break;
  }
  
return res;



Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Ramana Radhakrishnan
On Thu, Apr 14, 2016 at 11:36 AM, Jakub Jelinek  wrote:
> On Thu, Apr 14, 2016 at 11:25:07AM +0100, Ramana Radhakrishnan wrote:
>> > I see the test failing on aarch64-none-linux-gnu (native)
>> > with no output, just:
>> > spawn [open ...]
>> > FAIL: g++.dg/abi/empty13.C  -std=gnu++98 execution test
>> >
>> > And I see it passing on aarch64-none-elf.
>>


I don't understand why this passes on aarch64-none-elf btw.

>> IIRC on AArch64 we document that empty classes and structs take up a
>> slot as per the PCS .
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059b/IHI0059B_cppabi64.pdf
>>
>> See section 3.1
>>
>> For the purposes of parameter passing in
>> [
>> AAPCS64
>> ], a parameter whose type is an empty class shall be treated as if its
>> type were an aggregate with a
>> single
>> member of type unsigned byte.
>
> One thing is passing such arguments in registers, another is how they are
> passed on the stack (e.g. if there are several other arguments that are
> passed in registers).


It would just be treated like a structure with an unsigned byte and
treated the same way for parameter passing on the stack - i.e. IIRC
take a slot of 8 bytes.

> Also, note that the (new) definition of C++ empty class I believe includes
> even empty classes that are normally very large, i.e. don't pretend they
> are just single byte, but can be e.g. 1024 bytes or longer.
> Do you want to still pass them in registers and/or on the stack as if
> it is 1 byte, 0 bytes or more bytes?

I don't know off-hand.

Is this a valid example for what you have in mind ?

struct baz
{
  char a[1024];
};
struct foo : baz
{
};

int bar (struct foo b, int x)

if so, that would get treated as though it is just a 1 byte quantity
for C++. as per current behaviour by both gcc and llvm for aarch64.

> How do you pass empty C structures?

I don't think we do anything special for empty C structures which IIRC
means they don't take a slot.


>
> E.g. looking at sparc-solaris2.12 cross, C for empty structures passes
> both in registers and on the stack as if there is one argument (and C++ did
> too for the empty classes that pretended to have size 1), so appart from the
> ICE in emit_move_insn there is no ABI change, only perhaps for the larger
> structs.
>
> Guess we need to figure out what do the various targets do for C empty
> structures and decide what we do want to do with C++ empty classes.
> Perhaps the C++ FE should just set some flag on the empty class parameters,
> and let the backends decide what to do with those.



Ramana

>
> Jakub


[PATCH] Fix PR70130

2016-04-14 Thread Richard Biener

The following fixes PR70130 - improved SLP capabilities now run into
the realignment code on ppc which doesn't properly verify that all
vector loads emitted by vectorizable_load share the same alignment.

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

Bootstrapped / tested on ppc64le by Alan.

Richard.

2016-04-14  Richard Biener  
Alan Modra  

PR tree-optimization/70130
* tree-vect-data-refs.c (vect_supportable_dr_alignment): Detect
when alignment stays not the same and no not use the realign
scheme then.

* gcc.dg/vect/O3-pr70130.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   (revision 234970)
--- gcc/tree-vect-data-refs.c   (working copy)
*** vect_supportable_dr_alignment (struct da
*** 5983,5992 
  || targetm.vectorize.builtin_mask_for_load ()))
{
  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
! if ((nested_in_vect_loop
!  && (TREE_INT_CST_LOW (DR_STEP (dr))
!  != GET_MODE_SIZE (TYPE_MODE (vectype
!   || !loop_vinfo)
return dr_explicit_realign;
  else
return dr_explicit_realign_optimized;
--- 5983,6001 
  || targetm.vectorize.builtin_mask_for_load ()))
{
  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
! 
! /* If we are doing SLP then the accesses need not have the
!same alignment, instead it depends on the SLP group size.  */
! if (loop_vinfo
! && STMT_SLP_TYPE (stmt_info)
! && (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
! * GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT 
(stmt_info
! % TYPE_VECTOR_SUBPARTS (vectype) != 0)
!   ;
! else if (!loop_vinfo
!  || (nested_in_vect_loop
!  && (TREE_INT_CST_LOW (DR_STEP (dr))
!  != GET_MODE_SIZE (TYPE_MODE (vectype)
return dr_explicit_realign;
  else
return dr_explicit_realign_optimized;
Index: gcc/testsuite/gcc.dg/vect/O3-pr70130.c
===
*** gcc/testsuite/gcc.dg/vect/O3-pr70130.c  (revision 0)
--- gcc/testsuite/gcc.dg/vect/O3-pr70130.c  (working copy)
***
*** 0 
--- 1,94 
+ /* { dg-do run } */
+ /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */
+ /* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */
+ 
+ struct foo
+ {
+   short a[3][16][16];
+   short pad;
+ } images[8];
+ 
+ void __attribute__ ((noinline, noclone))
+ Loop_err (struct foo *img, const int s[16][2], int s0)
+ {
+   int i, j;
+ 
+   for (j = 0; j < 16; j++)
+ {
+   for (i=0; i < 16; i++)
+   {
+ img->a[0][j][i] = s[i][0];
+ img->a[1][j][i] = s[j][1];
+ img->a[2][j][i] = s0;
+   }
+ }
+ }
+ 
+ const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 },
+  { 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 },
+  { 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 },
+  { 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } };
+ const struct foo expected
+ = { { { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
+   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 } },
+   { { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 },
+   { 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15 },
+   { 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14 },
+   { 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13 },
+   { 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12 },
+   { 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11 },
+   { 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10 },
+   { 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9 

Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Jakub Jelinek
On Thu, Apr 14, 2016 at 12:19:26PM +0100, Ramana Radhakrishnan wrote:
> Is this a valid example for what you have in mind ?
> 
> struct baz
> {
>   char a[1024];
> };
> struct foo : baz
> {
> };
> 
> int bar (struct foo b, int x)

No, I meant say:
struct A {};
struct B { A a[1024]; };
int bar (struct B b, int c)
{ 
  return c;
} 
int baz (int a, int b, int c, int d, int e, int f, int g, struct B h, int i)
{
  return g + i;
}

Strangely, we warn about this with -Wabi=9 even on x86_64-linux on both
testcases, while only on baz it actually changes code generation.

> if so, that would get treated as though it is just a 1 byte quantity
> for C++. as per current behaviour by both gcc and llvm for aarch64.
> 
> > How do you pass empty C structures?
> 
> I don't think we do anything special for empty C structures which IIRC
> means they don't take a slot.

I think the intent of the C++ changes was to make them roughly match (at
least for the empty classes where C++ says sizeof (...) == 1 while
C says sizeof (...) == 0) what C does.

Jakub


Re: RFC: C++ PATCH to adjust empty class parameter passing ABI

2016-04-14 Thread Ramana Radhakrishnan
On Thu, Apr 14, 2016 at 12:25 PM, Jakub Jelinek  wrote:
> On Thu, Apr 14, 2016 at 12:19:26PM +0100, Ramana Radhakrishnan wrote:
>> Is this a valid example for what you have in mind ?
>>
>> struct baz
>> {
>>   char a[1024];
>> };
>> struct foo : baz
>> {
>> };
>>
>> int bar (struct foo b, int x)
>
> No, I meant say:
> struct A {};
> struct B { A a[1024]; };
> int bar (struct B b, int c)
> {
>   return c;
> }
> int baz (int a, int b, int c, int d, int e, int f, int g, struct B h, int i)
> {
>   return g + i;
> }
>
> Strangely, we warn about this with -Wabi=9 even on x86_64-linux on both
> testcases, while only on baz it actually changes code generation.
>

Seems to be treated as passing a 1 byte quantity before the change.
Note you need one more parameter on aarch64 to force struct B into the
stack.

IIRC the same problem should happen on AArch32 as well but I haven't
tried that yet.


regards
Ramana


Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-14 Thread Maxim Kuvyrkov
On Mar 14, 2016, at 11:14 AM, Li Bin  wrote:
> 
> As ARM64 is entering enterprise world, machines can not be stopped for
> some critical enterprise production environment, that is, live patch as
> one of the RAS features is increasing more important for ARM64 arch now.
> 
> Now, the mainstream live patch implementation which has been merged in
> Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and
> this feature needs the help of gcc. 
> 
> This patch proposes a generic solution for arm64 gcc which called mfentry,
> following the example of x86, mips, s390, etc. and on these archs, this
> feature has been used to implement the ftrace feature 'ftrace with regs'
> to support live patch.
> 
> By now, there is an another solution from linaro [1], which proposes to
> implement a new option -fprolog-pad=N that generate a pad of N nops at the
> beginning of each function. This solution is a arch-independent way for gcc,
> but there may be some limitations which have not been recognized for Linux
> kernel to adapt to this solution besides the discussion on [2]

It appears that implementing -fprolog-pad=N option in GCC will not enable 
kernel live-patching support for AArch64.  The proposal for the option was to 
make GCC output a given number of NOPs at the beginning of each function, and 
then the kernel could use that NOP pad to insert whatever instructions it 
needs.  The modification of kernel instruction stream needs to be done 
atomically, and, unfortunately, it seems the kernel can use only 
architecture-provided atomicity primitives -- i.e., changing at most 8 bytes at 
a time.

>From the kernel discussion thread it appears that the pad needs to be more 
>than 8 bytes, and that the kernel can't update that atomically.  However if 
>-mfentry approach is used, then we need to update only 4 (or 8) bytes of the 
>pad, and we avoid the atomicity problem.

Therefore, [unless there is a clever multi-stage update process to atomically 
change NOPs to whatever we need,] I think we have to go with Li's -mfentry 
approach.

Comments?

--
Maxim Kuvyrkov
www.linaro.org


> , typically
> for powerpc archs. Furthermore I think there are no good reasons to promote
> the other archs (such as x86) which have implemented the feature 'ftrace with 
> regs'
> to replace the current method with the new option, which may bring heavily
> target-dependent code adaption, as a result it becomes a arm64 dedicated
> solution, leaving kernel with two different forms of implementation. 
> 
> [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html
> [2] 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html



Re: [PATCH] Fix PR70130

2016-04-14 Thread Bill Schmidt
On Thu, 2016-04-14 at 13:22 +0200, Richard Biener wrote:
> The following fixes PR70130 - improved SLP capabilities now run into
> the realignment code on ppc which doesn't properly verify that all
> vector loads emitted by vectorizable_load share the same alignment.
> 
> Bootstrap / regtest pending on x86_64-unknown-linux-gnu.
> 
> Bootstrapped / tested on ppc64le by Alan.
> 
> Richard.
> 
> 2016-04-14  Richard Biener  
>   Alan Modra  
> 
>   PR tree-optimization/70130
>   * tree-vect-data-refs.c (vect_supportable_dr_alignment): Detect
>   when alignment stays not the same and no not use the realign
>   scheme then.
> 
>   * gcc.dg/vect/O3-pr70130.c: New testcase.
> 
> Index: gcc/tree-vect-data-refs.c
> ===
> *** gcc/tree-vect-data-refs.c (revision 234970)
> --- gcc/tree-vect-data-refs.c (working copy)
> *** vect_supportable_dr_alignment (struct da
> *** 5983,5992 
> || targetm.vectorize.builtin_mask_for_load ()))
>   {
> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> !   if ((nested_in_vect_loop
> !&& (TREE_INT_CST_LOW (DR_STEP (dr))
> !!= GET_MODE_SIZE (TYPE_MODE (vectype
> !   || !loop_vinfo)
>   return dr_explicit_realign;
> else
>   return dr_explicit_realign_optimized;
> --- 5983,6001 
> || targetm.vectorize.builtin_mask_for_load ()))
>   {
> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> ! 
> !   /* If we are doing SLP then the accesses need not have the
> !  same alignment, instead it depends on the SLP group size.  */
> !   if (loop_vinfo
> !   && STMT_SLP_TYPE (stmt_info)
> !   && (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
> !   * GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT 
> (stmt_info
> !   % TYPE_VECTOR_SUBPARTS (vectype) != 0)

Parentheses here look wrong.  Should be one fewer ending paren on the "*
GROUP_SIZE" line and one more on the following line, right?

Bill

> ! ;
> !   else if (!loop_vinfo
> !|| (nested_in_vect_loop
> !&& (TREE_INT_CST_LOW (DR_STEP (dr))
> !!= GET_MODE_SIZE (TYPE_MODE (vectype)
>   return dr_explicit_realign;
> else
>   return dr_explicit_realign_optimized;
> Index: gcc/testsuite/gcc.dg/vect/O3-pr70130.c
> ===
> *** gcc/testsuite/gcc.dg/vect/O3-pr70130.c(revision 0)
> --- gcc/testsuite/gcc.dg/vect/O3-pr70130.c(working copy)
> ***
> *** 0 
> --- 1,94 
> + /* { dg-do run } */
> + /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */
> + /* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */
> + 
> + struct foo
> + {
> +   short a[3][16][16];
> +   short pad;
> + } images[8];
> + 
> + void __attribute__ ((noinline, noclone))
> + Loop_err (struct foo *img, const int s[16][2], int s0)
> + {
> +   int i, j;
> + 
> +   for (j = 0; j < 16; j++)
> + {
> +   for (i=0; i < 16; i++)
> + {
> +   img->a[0][j][i] = s[i][0];
> +   img->a[1][j][i] = s[j][1];
> +   img->a[2][j][i] = s0;
> + }
> + }
> + }
> + 
> + const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 },
> +{ 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 },
> +{ 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 },
> +{ 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } };
> + const struct foo expected
> + = { { { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> + { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 } },
> +   { { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 },
> + { 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15 },
> + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14 },
> + { 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13

Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-14 Thread Andrew Pinski
On Thu, Apr 14, 2016 at 9:08 PM, Maxim Kuvyrkov
 wrote:
> On Mar 14, 2016, at 11:14 AM, Li Bin  wrote:
>>
>> As ARM64 is entering enterprise world, machines can not be stopped for
>> some critical enterprise production environment, that is, live patch as
>> one of the RAS features is increasing more important for ARM64 arch now.
>>
>> Now, the mainstream live patch implementation which has been merged in
>> Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and
>> this feature needs the help of gcc.
>>
>> This patch proposes a generic solution for arm64 gcc which called mfentry,
>> following the example of x86, mips, s390, etc. and on these archs, this
>> feature has been used to implement the ftrace feature 'ftrace with regs'
>> to support live patch.
>>
>> By now, there is an another solution from linaro [1], which proposes to
>> implement a new option -fprolog-pad=N that generate a pad of N nops at the
>> beginning of each function. This solution is a arch-independent way for gcc,
>> but there may be some limitations which have not been recognized for Linux
>> kernel to adapt to this solution besides the discussion on [2]
>
> It appears that implementing -fprolog-pad=N option in GCC will not enable 
> kernel live-patching support for AArch64.  The proposal for the option was to 
> make GCC output a given number of NOPs at the beginning of each function, and 
> then the kernel could use that NOP pad to insert whatever instructions it 
> needs.  The modification of kernel instruction stream needs to be done 
> atomically, and, unfortunately, it seems the kernel can use only 
> architecture-provided atomicity primitives -- i.e., changing at most 8 bytes 
> at a time.
>

Can't we add a 16byte atomic primitive for ARM64 to the kernel?
Though you need to align all functions to a 16 byte boundary if the
-fprolog-pag=N needs to happen.  Do you know what the size that needs
to be modified?  It does seem to be either 12 or 16 bytes.

> From the kernel discussion thread it appears that the pad needs to be more 
> than 8 bytes, and that the kernel can't update that atomically.  However if 
> -mfentry approach is used, then we need to update only 4 (or 8) bytes of the 
> pad, and we avoid the atomicity problem.

I think you are incorrect, you could add a 16 byte atomic primitive if needed.

>
> Therefore, [unless there is a clever multi-stage update process to atomically 
> change NOPs to whatever we need,] I think we have to go with Li's -mfentry 
> approach.

Please consider the above of having a 16 byte (128bit) atomic
instructions be available would that be enough?

Thanks,
Andrew

>
> Comments?
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
>> , typically
>> for powerpc archs. Furthermore I think there are no good reasons to promote
>> the other archs (such as x86) which have implemented the feature 'ftrace 
>> with regs'
>> to replace the current method with the new option, which may bring heavily
>> target-dependent code adaption, as a result it becomes a arm64 dedicated
>> solution, leaving kernel with two different forms of implementation.
>>
>> [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html
>> [2] 
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html
>


Re: [PATCH] Fix PR70130

2016-04-14 Thread Richard Biener
On Thu, 14 Apr 2016, Bill Schmidt wrote:

> On Thu, 2016-04-14 at 13:22 +0200, Richard Biener wrote:
> > The following fixes PR70130 - improved SLP capabilities now run into
> > the realignment code on ppc which doesn't properly verify that all
> > vector loads emitted by vectorizable_load share the same alignment.
> > 
> > Bootstrap / regtest pending on x86_64-unknown-linux-gnu.
> > 
> > Bootstrapped / tested on ppc64le by Alan.
> > 
> > Richard.
> > 
> > 2016-04-14  Richard Biener  
> > Alan Modra  
> > 
> > PR tree-optimization/70130
> > * tree-vect-data-refs.c (vect_supportable_dr_alignment): Detect
> > when alignment stays not the same and no not use the realign
> > scheme then.
> > 
> > * gcc.dg/vect/O3-pr70130.c: New testcase.
> > 
> > Index: gcc/tree-vect-data-refs.c
> > ===
> > *** gcc/tree-vect-data-refs.c   (revision 234970)
> > --- gcc/tree-vect-data-refs.c   (working copy)
> > *** vect_supportable_dr_alignment (struct da
> > *** 5983,5992 
> >   || targetm.vectorize.builtin_mask_for_load ()))
> > {
> >   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> > ! if ((nested_in_vect_loop
> > !  && (TREE_INT_CST_LOW (DR_STEP (dr))
> > !  != GET_MODE_SIZE (TYPE_MODE (vectype
> > !   || !loop_vinfo)
> > return dr_explicit_realign;
> >   else
> > return dr_explicit_realign_optimized;
> > --- 5983,6001 
> >   || targetm.vectorize.builtin_mask_for_load ()))
> > {
> >   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> > ! 
> > ! /* If we are doing SLP then the accesses need not have the
> > !same alignment, instead it depends on the SLP group size.  */
> > ! if (loop_vinfo
> > ! && STMT_SLP_TYPE (stmt_info)
> > ! && (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
> > ! * GROUP_SIZE (vinfo_for_stmt (GROUP_FIRST_ELEMENT 
> > (stmt_info
> > ! % TYPE_VECTOR_SUBPARTS (vectype) != 0)
> 
> Parentheses here look wrong.  Should be one fewer ending paren on the "*
> GROUP_SIZE" line and one more on the following line, right?

Yeah, though it shouldn't matter in practice.  I'll fix things up
before committing.

Richard.

> Bill
> 
> > !   ;
> > ! else if (!loop_vinfo
> > !  || (nested_in_vect_loop
> > !  && (TREE_INT_CST_LOW (DR_STEP (dr))
> > !  != GET_MODE_SIZE (TYPE_MODE (vectype)
> > return dr_explicit_realign;
> >   else
> > return dr_explicit_realign_optimized;
> > Index: gcc/testsuite/gcc.dg/vect/O3-pr70130.c
> > ===
> > *** gcc/testsuite/gcc.dg/vect/O3-pr70130.c  (revision 0)
> > --- gcc/testsuite/gcc.dg/vect/O3-pr70130.c  (working copy)
> > ***
> > *** 0 
> > --- 1,94 
> > + /* { dg-do run } */
> > + /* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */
> > + /* { dg-additional-options "-mcpu=power7" { target powerpc*-*-* } } */
> > + 
> > + struct foo
> > + {
> > +   short a[3][16][16];
> > +   short pad;
> > + } images[8];
> > + 
> > + void __attribute__ ((noinline, noclone))
> > + Loop_err (struct foo *img, const int s[16][2], int s0)
> > + {
> > +   int i, j;
> > + 
> > +   for (j = 0; j < 16; j++)
> > + {
> > +   for (i=0; i < 16; i++)
> > +   {
> > + img->a[0][j][i] = s[i][0];
> > + img->a[1][j][i] = s[j][1];
> > + img->a[2][j][i] = s0;
> > +   }
> > + }
> > + }
> > + 
> > + const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 },
> > +  { 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 },
> > +  { 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 },
> > +  { 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } };
> > + const struct foo expected
> > + = { { { { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 },
> > +   { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 } },
> > +   { { 16, 16, 

Re: [PATCH][AArch64] Backport of PR 70044 fix to GCC 5

2016-04-14 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00142.html

Thanks,
Kyrill
On 04/04/16 10:10, Kyrill Tkachov wrote:

Hi all,

I'd like to backport Nicks' patch for PR 70044 to the GCC 5 branch.
The patch doesn't apply cleanly because the 
aarch64_override_options_after_change and
associated machinery was reworked for GCC 6. This is the (simple) backport of 
that
patch to GCC 5.

Bootstrapped and tested on aarch64-none-linux-gnu. Confirmed that the test 
fails before the patch
and passes with it.

Ok to commit?

Thanks,
Kyrill

2016-04-04  Nick Clifton  
Kyrylo Tkachov  

PR target/70044
* config/aarch64/aarch64.c
(aarch64_override_options_after_change): When forcing
flag_omit_frame_pointer to be true, use a special value that can
be detected if this function is called again, thus preventing
flag_omit_leaf_frame_pointer from being forced to be false.

2016-04-04  Kyrylo Tkachov  

Backport from mainline
2016-03-31  Nick Clifton  

PR target/70044
* gcc.target/aarch64/pr70044.c: New test.




Re: [PATCH] c++/70594 debug info differences

2016-04-14 Thread Nathan Sidwell

On 04/13/16 15:41, Jason Merrill wrote:


The fini_constexpr stuff is OK immediately.


As those two objects are currently GTY((deletable)) I don't think that's 
necessary.  Have I missed something?



The rest of the patch is OK if Jakub thinks it should go in, but if his approach
addresses the problem, we might as well continue to GC the tables.


Jakub?  I see you've obscured the UIDs in the dump, addressed a LABELs issue, 
and found another problem.  Do we still want DECL_UID stability?


nathan



Re: [PATCH][AArch64] Backport of PR 70044 fix to GCC 5

2016-04-14 Thread James Greenhalgh
On Thu, Apr 14, 2016 at 02:24:48PM +0100, Kyrill Tkachov wrote:
> Ping.
> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00142.html
> 
> Thanks,
> Kyrill
> On 04/04/16 10:10, Kyrill Tkachov wrote:
> >Hi all,
> >
> >I'd like to backport Nicks' patch for PR 70044 to the GCC 5 branch.
> >The patch doesn't apply cleanly because the 
> >aarch64_override_options_after_change and
> >associated machinery was reworked for GCC 6. This is the (simple) backport 
> >of that
> >patch to GCC 5.
> >
> >Bootstrapped and tested on aarch64-none-linux-gnu. Confirmed that the test 
> >fails before the patch
> >and passes with it.
> >
> >Ok to commit?

OK.

Thanks,
James



Re: Splitting up gcc/omp-low.c?

2016-04-14 Thread Thomas Schwinge
Hi!

On Wed, 13 Apr 2016 20:20:19 +0200, Bernd Schmidt  wrote:
> On 04/13/2016 07:56 PM, Thomas Schwinge wrote:
> > 0001-Split-up-gcc-omp-low.c-plain.patch.xz attached.
> 
> That looks much better. However, the //OMPWHATEVER comments are not 
> really all that helpful.

They're helpful in the word-diff patch, as they show where I moved stuff.
I agree that they can't serve this purpose in the line-diff patch.

> I think the next step would be to clear out all 
> this stuff including the OMPCUT markers

I think the first step would be to get agreement on the several questions
that I posted yesterday.  That is, which stuff to move, into which files,
how to structure it all, and so on.  That's what I hoped the word-diff
patch would help with.  Everything else sounds a bit like busywork to me;
that is:

> start with an 
> initial patch that includes everything that actually modifies code 
> rather than moving it (omp_is_reference seems to be the major thing here).

I don't know how meaningful it is to create/discuss/review/commit the
following patch until the overall approach has been agreed upon?

commit 8ac6b83350022c74bb2af8006e51f3620889e7c1
Author: Thomas Schwinge 
Date:   Thu Apr 14 15:25:24 2016 +0200

gcc/omp-low.c:is_reference -> gcc/omp-low.h:omp_is_reference

gcc/
* omp-low.h: Include "langhooks.h".  Define omp_is_reference here,
and...
* omp-low.c: ... don't define is_reference here.  Adjust all
users.
---
 gcc/omp-low.c | 102 +++---
 gcc/omp-low.h |  10 ++
 2 files changed, 57 insertions(+), 55 deletions(-)

diff --git gcc/omp-low.c gcc/omp-low.c
index 7335abc..bb9c61e 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -1039,21 +1039,13 @@ is_variable_sized (const_tree expr)
   return !TREE_CONSTANT (TYPE_SIZE_UNIT (TREE_TYPE (expr)));
 }
 
-/* Return true if DECL is a reference type.  */
-
-static inline bool
-is_reference (tree decl)
-{
-  return lang_hooks.decls.omp_privatize_by_reference (decl);
-}
-
 /* Return the type of a decl.  If the decl is reference type,
return its base type.  */
 static inline tree
 get_base_type (tree decl)
 {
   tree type = TREE_TYPE (decl);
-  if (is_reference (decl))
+  if (omp_is_reference (decl))
 type = TREE_TYPE (type);
   return type;
 }
@@ -1348,7 +1340,7 @@ build_outer_var_ref (tree var, omp_context *ctx, bool 
lastprivate = false)
}
x = lookup_decl (var, outer);
 }
-  else if (is_reference (var))
+  else if (omp_is_reference (var))
 /* This can happen with orphaned constructs.  If var is reference, it is
possible it is shared and as such valid.  */
 x = var;
@@ -1371,7 +1363,7 @@ build_outer_var_ref (tree var, omp_context *ctx, bool 
lastprivate = false)
}
 }
 
-  if (is_reference (var))
+  if (omp_is_reference (var))
 x = build_simple_mem_ref (x);
 
   return x;
@@ -1433,7 +1425,7 @@ install_var_field (tree var, bool by_ref, int mask, 
omp_context *ctx,
   if (base_pointers_restrict)
type = build_qualified_type (type, TYPE_QUAL_RESTRICT);
 }
-  else if ((mask & 3) == 1 && is_reference (var))
+  else if ((mask & 3) == 1 && omp_is_reference (var))
 type = TREE_TYPE (type);
 
   field = build_decl (DECL_SOURCE_LOCATION (var),
@@ -1904,7 +1896,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
  if ((! TREE_READONLY (decl) && !OMP_CLAUSE_SHARED_READONLY (c))
  || TREE_ADDRESSABLE (decl)
  || by_ref
- || is_reference (decl))
+ || omp_is_reference (decl))
{
  by_ref = use_pointer_for_field (decl, ctx);
  install_var_field (decl, by_ref, 3, ctx);
@@ -1954,7 +1946,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
  && is_gimple_omp_offloaded (ctx->stmt))
{
  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
-   install_var_field (decl, !is_reference (decl), 3, ctx);
+   install_var_field (decl, !omp_is_reference (decl), 3, ctx);
  else if (TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
install_var_field (decl, true, 3, ctx);
  else
@@ -1973,7 +1965,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
  by_ref = use_pointer_for_field (decl, NULL);
 
  if (is_task_ctx (ctx)
- && (global || by_ref || is_reference (decl)))
+ && (global || by_ref || omp_is_reference (decl)))
{
  install_var_field (decl, false, 1, ctx);
  if (!global)
@@ -4673,10 +4665,10 @@ lower_rec_input_clauses (tree clauses, gimple_seq 
*ilist, gimple_seq *dlist,
  tree ref = build_outer_var_ref (var, ctx);
  /* For ref build_outer_var_ref already performs this.  */
  if (TREE_CODE (d) == INDIRECT_REF)
-   gcc_assert (is_reference (var));
+

Re: [PATCH] c++/70594 debug info differences

2016-04-14 Thread Jason Merrill

On 04/14/2016 09:25 AM, Nathan Sidwell wrote:

On 04/13/16 15:41, Jason Merrill wrote:


The fini_constexpr stuff is OK immediately.


As those two objects are currently GTY((deletable)) I don't think that's
necessary.  Have I missed something?


True, I suppose it doesn't make much difference.

Jason



Re: [PATCH] c++/70594 debug info differences

2016-04-14 Thread Jakub Jelinek
On Thu, Apr 14, 2016 at 09:43:26AM -0400, Jason Merrill wrote:
> On 04/14/2016 09:25 AM, Nathan Sidwell wrote:
> >On 04/13/16 15:41, Jason Merrill wrote:
> >
> >>The fini_constexpr stuff is OK immediately.
> >
> >As those two objects are currently GTY((deletable)) I don't think that's
> >necessary.  Have I missed something?
> 
> True, I suppose it doesn't make much difference.

Right now our debugging shows that the remaining issue is just an unrelated
ipa-devirt bug (where the aggressive pruning of BLOCK trees with -g0 results
in different devirt decisions from non-pruned one).

So we hopefully can get away with deletable.

Jakub


PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Marek Polacek
Looking at this PR again, seems we have reached conclusion that the way forward
for GCC 6 is to temporarily disable the check, so I'm posting a patch for that,
so as to finally resolve this PR.  The problem is that the C++ FE violates the
check when it sets FUNCTION_*_QUALIFIED flags.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-04-14  Marek Polacek  
Jan Hubicka  

PR c++/70029
* tree.c (verify_type): Disable the canonical type of main variant
check.

* g++.dg/torture/pr70029.C: New test.

diff --git gcc/testsuite/g++.dg/torture/pr70029.C 
gcc/testsuite/g++.dg/torture/pr70029.C
index e69de29..9592f0c 100644
--- gcc/testsuite/g++.dg/torture/pr70029.C
+++ gcc/testsuite/g++.dg/torture/pr70029.C
@@ -0,0 +1,12 @@
+// PR c++/70029
+// { dg-do compile }
+// { dg-options "-std=c++11 -g -flto" }
+// { dg-require-effective-target lto }
+
+struct A
+{
+  A();
+  int foo() && __attribute__ ((__warn_unused_result__)) { return 0; }
+};
+
+A a;
diff --git gcc/tree.c gcc/tree.c
index ed28429..c64d720 100644
--- gcc/tree.c
+++ gcc/tree.c
@@ -13584,7 +13584,9 @@ verify_type (const_tree t)
   debug_tree (ct);
   error_found = true;
 }
-  if (TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)
+  /* FIXME: this is violated by the C++ FE as discussed in PR70029, when
+ FUNCTION_*_QUALIFIED flags are set.  */
+  if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)
{
   error ("TYPE_CANONICAL of main variant is not main variant");
   debug_tree (ct);

Marek


Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Jason Merrill

On 04/14/2016 10:30 AM, Marek Polacek wrote:

+  /* FIXME: this is violated by the C++ FE as discussed in PR70029, when
+ FUNCTION_*_QUALIFIED flags are set.  */
+  if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)


How about guarding this check with flag_checking rather than disabling 
it entirely?  That way it won't affect released compilers, and we can 
downgrade the PR from P1, but doesn't hide the bug.


Jason



Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Jakub Jelinek
On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote:
> On 04/14/2016 10:30 AM, Marek Polacek wrote:
> >+  /* FIXME: this is violated by the C++ FE as discussed in PR70029, when
> >+ FUNCTION_*_QUALIFIED flags are set.  */
> >+  if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)
> 
> How about guarding this check with flag_checking rather than disabling it
> entirely?  That way it won't affect released compilers, and we can downgrade
> the PR from P1, but doesn't hide the bug.

That will still mean people who use -fchecking will keep reporting such
ICEs.  I think it is better to disable it and reenable after GCC 6 branches.

Jakub


Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Jason Merrill

On 04/14/2016 11:05 AM, Jakub Jelinek wrote:

On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote:

On 04/14/2016 10:30 AM, Marek Polacek wrote:

+  /* FIXME: this is violated by the C++ FE as discussed in PR70029, when
+ FUNCTION_*_QUALIFIED flags are set.  */
+  if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)


How about guarding this check with flag_checking rather than disabling it
entirely?  That way it won't affect released compilers, and we can downgrade
the PR from P1, but doesn't hide the bug.


That will still mean people who use -fchecking will keep reporting such
ICEs.  I think it is better to disable it and reenable after GCC 6 branches.


OK.

Jason




Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements

2016-04-14 Thread Martin Sebor

On 04/14/2016 04:39 AM, Andreas Schwab wrote:

Martin Sebor  writes:


diff --git a/gcc/testsuite/g++.dg/cpp1y/vla11.C 
b/gcc/testsuite/g++.dg/cpp1y/vla11.C
new file mode 100644
index 000..af9624a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/vla11.C
@@ -0,0 +1,711 @@
+// PR c++/69517 - [5/6 regression] SEGV on a VLA with excess initializer
+//   elements
+// PR c++/70019 - VLA size overflow not detected
+//
+// Runtime test to verify that attempting to either construct a VLA with
+// erroneous bounds, or initialize one with an initializer-list that
+// contains more elements than the VLA's non-constant (runtime) bounds
+// causes an exception to be thrown.  Test also verifies that valid
+// VLAs and their initializers don't cause such an exception.
+
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-Wno-vla" }


On m68k:

/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C: In instantiation 
of 'struct TestType<32u>':
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:201:1:   
required from here
/daten/aranym/gcc/gcc-20160414/gcc/testsuite/g++.dg/cpp1y/vla11.C:89:27: error: 
requested alignment 32 is larger than 16 [-Wattributes]


Thank you for the heads up (and sorry about the breakage).  I've
committed r234976 to fix that.

Martin


Re: [PATCH] [CLEANUP] Mark locally-used functions static

2016-04-14 Thread Thomas Schwinge
Hi!

On Tue, 15 Apr 2014 09:58:41 +0200, Richard Biener  
wrote:
> On Mon, Apr 14, 2014 at 4:48 PM, Patrick Palka  wrote:
> > This patch marks "static" a bunch of locally-used, non-debug functions
> > within the GCC sources.  Doing so addresses a subset of the warnings emitted
> > when compiling the GCC sources with -Wmissing-declarations.
> >
> > I bootstrapped and regtested this change on x86_64-unknown-linux-gnu.
> 
> The gcc/ parts are ok.

It seems this hasn't been committed.  I just re-discovered that
gcc/omp-low.c:simd_clone_vector_of_formal_parm_types could/should be
static but isn't.  Instead of now resolving that one individually:
Patrick, are you going to resolve the whole lot (in next stage 1, I
suppose)?

For reference:

> > 2014-04-13  Patrick Palka  
> >
> > gcc/c/
> > * c-array-notation.c (replace_invariant_exprs): Make static.
> >
> > gcc/cp/
> > * class.c (inherit_targ_abi_tags): Make static.
> > * cp-array-notation.c (create_cmp_incr): Likewise.
> > * mangle.c (tree_string_cmp): Likewise.
> > * pt.c (fixed_parameter_pack_p): Likewise.
> > * vtable-class-hierarchy.c
> > (vtv_register_class_hierarchy_information): Likewise.
> >
> > gcc/fortran/
> > * class.c (gfc_intrinsic_hash_value): Make static.
> > * trans-expr.c (gfc_conv_intrinsic_to_class): Likewise.
> >
> > gcc/
> > * asan.c (asan_mem_ref_get_end, replace_invariant_exprs): Make
> > static.
> > * cgraphclones.c (redirect_edge_duplicating_thunks): Likewise.
> > * cgraphunit.c (decide_is_symbol_needed): Likewise.
> > * config/i386/i386.c (make_pass_insert_vzeroupper,
> > ix86_avx_emit_vzeroupper): Likewise.
> > * dwarf2out.c (init_addr_table_entry): Likewise.
> > * gengtype.c (get_string_option, already_seen_tag,
> > mark_tag_as_seen): Likewise.
> > * gimple-ssa-isolate-paths (isolate_path): Likewise.
> > * graphite.c (graphite_transform_loops): Likewise.
> > * internal-fn.c
> > (ubsan_expand_si_overflow_{addsub,neg,mul}_check): Likewise.
> > * ipa-devirt.c (hash_type_name, likely_target_p): Likewise.
> > * ipa-inline-analysis.c (simple_edge_hints): Likewise.
> > * ipa-profile.c (cmp_counts, contains_hot_call_p): Likewise.
> > * ipa-prop.c (ipa_alloc_node_params,
> > write_agg_replacement_chain): Likewise.
> > * ipa.c (can_replace_by_local_alias): Likewise.
> > * lto-streamer-out.c (output_spmbol_p): Likewise.
> > * omp-low.c (simd_clone_vector_of_formal_parm_types): Likewise.
> > * tree-inline.c (redirect_all_calls, freqs_to_counts): Likewise.
> > * tree-predcom.c (tree_predictive_commoning): Likewise.
> > * tree-sra.c (ipa_sra_modify_function_body): Likewise.
> > * tree-ssa-loop-im.c (movement_possibilyt, tree_ssa_lim):
> > Likewise.
> > * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables,
> > tree_unroll_loops_completely): Likewise.
> > * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise.
> > * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise.
> > * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges):
> > Likewise.
> > * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks):
> > Likewise.
> > * ubsan.c (tree_type_map_hash): Likewise.
> > * varpool.c (varpool_call_variable_insertion_hooks): Likewise.
> >
> > libiberty/
> > * make-temp-file.c (choose_tmpdir): Make static.
> > ---
> >  gcc/asan.c  | 4 ++--
> >  gcc/c/c-array-notation.c| 2 +-
> >  gcc/cgraphclones.c  | 2 +-
> >  gcc/cgraphunit.c| 2 +-
> >  gcc/config/i386/i386.c  | 4 ++--
> >  gcc/cp/class.c  | 2 +-
> >  gcc/cp/cp-array-notation.c  | 2 +-
> >  gcc/cp/mangle.c | 2 +-
> >  gcc/cp/pt.c | 2 +-
> >  gcc/cp/vtable-class-hierarchy.c | 2 +-
> >  gcc/dwarf2out.c | 2 +-
> >  gcc/fortran/class.c | 2 +-
> >  gcc/fortran/trans-expr.c| 2 +-
> >  gcc/gengtype.c  | 6 +++---
> >  gcc/gimple-ssa-isolate-paths.c  | 2 +-
> >  gcc/graphite.c  | 2 +-
> >  gcc/internal-fn.c   | 6 +++---
> >  gcc/ipa-devirt.c| 4 ++--
> >  gcc/ipa-inline-analysis.c   | 2 +-
> >  gcc/ipa-profile.c   | 4 ++--
> >  gcc/ipa-prop.c  | 4 ++--
> >  gcc/ipa.c   | 2 +-
> >  gcc/lto-streamer-out.c  | 2 +-
> >  gcc/omp-low.c   | 2 +-
> >  gcc/tree-inline.c   | 4 ++--
> >  gcc/tree-predcom.c  | 2 +-
> >  gcc/tree-sra.c  | 2 +-
> >  gcc/tree-ssa-loop-im.c  | 4 ++--
> >  gcc/tree-ssa-loop-ivcanon.c | 4 ++--
> >  gcc/tree-ssa-loop-prefetch.c| 2 +-
> >  gcc/tree-ssa-loop-unswitch.c  

Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements

2016-04-14 Thread Martin Sebor

It caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70652


Sorry about that (I missed java among the languages when configuring
my build for regression testing).  I'm testing a libjava patch that
I expect will eliminate the linker error.

Martin



Re: rs6000 stack_tie mishap again

2016-04-14 Thread Andreas Krebbel
On 04/11/2016 12:15 PM, Olivier Hainque wrote:
> 
>> On Mar 28, 2016, at 19:41 , Richard Henderson  wrote:
>>
>> But I expect for stage4, the best solution is to strengthen the stack_tie 
>> pattern to block all memory.  Early scheduling of the stack frame 
>> deallocation (a simple logic insn) can't really be that important to 
>> performance.
> 
> Something like the attached patch, at least for next stage1 until the
> more general issue is sorted out ?
> 
> Only basic testing at this point. I can do more if considered appropriate.
> 
> Thanks for your feedback,
> 
> With Kind Regards,
> 
> Olivier
> 
>   * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Add a
> (clobber mem:BLK (scratch)) to the set of effects, blocking
>   all memory accesses.

We have the same problem on S/390 and I'm also looking for a way to solve that. 
I'm not sure about
the (clobber (mem)) approach. Wouldn't it be theoretically possible to move a 
memory write over a
(clobber (mem))?

The patch I'm currently testing follows a proposal from Joseph Myers in 2011:
https://gcc.gnu.org/ml/gcc-patches/2011-11/msg00977.html

+ [(set (mem:BLK (scratch))
+ (unspec:BLK [(match_operand:BLK 0 "memory_operand" "+m")] UNSPEC_TIE))]

What I'm wondering is whether the memory read is actually needed here?! 
Wouldn't the following have
the same effect?

[(set (mem:BLK (scratch)) (unspec:BLK [(const_int 0)] UNSPEC_TIE))]

To my understanding this should block memory reads and writes from being moved 
across.
The MEM probably should be set up to be in the frame alias set?

Bye,

-Andreas-



Re: [PATCH] [AArch64] support -mfentry feature for arm64

2016-04-14 Thread Szabolcs Nagy
On 14/04/16 14:15, Andrew Pinski wrote:
> On Thu, Apr 14, 2016 at 9:08 PM, Maxim Kuvyrkov
>  wrote:
>> On Mar 14, 2016, at 11:14 AM, Li Bin  wrote:
>>>
>>> As ARM64 is entering enterprise world, machines can not be stopped for
>>> some critical enterprise production environment, that is, live patch as
>>> one of the RAS features is increasing more important for ARM64 arch now.
>>>
>>> Now, the mainstream live patch implementation which has been merged in
>>> Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and
>>> this feature needs the help of gcc.
>>>
>>> This patch proposes a generic solution for arm64 gcc which called mfentry,
>>> following the example of x86, mips, s390, etc. and on these archs, this
>>> feature has been used to implement the ftrace feature 'ftrace with regs'
>>> to support live patch.
>>>
>>> By now, there is an another solution from linaro [1], which proposes to
>>> implement a new option -fprolog-pad=N that generate a pad of N nops at the
>>> beginning of each function. This solution is a arch-independent way for gcc,
>>> but there may be some limitations which have not been recognized for Linux
>>> kernel to adapt to this solution besides the discussion on [2]
>>
>> It appears that implementing -fprolog-pad=N option in GCC will not enable 
>> kernel live-patching support for AArch64.  The proposal for the option was 
>> to make GCC output a given number of NOPs at the beginning of each function, 
>> and then the kernel could use that NOP pad to insert whatever instructions 
>> it needs.  The modification of kernel instruction stream needs to be done 
>> atomically, and, unfortunately, it seems the kernel can use only 
>> architecture-provided atomicity primitives -- i.e., changing at most 8 bytes 
>> at a time.
>>
> 
> Can't we add a 16byte atomic primitive for ARM64 to the kernel?
> Though you need to align all functions to a 16 byte boundary if the
> -fprolog-pag=N needs to happen.  Do you know what the size that needs
> to be modified?  It does seem to be either 12 or 16 bytes.
> 

looking at [2] i don't see why

func:
  mov x9, x30
  bl _tracefunc
  

is not good for the kernel.

mov x9, x30 is a nop at function entry, so in
theory 4 byte atomic write should be enough
to enable/disable tracing.

>> From the kernel discussion thread it appears that the pad needs to be more 
>> than 8 bytes, and that the kernel can't update that atomically.  However if 
>> -mfentry approach is used, then we need to update only 4 (or 8) bytes of the 
>> pad, and we avoid the atomicity problem.
> 
> I think you are incorrect, you could add a 16 byte atomic primitive if needed.
> 
>>
>> Therefore, [unless there is a clever multi-stage update process to 
>> atomically change NOPs to whatever we need,] I think we have to go with Li's 
>> -mfentry approach.
> 
> Please consider the above of having a 16 byte (128bit) atomic
> instructions be available would that be enough?
> 
> Thanks,
> Andrew
> 
>>
>> Comments?
>>
>> --
>> Maxim Kuvyrkov
>> www.linaro.org
>>
>>
>>> , typically
>>> for powerpc archs. Furthermore I think there are no good reasons to promote
>>> the other archs (such as x86) which have implemented the feature 'ftrace 
>>> with regs'
>>> to replace the current method with the new option, which may bring heavily
>>> target-dependent code adaption, as a result it becomes a arm64 dedicated
>>> solution, leaving kernel with two different forms of implementation.
>>>
>>> [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html
>>> [2] 
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html
>>
> 



Re: Splitting up gcc/omp-low.c?

2016-04-14 Thread Bernd Schmidt

On 04/14/2016 03:36 PM, Thomas Schwinge wrote:


I don't know how meaningful it is to create/discuss/review/commit the
following patch until the overall approach has been agreed upon?


Why not? We agree the file should be split, and this makes it easier. So 
I'll approve it for stage1, or earlier if RMs want this split in gcc-6.


You could however probably also remove the #include langhooks from the C 
file.



Bernd


Re: Splitting up gcc/omp-low.c?

2016-04-14 Thread Thomas Schwinge
Hi!

On Wed, 13 Apr 2016 18:01:09 +0200, I wrote:
> On Fri, 08 Apr 2016 11:36:03 +0200, I wrote:
> > On Thu, 10 Dec 2015 09:08:35 +0100, Jakub Jelinek  wrote:
> > > On Wed, Dec 09, 2015 at 06:23:22PM +0100, Bernd Schmidt wrote:
> > > > On 12/09/2015 05:24 PM, Thomas Schwinge wrote:
> > > > >how about we split up gcc/omp-low.c into several
> > > > >files?  Would it make sense (I have not yet looked in detail) to do so
> > > > >along the borders of the several passes defined therein?
> 
> > > > I suspect a split along the ompexp/omplow boundary would be quite easy 
> > > > to
> > > > achieve.

> > And possibly some kind of omp-simd.c, and omp-checking.c, and so
> > on, if feasible.
> 
> Not yet looked into these.

..., and here they are: word-diff patches creating omp-diagnostics.c and
omp-simd.c, 0001-new-file-gcc-omp-diagnostics.c.patch.xz and
0002-new-file-gcc-omp-simd.c.patch.xz.  The former contains the
"diagnose_omp_blocks" pass and the latter the "simdclone" pass, with the
respective supporting code.  I will certainly submit line-diff patches if
we agree that this is sound -- these two may actually be good candidates
to do first, individually, and do that now, because they're completely
self-contained.  Makes sense?

Should possibly rename omp-simd.c to omp-simd-clone.c to make it clear
that's the only thing it does, the "simdclone" pass?

Should we maybe rename omp-diagnostics.c to omp-structured-blocks.c, and
really only have it contain that "diagnose_omp_blocks" pass, or should we
move other diagnostic stuff like check_omp_nesting_restrictions into that
file, too?

One //OMPTODO:

--- gcc/omp-low.h
+++ gcc/omp-low.h
@@ -95,0 +96,11 @@ extern gimple *build_omp_barrier (tree);
+//OMPTODO: moved from omp-low.c.  Renamed from WALK_SUBSTMTS to 
OMP_WALK_SUBSTMTS because of the very generic name.  Used in omp-low.c and 
omp-diagnostics.c.  Alternatively, WALK_SUBSTMTS should perhaps simply be 
duplicated in the two files?
+#define OMP_WALK_SUBSTMTS  \
+case GIMPLE_BIND: \
+case GIMPLE_TRY: \
+case GIMPLE_CATCH: \
+case GIMPLE_EH_FILTER: \
+case GIMPLE_TRANSACTION: \
+  /* The sub-statements for these should be walked.  */ \
+  *handled_ops_p = false; \
+  break;

Instead of that, duplicate WALK_SUBSTMTS into both omp-low.c and
omp-diagnostics.c?


Grüße
 Thomas




0001-new-file-gcc-omp-diagnostics.c.patch.xz
Description: application/xz


0002-new-file-gcc-omp-simd.c.patch.xz
Description: application/xz


Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Marek Polacek
On Thu, Apr 14, 2016 at 11:18:59AM -0400, Jason Merrill wrote:
> On 04/14/2016 11:05 AM, Jakub Jelinek wrote:
> >On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote:
> >>On 04/14/2016 10:30 AM, Marek Polacek wrote:
> >>>+  /* FIXME: this is violated by the C++ FE as discussed in PR70029, when
> >>>+ FUNCTION_*_QUALIFIED flags are set.  */
> >>>+  if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != 
> >>>ct)
> >>
> >>How about guarding this check with flag_checking rather than disabling it
> >>entirely?  That way it won't affect released compilers, and we can downgrade
> >>the PR from P1, but doesn't hide the bug.
> >
> >That will still mean people who use -fchecking will keep reporting such
> >ICEs.  I think it is better to disable it and reenable after GCC 6 branches.
> 
> OK.

Should I consider the patch approved or do we want to hear from Honza?

Marek


Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Jeff Law

On 04/14/2016 10:12 AM, Marek Polacek wrote:

On Thu, Apr 14, 2016 at 11:18:59AM -0400, Jason Merrill wrote:

On 04/14/2016 11:05 AM, Jakub Jelinek wrote:

On Thu, Apr 14, 2016 at 11:01:02AM -0400, Jason Merrill wrote:

On 04/14/2016 10:30 AM, Marek Polacek wrote:

+  /* FIXME: this is violated by the C++ FE as discussed in PR70029, when
+ FUNCTION_*_QUALIFIED flags are set.  */
+  if (0 && TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)


How about guarding this check with flag_checking rather than disabling it
entirely?  That way it won't affect released compilers, and we can downgrade
the PR from P1, but doesn't hide the bug.


That will still mean people who use -fchecking will keep reporting such
ICEs.  I think it is better to disable it and reenable after GCC 6 branches.


OK.


Should I consider the patch approved or do we want to hear from Honza?

Given the BZ & list discussion, I'd consider the patch approved.

I *think* the way to deal with the BZ is to change the regression marker 
to 7 and the target milestone as well.  I think leaving it as a P1 would 
be fine as that'll force revisiting in the gcc-7 timeframe.


jeff


Re: rs6000 stack_tie mishap again

2016-04-14 Thread Jeff Law

On 04/14/2016 09:47 AM, Andreas Krebbel wrote:

On 04/11/2016 12:15 PM, Olivier Hainque wrote:



On Mar 28, 2016, at 19:41 , Richard Henderson  wrote:

But I expect for stage4, the best solution is to strengthen the stack_tie 
pattern to block all memory.  Early scheduling of the stack frame deallocation 
(a simple logic insn) can't really be that important to performance.


Something like the attached patch, at least for next stage1 until the
more general issue is sorted out ?

Only basic testing at this point. I can do more if considered appropriate.

Thanks for your feedback,

With Kind Regards,

Olivier

* config/rs6000/rs6000.c (rs6000_emit_stack_tie): Add a
 (clobber mem:BLK (scratch)) to the set of effects, blocking
all memory accesses.


We have the same problem on S/390 and I'm also looking for a way to solve that. 
I'm not sure about
the (clobber (mem)) approach. Wouldn't it be theoretically possible to move a 
memory write over a
(clobber (mem))?
I thought we had code to handle this case specially, but I can't 
immediately find it in sched-deps.c.


Some ports also use an unspec_volatile to achieve the same effect:

;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and
;; all of memory.  This blocks insns from being moved across this point.

(define_insn "blockage"
  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
  ""
  ""
  [(set_attr "length" "0")])

;; Do not schedule instructions accessing memory across this point.

(define_expand "memory_blockage"
  [(set (match_dup 0)
(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))]
  ""
{
  operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
  MEM_VOLATILE_P (operands[0]) = 1;
})

(define_insn "*memory_blockage"
  [(set (match_operand:BLK 0)
(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))]
  ""
  ""
  [(set_attr "length" "0")])


Jeff


Re: PATCH to disable the canonical types check in verify_type (PR c++/70029)

2016-04-14 Thread Marek Polacek
On Thu, Apr 14, 2016 at 10:40:43AM -0600, Jeff Law wrote:
> Given the BZ & list discussion, I'd consider the patch approved.
> 
> I *think* the way to deal with the BZ is to change the regression marker to
> 7 and the target milestone as well.  I think leaving it as a P1 would be
> fine as that'll force revisiting in the gcc-7 timeframe.

Thanks, I've done so now.

Marek


Re: rs6000 stack_tie mishap again

2016-04-14 Thread Olivier Hainque

> On 14 Apr 2016, at 18:50, Jeff Law  wrote:
> 
> I thought we had code to handle this case specially, but I can't immediately 
> find it in sched-deps.c.

Unless I misunderstood what you were exactly looking for,
the special code is in alias.c. E.g. write_dependence_p:

  /* (mem:BLK (scratch)) is a special mechanism to conflict with everything.
 This is used in epilogue deallocation functions.  */
  ...

> Some ports also use an unspec_volatile to achieve the same effect:
> 
> ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and
> ;; all of memory.  This blocks insns from being moved across this point.
> 
> (define_insn "blockage"
>  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
>  ""
>  ""
>  [(set_attr "length" "0")])

Yes, I pondered this one and thought that a memory barrier
would be less aggressive, while good enough.

> ;; Do not schedule instructions accessing memory across this point.
> 
> (define_expand "memory_blockage"
>  [(set (match_dup 0)
>(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))]
>  ""
> {
>  operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
>  MEM_VOLATILE_P (operands[0]) = 1;
> })
> 
> (define_insn "*memory_blockage"
>  [(set (match_operand:BLK 0)
>(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))]
>  ""
>  ""
>  [(set_attr "length" "0")])

Emitting just that alone isn't enough though. The first attempt
I made did that and failed because the whole 

  reg-restore
  reg-restore
  tie (mem blockage only)
  
sequence was allowed to move past the stack pointer reset when
it's performed as a mere register to register move.

So I ended up adding the clobber mem:blk scratch to the existing
tie parallel, which references the stack pointer register as
well so is forbidden to move past it's update.

Olivier




Re: Fix some x86 peepholes vs. the red-zone

2016-04-14 Thread Jeff Law

On 04/12/2016 05:57 AM, Bernd Schmidt wrote:

With some changes I was working on, I produced a situation where one of
the x86 peepholes made an incorrect transformation. In the prologue
expansion code, we have

/* When using red zone we may start register saving before allocating
the stack frame saving one cycle of the prologue.  However, avoid
doing this if we have to probe the stack; at least on x86_64 the
stack probe can turn into a call that clobbers a red zone location.

So, we can in some situations produce something like:

mov %ebx, -8(%rsp)
mov %edi, -16(%rsp)
subl $16, %rsp

which is fine if using a redzone. The problem is that there are
peepholes which convert sp subtractions into pushes, clobbering the
saved registers.

I've made these peepholes conditional on !x86_using_red_zone.
Bootstrapped and tested on x86_64-linux, ok (now or later)?

OK for stage1.  Uros's call whether or not to OK for the trunk right now.

jeff


[PATCH] Fix -fcompare-debug due to incorrect block pruning (PR c++/70594)

2016-04-14 Thread Jakub Jelinek
Hi!

On a testcase Tobias provided privately there is a -fcompare-debug
failure due to different cgraph node->order values in the printout.
The problem is in different result from the
noncall_stmt_may_be_vtbl_ptr_store function on a store, which is caused
by too aggressive block pruning with -g0.

The noncall_stmt_may_be_vtbl_ptr_store relies on enough BLOCKS to be
preserved, such that if block_ultimate_origin of some BLOCK (gimple_block or
up through its BLOCK_SUPERCONTEXTs) is a FUNCTION_DECL, then it is
a cdtor if the stmt is originally inlined from a cdtor, or is some
other FUNCTION_DECL if it is not originally inlined from a cdtor.
To achieve that, tree-ssa-live.c attempts to keep the blocks
with block_ultimate_origin of FUNCTION_DECL which is cdtor,
and nearest block with block_ultimate_origin of FUNCTION_DECL
that is not cdtor below the cdtor one.

There is:
  /* For ipa-polymorphic-call.c purposes, preserve blocks:
 1) with BLOCK_ABSTRACT_ORIGIN of a ctor/dtor or their clones  */
  if (inlined_polymorphic_ctor_dtor_block_p (scope, true))
{
  in_ctor_dtor_block = true;
  unused = false;
}
  /* 2) inside such blocks, the outermost block with BLOCK_ABSTRACT_ORIGIN
 being a FUNCTION_DECL.  */
  else if (in_ctor_dtor_block
   && BLOCK_ABSTRACT_ORIGIN (scope)
   && TREE_CODE (BLOCK_ABSTRACT_ORIGIN (scope)) == FUNCTION_DECL)
{
  in_ctor_dtor_block = false;
  unused = false;
}
which is roughly right, but there are 2 bugs, both of which affect
the testcase I've looked at.

1) the above works well if a non-cdtor function is inlined into cdtor
and the cdtor is then inlined into some other function and pruning happens
only afterwards, but if a non-cdtor function is inlined into cdtor, then
pruning happens on the cdtor, the block with block_ultimate_origin
of FUNCTION_DECL can be pruned, and then finally the cdtor function is
inlined into some other one and noncall_stmt_may_be_vtbl_ptr_store depends
on -g

2) another issue is that noncall_stmt_may_be_vtbl_ptr_store uses
block_ultimate_origin, and for the cdtor blocks we use it in
remove_unused_scope_block_p too (because
inlined_polymorphic_ctor_dtor_block_p calls it), but for the other
blocks we just check BLOCK_ABSTRACT_ORIGIN if it is FUNCTION_DECL.
If it isn't a FUNCTION_DECL, but block_ultimate_origin is FUNCTION_DECL,
we still prune it, even when we shouldn't.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-04-14  Jakub Jelinek  

PR c++/70594
* ipa-utils.h (polymorphic_ctor_dtor_p): New prototype.
* ipa-polymorphic-call.c (polymorphic_ctor_dtor_p): New function.
(inlined_polymorphic_ctor_dtor_block_p): Use it.
* tree-ssa-live.c (remove_unused_scope_block_p): When
in_ctor_dtor_block, avoid discarding not just BLOCKs with
BLOCK_ABSTRACT_ORIGIN being FUNCTION_DECL, but even when
block_ultimate_origin is FUNCTION_DECL.
(remove_unused_locals): If current_function_decl is
polymorphic_ctor_dtor_p, pass initial true to
remove_unused_scope_block_p' is_ctor_dtor_block.

--- gcc/ipa-utils.h.jj  2016-01-04 14:55:51.0 +0100
+++ gcc/ipa-utils.h 2016-04-14 16:46:08.828444152 +0200
@@ -70,6 +70,7 @@ void dump_possible_polymorphic_call_targ
 bool possible_polymorphic_call_target_p (tree, HOST_WIDE_INT,
 const ipa_polymorphic_call_context &,
 struct cgraph_node *);
+tree polymorphic_ctor_dtor_p (tree, bool);
 tree inlined_polymorphic_ctor_dtor_block_p (tree, bool);
 bool decl_maybe_in_construction_p (tree, tree, gimple *, tree);
 tree vtable_pointer_value_to_binfo (const_tree);
--- gcc/ipa-polymorphic-call.c.jj   2016-03-30 16:00:17.0 +0200
+++ gcc/ipa-polymorphic-call.c  2016-04-14 16:45:45.407754387 +0200
@@ -479,16 +479,12 @@ contains_type_p (tree outer_type, HOST_W
 }
 
 
-/* Return a FUNCTION_DECL if BLOCK represents a constructor or destructor.
+/* Return a FUNCTION_DECL if FN represent a constructor or destructor.
If CHECK_CLONES is true, also check for clones of ctor/dtors.  */
 
 tree
-inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones)
+polymorphic_ctor_dtor_p (tree fn, bool check_clones)
 {
-  tree fn = block_ultimate_origin (block);
-  if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL)
-return NULL_TREE;
-
   if (TREE_CODE (TREE_TYPE (fn)) != METHOD_TYPE
   || (!DECL_CXX_CONSTRUCTOR_P (fn) && !DECL_CXX_DESTRUCTOR_P (fn)))
 {
@@ -510,6 +506,19 @@ inlined_polymorphic_ctor_dtor_block_p (t
   return fn;
 }
 
+/* Return a FUNCTION_DECL if BLOCK represents a constructor or destructor.
+   If CHECK_CLONES is true, also check for clones of ctor/dtors.  */
+
+tree
+inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones)
+{
+  tree fn = block_ultimate_origin (block);
+  if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL)
+return NULL_TREE;
+
+  retur

Re: [PATCH] Fix -fcompare-debug due to incorrect block pruning (PR c++/70594)

2016-04-14 Thread Richard Biener
On April 14, 2016 8:31:32 PM GMT+02:00, Jakub Jelinek  wrote:
>Hi!
>
>On a testcase Tobias provided privately there is a -fcompare-debug
>failure due to different cgraph node->order values in the printout.
>The problem is in different result from the
>noncall_stmt_may_be_vtbl_ptr_store function on a store, which is caused
>by too aggressive block pruning with -g0.
>
>The noncall_stmt_may_be_vtbl_ptr_store relies on enough BLOCKS to be
>preserved, such that if block_ultimate_origin of some BLOCK
>(gimple_block or
>up through its BLOCK_SUPERCONTEXTs) is a FUNCTION_DECL, then it is
>a cdtor if the stmt is originally inlined from a cdtor, or is some
>other FUNCTION_DECL if it is not originally inlined from a cdtor.
>To achieve that, tree-ssa-live.c attempts to keep the blocks
>with block_ultimate_origin of FUNCTION_DECL which is cdtor,
>and nearest block with block_ultimate_origin of FUNCTION_DECL
>that is not cdtor below the cdtor one.
>
>There is:
>  /* For ipa-polymorphic-call.c purposes, preserve blocks:
> 1) with BLOCK_ABSTRACT_ORIGIN of a ctor/dtor or their clones  */
>  if (inlined_polymorphic_ctor_dtor_block_p (scope, true))
>{
>  in_ctor_dtor_block = true;
>  unused = false;
>}
>/* 2) inside such blocks, the outermost block with
>BLOCK_ABSTRACT_ORIGIN
> being a FUNCTION_DECL.  */
>  else if (in_ctor_dtor_block
>   && BLOCK_ABSTRACT_ORIGIN (scope)
> && TREE_CODE (BLOCK_ABSTRACT_ORIGIN (scope)) == FUNCTION_DECL)
>{
>  in_ctor_dtor_block = false;
>  unused = false;
>}
>which is roughly right, but there are 2 bugs, both of which affect
>the testcase I've looked at.
>
>1) the above works well if a non-cdtor function is inlined into cdtor
>and the cdtor is then inlined into some other function and pruning
>happens
>only afterwards, but if a non-cdtor function is inlined into cdtor,
>then
>pruning happens on the cdtor, the block with block_ultimate_origin
>of FUNCTION_DECL can be pruned, and then finally the cdtor function is
>inlined into some other one and noncall_stmt_may_be_vtbl_ptr_store
>depends
>on -g
>
>2) another issue is that noncall_stmt_may_be_vtbl_ptr_store uses
>block_ultimate_origin, and for the cdtor blocks we use it in
>remove_unused_scope_block_p too (because
>inlined_polymorphic_ctor_dtor_block_p calls it), but for the other
>blocks we just check BLOCK_ABSTRACT_ORIGIN if it is FUNCTION_DECL.
>If it isn't a FUNCTION_DECL, but block_ultimate_origin is
>FUNCTION_DECL,
>we still prune it, even when we shouldn't.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

>2016-04-14  Jakub Jelinek  
>
>   PR c++/70594
>   * ipa-utils.h (polymorphic_ctor_dtor_p): New prototype.
>   * ipa-polymorphic-call.c (polymorphic_ctor_dtor_p): New function.
>   (inlined_polymorphic_ctor_dtor_block_p): Use it.
>   * tree-ssa-live.c (remove_unused_scope_block_p): When
>   in_ctor_dtor_block, avoid discarding not just BLOCKs with
>   BLOCK_ABSTRACT_ORIGIN being FUNCTION_DECL, but even when
>   block_ultimate_origin is FUNCTION_DECL.
>   (remove_unused_locals): If current_function_decl is
>   polymorphic_ctor_dtor_p, pass initial true to
>   remove_unused_scope_block_p' is_ctor_dtor_block.
>
>--- gcc/ipa-utils.h.jj 2016-01-04 14:55:51.0 +0100
>+++ gcc/ipa-utils.h2016-04-14 16:46:08.828444152 +0200
>@@ -70,6 +70,7 @@ void dump_possible_polymorphic_call_targ
> bool possible_polymorphic_call_target_p (tree, HOST_WIDE_INT,
>const ipa_polymorphic_call_context &,
>struct cgraph_node *);
>+tree polymorphic_ctor_dtor_p (tree, bool);
> tree inlined_polymorphic_ctor_dtor_block_p (tree, bool);
> bool decl_maybe_in_construction_p (tree, tree, gimple *, tree);
> tree vtable_pointer_value_to_binfo (const_tree);
>--- gcc/ipa-polymorphic-call.c.jj  2016-03-30 16:00:17.0 +0200
>+++ gcc/ipa-polymorphic-call.c 2016-04-14 16:45:45.407754387 +0200
>@@ -479,16 +479,12 @@ contains_type_p (tree outer_type, HOST_W
> }
> 
> 
>-/* Return a FUNCTION_DECL if BLOCK represents a constructor or
>destructor.
>+/* Return a FUNCTION_DECL if FN represent a constructor or destructor.
>If CHECK_CLONES is true, also check for clones of ctor/dtors.  */
> 
> tree
>-inlined_polymorphic_ctor_dtor_block_p (tree block, bool check_clones)
>+polymorphic_ctor_dtor_p (tree fn, bool check_clones)
> {
>-  tree fn = block_ultimate_origin (block);
>-  if (fn == NULL || TREE_CODE (fn) != FUNCTION_DECL)
>-return NULL_TREE;
>-
>   if (TREE_CODE (TREE_TYPE (fn)) != METHOD_TYPE
>  || (!DECL_CXX_CONSTRUCTOR_P (fn) && !DECL_CXX_DESTRUCTOR_P (fn)))
> {
>@@ -510,6 +506,19 @@ inlined_polymorphic_ctor_dtor_block_p (t
>   return fn;
> }
> 
>+/* Return a FUNCTION_DECL if BLOCK represents a constructor or
>destructor.
>+   If CHECK_CLONES is true, also check for clones of ctor/dtors.  */
>+
>+tree
>+inlined_polymorphic_ctor

C++ PATCH for c++/70648 (wrong error with constexpr constructor)

2016-04-14 Thread Jason Merrill
When we overwrite one CONSTRUCTOR with another, we need to transfer 
CONSTRUCTOR_NO_IMPLICIT_ZERO as well.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit aa2f91a9e517f9351ae9624d35b27f6a25fc1b97
Author: Jason Merrill 
Date:   Thu Apr 14 13:08:20 2016 -0400

	PR c++/70648
	* constexpr.c (cxx_eval_store_expression): Also copy
	CONSTRUCTOR_NO_IMPLICIT_ZERO.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 37cc336..4abff20 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3149,6 +3149,8 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
   CONSTRUCTOR_ELTS (*valp) = CONSTRUCTOR_ELTS (init);
   TREE_CONSTANT (*valp) = TREE_CONSTANT (init);
   TREE_SIDE_EFFECTS (*valp) = TREE_SIDE_EFFECTS (init);
+  CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp)
+	= CONSTRUCTOR_NO_IMPLICIT_ZERO (init);
 }
   else
 *valp = init;
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist10.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist10.C
new file mode 100644
index 000..c12347d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist10.C
@@ -0,0 +1,11 @@
+// PR c++/70648
+// { dg-do compile { target c++11 } }
+
+struct C
+{
+  template 
+  constexpr C (...) : c { static_cast(0)... } {}
+  int c[1];
+};
+
+static constexpr int b = C{}.c[0];


C++ PATCH for c++/70543 (wrong error with static const as array bound)

2016-04-14 Thread Jason Merrill
When we consider whether the initializer for a constant variable makes 
it value-dependent, we should check type-dependence as well as 
value-dependence.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 4af96532b224029a20b192be2e7cedc60b9d12d0
Author: Jason Merrill 
Date:   Thu Apr 14 13:23:44 2016 -0400

	PR c++/70543

	* pt.c (value_dependent_expression_p) [VAR_DECL]: A type-dependent
	initializer also makes the variable value-dependent.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d066e55..4a00530 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -22670,6 +22670,7 @@ value_dependent_expression_p (tree expression)
 	  && (TREE_CODE (DECL_INITIAL (expression)) == TREE_LIST
 	  /* cp_finish_decl doesn't fold reference initializers.  */
 	  || TREE_CODE (TREE_TYPE (expression)) == REFERENCE_TYPE
+	  || type_dependent_expression_p (DECL_INITIAL (expression))
 	  || value_dependent_expression_p (DECL_INITIAL (expression
 	return true;
   return false;
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-template9.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-template9.C
new file mode 100644
index 000..2ca641d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-template9.C
@@ -0,0 +1,17 @@
+// PR c++/70543
+// { dg-do compile { target c++11 } }
+
+template 
+struct X
+{
+  template 
+  static constexpr int
+  calc (void)
+  {
+return 0;
+  }
+
+  static constexpr unsigned int value = calc ();
+
+  char foo[value];
+};


Re: [PATCH #2], PR target/70640, Fix thinkos in PowerPC IEEE 128-bit floating point emulation

2016-04-14 Thread Michael Meissner
After I applied the fix for PR 70640, one the regression testers reported the
test as an error.  This system is a power7 system that does not have an
asembler that understands the power8/power9 instructions.  This means that
-mcpu=power8 will only generate power7 code.

The difference for this test is that on power7, it generates a vspltisw
instruciton to create a register with all 1's set, while on power8 it generates
xxlorc instead (vspltisw is restricted to the Altivec registers, while xxlorc
can target all VSX registers, so it is preferred for power8 code).

Since it was in a new test and obvious, I went ahead and commited the following
patch:

2016-04-14  Michael Meissner  

PR target/70640
* gcc.target/powerpc/pr70640.c: Fix test so it correctly works on
a power7 system that does not have an assembler that supports
power8.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/testsuite/gcc.target/powerpc/pr70640.c
===
--- gcc/testsuite/gcc.target/powerpc/pr70640.c  (revision 234980)
+++ gcc/testsuite/gcc.target/powerpc/pr70640.c  (working copy)
@@ -5,7 +5,7 @@
 
 __float128 foo (__float128 a) { return -a; }
 
-/* { dg-final { scan-assembler "xxlorc" } } */
+/* { dg-final { scan-assembler "xxlorc\|vspltisw" } } */
 /* { dg-final { scan-assembler "xxlxor" } } */
 /* { dg-final { scan-assembler "vslb"   } } */
 /* { dg-final { scan-assembler "vsldoi" } } */


C++ PATCH for c++/70622 (wrong error with multiple auto)

2016-04-14 Thread Jason Merrill
My change to how we check for consistency of multiple declarators for 
the same auto type-specifier failed to consider declarators of different 
forms; we need to compare only the part of the declared type that 
corresponds to the type-specifier.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 824d6696c29b11ff8015f632c58272685c0bbfe6
Author: Jason Merrill 
Date:   Thu Apr 14 15:05:44 2016 -0400

	PR c++/70622

	* parser.c (cp_parser_init_declarator): Add auto_result parm.
	(cp_parser_simple_declaration): Pass it.
	(strip_declarator_types): New.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 00e211e..cba2d65 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2193,7 +2193,7 @@ static tree cp_parser_decltype
 
 static tree cp_parser_init_declarator
   (cp_parser *, cp_decl_specifier_seq *, vec *,
-   bool, bool, int, bool *, tree *, location_t *);
+   bool, bool, int, bool *, tree *, location_t *, tree *);
 static cp_declarator *cp_parser_declarator
   (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool, bool);
 static cp_declarator *cp_parser_direct_declarator
@@ -12337,10 +12337,9 @@ cp_parser_simple_declaration (cp_parser* parser,
   && !cp_parser_error_occurred (parser))
 cp_parser_commit_to_tentative_parse (parser);
 
-  tree last_type, auto_node;
+  tree last_type;
 
   last_type = NULL_TREE;
-  auto_node = type_uses_auto (decl_specifiers.type);
 
   /* Keep going until we hit the `;' at the end of the simple
  declaration.  */
@@ -12351,6 +12350,7 @@ cp_parser_simple_declaration (cp_parser* parser,
   cp_token *token;
   bool function_definition_p;
   tree decl;
+  tree auto_result = NULL_TREE;
 
   if (saw_declarator)
 	{
@@ -12376,7 +12376,8 @@ cp_parser_simple_declaration (cp_parser* parser,
 	declares_class_or_enum,
 	&function_definition_p,
 	maybe_range_for_decl,
-	&init_loc);
+	&init_loc,
+	&auto_result);
   /* If an error occurred while parsing tentatively, exit quickly.
 	 (That usually happens when in the body of a function; each
 	 statement is treated as a declaration-statement until proven
@@ -12384,10 +12385,10 @@ cp_parser_simple_declaration (cp_parser* parser,
   if (cp_parser_error_occurred (parser))
 	goto done;
 
-  if (auto_node)
+  if (auto_result)
 	{
-	  tree type = TREE_TYPE (decl);
-	  if (last_type && !same_type_p (type, last_type))
+	  if (last_type && last_type != error_mark_node
+	  && !same_type_p (auto_result, last_type))
 	{
 	  /* If the list of declarators contains more than one declarator,
 		 the type of each declared variable is determined as described
@@ -12395,10 +12396,11 @@ cp_parser_simple_declaration (cp_parser* parser,
 		 the same in each deduction, the program is ill-formed.  */
 	  error_at (decl_specifiers.locations[ds_type_spec],
 			"inconsistent deduction for %qT: %qT and then %qT",
-			decl_specifiers.type, last_type, type);
-	  auto_node = NULL_TREE;
+			decl_specifiers.type, last_type, auto_result);
+	  last_type = error_mark_node;
 	}
-	  last_type = type;
+	  else
+	last_type = auto_result;
 	}
 
   /* Handle function definitions specially.  */
@@ -18221,6 +18223,31 @@ cp_parser_asm_definition (cp_parser* parser)
 }
 }
 
+/* Given the type TYPE of a declaration with declarator DECLARATOR, return the
+   type that comes from the decl-specifier-seq.  */
+
+static tree
+strip_declarator_types (tree type, cp_declarator *declarator)
+{
+  for (cp_declarator *d = declarator; d;)
+switch (d->kind)
+  {
+  case cdk_id:
+  case cdk_error:
+	d = NULL;
+	break;
+
+  default:
+	if (TYPE_PTRMEMFUNC_P (type))
+	  type = TYPE_PTRMEMFUNC_FN_TYPE (type);
+	type = TREE_TYPE (type);
+	d = d->declarator;
+	break;
+  }
+
+  return type;
+}
+
 /* Declarators [gram.dcl.decl] */
 
 /* Parse an init-declarator.
@@ -18286,7 +18313,8 @@ cp_parser_init_declarator (cp_parser* parser,
 			   int declares_class_or_enum,
 			   bool* function_definition_p,
 			   tree* maybe_range_for_decl,
-			   location_t* init_loc)
+			   location_t* init_loc,
+			   tree* auto_result)
 {
   cp_token *token = NULL, *asm_spec_start_token = NULL,
*attributes_start_token = NULL;
@@ -18677,6 +18705,10 @@ cp_parser_init_declarator (cp_parser* parser,
 	finish_fully_implicit_template (parser, /*member_decl_opt=*/0);
 }
 
+  if (auto_result && is_initialized && decl_specifiers->type
+  && type_uses_auto (decl_specifiers->type))
+*auto_result = strip_declarator_types (TREE_TYPE (decl), declarator);
+
   return decl;
 }
 
@@ -25808,7 +25840,7 @@ cp_parser_single_declaration (cp_parser* parser,
 member_p,
 declares_class_or_enum,
 &function_definition_p,
-	NULL, NULL);
+	NULL, NULL, NULL);
 
 /* 7.1.1-1 [dcl.stc]
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto47.C b/gcc/testsuite/g++.dg/cpp0x/auto47.C
new file mode 100644
index 000..0d80be6
--- /d

[patch] fix an openacc test case

2016-04-14 Thread Cesar Philippidis
This patch fixes a segfault in libgomp.oacc-fortran/non-scalar-data.f90.
The problem here is that 'n' is a parameter, and the kernels region
implicitly adds a copy clause to n. Naturally, the test segfaults when
it comes time to write the value back to the host as the kernels region
terminates. This problem only occurs on nvptx targets.

I'll apply this patch to trunk as obvious.

Cesar
2016-04-14  Cesar Philippidis  

	libgomp/
	* testsuite/libgomp.oacc-fortran/non-scalar-data.f90: Don't
	pass parameter variables to subroutines.

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 b/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90
index 4afb562..94e4228 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90
@@ -6,9 +6,11 @@
 program main
   implicit none
 
-  integer, parameter :: n = 100
-  integer :: array(n), i
-  
+  integer,parameter :: size = 100
+  integer :: array(size), i, n
+
+  n = size
+
   !$acc data copy(array)
   call kernels(array, n)
 


Re: rs6000 stack_tie mishap again

2016-04-14 Thread Segher Boessenkool
On Mon, Apr 11, 2016 at 12:15:10PM +0200, Olivier Hainque wrote:
> 
> > On Mar 28, 2016, at 19:41 , Richard Henderson  wrote:
> > 
> > But I expect for stage4, the best solution is to strengthen the stack_tie 
> > pattern to block all memory.  Early scheduling of the stack frame 
> > deallocation (a simple logic insn) can't really be that important to 
> > performance.
> 
> Something like the attached patch, at least for next stage1 until the
> more general issue is sorted out ?

It's a bit heavy; as a workaround, we may want this clobber in the stack
deallocation in the epilogue, but not in all other places stack_tie is
used.

And for stage 1 we do not really want a workaround, we want to fix the
actual problem ;-)


Segher


[PATCH] Fix PR target/70669 (allow __float128 to use direct move)

2016-04-14 Thread Michael Meissner
When adding the basic __float128 support, I forgot to enable direct move
support for moving __float128 between VSX registers and GPR registers.

This patch enables using direct move for __float128 variables on Power8
systems.  I bootstrapped the compiler and found no regressions with this
patch.  Is it ok to apply to the GCC trunk?

[gcc]
2016-04-14  Michael Meissner  

PR target/70669
* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Add
direct move handlers for KFmode. Change TFmode handlers test from
FLOAT128_IEEE_P to FLOAT128_VECTOR_P.

[gcc/testsuite]
2016-04-14  Michael Meissner  

PR target/70669
* gcc.target/powerpc/pr70669.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 234910)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3132,8 +3132,6 @@ rs6000_init_hard_regno_mode_ok (bool glo
  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_di_load;
  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_di_store;
  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_di_load;
- reg_addr[KFmode].reload_store= CODE_FOR_reload_kf_di_store;
- reg_addr[KFmode].reload_load = CODE_FOR_reload_kf_di_load;
  reg_addr[DFmode].reload_store= CODE_FOR_reload_df_di_store;
  reg_addr[DFmode].reload_load = CODE_FOR_reload_df_di_load;
  reg_addr[DDmode].reload_store= CODE_FOR_reload_dd_di_store;
@@ -3141,7 +3139,13 @@ rs6000_init_hard_regno_mode_ok (bool glo
  reg_addr[SFmode].reload_store= CODE_FOR_reload_sf_di_store;
  reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_di_load;
 
- if (FLOAT128_IEEE_P (TFmode))
+ if (FLOAT128_VECTOR_P (KFmode))
+   {
+ reg_addr[KFmode].reload_store = CODE_FOR_reload_kf_di_store;
+ reg_addr[KFmode].reload_load  = CODE_FOR_reload_kf_di_load;
+   }
+
+ if (FLOAT128_VECTOR_P (TFmode))
{
  reg_addr[TFmode].reload_store = CODE_FOR_reload_tf_di_store;
  reg_addr[TFmode].reload_load  = CODE_FOR_reload_tf_di_load;
@@ -3182,6 +3186,18 @@ rs6000_init_hard_regno_mode_ok (bool glo
  reg_addr[V8HImode].reload_vsx_gpr  = 
CODE_FOR_reload_vsx_from_gprv8hi;
  reg_addr[V16QImode].reload_vsx_gpr = 
CODE_FOR_reload_vsx_from_gprv16qi;
  reg_addr[SFmode].reload_vsx_gpr= 
CODE_FOR_reload_vsx_from_gprsf;
+
+ if (FLOAT128_VECTOR_P (KFmode))
+   {
+ reg_addr[KFmode].reload_gpr_vsx = 
CODE_FOR_reload_gpr_from_vsxkf;
+ reg_addr[KFmode].reload_vsx_gpr = 
CODE_FOR_reload_vsx_from_gprkf;
+   }
+
+ if (FLOAT128_VECTOR_P (TFmode))
+   {
+ reg_addr[TFmode].reload_gpr_vsx = 
CODE_FOR_reload_gpr_from_vsxtf;
+ reg_addr[TFmode].reload_vsx_gpr = 
CODE_FOR_reload_vsx_from_gprtf;
+   }
}
}
   else
@@ -3200,8 +3216,6 @@ rs6000_init_hard_regno_mode_ok (bool glo
  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_si_load;
  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_si_store;
  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_si_load;
- reg_addr[KFmode].reload_store= CODE_FOR_reload_kf_si_store;
- reg_addr[KFmode].reload_load = CODE_FOR_reload_kf_si_load;
  reg_addr[DFmode].reload_store= CODE_FOR_reload_df_si_store;
  reg_addr[DFmode].reload_load = CODE_FOR_reload_df_si_load;
  reg_addr[DDmode].reload_store= CODE_FOR_reload_dd_si_store;
@@ -3209,6 +3223,12 @@ rs6000_init_hard_regno_mode_ok (bool glo
  reg_addr[SFmode].reload_store= CODE_FOR_reload_sf_si_store;
  reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_si_load;
 
+ if (FLOAT128_VECTOR_P (KFmode))
+   {
+ reg_addr[KFmode].reload_store = CODE_FOR_reload_kf_si_store;
+ reg_addr[KFmode].reload_load  = CODE_FOR_reload_kf_si_load;
+   }
+
  if (FLOAT128_IEEE_P (TFmode))
{
  reg_addr[TFmode].reload_store = CODE_FOR_reload_tf_si_store;
Index: gcc/testsuite/gcc.target/powerpc/pr70669.c
===
--- gcc/testsuite/gcc.target/powerpc/pr70669.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr70669.c  (working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-O2 -mcpu=power8 -

Re: [PATCH] Fix PR target/70669 (allow __float128 to use direct move)

2016-04-14 Thread David Edelsohn
On Thu, Apr 14, 2016 at 6:43 PM, Michael Meissner
 wrote:
> When adding the basic __float128 support, I forgot to enable direct move
> support for moving __float128 between VSX registers and GPR registers.
>
> This patch enables using direct move for __float128 variables on Power8
> systems.  I bootstrapped the compiler and found no regressions with this
> patch.  Is it ok to apply to the GCC trunk?
>
> [gcc]
> 2016-04-14  Michael Meissner  
>
> PR target/70669
> * config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Add
> direct move handlers for KFmode. Change TFmode handlers test from
> FLOAT128_IEEE_P to FLOAT128_VECTOR_P.
>
> [gcc/testsuite]
> 2016-04-14  Michael Meissner  
>
> PR target/70669
> * gcc.target/powerpc/pr70669.c: New test.

Okay.

Thanks, David


Re: [PATCH] rtlanal: Fix bits/bytes confusion in set_noop_p (PR68814)

2016-04-14 Thread Segher Boessenkool
Hi,

This now also shows up on GCC 5, see PR70672.  It there happens in
the jump1 pass already.

Is this okay to backport to 5 and 4.9?


Segher


On Wed, Dec 09, 2015 at 08:17:51PM +, Segher Boessenkool wrote:
> The meaning of ZERO_EXTRACT depends on BITS_BIG_ENDIAN, not on
> BYTES_BIG_ENDIAN.  This caused PR68814.
> 
> Testing in progress on powerpc64le-linux; if it passes, is this
> okay for trunk?
> 
> 
> Segher
> 
> 
> 2015-12-09  Segher Boessenkool  
> 
>   PR rtl-optimization/68814
>   * rtlanal.c (set_noop_p): Use BITS_BIG_ENDIAN instead of
>   BYTES_BIG_ENDIAN.
> ---
>  gcc/rtlanal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
> index 67098e5..f893bca 100644
> --- a/gcc/rtlanal.c
> +++ b/gcc/rtlanal.c
> @@ -1534,7 +1534,7 @@ set_noop_p (const_rtx set)
>  
>if (GET_CODE (dst) == ZERO_EXTRACT)
>  return rtx_equal_p (XEXP (dst, 0), src)
> -&& ! BYTES_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
> +&& !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
>  && !side_effects_p (src);
>  
>if (GET_CODE (dst) == STRICT_LOW_PART)
> -- 
> 1.9.3


C++ PATCH for c++/70528 (mutual dependence error)

2016-04-14 Thread Jason Merrill
The problem with this testcase was that in order to determine whether J 
is a literal type, we needed to implicitly declare its default 
constructor, which calls the H default constructor, and instantiating 
the template arguments for H() requires the K default constructor, which 
needs the First default mem-initializer, which we haven't parsed yet.


We can break this chain by just treating the implicitly-declared default 
constructor as constexpr.  This doesn't break anything in the regression 
testsuite: tests for non-literal type diagnostics still pass, just 
because of a member/base type being non-literal rather than because of 
the default constructor, which actually produces better diagnostics.


This is also a direction that Core has been discussing; the notion of 
literal type is only a convenience for diagnostics, it's not important 
to get it right at the cost of excessive instantiation.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 743489d435da4976c6ecdd66468b85b3ef9a1968
Author: Jason Merrill 
Date:   Thu Apr 7 07:45:03 2016 -0400

	PR c++/70528

	* class.c (type_has_constexpr_default_constructor): Return true
	for an implicitly declared constructor.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 02a992f..e6d5bb0 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -3346,7 +3346,6 @@ add_implicitly_declared_members (tree t, tree* access_decls,
   CLASSTYPE_LAZY_DEFAULT_CTOR (t) = 1;
   if (cxx_dialect >= cxx11)
 	TYPE_HAS_CONSTEXPR_CTOR (t)
-	  /* This might force the declaration.  */
 	  = type_has_constexpr_default_constructor (t);
 }
 
@@ -5349,8 +5348,11 @@ type_has_constexpr_default_constructor (tree t)
 {
   if (!TYPE_HAS_COMPLEX_DFLT (t))
 	return trivial_default_constructor_is_constexpr (t);
-  /* Non-trivial, we need to check subobject constructors.  */
-  lazily_declare_fn (sfk_constructor, t);
+  /* Assume it's constexpr to avoid unnecessary instantiation; if the
+	 definition would have made the class non-literal, it will still be
+	 non-literal because of the base or member in question, and that
+	 gives a better diagnostic.  */
+  return true;
 }
   fns = locate_ctor (t);
   return (fns && DECL_DECLARED_CONSTEXPR_P (fns));
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C
index 45e1ed2..42ca30a 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ctor12.C
@@ -3,6 +3,7 @@
 
 template 
 struct C {
+  C() = default;
   constexpr C(const Tp& r) { }
 };
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C
index a67505d..8d352d0 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-default-ctor.C
@@ -7,6 +7,6 @@ struct A {
 struct B: A { };
 constexpr int f(B b) { return b.i; }
 
-struct C { C(); };	   // { dg-message "calls non-constexpr" }
-struct D: C { };	   // { dg-message "no constexpr constructor" }
+struct C { C(); };	   // { dg-message "" }
+struct D: C { };	   // { dg-message "" }
 constexpr int g(D d) { return 42; } // { dg-error "invalid type" }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C
index 30e0a64..bf95b24 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-ice6.C
@@ -6,6 +6,6 @@ struct A
   A(int);
 };
 
-struct B : A {};   // { dg-error "no matching" }
+struct B : A {};   // { dg-message "" }
 
 constexpr int foo(B) { return 0; } // { dg-error "invalid type" }
diff --git a/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C b/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C
index 7a22f88..7e2d58b 100644
--- a/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C
+++ b/gcc/testsuite/g++.dg/cpp0x/inh-ctor19.C
@@ -8,7 +8,7 @@ struct A
 
 struct B : A
 {
-  using A::A;   // { dg-error "inherited" }
+  using A::A;
 };
 
 constexpr B b;  // { dg-error "literal" }
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr70528.C b/gcc/testsuite/g++.dg/cpp0x/pr70528.C
new file mode 100644
index 000..af1c84e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr70528.C
@@ -0,0 +1,16 @@
+// PR c++/70258
+// { dg-do compile { target c++11 } }
+
+template 
+struct H
+{
+  template 
+  H ();
+};
+
+struct J {
+  struct K {
+int First = 0;
+  };
+  H FunctionMDInfo;
+};


C++ PATCH for c++/70494 (ICE with lambda capture of array)

2016-04-14 Thread Jason Merrill
split_nonconstant_init_1 didn't know how to deal with arrays when 
building a cleanup.  Rather than add that logic there, we can use 
cxx_maybe_build_cleanup, which does understand arrays, once we adjust it 
to handle non-VAR_DECLs.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 4c385da85bf8f9a1c0f26b25441aa2d182a2c02d
Author: Jason Merrill 
Date:   Thu Apr 14 23:58:39 2016 -0400

	PR c++/70494

	* decl.c (cxx_maybe_build_cleanup): Handle non-decls.
	* typeck2.c (split_nonconstant_init_1): Use it.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 380bc79..38e6bd8 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -15021,7 +15021,8 @@ complete_vars (tree type)
 
 /* If DECL is of a type which needs a cleanup, build and return an
expression to perform that cleanup here.  Return NULL_TREE if no
-   cleanup need be done.  */
+   cleanup need be done.  DECL can also be a _REF when called from
+   split_nonconstant_init_1.  */
 
 tree
 cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain)
@@ -15039,7 +15040,10 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain)
   /* Handle "__attribute__((cleanup))".  We run the cleanup function
  before the destructor since the destructor is what actually
  terminates the lifetime of the object.  */
-  attr = lookup_attribute ("cleanup", DECL_ATTRIBUTES (decl));
+  if (DECL_P (decl))
+attr = lookup_attribute ("cleanup", DECL_ATTRIBUTES (decl));
+  else
+attr = NULL_TREE;
   if (attr)
 {
   tree id;
@@ -15098,6 +15102,7 @@ cxx_maybe_build_cleanup (tree decl, tsubst_flags_t complain)
   protected_set_expr_location (cleanup, UNKNOWN_LOCATION);
 
   if (cleanup
+  && DECL_P (decl)
   && !lookup_attribute ("warn_unused", TYPE_ATTRIBUTES (TREE_TYPE (decl)))
   /* Treat objects with destructors as used; the destructor may do
 	 something substantive.  */
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index b921689..e59ad51 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -688,14 +688,9 @@ split_nonconstant_init_1 (tree dest, tree init)
 		  code = build_stmt (input_location, EXPR_STMT, code);
 		  code = maybe_cleanup_point_expr_void (code);
 		  add_stmt (code);
-		  if (type_build_dtor_call (inner_type))
-		{
-		  code = (build_special_member_call
-			  (sub, complete_dtor_identifier, NULL, inner_type,
-			   LOOKUP_NORMAL, tf_warning_or_error));
-		  if (!TYPE_HAS_TRIVIAL_DESTRUCTOR (inner_type))
-			finish_eh_cleanup (code);
-		}
+		  if (tree cleanup
+		  = cxx_maybe_build_cleanup (sub, tf_warning_or_error))
+		finish_eh_cleanup (cleanup);
 		}
 
 	  num_split_elts++;
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-array2.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-array2.C
new file mode 100644
index 000..d0063e1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-array2.C
@@ -0,0 +1,10 @@
+// PR c++/70494
+// { dg-do compile { target c++11 } }
+
+struct A { ~A(); };
+
+int main()
+{
+  A v[] = { A(), A() };
+  auto lambda = [v]{};
+}


Re: rs6000 stack_tie mishap again

2016-04-14 Thread Andreas Krebbel
On 04/14/2016 07:10 PM, Olivier Hainque wrote:
> 
>> On 14 Apr 2016, at 18:50, Jeff Law  wrote:
>>
>> I thought we had code to handle this case specially, but I can't immediately 
>> find it in sched-deps.c.
> 
> Unless I misunderstood what you were exactly looking for,
> the special code is in alias.c. E.g. write_dependence_p:
> 
>   /* (mem:BLK (scratch)) is a special mechanism to conflict with everything.
>  This is used in epilogue deallocation functions.  */
>   ...

Ok thanks. I've verified that the dependencies are also generated when
using this expression in a clobber.

> Emitting just that alone isn't enough though. The first attempt
> I made did that and failed because the whole 
> 
>   reg-restore
>   reg-restore
>   tie (mem blockage only)
>   
> sequence was allowed to move past the stack pointer reset when
> it's performed as a mere register to register move.
> 
> So I ended up adding the clobber mem:blk scratch to the existing
> tie parallel, which references the stack pointer register as
> well so is forbidden to move past it's update.

I've seen the same on S/390 when restoring the stack pointer
from a floating point reg. But I think it is not limited to
reg-reg stack pointer restores.  Potentially this could also
happen with the stack pointer decrement in the prologue which
could also get moved across a mem only barrier.

Bye,

-Andreas-




Re: Split out OMP constructs' SIMD clone supporting code

2016-04-14 Thread Thomas Schwinge
Hi!

On Thu, 14 Apr 2016 22:27:40 +0200, I wrote:
> On Thu, 14 Apr 2016 18:01:13 +0200, I wrote:
> > "simdclone" pass, with the
> > respective supporting code.  I will certainly submit line-diff patches if
> > we agree that this is sound -- these two may actually be good candidates
> > to do first, individually, and do that now, because they're completely
> > self-contained.  Makes sense?
> 
> ;-) Made enough sense to me, so that I prepared the attached patch.  I'm
> also attaching a "-C" variant that I created using Git's -C5% option, and
> which shows how the new file gcc/omp-simd-clone.c can be created from the
> original gcc/omp-low.c.  (Is that useful for review, or are you manually
> doing something like that anyway?)
> 
> > Should possibly rename omp-simd.c to omp-simd-clone.c to make it clear
> > that's the only thing it does, the "simdclone" pass?
> 
> I did that.
> 
> I manually determined a reduced #include list for the new file
> gcc/omp-simd-clone.c.  Hope that's alright.
> 
> OK to commit once bootstrap testing succeeded?

Bootstrap/testing look good.  OK to commit?

> commit 8f33dc59ad24a995694d42ee9013e0853426e190
> Author: Thomas Schwinge 
> Date:   Thu Apr 14 21:56:31 2016 +0200
> 
> Split out OMP constructs' SIMD clone supporting code
> 
>   gcc/
>   * omp-low.c (simd_clone_struct_alloc, simd_clone_struct_copy)
>   (simd_clone_vector_of_formal_parm_types)
>   (simd_clone_clauses_extract, simd_clone_compute_base_data_type)
>   (simd_clone_mangle, simd_clone_create)
>   (simd_clone_adjust_return_type, create_tmp_simd_array)
>   (simd_clone_adjust_argument_types, simd_clone_init_simd_arrays)
>   (struct modify_stmt_info, ipa_simd_modify_stmt_ops)
>   (ipa_simd_modify_function_body, simd_clone_linear_addend)
>   (simd_clone_adjust, expand_simd_clones, ipa_omp_simd_clone)
>   (pass_data_omp_simd_clone, class pass_omp_simd_clone)
>   (pass_omp_simd_clone::gate, make_pass_omp_simd_clone): Move
>   into...
>   * omp-simd-clone.c: ... this new file.
>   (simd_clone_vector_of_formal_parm_types): Make it static.
>   * Makefile.in (OBJS): Add omp-simd-clone.o.
> 
>  gcc/Makefile.in  |1 +
>  gcc/omp-low.c| 1606 
>  gcc/omp-simd-clone.c | 1654 
> ++
>  3 files changed, 1655 insertions(+), 1606 deletions(-)


Grüße
 Thomas