ipa-icf::merge TLC

2015-02-25 Thread Jan Hubicka
Hi,
this patch reorganize sem_function::merge and sem_variable::merge.
I read the code in detail and found several issues that are fixed in the
following patch.

  1) The logic whether address matters was wrong and ignored symbol aliases.

 I separated it into symtab_node::address_matters predicate and also
 added special case for C++ cdtors that are already handled specially
 by ipa-visibility.

  2) Turning definition to alias will result in aliases of aliases.
 Because we normally do not output them, I think it is better to
 flatten the structure or we may run into interesting target issues
 with non-GNU assemblers.

 I fixed it in symtab_node::resolve_alias.

  3) check in cgraph_edge::verify_corresponds_to_fndecl used fndecl instead
 of callee's flag that is useless because fndecl may have its symbol
 removed.
 This was hacked around in sem_function::merge by applying change
 immediately that however breaks with WPA.

  4) Redirection of callers ignored existence of aliases
 Fixed by introduction of redirect_all_callers function. I did not reuse
 similar code in clone redirection because semantic is bit different WRT
 interposable aliases.

  5) binds_to_local_def_p introduced to fix PR64146 is wrong; it possible
 to merge interposable functions (by introducing thunk with local alias).
 What is wrong with the testcase is that sem equality think they are
 equivalent just because the interposable callees are equivalent.
 For everything interposable sem equality should resort to comparing
 functions by symtab_node::semantically_equivalent_p

 The testcase can be turned into wrong code again by making the wrapper
 functions non-interposable.

 Because Martin has similar patch on the way I decided to not fix this
 issue, so the patch currently breaks the testcase.

 I would like to stop snowballing here.  Either I will XFAIL the testcase
 momentarily or Martin will get his patch in tomorrow.

  6) logic about discarded symbols had weird code:
-  if (original->resolution == LDPR_PREEMPTED_REG
-  || original->resolution == LDPR_PREEMPTED_IR)
-original_discardable = true;
 It took me a while to remember why it is there.  Basically it is about
 the case where ORIGINAL definition is known to not appear in the final
 binary, because linker preemted it by something else.
 In this case introducing an alias is counter-effective and should be
 avoided.

 Martin: It would be nice if the ORIGINAL was always chosen in a way
 that it binds_to_local_def if possible.

  8) Thunks should not be produced for empty functions and such.

  9) Logic about COMDAT groups was quite wrong - basically it is possible
 to unify within groups but extra care needs to be taken to unify across
 groups

  10) There is no need to always give up when local alias of original can
  not be created

  11) ALIAS is interposable, it is still possible to turn it into a thunk
  and keep interpositions correct

  12) I merged diagnostics to be all "Not unifying;" or "Unified;"

  13) sem_variable::merge diverged somewhat from function version and
  had strange code preventing alias cycles that should not fire because
  we should not be merging aliases, only their targets.
  Code also disabled itself with -fvariable-sections for no reason.
 this simplifies grepping.

  14) As noticed by Jakub, it does not make that much sense to do
  just partial redirection when orignal function is going to stay
  for other reason.

I tested the patch on x86_64-linux and on LTO firefox.
I also tested LTO firefox with symbol aliases disabled (to immitate darwin)
and plan to regtest with this flag.

Finally I tried to reorganize the code so it is easier to follow.  First the
function decides whether alias is possible and if not it check conditions that
makes wrapper/redirection possible.  If everything fails it reports reason and
terminates.  After all checking it finally performs changes.

I plan to commit after some further testing tomorrow and having chance
Martin to look across the changes and discuss 5).

Honza

PR ipa/65150
* ipa-icf.c (redirect_all_callers): New function.
(sem_function::merge): Reorganize and fix merging issues.
(sem_variable::merge): Likewise.
(sem_variable::compare_sections): Remove.
* symtab.c (symtab_node::resolve_alias): When alias has aliases,
redirect them.
(address_matters_1): New function.
(symtab_node::address_taken_from_non_vtable_p): Move here from
ipa-visibility.
(symtab_node::address_matters_p): New function.
* cgraph.c (cgraph_edge::verify_corresponds_to_fndecl): Fix
check for merged flag.
* cgraph.h (address_matters_p): Declare.
* ipa-visibility.c (symtab_node::address_taken_from_non_vtable_p):
Remove.
 

Re: [patch, avr] Tidy up avr-log.c

2015-02-25 Thread Denis Chertykov
2015-02-24 16:09 GMT+03:00 Georg-Johann Lay :
> avr-log.c and respective macros in avr-protos.h still assume that the
> implementation language is C90, i.e. no variadic macros are available.
>
> This patch cleans up the code from the cumbersome old approach and uses
> variadic macros for avr_dump, avr_edump and avr_fdump.
>
> Ok for trunk?
>
> Johann
>
>
> Use variadic macros with avr-log.c.
>
> * config/avr/avr-protos.h (avr_vdump): New prototype.
> (avr_log_set_caller_e, avr_log_set_caller_f): Remove protos.
> (avr_edump, avr_fdump, avr_dump): (Re)define to use avr_vdump.
> * config/avr/avr-log.c: Adjust comments.
> (avr_vdump): New function.
> (avr_vadump): Pass caller as 2nd argument instead of format string.
> (avr_log_caller, avr_log_fdump_e, avr_log_fdump_f)
> (avr_log_set_caller_e, avr_log_set_caller_f): Remove.

Ok.

Denis.


Re: [PATCH] Fix up LTO TARGET_OPTION_NODE handling on x86 (PR lto/64374)

2015-02-25 Thread Richard Biener
On Tue, 24 Feb 2015, Jakub Jelinek wrote:

> On Tue, Feb 24, 2015 at 08:48:19PM +0100, Jan Hubicka wrote:
> > Thanks, the i386 parts of the patch are OK, but I think you want to add the 
> > reverse
> > transformation, too.  I.e. if someone compiles with -fPIC but links without.
> 
> I've only done it this way because that is what
> ix86_option_override_internal was doing, but supposedly only because the
> command line option is only about the non-PIC variants.
> So I agree that the other direction makes sense too and will adjust it.
> 
> > My plan to fix the testcase was to put it into 
> > ix86_function_specific_restore
> > which would save need for a new hook. But I am fine either way (just can't
> > approve the newhook)
> 
> The way the streaming in now works is that we don't have a gcc_options
> structure anywhere, so if it was done in the *_restore hook, you'd need
> to *_save it first and then restore.
> 
> Richard, are you ok with the new hook?

Yeah.  Can't think of a better way (other than not doing what the
x86 backend does).

Richard.


Re: Patch ping

2015-02-25 Thread Richard Biener
On Wed, 25 Feb 2015, Jakub Jelinek wrote:

> On Wed, Feb 18, 2015 at 11:00:35AM +0100, Jakub Jelinek wrote:
> > On Tue, Feb 17, 2015 at 11:00:14AM +0100, Richard Biener wrote:
> > > I'm just looking for a way to make this less of a hack (and the LTO IL
> > > less target dependent).  Not for GCC 5 for which something like your
> > > patch is probably ok, but for the future.
> > 
> > So, given Ilya's and Thomas' testing, is this acceptable for now, and
> > perhaps we can try to do something better for GCC 6?
> > 
> > Here is the patch with full ChangeLog:
> 
> I'd like to ping following patch:
> http://gcc.gnu.org/ml/gcc-patches/2015-02/msg01080.html

Oops, totally forgot about this one.

Shouldn't

+   default:
+ error ("unsupported mode %s\n", mname);

be a fatal_error ()?  After all if we hit this but continue we'll
stream random crap.  I also think we should be a bit more user-centric
here and maybe report "for host / offload target combination".

+static GTY(()) const unsigned char *lto_mode_identity_table;

why in GC memory?

Ok with changes along these lines.

Thanks,
Richard.


> > 2015-02-18  Jakub Jelinek  
> > 
> > * passes.c (ipa_write_summaries_1): Call lto_output_init_mode_table.
> > (ipa_write_optimization_summaries): Likewise.
> > * tree-streamer.h: Include data-streamer.h.
> > (streamer_mode_table): Declare extern variable.
> > (bp_pack_machine_mode, bp_unpack_machine_mode): New inline functions.
> > * lto-streamer-out.c (lto_output_init_mode_table,
> > lto_write_mode_table): New functions.
> > (produce_asm_for_decls): Call lto_write_mode_table when streaming
> > offloading LTO.
> > * lto-section-in.c (lto_section_name): Add "mode_table" entry.
> > (lto_create_simple_input_block): Add mode_table argument to the
> > lto_input_block constructors.
> > * ipa-prop.c (ipa_prop_read_section, read_replacements_section):
> > Likewise.
> > * data-streamer-in.c (string_for_index): Likewise.
> > * ipa-inline-analysis.c (inline_read_section): Likewise.
> > * ipa-icf.c (sem_item_optimizer::read_section): Likewise.
> > * lto-cgraph.c (input_cgraph_opt_section): Likewise.
> > * lto-streamer-in.c (lto_read_body_or_constructor,
> > lto_input_toplevel_asms): Likewise.
> > (lto_input_mode_table): New function.
> > * tree-streamer-out.c (pack_ts_fixed_cst_value_fields,
> > pack_ts_decl_common_value_fields, pack_ts_type_common_value_fields):
> > Use bp_pack_machine_mode.
> > * real.h (struct real_format): Add name field.
> > * lto-streamer.h (enum lto_section_type): Add LTO_section_mode_table.
> > (class lto_input_block): Add mode_table member.
> > (lto_input_block::lto_input_block): Add mode_table_ argument,
> > initialize mode_table.
> > (struct lto_file_decl_data): Add mode_table field.
> > (lto_input_mode_table, lto_output_init_mode_table): New prototypes.
> > * tree-streamer-in.c (unpack_ts_fixed_cst_value_fields,
> > unpack_ts_decl_common_value_fields,
> > unpack_ts_type_common_value_fields): Call bp_unpack_machine_mode.
> > * tree-streamer.c (streamer_mode_table): New variable.
> > * real.c (ieee_single_format, mips_single_format,
> > motorola_single_format, spu_single_format, ieee_double_format,
> > mips_double_format, motorola_double_format,
> > ieee_extended_motorola_format, ieee_extended_intel_96_format,
> > ieee_extended_intel_128_format, ieee_extended_intel_96_round_53_format,
> > ibm_extended_format, mips_extended_format, ieee_quad_format,
> > mips_quad_format, vax_f_format, vax_d_format, vax_g_format,
> > decimal_single_format, decimal_double_format, decimal_quad_format,
> > ieee_half_format, arm_half_format, real_internal_format): Add name
> > field.
> > * config/pdp11/pdp11.c (pdp11_f_format, pdp11_d_format): Likewise.
> > lto/
> > * lto.c (lto_mode_identity_table): New variable.
> > (lto_read_decls): Add mode_table argument to the lto_input_block
> > constructor.
> > (lto_file_finalize): Initialize mode_table.
> > (lto_init): Initialize lto_mode_identity_table.
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


[PATCH, CHKP, PR target/65183] Avoid wrong pass local data usage

2015-02-25 Thread Ilya Enkovich
Hi,

This patch fixes a case when outdated checker local data is used to process 
external calls.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for 
trunk?

Thanks,
Ilya
--
gcc/

2015-02-25  Ilya Enkovich  

PR target/65183
* tree-chkp.c (chkp_check_lower): Don't check against
zero bounds for already instrumented functions.
(chkp_check_upper): Likewise.
(chkp_fini): Clean pass local data to avoid wrong reusage.

gcc/testsuite/

2015-02-25  Ilya Enkovich  

PR target/65183
* gcc.target/i386/pr65183.c: New.


diff --git a/gcc/testsuite/gcc.target/i386/pr65183.c 
b/gcc/testsuite/gcc.target/i386/pr65183.c
new file mode 100644
index 000..069a543
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65183.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options "-O -fcheck-pointer-bounds -fchkp-use-nochk-string-functions 
-mmpx" } */
+
+extern void bar(void *);
+extern void baz(void);
+
+static int lc[32];
+
+void foobar(void *c)
+{
+  bar(&c);
+  __builtin_memcpy (lc, c, lc[0]);
+}
+
+void foo ()
+{
+  baz ();
+  foobar(0);
+}
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index b0a3a15..d2df4ba 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -1268,7 +1268,8 @@ chkp_check_lower (tree addr, tree bounds,
   gimple check;
   tree node;
 
-  if (bounds == chkp_get_zero_bounds ())
+  if (!chkp_function_instrumented_p (current_function_decl)
+  && bounds == chkp_get_zero_bounds ())
 return;
 
   if (dirflag == integer_zero_node
@@ -1314,7 +1315,8 @@ chkp_check_upper (tree addr, tree bounds,
   gimple check;
   tree node;
 
-  if (bounds == chkp_get_zero_bounds ())
+  if (!chkp_function_instrumented_p (current_function_decl)
+  && bounds == chkp_get_zero_bounds ())
 return;
 
   if (dirflag == integer_zero_node
@@ -4306,6 +4308,10 @@ chkp_fini (void)
   free_dominance_info (CDI_POST_DOMINATORS);
 
   bitmap_obstack_release (NULL);
+
+  entry_block = NULL;
+  zero_bounds = NULL_TREE;
+  none_bounds = NULL_TREE;
 }
 
 /* Main instrumentation pass function.  */


Re: [PATCH PR65161]

2015-02-25 Thread Yuri Rumyantsev
Hi All,

I prepared new patch which includes test-case.

I can't agree with patch proposed by Alexander since other functions
doing ready list reordering also use HID interface, so I put escape
check in ix86_sched_reorder.

Is it OK for trunk?

2015-02-25  Yuri Rumyantsev  

PR target/65161
* config/i386/i386.c (ix86_sched_reorder): Skip instruction reordering
for selective scheduling.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr65161.c: New test.


2015-02-25 2:54 GMT+03:00 Andrey Belevantsev :
> On 24.02.2015 21:16, Alexander Monakov wrote:
>>
>>
>>
>> On Tue, 24 Feb 2015, Yuri Rumyantsev wrote:
>>
>>> Hi All!
>>>
>>> Here is a simple patch to not perform instruction reordering for
>>> selective scheduling since it uses interface of list scheduling
>>> defined in "sched-int.h".
>>
>>
>> As I see, the exact problem is that swap_top_of_ready_list accesses HID,
>> so
>> please use the more specialized patch below instead.
>
>
> You have missed a space before call parentheses in the patch, otherwise it
> looks fine.
>
> Andrey
>
>
>
>>
>> Thanks.
>> Alexander
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 7f5796a..6eccd54 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -26615,6 +26615,12 @@ swap_top_of_ready_list (rtx_insn **ready, int
>> n_ready)
>> dep_t dep;
>> int clock1 = -1;
>> int clock2 = -1;
>> +
>> +  /* The following heuristic inspects h_i_d, but it is not extended for
>> insns
>> + created when doing selective scheduling.  */
>> +  if (sel_sched_p())
>> +return false;
>> +
>> #define INSN_TICK(INSN) (HID (INSN)->tick)
>>
>> if (!TARGET_SILVERMONT && !TARGET_INTEL)
>>
>


patch.1
Description: Binary data


libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)

2015-02-25 Thread Thomas Schwinge
Hi!

On Tue, 24 Feb 2015 11:29:51 +, Julian Brown  
wrote:
> On Wed, 4 Feb 2015 15:05:45 +
> Julian Brown  wrote:
> 
> > The major changes are: [...]

Thanks for looking into this!

> This is a version of the previously-posted patch to rework
> initialisation and support the proposed load/unload hooks, merged to
> gomp4 branch and tested alongside the two patches (from
> https://gcc.gnu.org/wiki/Offloading#nvptx_Offloading):
> 
> http://news.gmane.org/find-root.php?message_id=%3C20150218100035.GF1746%40tucnak.redhat.com%3E
> 
> http://news.gmane.org/find-root.php?message_id=%3C546CF508.9010807%40codesourcery.com%3E
> 
> As well as Ilya Verbin's patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01605.html

(I also added

to the mix.)

> Test results look OK, barring a suspected harness issue (lib-83
> failing with a timeout for nvptx

Yes; Jim's rewriting the timing code.

However, I'm seeing a class of testsuite regressions: all variants of
libgomp.oacc-fortran/lib-5.f90 and libgomp.oacc-fortran/lib-7.f90 FAIL:
»libgomp: cuMemFreeHost error: invalid value«.  I see these two test
cases contain a lot of acc_get_num_devices and similar calls -- I've been
testing this on our nvidiak20-2 system, which contains two Nvidia K20
cards, so maybe there's something wrong in that regard.  (But why is this
failing only for Fortran -- are we missing C/C++ tests in that area?)
Can you have a look, or want me to?

> OK for gomp4 branch? I could commit Ilya's patch there too if so.

I'll leave the decision to Jakub, but, what about trunk?  As Ilya
indicated in
,
(at least part of) these patches are fixing a regression with offloading
From shared libraries.  (And maybe the rest qualifies as fixes and
extensions to new code (offloading), so no danger to cause any
regressions compared to the last GCC release?)


I have not reviewed all your changes; just a few comments:

> --- a/gcc/config/nvptx/mkoffload.c
> +++ b/gcc/config/nvptx/mkoffload.c
> @@ -850,16 +851,17 @@ process (FILE *in, FILE *out)

>fprintf (out, "static const void *target_data[] = {\n");
> -  fprintf (out, "  ptx_code, var_mappings, func_mappings\n");
> +  fprintf (out, "  ptx_code, (void*) %u, var_mappings, (void*) %u, "
> + "func_mappings\n", nvars, nfuncs);
>fprintf (out, "};\n\n");

I wondered if it's maybe more elegant to just separate those by NULL
delimiters instead of the size integers casted to void * (spaces
missing)?  But then, that'd need "double scanning" in the consumer,
libgomp/plugin/plugin-nvptx.c:GOMP_OFFLOAD_load_image, because we need to
allocate an appropriately sized array, so maybe your more expressive
approach is better indeed.

> --- a/libgomp/oacc-async.c
> +++ b/libgomp/oacc-async.c
> @@ -34,44 +34,68 @@
>  int
>  acc_async_test (int async)
>  {
> +  struct goacc_thread *thr = goacc_thread ();
> +
>if (async < acc_async_sync)
>  gomp_fatal ("invalid async argument: %d", async);
>  
> -  return base_dev->openacc.async_test_func (async);
> +  assert (thr->dev);
> +
> +  return thr->dev->openacc.async_test_func (async);
>  }

(Here, and in several other places: I would have placed the declaration
of thr and its initialization just before its first use, but then, no
need to change that now.)

Here, and in several other places: is this code conforming to the OpenACC
specification?  Do we need to (lazily) initialize in all these places, or
in goacc_thread, or gracefully fail (see below) if not initialized
(basically in all places where you currently assert (thr->dev)?

#include 

int main(int argc, char *argv[])
{
  return acc_async_test(0);
}

$ build-gcc/gcc/xgcc -Bbuild-gcc/gcc/ 
-Bbuild-gcc/x86_64-unknown-linux-gnu/./libgomp/ 
-Bbuild-gcc/x86_64-unknown-linux-gnu/./libgomp/.libs 
-Ibuild-gcc/x86_64-unknown-linux-gnu/./libgomp -Isource-gcc/libgomp 
-Binstall/offload-nvptx-none/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0 
-Binstall/offload-nvptx-none/bin 
-Binstall/offload-x86_64-intelmicemul-linux-gnu/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0
 -Binstall/offload-x86_64-intelmicemul-linux-gnu/bin 
-Lbuild-gcc/x86_64-unknown-linux-gnu/./libgomp/.libs 
-Wl,-rpath,build-gcc/x86_64-unknown-linux-gnu/./libgomp/.libs -Wall ../a.c 
-fopenacc -g
$ gdb -q a.out 
Reading symbols from a.out...done.
(gdb) r
Starting program: [...]/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
acc_async_test (async=0) at [...]/source-gcc/libgomp/oacc-async.c:42
42assert (thr->dev);

Also, I'm not sure what the expected outcome of this code sequence is:

acc_init(acc_device_nvidia);
acc_shutdown(acc_device

[PR58315] reset inlined debug vars at return-to point

2015-02-25 Thread Alexandre Oliva
This patch fixes a problem that has been with us for several years.
Variable tracking has no idea about the end of the lifetime of inlined
variables, so it keeps on computing locations for them over and over,
even though the computed locations make no sense whatsoever because the
variable can't even be accessed any more.

With this patch, we unbind all inlined variables at the point the
inlined function returns to, so that the locations for those variables
will not be touched any further.

In theory, we could do something similar to non-inlined auto variables,
when they go out of scope, but their decls apply to the entire function
and I'm told gdb sort-of expects the variables to be accessible
throughout the function, so I'm not tackling that in this patch, for I'm
happy enough with what this patch gets us:

- almost 99% reduction in the output asm for the PR testcase

- more than 90% reduction in the peak memory use compiling that testcase

- 63% reduction in the compile time for that testcase

What's scary is that the testcase is not particularly pathological.  Any
function that calls a longish sequence of inlined functions, that in
turn call other inline functions, and so on, something that's not
particularly unusual in C++, will likely observe significant
improvement, as we won't see growing sequences of var_location notes
after each call or so, as var-tracking computes a new in-stack location
for the implicit this argument of each previously-inlined function.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?


Reset inlined debug variables at the end of the inlined function

From: Alexandre Oliva 

for  gcc/ChangeLog

PR debug/58315
* tree-inline.c (reset_debug_binding): New.
(reset_debug_bindings): Likewise.
(expand_call_inline): Call it.
---
 gcc/tree-inline.c |   56 +
 1 file changed, 56 insertions(+)

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index d8abe03..5b58d8b 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -4345,6 +4345,60 @@ add_local_variables (struct function *callee, struct 
function *caller,
   }
 }
 
+/* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might
+   have brought in or introduced any debug stmts for SRCVAR.  */
+
+static inline void
+reset_debug_binding (copy_body_data *id, tree srcvar, gimple_seq *bindings)
+{
+  tree *remappedvarp = id->decl_map->get (srcvar);
+
+  if (!remappedvarp)
+return;
+
+  if (TREE_CODE (*remappedvarp) != VAR_DECL)
+return;
+
+  if (*remappedvarp == id->retvar || *remappedvarp == id->retbnd)
+return;
+
+  tree tvar = target_for_debug_bind (*remappedvarp);
+  if (!tvar)
+return;
+
+  gdebug *stmt = gimple_build_debug_bind (tvar, NULL_TREE,
+ id->call_stmt);
+  gimple_seq_add_stmt (bindings, stmt);
+}
+
+/* For each inlined variable for which we may have debug bind stmts,
+   add before GSI a final debug stmt resetting it, marking the end of
+   its life, so that var-tracking knows it doesn't have to compute
+   further locations for it.  */
+
+static inline void
+reset_debug_bindings (copy_body_data *id, gimple_stmt_iterator gsi)
+{
+  tree var;
+  unsigned ix;
+  gimple_seq bindings = NULL;
+
+  if (!gimple_in_ssa_p (id->src_cfun))
+return;
+
+  if (!opt_for_fn (id->dst_fn, flag_var_tracking_assignments))
+return;
+
+  for (var = DECL_ARGUMENTS (id->src_fn);
+   var; var = DECL_CHAIN (var))
+reset_debug_binding (id, var, &bindings);
+
+  FOR_EACH_LOCAL_DECL (id->src_cfun, ix, var)
+reset_debug_binding (id, var, &bindings);
+
+  gsi_insert_seq_before_without_update (&gsi, bindings, GSI_SAME_STMT);
+}
+
 /* If STMT is a GIMPLE_CALL, replace it with its inline expansion.  */
 
 static bool
@@ -4650,6 +4704,8 @@ expand_call_inline (basic_block bb, gimple stmt, 
copy_body_data *id)
 GCOV_COMPUTE_SCALE (cg_edge->frequency, CGRAPH_FREQ_BASE),
 bb, return_block, NULL);
 
+  reset_debug_bindings (id, stmt_gsi);
+
   /* Reset the escaped solution.  */
   if (cfun->gimple_df)
 pt_solution_reset (&cfun->gimple_df->escaped);


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH][wwwdocs] Mention xgene-1 in arm and aarch64, FreeBSD support for arm

2015-02-25 Thread Kyrill Tkachov


On 13/02/15 10:14, Richard Earnshaw wrote:

On 13/02/15 09:52, Kyrill Tkachov wrote:

Hi all,

This patch to changes.html mentions the xgene1 support in GCC 5 for arm
and aarch64 and also the FreeBSD support for ARM.

Is this ok?

The repetitive nature of all these new cpus being added looks rather
wooden.  I think it would be better to merge them into one change block,
that lists all the cpus and their internal names, then mentions once at
the end that these names can be used as arguments to -mcpu and -mtune.


Yeah, that makes sense. I've incorporated the feedback.
Here's a proposed patch.

How's this?

Kyrill



R.


I added the FreeBSD part in the Operating Systems section similar to
Dragonfly BSD.

Thanks,
Kyrill

wwwdocs-xgene-freebsd.patch


Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.80
diff -U 3 -r1.80 changes.html
--- htdocs/gcc-5/changes.html   9 Feb 2015 11:54:27 -   1.80
+++ htdocs/gcc-5/changes.html   11 Feb 2015 17:24:14 -
@@ -480,6 +480,9 @@
   Support for the Cavium ThunderX processor is now available through 
the
  -mcpu=thunderx and -mtune=thunderx options.
   
+ Support for the Applied Micro X-Gene processor is now available 
through
+ the -mcpu=xgene1 and -mtune=xgene1 options.
+ 
   Support for the Cortex-A72 processor has been added through
 the -mcpu=cortex-a72 option and the big.LITTLE
 variant -mcpu=cortex-a72.cortex-a53.  Using these options
@@ -512,6 +515,9 @@
 Support for the Cortex-A17 processor has been added through the
-mcpu=cortex-a17 and -mtune=cortex-a17 
options.

+ Support for the Applied Micro X-Gene processor is now available 
through
+ the -mcpu=xgene1 and -mtune=xgene1 options.
+ 
 Support for the Cortex-M7 processor has been added through the
-mcpu=cortex-m7 and -mtune=cortex-m7 options.

@@ -666,6 +672,13 @@
  GCC now supports the DragonFly BSD operating system.

  
+  FreeBSD

+
+  
+GCC now supports the FreeBSD operating system for the arm port
+through the arm*-*-freebsd* target triplets.
+  
+
VxWorks MILS
  





Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.80
diff -U 3 -r1.80 changes.html
--- htdocs/gcc-5/changes.html	9 Feb 2015 11:54:27 -	1.80
+++ htdocs/gcc-5/changes.html	17 Feb 2015 12:24:51 -
@@ -477,13 +477,18 @@
-mcpu or -march e.g.
-mcpu=cortex-a53+crypto.
  
- Support for the Cavium ThunderX processor is now available through the
--mcpu=thunderx and -mtune=thunderx options.
- 
- Support for the Cortex-A72 processor has been added through
-   the -mcpu=cortex-a72 option and the big.LITTLE
-   variant -mcpu=cortex-a72.cortex-a53.  Using these options
-   requires a version of GNU binutils that has support for the Cortex-A72.
+  Support has been added for the following processors
+  (GCC identifiers in parentheses): ARM Cortex-A72
+  (cortex-a72) and initial support for its big.LITTLE
+  combination with the ARM Cortex-A53 (cortex-a72.cortex-a53),
+  Cavium ThunderX (thunderx), Applied Micro X-Gene 1
+  (xgene1).
+  The GCC identifiers can be used
+  as arguments to the -mcpu or -mtune options,
+  for example: -mcpu=xgene1 or
+  -mtune=cortex-a72.cortex-a53.
+  Using -mcpu=cortex-a72 requires a version of GNU binutils
+  that has support for the Cortex-A72.
  
  The transitional options -mlra and -mno-lra
have been removed. The AArch64 backend now uses the local register
@@ -509,21 +514,19 @@
to offer increased performance when compiling with
-mcpu=cortex-a57 or -mtune=cortex-a57.
   
-   Support for the Cortex-A17 processor has been added through the
-  -mcpu=cortex-a17 and -mtune=cortex-a17 options.
-  
-   Support for the Cortex-M7 processor has been added through the
-  -mcpu=cortex-m7 and -mtune=cortex-m7 options.
-  
-   Initial big.LITTLE tuning support for the combination of Cortex-A17
-   and Cortex-A7 has been added through the
-   -mcpu=cortex-a17.cortex-a7  and
-   -mtune=cortex-a17.cortex-a7 options.
-  
-  Support for the Cortex-A72 processor has been added through
-the -mcpu=cortex-a72 option and the big.LITTLE
-variant -mcpu=cortex-a72.cortex-a53.  Using these options
-requires a version of GNU binutils that has support for the Cortex-A72.
+   Support has been added for the following processors
+   (GCC identifiers in parentheses): ARM Cortex-A17 (cortex-a17) and
+   initial support for its big.LITTLE combination with the ARM Cortex-A7
+   (cortex-a17.cortex-a7), ARM Cortex-A72
+   (cortex-a72) and initial s

[patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Kai Tietz
Hello,

ChangeLog

2015-02-25  Kai Tietz  

PR tree-optimization/61917
* tree-vect-loop.c (vectorizable_reduction): Allow
vect_internal_def without reduction to exit graceful.

ChagneLog testsuite/

2015-02-25  Kai Tietz  

PR tree-optimization/61917
* gcc.dg/vect/vect-pr61917.c: New file.

Tested for x86_64-unkown-linux.  Ok for apply?
Regards,
Kai

Index: tree-vect-loop.c
===
--- tree-vect-loop.c(Revision 220958)
+++ tree-vect-loop.c(Arbeitskopie)
@@ -4990,7 +4990,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
   /* For pattern recognized stmts, orig_stmt might be a reduction,
  but some helper statements for the pattern might not, or
  might be COND_EXPRs with reduction uses in the condition.  */
-  gcc_assert (orig_stmt);
+  gcc_assert (orig_stmt || dt == vect_internal_def);
   return false;
 }
   if (!found_nested_cycle_def)
Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
===
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+int a, b, c, d;
+
+int
+fn1 ()
+{
+  for (; c; c++)
+for (b = 0; b < 2; b++)
+  d = a - d;
+  return d;
+}


[Patch,microblaze]: Optimized usage of pcmp conditional instruction.

2015-02-25 Thread Ajit Kumar Agarwal
Hello All:

Please find the patch for the optimized usage of pcmp instructions in 
microblaze. No regressions is seen
In deja GNU tests. There are many testcases that are already there in deja GNU 
to check the generation of 
pcmpne/pcmpeq instructions and are used to check the validity. 

commit b74acf44ce4286649e5be7cff7518d814cb2491f
Author: Ajit Kumar Agarwal 
Date:   Wed Feb 25 15:33:02 2015 +0530

[Patch,microblaze]: Optimized usage of pcmp conditional instruction.

The changes are made in the patch for optimized usage of pcmpne/pcmpeq
instructions. The xor with register to register is replaced with pcmpeq
/pcmpne instructions and for immediate check still the xori will be used.
The purpose of the change is to acheive the aggressive usage of pcmpne
/pcmpeq instructions instead of xor being used for comparison.

ChangeLog:
2015-02-25  Ajit Agarwal  

* config/microblaze/microblaze.md (cbranchsi4): Added immediate
constraints.
(cbranchsi4_reg): New.
* config/microblaze/microblaze.c
(microblaze_expand_conditional_branch_reg): New.
* config/microblaze/microblaze-protos.h
(microblaze_expand_conditional_branch_reg): New prototype.

Signed-off-by:Ajit Agarwal ajit...@xilinx.com

Thanks & Regards
Ajit


0001-Patch-microblaze-Optimized-usage-of-pcmp-conditional.patch
Description: 0001-Patch-microblaze-Optimized-usage-of-pcmp-conditional.patch


[Patch,microblaze]: Optimized usage of fint instruction.

2015-02-25 Thread Ajit Kumar Agarwal
Hello All:

Please find the patch for the optimized usage of fint instruction changes. No 
regression is seen
in the deja GNU tests.

commit ed4dc0b96bf43c200cacad97f73a98ab7048e51b
Author: Ajit Kumar Agarwal 
Date:   Wed Feb 25 15:36:29 2015 +0530

[Patch,microblaze]: Optimized usage of fint instruction.

The changes are made in the patch for optimized usage of fint instruction.
The sequence of fint/cond_branch is replaced with fcmp/cond_branch. The
fint instruction takes 6/7 cycles as compared to fcmp instruction which
takes 1 cycles. The conversion from float to int with fint instruction
is not required and can directly compared with fcmp instruction which
takes 1 cycle as compared to 6/7 cycles with fint instruction.

ChangeLog:
2015-02-25  Ajit Agarwal  

* config/microblaze/microblaze.md (peephole2): New.

Signed-off-by:Ajit Agarwal ajit...@xilinx.com

Thanks & Regards
Ajit


0001-Patch-microblaze-Optimized-usage-of-fint-instruction.patch
Description: 0001-Patch-microblaze-Optimized-usage-of-fint-instruction.patch


Re: [PATCH PR65161]

2015-02-25 Thread Alexander Monakov


On Wed, 25 Feb 2015, Yuri Rumyantsev wrote:

> Hi All,
> 
> I prepared new patch which includes test-case.
> 
> I can't agree with patch proposed by Alexander since other functions
> doing ready list reordering also use HID interface, so I put escape
> check in ix86_sched_reorder.

I don't see how that is the case.  Can you point me to specific lines of code
in ix86_sched_reorder or its callees that access HID?

In any case please use sel_sched_p () rather than flag_selective_scheduling2.

Thanks.

Alexander


Re: [PATCH] Optimize bfi by remove redundant zero_extend

2015-02-25 Thread Renlin Li

On 13/02/15 17:04, Richard Henderson wrote:

On 02/13/2015 08:26 AM, Renlin Li wrote:

+  /* Complete overlap.  We can remove the source ZERO_EXTEND.  */
+  if (width == inner_size
+ && (regno < FIRST_PSEUDO_REGISTER)
+ && HARD_REGNO_MODE_OK (regno, mode))
+  {
+   rtx reg = gen_rtx_REG (mode, regno);
+   return gen_rtx_SET (VOIDmode, dest, reg);

What in the world are you doing here with the hard registers?


r~



Ah, So we don't have the convention to play with hard register in this pass?

This AND rtx here will be transformed into a zero_extend rtx from 
sub_reg expression, and further into the form I have mentioned above, as 
the hard register support this.


(and:SI (reg:SI 1 x1 [ y ])
(const_int 255 [0xff])))

I have modified my patch a little bit, Is the following method Okay?

  if (GET_CODE (dest) == ZERO_EXTRACT
  && CONST_INT_P (XEXP (dest, 1))
  && CONST_INT_P (XEXP (dest, 2))
  && GET_CODE (src) == ZERO_EXTEND)
{
  HOST_WIDE_INT width = INTVAL (XEXP (dest, 1));
  mode = GET_MODE (dest);

 /* handle sub_reg rtx.  */
  if (GET_CODE (XEXP (src, 0)) == SUBREG && REG_P (XEXP (XEXP (src, 
0), 0)))

{
  rtx reg = XEXP (XEXP (src, 0), 0);
  unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (XEXP (src, 
0)));


  /* Complete overlap.  We can remove the source ZERO_EXTEND. */
  if (width == inner_size
  && mode == GET_MODE (reg))
return gen_rtx_SET (VOIDmode, dest, reg);
}
  /* handle hard register case as sub_reg further is simplied. */
  else if (REG_P (XEXP (src, 0)))
{
  rtx tmp = XEXP (src, 0);
  unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (tmp));

  /* Complete overlap.  We can remove the source ZERO_EXTEND. */
  if (width == inner_size && can_change_dest_mode (tmp, 0, mode))
{
  rtx reg = gen_rtx_REG (mode, REGNO (tmp));
  return gen_rtx_SET (VOIDmode, dest, reg);
}
}
}



Is there a better way to do this here or somewhere else?  or it's good 
to implement it as a specific rtx insn?

A define split won't work in this case. This should handled in combine pass.

Thank you for any suggestions!
Regards,
Renlin Li





Option overriding in the offloading code path (was: [nvptx] -freorder-blocks-and-partition, -freorder-functions)

2015-02-25 Thread Thomas Schwinge
Hi!

On Wed, 11 Feb 2015 15:50:20 +0100, I wrote:
> On Wed, 11 Feb 2015 15:44:26 +0100, I wrote:
> > If -freorder-blocks-and-partition is active, this results in PTX code
> > such as: [...]

> Such partitioning might not make a lot of sense for the virtual ISA that
> PTX is, but disabling it in nvptx.c:nvptx_option_override does not work.
> (Because that is not invoked in the offloading code path?)  I see x86 has
> a ix86_option_override_internal (but I don't know how that options
> processing works) -- is something like that needed for nvptx, too, and
> how to interconnect that with the offloading code path?  Sounds a bit
> like what Jakub suggests in ?

Am I on the right track with my assumption that it is correct that
nvptx.c:nvptx_option_override is not invoked in the offloading code path,
so we'd need a new target hook (?) to consolidate/override the options in
this scenario?


Using this to forcefully disable -fvar-tracking (as done in
nvptx_option_override), should then allow me to drop the following
beautiful specimen of a patch (which I didn't commit anywhere, so far):

commit ab5a010357f4c7347dd892f3666cdeecd08cc083
Author: Thomas Schwinge 
Date:   Mon Feb 16 13:57:08 2015 +0100

libgomp Fortran testing: for -g torture testing, disable variable tracking.

Otherwise, the nvptx-none offloading compiler will run into issues such as:

source-gcc/libgomp/testsuite/libgomp.fortran/examples-4/e.50.1.f90: In 
function '__e_50_1_mod_MOD_vec_mult._omp_fn.1':

source-gcc/libgomp/testsuite/libgomp.fortran/examples-4/e.50.1.f90:31:0: 
internal compiler error: in use_type, at var-tracking.c:5442
 p(i) = v1(i) * v2(i)
 ^
0xc4dc72 use_type
source-gcc/gcc/var-tracking.c:5442
0xc504b3 add_stores
source-gcc/gcc/var-tracking.c:5869
0xc4cd28 add_with_sets
source-gcc/gcc/var-tracking.c:6553
0x5e9b7d cselib_record_sets
source-gcc/gcc/cselib.c:2574
0x5ea8a7 cselib_process_insn(rtx_insn*)
source-gcc/gcc/cselib.c:2686
0xc586a3 vt_initialize
source-gcc/gcc/var-tracking.c:10126
0xc65a8e variable_tracking_main_1
source-gcc/gcc/var-tracking.c:10322
0xc65a8e variable_tracking_main
source-gcc/gcc/var-tracking.c:10375
0xc65a8e execute
source-gcc/gcc/var-tracking.c:10412
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
mkoffload: fatal error: 
install/offload-nvptx-none/bin//x86_64-unknown-linux-gnu-accel-nvptx-none-gcc 
returned 1 exit status
---
 libgomp/testsuite/libgomp.fortran/fortran.exp  |3 +++
 libgomp/testsuite/libgomp.oacc-fortran/fortran.exp |3 +++
 2 files changed, 6 insertions(+)

diff --git libgomp/testsuite/libgomp.fortran/fortran.exp 
libgomp/testsuite/libgomp.fortran/fortran.exp
index 9e6b643..0b597e6 100644
--- libgomp/testsuite/libgomp.fortran/fortran.exp
+++ libgomp/testsuite/libgomp.fortran/fortran.exp
@@ -21,6 +21,9 @@ dg-init
 # Turn on OpenMP.
 lappend ALWAYS_CFLAGS "additional_flags=-fopenmp"
 
+# TODO: for -g torture testing, disable variable tracking.
+regsub -all -- { -g[^ ]*} $DG_TORTURE_OPTIONS {& -fno-var-tracking} 
DG_TORTURE_OPTIONS
+
 if { $blddir != "" } {
 set lang_source_re {^.*\.[fF](|90|95|03|08)$}
 set lang_include_flags "-fintrinsic-modules-path=${blddir}"
diff --git libgomp/testsuite/libgomp.oacc-fortran/fortran.exp 
libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
index a8f62e8..080a7b9 100644
--- libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
+++ libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
@@ -23,6 +23,9 @@ dg-init
 # Turn on OpenACC.
 lappend ALWAYS_CFLAGS "additional_flags=-fopenacc"
 
+# TODO: for -g torture testing, disable variable tracking.
+regsub -all -- { -g[^ ]*} $DG_TORTURE_OPTIONS {& -fno-var-tracking} 
DG_TORTURE_OPTIONS
+
 if { $blddir != "" } {
 set lang_source_re {^.*\.[fF](|90|95|03|08)$}
 set lang_include_flags "-fintrinsic-modules-path=${blddir}"


Grüße,
 Thomas


signature.asc
Description: PGP signature


RE: [Patch][ARM]Don't put volatile memory access in IT block for cortex-m7

2015-02-25 Thread Terry Guo


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Richard Earnshaw
> Sent: Wednesday, February 18, 2015 2:45 AM
> To: Terry Guo; gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw; Ramana Radhakrishnan
> Subject: Re: [Patch][ARM]Don't put volatile memory access in IT block for
> cortex-m7
> 
> On 12/02/15 11:12, Terry Guo wrote:
> > Hi there,
> >
> > This patch intends to prevent gcc from putting volatile memory access
> > into IT block for target like cortex-m7.
> >
> > gcc/ChangeLog:
> >
> > 2015-02-12  Terry Guo  
> >
> > * config/arm/arm.c (arm_tune_cortex_m7): New global variable.
> > * config/arm/arm.h (TARGET_NO_VOLATILE_CE): New macro.
> > (arm_tune_cortex_m7): Declare new global variable.
> > * config/arm/arm.md (arm_comparison_operator): Disabled if not
> allow
> >  volatile memory access in IT block.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2015-02-12  Terry Guo  
> >
> > * gcc.target/arm/cortex-m7-it-volatile.c: New test.
> >
> 
> Not ok.
> 
> +/* Targets that don't support accessing volatile memory inside IT
> block.  */
> +#define TARGET_NO_VOLATILE_CE(arm_tune_cortex_m7)
> 
> Please don't create feature bits that explicitly test for a particular target.
> Instead, define generic 'features' and then arrange for either the
> architecture tables, or tuning tables (as appropriate) to enable that feature.
> 
> See how arm_arch_arm_hwdiv is defined for how to do this.
> 
> R.
> 

Thanks Richard.  Patch is updated per your suggestion. Is this one OK for 
current stage and 4.8/4.9?

BR,
Terry

gcc/testsuite/ChangeLog:

2015-02-25  Terry Guo  

* gcc.target/arm/no-volatile-in-it.c: New test.


gcc/ChangeLog:

2015-02-25  Terry Guo  

* config/arm/arm-cores.def (cortex-m7): Add flag FL_NO_VOLATILE_CE.
* config/arm/arm-protos.h (FL_NO_VOLATILE_CE): New flag.
(arm_arch_no_volatile_ce): Declare new global variable.
* config/arm/arm.c (arm_arch_no_volatile_ce): Define new global variable.
(arm_option_override): Assign value to arm_arch_no_volatile_ce.
* config/arm/arm.h (arm_arch_no_volatile_ce): Declare it.
(TARGET_NO_VOLATILE_CE): New macro.
* config/arm/arm.md (arm_comparison_operator): Disabled if not allow
volatile memory access in IT blockdiff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..b22ea7f 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -155,7 +155,7 @@ ARM_CORE("cortex-r4",   cortexr4, cortexr4, 
7R,  FL_LDSCHED, cortex)
 ARM_CORE("cortex-r4f", cortexr4f, cortexr4f,   7R,  
FL_LDSCHED, cortex)
 ARM_CORE("cortex-r5",  cortexr5, cortexr5, 7R,  FL_LDSCHED 
| FL_ARM_DIV, cortex)
 ARM_CORE("cortex-r7",  cortexr7, cortexr7, 7R,  FL_LDSCHED 
| FL_ARM_DIV, cortex)
-ARM_CORE("cortex-m7",  cortexm7, cortexm7, 7EM, 
FL_LDSCHED, cortex_m7)
+ARM_CORE("cortex-m7",  cortexm7, cortexm7, 7EM, FL_LDSCHED 
| FL_NO_VOLATILE_CE, cortex_m7)
 ARM_CORE("cortex-m4",  cortexm4, cortexm4, 7EM, 
FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",  cortexm3, cortexm3, 7M,  
FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",marvell_pj4, marvell_pj4,   7A,  
FL_LDSCHED, 9e)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..28ffe52 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -360,6 +360,7 @@ extern bool arm_is_constant_pool_ref (rtx);
 #define FL_CRC32  (1 << 25)  /* ARMv8 CRC32 instructions.  */
 
 #define FL_SMALLMUL   (1 << 26)   /* Small multiply supported.  */
+#define FL_NO_VOLATILE_CE   (1 << 27) /* No volatile memory in IT block.  */
 
 #define FL_IWMMXT (1 << 29)  /* XScale v2 or "Intel Wireless 
MMX technology".  */
 #define FL_IWMMXT2(1 << 30)   /* "Intel Wireless MMX2 technology".  */
@@ -482,6 +483,9 @@ extern int arm_arch_thumb2;
 extern int arm_arch_arm_hwdiv;
 extern int arm_arch_thumb_hwdiv;
 
+/* Nonzero if chip disallows volatile memory access in IT block.  */
+extern int arm_arch_no_volatile_ce;
+
 /* Nonzero if we should use Neon to handle 64-bits operations rather
than core registers.  */
 extern int prefer_neon_for_64bits;
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 297dfe1..8c10ea3 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -383,6 +383,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 #define TARGET_IDIV((TARGET_ARM && arm_arch_arm_hwdiv) \
 || (TARGET_THUMB2 && arm_arch_thumb_hwdiv))
 
+/* Nonzero if disallow volatile memory access in IT block.  */
+#define TARGET_NO_VOLATILE_CE  (arm_arch_no_volatile_ce)
+
 /* Should NEON be used for 64-bits bitops.  */
 #define TARGET_PREFER_NEON

Re: [PR58315] reset inlined debug vars at return-to point

2015-02-25 Thread Richard Biener
On Wed, Feb 25, 2015 at 10:40 AM, Alexandre Oliva  wrote:
> This patch fixes a problem that has been with us for several years.
> Variable tracking has no idea about the end of the lifetime of inlined
> variables, so it keeps on computing locations for them over and over,
> even though the computed locations make no sense whatsoever because the
> variable can't even be accessed any more.
>
> With this patch, we unbind all inlined variables at the point the
> inlined function returns to, so that the locations for those variables
> will not be touched any further.
>
> In theory, we could do something similar to non-inlined auto variables,
> when they go out of scope, but their decls apply to the entire function
> and I'm told gdb sort-of expects the variables to be accessible
> throughout the function, so I'm not tackling that in this patch, for I'm
> happy enough with what this patch gets us:
>
> - almost 99% reduction in the output asm for the PR testcase
>
> - more than 90% reduction in the peak memory use compiling that testcase
>
> - 63% reduction in the compile time for that testcase
>
> What's scary is that the testcase is not particularly pathological.  Any
> function that calls a longish sequence of inlined functions, that in
> turn call other inline functions, and so on, something that's not
> particularly unusual in C++, will likely observe significant
> improvement, as we won't see growing sequences of var_location notes
> after each call or so, as var-tracking computes a new in-stack location
> for the implicit this argument of each previously-inlined function.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

But code-motion could still move stmts from the inlined functions
across these resets?  That said - shouldn't this simply performed
by proper var-tracking u-ops produced by a backward scan over the
function for "live" scope-blocks?  That is, when you see a scope
block becoming live from exit then add u-ops resetting all
vars from that scope block?

Your patch as-is would add very many debug stmts to for example
tramp3d.  And as you say, the same reasoning applies to scopes
in general, not just for inlines.

Richard.

> Reset inlined debug variables at the end of the inlined function
>
> From: Alexandre Oliva 
>
> for  gcc/ChangeLog
>
> PR debug/58315
> * tree-inline.c (reset_debug_binding): New.
> (reset_debug_bindings): Likewise.
> (expand_call_inline): Call it.
> ---
>  gcc/tree-inline.c |   56 
> +
>  1 file changed, 56 insertions(+)
>
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index d8abe03..5b58d8b 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -4345,6 +4345,60 @@ add_local_variables (struct function *callee, struct 
> function *caller,
>}
>  }
>
> +/* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might
> +   have brought in or introduced any debug stmts for SRCVAR.  */
> +
> +static inline void
> +reset_debug_binding (copy_body_data *id, tree srcvar, gimple_seq *bindings)
> +{
> +  tree *remappedvarp = id->decl_map->get (srcvar);
> +
> +  if (!remappedvarp)
> +return;
> +
> +  if (TREE_CODE (*remappedvarp) != VAR_DECL)
> +return;
> +
> +  if (*remappedvarp == id->retvar || *remappedvarp == id->retbnd)
> +return;
> +
> +  tree tvar = target_for_debug_bind (*remappedvarp);
> +  if (!tvar)
> +return;
> +
> +  gdebug *stmt = gimple_build_debug_bind (tvar, NULL_TREE,
> + id->call_stmt);
> +  gimple_seq_add_stmt (bindings, stmt);
> +}
> +
> +/* For each inlined variable for which we may have debug bind stmts,
> +   add before GSI a final debug stmt resetting it, marking the end of
> +   its life, so that var-tracking knows it doesn't have to compute
> +   further locations for it.  */
> +
> +static inline void
> +reset_debug_bindings (copy_body_data *id, gimple_stmt_iterator gsi)
> +{
> +  tree var;
> +  unsigned ix;
> +  gimple_seq bindings = NULL;
> +
> +  if (!gimple_in_ssa_p (id->src_cfun))
> +return;
> +
> +  if (!opt_for_fn (id->dst_fn, flag_var_tracking_assignments))
> +return;
> +
> +  for (var = DECL_ARGUMENTS (id->src_fn);
> +   var; var = DECL_CHAIN (var))
> +reset_debug_binding (id, var, &bindings);
> +
> +  FOR_EACH_LOCAL_DECL (id->src_cfun, ix, var)
> +reset_debug_binding (id, var, &bindings);
> +
> +  gsi_insert_seq_before_without_update (&gsi, bindings, GSI_SAME_STMT);
> +}
> +
>  /* If STMT is a GIMPLE_CALL, replace it with its inline expansion.  */
>
>  static bool
> @@ -4650,6 +4704,8 @@ expand_call_inline (basic_block bb, gimple stmt, 
> copy_body_data *id)
>  GCOV_COMPUTE_SCALE (cg_edge->frequency, CGRAPH_FREQ_BASE),
>  bb, return_block, NULL);
>
> +  reset_debug_bindings (id, stmt_gsi);
> +
>/* Reset the escaped solution.  */
>if (cfun->gimple_df)
>  pt_solution_reset (&cfun->gimple_df->escaped);
>
>
> --

Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Richard Biener
On Wed, Feb 25, 2015 at 11:06 AM, Kai Tietz  wrote:
> Hello,
>
> ChangeLog
>
> 2015-02-25  Kai Tietz  
>
> PR tree-optimization/61917
> * tree-vect-loop.c (vectorizable_reduction): Allow
> vect_internal_def without reduction to exit graceful.
>
> ChagneLog testsuite/
>
> 2015-02-25  Kai Tietz  
>
> PR tree-optimization/61917
> * gcc.dg/vect/vect-pr61917.c: New file.
>
> Tested for x86_64-unkown-linux.  Ok for apply?

It doesn't make much sense to fail here as said in the comment
because of patterns if the actual case isn't a pattern.  Also
the patch causing this made vect_external_def possible, so
why does this affect vect_internal_def here?

It may paper over the issue - but clearly the fix is bogus.

Richard.

> Regards,
> Kai
>
> Index: tree-vect-loop.c
> ===
> --- tree-vect-loop.c(Revision 220958)
> +++ tree-vect-loop.c(Arbeitskopie)
> @@ -4990,7 +4990,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
>/* For pattern recognized stmts, orig_stmt might be a reduction,
>   but some helper statements for the pattern might not, or
>   might be COND_EXPRs with reduction uses in the condition.  */
> -  gcc_assert (orig_stmt);
> +  gcc_assert (orig_stmt || dt == vect_internal_def);
>return false;
>  }
>if (!found_nested_cycle_def)
> Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
> ===
> --- /dev/null
> +++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +int a, b, c, d;
> +
> +int
> +fn1 ()
> +{
> +  for (; c; c++)
> +for (b = 0; b < 2; b++)
> +  d = a - d;
> +  return d;
> +}


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Joseph Myers
On Sun, 22 Feb 2015, Arnaud Charlet wrote:

> Two .png files were missing, now added:
> 
> 2015-02-22  Arnaud Charlet  
>   
>   * doc/gnat_ugn/project-manager-figure.png,
>   doc/gnat_ugn/rtlibrary-structure.png: New. 

The maintainer-scripts/update_web_docs_svn script is still producing 
errors relating to those files: 
https://gcc.gnu.org/ml/gccadmin/2015-q1/msg00129.html

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch][ARM]Don't put volatile memory access in IT block for cortex-m7

2015-02-25 Thread Richard Earnshaw

On 25/02/15 10:42, Terry Guo wrote:




-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
ow...@gcc.gnu.org] On Behalf Of Richard Earnshaw
Sent: Wednesday, February 18, 2015 2:45 AM
To: Terry Guo; gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw; Ramana Radhakrishnan
Subject: Re: [Patch][ARM]Don't put volatile memory access in IT block for
cortex-m7

On 12/02/15 11:12, Terry Guo wrote:

Hi there,

This patch intends to prevent gcc from putting volatile memory access
into IT block for target like cortex-m7.

gcc/ChangeLog:

2015-02-12  Terry Guo  

* config/arm/arm.c (arm_tune_cortex_m7): New global variable.
* config/arm/arm.h (TARGET_NO_VOLATILE_CE): New macro.
 (arm_tune_cortex_m7): Declare new global variable.
* config/arm/arm.md (arm_comparison_operator): Disabled if not

allow

  volatile memory access in IT block.

gcc/testsuite/ChangeLog:

2015-02-12  Terry Guo  

* gcc.target/arm/cortex-m7-it-volatile.c: New test.



Not ok.

+/* Targets that don't support accessing volatile memory inside IT
block.  */
+#define TARGET_NO_VOLATILE_CE  (arm_tune_cortex_m7)

Please don't create feature bits that explicitly test for a particular target.
Instead, define generic 'features' and then arrange for either the
architecture tables, or tuning tables (as appropriate) to enable that feature.

See how arm_arch_arm_hwdiv is defined for how to do this.

R.



Thanks Richard.  Patch is updated per your suggestion. Is this one OK for 
current stage and 4.8/4.9?



This is OK for trunk. I suggest you let it gestate there for a few days 
before doing back-ports.


R.


BR,
Terry

gcc/testsuite/ChangeLog:

2015-02-25  Terry Guo  

 * gcc.target/arm/no-volatile-in-it.c: New test.


gcc/ChangeLog:

2015-02-25  Terry Guo  

 * config/arm/arm-cores.def (cortex-m7): Add flag FL_NO_VOLATILE_CE.
 * config/arm/arm-protos.h (FL_NO_VOLATILE_CE): New flag.
 (arm_arch_no_volatile_ce): Declare new global variable.
 * config/arm/arm.c (arm_arch_no_volatile_ce): Define new global variable.
 (arm_option_override): Assign value to arm_arch_no_volatile_ce.
 * config/arm/arm.h (arm_arch_no_volatile_ce): Declare it.
 (TARGET_NO_VOLATILE_CE): New macro.
 * config/arm/arm.md (arm_comparison_operator): Disabled if not allow
 volatile memory access in IT block





Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Kai Tietz
2015-02-25 11:57 GMT+01:00 Richard Biener :
> On Wed, Feb 25, 2015 at 11:06 AM, Kai Tietz  wrote:
>> Hello,
>>
>> ChangeLog
>>
>> 2015-02-25  Kai Tietz  
>>
>> PR tree-optimization/61917
>> * tree-vect-loop.c (vectorizable_reduction): Allow
>> vect_internal_def without reduction to exit graceful.
>>
>> ChagneLog testsuite/
>>
>> 2015-02-25  Kai Tietz  
>>
>> PR tree-optimization/61917
>> * gcc.dg/vect/vect-pr61917.c: New file.
>>
>> Tested for x86_64-unkown-linux.  Ok for apply?
>
> It doesn't make much sense to fail here as said in the comment
> because of patterns if the actual case isn't a pattern.  Also
> the patch causing this made vect_external_def possible, so
> why does this affect vect_internal_def here?
>
> It may paper over the issue - but clearly the fix is bogus.

Well, actually I don't see any good reason for failing here at all,
but I assumed that author wanted to get by it some missed cases.
AFAIU the code we actually don't need/want to assert here either on
internal (where it is pretty likely no pattern present), and also on
externals, too.
I would be fine to remove this assert here completely, but I thought
it might be of interest to see that orig_stmt isn't NULL for nested
case

Kai

> Richard.
>
>> Regards,
>> Kai
>>
>> Index: tree-vect-loop.c
>> ===
>> --- tree-vect-loop.c(Revision 220958)
>> +++ tree-vect-loop.c(Arbeitskopie)
>> @@ -4990,7 +4990,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
>>/* For pattern recognized stmts, orig_stmt might be a reduction,
>>   but some helper statements for the pattern might not, or
>>   might be COND_EXPRs with reduction uses in the condition.  */
>> -  gcc_assert (orig_stmt);
>> +  gcc_assert (orig_stmt || dt == vect_internal_def);
>>return false;
>>  }
>>if (!found_nested_cycle_def)
>> Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
>> ===
>> --- /dev/null
>> +++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
>> @@ -0,0 +1,13 @@
>> +/* { dg-do compile } */
>> +/* { dg-additional-options "-O3" } */
>> +
>> +int a, b, c, d;
>> +
>> +int
>> +fn1 ()
>> +{
>> +  for (; c; c++)
>> +for (b = 0; b < 2; b++)
>> +  d = a - d;
>> +  return d;
>> +}


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Arnaud Charlet
> > Two .png files were missing, now added:
> > 
> > 2015-02-22  Arnaud Charlet  
> >   
> > * doc/gnat_ugn/project-manager-figure.png,
> > doc/gnat_ugn/rtlibrary-structure.png: New. 
> 
> The maintainer-scripts/update_web_docs_svn script is still producing
> errors relating to those files: 
> https://gcc.gnu.org/ml/gccadmin/2015-q1/msg00129.html

Indeed. I'll need some help here since I do not know where these .png files
are looked for or how to change that (I tried copying the .png file under
gcc/ada but this didn't help. I also tried adding -I$(srcdir)/ada/doc/gnat_ugn
to the doc/gnat_ugn.dvi rule in gcc/ada/gcc-interface/Make-lang.in rule, with
no success).

Basically the images can be found under gcc/ada/doc/gnat_ugn so for someone
familiar with texinfo this should be an easy change.

Arno


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Joseph Myers
On Wed, 25 Feb 2015, Arnaud Charlet wrote:

> > > Two .png files were missing, now added:
> > > 
> > > 2015-02-22  Arnaud Charlet  
> > >   
> > >   * doc/gnat_ugn/project-manager-figure.png,
> > >   doc/gnat_ugn/rtlibrary-structure.png: New. 
> > 
> > The maintainer-scripts/update_web_docs_svn script is still producing
> > errors relating to those files: 
> > https://gcc.gnu.org/ml/gccadmin/2015-q1/msg00129.html
> 
> Indeed. I'll need some help here since I do not know where these .png files
> are looked for or how to change that (I tried copying the .png file under
> gcc/ada but this didn't help. I also tried adding -I$(srcdir)/ada/doc/gnat_ugn
> to the doc/gnat_ugn.dvi rule in gcc/ada/gcc-interface/Make-lang.in rule, with
> no success).
> 
> Basically the images can be found under gcc/ada/doc/gnat_ugn so for someone
> familiar with texinfo this should be an easy change.

update_web_docs_svn does not use the makefiles.  So you need to update the 
find command therein not to remove anything that's part of the sources for 
this documentation, and possibly update -I options for building manuals as 
well.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Arnaud Charlet
> So you need to update the
> find command therein not to remove anything that's part of the sources for
> this documentation, and possibly update -I options for building manuals as
> well.

I've added a -I gcc/gcc/ada/doc/gnat_ugn there, that's as far as my
knowledge goes for this script so I hope this is enough.

I can help in transitioning the script to sphinx though, that would seem
more interesting/productive at this stage.

Arno


Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Richard Biener
On Wed, Feb 25, 2015 at 12:05 PM, Kai Tietz  wrote:
> 2015-02-25 11:57 GMT+01:00 Richard Biener :
>> On Wed, Feb 25, 2015 at 11:06 AM, Kai Tietz  wrote:
>>> Hello,
>>>
>>> ChangeLog
>>>
>>> 2015-02-25  Kai Tietz  
>>>
>>> PR tree-optimization/61917
>>> * tree-vect-loop.c (vectorizable_reduction): Allow
>>> vect_internal_def without reduction to exit graceful.
>>>
>>> ChagneLog testsuite/
>>>
>>> 2015-02-25  Kai Tietz  
>>>
>>> PR tree-optimization/61917
>>> * gcc.dg/vect/vect-pr61917.c: New file.
>>>
>>> Tested for x86_64-unkown-linux.  Ok for apply?
>>
>> It doesn't make much sense to fail here as said in the comment
>> because of patterns if the actual case isn't a pattern.  Also
>> the patch causing this made vect_external_def possible, so
>> why does this affect vect_internal_def here?
>>
>> It may paper over the issue - but clearly the fix is bogus.
>
> Well, actually I don't see any good reason for failing here at all,
> but I assumed that author wanted to get by it some missed cases.
> AFAIU the code we actually don't need/want to assert here either on
> internal (where it is pretty likely no pattern present), and also on
> externals, too.
> I would be fine to remove this assert here completely, but I thought
> it might be of interest to see that orig_stmt isn't NULL for nested
> case

I agree that the code is overusing asserts but the path you bail out
on is bogus.  Fact is that we somehow detected a reduction here
(via special-casing of minus I guess) but fail to properly handle it
later.  In fact we assert that we end up with a PHI node in the definition
but would ICE there as well.  Thus a better patch would be

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 220958)
+++ gcc/tree-vect-loop.c(working copy)
@@ -4981,6 +4981,13 @@ vectorizable_reduction (gimple stmt, gim
   if (!vectype_in)
 vectype_in = tem;
   gcc_assert (is_simple_use);
+
+  /* If the defining stmt isn't a PHI node then this isn't a reduction.  */
+  if (!found_nested_cycle_def)
+reduc_def_stmt = def_stmt;
+  if (gimple_code (reduc_def_stmt) != GIMPLE_PHI)
+return false;
+
   if (!(dt == vect_reduction_def
|| dt == vect_nested_cycle
|| ((dt == vect_internal_def || dt == vect_external_def
@@ -4993,10 +5000,7 @@ vectorizable_reduction (gimple stmt, gim
   gcc_assert (orig_stmt);
   return false;
 }
-  if (!found_nested_cycle_def)
-reduc_def_stmt = def_stmt;

-  gcc_assert (gimple_code (reduc_def_stmt) == GIMPLE_PHI);
   if (orig_stmt)
 gcc_assert (orig_stmt == vect_is_simple_reduction (loop_vinfo,
reduc_def_stmt,

Richard.

> Kai
>
>> Richard.
>>
>>> Regards,
>>> Kai
>>>
>>> Index: tree-vect-loop.c
>>> ===
>>> --- tree-vect-loop.c(Revision 220958)
>>> +++ tree-vect-loop.c(Arbeitskopie)
>>> @@ -4990,7 +4990,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
>>>/* For pattern recognized stmts, orig_stmt might be a reduction,
>>>   but some helper statements for the pattern might not, or
>>>   might be COND_EXPRs with reduction uses in the condition.  */
>>> -  gcc_assert (orig_stmt);
>>> +  gcc_assert (orig_stmt || dt == vect_internal_def);
>>>return false;
>>>  }
>>>if (!found_nested_cycle_def)
>>> Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
>>> ===
>>> --- /dev/null
>>> +++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
>>> @@ -0,0 +1,13 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-additional-options "-O3" } */
>>> +
>>> +int a, b, c, d;
>>> +
>>> +int
>>> +fn1 ()
>>> +{
>>> +  for (; c; c++)
>>> +for (b = 0; b < 2; b++)
>>> +  d = a - d;
>>> +  return d;
>>> +}


Re: [PATCH] Use DO_PRAGMA in libgomp.oacc-c-c++-common/reduction-1.c

2015-02-25 Thread Thomas Schwinge
Hi!

On Mon, 23 Feb 2015 18:14:35 +0100, Tom de Vries  wrote:
> On 23-02-15 17:08, Jakub Jelinek wrote:
> > On Mon, Feb 23, 2015 at 04:52:56PM +0100, Tom de Vries wrote:
> >> The only thing I'm not sure about is the two-level pragma expansion using
> >> the apply pragmas. It maximizes factoring out common parts, but it makes
> >> things less readable.
> >>
> >> Tested on x86_64.
> >>
> >> OK for stage4?
> >
> > If Thomas is ok with that, it is ok with me too.

OK, thanks!

> For comparison, this is a less convoluted, but longer version.

Hmm, I can't quite decide which of the two to prefer.  Your call.  ;-P
(Maybe, toss a coin if it takes more than a minute to decide.)


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Richard Biener
On Sun, Feb 22, 2015 at 8:15 PM, Arnaud Charlet  wrote:
>> Your patch removes these arguments to dircategory:
>> ...
>> $ git show bf5dffd3a47fe12ace71fe48e87cfb1b9ada1344 | grep dircategory
>> +@dircategory
>> -@dircategory GNU Ada tools
>> -@dircategory GNU Ada tools
>> +@dircategory
>> ...
>
> Well OK but these are automatically generated now, and this doesn't really
> answer my question about the documentation of @dircategory.
>
> I'll put a kludge for now to work around this. In the long term, if we could
> transition all docs to sphinx and get rid of texinfo that would be great.

I also see nonsensical @direntry - just compare to what is there on the
4.9 branch.

Richard.

> Arno


Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)

2015-02-25 Thread Julian Brown
On Wed, 25 Feb 2015 10:36:08 +0100
Thomas Schwinge  wrote:

> Hi!
> 
> On Tue, 24 Feb 2015 11:29:51 +, Julian Brown
>  wrote:
> > Test results look OK, barring a suspected harness issue (lib-83
> > failing with a timeout for nvptx
> 
> However, I'm seeing a class of testsuite regressions: all variants of
> libgomp.oacc-fortran/lib-5.f90 and libgomp.oacc-fortran/lib-7.f90
> FAIL: »libgomp: cuMemFreeHost error: invalid value«.  I see these two
> test cases contain a lot of acc_get_num_devices and similar calls --
> I've been testing this on our nvidiak20-2 system, which contains two
> Nvidia K20 cards, so maybe there's something wrong in that regard.
> (But why is this failing only for Fortran -- are we missing C/C++
> tests in that area?) Can you have a look, or want me to?

I can have a look at that.

> > --- a/gcc/config/nvptx/mkoffload.c
> > +++ b/gcc/config/nvptx/mkoffload.c
> > @@ -850,16 +851,17 @@ process (FILE *in, FILE *out)
> 
> >fprintf (out, "static const void *target_data[] = {\n");
> > -  fprintf (out, "  ptx_code, var_mappings, func_mappings\n");
> > +  fprintf (out, "  ptx_code, (void*) %u, var_mappings, (void*) %u,
> > "
> > +   "func_mappings\n", nvars, nfuncs);
> >fprintf (out, "};\n\n");
> 
> I wondered if it's maybe more elegant to just separate those by NULL
> delimiters instead of the size integers casted to void * (spaces
> missing)?  But then, that'd need "double scanning" in the consumer,
> libgomp/plugin/plugin-nvptx.c:GOMP_OFFLOAD_load_image, because we
> need to allocate an appropriately sized array, so maybe your more
> expressive approach is better indeed.

Yeah, I considered both: there's probably not much to choose between
the approaches. They use the same amount of space.

> > --- a/libgomp/oacc-async.c
> > +++ b/libgomp/oacc-async.c
> > @@ -34,44 +34,68 @@
> >  int
> >  acc_async_test (int async)
> >  {
> > +  struct goacc_thread *thr = goacc_thread ();
> > +
> >if (async < acc_async_sync)
> >  gomp_fatal ("invalid async argument: %d", async);
> >  
> > -  return base_dev->openacc.async_test_func (async);
> > +  assert (thr->dev);
> > +
> > +  return thr->dev->openacc.async_test_func (async);
> >  }

> Here, and in several other places: is this code conforming to the
> OpenACC specification?  Do we need to (lazily) initialize in all
> these places, or in goacc_thread, or gracefully fail (see below) if
> not initialized (basically in all places where you currently assert
> (thr->dev)?
> 
> #include 
> 
> int main(int argc, char *argv[])
> {
>   return acc_async_test(0);
> }
> 
> [sigsegv]

Whether it conforms to the spec or not is a hard question to answer,
because a lot of behaviour is left undefined. But here are two
possibly-useful made-up guidelines:

1. Does the program work the same with OpenACC disabled?

2. Does some strange use of OpenACC functionality (including library
   calls, etc.) probably indicate user error?

Much of the lazy initialisation code is there so that (1) can be true
-- i.e., a program can use OpenACC directives without making an
explicit call to "acc_init" or other API-specific initialisation code.

But this case is an explicit call to the OpenACC runtime library, so the
program can't work without -fopenacc enabled, so we can follow
guideline (2) instead. And in this case, it's meaningless to test for
completion of async operation when no device is active.

Of course though, this should be an actual error rather than a crash.
But, I don't think we want to lazily-initialise here.

> Also, I'm not sure what the expected outcome of this code sequence is:
> 
> acc_init(acc_device_nvidia);
> acc_shutdown(acc_device_nvidia);
> acc_async_test(0);
> 
> a.out: [...]/source-gcc/libgomp/oacc-async.c:42: acc_async_test:
> Assertion `thr->dev' failed. Aborted (core dumped)
> 
> If the OpenACC specification can be read such that all this indeed is
> "undefined behavior", then aborting/crashing is OK, of course.

Again, this would probably indicate user error in a real program, so it
should raise a (real) error message.

> > --- a/libgomp/oacc-cuda.c
> > +++ b/libgomp/oacc-cuda.c
> > @@ -34,51 +34,53 @@
> >  void *
> >  acc_get_current_cuda_device (void)
> >  {
> > -  void *p = NULL;
> > +  struct goacc_thread *thr = goacc_thread ();
> >  
> > -  if (base_dev && base_dev->openacc.cuda.get_current_device_func)
> > -p = base_dev->openacc.cuda.get_current_device_func ();
> > +  if (thr && thr->dev &&
> > thr->dev->openacc.cuda.get_current_device_func)
> > +return thr->dev->openacc.cuda.get_current_device_func ();
> >  
> > -  return p;
> > +  return NULL;
> >  }
> 
> Here, and in other places, it looks as if we'd fail gracefully.

Not sure about this (maybe it should be an error too?), but...

> >  int
> >  acc_set_cuda_stream (int async, void *stream)
> >  {
> > -  int s = -1;
> > +  struct goacc_thread *thr;
> >  
> >if (async < 0 || stream == NULL)
> >  return 0;
> >  
> >g

Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)

2015-02-25 Thread Ilya Verbin
On Wed, Feb 25, 2015 at 10:36:08 +0100, Thomas Schwinge wrote:
> > Julian Brown  wrote:
> > OK for gomp4 branch? I could commit Ilya's patch there too if so.
> 
> I'll leave the decision to Jakub, but, what about trunk?  As Ilya
> indicated in
> ,
> (at least part of) these patches are fixing a regression with offloading
> From shared libraries.  (And maybe the rest qualifies as fixes and
> extensions to new code (offloading), so no danger to cause any
> regressions compared to the last GCC release?)

BTW, when I removed calls to gomp_init_tables in  
,
I could accidentally remove some necessary gomp_mutex_lock/unlock.
Also GOMP_offload_[un]register require some mutexes, as noted by Jakub.
I'm going to fix this.  So, I think we should commit all dependent patches to
gomp4 branch, and I will post a fix for mutexes on top of them.

Thanks,
  -- Ilya


Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Kai Tietz
2015-02-25 12:35 GMT+01:00 Richard Biener :
> On Wed, Feb 25, 2015 at 12:05 PM, Kai Tietz  wrote:
>> 2015-02-25 11:57 GMT+01:00 Richard Biener :
>>> On Wed, Feb 25, 2015 at 11:06 AM, Kai Tietz  wrote:
 Hello,

 ChangeLog

 2015-02-25  Kai Tietz  

 PR tree-optimization/61917
 * tree-vect-loop.c (vectorizable_reduction): Allow
 vect_internal_def without reduction to exit graceful.

 ChagneLog testsuite/

 2015-02-25  Kai Tietz  

 PR tree-optimization/61917
 * gcc.dg/vect/vect-pr61917.c: New file.

 Tested for x86_64-unkown-linux.  Ok for apply?
>>>
>>> It doesn't make much sense to fail here as said in the comment
>>> because of patterns if the actual case isn't a pattern.  Also
>>> the patch causing this made vect_external_def possible, so
>>> why does this affect vect_internal_def here?
>>>
>>> It may paper over the issue - but clearly the fix is bogus.
>>
>> Well, actually I don't see any good reason for failing here at all,
>> but I assumed that author wanted to get by it some missed cases.
>> AFAIU the code we actually don't need/want to assert here either on
>> internal (where it is pretty likely no pattern present), and also on
>> externals, too.
>> I would be fine to remove this assert here completely, but I thought
>> it might be of interest to see that orig_stmt isn't NULL for nested
>> case
>
> I agree that the code is overusing asserts but the path you bail out
> on is bogus.  Fact is that we somehow detected a reduction here
> (via special-casing of minus I guess) but fail to properly handle it
> later.  In fact we assert that we end up with a PHI node in the definition
> but would ICE there as well.  Thus a better patch would be
>
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c(revision 220958)
> +++ gcc/tree-vect-loop.c(working copy)
> @@ -4981,6 +4981,13 @@ vectorizable_reduction (gimple stmt, gim
>if (!vectype_in)
>  vectype_in = tem;
>gcc_assert (is_simple_use);
> +
> +  /* If the defining stmt isn't a PHI node then this isn't a reduction.  */
> +  if (!found_nested_cycle_def)
> +reduc_def_stmt = def_stmt;
> +  if (gimple_code (reduc_def_stmt) != GIMPLE_PHI)
> +return false;
> +
>if (!(dt == vect_reduction_def
> || dt == vect_nested_cycle
> || ((dt == vect_internal_def || dt == vect_external_def
> @@ -4993,10 +5000,7 @@ vectorizable_reduction (gimple stmt, gim
>gcc_assert (orig_stmt);
>return false;
>  }
> -  if (!found_nested_cycle_def)
> -reduc_def_stmt = def_stmt;
>
> -  gcc_assert (gimple_code (reduc_def_stmt) == GIMPLE_PHI);
>if (orig_stmt)
>  gcc_assert (orig_stmt == vect_is_simple_reduction (loop_vinfo,
> reduc_def_stmt,
>
> Richard.

Yes, your version is better for reader and shows the intention better.
I will give it a test.

Kai

>>> Richard.
>>>
 Regards,
 Kai

 Index: tree-vect-loop.c
 ===
 --- tree-vect-loop.c(Revision 220958)
 +++ tree-vect-loop.c(Arbeitskopie)
 @@ -4990,7 +4990,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
/* For pattern recognized stmts, orig_stmt might be a reduction,
   but some helper statements for the pattern might not, or
   might be COND_EXPRs with reduction uses in the condition.  */
 -  gcc_assert (orig_stmt);
 +  gcc_assert (orig_stmt || dt == vect_internal_def);
return false;
  }
if (!found_nested_cycle_def)
 Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
 ===
 --- /dev/null
 +++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
 @@ -0,0 +1,13 @@
 +/* { dg-do compile } */
 +/* { dg-additional-options "-O3" } */
 +
 +int a, b, c, d;
 +
 +int
 +fn1 ()
 +{
 +  for (; c; c++)
 +for (b = 0; b < 2; b++)
 +  d = a - d;
 +  return d;
 +}


Re: [PATCH PR65161]

2015-02-25 Thread Yuri Rumyantsev
Here is updated patch accordingly to Alexander comments.

BTW another function using HID interface is do_reorder_for_imul and it
is called from ix86_sched_reorder.

Is it OK for trunk?

2015-02-25 13:26 GMT+03:00 Alexander Monakov :
>
>
> On Wed, 25 Feb 2015, Yuri Rumyantsev wrote:
>
>> Hi All,
>>
>> I prepared new patch which includes test-case.
>>
>> I can't agree with patch proposed by Alexander since other functions
>> doing ready list reordering also use HID interface, so I put escape
>> check in ix86_sched_reorder.
>
> I don't see how that is the case.  Can you point me to specific lines of code
> in ix86_sched_reorder or its callees that access HID?
>
> In any case please use sel_sched_p () rather than flag_selective_scheduling2.
>
> Thanks.
>
> Alexander


patch.2
Description: Binary data


Re: [PATCH PR65161]

2015-02-25 Thread Alexander Monakov


On Wed, 25 Feb 2015, Yuri Rumyantsev wrote:

> Here is updated patch accordingly to Alexander comments.
> 
> BTW another function using HID interface is do_reorder_for_imul and it
> is called from ix86_sched_reorder.

do_reorder_for_imul uses dependency list iteration macros, which use HDID, not
HID, and HDID is populated during selective scheduling.  How exactly does
do_reorder_for_imul access HID?

Alexander


Re: ipa-icf::merge TLC

2015-02-25 Thread Markus Trippelsdorf
On 2015.02.25 at 09:38 +0100, Jan Hubicka wrote:
> this patch reorganize sem_function::merge and sem_variable::merge.
> I read the code in detail and found several issues that are fixed in the
> following patch.

I gave your patch a quick spin. It breaks Chromium. Its protocol buffer
compiler gets miscompiled:

 RULE Generating C++ and Python code from copresence/proto/enums.proto
FAILED: cd ../../components; python ../tools/protoc_wrapper/protoc_wrapper.py 
--include "" --protobuf 
"../out/Release/gen/protoc_out/components/copresence/proto/enums.pb.h" -
-proto-in-dir copresence/proto --proto-in-file "enums.proto" 
"--use-system-protobuf=0" -- ../out/Release/protoc --cpp_out 
../out/Release/gen/protoc_out/components/copresence/
proto --python_out ../out/Release/pyproto/components/copresence/proto
*** Error in `../out/Release/protoc': double free or corruption (out): 
0x7ffd966468b0 ***
=== Backtrace: =
/lib/libc.so.6(+0x7265e)[0x7fe09058065e]
/lib/libc.so.6(+0x77f1d)[0x7fe090585f1d]
/lib/libc.so.6(+0x7874b)[0x7fe09058674b]
../out/Release/protoc[0x45f5e1]
../out/Release/protoc[0x462caf]
../out/Release/protoc[0x462d1c]
../out/Release/protoc[0x46685c]
../out/Release/protoc[0x4677d5]
../out/Release/protoc[0x4679c3]
../out/Release/protoc[0x467b15]
../out/Release/protoc[0x479d89]
../out/Release/protoc[0x4aa058]
../out/Release/protoc[0x4aa0af]
../out/Release/protoc[0x46b217]
../out/Release/protoc[0x4a4832]
../out/Release/protoc[0x4a86cf]
../out/Release/protoc[0x4a890e]
../out/Release/protoc[0x4a0827]
../out/Release/protoc[0x4678fd]
../out/Release/protoc[0x467b15]
../out/Release/protoc[0x40c926]
../out/Release/protoc[0x403131]
/lib/libc.so.6(__libc_start_main+0xf0)[0x7fe09052e6d0]
../out/Release/protoc[0x4037f9]
=== Memory map: 
0040-004eb000 r-xp  00:13 36244132   
/var/tmp/chromium/src/out/Release/protoc
004ec000-004ef000 r--p 000eb000 00:13 36244132   
/var/tmp/chromium/src/out/Release/protoc
004ef000-004f rw-p 000ee000 00:13 36244132   
/var/tmp/chromium/src/out/Release/protoc
01ab-01b03000 rw-p  00:00 0  [heap]
7fe09027f000-7fe09030d000 r-xp  00:0f 2540736
/lib64/libm-2.21.90.so
7fe09030d000-7fe09050c000 ---p 0008e000 00:0f 2540736
/lib64/libm-2.21.90.so
7fe09050c000-7fe09050d000 r--p 0008d000 00:0f 2540736
/lib64/libm-2.21.90.so
7fe09050d000-7fe09050e000 rw-p 0008e000 00:0f 2540736
/lib64/libm-2.21.90.so
7fe09050e000-7fe09066e000 r-xp  00:0f 2541268
/lib64/libc-2.21.90.so
7fe09066e000-7fe09086d000 ---p 0016 00:0f 2541268
/lib64/libc-2.21.90.so
7fe09086d000-7fe090871000 r--p 0015f000 00:0f 2541268
/lib64/libc-2.21.90.so
7fe090871000-7fe090873000 rw-p 00163000 00:0f 2541268
/lib64/libc-2.21.90.so
7fe090873000-7fe090877000 rw-p  00:00 0 
7fe090877000-7fe09088f000 r-xp  00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe09088f000-7fe090a8e000 ---p 00018000 00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe090a8e000-7fe090a8f000 r--p 00017000 00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe090a8f000-7fe090a9 rw-p 00018000 00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe090a9-7fe090a94000 rw-p  00:00 0 
7fe090a94000-7fe090ab6000 r-xp  00:0f 2541267
/lib64/ld-2.21.90.so
7fe090ad2000-7fe090ad7000 rw-p  00:00 0 
7fe090ad7000-7fe090aee000 r-xp  00:0f 72800  
/lib64/libgcc_s.so.1
7fe090aee000-7fe090aef000 rw-p 00016000 00:0f 72800  
/lib64/libgcc_s.so.1
7fe090aef000-7fe090af rw-p  00:00 0 
7fe090af-7fe090c7b000 r-xp  00:0f 3018239
/usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
7fe090c7b000-7fe090c7c000 ---p 0018b000 00:0f 3018239
/usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
7fe090c7c000-7fe090c86000 r--p 0018b000 00:0f 3018239
/usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
7fe090c86000-7fe090c8a000 rw-p 00195000 00:0f 3018239
/usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
7fe090c8a000-7fe090c8d000 rw-p  00:00 0 
7fe090cb3000-7fe090cb5000 rw-p  00:00 0 
7fe090cb5000-7fe090cb6000 r--p 00021000 00:0f 2541267
/lib64/ld-2.21.90.so
7fe090cb6000-7fe090cb7000 rw-p 00022000 00:0f 2541267
/lib64/ld-2.21.90.so
7fe090cb7000-7fe090cb8000 rw-p  00:00 0 
7ffd96628000-7ffd96649000 rw-p  00:00 0  [stack]
7ffd967e4000-7ffd967e6000 r--p  00:00 0  [vvar]
7ffd967e6000-7ffd967e8000 r-xp  00:00 0  [vdso]

Re: Option overriding in the offloading code path

2015-02-25 Thread Bernd Schmidt

On 02/25/2015 11:28 AM, Thomas Schwinge wrote:


Am I on the right track with my assumption that it is correct that
nvptx.c:nvptx_option_override is not invoked in the offloading code path,
so we'd need a new target hook (?) to consolidate/override the options in
this scenario?


I'm surprised by this. Does lto1 not go through do_compile? I guess 
that's plausible, but I think in that case we should invoke this hook if 
ACCEL_COMPILER.



Bernd



Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Kai Tietz
Hello,

So, I did full regression-test for following patch:

ChangeLog

2015-02-25  Richard Biener  
Kai Tietz  

PR tree-optimization/61917
* tree-vect-loop.c (vectorizable_reduction): Allow
vect_internal_def without reduction to exit graceful.

ChagneLog testsuite/

2015-02-25  Kai Tietz  

PR tree-optimization/61917
* gcc.dg/vect/vect-pr61917.c: New file.

Tested for x86_64-unkown-linux.  Ok for apply?

Regards,
Kai

Index: tree-vect-loop.c
===
--- tree-vect-loop.c(Revision 220958)
+++ tree-vect-loop.c(Arbeitskopie)
@@ -4981,6 +4981,12 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
   if (!vectype_in)
 vectype_in = tem;
   gcc_assert (is_simple_use);
+  if (!found_nested_cycle_def)
+reduc_def_stmt = def_stmt;
+
+  if (gimple_code (reduc_def_stmt) != GIMPLE_PHI)
+return false;
+
   if (!(dt == vect_reduction_def
 || dt == vect_nested_cycle
 || ((dt == vect_internal_def || dt == vect_external_def
@@ -4993,10 +4999,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
   gcc_assert (orig_stmt);
   return false;
 }
-  if (!found_nested_cycle_def)
-reduc_def_stmt = def_stmt;

-  gcc_assert (gimple_code (reduc_def_stmt) == GIMPLE_PHI);
   if (orig_stmt)
 gcc_assert (orig_stmt == vect_is_simple_reduction (loop_vinfo,
reduc_def_stmt,
Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
===
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+int a, b, c, d;
+
+int
+fn1 ()
+{
+  for (; c; c++)
+for (b = 0; b < 2; b++)
+  d = a - d;
+  return d;
+}


Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread Richard Biener
On Wed, Feb 25, 2015 at 2:10 PM, Kai Tietz  wrote:
> Hello,
>
> So, I did full regression-test for following patch:
>
> ChangeLog
>
> 2015-02-25  Richard Biener  
> Kai Tietz  
>
> PR tree-optimization/61917
> * tree-vect-loop.c (vectorizable_reduction): Allow
> vect_internal_def without reduction to exit graceful.
>
> ChagneLog testsuite/
>
> 2015-02-25  Kai Tietz  
>
> PR tree-optimization/61917
> * gcc.dg/vect/vect-pr61917.c: New file.
>
> Tested for x86_64-unkown-linux.  Ok for apply?

Ok.

Thanks,
Richard.

> Regards,
> Kai
>
> Index: tree-vect-loop.c
> ===
> --- tree-vect-loop.c(Revision 220958)
> +++ tree-vect-loop.c(Arbeitskopie)
> @@ -4981,6 +4981,12 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
>if (!vectype_in)
>  vectype_in = tem;
>gcc_assert (is_simple_use);
> +  if (!found_nested_cycle_def)
> +reduc_def_stmt = def_stmt;
> +
> +  if (gimple_code (reduc_def_stmt) != GIMPLE_PHI)
> +return false;
> +
>if (!(dt == vect_reduction_def
>  || dt == vect_nested_cycle
>  || ((dt == vect_internal_def || dt == vect_external_def
> @@ -4993,10 +4999,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
>gcc_assert (orig_stmt);
>return false;
>  }
> -  if (!found_nested_cycle_def)
> -reduc_def_stmt = def_stmt;
>
> -  gcc_assert (gimple_code (reduc_def_stmt) == GIMPLE_PHI);
>if (orig_stmt)
>  gcc_assert (orig_stmt == vect_is_simple_reduction (loop_vinfo,
> reduc_def_stmt,
> Index: gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
> ===
> --- /dev/null
> +++ gcc/gcc/testsuite/gcc.dg/vect/vect-pr61917.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +int a, b, c, d;
> +
> +int
> +fn1 ()
> +{
> +  for (; c; c++)
> +for (b = 0; b < 2; b++)
> +  d = a - d;
> +  return d;
> +}


[PATCH] Fix PR65204

2015-02-25 Thread Richard Biener

This fixes missed tracking of alignment of non-invariant addresses
in CCP.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for GCC 6.

Richard.

2015-02-25  Richard Biener  

PR tree-optimization/65204
* tree-ssa-ccp.c (evaluate_stmt): Always evaluate address
takens for bit-CCP.

* gcc.dg/tree-ssa/ssa-ccp-35.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===
--- gcc/tree-ssa-ccp.c  (revision 220958)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -1748,7 +1769,9 @@ evaluate_stmt (gimple stmt)
 
   /* Resort to simplification for bitwise tracking.  */
   if (flag_tree_bit_ccp
-  && (likelyvalue == CONSTANT || is_gimple_call (stmt))
+  && (likelyvalue == CONSTANT || is_gimple_call (stmt)
+ || (gimple_assign_single_p (stmt)
+ && gimple_assign_rhs_code (stmt) == ADDR_EXPR))
   && !is_constant)
 {
   enum gimple_code code = gimple_code (stmt);
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-35.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-35.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-35.c  (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1" } */
+
+typedef char char16[16] __attribute__ ((aligned (16)));
+char16 c16[4] __attribute__ ((aligned (4)));
+
+int f5 (int i)
+{
+  __SIZE_TYPE__ s = (__SIZE_TYPE__)&c16[i];
+  /* 0 */
+  return 3 & s;
+}
+
+/* { dg-final { scan-tree-dump "return 0;" "ccp1" } } */
+/* { dg-final { cleanup-tree-dump "ccp1" } } */


Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

Hi,

This patch merges pipeline description for marvell-whitney to latest code base.
Is it OK for trunk?

--
Regards,
Xingxing
commit 83974dde8d9f773df1004aa1d5e3b05d8a33f5e0
Author: Xingxing Pan 
Date:   Wed Feb 25 10:24:40 2015 +0800

2015-02-25 Xingxing Pan  

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..fc76eb5 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4",		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..d047dbc 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 3450e5b..f0f9f3f 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d459f27..fbfab2e 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,7 +31,8 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa53,cortexa57,cortexa72,
-	xgene1,cortexa57cortexa53,cortexa72cortexa53"
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa72,xgene1,cortexa57cortexa53,
+	cortexa72cortexa53"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7bf5b4d..e68287f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune =
   ARM_SCHED_AUTOPREF_OFF			/* Sched L2 autopref.  */
 };
 
+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  &cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,
+  1,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,	/* Prefer LDRD/STRD.  */
+  {true, true},	/* Prefer non short circuit.  */
+  &arm_default_vec_cost,/* Vectorizer costs.  */
+  false,/* Prefer Neon for 64-bits bitops.  */
+  false, false, /* Prefer 32-bit encodings.  */
+  false,	/* Prefer Neon for stringops.  */
+  8		/* Maximum insns to inline memset.  */
+};
+
 const struct tune_params arm_cortex_a12_tune =
 {
   arm_9e_rtx_costs,
@@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost)
   return true;
 }
 
+/* Return true if vector element size is byte.  */
+bool
+marvell_whitney_vector_mode_qi (rtx_insn *insn)
+{
+  machine_mode mode;
+
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  mode = GET_MODE (SET_DEST (PATTERN (insn)));
+  if (

[PATCH, CHKP, i386, PR target/65167] Avoid motion of bounds args during scheduling

2015-02-25 Thread Ilya Enkovich
Hi,

This patch adds support for bounds registers into args recognition mechanism 
used by scheduler.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK 
for trunk?

Thanks,
Ilya
--
gcc/

2015-02-25  Ilya Enkovich  

PR target/65167
* gcc/config/i386/i386.c (ix86_function_arg_regno_p): Support
bounds registers.
(avoid_func_arg_motion): Add dependencies for BNDSTX insns.

gcc/testsuite/

2015-02-25  Ilya Enkovich  

PR target/65167
* gcc.target/i386/pr65167.c: New.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 71a5b22..acbe25f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6068,6 +6068,9 @@ ix86_function_arg_regno_p (int regno)
   int i;
   const int *parm_regs;
 
+  if (TARGET_MPX && BND_REGNO_P (regno))
+return true;
+
   if (!TARGET_64BIT)
 {
   if (TARGET_MACHO)
@@ -26846,6 +26849,16 @@ avoid_func_arg_motion (rtx_insn *first_arg, rtx_insn 
*insn)
   rtx set;
   rtx tmp;
 
+  /* Add anti dependencies for bounds stores.  */
+  if (INSN_P (insn)
+  && GET_CODE (PATTERN (insn)) == PARALLEL
+  && GET_CODE (XVECEXP (PATTERN (insn), 0, 0)) == UNSPEC
+  && XINT (XVECEXP (PATTERN (insn), 0, 0), 1) == UNSPEC_BNDSTX)
+{
+  add_dependence (first_arg, insn, REG_DEP_ANTI);
+  return;
+}
+
   set = single_set (insn);
   if (!set)
 return;
diff --git a/gcc/testsuite/gcc.target/i386/pr65167.c 
b/gcc/testsuite/gcc.target/i386/pr65167.c
new file mode 100644
index 000..35f3d6b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65167.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options "-O -fschedule-insns -fcheck-pointer-bounds -mmpx" } */
+
+void bar(int *a, int *b, int *c, int *d, int *e, int *f);
+
+int foo (int *a, int *b, int *c, int *d, int *e, int *f)
+{
+  bar (a, b, c, d, e, f);
+  return *f;
+}


Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

Hi,

This patch expanding the following RTL types. And it has been merged to the 
latest code base.

(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.

Is it OK for trunk?

--
Regards,
Xingxing
commit b0d0ebf6a2553bc7b6cc8f72fbaa0104938d0d41
Author: Xingxing Pan 
Date:   Wed Feb 25 14:46:25 2015 +0800

2015-02-25  Xingxing Pan  

* config/arm/types.md:
(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
* config/aarch64/aarch64-simd.md: Ditto.
* config/aarch64/aarch64.md: Ditto.
* config/aarch64/thunderx.md: Ditto.
* config/arm/arm.md: Ditto.
* config/arm/cortex-a15-neon.md: Ditto.
* config/arm/cortex-a17-neon.md: Ditto.
* config/arm/cortex-a57.md: Ditto.
* config/arm/cortex-a8-neon.md: Ditto.
* config/arm/cortex-a9-neon.md: Ditto.
* config/arm/marvell-whitney.md: Ditto.
* config/arm/neon.md: Ditto.
* config/arm/types.md: Ditto.
* config/arm/xgene1.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0557570..611d14c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
  }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\
  mov_reg, neon_move")]
 )
 
@@ -147,7 +147,7 @@
 }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
  neon_move")
(set_attr "length" "4,4,4,8,8,8,4")]
 )
@@ -218,7 +218,7 @@
   (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -229,7 +229,7 @@
   (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -239,7 +239,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "bic3"
@@ -248,7 +248,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "add3"
@@ -444,7 +444,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "ior3"
@@ -453,7 +453,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "xor3"
@@ -462,7 +462,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "eor\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "one_cmpl2"
@@ -470,7 +470,7 @@
 (not:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")))]
   "TARGET_SIMD"
   "not\t%0., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "aarch64_simd_vec_set"
@@ -496,7 +496,7 @@
 	gcc_unreachable ();
  }
   }
-  [(set_attr "type" "neon_from_gp, neon_ins, neon_load1_1reg")]
+  [(set_attr "type" "neon_from_gp_scalar, neon_ins, neon_load1_1reg")]
 )
 
 (define_insn "aarch64_simd_lshr"
@@ -816,7 +816,7 @@
 	gcc_unreachable ();
   }
   }
-  [(set_attr "type" "neon_from_gp, neon_ins_q")]
+  [(set_attr "type" "neon_from_gp_scalar, neon_ins_q")]
 )
 
 (define_expand "vec_setv2di"
@@ -2426,7 +2426,7 @@
 operands[2] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[2])));
 return "smov\\t%0, %1.[%2]";
   }
-  [(set_attr "type" "neon_to_gp")]
+  [(set_attr "type" "neon_to_gp_scalar

Re: [PATCH PR65161]

2015-02-25 Thread Yuri Rumyantsev
I modified patch accordingly to Alexander comments.

Is it OK for trunk?

2015-02-25 15:38 GMT+03:00 Alexander Monakov :
>
>
> On Wed, 25 Feb 2015, Yuri Rumyantsev wrote:
>
>> Here is updated patch accordingly to Alexander comments.
>>
>> BTW another function using HID interface is do_reorder_for_imul and it
>> is called from ix86_sched_reorder.
>
> do_reorder_for_imul uses dependency list iteration macros, which use HDID, not
> HID, and HDID is populated during selective scheduling.  How exactly does
> do_reorder_for_imul access HID?
>
> Alexander


patch.3
Description: Binary data


Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

On 02/25/2015 09:42 PM, Xingxing Pan wrote:

Hi,

This patch expanding the following RTL types. And it has been merged to
the latest code base.

 (neon_logic): Expand to neon_logic_reg and neon_logic_imm.
 (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
 (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
 (neon_from_gp_q): Expand to neon_from_gp_q and
neon_from_gp_scalar_q.
 (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
 (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.

Is it OK for trunk?



Fix typos in commit message.

--
Regards,
Xingxing
commit b0d0ebf6a2553bc7b6cc8f72fbaa0104938d0d41
Author: Xingxing Pan 
Date:   Wed Feb 25 14:46:25 2015 +0800

2015-02-25  Xingxing Pan  

* config/arm/types.md:
(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
* config/aarch64/aarch64-simd.md: Ditto.
* config/aarch64/aarch64.md: Ditto.
* config/aarch64/thunderx.md: Ditto.
* config/arm/arm.md: Ditto.
* config/arm/cortex-a15-neon.md: Ditto.
* config/arm/cortex-a17-neon.md: Ditto.
* config/arm/cortex-a57.md: Ditto.
* config/arm/cortex-a8-neon.md: Ditto.
* config/arm/cortex-a9-neon.md: Ditto.
* config/arm/marvell-whitney.md: Ditto.
* config/arm/neon.md: Ditto.
* config/arm/xgene1.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0557570..611d14c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
  }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\
  mov_reg, neon_move")]
 )
 
@@ -147,7 +147,7 @@
 }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
  neon_move")
(set_attr "length" "4,4,4,8,8,8,4")]
 )
@@ -218,7 +218,7 @@
   (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -229,7 +229,7 @@
   (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -239,7 +239,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "bic3"
@@ -248,7 +248,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "add3"
@@ -444,7 +444,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "ior3"
@@ -453,7 +453,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "xor3"
@@ -462,7 +462,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "eor\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "one_cmpl2"
@@ -470,7 +470,7 @@
 (not:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")))]
   "TARGET_SIMD"
   "not\t%0., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "aarch64_simd_vec_set"
@@ -496,7 +496,7 @@
 	gcc_unreachable ();
  }
   }
-  [(set_attr "type" "neon_from_gp, neon_ins, neon_load1_1reg")]
+  [(set_attr "type" "neon_from_gp_scalar, neon_ins, neon_load1_1reg")]
 )
 
 (define_insn "aarch64_simd_lshr"
@@ -816,7 +816,7 @@
 	gcc_unreachable ();
   }
   }
-  [(set_attr "type" "neon_from_gp, neon_ins_q")]
+  [(set_attr "type" "neon_from_gp_scalar, neon_ins_q")]
 )
 
 (define_expand "vec_setv2di"
@@ -2426,7 +2426,7 @@
 operands[2] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[2])));
 return "smov\\t%0, %1.[%2]";
   }
-  [(set_attr "type" "neon_to_g

Re: [PATCH PR65161]

2015-02-25 Thread Alexander Monakov


On Wed, 25 Feb 2015, Yuri Rumyantsev wrote:

> I modified patch accordingly to Alexander comments.
> 
> Is it OK for trunk?

If possible, please add a short comment explaining why a shortcut is
necessary, for example "HID is not populated during selective scheduling".

OK for trunk from selective scheduler maintainer's perspective.

Thanks.
Alexander


Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread James Greenhalgh
On Wed, Feb 25, 2015 at 01:42:39PM +, Xingxing Pan wrote:
> Hi,
> 
> This patch expanding the following RTL types. And it has been merged to the 
> latest code base.
> 
>  (neon_logic): Expand to neon_logic_reg and neon_logic_imm.
>  (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
>  (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
>  (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
>  (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
>  (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
> 
> Is it OK for trunk?
> 
> -- 
> Regards,
> Xingxing

I've had a look through the AArch64 parts, and they look OK to me
(though only Marcus or Richard can approve them), I have one additional
comment.

>  ;; In this insn, operand 1 should be low, and operand 2 the high part of the
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 8f157ce..8be2ebf 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -828,7 +828,7 @@
>   }
>  }
>[(set_attr "type" "mov_reg,mov_imm,mov_imm,load1,load1,store1,store1,\
> - neon_from_gp,neon_from_gp, neon_dup")
> + neon_to_gp_scalar,neon_from_gp, neon_dup")
> (set_attr "simd" "*,*,yes,*,*,*,*,yes,yes,yes")]
>  )

Here you change neon_from_gp to neon_to_gp_scalar.

This looks like the correct thing to do, but would you mind pulling it out
to a separate patch, first changing neon_from_gp to neon_to_gp?

I'd just like to have the bug-fix separate from the bigger infrastructure
change.

Thanks,
James



[wwwdocs] List IPA-CP of alignments in changes.html

2015-02-25 Thread Martin Jambor
Hi,

I'd like to commit the following to gcc-5/changes.html so that IPA-CP
alignment propagation is listed among other new features.  OK?

Thanks,

Martin

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.82
diff -u -r1.82 changes.html
--- changes.html25 Feb 2015 06:02:21 -  1.82
+++ changes.html25 Feb 2015 14:19:31 -
@@ -57,6 +57,10 @@
 seen when building Chromium with link-time optimization.
  The symbol table and call-graph API was reworked to C++ and
 simplified.
+ The interprocedural propagation of constants now also propagates
+ alignments of pointer parameters.  This for example means that the
+ vectorizer often does not need to generate loop prologues and 
epilogues
+ to make up for potential misalignments.
 
 Link-time optimization improvements:
 


[patch, avr, committed]: Fix PR65196 (ice-checking)

2015-02-25 Thread Georg-Johann Lay

http://gcc.gnu.org/r220963
http://gcc.gnu.org/r220964
http://gcc.gnu.org/r220965

Applied this obvious fix for ICE with checking enabled (recog_memoized used 
with invalid rtx, e.g. jump_table_data).


Johann

PR target/65196
* config/avr/avr.c (avr_adjust_insn_length): Call recog_memoized
only with NONDEBUG_INSN_P.


Index: config/avr/avr.c
===
--- config/avr/avr.c(revision 220738)
+++ config/avr/avr.c(working copy)
@@ -7778,7 +7778,8 @@ avr_adjust_insn_length (rtx insn, int le
  It is easier to state this in an insn attribute "adjust_len" than
  to clutter up code here...  */

-  if (-1 == recog_memoized (insn))
+  if (!NONDEBUG_INSN_P (insn)
+  || -1 == recog_memoized (insn))
 {
   return len;
 }



Re: [PATCH PR65161]

2015-02-25 Thread Yuri Rumyantsev
Hi All,

Here is updated patch to fix ICE.

Is it OK for trunk?

2015-02-25  Yuri Rumyantsev  

PR target/65161
* config/i386/i386.c (ix86_sched_reorder): Skip instruction reordering
for selective scheduling.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr65161.c: New test.

2015-02-25 17:04 GMT+03:00 Alexander Monakov :
>
>
> On Wed, 25 Feb 2015, Yuri Rumyantsev wrote:
>
>> I modified patch accordingly to Alexander comments.
>>
>> Is it OK for trunk?
>
> If possible, please add a short comment explaining why a shortcut is
> necessary, for example "HID is not populated during selective scheduling".
>
> OK for trunk from selective scheduler maintainer's perspective.
>
> Thanks.
> Alexander


patch.3
Description: Binary data


[PATCH, CHKP, i386, PR target/65184] Fix pass_by_reference for MS ABI for bounds

2015-02-25 Thread Ilya Enkovich
Hi,

Currenly ix86_pass_by_reference may return 1 for bounds if MS ABI is used.  
This patch explicitly says bounds are never passed by reference.  Bootstrapped 
and tested on x86_64-unknown-linux-gnu.  OK for trunk?


Thanks,
Ilya
--
gcc/

2015-02-25  Ilya Enkovich  

PR target/65184
* gcc/config/i386/i386.c (ix86_pass_by_reference) Bounds
are never passed by reference.

gcc/testsuite/

2015-02-25  Ilya Enkovich  

PR target/65184
* gcc.target/i386/pr65184.c: New.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 71a5b22..28242d7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -7859,6 +7859,11 @@ ix86_pass_by_reference (cumulative_args_t cum_v, 
machine_mode mode,
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
 
+  /* Bounds are never passed by reference.  */
+  if ((type && POINTER_BOUNDS_TYPE_P (type))
+  || POINTER_BOUNDS_MODE_P (mode))
+return false;
+
   /* See Windows x64 Software Convention.  */
   if (TARGET_64BIT && (cum ? cum->call_abi : ix86_abi) == MS_ABI)
 {
diff --git a/gcc/testsuite/gcc.target/i386/pr65184.c 
b/gcc/testsuite/gcc.target/i386/pr65184.c
new file mode 100644
index 000..0355f29
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65184.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options "-O2 -mabi=ms -fcheck-pointer-bounds -mmpx" } */
+
+void
+foo (int *a)
+{
+  if (a[0] != a[1] * 2333)
+__builtin_abort ();
+}
+
+void
+bar (int *a)
+{
+  if (a[0] != a[1] * 2333)
+__builtin_abort ();
+}


Re: [PATCH] gcc/reload.c: Initialize several arrays before use them in find_reloads()

2015-02-25 Thread Jeff Law

On 02/24/15 22:47, augustine.sterl...@gmail.com wrote:

On Tue, Feb 24, 2015 at 4:45 PM, Max Filippov  wrote:


Sterling,

I was referring Jeff's patch, do you say that his patch is not the proper
fix?


No, I was thinking of Chen's patch. Jeff's patch is the right one.

Jeff, your patch is OK for xtensa. Do you mind checking it in?

Done.
jeff


[Ping v2] [PATCH PR64820] Fix ASan UAR detection fails on 32-bit targets if SSP is enabled.

2015-02-25 Thread Maxim Ostapenko

On 02/16/2015 10:58 AM, Maxim Ostapenko wrote:

Hi,

when testing I noticed, that if compile with both -fsanitize=address and
-fstack-protector for 32-bit architectures and run with
ASAN_OPTIONS=detect_stack_use_after_return=1, libsanitizer fails with:

  ==7299==AddressSanitizer CHECK failed:
/home/max/workspace/downloads/gcc/libsanitizer/asan/asan_poisoning.cc:25
"((AddrIsAlignedByGranularity(addr + size))) != (0)" (0x0, 0x0)
 #0 0xf72d8afc in AsanCheckFailed
/home/max/workspace/downloads/gcc/libsanitizer/asan/asan_rtl.cc:68
 #1 0xf72dda89 in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/home/max/workspace/downloads/gcc/libsanitizer/sanitizer_common/sanitizer_common.cc:72 



This happens because ssp inserts a stack guard into a function, that
confuses asan_emit_stack_protection to calculate right size parameter
for asan_stack_malloc.

This tiny patch resolves the issue.

Regtested with make -j12 -k check
RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' on 
x86_64-unknown-linux-gnu.


Bootstrapped, ASan-bootstrapped on x86_64-unknown-linux-gnu.

Ok to commit?

-Maxim




Ping.

-Maxim
gcc/ChangeLog:

2015-02-09  Max Ostapenko  

	PR sanitizer/64820
* cfgexpand.c (align_base): New function.
(alloc_stack_frame_space): Call it.
(expand_stack_vars): Align prev_frame to be sure
data->asan_vec elements aligned properly.

gcc/testsuite/ChangeLog:

2015-02-09  Max Ostapenko  

	PR sanitizer/64820
* c-c++-common/asan/pr64820.c: New test.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 7dfe1f6..7845a17 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -282,6 +282,15 @@ align_local_variable (tree decl)
   return align / BITS_PER_UNIT;
 }
 
+/* Align given offset BASE with ALIGN.  Truncate up if ALIGN_UP is true,
+   down otherwise.  Return truncated BASE value.  */
+
+static inline unsigned HOST_WIDE_INT
+align_base (HOST_WIDE_INT base, unsigned HOST_WIDE_INT align, bool align_up)
+{
+  return align_up ? (base + align - 1) & -align : base & -align;
+}
+
 /* Allocate SIZE bytes at byte alignment ALIGN from the stack frame.
Return the frame offset.  */
 
@@ -293,17 +302,15 @@ alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
   new_frame_offset = frame_offset;
   if (FRAME_GROWS_DOWNWARD)
 {
-  new_frame_offset -= size + frame_phase;
-  new_frame_offset &= -align;
-  new_frame_offset += frame_phase;
+  new_frame_offset
+	= align_base (frame_offset - frame_phase - size,
+		  align, false) + frame_phase;
   offset = new_frame_offset;
 }
   else
 {
-  new_frame_offset -= frame_phase;
-  new_frame_offset += align - 1;
-  new_frame_offset &= -align;
-  new_frame_offset += frame_phase;
+  new_frame_offset
+	= align_base (frame_offset - frame_phase, align, true) + frame_phase;
   offset = new_frame_offset;
   new_frame_offset += size;
 }
@@ -1031,13 +1038,16 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  base = virtual_stack_vars_rtx;
 	  if ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK && pred)
 	{
-	  HOST_WIDE_INT prev_offset = frame_offset;
+	  HOST_WIDE_INT prev_offset
+		= align_base (frame_offset,
+			  MAX (alignb, ASAN_RED_ZONE_SIZE),
+			  FRAME_GROWS_DOWNWARD);
 	  tree repr_decl = NULL_TREE;
-
 	  offset
 		= alloc_stack_frame_space (stack_vars[i].size
 	   + ASAN_RED_ZONE_SIZE,
 	   MAX (alignb, ASAN_RED_ZONE_SIZE));
+
 	  data->asan_vec.safe_push (prev_offset);
 	  data->asan_vec.safe_push (offset + stack_vars[i].size);
 	  /* Find best representative of the partition.
diff --git a/gcc/testsuite/c-c++-common/asan/pr64820.c b/gcc/testsuite/c-c++-common/asan/pr64820.c
new file mode 100644
index 000..885a662
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/pr64820.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fstack_protector } */
+/* { dg-options "-fstack-protector-strong" } */
+/* { dg-set-target-env-var ASAN_OPTIONS "detect_stack_use_after_return=1" } */
+/* { dg-shouldfail "asan" } */
+
+__attribute__((noinline))
+char *Ident(char *x) {
+  return x;
+}
+
+__attribute__((noinline))
+char *Func1() {
+  char local[1 << 12];
+  return Ident(local);
+}
+
+__attribute__((noinline))
+void Func2(char *x) {
+  *x = 1;
+}
+int main(int argc, char **argv) {
+  Func2(Func1());
+  return 0;
+}
+
+/* { dg-output "AddressSanitizer: stack-use-after-return on address 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "WRITE of size 1 at .* thread T0.*" } */
+/* { dg-output "#0.*(Func2)?.*pr64820.(c:21)?.*" } */
+/* { dg-output "is located in stack of thread T0 at offset.*" } */


Re: [PATCH, CHKP, i386, PR target/65167] Avoid motion of bounds args during scheduling

2015-02-25 Thread Uros Bizjak
Hello!

> 2015-02-25  Ilya Enkovich  
>
> PR target/65167
> * gcc/config/i386/i386.c (ix86_function_arg_regno_p): Support
> bounds registers.
> (avoid_func_arg_motion): Add dependencies for BNDSTX insns.
>
> gcc/testsuite/
>
> 2015-02-25  Ilya Enkovich  
>
> PR target/65167
> * gcc.target/i386/pr65167.c: New.

OK for mainline with a small change below.

+  if (TARGET_MPX && BND_REGNO_P (regno))
+return true;

The check for TARGET_MPX is not needed.

Uros.


Re: [wwwdocs] List IPA-CP of alignments in changes.html

2015-02-25 Thread Gerald Pfeifer
On Wed, 25 Feb 2015, Martin Jambor wrote:
> I'd like to commit the following to gcc-5/changes.html so that IPA-CP
> alignment propagation is listed among other new features.  OK?

Looks good, thank you!  (Just watch out for overly long lines.)

Gerald


Re: [PATCH, CHKP, i386, PR target/65167] Avoid motion of bounds args during scheduling

2015-02-25 Thread Uros Bizjak
On Wed, Feb 25, 2015 at 3:47 PM, Uros Bizjak  wrote:
> Hello!
>
>> 2015-02-25  Ilya Enkovich  
>>
>> PR target/65167
>> * gcc/config/i386/i386.c (ix86_function_arg_regno_p): Support
>> bounds registers.
>> (avoid_func_arg_motion): Add dependencies for BNDSTX insns.
>>
>> gcc/testsuite/
>>
>> 2015-02-25  Ilya Enkovich  
>>
>> PR target/65167
>> * gcc.target/i386/pr65167.c: New.
>
> OK for mainline with a small change below.

Ah, I missed the explanation w.r.t __builtin_apply_args in the PR.

The patch is OK as it is.

Thanks,
Uros.


[patch]: Fix PR target/64212

2015-02-25 Thread Kai Tietz
Hi,

The issue here is a pe-coff target specific thing that
dllimported-symbols have an noninterposable, and an interposable part.
The dllimport address itself is not interposable, but its stubbing
function/var is.

So the hook binds_to_local has to return false for dllimport,
nevertheless for clones we want that they getting interposable.
Therefore - as suggested by Honza - we need to make explicit sure that
we set DECL_DLLIMPORT_P() explicit to 0 in symtab.

ChangeLog

2015-02-25  Kai Tietz  

PR target/64212
* symtab.c (symtab::make_decl_local): Set DECL_IMPORT_P explicit to 0.
(symtab::noninterposable_alias): Likewise.

Tested for x86_64-w64-mingw32, i686-w64-mingw32, and
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: symtab.c
===
--- symtab.c(Revision 220969)
+++ symtab.c(Arbeitskopie)
@@ -1165,6 +1165,7 @@ symtab_node::make_decl_local (void)
   DECL_VISIBILITY_SPECIFIED (decl) = 0;
   DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
   TREE_PUBLIC (decl) = 0;
+  DECL_DLLIMPORT_P (decl) = 0;
   if (!DECL_RTL_SET_P (decl))
 return;

@@ -1534,7 +1535,6 @@ symtab_node::noninterposable_alias (symtab_node *n
  != flags_from_decl_or_type (fn->decl))
   || DECL_ATTRIBUTES (node->decl) != DECL_ATTRIBUTES (fn->decl))
 return false;
-
   *(symtab_node **)data = node;
   return true;
 }
@@ -1566,6 +1566,7 @@ symtab_node::noninterposable_alias (void)

   /* Otherwise create a new one.  */
   new_decl = copy_node (node->decl);
+  DECL_DLLIMPORT_P (new_decl) = 0;
   DECL_NAME (new_decl) = clone_function_name (node->decl, "localalias");
   if (TREE_CODE (new_decl) == FUNCTION_DECL)
 DECL_STRUCT_FUNCTION (new_decl) = NULL;


Re: [PATCH] gcc/reload.c: Initialize several arrays before use them in find_reloads()

2015-02-25 Thread augustine.sterl...@gmail.com
On Wed, Feb 25, 2015 at 6:39 AM, Jeff Law  wrote:
>
> Done.
> jeff

Thanks!


Re: [patch] Fix ICE on unaligned record field

2015-02-25 Thread Martin Jambor
Hi Eric and Richard,

On Tue, Jan 06, 2015 at 06:07:12PM +0100, Eric Botcazou wrote:
> Martin,
> 
> > I suppose that could be done by something like the following, which I
> > have tested only very mildly so far, in particular I have not double
> > checked that get_inner_reference is cfun-agnostic.
> 
> The patch introduces no regressions on x86-64/Linux and makes the testcase 
> (gnat.dg/specs/pack12.ads attached to the first message) pass.
> 
> Do you plan to install it (along with the testcase)?
> 

for various reasons I was not able to do it earlier, but today I have
re-bootstrapped the following (the only change is the added testcase)
on x86_64-linux and it passes OK.  Should I commit it to trunk then?

Thanks,

Martin


2015-02-25  Martin Jambor  
Eric Botcazou  

gcc/
* tree-sra.c (ipa_sra_check_caller_data): New type.
(has_caller_p): Removed.
(ipa_sra_check_caller): New function.
(ipa_sra_preliminary_function_checks): Use it.

gcc/changelog/
* gnat.dg/specs/pack12.ads: New test.


diff --git a/gcc/testsuite/gnat.dg/specs/pack12.ads 
b/gcc/testsuite/gnat.dg/specs/pack12.ads
new file mode 100644
index 000..c5e962c
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/specs/pack12.ads
@@ -0,0 +1,21 @@
+-- { dg-do compile }
+-- { dg-options "-O2" }
+
+package Pack12 is
+
+  type Rec1 is record
+B : Boolean;
+N : Natural;
+  end record;
+
+  type Rec2 is record
+B : Boolean;
+R : Rec1;
+  end record;
+  pragma Pack (Rec2);
+
+  type Rec3 is tagged record
+R : Rec2;
+  end record;
+
+end Pack12;
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 023b817..a6cddaf 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -5009,13 +5009,54 @@ modify_function (struct cgraph_node *node, 
ipa_parm_adjustment_vec adjustments)
   return cfg_changed;
 }
 
-/* If NODE has a caller, return true.  */
+/* Means of communication between ipa_sra_check_caller and
+   ipa_sra_preliminary_function_checks.  */
+
+struct ipa_sra_check_caller_data
+{
+  bool has_callers;
+  bool bad_arg_alignment;
+};
+
+/* If NODE has a caller, mark that fact in DATA which is pointer to
+   ipa_sra_check_caller_data.  Also check all aggregate arguments in all known
+   calls if they are unit aligned and if not, set the appropriate flag in DATA
+   too. */
 
 static bool
-has_caller_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED)
+ipa_sra_check_caller (struct cgraph_node *node, void *data)
 {
-  if (node->callers)
-return true;
+  if (!node->callers)
+return false;
+
+  struct ipa_sra_check_caller_data *iscc;
+  iscc = (struct ipa_sra_check_caller_data *) data;
+  iscc->has_callers = true;
+
+  for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller)
+{
+  gimple call_stmt = cs->call_stmt;
+  unsigned count = gimple_call_num_args (call_stmt);
+  for (unsigned i = 0; i < count; i++)
+   {
+ tree arg = gimple_call_arg (call_stmt, i);
+ if (is_gimple_reg (arg))
+ continue;
+
+ tree offset;
+ HOST_WIDE_INT bitsize, bitpos;
+ machine_mode mode;
+ int unsignedp, volatilep = 0;
+ get_inner_reference (arg, &bitsize, &bitpos, &offset, &mode,
+  &unsignedp, &volatilep, false);
+ if (bitpos % BITS_PER_UNIT)
+   {
+ iscc->bad_arg_alignment = true;
+ return true;
+   }
+   }
+}
+
   return false;
 }
 
@@ -5070,14 +5111,6 @@ ipa_sra_preliminary_function_checks (struct cgraph_node 
*node)
   return false;
 }
 
-  if (!node->call_for_symbol_thunks_and_aliases (has_caller_p, NULL, true))
-{
-  if (dump_file)
-   fprintf (dump_file,
-"Function has no callers in this compilation unit.\n");
-  return false;
-}
-
   if (cfun->stdarg)
 {
   if (dump_file)
@@ -5096,6 +5129,25 @@ ipa_sra_preliminary_function_checks (struct cgraph_node 
*node)
   return false;
 }
 
+  struct ipa_sra_check_caller_data iscc;
+  memset (&iscc, 0, sizeof(iscc));
+  node->call_for_symbol_thunks_and_aliases (ipa_sra_check_caller, &iscc, true);
+  if (!iscc.has_callers)
+{
+  if (dump_file)
+   fprintf (dump_file,
+"Function has no callers in this compilation unit.\n");
+  return false;
+}
+
+  if (iscc.bad_arg_alignment)
+{
+  if (dump_file)
+   fprintf (dump_file,
+"A function call has an argument with non-unit alignemnt.\n");
+  return false;
+}
+
   return true;
 }
 


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Arnaud Charlet
> > I've added a -I gcc/gcc/ada/doc/gnat_ugn there, that's as far as my
> > knowledge goes for this script so I hope this is enough.
> 
> Well, since by default the find command deletes all files except those
> known to be documentation sources, you need at least to change it not to
> delete those particular files (and the actual RST sources of these
> manuals).

OK, I've applied the patch below, hopefully it should do the job.

> > I can help in transitioning the script to sphinx though, that would seem
> > more interesting/productive at this stage.
> 
> See the existing code to handle Sphinx documentation for the JIT.

That's a good reference. We'll need a more recent version of sphinx than
1.0 though (at least 1.2.2, or even better, 1.3b2 which is the version we use
at AdaCore).

Arno
--
--- update_web_docs_svn (revision 220961)
+++ update_web_docs_svn (working copy)
@@ -107,10 +107,8 @@
   svn -q export $SVNROOT/tags/$RELEASE gcc
 fi
 
-# Remove all unwanted files.  This is needed (a) to build the Ada
-# generator programs with the installed library, not the new one and
-# (b) to avoid packaging all the sources instead of only documentation
-# sources.
+# Remove all unwanted files.  This is needed to avoid packaging all the
+# sources instead of only documentation sources.
 # Note that we have to preserve gcc/jit/docs since the jit docs are
 # not .texi files (Makefile, .rst and .png), and the jit docs use
 # include directives to pull in content from jit/jit-common.h and
@@ -120,6 +118,7 @@
   -o -path gcc/gcc/doc/include/texinfo.tex \
   -o -path gcc/gcc/BASE-VER \
   -o -path gcc/gcc/DEV-PHASE \
+  -o -path "gcc/gcc/ada/doc/gnat_ugn/*.png" \
   -o -path "gcc/gcc/jit/docs/*" \
   -o -path "gcc/gcc/jit/jit-common.h" \
   -o -path "gcc/gcc/jit/notes.txt" \



C++ PATCH for useless NOTE_INSN_DELETED_DEBUG_LABELs

2015-02-25 Thread Jason Merrill
This isn't the main problem in debug/58315, but when looking at it I saw 
a bunch of useless


# DEBUG  => NULL

lines, which turned out to be deleted debug labels notes for the 
cdtor_label created in start_preparsed_function.  Since this is an 
internal, unnamed label, we shouldn't have debug information about it, 
but we were forgetting to mark it as artificial.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit b7d4cc82f45a906012f13d084af09840c1e37bc0
Author: Jason Merrill 
Date:   Wed Feb 25 08:53:43 2015 -0500

	PR debug/58315
	* decl.c (start_preparsed_function): Use create_artificial_label
	for cdtor_label.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 67c5ae7..83e060b 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13721,9 +13721,7 @@ start_preparsed_function (tree decl1, tree attrs, int flags)
   || (DECL_CONSTRUCTOR_P (decl1)
 	  && targetm.cxx.cdtor_returns_this ()))
 {
-  cdtor_label = build_decl (input_location, 
-LABEL_DECL, NULL_TREE, void_type_node);
-  DECL_CONTEXT (cdtor_label) = current_function_decl;
+  cdtor_label = create_artificial_label (input_location);
 }
 
   start_fname_decls ();
diff --git a/gcc/testsuite/g++.dg/tree-ssa/deleted-label1.C b/gcc/testsuite/g++.dg/tree-ssa/deleted-label1.C
new file mode 100644
index 000..11c06be
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/deleted-label1.C
@@ -0,0 +1,19 @@
+// PR debug/58315
+// { dg-options "-O -g -fdump-tree-einline" }
+// { dg-final { scan-tree-dump-not "DEBUG " "einline" } }
+// { dg-final { cleanup-tree-dump "einline" } }
+
+// We used to emit useless NOTE_INSN_DELETED_DEBUG_LABELs for the
+// artificial cdtor_label.
+
+struct A
+{
+  ~A() {}
+};
+
+struct B: A {};
+
+int main()
+{
+  A a;
+}


PING Re: [patch] PR debug/46102 Disable -feliminate-dwarf2-dups when reading a PCH

2015-02-25 Thread Aldy Hernandez

On 02/19/2015 10:41 AM, Jakub Jelinek wrote:

On Thu, Feb 19, 2015 at 10:33:20AM -0800, Aldy Hernandez wrote:

Well, any PCH file we generate will have some sort of early DIE in it (at
the very least the compilation unit DIE) and we will read these in at PCH
read-in time, obliterating whatever was already there.  But most
importantly, with the attached patch we will not use these
DW_TAG_GNU_[BE]INCL* DIEs, since the reader will avoid reading the pch file.
So, I don't think erroring out at output time is necessary.

How does this look?


Looks reasonable to me, but I'd prefer to defer this to Jason as debug
maintainer.


commit d90a408ad21aa0868cc13de24ea38e210ef78a68
Author: Aldy Hernandez 
Date:   Thu Feb 19 07:35:59 2015 -0800

PR debug/46102
* c-pch.c (c_common_valid_pch): Mark PCH file with
-feliminate-dwarf2-dups as invalid.

diff --git a/gcc/c-family/c-pch.c b/gcc/c-family/c-pch.c
index 0ede92a..55163c9 100644
--- a/gcc/c-family/c-pch.c
+++ b/gcc/c-family/c-pch.c
@@ -224,6 +224,19 @@ c_common_valid_pch (cpp_reader *pfile, const char *name, 
int fd)
const char *pch_ident;
struct c_pch_validity v;

+  /* We may have outputted a few DIEs corresponding to
+ DW_TAG_GNU_[BE]INCL.  Reading the compiler state later will read
+ in these DIEs, and obliterate any DW_TAG_GNU_[BE]INCL the reader
+ may have generated itself.  Do not read the PCH if this may
+ happen.  */
+  if (flag_eliminate_dwarf2_dups)
+{
+  if (cpp_get_options (pfile)->warn_invalid_pch)
+   cpp_error (pfile, CPP_DL_WARNING,
+  "%s: cannot be used with -feliminate-dwarf2-dups", name);
+  return 2;
+}
+
/* Perform a quick test of whether this is a valid
   precompiled header for the current language.  */




Jakub





[patch, avr-tiny]: Fix handling of constant addresses.

2015-02-25 Thread Georg-Johann Lay
The current avr-gcc ICEs in avr.c::tiny_valid_direct_memory_access_range 
because XEXP (op, 0) is used on op which are not MEM_P (e.g. REG or SUBREG).


If op is MEM_P then INTVAL might be used for on RTXes which are not CONST_INT, 
e.g. CONST.


Anyway, using such functions in insn conditions is not the right approach to 
get valid addresses.  The right place is targetm.legitimate_address_p.  Taking 
away move insns from reload by means of such conditions is not robust; reload 
knows how to make addresses legitimate if told so...


Ok for trunk?

Johann


PR target/65192
* config/avr/avr.c (tiny_valid_direct_memory_access_range): Remove.
(avr_legitimate_address_p) :
Refuse any constant address not in 0..0xbf.
* config/avr/avr-protos.h: Same.
* config/avr/avr.md (*mov, *movsf): Remove
tiny_valid_direct_memory_access_range from insn conditions.
(mov): Don't special-case expansion of avrtiny addresses.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 220964)
+++ config/avr/avr.c	(working copy)
@@ -1823,6 +1823,16 @@ avr_legitimate_address_p (machine_mode m
   break;
 }
 
+  if (AVR_TINY
+  && CONSTANT_ADDRESS_P (x))
+{
+  /* avrtiny's load / store instructions only cover addresses 0..0xbf:
+ IN / OUT range is 0..0x3f and LDS / STS can access 0x40..0xbf.  */
+
+  ok = (CONST_INT_P (x)
+&& IN_RANGE (INTVAL (x), 0, 0xc0 - GET_MODE_SIZE (mode)));
+}
+
   if (avr_log.legitimate_address_p)
 {
   avr_edump ("\n%?: ret=%d, mode=%m strict=%d "
@@ -3210,37 +3220,6 @@ avr_out_xload (rtx_insn *insn ATTRIBUTE_
 }
 
 
-/* AVRTC-579
-   If OP is a symbol or a constant expression with value > 0xbf
-   return FALSE, otherwise TRUE.
-   This check is used to avoid LDS / STS instruction with invalid memory
-   access range (valid range 0x40..0xbf).  For I/O operand range 0x0..0x3f,
-   IN / OUT instruction will be generated.  */
-
-bool
-tiny_valid_direct_memory_access_range (rtx op, machine_mode mode)
-{
-  rtx x;
-
-  if (!AVR_TINY)
-return true;
-
-  x = XEXP (op,0);
-
-  if (MEM_P (op) && x && GET_CODE (x) == SYMBOL_REF)
-{
-  return false;
-}
-
-  if (MEM_P (op) && x && (CONSTANT_ADDRESS_P (x))
-  && !(IN_RANGE (INTVAL (x), 0, 0xC0 - GET_MODE_SIZE (mode
-{
-  return false;
-}
-
-  return true;
-}
-
 const char*
 output_movqi (rtx_insn *insn, rtx operands[], int *plen)
 {
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 220854)
+++ config/avr/avr.md	(working copy)
@@ -671,32 +671,6 @@ (define_expand "mov"
 emit_insn (gen_load_libgcc (dest, src));
 DONE;
   }
-
-// AVRTC-579
-// If the source operand expression is out of range for LDS instruction
-// copy source operand expression to register.
-// For tiny core, LDS instruction's memory access range limited to 0x40..0xbf.
-
-if (!tiny_valid_direct_memory_access_range (src, mode))
-  {
-rtx srcx = XEXP (src, 0);
-operands[1] = src = replace_equiv_address (src, copy_to_mode_reg (GET_MODE (srcx), srcx));
-emit_move_insn (dest, src);
-DONE;
-  }
-
-// AVRTC-579
-// If the destination operand expression is out of range for STS instruction
-// copy destination operand expression to register.
-// For tiny core, STS instruction's memory access range limited to 0x40..0xbf.
-
-if (!tiny_valid_direct_memory_access_range (dest, mode))
-  {
-rtx destx = XEXP (dest, 0);
-operands[0] = dest = replace_equiv_address (dest, copy_to_mode_reg (GET_MODE (destx), destx));
-emit_move_insn (dest, src);
-DONE;
-  }
   })
 
 ;;
@@ -713,13 +687,8 @@ (define_expand "mov"
 (define_insn "mov_insn"
   [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r,d,Qm   ,r ,q,r,*r")
 (match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r Y00,Qm,r,q,i"))]
-  "(register_operand (operands[0], mode)
-|| reg_or_0_operand (operands[1], mode))
-   /* Skip if operands are out of lds/sts memory access range(0x40..0xbf)
-  though access range is checked during define_expand, it is required
-  here to avoid merging RTXes during combine pass.  */
-   && tiny_valid_direct_memory_access_range (operands[0], QImode)
-   && tiny_valid_direct_memory_access_range (operands[1], QImode)"
+  "register_operand (operands[0], mode)
+|| reg_or_0_operand (operands[1], mode)"
   {
 return output_movqi (insn, operands, NULL);
   }
@@ -812,13 +781,8 @@ (define_insn "*reload_in"
 (define_insn "*mov"
   [(set (match_operand:ALL2 0 "nonimmediate_operand" "=r,r  ,r,m,d,*r,q,r")
 (match_operand:ALL2 1 "nox_general_operand"   "r,Y00,m,r Y00,i,i ,r,q"))]
-  "(register_operand (operands[0

[hsa] Do not ICE when regallocating function with zero pseudoregisters

2015-02-25 Thread Martin Jambor
Hi,

the new HSA register allocator ICEs when it tries to resize a vector
to zero length, which is something that our vectors do not take well,
when it processes a simple function which just returns a constant and
does not actually use any registers.

Fixed thusly, committed to the hsa branch after rudimentary testing.

Thanks,

Martin


2015-02-25  Martin Jambor  

* hsa-regalloc.c (regalloc): Bail out if there are no registers.

diff --git a/gcc/hsa-regalloc.c b/gcc/hsa-regalloc.c
index 94c88dc..8f0b4bb 100644
--- a/gcc/hsa-regalloc.c
+++ b/gcc/hsa-regalloc.c
@@ -745,6 +745,10 @@ regalloc (void)
   basic_block bb;
   reg_class_desc classes[4];
 
+  /* If there are no registers used in the function, exit right away. */
+  if (hsa_cfun.reg_count == 0)
+return;
+
   memset (classes, 0, sizeof (classes));
   classes[0].next_avail = 0;
   classes[0].max_num = 7;


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Joseph Myers
On Wed, 25 Feb 2015, Arnaud Charlet wrote:

> > See the existing code to handle Sphinx documentation for the JIT.
> 
> That's a good reference. We'll need a more recent version of sphinx than
> 1.0 though (at least 1.2.2, or even better, 1.3b2 which is the version we use
> at AdaCore).

1.0 is apparently the most recent version readily available for RHEL 6 
which gcc.gnu.org runs.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Ada] convert GNAT doc to sphinx

2015-02-25 Thread Joseph Myers
On Wed, 25 Feb 2015, Arnaud Charlet wrote:

> > So you need to update the
> > find command therein not to remove anything that's part of the sources for
> > this documentation, and possibly update -I options for building manuals as
> > well.
> 
> I've added a -I gcc/gcc/ada/doc/gnat_ugn there, that's as far as my
> knowledge goes for this script so I hope this is enough.

Well, since by default the find command deletes all files except those 
known to be documentation sources, you need at least to change it not to 
delete those particular files (and the actual RST sources of these 
manuals).

> I can help in transitioning the script to sphinx though, that would seem
> more interesting/productive at this stage.

See the existing code to handle Sphinx documentation for the JIT.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch] PR debug/46102 Disable -feliminate-dwarf2-dups when reading a PCH

2015-02-25 Thread Jason Merrill

On 02/19/2015 11:50 AM, Jakub Jelinek wrote:

Wouldn't it be better to disable PCH reading if -feliminate-dwarf2-dups
is used?


In the abstract, perhaps, but given

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53118

I'd prefer to disable the useless thing.  :)

We might actually disable -feliminate-dwarf2-dups entirely until that 
bug is fixed.


Jason



Re: [PATCH] Use DO_PRAGMA in libgomp.oacc-c-c++-common/reduction-1.c

2015-02-25 Thread Tom de Vries

On 25-02-15 12:40, Thomas Schwinge wrote:

Hi!

On Mon, 23 Feb 2015 18:14:35 +0100, Tom de Vries  wrote:

On 23-02-15 17:08, Jakub Jelinek wrote:

On Mon, Feb 23, 2015 at 04:52:56PM +0100, Tom de Vries wrote:

The only thing I'm not sure about is the two-level pragma expansion using
the apply pragmas. It maximizes factoring out common parts, but it makes
things less readable.

Tested on x86_64.

OK for stage4?


If Thomas is ok with that, it is ok with me too.


OK, thanks!


For comparison, this is a less convoluted, but longer version.


Hmm, I can't quite decide which of the two to prefer.  Your call.  ;-P
(Maybe, toss a coin if it takes more than a minute to decide.)




Committed this version, I'm happy with this one. There are now just two 
reduction macros (check_reduction_op and check_reduction_macro), which are 
reasonably readable.


Thanks,
- Tom

/* { dg-do run } */

/* Integer reductions.  */

#include 
#include 

#define vl 32

#define DO_PRAGMA(x) _Pragma (#x)

#define check_reduction_op(type, op, init, b)	\
  {		\
type res, vres;\
res = (init);\
DO_PRAGMA (acc parallel vector_length (vl))\
DO_PRAGMA (acc loop reduction (op:res))\
for (i = 0; i < n; i++)			\
  res = res op (b);\
		\
vres = (init);\
for (i = 0; i < n; i++)			\
  vres = vres op (b);			\
		\
if (res != vres)\
  abort ();	\
  }

static void
test_reductions_int (void)
{
  const int n = 1000;
  int i;
  int array[n];

  for (i = 0; i < n; i++)
array[i] = i;

  check_reduction_op (int, +, 0, array[i]);
  check_reduction_op (int, *, 1, array[i]);
  check_reduction_op (int, &, -1, array[i]);
  check_reduction_op (int, |, 0, array[i]);
  check_reduction_op (int, ^, 0, array[i]);
}

static void
test_reductions_bool (void)
{
  const int n = 1000;
  int i;
  int array[n];
  int cmp_val;

  for (i = 0; i < n; i++)
array[i] = i;

  cmp_val = 5;
  check_reduction_op (bool, &&, true, (cmp_val > array[i]));
  check_reduction_op (bool, ||, false, (cmp_val > array[i]));
}

#define check_reduction_macro(type, op, init, b)	\
  {			\
type res, vres;	\
res = (init);	\
DO_PRAGMA (acc parallel vector_length (vl))\
DO_PRAGMA (acc loop reduction (op:res))\
for (i = 0; i < n; i++)\
  res = op (res, (b));\
			\
vres = (init);	\
for (i = 0; i < n; i++)\
  vres = op (vres, (b));\
			\
if (res != vres)	\
  abort ();		\
  }

#define max(a, b) (((a) > (b)) ? (a) : (b))
#define min(a, b) (((a) < (b)) ? (a) : (b))

static void
test_reductions_minmax (void)
{
  const int n = 1000;
  int i;
  int array[n];

  for (i = 0; i < n; i++)
array[i] = i;

  check_reduction_macro (int, min, n + 1, array[i]);
  check_reduction_macro (int, max, -1, array[i]);
}

int
main (void)
{
  test_reductions_int ();
  test_reductions_bool ();
  test_reductions_minmax ();
  return 0;
}

2015-02-25  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/reduction-1.c (DO_PRAGMA)
	(check_reduction_op, check_reduction_macro, max, min):
	Declare.
	(test_reductions_int, test_reductions_minmax, test_reductions_bool): New
	function.
	(main): Use new functions.
---
 .../libgomp.oacc-c-c++-common/reduction-1.c| 223 +++--
 1 file changed, 76 insertions(+), 147 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c
index acf9540..4501f8e 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c
@@ -7,168 +7,97 @@
 
 #define vl 32
 
-int
-main(void)
+#define DO_PRAGMA(x) _Pragma (#x)
+
+#define check_reduction_op(type, op, init, b)	\
+  {		\
+type res, vres;\
+res = (init);\
+DO_PRAGMA (acc parallel vector_length (vl))\
+DO_PRAGMA (acc loop reduction (op:res))\
+for (i = 0; i < n; i++)			\
+  res = res op (b);\
+		\
+vres = (init);\
+for (i = 0; i < n; i++)			\
+  vres = vres op (b);			\
+		\
+if (res != vres)\
+  abort ();	\
+  }
+
+static void
+test_reductions_int (void)
 {
   const int n = 1000;
   int i;
-  int vresult, result, array[n];
-  bool lvresult, lresult;
+  int array[n];
 
   for (i = 0; i < n; i++)
 array[i] = i;
 
-  result = 0;
-  vresult = 0;
-
-  /* '+' reductions.  */
-#pragma acc parallel vector_length (vl)
-#pragma acc loop reduction (+:result)
-  for (i = 0; i < n; i++)
-result += array[i];
-
-  /* Verify the reduction.  */
-  for (i = 0; i < n; i++)
-vresult += array[i];
-
-  if (result != vresult)
-abort ();
-
-  result = 0;
-  vresult = 0;
-
-  /* '*' reductions.  */
-#pragma acc parallel vector_length (vl)
-#pragma acc loop reduction (*:result)
-  for (i = 0; i < n; i++)
-result *= array[i];
-
-  /* Verify the reduction.  */
-  for (i = 0; i < n; i++)
-vresult *= array[i];
-
-  if (result != vresult)
-abort ();
-
-//   

Re: [PR58315] reset inlined debug vars at return-to point

2015-02-25 Thread Jakub Jelinek
On Wed, Feb 25, 2015 at 11:54:16AM +0100, Richard Biener wrote:
> > Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?
> 
> But code-motion could still move stmts from the inlined functions
> across these resets?  That said - shouldn't this simply performed
> by proper var-tracking u-ops produced by a backward scan over the
> function for "live" scope-blocks?  That is, when you see a scope
> block becoming live from exit then add u-ops resetting all
> vars from that scope block?

Yeah, wanted to say the same, I'm afraid such a change will very much affect
debugging experience close to the end of inlined functions, as sinking,
scheduling and various other passes may move statements from the inline
functions across those resets.  And, various tools and users really want to
be able to inspect variables and parameters on the return statement.

So, IMHO to do something like this, we'd need to mark those debug stmts
some way to say that they aren't normal debug binds, but debug binds at the
end of scopes (whether inlined functions or just lexical blocks), and
optimization passes that perform code motion should try to detect the case
when they are moving downward some statements across such debug stmts and
move those debug stmts along with those if possible.

And another thing is the amount of the added debug stmts, right now we don't
add debug stmts all the time for everything, just when something is needed,
while your patch adds it unconditionally, even when debug stmts for those
won't be really emitted.  As they are just resets, that hopefully will not
drastically affect var-tracking time, but might affect other optimization
passes, which would need to deal with much more statements than before.

Jakub


Re: [PATCH] Fix for PR ipa/64693

2015-02-25 Thread Martin Liška

On 02/20/2015 07:39 PM, Jan Hubicka wrote:

Hello.

There's updated version that reflects how should we handle congruence classes 
that have any
address reference. Patch can bootstrap x86_64-linux-pc and no new regression is 
introduced?

Ready for trunk?
Thanks,
Martin





>From d7472e55b345214d55ed49f5f10deafa9a24a4fc Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 19 Feb 2015 16:08:09 +0100
Subject: [PATCH 1/2] Fix PR ipa/64693

gcc/ChangeLog:

2015-02-20  Martin Liska  

PR ipa/64693
* ipa-icf.c (sem_item_optimizer::add_item_to_class): Identify if
a newly added item has an address reference.
(sem_item_optimizer::subdivide_classes_by_addr_references):
New function.
(sem_item_optimizer::process_cong_reduction): Include subdivision
based on address references.
* ipa-icf.h (struct addr_refs_hashmap_traits): New struct.
* ipa-ref.h (has_addr_ref_p): New function.

gcc/testsuite/ChangeLog:

2015-02-20  Martin Liska  

* gcc.dg/ipa/ipa-icf-26.c: Update expected test results.
* gcc.dg/ipa/ipa-icf-33.c: Remove duplicate line.
* gcc.dg/ipa/ipa-icf-34.c: New test.




diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 494fdcf..859b9d1 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -1809,6 +1809,9 @@ sem_item_optimizer::add_item_to_class (congruence_class 
*cls, sem_item *item)
item->index_in_class = cls->members.length ();
cls->members.safe_push (item);
item->cls = cls;
+
+  if (!cls->has_member_with_addr_ref && item->node->ref_list.has_addr_ref_p ())
+cls->has_member_with_addr_ref = true;
  }

  /* Congruence classes are built by hash value.  */
@@ -1969,6 +1972,84 @@ sem_item_optimizer::subdivide_classes_by_equality (bool 
in_wpa)
verify_classes ();
  }

+/* Subdivide classes by address references that members of the class
+   reference. Example can be a pair of functions that have an address
+   taken from a function. If these addresses are different the class
+   is split.  */


OK, I am bit surprised you have a separate loop for this instead of doing it at
a place you compare ipa-ref rerences anyway, but I suppose you know the code
better than I do ;)

+  while(!worklist_empty ())
+  {
+/* Process complete congruence reduction.  */
+while ((cls = worklist_pop ()) != NULL)
+  do_congruence_step (cls);
+
+/* Subdivide newly created classes according to references.  */
+unsigned new_classes = subdivide_classes_by_addr_references ();


I think this needs to be performed just once, because subdividing does not
depend on congurences.


+/* Class is container for address references for a symtab_node.  */
+
+class addr_refs_collection
+{
+public:
+  addr_refs_collection (symtab_node *node)
+  {
+m_references.create (0);
+ipa_ref *ref;
+
+if (is_a  (node) && DECL_VIRTUAL_P (node->decl))
+  return;
+
+for (unsigned i = 0; i < node->num_references (); i++)
+  {
+   ref = node->iterate_reference (i, ref);
+   if (ref->use == IPA_REF_ADDR)
+ m_references.safe_push (ref->referred);


You do not need to consider IPA_REF_ADDR of virtual table/ctors/dtors and 
virtual functions
to be address references (because these are never compared for equality.) Test 
it as

The proper conditon on when address matter is
   if (!DECL_VIRTUAL_P (ref->referred->decl)
   && (TREE_CODE (ref->referred->decl) != FUNCTION_DECL
   || (!DECL_CXX_CONSTRUCTOR_P (ref->referred->decl)
   && !DECL_CXX_DESTRUCTOR_P (ref->referred->decl)))

please also update sem_function::merge by adding cdtors in addition
to DECL_VIRTUAL_P

Why sem_item_optimizer::filter_removed_items checks cdtors?

+  }
+  }
+
+  /* Vector of address references.  */
+  vec m_references;
+};
+
+/* Hash traits for addr_refs_collection map.  */
+
+struct addr_refs_hashmap_traits: default_hashmap_traits
+{
+  static hashval_t
+  hash (const addr_refs_collection *v)
+  {
+inchash::hash hstate;
+hstate.add_int (v->m_references.length ());
+
+return hstate.end ();


This looks like a poor choice of hash function because the count will likely
match.  equal_address_to basically walks to alias targets
A safe approximation is to hash ultimate_alias_target of all entries in your 
list.

+  }
+
+  static bool
+  equal_keys (const addr_refs_collection *a,
+ const addr_refs_collection *b)
+  {
+if (a->m_references.length () != b->m_references.length ())
+  return false;
+
+for (unsigned i = 0; i < a->m_references.length (); i++)
+  if (a->m_references[i]->equal_address_to (b->m_references[i]) != 1)
+   return false;
+
+return true;
+  }
+};


OK with these changes.
Honza



Hello Honza.

I've updated the patch so that your notes are resolved. Moreover, I've added 
comparison
for interposable symbols that are either target of reference or are called by a 
function.
Please read the patch to verify the comparison is as you expected.

I'm going to r

Re: Patch ping

2015-02-25 Thread Jakub Jelinek
On Wed, Feb 25, 2015 at 10:10:52AM +0100, Richard Biener wrote:
> Oops, totally forgot about this one.
> 
> Shouldn't
> 
> + default:
> +   error ("unsupported mode %s\n", mname);
> 
> be a fatal_error ()?  After all if we hit this but continue we'll

Ok, I'll change it.

> stream random crap.  I also think we should be a bit more user-centric
> here and maybe report "for host / offload target combination".

Eventually, sure, we should be able (based on options) either turn all the
errors from the offloading compiler into warnings that just disable the
offloading for some particular offloading target.

> +static GTY(()) const unsigned char *lto_mode_identity_table;
> 
> why in GC memory?

The reason for that is that it is referenced from GC structure, and in the
offloading path they should be GC allocated, so that they can be released
when the corresponding GC structure holding pointer to that goes away.
In the non-offloading LTO, all those GC structures will contain the same
value, lto_mode_identity_table, but if that would be heap allocated, GC
would be upset.

Jakub


Re: [patch]: Fix PR target/64212

2015-02-25 Thread Kai Tietz
Applied at revision 22098 to trunk.  Jan approved patch on IRC.

Regards,
Kai


Re: [patch]: [Bug tree-optimization/61917] [4.9/5 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in vectorizable_reduction, at tree-vect-loop.c:4913

2015-02-25 Thread H.J. Lu
On Wed, Feb 25, 2015 at 5:10 AM, Kai Tietz  wrote:
> Hello,
>
> So, I did full regression-test for following patch:
>
> ChangeLog
>
> 2015-02-25  Richard Biener  
> Kai Tietz  
>
> PR tree-optimization/61917
> * tree-vect-loop.c (vectorizable_reduction): Allow
> vect_internal_def without reduction to exit graceful.
>

I think it caused:

FAIL: gcc.dg/pr56350.c (internal compiler error)
FAIL: gcc.dg/pr56350.c (test for excess errors)

[hjl@gnu-6 gcc]$ ./xgcc -B./ -O -ftree-vectorize
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/pr56350.c
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/pr56350.c: In
function ‘f’:
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/pr56350.c:8:1:
internal compiler error: Segmentation fault
 f (void)
 ^
0xd1f836 crash_signal
/export/gnu/import/git/sources/gcc/gcc/toplev.c:383
0xfaf59a gimple_code
/export/gnu/import/git/sources/gcc/gcc/gimple.h:1553
0xfbd855 vectorizable_reduction(gimple_statement_base*,
gimple_stmt_iterator*, gimple_statement_base**, _slp_tree*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:4987
0xfabc86 vect_analyze_stmt(gimple_statement_base*, bool*, _slp_tree*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:7170
0xfb50c9 vect_analyze_loop_operations
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:1539
0xfb58cc vect_analyze_loop_2
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:1800
0xfb5c70 vect_analyze_loop(loop*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:1898
0xfd558f vectorize_loops()
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:451
0xed3699 execute
/export/gnu/import/git/sources/gcc/gcc/tree-ssa-loop.c:295
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
[hjl@gnu-6 gcc]$


-- 
H.J.


Re: [PATCH] Fix for PR ipa/64693

2015-02-25 Thread Jan Hubicka
> Hello Honza.
> 
> I've updated the patch so that your notes are resolved. Moreover, I've added 
> comparison
> for interposable symbols that are either target of reference or are called by 
> a function.
> Please read the patch to verify the comparison is as you expected.
> 
> I'm going to run testsuite.
> 
> Thanks,
> Martin

> >From 8dae064e67e30537486e0d502fc5df39d37cee3e Mon Sep 17 00:00:00 2001
> From: mliska 
> Date: Thu, 19 Feb 2015 16:08:09 +0100
> Subject: [PATCH 1/3] Fix PR ipa/64693
> 
> gcc/ChangeLog:
> 
> 2015-02-20  Martin Liska  
> 
>   PR ipa/64693
>   * ipa-icf.c (sem_item_optimizer::add_item_to_class): Identify if
>   a newly added item has an address reference.
>   (sem_item_optimizer::subdivide_classes_by_addr_references):
>   New function.
>   (sem_item_optimizer::process_cong_reduction): Include subdivision
>   based on address references.
>   * ipa-icf.h (struct addr_refs_hashmap_traits): New struct.
>   (sem_item::is_nonvirtual_or_cdtor): New function.
> 
> gcc/testsuite/ChangeLog:
> 
> 2015-02-20  Martin Liska  
> 
>   * gcc.dg/ipa/ipa-icf-26.c: Update expected test results.
>   * gcc.dg/ipa/ipa-icf-33.c: Remove duplicate line.
>   * gcc.dg/ipa/ipa-icf-34.c: New test.
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index 494fdcf..fbb641d 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -126,6 +126,40 @@ along with GCC; see the file COPYING3.  If not see
>  using namespace ipa_icf_gimple;
>  
>  namespace ipa_icf {
> +
> +/* Constructor.  */
> +
> +addr_refs_collection::addr_refs_collection (symtab_node *node)

I gues because you now track two thinks, address references and interposable
symbols, perhaps the function name can reflect it.
Perhaps symbol_compare_collection sounds more precise, but I leave decision
on you.
> +{
> +  m_references.create (0);
> +  m_interposables.create (0);
> +
> +  ipa_ref *ref;
> +
> +  if (is_a  (node) && DECL_VIRTUAL_P (node->decl))
> +return;
> +
> +  for (unsigned i = 0; i < node->num_references (); i++)
> +{
> +  ref = node->iterate_reference (i, ref);
> +  if (ref->use == IPA_REF_ADDR
> +   && sem_item::is_nonvirtual_or_cdtor (ref->referred->decl))
> + m_references.safe_push (ref->referred);
Since I introduced the address_matters predicate, just make
is_nonvirtual_or_cdtor a address_matters_p predicate of ipa_ref itself.
Test that reference is ADDR, referring is not virtual table and referred is is
non-virtual noncdotr.

It is better to have this centralized in symbol table predicates because later
we may want to get smarter.
> @@ -638,11 +672,11 @@ sem_function::merge (sem_item *alias_item)
>  
>/* See if original and/or alias address can be compared for equality.  */
>original_address_matters
> -= (!DECL_VIRTUAL_P (original->decl)
> += (sem_item::is_nonvirtual_or_cdtor (original->decl)
> && (original->externally_visible
>  || original->address_taken_from_non_vtable_p ()));
>alias_address_matters
> -= (!DECL_VIRTUAL_P (alias->decl)
> += (sem_item::is_nonvirtual_or_cdtor (alias->decl)
> && (alias->externally_visible
>  || alias->address_taken_from_non_vtable_p ()));
>  

Lets levae this for incremental patch for the ::nerge revamp.
> @@ -1969,6 +2003,82 @@ sem_item_optimizer::subdivide_classes_by_equality 
> (bool in_wpa)
>verify_classes ();
>  }
>  
> +/* Subdivide classes by address references that members of the class
> +   reference. Example can be a pair of functions that have an address
> +   taken from a function. If these addresses are different the class
> +   is split.  */
> +
> +unsigned
> +sem_item_optimizer::subdivide_classes_by_addr_references ()

Simialrly this needs update of name.
> @@ -2258,8 +2368,20 @@ sem_item_optimizer::process_cong_reduction (void)
>  fprintf (dump_file, "Congruence class reduction\n");
>  
>congruence_class *cls;
> -  while ((cls = worklist_pop ()) != NULL)
> -do_congruence_step (cls);
> +
> +  while(!worklist_empty ())
> +  {
> +/* Process complete congruence reduction.  */
> +while ((cls = worklist_pop ()) != NULL)
> +  do_congruence_step (cls);
> +
> +/* Subdivide newly created classes according to references.  */
> +unsigned new_classes = subdivide_classes_by_addr_references ();

Still do not see why this needs to be iterated within the loop and not just 
executed once ;)
> +class addr_refs_collection
> +{
> +public:
> +  /* Constructor.  */
> +  addr_refs_collection (symtab_node *node);
> +
> +  /* Destructor.  */
> +  ~addr_refs_collection ()
> +  {
> +m_references.release ();
> +m_interposables.release ();
> +  }
> +
> +  /* Vector of address references.  */
> +  vec m_references;
> +
> +  /* Vector of interposable references.  */
> +  vec m_interposables;
> +};
> +  static bool
> +  equal_keys (const addr_refs_collection *a,
> +   const addr_refs_collection *b)
> +  {
> +if (a->m_references.length () != b-

Re: [patch] PR debug/46102 Disable -feliminate-dwarf2-dups when reading a PCH

2015-02-25 Thread Aldy Hernandez

On 02/25/2015 07:59 AM, Jason Merrill wrote:

On 02/19/2015 11:50 AM, Jakub Jelinek wrote:

Wouldn't it be better to disable PCH reading if -feliminate-dwarf2-dups
is used?


In the abstract, perhaps, but given

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53118

I'd prefer to disable the useless thing.  :)


Patch attached.



We might actually disable -feliminate-dwarf2-dups entirely until that
bug is fixed.


Well technically, this bug is a subset of 53118.  I would like to mark 
it as a duplicate, and can tackle it as part of my early debug work. 
After all, we're going to get a lot more DIEs that will get streamed 
early on, which PCH will have to deal with.  So, this will all get fixed.


Also, can we downgrade 53118, perhaps to a P4?  As Ian mentions here:

https://gcc.gnu.org/ml/gcc-help/2010-09/msg00083.html

There are better ways of optimizing this at link time for dwarf4, and 
the fact that this has been broken since GCC 4.0 would hint that this 
may not be of P2 importance?


OK for mainline pending tests?
commit 512b997ad55f45898fce2704c0289d472d08cab1
Author: Aldy Hernandez 
Date:   Wed Feb 25 08:49:59 2015 -0800

PR debug/46102
* dwarf2out.c (dwarf2out_init): Disable -feliminate-dwarf2-dups.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ebf41c8..3f2837b 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -22621,6 +22621,13 @@ output_macinfo (void)
 static void
 dwarf2out_init (const char *filename ATTRIBUTE_UNUSED)
 {
+  /* This option is currently broken, see (PR53118 and PR46102).  */
+  if (flag_eliminate_dwarf2_dups)
+{
+  warning (0, "ignoring unimplemented option -feliminate-dwarf2-dups");
+  flag_eliminate_dwarf2_dups = 0;
+}
+
   /* Allocate the file_table.  */
   file_table = hash_table::create_ggc (50);
 
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2-1.C 
b/gcc/testsuite/g++.dg/debug/dwarf2-1.C
index e90d510..913bfe5 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2-1.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2-1.C
@@ -20,3 +20,5 @@ namespace N
 }
 
 N::Derived thing;
+
+/* { dg-bogus "ignoring unimplemented option -feliminate-dwarf2-dups" 
"unimplemented" { xfail *-*-* } 1 } */
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2-2.C 
b/gcc/testsuite/g++.dg/debug/dwarf2-2.C
index 9e6dbd2..214bbb1 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2-2.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2-2.C
@@ -15,3 +15,5 @@ void A::foo ()
 {
   using namespace N;
 }
+
+/* { dg-bogus "ignoring unimplemented option -feliminate-dwarf2-dups" 
"unimplemented" { xfail *-*-* } 1 } */
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/typedef5.C 
b/gcc/testsuite/g++.dg/debug/dwarf2/typedef5.C
index d9d058c..17ffafa 100644
--- a/gcc/testsuite/g++.dg/debug/dwarf2/typedef5.C
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/typedef5.C
@@ -8,3 +8,5 @@ typedef struct
 } A;
 
 A a;
+
+/* { dg-bogus "ignoring unimplemented option -feliminate-dwarf2-dups" 
"unimplemented" { xfail *-*-* } 1 } */
diff --git a/gcc/testsuite/g++.dg/debug/pr46123.C 
b/gcc/testsuite/g++.dg/debug/pr46123.C
index 9e115cd..f5e5f9f 100644
--- a/gcc/testsuite/g++.dg/debug/pr46123.C
+++ b/gcc/testsuite/g++.dg/debug/pr46123.C
@@ -45,3 +45,5 @@ int main ()
 return 1;
   return 0;
 }
+
+/* { dg-bogus "ignoring unimplemented option -feliminate-dwarf2-dups" 
"unimplemented" { xfail *-*-* } 1 } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-3.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2-3.c
index f0c129c..e649dfa 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2-3.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2-3.c
@@ -11,3 +11,5 @@ int main()
   p.x = 0;
   p.y = 0;
 }
+
+/* { dg-bogus "ignoring unimplemented option -feliminate-dwarf2-dups" 
"unimplemented" { xfail *-*-* } 1 } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/dups-types.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/dups-types.c
index d9c01d0..4d3a9e8 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/dups-types.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/dups-types.c
@@ -1,8 +1,10 @@
 /* Test that these two options can work together.  */
 /* { dg-options "-gdwarf-4 -dA -feliminate-dwarf2-dups -fdebug-types-section" 
} */
-/* { dg-final { scan-assembler "DW.dups_types\.h\[^)\]*. DW_TAG_typedef" } } */
+/* { dg-final { scan-assembler "DW.dups_types\.h\[^)\]*. DW_TAG_typedef" { 
xfail *-*-* } } } */
 /* { dg-final { scan-assembler "DW_TAG_type_unit" } } */
 
 #include "dups-types.h"
 
 A2 a;
+
+/* { dg-bogus "ignoring unimplemented option -feliminate-dwarf2-dups" 
"unimplemented" { xfail *-*-* } 1 } */


Re: Option overriding in the offloading code path (was: [nvptx] -freorder-blocks-and-partition, -freorder-functions)

2015-02-25 Thread Jakub Jelinek
On Wed, Feb 25, 2015 at 11:28:12AM +0100, Thomas Schwinge wrote:
> Am I on the right track with my assumption that it is correct that
> nvptx.c:nvptx_option_override is not invoked in the offloading code path,
> so we'd need a new target hook (?) to consolidate/override the options in
> this scenario?
> 
> 
> Using this to forcefully disable -fvar-tracking (as done in
> nvptx_option_override), should then allow me to drop the following
> beautiful specimen of a patch (which I didn't commit anywhere, so far):

Supposedly you could just disable var-tracking for
targetm.no_register_allocation case, or change that assert to
allow pseudos for targetm.no_register_allocation?

Anyway, if var-tracking is never useful for NVPTX, if you want to override
it early, flag_var_tracking* options are Optimization options, thus you'd
need a target hook similar to the one I've added today, but instead of
TARGET_OPTION_NODE do something similar for OPTIMIZATION_NODE streaming.

Jakub


Re: ipa-icf::merge TLC

2015-02-25 Thread Jan Hubicka
> On 2015.02.25 at 09:38 +0100, Jan Hubicka wrote:
> > this patch reorganize sem_function::merge and sem_variable::merge.
> > I read the code in detail and found several issues that are fixed in the
> > following patch.
> 
> I gave your patch a quick spin. It breaks Chromium. Its protocol buffer
> compiler gets miscompiled:

I see (I remember running into simiarly looking ICE last time I tried Chromium).
Is there any chance you can look into what gets wrong?  I added enough of 
sanity checking
code so I do not see how merging itself can lead to wrong codes without ICEing,
but the patch makes considerably more merging to happen (old code had bug where 
it
tried to merge but didn't), so we may run into another previously latent issue.
Martin has 3 correctness ipa-icf patches in a way, so perhaps one of them ;)

Honza
> 
>  RULE Generating C++ and Python code from copresence/proto/enums.proto
> FAILED: cd ../../components; python ../tools/protoc_wrapper/protoc_wrapper.py 
> --include "" --protobuf 
> "../out/Release/gen/protoc_out/components/copresence/proto/enums.pb.h" -
> -proto-in-dir copresence/proto --proto-in-file "enums.proto" 
> "--use-system-protobuf=0" -- ../out/Release/protoc --cpp_out 
> ../out/Release/gen/protoc_out/components/copresence/
> proto --python_out ../out/Release/pyproto/components/copresence/proto
> *** Error in `../out/Release/protoc': double free or corruption (out): 
> 0x7ffd966468b0 ***
> === Backtrace: =
> /lib/libc.so.6(+0x7265e)[0x7fe09058065e]
> /lib/libc.so.6(+0x77f1d)[0x7fe090585f1d]
> /lib/libc.so.6(+0x7874b)[0x7fe09058674b]
> ../out/Release/protoc[0x45f5e1]
> ../out/Release/protoc[0x462caf]
> ../out/Release/protoc[0x462d1c]
> ../out/Release/protoc[0x46685c]
> ../out/Release/protoc[0x4677d5]
> ../out/Release/protoc[0x4679c3]
> ../out/Release/protoc[0x467b15]
> ../out/Release/protoc[0x479d89]
> ../out/Release/protoc[0x4aa058]
> ../out/Release/protoc[0x4aa0af]
> ../out/Release/protoc[0x46b217]
> ../out/Release/protoc[0x4a4832]
> ../out/Release/protoc[0x4a86cf]
> ../out/Release/protoc[0x4a890e]
> ../out/Release/protoc[0x4a0827]
> ../out/Release/protoc[0x4678fd]
> ../out/Release/protoc[0x467b15]
> ../out/Release/protoc[0x40c926]
> ../out/Release/protoc[0x403131]
> /lib/libc.so.6(__libc_start_main+0xf0)[0x7fe09052e6d0]
> ../out/Release/protoc[0x4037f9]
> === Memory map: 
> 0040-004eb000 r-xp  00:13 36244132   
> /var/tmp/chromium/src/out/Release/protoc
> 004ec000-004ef000 r--p 000eb000 00:13 36244132   
> /var/tmp/chromium/src/out/Release/protoc
> 004ef000-004f rw-p 000ee000 00:13 36244132   
> /var/tmp/chromium/src/out/Release/protoc
> 01ab-01b03000 rw-p  00:00 0  
> [heap]
> 7fe09027f000-7fe09030d000 r-xp  00:0f 2540736
> /lib64/libm-2.21.90.so
> 7fe09030d000-7fe09050c000 ---p 0008e000 00:0f 2540736
> /lib64/libm-2.21.90.so
> 7fe09050c000-7fe09050d000 r--p 0008d000 00:0f 2540736
> /lib64/libm-2.21.90.so
> 7fe09050d000-7fe09050e000 rw-p 0008e000 00:0f 2540736
> /lib64/libm-2.21.90.so
> 7fe09050e000-7fe09066e000 r-xp  00:0f 2541268
> /lib64/libc-2.21.90.so
> 7fe09066e000-7fe09086d000 ---p 0016 00:0f 2541268
> /lib64/libc-2.21.90.so
> 7fe09086d000-7fe090871000 r--p 0015f000 00:0f 2541268
> /lib64/libc-2.21.90.so
> 7fe090871000-7fe090873000 rw-p 00163000 00:0f 2541268
> /lib64/libc-2.21.90.so
> 7fe090873000-7fe090877000 rw-p  00:00 0 
> 7fe090877000-7fe09088f000 r-xp  00:0f 2541131
> /lib64/libpthread-2.21.90.so
> 7fe09088f000-7fe090a8e000 ---p 00018000 00:0f 2541131
> /lib64/libpthread-2.21.90.so
> 7fe090a8e000-7fe090a8f000 r--p 00017000 00:0f 2541131
> /lib64/libpthread-2.21.90.so
> 7fe090a8f000-7fe090a9 rw-p 00018000 00:0f 2541131
> /lib64/libpthread-2.21.90.so
> 7fe090a9-7fe090a94000 rw-p  00:00 0 
> 7fe090a94000-7fe090ab6000 r-xp  00:0f 2541267
> /lib64/ld-2.21.90.so
> 7fe090ad2000-7fe090ad7000 rw-p  00:00 0 
> 7fe090ad7000-7fe090aee000 r-xp  00:0f 72800  
> /lib64/libgcc_s.so.1
> 7fe090aee000-7fe090aef000 rw-p 00016000 00:0f 72800  
> /lib64/libgcc_s.so.1
> 7fe090aef000-7fe090af rw-p  00:00 0 
> 7fe090af-7fe090c7b000 r-xp  00:0f 3018239
> /usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
> 7fe090c7b000-7fe090c7c000 ---p 0018b000 00:0f 3018239
> /usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
> 7fe090c7c000-7fe090c86000 r--p 0018b000 00:0f 3018239
> /usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
> 7fe090c86000-7fe090c8a000 rw-p 00

[4.8 branch] PATCH: PR middle-end/53623: [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions

2015-02-25 Thread H.J. Lu
On Tue, Feb 17, 2015 at 4:47 AM, H.J. Lu  wrote:
> On Mon, Feb 16, 2015 at 5:24 AM, H.J. Lu  wrote:
>> On Mon, Feb 16, 2015 at 5:18 AM, Jakub Jelinek  wrote:
>>> On Mon, Feb 16, 2015 at 05:15:02AM -0800, H.J. Lu wrote:
 On Mon, Feb 16, 2015 at 4:30 AM, H.J. Lu  wrote:
 > On Mon, Feb 16, 2015 at 1:35 AM, Jakub Jelinek  wrote:
 >> On Sun, Feb 15, 2015 at 12:53:39PM -0800, H.J. Lu wrote:
 >>> This is a backport of the patch for PR middle-end/53623 plus all bug
 >>> fixes caused by it.  Tested on Linux/x86-32, Linux/x86-64 and x32.  OK
 >>> for 4.8 branch?
 >>
 >> What about PR64286 and PR63659, are you sure those aren't related?
 >> I mean, they are on the 4.9 branch and I don't see why they couldn't 
 >> affect
 >> the 4.8 backport.
 >>
 >> Jakub
 >
 > Fix for PR 63659 has been backported to 4.8 branch.  I will check if
 > fix for PR 64286 is needed.
 >
 > --
 > H.J.

 The fix for PR 64286 is an updated fix for PR 59754 which is caused by
 the fix for PR 53623.  But the testcase in the fix for PR 64286 doesn't
 fail on 4.8 branch + my backport of the fix for PR 53623 on Haswell.
 I suggest

 1. We go without my current backport and backport the fix for PR 64286
 in a separate patch.  Or
 2. We go without my backport minus the backport of the PR 59754
 fix and backport the fixes for PR 59754 plus PR 64286 in a separate patch
>>>
>>> I think keeping the branch broken is bad, even if we don't have a testcase
>>> that really fails, pressumably the issue is just latent.
>>> So I'd strongly prefer
>>> 3. Add the PR64286 fix to the patch being tested and commit only when it as
>>> whole is tested, as one commit.
>>>
>>
>> I will do that and restart the testing.
>>
>
> I tested it on x86-32, x86-64 and x32.  There are no regressions.
>

Here is the patch.  OK for 4.8 branch?

Thanks.


-- 
H.J.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 469ee31..01749de 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,94 @@
+2015-02-16  H.J. Lu  
+
+	Backported from 4.9 branch
+	2015-01-13  Jakub Jelinek  
+
+	PR rtl-optimization/64286
+	* ree.c (combine_reaching_defs): Move part of comment earlier,
+	remove !SCALAR_INT_MODE_P check.
+	(add_removable_extension): Don't add vector mode
+	extensions if all uses of the source register aren't the same
+	vector extensions.
+
+2015-02-15  H.J. Lu  
+
+	Backport from mainline:
+	2014-06-13  Jeff Law  
+
+	PR rtl-optimization/61094
+	PR rtl-optimization/61446
+	* ree.c (combine_reaching_defs): Get the mode for the copy from
+	the extension insn rather than the defining insn.
+
+	2014-06-02  Jeff Law  
+
+	PR rtl-optimization/61094
+	* ree.c (combine_reaching_defs): Do not reextend an insn if it
+	was marked as do_no_reextend.  If a copy is needed to eliminate
+	an extension, then mark it as do_not_reextend.
+
+	2014-02-14  Jeff Law  
+
+	PR rtl-optimization/60131
+	* ree.c (get_extended_src_reg): New function.
+	(combine_reaching_defs): Use it rather than assuming location
+	of REG.
+	(find_and_remove_re): Verify first operand of extension is
+	a REG before adding the insns to the copy list.
+
+	2014-01-17  Jeff Law  
+
+	* ree.c (combine_set_extension): Temporarily disable test for
+	changing number of hard registers.
+
+	2014-01-15  Jeff Law  
+
+	PR tree-optimization/59747
+	* ree.c (find_and_remove_re): Properly handle case where a second
+	eliminated extension requires widening a copy created for elimination
+	of a prior extension.
+	(combine_set_extension): Ensure that the number of hard regs needed
+	for a destination register does not change when we widen it.
+
+	2014-01-10  Jeff Law  
+
+	PR middle-end/59743
+	* ree.c (combine_reaching_defs): Ensure the defining statement
+	occurs before the extension when optimizing extensions with
+	different source and destination hard registers.
+
+	2014-01-10  Jakub Jelinek  
+
+	PR rtl-optimization/59754
+	* ree.c (combine_reaching_defs): Disallow !SCALAR_INT_MODE_P
+	modes in the REGNO != REGNO case.
+
+	2014-01-08  Jeff Law  
+
+	* ree.c (get_sub_rtx): New function, extracted from...
+	(merge_def_and_ext): Here.
+	(combine_reaching_defs): Use get_sub_rtx.
+
+	2014-01-07  Jeff Law  
+
+	PR middle-end/53623
+	* ree.c (combine_set_extension): Handle case where source
+	and destination registers in an extension insn are different.
+	(combine_reaching_defs): Allow source and destination
+	registers in extension to be different under limited
+	circumstances.
+	(add_removable_extension): Remove restriction that the
+	source and destination registers in the extension are the
+	same.
+	(find_and_remove_re): Emit a copy from the extension's
+	destination to its source after the defining insn if
+	the source and destination registers are different.
+
+	2013-12-12  Jeff Law  
+
+	* i386.md (simple LEA peephole2): Add missing mode to zero_extend
+	for zero-extended MULT simple LEA pattern.
+
 2015-02-12  Jakub Jelinek

[PATCH, rs6000 testsuite] Fix failures for implicit function declaration

2015-02-25 Thread Pat Haugen
The following patch fixes "excess errors" failures for implicit function 
declarations (memcmp/random) for the direct-move-*/pack01 tests. Tested 
on powerpc64le-unknown-linux-gnu.


Committed as obvious.


2015-02-25  Pat Haugen 

gcc/testsuite:
* gcc.target/powerpc/direct-move.h: Include string.h/stdlib.h.
* gcc.target/powerpc/pack01.c: Include string.h.


Index: gcc.target/powerpc/direct-move.h
===
--- gcc.target/powerpc/direct-move.h(revision 220981)
+++ gcc.target/powerpc/direct-move.h(working copy)
@@ -1,6 +1,8 @@
 /* Test functions for direct move support.  */

 #include 
+#include 
+#include 
 extern void abort (void);

 #ifndef VSX_REG_ATTR
Index: gcc.target/powerpc/pack01.c
===
--- gcc.target/powerpc/pack01.c(revision 220981)
+++ gcc.target/powerpc/pack01.c(working copy)
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifdef DEBUG
 #include 



[PATCH] ICF: handle correctly hard register variables

2015-02-25 Thread Martin Liška

Hello.

This patch adds support for hard register variables in ICF and it's pre-approved
by Honza, I'm going to install the patch.

No regressing on x86_64-linux-pc.

Thanks,
Martin

>From eff93050904e0aeaf26b47fb1d1e8eeb803f9af6 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Wed, 25 Feb 2015 18:26:09 +0100
Subject: [PATCH] ICF: Validate correctly hard register variables.

gcc/ChangeLog:

2015-02-25  Martin Liska  

	* ipa-icf-gimple.c (func_checker::compare_variable_decl): Compare
	hard register variables.
---
 gcc/ipa-icf-gimple.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 5b176d0..53d2c38 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -575,6 +575,13 @@ func_checker::compare_variable_decl (tree t1, tree t2)
   if (t1 == t2)
 return true;
 
+  if (DECL_HARD_REGISTER (t1) != DECL_HARD_REGISTER (t2))
+return return_false_with_msg ("DECL_HARD_REGISTER are different");
+
+  if (DECL_HARD_REGISTER (t1)
+  && DECL_ASSEMBLER_NAME (t1) != DECL_ASSEMBLER_NAME (t2))
+return return_false_with_msg ("HARD REGISTERS are different");
+
   if (TREE_CODE (t1) == VAR_DECL && (DECL_EXTERNAL (t1) || TREE_STATIC (t1)))
 {
   symtab_node *n1 = symtab_node::get (t1);
-- 
2.1.2



[patch]: Fix regression caused by fix for 61917

2015-02-25 Thread Kai Tietz
Hi,

The patch didn't handled the case for dt being vect_constant_def,
where of course the reduc_def_stmt is NULL.
By checking for NULL before testing for PHI, we now fallback for such
cases to old behavior and return in the next if-statment.

2015-02-25  Richard Biener  
Kai Tietz  

PR tree-optimization/61917
* tree-vect-loop.c (vectorizable_reduction): Handle obvious case
that reduc_def_stmt is null.

Tested and will apply as obvious to trunk and 4.9 if there are no objections.

Sorry for the noise.
Kai

Index: tree-vect-loop.c
===
--- tree-vect-loop.c(Revision 220968)
+++ tree-vect-loop.c(Arbeitskopie)
@@ -4912,7 +4912,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
   if (!found_nested_cycle_def)
 reduc_def_stmt = def_stmt;

-  if (gimple_code (reduc_def_stmt) != GIMPLE_PHI)
+  if (reduc_def_stmt && gimple_code (reduc_def_stmt) != GIMPLE_PHI)
 return false;

   if (!(dt == vect_reduction_def


Re: [PATCH] Fix for PR ipa/64693

2015-02-25 Thread Martin Liška

On 02/25/2015 06:00 PM, Jan Hubicka wrote:

Hello Honza.

I've updated the patch so that your notes are resolved. Moreover, I've added 
comparison
for interposable symbols that are either target of reference or are called by a 
function.
Please read the patch to verify the comparison is as you expected.

I'm going to run testsuite.

Thanks,
Martin



>From 8dae064e67e30537486e0d502fc5df39d37cee3e Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 19 Feb 2015 16:08:09 +0100
Subject: [PATCH 1/3] Fix PR ipa/64693

gcc/ChangeLog:

2015-02-20  Martin Liska  

PR ipa/64693
* ipa-icf.c (sem_item_optimizer::add_item_to_class): Identify if
a newly added item has an address reference.
(sem_item_optimizer::subdivide_classes_by_addr_references):
New function.
(sem_item_optimizer::process_cong_reduction): Include subdivision
based on address references.
* ipa-icf.h (struct addr_refs_hashmap_traits): New struct.
(sem_item::is_nonvirtual_or_cdtor): New function.

gcc/testsuite/ChangeLog:

2015-02-20  Martin Liska  

* gcc.dg/ipa/ipa-icf-26.c: Update expected test results.
* gcc.dg/ipa/ipa-icf-33.c: Remove duplicate line.
* gcc.dg/ipa/ipa-icf-34.c: New test.
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 494fdcf..fbb641d 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -126,6 +126,40 @@ along with GCC; see the file COPYING3.  If not see
  using namespace ipa_icf_gimple;

  namespace ipa_icf {
+
+/* Constructor.  */
+
+addr_refs_collection::addr_refs_collection (symtab_node *node)


I gues because you now track two thinks, address references and interposable
symbols, perhaps the function name can reflect it.
Perhaps symbol_compare_collection sounds more precise, but I leave decision
on you.

+{
+  m_references.create (0);
+  m_interposables.create (0);
+
+  ipa_ref *ref;
+
+  if (is_a  (node) && DECL_VIRTUAL_P (node->decl))
+return;
+
+  for (unsigned i = 0; i < node->num_references (); i++)
+{
+  ref = node->iterate_reference (i, ref);
+  if (ref->use == IPA_REF_ADDR
+ && sem_item::is_nonvirtual_or_cdtor (ref->referred->decl))
+   m_references.safe_push (ref->referred);

Since I introduced the address_matters predicate, just make
is_nonvirtual_or_cdtor a address_matters_p predicate of ipa_ref itself.
Test that reference is ADDR, referring is not virtual table and referred is is
non-virtual noncdotr.

It is better to have this centralized in symbol table predicates because later
we may want to get smarter.


Agree with that, so I've taken the chunk which defines 'address_matters_p' from 
your big patch.


@@ -638,11 +672,11 @@ sem_function::merge (sem_item *alias_item)

/* See if original and/or alias address can be compared for equality.  */
original_address_matters
-= (!DECL_VIRTUAL_P (original->decl)
+= (sem_item::is_nonvirtual_or_cdtor (original->decl)
 && (original->externally_visible
   || original->address_taken_from_non_vtable_p ()));
alias_address_matters
-= (!DECL_VIRTUAL_P (alias->decl)
+= (sem_item::is_nonvirtual_or_cdtor (alias->decl)
 && (alias->externally_visible
   || alias->address_taken_from_non_vtable_p ()));



Lets levae this for incremental patch for the ::nerge revamp.


Ok, I'm going to write it.


@@ -1969,6 +2003,82 @@ sem_item_optimizer::subdivide_classes_by_equality (bool 
in_wpa)
verify_classes ();
  }

+/* Subdivide classes by address references that members of the class
+   reference. Example can be a pair of functions that have an address
+   taken from a function. If these addresses are different the class
+   is split.  */
+
+unsigned
+sem_item_optimizer::subdivide_classes_by_addr_references ()


Simialrly this needs update of name.


Renamed.


@@ -2258,8 +2368,20 @@ sem_item_optimizer::process_cong_reduction (void)
  fprintf (dump_file, "Congruence class reduction\n");

congruence_class *cls;
-  while ((cls = worklist_pop ()) != NULL)
-do_congruence_step (cls);
+
+  while(!worklist_empty ())
+  {
+/* Process complete congruence reduction.  */
+while ((cls = worklist_pop ()) != NULL)
+  do_congruence_step (cls);
+
+/* Subdivide newly created classes according to references.  */
+unsigned new_classes = subdivide_classes_by_addr_references ();


Still do not see why this needs to be iterated within the loop and not just 
executed once ;)


You are right, loop is removed.


+class addr_refs_collection
+{
+public:
+  /* Constructor.  */
+  addr_refs_collection (symtab_node *node);
+
+  /* Destructor.  */
+  ~addr_refs_collection ()
+  {
+m_references.release ();
+m_interposables.release ();
+  }
+
+  /* Vector of address references.  */
+  vec m_references;
+
+  /* Vector of interposable references.  */
+  vec m_interposables;
+};
+  static bool
+  equal_keys (const addr_refs_collection *a,
+ const addr_refs_collection *b)
+  {
+if (a->m_references.l

Re: [patch]: Fix regression caused by fix for 61917

2015-02-25 Thread Jakub Jelinek
On Wed, Feb 25, 2015 at 07:13:55PM +0100, Kai Tietz wrote:
> Hi,
> 
> The patch didn't handled the case for dt being vect_constant_def,
> where of course the reduc_def_stmt is NULL.
> By checking for NULL before testing for PHI, we now fallback for such
> cases to old behavior and return in the next if-statment.
> 
> 2015-02-25  Richard Biener  
> Kai Tietz  
> 
> PR tree-optimization/61917
> * tree-vect-loop.c (vectorizable_reduction): Handle obvious case
> that reduc_def_stmt is null.
> 
> Tested and will apply as obvious to trunk and 4.9 if there are no objections.

Ok, thanks.

> --- tree-vect-loop.c(Revision 220968)
> +++ tree-vect-loop.c(Arbeitskopie)
> @@ -4912,7 +4912,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i
>if (!found_nested_cycle_def)
>  reduc_def_stmt = def_stmt;
> 
> -  if (gimple_code (reduc_def_stmt) != GIMPLE_PHI)
> +  if (reduc_def_stmt && gimple_code (reduc_def_stmt) != GIMPLE_PHI)
>  return false;
> 
>if (!(dt == vect_reduction_def

Jakub


Re: ipa-icf::merge TLC

2015-02-25 Thread Martin Liška

On 02/25/2015 06:15 PM, Jan Hubicka wrote:

On 2015.02.25 at 09:38 +0100, Jan Hubicka wrote:

this patch reorganize sem_function::merge and sem_variable::merge.
I read the code in detail and found several issues that are fixed in the
following patch.


I gave your patch a quick spin. It breaks Chromium. Its protocol buffer
compiler gets miscompiled:


I see (I remember running into simiarly looking ICE last time I tried Chromium).
Is there any chance you can look into what gets wrong?  I added enough of 
sanity checking
code so I do not see how merging itself can lead to wrong codes without ICEing,
but the patch makes considerably more merging to happen (old code had bug where 
it
tried to merge but didn't), so we may run into another previously latent issue.
Martin has 3 correctness ipa-icf patches in a way, so perhaps one of them ;)

Honza


Hello.

I've just updated chromium to latest version, but unfortunately I cannot 
reproduce the memory corruption.
Which build flags do you use Markus for Chromium? Can you please run valgrind 
to spot the problematic function?
Moreover, I would appreciate if you will be able to find corresponding merge 
operation (-fdump-ipa-icf)

Thank you,
Martin



  RULE Generating C++ and Python code from copresence/proto/enums.proto
FAILED: cd ../../components; python ../tools/protoc_wrapper/protoc_wrapper.py --include 
"" --protobuf 
"../out/Release/gen/protoc_out/components/copresence/proto/enums.pb.h" -
-proto-in-dir copresence/proto --proto-in-file "enums.proto" 
"--use-system-protobuf=0" -- ../out/Release/protoc --cpp_out 
../out/Release/gen/protoc_out/components/copresence/
proto --python_out ../out/Release/pyproto/components/copresence/proto
*** Error in `../out/Release/protoc': double free or corruption (out): 
0x7ffd966468b0 ***
=== Backtrace: =
/lib/libc.so.6(+0x7265e)[0x7fe09058065e]
/lib/libc.so.6(+0x77f1d)[0x7fe090585f1d]
/lib/libc.so.6(+0x7874b)[0x7fe09058674b]
../out/Release/protoc[0x45f5e1]
../out/Release/protoc[0x462caf]
../out/Release/protoc[0x462d1c]
../out/Release/protoc[0x46685c]
../out/Release/protoc[0x4677d5]
../out/Release/protoc[0x4679c3]
../out/Release/protoc[0x467b15]
../out/Release/protoc[0x479d89]
../out/Release/protoc[0x4aa058]
../out/Release/protoc[0x4aa0af]
../out/Release/protoc[0x46b217]
../out/Release/protoc[0x4a4832]
../out/Release/protoc[0x4a86cf]
../out/Release/protoc[0x4a890e]
../out/Release/protoc[0x4a0827]
../out/Release/protoc[0x4678fd]
../out/Release/protoc[0x467b15]
../out/Release/protoc[0x40c926]
../out/Release/protoc[0x403131]
/lib/libc.so.6(__libc_start_main+0xf0)[0x7fe09052e6d0]
../out/Release/protoc[0x4037f9]
=== Memory map: 
0040-004eb000 r-xp  00:13 36244132   
/var/tmp/chromium/src/out/Release/protoc
004ec000-004ef000 r--p 000eb000 00:13 36244132   
/var/tmp/chromium/src/out/Release/protoc
004ef000-004f rw-p 000ee000 00:13 36244132   
/var/tmp/chromium/src/out/Release/protoc
01ab-01b03000 rw-p  00:00 0  [heap]
7fe09027f000-7fe09030d000 r-xp  00:0f 2540736
/lib64/libm-2.21.90.so
7fe09030d000-7fe09050c000 ---p 0008e000 00:0f 2540736
/lib64/libm-2.21.90.so
7fe09050c000-7fe09050d000 r--p 0008d000 00:0f 2540736
/lib64/libm-2.21.90.so
7fe09050d000-7fe09050e000 rw-p 0008e000 00:0f 2540736
/lib64/libm-2.21.90.so
7fe09050e000-7fe09066e000 r-xp  00:0f 2541268
/lib64/libc-2.21.90.so
7fe09066e000-7fe09086d000 ---p 0016 00:0f 2541268
/lib64/libc-2.21.90.so
7fe09086d000-7fe090871000 r--p 0015f000 00:0f 2541268
/lib64/libc-2.21.90.so
7fe090871000-7fe090873000 rw-p 00163000 00:0f 2541268
/lib64/libc-2.21.90.so
7fe090873000-7fe090877000 rw-p  00:00 0
7fe090877000-7fe09088f000 r-xp  00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe09088f000-7fe090a8e000 ---p 00018000 00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe090a8e000-7fe090a8f000 r--p 00017000 00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe090a8f000-7fe090a9 rw-p 00018000 00:0f 2541131
/lib64/libpthread-2.21.90.so
7fe090a9-7fe090a94000 rw-p  00:00 0
7fe090a94000-7fe090ab6000 r-xp  00:0f 2541267
/lib64/ld-2.21.90.so
7fe090ad2000-7fe090ad7000 rw-p  00:00 0
7fe090ad7000-7fe090aee000 r-xp  00:0f 72800  
/lib64/libgcc_s.so.1
7fe090aee000-7fe090aef000 rw-p 00016000 00:0f 72800  
/lib64/libgcc_s.so.1
7fe090aef000-7fe090af rw-p  00:00 0
7fe090af-7fe090c7b000 r-xp  00:0f 3018239
/usr/lib64/gcc/x86_64-pc-linux-gnu/5.0.0/libstdc++.so.6.0.21
7fe090c7b000-7fe090c7c000 ---p 0018b000 00:0f 3018239
/usr/lib64/gcc/x86_64

Re: [patch] fix PR65048: check that jump-thread paths are still valid

2015-02-25 Thread Sebastian Pop
Jeff Law wrote:
> >Registering FSM jump thread: (10, 12)  (12, 13)  (13, 15)  (15, 3)
> [ snip ]
> >Registering FSM jump thread: (7, 10)  (10, 12)  (12, 13)  (13, 14)
> 
> What I'm having a bit of trouble wrapping my head around is how can
> those two paths both be valid when you register them?  They have
> different transitions out of bb13, one going to bb15 the other to
> bb14, but they're both coming in via (10, 12).

Here is the output of debug_loops(3) showing the two paths before we start the
FSM code generation:

bb_7 (preds = {bb_4 }, succs = {bb_8 bb_10 bb_11 })
{
:
  # .MEM_26 = VDEF <.MEM_1>
  a = 84;
  switch (x_8(D)) , case 65: , case 85: >
}
bb_10 (preds = {bb_9 bb_5 bb_7 }, succs = {bb_12 })
{
  # .MEM_30 = PHI <.MEM_4(9), .MEM_20(5), .MEM_26(7)>
  # _34 = PHI <_7(9), 65(5), 84(7)>
:
  goto  ();
}
bb_12 (preds = {bb_9 bb_11 bb_10 }, succs = {bb_3 bb_13 })
{
  # _13 = PHI <_12(9), 65(11), 84(10)>
  # .MEM_32 = PHI <.MEM_4(9), .MEM_31(11), .MEM_30(10)>
  # _36 = PHI <_7(9), _35(11), _34(10)>
:
  # .MEM_6 = VDEF <.MEM_32>
  b = _13;
  # VUSE <.MEM_6>
  c.0_10 = c;
  switch (c.0_10) , case 85: >
}
bb_13 (preds = {bb_12 }, succs = {bb_14 bb_15 })
{
:
  switch (_36) , case 71: >
}
bb_14 (preds = {bb_13 bb_6 bb_8 }, succs = {bb_15 })
{
  # .MEM_33 = PHI <.MEM_6(13), .MEM_23(6), .MEM_28(8)>
  # _37 = PHI <_36(13), 65(6), 84(8)>
  # _39 = PHI <_13(13), _12(6), _12(8)>
:
  # .MEM_18 = VDEF <.MEM_33>
  fn ();
}
bb_15 (preds = {bb_13 bb_14 }, succs = {bb_3 bb_16 })
{
  # .MEM_17 = PHI <.MEM_6(13), .MEM_18(14)>
  # _38 = PHI <_36(13), _37(14)>
  # _40 = PHI <_13(13), _39(14)>
:
  switch (_40) , case 65: >
}
bb_3 (preds = {bb_16 bb_15 bb_12 bb_6 bb_8 }, succs = {bb_4 })
{
  # .MEM_3 = PHI <.MEM_19(16), .MEM_17(15), .MEM_6(12), .MEM_23(6), 
.MEM_28(8)>
  # _9 = PHI <_38(16), _38(15), _36(12), 65(6), 84(8)>
  # _16 = PHI <65(16), _40(15), _13(12), _12(6), _12(8)>
:
}

First, let's look at why we jump thread from 10 to 3:
> >Registering FSM jump thread: (10, 12)  (12, 13)  (13, 15)  (15, 3)

In other words, let's see how we can infer that "from bb_15 we are guaranteed to
jump into bb_3 if we come from bb_10."

So this switch in bb_15 is guaranteed to jump to the default case:
switch (_40) , case 65: > 

 # _40 = PHI <_13(13), _39(14)>

because when coming from bb_13, "_40 = _13", and then in bb_12 we have the 
definition
# _13 = PHI <_12(9), 65(11), 84(10)>

and so if we come from bb_10, the value of _13 is 84.
Because 84 != 65, switch (_40) will switch to default, that is a jump from 
bb_15 to bb_3.


Now let's see how we jump thread from 7 to 14:
> >Registering FSM jump thread: (7, 10)  (10, 12)  (12, 13)  (13, 14)

Why do we know that from bb_13 we necessarily jump to bb_14 if we have just
executed the code in bb_7?

In other words, why do we jump to the default case of
switch (_36) , case 71: >

# _36 = PHI <_7(9), _35(11), _34(10)>
# _34 = PHI <_7(9), 65(5), 84(7)>

so coming from bb_7, the value of the switch is 84 that is different than case 
71,
so we jump to the default case in "switch (_36) , case 71: 
>".

> let's make sure the paths that are registered are reasonable first

I think it is reasonable to jump thread these two paths.

Sebastian


Re: ipa-icf::merge TLC

2015-02-25 Thread Markus Trippelsdorf
On 2015.02.25 at 19:32 +0100, Martin Liška wrote:
> On 02/25/2015 06:15 PM, Jan Hubicka wrote:
> >> On 2015.02.25 at 09:38 +0100, Jan Hubicka wrote:
> >>> this patch reorganize sem_function::merge and sem_variable::merge.
> >>> I read the code in detail and found several issues that are fixed in the
> >>> following patch.
> >>
> >> I gave your patch a quick spin. It breaks Chromium. Its protocol buffer
> >> compiler gets miscompiled:
> >
> > I see (I remember running into simiarly looking ICE last time I tried 
> > Chromium).
> > Is there any chance you can look into what gets wrong?  I added enough of 
> > sanity checking
> > code so I do not see how merging itself can lead to wrong codes without 
> > ICEing,
> > but the patch makes considerably more merging to happen (old code had bug 
> > where it
> > tried to merge but didn't), so we may run into another previously latent 
> > issue.
> > Martin has 3 correctness ipa-icf patches in a way, so perhaps one of them ;)
> >
> > Honza
> 
> Hello.
> 
> I've just updated chromium to latest version, but unfortunately I cannot 
> reproduce the memory corruption.
> Which build flags do you use Markus for Chromium? Can you please run valgrind 
> to spot the problematic function?
> Moreover, I would appreciate if you will be able to find corresponding merge 
> operation (-fdump-ipa-icf)

I'm just using the defaults:
GYP_DEFINES="ffmpeg_branding=Chrome use_kerberos=0 fastbuild=1 
remove_webcore_debug_symbols=1 use_pulseaudio=0 use_gnome_keyring=0 
use_linux_link_gnome_keyring=0 disable_nacl=1 clang=0 host_clang=0 
linux_use_bundled_binutils=0 linux_use_bundled_gold=0 
google_api_key=AIzaSyDEAOvatFo0eTgsV_ZlEzx0ObmepsMzfAc 
google_default_client_id=329227923882.apps.googleusercontent.com 
google_default_client_secret=vgKG0NNv7GoDpbtoFNLxCUXu werror=" gclient sync

and then "ninja chrome" to build.

Program received signal SIGABRT, Aborted.
0x778886f8 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x778886f8 in raise () from /lib/libc.so.6
#1  0x77889bbd in abort () from /lib/libc.so.6
#2  0x778c7663 in __libc_message () from /lib/libc.so.6
#3  0x778ccf1d in malloc_printerr () from /lib/libc.so.6
#4  0x778cd74b in _int_free () from /lib/libc.so.6
#5  0x0045f5c1 in 
google::protobuf::DescriptorBuilder::BuildFieldOrExtension(google::protobuf::FieldDescriptorProto
 const&, google::protobuf::Descriptor const*, googl
e::protobuf::FieldDescriptor*, bool) ()
#6  0x00462c8f in 
google::protobuf::DescriptorBuilder::BuildMessage(google::protobuf::DescriptorProto
 const&, google::protobuf::Descriptor const*, google::protobuf::D
escriptor*) ()
#7  0x00462cfc in 
google::protobuf::DescriptorBuilder::BuildMessage(google::protobuf::DescriptorProto
 const&, google::protobuf::Descriptor const*, google::protobuf::D
escriptor*) ()
#8  0x0046683c in 
google::protobuf::DescriptorBuilder::BuildFile(google::protobuf::FileDescriptorProto
 const&) ()
#9  0x004677b5 in 
google::protobuf::DescriptorPool::BuildFileFromDatabase(google::protobuf::FileDescriptorProto
 const&) const ()
#10 0x004679a3 in 
google::protobuf::DescriptorPool::TryFindFileInFallbackDatabase(std::__cxx11::basic_string, std::allocator > cons
t&) const ()
#11 0x00467af5 in 
google::protobuf::DescriptorPool::FindFileByName(std::__cxx11::basic_string, std::allocator > const&) const ()
#12 0x00479d69 in 
google::protobuf::protobuf_AssignDesc_google_2fprotobuf_2fdescriptor_2eproto() 
()
#13 0x004aa038 in google::protobuf::GoogleOnceInitImpl(long*, 
google::protobuf::Closure*) ()
#14 0x004aa08f in google::protobuf::GoogleOnceInit(long*, void (*)()) ()
#15 0x0046b1f7 in google::protobuf::FileOptions::GetMetadata() const ()
#16 0x004a4812 in 
google::protobuf::compiler::Parser::ParseOption(google::protobuf::Message*, 
google::protobuf::compiler::Parser::LocationRecorder const&, google::pro
tobuf::compiler::Parser::OptionStyle) ()
#17 0x004a86af in 
google::protobuf::compiler::Parser::ParseTopLevelStatement(google::protobuf::FileDescriptorProto*,
 google::protobuf::compiler::Parser::LocationRecor
der const&) ()
#18 0x004a88ee in 
google::protobuf::compiler::Parser::Parse(google::protobuf::io::Tokenizer*, 
google::protobuf::FileDescriptorProto*) ()
#19 0x004a0807 in 
google::protobuf::compiler::SourceTreeDescriptorDatabase::FindFileByName(std::__cxx11::basic_string, std::allocator > const&, google::protobuf::FileDescriptorProto*) ()
#20 0x004678dd in 
google::protobuf::DescriptorPool::TryFindFileInFallbackDatabase(std::__cxx11::basic_string, std::allocator > cons
t&) const ()
#21 0x00467af5 in 
google::protobuf::DescriptorPool::FindFileByName(std::__cxx11::basic_string, std::allocator > const&) const ()
#22 0x0040c906 in 
google::protobuf::compiler::CommandLineInterface::Run(int, char const* const*) 
()
#23 0x00403131 in main ()

I have tried

Re: [PATCH] Fix for PR ipa/64693

2015-02-25 Thread Jan Hubicka
> 

> >From dd240028726cb7fdc777acd0b6d14c4f89aed714 Mon Sep 17 00:00:00 2001
> From: mliska 
> Date: Thu, 19 Feb 2015 16:08:09 +0100
> Subject: [PATCH 1/3] Fix PR ipa/64693
> 
> 2015-02-25  Martin Liska  
>   Jan Hubicka  
> 
>   * gcc.dg/ipa/ipa-icf-26.c: Update test.
>   * gcc.dg/ipa/ipa-icf-33.c: Remove redundant line.
>   * gcc.dg/ipa/ipa-icf-34.c: New test.
> 
> gcc/ChangeLog:
> 
> 2015-02-25  Martin Liska  
>   Jan Hubicka  
> 
>   * cgraph.h (address_matters_p): New function.
>   * ipa-icf.c (symbol_compare_collection::symbol_compare_collection): New.
>   (sem_item_optimizer::subdivide_classes_by_sensitive_refs): New function.
>   (sem_item_optimizer::process_cong_reduction): Include division by
>   sensitive references.
>   * ipa-icf.h (struct symbol_compare_hashmap_traits): New class.
>   * ipa-visibility.c (symtab_node::address_taken_from_non_vtable_p): 
> Removed.
>   * symtab.c (address_matters_1):  New function.
>   (symtab_node::address_matters_p): Moved from ipa-visibility.c.
> +  if (is_a  (node) && DECL_VIRTUAL_P (node->decl))
> +return;
> +
> +  for (unsigned i = 0; i < node->num_references (); i++)
> +{
> +  ref = node->iterate_reference (i, ref);
> +  if (ref->use == IPA_REF_ADDR && ref->referred->address_matters_p ()
> +   && !DECL_VIRTUAL_P (ref->referring->decl))
!address_matters_p should be implied by !DECL_VIRTUAL_P (ref->referring->decl).
> + m_references.safe_push (ref->referred);
> +
> +  if (ref->referred->get_availability () <= AVAIL_INTERPOSABLE)
> + m_interposables.safe_push (ref->referred);
Push into m_references if ref->use is IPA_REF_ADDR.  We care about address and 
not value then.
> +}
> +
> +  if (is_a  (node))
> +{
> +  cgraph_node *cnode = dyn_cast  (node);
> +
> +  for (cgraph_edge *e = cnode->callees; e; e = e->next_callee)
> + if (e->callee->get_availability () <= AVAIL_INTERPOSABLE)
> +   m_interposables.safe_push (e->callee);
> +}
> @@ -140,6 +204,15 @@ public:
>   contains_polymorphic_type_p comparison.  */
>static bool get_base_types (tree *t1, tree *t2);
>  
> +  /* Return true if given DECL is neither virtual nor cdtor.  */
> +  static bool is_nonvirtual_or_cdtor (tree decl)

You should be able to drop this one.
> +/* Return ture if address of N is possibly compared.  */
> +
> +static bool
> +address_matters_1 (symtab_node *n, void *)
> +{
> +  if (DECL_VIRTUAL_P (n->decl))
> +return false;
> +  if (is_a  (n)
> +  && (DECL_CXX_CONSTRUCTOR_P (n->decl)
> +   || DECL_CXX_DESTRUCTOR_P (n->decl)))
> +return false;
> +  if (n->externally_visible
> +  || n->symtab_node::address_taken_from_non_vtable_p ())
> +return true;
> +  return false;
> +}

Aha, I meant adding address_matters_p predicate into ipa-ref that will test 
whether given refernece may lead
to address being used for comparsion.

Something like

/* Return true if refernece may be used in address compare.  */
bool
ipa_ref::address_matters_p ()
{
  if (use != IPA_REF_ADDR)
return false;
  /* Addresses taken from virtual tables are never compared.  */
  if (is_a  (referring)
  && DECL_VIRTUAL_P (referring->decl))
return false;
  /* Address of virtual tables and functions is never compared.  */
  if (DECL_VIRTUAL_P (referred->decl)
return false;
  /* Address of C++ cdtors is never compared.  */
  if (is_a  (referred)
  && (DECL_CXX_CONSTRUCTOR_P (referred->decl) || DECL_CXX_DESTRUCTOR_P 
(referred->decl)))
return false;
  return true;
}

Honza


Re: [PATCH, alpha]: Fix PR/47230 [4.6/4.7 Regression] gcc fails to bootstrap on alpha in stage2 with "relocation truncated to fit: GPREL16 against ..."

2015-02-25 Thread Uros Bizjak
On Mon, Jul 28, 2014 at 7:02 PM, Richard Henderson  wrote:
> On 07/26/2014 05:35 AM, Uros Bizjak wrote:
>> On Mon, May 2, 2011 at 9:21 AM, Uros Bizjak  wrote:
>>
>>> It looks that GP relative relocations do not fit anymore into GPREL16
>>> reloc, so bootstrap on alpha hosts fail in stage2 with  "relocation
>>> truncated to fit: GPREL16 against ...". I found no other solution but
>>> to pass --no-relax to linker in order to finish the bootstrap.

[...]

> FYI, this was a bug in ld that I fixed "recently", i.e. after the current 2.24
> release.  It's an ok workaround for release branches, but I'd like it not to 
> be
> committed to mainline please.

Attached patch reverts PR47230 workaround on mainline SVN. In addition
to the revert, the patch also documents that binutils 2.25 or newer
are required.

/

2015-02-25  Uros Bizjak  

Revert:
2014-07-26  Uros Bizjak  

PR target/47230
* configure.ac (alpha*-*-linux*): Use mh-alpha-linux.
* configure: Regenerate.

/config

2015-02-25  Uros Bizjak  

Revert:
2014-07-26  Uros Bizjak  

PR target/47230
* mh-alpha-linux: New file.

/gcc

2015-02-25  Uros Bizjak  

PR target/47230
* doc/install.texi (Specific, alpha*-*-*): Document that binutils 2.25
or newer are required.

The patch was tested on alpha-linux-gnu and alphaev68-linux-gnu for
all default languages plus obj-c++ and go.

OK for mainline?

Uros.
Index: config/mh-alpha-linux
===
--- config/mh-alpha-linux   (revision 220920)
+++ config/mh-alpha-linux   (working copy)
@@ -1,3 +0,0 @@
-# Prevent GPREL16 relocation truncation
-LDFLAGS += -Wl,--no-relax
-BOOT_LDFLAGS += -Wl,--no-relax
Index: configure
===
--- configure   (revision 220920)
+++ configure   (working copy)
@@ -3944,9 +3944,6 @@
   *-mingw*)
 host_makefile_frag="config/mh-mingw"
 ;;
-  alpha*-*-linux*)
-host_makefile_frag="config/mh-alpha-linux"
-;;
   hppa*-hp-hpux10*)
 host_makefile_frag="config/mh-pa-hpux10"
 ;;
Index: configure.ac
===
--- configure.ac(revision 220920)
+++ configure.ac(working copy)
@@ -1275,9 +1275,6 @@
   *-mingw*)
 host_makefile_frag="config/mh-mingw"
 ;;
-  alpha*-*-linux*)
-host_makefile_frag="config/mh-alpha-linux"
-;;
   hppa*-hp-hpux10*)
 host_makefile_frag="config/mh-pa-hpux10"
 ;;
Index: gcc/doc/install.texi
===
--- gcc/doc/install.texi(revision 220920)
+++ gcc/doc/install.texi(working copy)
@@ -3416,10 +3416,11 @@
 DEC OSF/1, Digital UNIX and Tru64 UNIX)@.  In addition to reading this
 section, please read all other sections that match your target.
 
-We require binutils 2.11.2 or newer.
-Previous binutils releases had a number of problems with DWARF 2
-debugging information, not the least of which is incorrect linking of
-shared libraries.
+Binutils 2.25 or newer are required.
+Versions prior to 2.25 have faulty relaxation pass that may result in
+GPREL16 relocation truncation errors.  Versions prior to 2.11.2 had a
+number of problems with DWARF 2 debugging information, not the least of
+which is incorrect linking of shared libraries.
 
 @html
 


Re: [PATCH, alpha]: Fix PR/47230 [4.6/4.7 Regression] gcc fails to bootstrap on alpha in stage2 with "relocation truncated to fit: GPREL16 against ..."

2015-02-25 Thread Richard Henderson
On 02/25/2015 09:02 AM, Uros Bizjak wrote:
> The patch was tested on alpha-linux-gnu and alphaev68-linux-gnu for
> all default languages plus obj-c++ and go.
> 
> OK for mainline?

Ok.  Thanks.


r~


Re: [patch] fix PR65048: check that jump-thread paths are still valid

2015-02-25 Thread Sebastian Pop
Jeff Law wrote:
> On 02/18/15 15:27, Sebastian Pop wrote:
> >>The dumps for the FSM threads are a bit sparse -- they don't show
> >>the entire path.  That makes it much harder to see what's going on.
> >
> >Would a patch improving the FSM dumps ok to commit separately to trunk?
> Most definitely.  I realize we're in stage4, but I'd approve such a change.

Attached:

* tree-ssa-threadupdate.c (dump_jump_thread_path): Print all edges
of an EDGE_FSM_THREAD.

I'm regstrapping on x86_64-linux.  Ok for trunk if it passes?

Thanks,
Sebastian


>From df583e5d511db2ed87c14cd5290f382c139636d9 Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Wed, 25 Feb 2015 20:12:33 +0100
Subject: [PATCH] improve FSM jump thread dump

---
 gcc/tree-ssa-threadupdate.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 7a41ab2..7a159bb 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -197,6 +197,9 @@ dump_jump_thread_path (FILE *dump_file, vec path,
   if (path[i]->type == EDGE_NO_COPY_SRC_BLOCK)
fprintf (dump_file, " (%d, %d) nocopy;",
 		 path[i]->e->src->index, path[i]->e->dest->index);
+  if (path[0]->type == EDGE_FSM_THREAD)
+	fprintf (dump_file, " (%d, %d) ",
+		 path[i]->e->src->index, path[i]->e->dest->index);
 }
   fputc ('\n', dump_file);
 }
-- 
1.7.2.5



Re: [PATCH 1/n] OpenMP 4.0 offloading infrastructure

2015-02-25 Thread Ilya Verbin
On Fri, Feb 20, 2015 at 15:50:53 +0100, Jakub Jelinek wrote:
> On Fri, Feb 20, 2015 at 03:41:40PM +0100, Thomas Schwinge wrote:
> > Well, but users (like Jakub, for example) ;-) may decide to build the
> > offloading compilers without specifying --enable-languages, and that'll
> > then default to include Java, and you'll end up with:
> 
> Yeah.  We can perhaps tweak what languages we include by default for
> the --enable-as-accelerator-for= configurations, but if the user decides
> to explicitly add some language, we should still support that.

Here is the patch.
Build and install seems to be working both for accel and regular modes, with
--enable-languages=c,c++,fortran,go,java,jit,lto,objc,obj-c++
OK for trunk?


gcc/ada/
* gcc-interface/Make-lang.in (ada.install-common): Do not install for
the offloading compiler.
gcc/go/
* Make-lang.in (go.install-common): Do not install for the offloading
compiler.
gcc/java/
* Make-lang.in (java.install-common): Do not install for the offloading
compiler.
gcc/jit/
* Make-lang.in (jit.install-common): Do not install for the offloading
compiler.


diff --git a/gcc/ada/gcc-interface/Make-lang.in 
b/gcc/ada/gcc-interface/Make-lang.in
index 4696203..81fd80a 100644
--- a/gcc/ada/gcc-interface/Make-lang.in
+++ b/gcc/ada/gcc-interface/Make-lang.in
@@ -754,25 +754,27 @@ doc/gnat-style.pdf: ada/gnat-style.texi 
$(gcc_docdir)/include/fdl.texi
 # vxaddr2line is only used for cross VxWorks ports (it calls the underlying
 # cross addr2line).
 ada.install-common:
-   $(MKDIR) $(DESTDIR)$(bindir)
-   -if [ -f gnat1$(exeext) ] ; \
-   then \
- for tool in $(ADA_TOOLS) ; do \
-   install_name=`echo $$tool|sed 
'$(program_transform_name)'`$(exeext); \
-   $(RM) $(DESTDIR)$(bindir)/$$install_name; \
-   if [ -f $$tool-cross$(exeext) ] ; \
+   -if test "$(enable_as_accelerator)" != "yes" ; then \
+ $(MKDIR) $(DESTDIR)$(bindir); \
+ if [ -f gnat1$(exeext) ] ; \
+ then \
+   for tool in $(ADA_TOOLS) ; do \
+ install_name=`echo $$tool|sed 
'$(program_transform_name)'`$(exeext); \
+ $(RM) $(DESTDIR)$(bindir)/$$install_name; \
+ if [ -f $$tool-cross$(exeext) ] ; \
+ then \
+   $(INSTALL_PROGRAM) $$tool-cross$(exeext) 
$(DESTDIR)$(bindir)/$$install_name; \
+ else \
+   $(INSTALL_PROGRAM) $$tool$(exeext) 
$(DESTDIR)$(bindir)/$$install_name; \
+ fi ; \
+   done; \
+   $(RM) $(DESTDIR)$(bindir)/gnatdll$(exeext); \
+   $(INSTALL_PROGRAM) gnatdll$(exeext) 
$(DESTDIR)$(bindir)/gnatdll$(exeext); \
+   if [ -f vxaddr2line$(exeext) ] ; \
then \
- $(INSTALL_PROGRAM) $$tool-cross$(exeext) 
$(DESTDIR)$(bindir)/$$install_name; \
-   else \
- $(INSTALL_PROGRAM) $$tool$(exeext) 
$(DESTDIR)$(bindir)/$$install_name; \
+ $(RM) $(DESTDIR)$(bindir)/vxaddr2line$(exeext); \
+ $(INSTALL_PROGRAM) vxaddr2line$(exeext) 
$(DESTDIR)$(bindir)/vxaddr2line$(exeext); \
fi ; \
- done; \
- $(RM) $(DESTDIR)$(bindir)/gnatdll$(exeext); \
- $(INSTALL_PROGRAM) gnatdll$(exeext) 
$(DESTDIR)$(bindir)/gnatdll$(exeext); \
- if [ -f vxaddr2line$(exeext) ] ; \
- then \
-   $(RM) $(DESTDIR)$(bindir)/vxaddr2line$(exeext); \
-   $(INSTALL_PROGRAM) vxaddr2line$(exeext) 
$(DESTDIR)$(bindir)/vxaddr2line$(exeext); \
  fi ; \
fi
 
diff --git a/gcc/go/Make-lang.in b/gcc/go/Make-lang.in
index 6c5968a..891b610 100644
--- a/gcc/go/Make-lang.in
+++ b/gcc/go/Make-lang.in
@@ -136,15 +136,17 @@ check_go_parallelize = 10
 # Install hooks.
 
 go.install-common: installdirs
-   -rm -f $(DESTDIR)$(bindir)/$(GCCGO_INSTALL_NAME)$(exeext)
-   $(INSTALL_PROGRAM) gccgo$(exeext) 
$(DESTDIR)$(bindir)/$(GCCGO_INSTALL_NAME)$(exeext)
-   -if test -f go1$(exeext); then \
- if test -f gccgo-cross$(exeext); then \
-   :; \
- else \
-   rm -f $(DESTDIR)$(bindir)/$(GCCGO_TARGET_INSTALL_NAME)$(exeext); \
-   ( cd $(DESTDIR)$(bindir) && \
- $(LN) $(GCCGO_INSTALL_NAME)$(exeext) 
$(GCCGO_TARGET_INSTALL_NAME)$(exeext) ); \
+   -if test "$(enable_as_accelerator)" != "yes" ; then \
+ rm -f $(DESTDIR)$(bindir)/$(GCCGO_INSTALL_NAME)$(exeext); \
+ $(INSTALL_PROGRAM) gccgo$(exeext) 
$(DESTDIR)$(bindir)/$(GCCGO_INSTALL_NAME)$(exeext); \
+ if test -f go1$(exeext); then \
+   if test -f gccgo-cross$(exeext); then \
+ :; \
+   else \
+ rm -f $(DESTDIR)$(bindir)/$(GCCGO_TARGET_INSTALL_NAME)$(exeext); \
+ ( cd $(DESTDIR)$(bindir) && \
+   $(LN) $(GCCGO_INSTALL_NAME)$(exeext) 
$(GCCGO_TARGET_INSTALL_NAME)$(exeext) ); \
+   fi; \
  fi; \
fi
 
diff --git a/gcc/java/Make-lang.in b/gcc/java/Make

Re: Fix alignment propagation

2015-02-25 Thread Martin Jambor
Hi,

On Fri, Feb 20, 2015 at 07:22:02PM +0100, Jan Hubicka wrote:
> > > +/* Decrease alignment info DEST to be at most CUR.  */
> > > +
> > > +static bool
> > > +decrease_alignment (ipa_alignment *dest, ipa_alignment cur)
> > > +{
> > > +  bool changed = false;
> > > +
> > > +  if (!cur.known)
> > > +return false;
> > 
> > I really think this should be return set_alignment_to_bottom (dest);
> > 
> > If some known alignment has been already propagated to DEST along a
> > different edge and now along the current edge an unknown alignment is
> > coming in, then the result value of the lattice must be BOTTOM and not
> > the previous alignment this code leaves in place.
> 
> Well, because this is an optimisitic propagation now, !cur.known means TOP
> that is "as good alginment as you can thunk of".
> You have one known alignment in DEST and TOP in other, result is TOP.

It seems to be clear now that the fact that I used the same structure
for the alignment information in the jump function and for the
alignment lattice (and so for example known meant pessimistic
assumptions in the former but optimistic in the latter) was really a
confusing idea.  So, at the risk of proposing a slightly larger patch
at this late stage, let me backtrack and come up with a real lattice,
with bottom and top which are called that way and with a real meet
operation.  Otherwise, the functionality the same as Honza's patch,
with the increase_alignment function ditched, because it would never
be used anyway.  We can revisit that in the next stage1, just as we
can perhaps make the storage more compact.  At this point I wanted to
minimize risk.

The decrease_alignment is now called meet_with and I hope it is now
clear why I requested the changes.

The patch is currently undergoing bootstrap and testing, Honza
promised to test on Firefox, it would be great if people burnt by the
second bug in PR 65028 could run their tests too.

It's likely there will be comments I'll need to incorporate, but I
would like to commit this soon to avoid the confusion the multiple
uses of ipa_alignment structure apparently caused.

Thanks,

Martin


2015-02-25  Martin Jambor  
Jan Hubicka  

* ipa-cp.c (ipcp_alignment_lattice): New type.
(ipcp_param_lattices): Use the above to represent alignment.
(ipcp_alignment_lattice::print): New function.
(print_all_lattices): Use it to print alignment information.
(ipcp_alignment_lattice::top_p): New function.
(ipcp_alignment_lattice::bottom_p): Likewise.
(ipcp_alignment_lattice::set_to_bottom): Likewise.
(ipcp_alignment_lattice::meet_with_1): Likewise.
(ipcp_alignment_lattice::meet_with): Two new overloaded functions.
(set_all_contains_variable): Use set_to_bottom of alignment lattice.
(initialize_node_lattices): Likewise.
(propagate_alignment_accross_jump_function): Work with the new class
for alignment lattices.
(propagate_constants_accross_call): Pass only the alignment lattice to
propagate_alignment_accross_jump_function.
(ipcp_store_alignment_results): Work with the new class for alignment
lattices.

testsuite/
* gcc.dg/ipa/propalign-4.c: New test.
* gcc.dg/ipa/propalign-5.c: Likewise.

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index bfe4d97..5ebe04a 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -257,6 +257,36 @@ public:
   struct ipcp_agg_lattice *next;
 };
 
+/* Lattice of pointer alignment.  Unlike the previous types of lattices, this
+   one is only capable of holding one value.  */
+
+class ipcp_alignment_lattice
+{
+public:
+  /* If bottom and top are both false, these two fields hold values as given by
+ ptr_info_def and get_pointer_alignment_1.  */
+  unsigned align;
+  unsigned misalign;
+
+  inline bool bottom_p () const;
+  inline bool top_p () const;
+  inline bool set_to_bottom ();
+  bool meet_with (unsigned new_align, unsigned new_misalign);
+  bool meet_with (const ipcp_alignment_lattice &other, HOST_WIDE_INT offset);
+  void print (FILE * f);
+private:
+  /* If set, this lattice is bottom and all other fields should be
+ disregarded.  */
+  bool bottom;
+  /* If bottom and not_top are false, the lattice is TOP.  If not_top is true,
+ the known alignment is stored in the fields align and misalign.  The field
+ is negated so that memset to zero initializes the lattice to TOP
+ state.  */
+  bool not_top;
+
+  bool meet_with_1 (unsigned new_align, unsigned new_misalign);
+};
+
 /* Structure containing lattices for a parameter itself and for pieces of
aggregates that are passed in the parameter or by a reference in a parameter
plus some other useful flags.  */
@@ -270,9 +300,8 @@ public:
   ipcp_lattice ctxlat;
   /* Lattices describing aggregate parts.  */
   ipcp_agg_lattice *aggs;
-  /* Alignment information.  Very basic one value lattice where !known means
- TOP and zero alignment bottom.  */
-  ipa_al

Re: [PATCH PR65161]

2015-02-25 Thread Uros Bizjak
On Wed, Feb 25, 2015 at 3:24 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is updated patch to fix ICE.
>
> Is it OK for trunk?
>
> 2015-02-25  Yuri Rumyantsev  
>
> PR target/65161
> * config/i386/i386.c (ix86_sched_reorder): Skip instruction reordering
> for selective scheduling.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/pr65161.c: New test.

OK.

Thanks,
Uros.


[patch, libstdc++] Use explicit relative imports for the pretty printers

2015-02-25 Thread Matthias Klose
When gdb is linked/used with Python 3, import of the pretty printers fails:

Traceback (most recent call last):
 File
"/usr/share/gdb/auto-load/usr/lib/i386-linux-gnu/libstdc++.so.6.0.21-gdb.py",
line 58, in 
   import libstdcxx.v6
 File
"/usr/lib/i386-linux-gnu/../../share/gcc-5/python/libstdcxx/v6/__init__.py",
line 19, in 
   from printers import register_libstdcxx_printers
ImportError: No module named 'printers'
[Inferior 1 (process 6130) exited normally]

Python3 doesn't support implicit relative imports anymore.  Use explicit
relative imports instead.  This syntax is compatible with Python 2.5 and newer
2.x versions.  Ok for the trunk?

  Matthias


2015-02-25  Matthias Klose  

* python/libstdcxx/v6/__init__.py: Use explicit relative imports.

Index: libstdc++-v3/python/libstdcxx/v6/__init__.py
===
--- libstdc++-v3/python/libstdcxx/v6/__init__.py(revision 220970)
+++ libstdc++-v3/python/libstdcxx/v6/__init__.py(working copy)
@@ -16,7 +16,7 @@
 import gdb

 # Load the pretty-printers.
-from printers import register_libstdcxx_printers
+from .printers import register_libstdcxx_printers
 register_libstdcxx_printers(gdb.current_objfile())

 # Load the xmethods if GDB supports them.
@@ -28,5 +28,5 @@
 return False

 if gdb_has_xmethods():
-from xmethods import register_libstdcxx_xmethods
+from .xmethods import register_libstdcxx_xmethods
 register_libstdcxx_xmethods(gdb.current_objfile())


Re: [PR58315] reset inlined debug vars at return-to point

2015-02-25 Thread Alexandre Oliva
On Feb 25, 2015, Richard Biener  wrote:

> But code-motion could still move stmts from the inlined functions
> across these resets?

Sure, just like it could still move stmts across any other debug stmts.
Once you return from a function, it's as if all of its variables ceased
to exist, so what is the problem with this?

The real error, IMHO, is to assume the moved instruction is still inside
the inline function.  It isn't.  If you wanted to inspect the variable
before it went out of scope, the debugger should have helped you stop
there, not stop you at an instruction that is outside the expected flow.

> That said - shouldn't this simply performed by proper var-tracking
> u-ops produced by a backward scan over the function for "live"
> scope-blocks?

Please elaborate (ideally with a patch); I have no idea of how you plan
to map scope blocks to their original (and thus correct) position (i.e.,
before any code motion).

> That is, when you see a scope block becoming live from exit then add
> u-ops resetting all vars from that scope block?

Oh, you want code motion artifacts to leak into the language VM
execution modeled in debug info?  That would be fundamentally against
the model implemented with debug stmts and insns.  /me mumbles something
about mixing metaphors and leaky screwdrivers ;-D

IOW, I don't think that would be appropriate at all.  Remember the VTA
presentation at the GCC Summit years ago, when I showed you just can't
hope to recover debug info from code already mangled by the compiler,
because optimizations are lossy?  You get to a point in which you don't
know what you don't know, so you're bound to make wrong decisions.  So,
take note of what you're going to need when you know it's still
accurate.

> Your patch as-is would add very many debug stmts to for example
> tramp3d.

Do you have any evidence that this would have a negative impact on
compile time, memory use or any other relevant metric?  Odds are that,
if it makes any difference whatsoever, it will be a very positive one.

The absolute worst case of this patch is doubling the debug stmt count
(proof: it will add at most one debug stmt per inlined variable that had
at least one debug stmt).

Now, if you're concerned about debug stmt count, we could introduce
another debug stmt/insn form that can reset multiple variables in a
single swoop.  It does not seem to me like this would be worth the
effort.

> And as you say, the same reasoning applies to scopes in general, not
> just for inlines.

I actually said the opposite.  We turn block-local variables into
function-wide declarations very early, so apparently we *can* reference
the variables even when they're out of scope.  But we *cannot* do that
for inlined variables.  That's why I drew the line where I did.  (Plus,
introducing the debug temps at the point I did was easy to try, and it
had a huge positive impact :-)

Sure we *could* introduce debug unbind stmts at the end of scopes.  If
you have evidence or even a hunch that this will have a positive effect
on relevant metrics, go for it!

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [patch] fix PR65048: check that jump-thread paths are still valid

2015-02-25 Thread Jeff Law

On 02/25/15 12:18, Sebastian Pop wrote:

Jeff Law wrote:

On 02/18/15 15:27, Sebastian Pop wrote:

The dumps for the FSM threads are a bit sparse -- they don't show
the entire path.  That makes it much harder to see what's going on.


Would a patch improving the FSM dumps ok to commit separately to trunk?

Most definitely.  I realize we're in stage4, but I'd approve such a change.


Attached:

 * tree-ssa-threadupdate.c (dump_jump_thread_path): Print all edges
 of an EDGE_FSM_THREAD.

I'm regstrapping on x86_64-linux.  Ok for trunk if it passes?

Yes.

jeff


Re: [patch] PR debug/46102 Disable -feliminate-dwarf2-dups when reading a PCH

2015-02-25 Thread Jason Merrill

On 02/25/2015 12:02 PM, Aldy Hernandez wrote:

+  if (flag_eliminate_dwarf2_dups)
+{
+  warning (0, "ignoring unimplemented option -feliminate-dwarf2-dups");
+  flag_eliminate_dwarf2_dups = 0;
+}


I think we only want to disable it for C++, not all languages.

Jason



Re: [PR58315] reset inlined debug vars at return-to point

2015-02-25 Thread Alexandre Oliva
On Feb 25, 2015, Jakub Jelinek  wrote:

> various tools and users really want to
> be able to inspect variables and parameters on the return statement.

This patch won't affect the return statement.  The resets are at the
return-to statement; if you stop at the return statement (assuming you
have code generated for it in place), you should still get the same
bindings.  Now, if *all* the code for the return statement was optimized
away, and you stop at the subsequent instruction, that happens to be
past the return, then, yeah, you're out of luck, but then you were
already out of luck before.

Now, there is of course the case in which there is some code left in
place for the return stmt, but it is no longer in its natural position
in the code flow.  You stop there, you're technically out of the inline
scope, but now you're also past the "clobbering" of the inlined
variables.  The real solution for this is to implement the stmt
frontiers I presented at some GCC Summit, so that, when you stop at a
statement, you get the state you expect regardless of code motion,
because you get the state at the natural flow of control, even if node
actual code remained at that point.

> And another thing is the amount of the added debug stmts, right now we don't
> add debug stmts all the time for everything, just when something is needed,

We add debug stmts whenever a trackable (auto "gimple register")
variable is modified.  They are "clobbered" at the end of the inline
function they expanded out of, so this just corrects an long-standing
and quite expensive oversight.

You won't get debug stmts for unused inlined variables, for example: you
should only get them for variables that were remapped while inlining the
code to begin with.  If you got code for them and they are trackable,
you certainly got debug stmts for them as well.

> while your patch adds it unconditionally, even when debug stmts for those
> won't be really emitted.

It shouldn't.  Please show me?

> As they are just resets, that hopefully will not
> drastically affect var-tracking time

They will.  But in a positive way :-)

> but might affect other optimization passes, which would need to deal
> with much more statements than before.

It shouldn't be hard to test this hypothesis one way or another.  Tweak
the code that introduces the new debug temps so that they all refer to
the same fake variable, as opposed to resetting the intended variables,
and then you'll have an exact measure of the compile-time impact
*without* the savings afforded by the resets.  Then we can compare
whether it's an overall win or loss.

My measurements, for a not particularly unusual testcase, showed an
overall reduction of 63% in compile time, as indicated yesterday.  Now,
who should bear the burden of collecting evidence to back up the claims
against the change?  Are those concerns enough to hold it up?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [patch] PR debug/46102 Disable -feliminate-dwarf2-dups when reading a PCH

2015-02-25 Thread Mike Stump
On Feb 25, 2015, at 1:13 PM, Jason Merrill  wrote:
> On 02/25/2015 12:02 PM, Aldy Hernandez wrote:
>> +  if (flag_eliminate_dwarf2_dups)
>> +{
>> +  warning (0, "ignoring unimplemented option -feliminate-dwarf2-dups");
>> +  flag_eliminate_dwarf2_dups = 0;
>> +}
> 
> I think we only want to disable it for C++, not all languages.

And Objective-C++…  if you strcmp the name in a dwarf file).  Prefer 
flag_eliminate_dwarf2_dups = 0 in the C++ startup code someplace.

Re: [PR58315] reset inlined debug vars at return-to point

2015-02-25 Thread Jakub Jelinek
On Wed, Feb 25, 2015 at 06:17:33PM -0300, Alexandre Oliva wrote:
> My measurements, for a not particularly unusual testcase, showed an
> overall reduction of 63% in compile time, as indicated yesterday.  Now,
> who should bear the burden of collecting evidence to back up the claims
> against the change?  Are those concerns enough to hold it up?

Can you e.g. run dwlocstat on some larger C++ binaries built without and
with your patch?  I believe dwlocstat is supposed to count only the
instructions where the variables or parameters are in scope, so should be
exactly what we care about here.
E.g. cc1plus and libstdc++.so.6 might be good candidates from gcc itself,
perhaps firefox or similar as something even larger.

Jakub


[SH] Adding some peepholes (PR 61142)

2015-02-25 Thread Oleg Endo
Hi,

These are the peepholes as mentioned in PR 65153 and in PR 61142.  They
try to wallpaper some bad RA choices and reduce the CSiBE code size by
approx. 3.9K bytes.

A problem I ran into with this one is that the peephole2 pass drops
REG_INC notes, which makes the following passes produce garbage
sometimes.  Instead of rejecting automodify mems in the peephole2
patterns, for now I'm manually adding the REG_INC notes after emitting
move insns.  Maybe peephole2 could do that automatically in the future.

Tested with 
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}".

Kaz, could you also please pre-test this on sh4-linux?

Cheers,
Oleg

gcc/ChangeLog
PR target/61142
* config/sh/sh.c (sh_check_add_incdec_notes): New function.
* config/sh/sh-protos.h (sh_check_add_incdec_notes): Declare it.
* config/sh/predicates.md (const_logical_operand): New 
predicate.
* config/sh/sh.md: Add new peephole2 patterns.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 220947)
+++ gcc/config/sh/sh.md	(working copy)
@@ -14532,6 +14532,179 @@
 	(mem:HI (plus:SI (match_dup 1) (match_dup 2]
   "")
 
+;;	extu.bw	a,b
+;;	mov	b,c	->	extu.bw	a,c
+(define_peephole2
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(zero_extend:SI (match_operand:QIHI 1 "arith_reg_operand")))
+   (set (match_operand:SI 2 "arith_reg_dest")
+	(match_dup 0))]
+  "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 2) (zero_extend:SI (match_dup 1)))])
+
+;;	mov	r0,r1
+;;	extu.bw	r1,r1   ->	extu.bw	r0,r1
+(define_peephole2
+  [(set (match_operand 0 "arith_reg_dest")
+	(match_operand 1 "arith_reg_operand"))
+   (set (match_operand:SI 2 "arith_reg_dest")
+	(zero_extend:SI (match_operand:QIHI 3 "arith_reg_operand")))]
+  "TARGET_SH1
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && (REGNO (operands[0]) == REGNO (operands[2])
+   || peep2_reg_dead_p (2, operands[0]))"
+  [(set (match_dup 2) (zero_extend:SI (match_dup 1)))]
+{
+  operands[1] = gen_rtx_REG (mode, REGNO (operands[1]));
+})
+
+;;	mov	a,b
+;;	mov	b,a	->	< nop >
+(define_peephole2
+  [(set (match_operand 0 "register_operand")
+	(match_operand 1 "register_operand"))
+   (set (match_operand 2 "register_operand")
+	(match_operand 3 "register_operand"))]
+  "TARGET_SH1
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && REGNO (operands[1]) == REGNO (operands[2])
+   && peep2_reg_dead_p (2, operands[3])"
+  [(const_int 0)])
+
+;;	mov	#3,r4
+;;	and	r4,r1	->	mov	r1,r0
+;;	mov	r1,r0		and	#3,r0
+(define_code_iterator ANDIORXOR [and ior xor])
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+	(match_operand:SI 1 "const_logical_operand"))
+   (set (match_operand:SI 2) (ANDIORXOR:SI (match_dup 2) (match_dup 0)))
+   (set (reg:SI R0_REG) (match_dup 2))]
+  "TARGET_SH1
+   && peep2_reg_dead_p (3, operands[0]) && peep2_reg_dead_p (3, operands[2])"
+  [(set (reg:SI R0_REG) (match_dup 2))
+   (set (reg:SI R0_REG) (ANDIORXOR:SI (reg:SI R0_REG) (match_dup 1)))])
+
+;;	...	r2,r0		...	r2,r0
+;;	or	r1,r0	->	or	r0,r1
+;;	mov	r0,r1
+;;	(r0 dead)
+(define_code_iterator ANDIORXORPLUS [and ior xor plus])
+(define_peephole2
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(ANDIORXORPLUS:SI (match_dup 0) (match_operand:SI 1 "arith_reg_dest")))
+   (set (match_dup 1) (match_dup 0))]
+  "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 1) (ANDIORXORPLUS:SI (match_dup 1) (match_dup 0)))])
+
+;;	mov	r12,r0
+;;	add	#-48,r0 ->	add	#-48,r12
+;;	mov.l	r0,@(4,r10)	mov.l	r12,@(4,r10)
+;;	(r12 dead)
+(define_peephole2
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(match_operand:SI 1 "arith_reg_dest"))
+   (set (match_dup 0) (plus:SI (match_dup 0)
+			   (match_operand:SI 2 "const_int_operand")))
+   (set (match_operand:SI 3 "general_movdst_operand") (match_dup 0))]
+  "TARGET_SH1
+   && peep2_reg_dead_p (2, operands[1]) && peep2_reg_dead_p (3, operands[0])"
+  [(const_int 0)]
+{
+  emit_insn (gen_addsi3 (operands[1], operands[1], operands[2]));
+  sh_check_add_incdec_notes (emit_move_insn (operands[3], operands[1]));
+})
+
+;;	mov.l	@(r0,r9),r1
+;;	mov	r1,r0	->	mov	@(r0,r9),r0
+(define_peephole2
+  [(set (match_operand:SI 0 "arith_reg_dest")
+	(match_operand:SI 1 "general_movsrc_operand"))
+   (set (match_operand:SI 2 "arith_reg_dest")
+	(match_dup 0))]
+  "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
+  [(const_int 0)]
+{
+  sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[1]));
+})
+
+(define_peephole2
+  [(set (match_operand:QIHI 0 "register_operand")
+	(match_operand:QIHI 1 "movsrc_no_disp_mem_operand"))
+   (set (match_operand:QIHI 2 "register_operand")
+	(match_dup 0))]
+  "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])"
+  [(const_int 0)]
+{
+  sh_check_add_incdec_notes (emit_move_insn (operands[2], operands[1]));
+})
+
+(define_peephole

Re: [patch] fix PR65048: check that jump-thread paths are still valid

2015-02-25 Thread Jeff Law

On 02/25/15 11:37, Sebastian Pop wrote:

Jeff Law wrote:

Registering FSM jump thread: (10, 12)  (12, 13)  (13, 15)  (15, 3)

[ snip ]

Registering FSM jump thread: (7, 10)  (10, 12)  (12, 13)  (13, 14)


What I'm having a bit of trouble wrapping my head around is how can
those two paths both be valid when you register them?  They have
different transitions out of bb13, one going to bb15 the other to
bb14, but they're both coming in via (10, 12).


Here is the output of debug_loops(3) showing the two paths before we start the
FSM code generation:

 bb_7 (preds = {bb_4 }, succs = {bb_8 bb_10 bb_11 })
 {
 :
   # .MEM_26 = VDEF <.MEM_1>
   a = 84;
   switch (x_8(D)) , case 65: , case 85: >
 }
 bb_10 (preds = {bb_9 bb_5 bb_7 }, succs = {bb_12 })
 {
   # .MEM_30 = PHI <.MEM_4(9), .MEM_20(5), .MEM_26(7)>
   # _34 = PHI <_7(9), 65(5), 84(7)>
 :
   goto  ();
 }
 bb_12 (preds = {bb_9 bb_11 bb_10 }, succs = {bb_3 bb_13 })
 {
   # _13 = PHI <_12(9), 65(11), 84(10)>
   # .MEM_32 = PHI <.MEM_4(9), .MEM_31(11), .MEM_30(10)>
   # _36 = PHI <_7(9), _35(11), _34(10)>
 :
   # .MEM_6 = VDEF <.MEM_32>
   b = _13;
   # VUSE <.MEM_6>
   c.0_10 = c;
   switch (c.0_10) , case 85: >
 }
 bb_13 (preds = {bb_12 }, succs = {bb_14 bb_15 })
 {
 :
   switch (_36) , case 71: >
 }
 bb_14 (preds = {bb_13 bb_6 bb_8 }, succs = {bb_15 })
 {
   # .MEM_33 = PHI <.MEM_6(13), .MEM_23(6), .MEM_28(8)>
   # _37 = PHI <_36(13), 65(6), 84(8)>
   # _39 = PHI <_13(13), _12(6), _12(8)>
 :
   # .MEM_18 = VDEF <.MEM_33>
   fn ();
 }
 bb_15 (preds = {bb_13 bb_14 }, succs = {bb_3 bb_16 })
 {
   # .MEM_17 = PHI <.MEM_6(13), .MEM_18(14)>
   # _38 = PHI <_36(13), _37(14)>
   # _40 = PHI <_13(13), _39(14)>
 :
   switch (_40) , case 65: >
 }
 bb_3 (preds = {bb_16 bb_15 bb_12 bb_6 bb_8 }, succs = {bb_4 })
 {
   # .MEM_3 = PHI <.MEM_19(16), .MEM_17(15), .MEM_6(12), .MEM_23(6), 
.MEM_28(8)>
   # _9 = PHI <_38(16), _38(15), _36(12), 65(6), 84(8)>
   # _16 = PHI <65(16), _40(15), _13(12), _12(6), _12(8)>
 :
 }

First, let's look at why we jump thread from 10 to 3:

Registering FSM jump thread: (10, 12)  (12, 13)  (13, 15)  (15, 3)


In other words, let's see how we can infer that "from bb_15 we are guaranteed to
jump into bb_3 if we come from bb_10."

So this switch in bb_15 is guaranteed to jump to the default case:
switch (_40) , case 65: >

  # _40 = PHI <_13(13), _39(14)>

because when coming from bb_13, "_40 = _13", and then in bb_12 we have the 
definition
# _13 = PHI <_12(9), 65(11), 84(10)>

and so if we come from bb_10, the value of _13 is 84.
Because 84 != 65, switch (_40) will switch to default, that is a jump from 
bb_15 to bb_3.


Now let's see how we jump thread from 7 to 14:

Registering FSM jump thread: (7, 10)  (10, 12)  (12, 13)  (13, 14)


Why do we know that from bb_13 we necessarily jump to bb_14 if we have just
executed the code in bb_7?

In other words, why do we jump to the default case of
switch (_36) , case 71: >

# _36 = PHI <_7(9), _35(11), _34(10)>
# _34 = PHI <_7(9), 65(5), 84(7)>

so coming from bb_7, the value of the switch is 84 that is different than case 
71,
so we jump to the default case in "switch (_36) , case 71: 
>".


let's make sure the paths that are registered are reasonable first


I think it is reasonable to jump thread these two paths.
So this is why the debugging output is so important ;-)  We might 
consider further enhancing the debug output to mark the state of each of 
those intermediate blocks, of particular importance is whether or not 
the block has a known static destination on the path.


In this test, the branch out of 13 is not known statically in the first 
path,but is known in the second path.  Thus 13' in the first path will 
continue to have a conditional branch (to 14 or 15' where 15' always 
goes to 3).  13'' in the second path will unconditionally transfer to 14.


So, now time for me to go read that patch thoroughly.

Thanks,

jeff



C++ PATCH for c++/65209 (linkage of anonymous namespace)

2015-02-25 Thread Jason Merrill
There are two issues with this testcase: one, when we internalize a decl 
because it involves an anonymous namespace we need to clear DECL_COMDAT, 
now that we're setting it early.  Also, our handling of pointers to 
local functions also needs to handle references.


Tested x86_64-pc-linux-gnu, applying to trunk.  Also applying the second 
hunk to 4.9, as the second issue is a regression there as well.


commit 4e63fb3b23cc61a1e8709ae153e7f8e77bb09e39
Author: Jason Merrill 
Date:   Wed Feb 25 15:42:35 2015 -0500

	PR c++/65209
	* decl2.c (constrain_visibility) [VISIBILITY_ANON]: Clear
	DECL_COMDAT.
	(constrain_visibility_for_template): Handle reference arguments.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index a7bc08f..a4a5ebf 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2175,6 +2175,7 @@ constrain_visibility (tree decl, int visibility, bool tmpl)
 	  TREE_PUBLIC (decl) = 0;
 	  DECL_WEAK (decl) = 0;
 	  DECL_COMMON (decl) = 0;
+	  DECL_COMDAT (decl) = false;
 	  if (TREE_CODE (decl) == FUNCTION_DECL
 	  || TREE_CODE (decl) == VAR_DECL)
 	{
@@ -2215,9 +2216,12 @@ constrain_visibility_for_template (tree decl, tree targs)
   tree arg = TREE_VEC_ELT (args, i-1);
   if (TYPE_P (arg))
 	vis = type_visibility (arg);
-  else if (TREE_TYPE (arg) && POINTER_TYPE_P (TREE_TYPE (arg)))
+  else
 	{
-	  STRIP_NOPS (arg);
+	  if (REFERENCE_REF_P (arg))
+	arg = TREE_OPERAND (arg, 0);
+	  if (TREE_TYPE (arg))
+	STRIP_NOPS (arg);
 	  if (TREE_CODE (arg) == ADDR_EXPR)
 	arg = TREE_OPERAND (arg, 0);
 	  if (VAR_OR_FUNCTION_DECL_P (arg))
diff --git a/gcc/testsuite/g++.dg/abi/anon4.C b/gcc/testsuite/g++.dg/abi/anon4.C
new file mode 100644
index 000..088ba99
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/anon4.C
@@ -0,0 +1,41 @@
+// PR c++/65209
+// { dg-final { scan-assembler-not "comdat" } }
+
+// Everything involving the anonymous namespace bits should be private, not
+// COMDAT.
+
+struct Bar
+{
+  static Bar *self();
+  char pad[24];
+};
+
+template 
+struct BarGlobalStatic
+{
+  Bar *operator()() { return holderFunction(); }
+};
+
+namespace {
+  namespace Q_QGS_s_self {
+inline Bar *innerFunction() {
+  static struct Holder {
+	Bar value;
+	~Holder() {}
+  } holder;
+  return &holder.value;
+}
+  }
+}
+static BarGlobalStatic s_self;
+
+Bar *Bar::self()
+{
+  return s_self();
+}
+
+int main(int argc, char *argv[])
+{
+  Bar* bar = Bar::self();
+  return 0;
+}


  1   2   >