date:20141020

[PATCH] Improve scheduler dumps of ready list

2014-10-20 Thread Maxim Kuvyrkov

Hi,

Following previous improvement to scheduler dumps that provided insight into 
which heuristics in rank_for_schedule make most decisions, this patch adds 
print outs that show the deciding reason for an instruction in the ready list 
to be at its particular place.

This patch allowed me to troubleshoot several scheduling problems in the 
register pressure scheduling.

Tested on x86_64-linux-gnu, arm-linux-gnueabihf and aarch64-linux-gnu.

OK to apply?

Thank you, 

--
Maxim Kuvyrkov
www.linaro.org




0002-last_rfs_win.ChangeLog
Description: Binary data


0002-last_rfs_win.patch
Description: Binary data

Re: [PATCH] Fix for PR63569

2014-10-20 Thread Richard Biener

On Fri, Oct 17, 2014 at 3:36 PM, Martin Liška  wrote:
> Hello.
>
> Following patch fixes PR63569.
>
> Bootstrap executed on ppc64-linux and no regression seen on x86_64-pc-linux.
> Ready for trunk?

Um.  As suggested in the bugreport I replied to please work on splitting
out general operand vs. memory operand compare.

+bool
+func_checker::compare_volatility (tree t1, tree t2)
+{
+  if (t1 && t2)
+return TREE_THIS_VOLATILE (t1) == TREE_THIS_VOLATILE (t2);
+
+  return !(t1 || t2);

The last check looks unrelated to me.  Either you want to do this
quick check inline for all operand compares or defer to compare_operand?

Btw, I think the volatility check would be better placed in the memory
operand compare function when comparing handled-components and
decls.

Richard.

> Thank you,
> Martin

Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.

2014-10-20 Thread Richard Biener

On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> I reworked the patch as you proposed, but I didn't understand what
> did you mean by:
>
>>So please rework the patch so critical edges are always handled
>>correctly.
>
> In current patch flag_force_vectorize is used (1) to reject phi nodes
> with more than 2 arguments; (2) to reject basic blocks with only
> critical incoming edges since support for extended predication of phi
> nodes will be in next patch.

I mean that (2) should not be rejected dependent on flag_force_vectorize.
It was rejected because if-cvt couldn't handle it correctly before but with
this patch this is fixed.  I see no reason to still reject this then even
for !flag_force_vectorize.

Rejecting PHIs with more than two arguments with flag_force_vectorize
is ok.

Richard.

> Could you please clarify your statement.
>
> I attached modified patch.
>
> ChangeLog:
>
> 2014-10-17  Yuri Rumyantsev  
>
> (flag_force_vectorize): New variable.
> (edge_predicate): New function.
> (set_edge_predicate): New function.
> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
> if destination block of edge is not always executed. Set-up predicate
> for critical edge.
> (if_convertible_phi_p): Accept phi nodes with more than two args
> if FLAG_FORCE_VECTORIZE was set-up.
> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
> (if_convertible_stmt_p): Fix up pre-function comments.
> (all_edges_are_critical): New function.
> (if_convertible_bb_p): Use call of all_preds_critical_p
> to reject block if-conversion with incoming critical edges only if
> FLAG_FORCE_VECTORIZE was not set-up.
> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
> to compute predicate instead of fold_build2_loc.
> Add zeroing of edge 'aux' field.
> (find_phi_replacement_condition): Extend function interface:
> it returns NULL if given phi node must be handled by means of
> extended phi node predication. If number of predecessors of phi-block
> is equal 2 and atleast one incoming edge is not critical original
> algorithm is used.
> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
> Nullify 'aux' field of edges for blocks with two successors.
>
>
>
>
> 2014-10-17 13:09 GMT+04:00 Richard Biener :
>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev  wrote:
>>> Richard,
>>>
>>> Here is reduced patch as you requested. All your remarks have been fixed.
>>> Could you please look at it ( I have already sent the patch with
>>> changes in add_to_predicate_list for review).
>>
>> + if (dump_file && (dump_flags & TDF_DETAILS))
>> +   fprintf (dump_file, "More than two phi node args.\n");
>> + return false;
>> +   }
>> +
>> +}
>>
>> Excess vertical space.
>>
>>
>> +/* Assumes that BB has more than 2 predecessors.
>>
>> More than 1 predecessor?
>>
>> +   Returns false if at least one successor is not on critical edge
>> +   and true otherwise.  */
>> +
>> +static inline bool
>> +all_edges_are_critical (basic_block bb)
>> +{
>>
>> "all_preds_critical_p" would be a better name
>>
>> +  if (EDGE_COUNT (bb->preds) > 2)
>> +{
>> +  if (!flag_force_vectorize)
>> +   return false;
>> +}
>>
>> as I said in the last review I don't think we should restrict edge
>> predicates to flag_force_vectorize.  At least I can't see how
>> if-conversion is magically more expensive for that case?
>>
>> So please rework the patch so critical edges are always handled
>> correctly.
>>
>> Ok with that and the above suggested changes.
>>
>> Thanks,
>> Richard.
>>
>>
>>> Thanks.
>>> Yuri.
>>> ChangeLog
>>> 2014-10-16  Yuri Rumyantsev  
>>>
>>> (flag_force_vectorize): New variable.
>>> (edge_predicate): New function.
>>> (set_edge_predicate): New function.
>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>>> if destination block of edge is not always executed. Set-up predicate
>>> for critical edge.
>>> (if_convertible_phi_p): Accept phi nodes with more than two args
>>> if FLAG_FORCE_VECTORIZE was set-up.
>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>>> (if_convertible_stmt_p): Fix up pre-function comments.
>>> (all_edges_are_critical): New function.
>>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical
>>> to reject block if-conversion with incoming critical edges only if
>>> FLAG_FORCE_VECTORIZE was not set-up.
>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>>> to compute predicate instead of fold_build2_loc.
>>> Add zeroing of edge 'aux' field.
>>> (find_phi_replacement_condition): Extend function interface:
>>> it returns NULL if given phi node must be handled by means of
>>> extended phi node predication. If number of predecessors of phi-block
>>> is equal 2 and atleast one incoming edge is not critical original
>>> algorithm is used.
>>> (tree_if_conversion): Temporary set-up F

Re: [patch] Create cfgrtl.h

2014-10-20 Thread Richard Biener

On Fri, Oct 17, 2014 at 6:44 PM, Andrew MacLeod  wrote:
> Rather than trying to flatten basic-block.h and do all the work associated
> in one big patch,  I'll try to do it in smaller steps :-)
>
> This patch creates cfgrtl.h to maintain the prototypes for functions
> exported from cfgrtl.c.  For the moment, basic-block.h  includes cfgrtl.h,
> keeping everything compiling.
>
> When basic-block.h gets flattened, I'll reduce inclusion of cfgrtl.h to only
> files which actually need it.
>
> I also took a couple of trivial things out of basic-block.h that didn't
> belong there:
>  - extern const struct gcov_ctr_summary *profile_info;  belonged in
> profile.h since it is exported from profile.c.. This required a few .c files
> to include profile.h now.
>  - the prototypes for gt_ggc_mx (edge_def *e) and gt_pch_nx (edge_def *e)
> were moved to tree-cfg.h since they are delcared in tree-cfg.c.
>
> Bootstraps on x86_64-unknown-linux-gnu, and running test regressions, but
> compilation is likely to be enough to confirm its correct.
>
> Assuming all is fine, OK for trunk?

Ok.

Thanks,
Richard.

> Andrew

Re: [PATCH] PR preprocessor/42014

2014-10-20 Thread Krzesimir Nowak

2014-10-18 23:07 GMT+02:00 Krzesimir Nowak :
> Hello.
>
> This is my first patch for GCC. I already started a paperwork for
> copyright assignment (sent an email to fsf-records at gnu org) -
> waiting for response.
>
> So, about this patch - it basically removes column printing from "In
> file included from ..." lines, as the column information always
> returned 0. Not sure if this is correct assumption - I tested only C
> and C++, so I don't know if other frontends (ada, go?) provide column
> information for include lines. Anyway, column information here is
> probably not useful.
>
> Or maybe it is, if GCC supports some language with include syntax like
> followish:
> #include , , 
>
> Maybe in this case printing column number has sense?
>
> I need help with testcase - I don't know how to implement it
> correctly. The output of compilation is something like this:
>
> In file included from .../pr42014-2.h:2,
>  from .../pr42014-1.h:3,
>  from .../pr42014.c:4:
> .../pr42014-3.h:1:7: error: 'foo' was not declared in this scope
>
> How to check the "from" lines? Is there some dg-foo (dg-grep?) command
> for it? dg-excess-errors is likely not suited for this purpose.

I suppose I will have to add a preprocessed file and try using dg-message.

>
> Also, do I need to run make -k check for both vanilla and changed GCC
> to compare the results? These tests take ages to complete, so maybe
> there is some subset of tests which is enough for regression checking
> in this case? Currently I am only running following command in gcc
> directory:
> make check-c++ RUNTESTFLAGS="-v dg.exp=cpp/pr42014.c"
>
> Krzesimir Nowak (1):
>   Fix PR preprocessor/42014
>
>  gcc/ChangeLog  |  6 ++
>  gcc/diagnostic.c   | 27 +++
>  gcc/testsuite/ChangeLog|  8 
>  gcc/testsuite/c-c++-common/cpp/pr42014-1.h |  3 +++
>  gcc/testsuite/c-c++-common/cpp/pr42014-2.h |  2 ++
>  gcc/testsuite/c-c++-common/cpp/pr42014-3.h |  1 +
>  gcc/testsuite/c-c++-common/cpp/pr42014.c   |  8 
>  7 files changed, 43 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr42014-1.h
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr42014-2.h
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr42014-3.h
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr42014.c
>
> --
> 1.9.3
>

Re: intN patch 3/5: main int128 -> __intN conversion.

2014-10-20 Thread Andreas Schwab

DJ Delorie  writes:

>> FAIL: g++.dg/init/enum1.C  -std=gnu++11  (test for errors, line 12)
>> FAIL: g++.dg/init/enum1.C  -std=gnu++1y  (test for errors, line 12)
>> FAIL: g++.dg/init/enum1.C  -std=gnu++98  (test for errors, line 12)
>> 
>> That used to complain about "enum1.C:12:1: error: no integral type can
>> represent all of the enumerator values for 'test'"
>
> On what host?  It works OK on x86-64.

Have you considered doing proper testing?

http://gcc.gnu.org/ml/gcc-testresults/2014-10/msg02106.html

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH PR63530] Fix the pointer alignment in vectorization

2014-10-20 Thread Richard Biener

On Fri, Oct 17, 2014 at 7:58 PM, Carrot Wei  wrote:
> Hi
>
> In current vectorization pass, when a new vector pointer is created,
> its alignment is not set correctly. We should use DR_MISALIGNMENT (dr)
> since only this alignment is adjusted when loop peeling or multi
> version is occurred.
>
> This patch passed following tests:
> x86_64 bootstrap.
> x86_64 regression test.
> armv7 regression test.
>
> OK for trunk and 4.9 branch?

I miss a testcase.  I also miss a comment before this code explaining
why DR_MISALIGNMENT if not -1 is valid and why it is not valid if
'offset' is supplied (what about 'byte_offset' btw?).  Also if peeling
for alignment aligned this ref (misalign == 0) you don't set the alignment.

Thus you may fix a bug (not sure without a testcase) but the new code
certainly doesn't look 100% correct.

That said, I would have expected that we can unconditionally do

 set_ptr_info_alignment (..., align, misalign)

if misalign is != -1 and if we adjust misalign by offset * step + byte_offset
(usually both are constants).

Also we can still trust the alignment copied from addr_base modulo
vector element size even if DR_MISALIGN is -1.  This may matter
for targets that require element-alignment for vector accesses.

Thanks,
Richard.

> thanks
> Guozhi Wei
>
> 2014-10-17  Guozhi Wei  
>
> PR tree-optimization/63530
> tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref): Set
> pointer alignment according to DR_MISALIGNMENT.

Re: [PATCH, rtl-optimization]: Remove const_alias_set

2014-10-20 Thread Richard Biener

On Sun, Oct 19, 2014 at 8:13 PM, Uros Bizjak  wrote:
> Hello!
>
> The fix that fixed scheduler issues with AND addresses (the fix
> prevented early exit for MEM_READONLY_P addresses when AND alignment
> addresses were involved) caused some fall-out for libgo testsuite.
> These tests triggered an assert in mems_in_disjoint_alias_sets_p,
> which checks for zero alias set when flag_strict_aliasing is false. We
> have had some off-list discussion with Ian Lance Taylor about this
> issue.
>
> The problem was, that Go dynamically switches off flag_strict_aliasing
> after compilation started and when "unsafe" package is imported
> (similar to when__attribute__ ((optimize ("-fno-strict-aliasing"))) is
> used in c). To mitigate this issue, the Go frontend called
> varasm_init_once again to recalculated (= cleared) const_alias_set in
> this case.
>
> As observed in [1], the fix for canon_true_depence [2] that introduced
> quick exit for a MEM_READONLY_P operands made const_alias_set
> redundant, it is no longer user for anything.
>
> The patch that fixed scheduling of AND operands removed early
> MEM_READONLY_P exit for memory operands with AND realignment, so
> operands could reach more complex code later in the function that was
> able to determine dependence of memory operands. This code includes
> the call to mems_in_disjoint_alias_sets_p, and the assert triggered
> again for some MEM_READONLY_P operands that have had non-zero alias
> set, set from the value, cached in const_alias_set from before
> flag_strict_aliasing flag was cleared.
>
> The proposed solution is to remove const_alias_set altogether. The
> MEM_READONLY_P successfully supersedes const_alias_set functionality,
> and this is also confirmed by the removal of the second
> varasm_init_once call in the Go frontend. In an off-list discussion,
> Ian agrees that attached patch should also fix the problem.
>
> 2014-10-19  Uros Bizjak  
>
> * varasm.c (const_alias_set): Remove.
> (init_varasm_once): Remove initialization of const_alias_set.
> (build_constant_desc): Do not set alias set to const_alias_set.
>
> The patch was tested on alpha-linux-gnu [3], alphaev68-linux-gnu and
> x86_64-linux-gnu {,-m32} for all default languages plus Go and
> obj-c++.
>
> The patch fixes all mentioned libgo failures on alpha.
>
> OK for mainline?

Ok.

Thanks,
Richard.

> [1] https://gcc.gnu.org/ml/gcc-patches/2013-07/msg01033.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2010-07/msg01758.html
> [3] https://gcc.gnu.org/ml/gcc-testresults/2014-10/msg02041.html
>
> Uros.

Re: [GOOGLE] Increase max-early-inliner-iterations to 2 for profile-gen and use

2014-10-20 Thread Richard Biener

On Mon, Oct 20, 2014 at 12:02 AM, Xinliang David Li  wrote:
> On Sat, Oct 18, 2014 at 4:19 PM, Xinliang David Li  wrote:
>> On Sat, Oct 18, 2014 at 3:27 PM, Jan Hubicka  wrote:
 The difference in instrumentation runtime is huge -- as topn profiler
 is pretty expensive to run.

 With FDO, it is probably better to make early inlining more aggressive
 in order to get more context sensitive profiling.
>>>
>>> I agree with that, I just would like to understand where increasing the 
>>> iterations
>>> helps and if we can handle it without iterating (because Richi originally 
>>> requested to
>>> drop the iteration for correcness issues)

Well, I requested to do any iteration with an IPA view in mind.  That is,
iterate for cgraph cycles for example where currently we face the situation
that at least one function is inlined unoptimized.  For this we'd like to
first optimize without inlining (well, maybe inlining doesn't hurt) and then
inline (and re-optimize if we inlined).

Indirect edges are more interesting, but basically you'd want to re-inline
once you discover new direct calls during early opts (but then make
sure to do that only after the direct callee was early-optimized first).

Thus it would be nice if somebody could improve on the currently very
simple function ordering we apply early opts, integrating "iteration"
in a better way (not iterating over all functions but only where it
might make a difference, focused on inlining).

>>> Do you have some examples?
>>
>> We can do FDO experiment by shutting down einline. (Note that
>> increasing iteration to 2 did not actually improve performance with
>> our benchmarks).
>
> Early inlining itself has large performance impact for FDO (the
> runtime of the profile-use build). With it disabled, the FDO
> performance drops by >2% on average. The degradation is seen across
> all benchmarks except for one.

Only 2%?  You are lucky ;)  For tramp3d introducing early inlining
made a difference of 10% ;)  (yes, statistically for tramp3d
we have for each assembler instruction generated 100 calls in the
initial code ... wheee C++ template metaprogramming!)

So indeed early inlining was absoultely required to make FDO usable at all.

Richard.

> David
>
>
>>
>> David
>>
>>> Honza

 David

 On Sat, Oct 18, 2014 at 10:05 AM, Jan Hubicka  wrote:
 >> Increasing the number of early inliner iterations from 1 to 2 enables 
 >> more
 >> indirect calls to be promoted/inlined before instrumentation. This in 
 >> turn
 >> reduces the instrumentation overhead, particularly for more expensive 
 >> indirect
 >> call topn profiling.
 >
 > How much difference you get here? One posibility would be also to run 
 > specialized
 > ipa-cp before profile instrumentation.
 >
 > Honza
 >>
 >> Passes internal testing and regression tests. Ok for google/4_9?
 >>
 >> 2014-10-18  Teresa Johnson  
 >>
 >> Google ref b/17934523
 >> * opts.c (finish_options): Increase 
 >> max-early-inliner-iterations to 2
 >> for profile-gen and profile-use builds.
 >>
 >> Index: opts.c
 >> ===
 >> --- opts.c  (revision 216286)
 >> +++ opts.c  (working copy)
 >> @@ -870,6 +869,14 @@ finish_options (struct gcc_options *opts, struct g
 >>  opts->x_param_values, opts_set->x_param_values);
 >>  }
 >>
 >> +  if (opts->x_profile_arc_flag
 >> +  || opts->x_flag_branch_probabilities)
 >> +{
 >> +  maybe_set_param_value
 >> +   (PARAM_EARLY_INLINER_MAX_ITERATIONS, 2,
 >> +opts->x_param_values, opts_set->x_param_values);
 >> +}
 >> +
 >>if (!(opts->x_flag_auto_profile
 >>  || (opts->x_profile_arc_flag || 
 >> opts->x_flag_branch_probabilities)))
 >>  {
 >>
 >>
 >> --
 >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [AArch64] Add --enable-fix-cortex-a53-835769 configure-time option

2014-10-20 Thread Kyrill Tkachov



On 19/10/14 21:31, Gerald Pfeifer wrote:

On Friday 2014-10-10 11:53, Kyrill Tkachov wrote:

This adds a new configure-time option --enable-fix-cortex-a53-835769 that
will enable the Cortex-A53 erratum fix by default
so you don't have to specify -mfix-cortex-a53-835769 every time.

Documentation in install.texi is added.

Thank you.  Can you please also update gcc-5/changes.html on the
web side of things?


Sure, but I'm not sure how to get access to the web pages cvs.
Could you point me to the magic runes please?

Thanks,
Kyrill



Gerald

Re: [gomp4] Use GOMP_PLUGIN_ not gomp_plugin_ for libgomp plugin API

2014-10-20 Thread Thomas Schwinge

Hi Julian!

On Fri, 17 Oct 2014 16:48:26 +0100, Julian Brown  
wrote:
> As the title says, this patch makes the libgomp plugin API use the
> GOMP_PLUGIN_ prefix rather than gomp_plugin_. This is purely a
> mechanical change.
> 
> OK for the gomp4 branch?

Yes, thanks.

> libgomp/
> * libgomp-plugin.c (gomp_plugin_*): Rename to...
> (GOMP_PLUGIN_*): This.
> * libgomp-plugin.h: Likewise.
> * libgomp.map: Likewise.
> * oacc-host.c (GOMP): Use GOMP_PLUGIN_ in macro expansion.
> * oacc-plugin.c (gomp_plugin_*): Rename to...
> (GOMP_PLUGIN_*): This.
> * plugin-nvptx.c: Likewise.


Grüße,
 Thomas


pgpJFUOPPjHz5.pgp
Description: PGP signature

Re: [gomp4] Fix include path configury for gomp-constants.h

2014-10-20 Thread Thomas Schwinge

Hi Julian!

On Fri, 17 Oct 2014 16:51:17 +0100, Julian Brown  
wrote:
> This patch tweaks the include path configury used by libgomp to find
> the gomp-constants.h header, as suggested by Jakub.
> 
> OK for the gomp4 branch?

Thanks, yes, with the following changed:

> libgomp/
> * Makefile.am (AM_CPPFLAGS): Fix search path for locating
> gomp-constants.h.
> * Makefile.in: Regenerate.

> --- a/libgomp/Makefile.am
> +++ b/libgomp/Makefile.am
> @@ -14,8 +14,7 @@ libsubincludedir = 
> $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
>  
>  vpath % $(strip $(search_path))
>  
> -AM_CPPFLAGS = $(addprefix -I, $(search_path)) \
> - $(addprefix -I, $(search_path)/../include)
> +AM_CPPFLAGS = $(addprefix -I, $(search_path)) -I $(top_srcdir)/../include

   ^

No space here?


Grüße,
 Thomas


pgps44DrYZHu5.pgp
Description: PGP signature

Re: [gomp4] Asynchronous data unmapping & wait fixes for OpenACC

2014-10-20 Thread Thomas Schwinge

Hi Julian!

On Fri, 17 Oct 2014 17:05:40 +0100, Julian Brown  
wrote:
> This patch introduces a new plugin hook in libgomp to register a
> callback function to clean up host-side bookkeeping data after an
> asynchronous operation has completed (replacing the previous ad-hoc
> method used in the NVPTX backend), and adds code to ensure that same
> cleanup is done reliably in the NVPTX backend when the user program
> hits a "wait" directive, or equivalent.
> 
> OK for the gomp4 branch?

Yes, thanks.

> libgomp/
> * oacc-host.c (openacc_register_async_cleanup): New.
> (host_dispatch): Initialise register_async_cleanup_func entry.
> * oacc-int.h (struct ACC_dispatch_t): Add
> register_async_cleanup_func hook.
> * oacc-parallel.c (GOACC_parallel): Call
> register_async_cleanup_func hook after queuing asynchronous
> copy-back.
> * plugin-nvptx.c (enum PTX_event_type): Add PTX_EVT_ASYNC_CLEANUP.
> (struct PTX_event): Remove tgt field.
> (event_gc): Don't do async cleanup in PTX_EVT_KNL, do it in
> PTX_EVT_ASYNC_CLEANUP instead.
> (event_add): Remove tgt argument. Support PTX_EVT_ASYNC_CLEANUP
> events.
> (PTX_exec, PTX_host2dev, PTX_dev2host, PTX_wait_async)
> (PTX_wait_all_async): Update calls to event_add.
> (openacc_register_async_cleanup): New.
> (PTX_async_test): Call event_gc on success path.
> (PTX_async_test_all): Likewise.
> * target.c (gomp_load_plugin_for_device): Initialise
> register_async_cleanup hook.


Grüße,
 Thomas


pgpflhmGFV4Bm.pgp
Description: PGP signature

Re: [gomp4] OpenACC / C++

2014-10-20 Thread Thomas Schwinge

Hi!

On Thu, 16 Oct 2014 19:38:23 +0200, I wrote:
> On Wed, 15 Oct 2014 11:21:05 -0500, James Norris  
> wrote:
> > This patch adds OpenACC support to C++ in the gomp4 branch.
> 
> We understand that there will be further patches required on top of this,
> but we shall work on that incrementally.

One such patch I have just applied in r216456: »Enable compiler testing
for OpenACC/C++«.

There are several FAILs that need to be addressed.  As appropriate,
rework the source code, adjust some dg-* directives' regular expressions
a little bit (the diagnostic messages of C and C++ should typically be
very similar), and perhaps add some »{ target c }« or »{ target c++ }« to
the dg-* directives.  If test cases truly are appropriate for C but not
for C++, then they should be moved out of c-c++-common into the respecive
C-only directory.

commit 8f75331ca65b10331dbade6f35504af6447f5853
Author: tschwinge 
Date:   Mon Oct 20 10:11:38 2014 +

Enable compiler testing for OpenACC/C++.

gcc/testsuite/
* gcc.dg/goacc/sb-1.c: Move file...
* c-c++-common/goacc/sb-1.c: ... here.
* gcc.dg/goacc/sb-2.c: Move file...
* c-c++-common/goacc/sb-2.c: ... here.
* gcc.dg/goacc/sb-3.c: Move file...
* c-c++-common/goacc/sb-3.c: ... here.

* g++.dg/goacc-gomp/goacc-gomp.exp: New file.
* g++.dg/goacc/goacc.exp: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@216456 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog.gomp   | 12 
 .../{gcc.dg => c-c++-common}/goacc/sb-1.c  |  0
 .../{gcc.dg => c-c++-common}/goacc/sb-2.c  |  0
 .../{gcc.dg => c-c++-common}/goacc/sb-3.c  |  0
 gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp | 36 ++
 gcc/testsuite/g++.dg/goacc/goacc.exp   | 35 +
 6 files changed, 83 insertions(+)

diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index 48651ad..10232bc 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,3 +1,15 @@
+2014-10-20  Thomas Schwinge  
+
+   * gcc.dg/goacc/sb-1.c: Move file...
+   * c-c++-common/goacc/sb-1.c: ... here.
+   * gcc.dg/goacc/sb-2.c: Move file...
+   * c-c++-common/goacc/sb-2.c: ... here.
+   * gcc.dg/goacc/sb-3.c: Move file...
+   * c-c++-common/goacc/sb-3.c: ... here.
+
+   * g++.dg/goacc-gomp/goacc-gomp.exp: New file.
+   * g++.dg/goacc/goacc.exp: Likewise.
+
 2014-10-09  Thomas Schwinge  
 
* gcc.dg/goacc/collapse.c: Move file to
diff --git gcc/testsuite/gcc.dg/goacc/sb-1.c 
gcc/testsuite/c-c++-common/goacc/sb-1.c
similarity index 100%
rename from gcc/testsuite/gcc.dg/goacc/sb-1.c
rename to gcc/testsuite/c-c++-common/goacc/sb-1.c
diff --git gcc/testsuite/gcc.dg/goacc/sb-2.c 
gcc/testsuite/c-c++-common/goacc/sb-2.c
similarity index 100%
rename from gcc/testsuite/gcc.dg/goacc/sb-2.c
rename to gcc/testsuite/c-c++-common/goacc/sb-2.c
diff --git gcc/testsuite/gcc.dg/goacc/sb-3.c 
gcc/testsuite/c-c++-common/goacc/sb-3.c
similarity index 100%
rename from gcc/testsuite/gcc.dg/goacc/sb-3.c
rename to gcc/testsuite/c-c++-common/goacc/sb-3.c
diff --git gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp 
gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp
new file mode 100644
index 000..ca42de4
--- /dev/null
+++ gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp
@@ -0,0 +1,36 @@
+# Copyright (C) 2006-2014 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# Load support procs.
+load_lib g++-dg.exp
+
+if { ![check_effective_target_fopenacc] \
+ || ![check_effective_target_fopenmp] } {
+  return
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+g++-dg-runtest [lsort [concat \
+   [find $srcdir/$subdir *.C] \
+   [find $srcdir/c-c++-common/goacc-gomp *.c]]] "" "-fopenacc -fopenmp"
+
+# All done.
+dg-finish
diff --git gcc/testsuite/g++.dg/goacc/goacc.exp 
gcc/testsuite/g++.dg/goacc/goacc.exp
new file mode 100644
index 000..1889a86
--- /dev/null
+++ gcc/testsuite/g++.dg/goacc/goacc.exp
@@ -0,0 +1,35 @@
+# Copyright (C) 2006-2014 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the

[PATCH] Backport misalign tests to 4.9

2014-10-20 Thread Yury Gribov


On Fri, Oct 17, 2014 at 06:15:11PM +0400, Yury Gribov wrote:

On 10/17/2014 05:49 PM, Jakub Jelinek wrote:

>> So, what about this?  Just checked that with
>> make -k check-g{cc,++} 
RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp tsan.exp ubsan.exp'

>> so far.  Plus if you add misalign tests...


Sure, can do this on Monday.


Here are the said tests, passed (on x64) as expected. Ok to commit?

-Y
commit f4007db6a5f90f71fe977c8232ea7fe2de1c6c28
Author: jakub 
Date:   Fri May 30 18:37:59 2014 +

2014-10-20  Yury Gribov  

	Backported from mainline
	2014-05-30  Jakub Jelinek  

	* c-c++-common/asan/misalign-1.c: New test.
	* c-c++-common/asan/misalign-2.c: New test.

diff --git a/gcc/testsuite/c-c++-common/asan/misalign-1.c b/gcc/testsuite/c-c++-common/asan/misalign-1.c
new file mode 100644
index 000..0c5b6e0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/misalign-1.c
@@ -0,0 +1,42 @@
+/* { dg-do run { target { ilp32 || lp64 } } } */
+/* { dg-options "-O2" } */
+/* { dg-shouldfail "asan" } */
+
+struct S { int i; } __attribute__ ((packed));
+
+__attribute__((noinline, noclone)) int
+foo (struct S *s)
+{
+  return s->i;
+}
+
+__attribute__((noinline, noclone)) int
+bar (int *s)
+{
+  return *s;
+}
+
+__attribute__((noinline, noclone)) struct S
+baz (struct S *s)
+{
+  return *s;
+}
+
+int
+main ()
+{
+  struct T { char a[3]; struct S b[3]; char c; } t;
+  int v = 5;
+  struct S *p = t.b;
+  asm volatile ("" : "+rm" (p));
+  p += 3;
+  if (bar (&v) != 5) __builtin_abort ();
+  volatile int w = foo (p);
+  return 0;
+}
+
+/* { dg-output "ERROR: AddressSanitizer:\[^\n\r]*on address\[^\n\r]*" } */
+/* { dg-output "0x\[0-9a-f\]+ at pc 0x\[0-9a-f\]+ bp 0x\[0-9a-f\]+ sp 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*READ of size 4 at 0x\[0-9a-f\]+ thread T0\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ (in _*foo(\[^\n\r]*misalign-1.c:10|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#1 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*misalign-1.c:34|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
diff --git a/gcc/testsuite/c-c++-common/asan/misalign-2.c b/gcc/testsuite/c-c++-common/asan/misalign-2.c
new file mode 100644
index 000..7fbe299
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/misalign-2.c
@@ -0,0 +1,42 @@
+/* { dg-do run { target { ilp32 || lp64 } } } */
+/* { dg-options "-O2" } */
+/* { dg-shouldfail "asan" } */
+
+struct S { int i; } __attribute__ ((packed));
+
+__attribute__((noinline, noclone)) int
+foo (struct S *s)
+{
+  return s->i;
+}
+
+__attribute__((noinline, noclone)) int
+bar (int *s)
+{
+  return *s;
+}
+
+__attribute__((noinline, noclone)) struct S
+baz (struct S *s)
+{
+  return *s;
+}
+
+int
+main ()
+{
+  struct T { char a[3]; struct S b[3]; char c; } t;
+  int v = 5;
+  struct S *p = t.b;
+  asm volatile ("" : "+rm" (p));
+  p += 3;
+  if (bar (&v) != 5) __builtin_abort ();
+  volatile struct S w = baz (p);
+  return 0;
+}
+
+/* { dg-output "ERROR: AddressSanitizer:\[^\n\r]*on address\[^\n\r]*" } */
+/* { dg-output "0x\[0-9a-f\]+ at pc 0x\[0-9a-f\]+ bp 0x\[0-9a-f\]+ sp 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*READ of size 4 at 0x\[0-9a-f\]+ thread T0\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ (in _*baz(\[^\n\r]*misalign-2.c:22|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#1 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*misalign-2.c:34|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */

Re: [PATCH] -fsanitize-recover=list

2014-10-20 Thread Yury Gribov


On 10/17/2014 08:13 PM, Jakub Jelinek wrote:

On Mon, Oct 13, 2014 at 02:47:07PM +0400, Yury Gribov wrote:

On 09/30/2014 09:39 PM, Jakub Jelinek wrote:

LGTM, will hack it up soon in GCC then.


Do you plan to work on this in near future?


Here is only very lightly tested patch, didn't get to updating
documentation though, plus there is no testsuite coverage for it.
Supposedly, most of the tests that use -fno-sanitize-recover
or -fsanitize-recover in dg-options should be changed to use
-fno-sanitize-recover= or -fsanitize-recover (in some cases for
the kind that is enabled with -fsanitize= only, in other cases
perhaps for something covering that and some other options),
plus perhaps some new smallish tests that test that if you e.g.
do -fsanitize=undefined -fno-sanitize-recover=divide , that
you can recover from several say out of bound shifts, but that
the first divide will terminate, etc.
Yuri or Marek, would you have spare time for that?


I'll take on this!

-Y

Re: [PATCH] Backport misalign tests to 4.9

2014-10-20 Thread Jakub Jelinek

On Mon, Oct 20, 2014 at 02:37:21PM +0400, Yury Gribov wrote:
> On Fri, Oct 17, 2014 at 06:15:11PM +0400, Yury Gribov wrote:
> >On 10/17/2014 05:49 PM, Jakub Jelinek wrote:
> >> So, what about this?  Just checked that with
> >> make -k check-g{cc,++} RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}
> asan.exp tsan.exp ubsan.exp'
> >> so far.  Plus if you add misalign tests...
> >
> >Sure, can do this on Monday.
> 
> Here are the said tests, passed (on x64) as expected. Ok to commit?
> 
> -Y

> commit f4007db6a5f90f71fe977c8232ea7fe2de1c6c28
> Author: jakub 
> Date:   Fri May 30 18:37:59 2014 +
> 
> 2014-10-20  Yury Gribov  
> 
>   Backported from mainline
>   2014-05-30  Jakub Jelinek  
> 
>   * c-c++-common/asan/misalign-1.c: New test.
>   * c-c++-common/asan/misalign-2.c: New test.

Ok, thanks.

Jakub

Re: PR debug/60655, debug loc expressions

2014-10-20 Thread Alan Modra

On Thu, Oct 16, 2014 at 09:07:58AM +0200, Jakub Jelinek wrote:
> So, please find the spot where we forget to simplify stuff, and put the
> simplification there.

You were correct to be suspicious that we weren't simplifying as we
should.  After more time in the debugger than I care to admit, I found
the underlying cause.

One of the var loc expressions is
(plus:SI (plus:SI (not:SI (debug_expr:SI D#9))
(value/u:SI 58:4373 @0x18d3968/0x18ef230))
(debug_expr:SI D#5))

which after substitution (in bb7) becomes
(plus:SI (plus:SI (not:SI (plus:SI (reg:SI 5 5 [orig:212 D.2333 ] [212])
(const:SI (plus:SI (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])
(const_int -1 [0x])
(reg:SI 10 10 [orig:223 ivtmp.33 ] [223]))
(plus:SI (reg:SI 5 5 [orig:212 D.2333 ] [212])
(const:SI (plus:SI (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])
(const_int 323 [0x143])

The above has 8 ops by the time you turn ~x into -x - 1, and exceeds
the allowed number of elements in the simplify_plus_minus ops array.
Note that the ops array has 8 elements but the code only allows 7 to
be entered, a bug since the "spare" element isn't a sentinal or used
in any other way.

This resulted in a partial simplification of the expression to
(plus:SI (plus:SI (reg:SI 10 10 [orig:223 ivtmp.33 ] [223])
(symbol_ref:SI ("*.LANCHOR0") [flags 0x182]))
(const:SI (minus:SI (const_int 323 [0x143])
(symbol_ref:SI ("*.LANCHOR0") [flags 0x182]

I also noticed another small bug in simplify_plus_minus.  n_constants
ought to be the number of constants in ops, not the number of times
we look at a constant.

The "Handle CONST wrapped NOT, NEG and MINUS" in the previous patch
seems to no longer be necessary, so I took that out (didn't hit the
code in powerpc64-linux, powerpc-linux and x86_64-linux bootstrap and
regression tests).

Bootstrapped and regression tested powerpc64-linux and x86_64-linux.
OK to apply?

PR debug/60655
* simplify-rtx.c (simplify_plus_minus): Delete unused "input_ops".
Increase "ops" array size.  Correct array size tests.  Init
n_constants in loop.  Break out of innermost loop when finding
a trivial CONST expression.

Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 216420)
+++ gcc/simplify-rtx.c  (working copy)
@@ -3965,10 +3965,10 @@
 simplify_plus_minus (enum rtx_code code, enum machine_mode mode, rtx op0,
 rtx op1)
 {
-  struct simplify_plus_minus_op_data ops[8];
+  struct simplify_plus_minus_op_data ops[16];
   rtx result, tem;
-  int n_ops = 2, input_ops = 2;
-  int changed, n_constants = 0, canonicalized = 0;
+  int n_ops = 2;
+  int changed, n_constants, canonicalized = 0;
   int i, j;
 
   memset (ops, 0, sizeof ops);
@@ -3985,6 +3985,7 @@
   do
 {
   changed = 0;
+  n_constants = 0;
 
   for (i = 0; i < n_ops; i++)
{
@@ -3996,7 +3997,7 @@
{
case PLUS:
case MINUS:
- if (n_ops == 7)
+ if (n_ops == ARRAY_SIZE (ops))
return NULL_RTX;
 
  ops[n_ops].op = XEXP (this_op, 1);
@@ -4004,7 +4005,6 @@
  n_ops++;
 
  ops[i].op = XEXP (this_op, 0);
- input_ops++;
  changed = 1;
  canonicalized |= this_neg;
  break;
@@ -4017,7 +4017,7 @@
  break;
 
case CONST:
- if (n_ops < 7
+ if (n_ops != ARRAY_SIZE (ops)
  && GET_CODE (XEXP (this_op, 0)) == PLUS
  && CONSTANT_P (XEXP (XEXP (this_op, 0), 0))
  && CONSTANT_P (XEXP (XEXP (this_op, 0), 1)))
@@ -4033,7 +4033,7 @@
 
case NOT:
  /* ~a -> (-a - 1) */
- if (n_ops != 7)
+ if (n_ops != ARRAY_SIZE (ops))
{
  ops[n_ops].op = CONSTM1_RTX (mode);
  ops[n_ops++].neg = this_neg;
@@ -4097,7 +4097,7 @@
   /* Now simplify each pair of operands until nothing changes.  */
   do
 {
-  /* Insertion sort is good enough for an eight-element array.  */
+  /* Insertion sort is good enough for a small array.  */
   for (i = 1; i < n_ops; i++)
 {
   struct simplify_plus_minus_op_data save;
@@ -4148,16 +4148,21 @@
else
  tem = simplify_binary_operation (ncode, mode, lhs, rhs);
 
-   /* Reject "simplifications" that just wrap the two
-  arguments in a CONST.  Failure to do so can result
-  in infinite recursion with simplify_binary_operation
-  when it calls us to simplify CONST operations.  */
-   if (tem
-   && ! (GET_CODE (tem) == CONST
- && GET_CODE (XEXP (tem, 0)) == ncode
- && XEXP (XEXP (tem,

Re: PR debug/60655, debug loc expressions

2014-10-20 Thread Jakub Jelinek

On Mon, Oct 20, 2014 at 09:16:57PM +1030, Alan Modra wrote:
>   PR debug/60655
>   * simplify-rtx.c (simplify_plus_minus): Delete unused "input_ops".
>   Increase "ops" array size.  Correct array size tests.  Init
>   n_constants in loop.  Break out of innermost loop when finding
>   a trivial CONST expression.

LGTM, thanks.

Jakub

Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming

2014-10-20 Thread Ilya Verbin

On 15 Oct 16:23, Richard Biener wrote:
> > +static bool
> > +initialize_offload (void)
> > +{
> > +  bool have_offload = false;
> > +  struct cgraph_node *node;
> > +  struct varpool_node *vnode;
> > +
> > +  FOR_EACH_DEFINED_FUNCTION (node)
> > +if (lookup_attribute ("omp declare target", DECL_ATTRIBUTES 
> > (node->decl)))
> > +  {
> > +   have_offload = true;
> > +   break;
> > +  }
> > +
> > +  FOR_EACH_DEFINED_VARIABLE (vnode)
> > +{
> > +  if (!lookup_attribute ("omp declare target",
> > +DECL_ATTRIBUTES (vnode->decl))
> > + || TREE_CODE (vnode->decl) != VAR_DECL
> > + || DECL_SIZE (vnode->decl) == 0)
> > +   continue;
> > +  have_offload = true;
> > +}
> > +
> > +  return have_offload;
> > +}
> > +
> 
> I wonder if we can avoid the above by means of a global have_offload
> flag?  (or inside gcc::context)

So you propose to set global have_offload flag somewhere in expand_omp_target,
etc. where functions and global variables are created?

> >  static void
> >  ipa_passes (void)
> >  {
> > +  bool have_offload = false;
> >gcc::pass_manager *passes = g->get_passes ();
> >  
> >set_cfun (NULL);
> > @@ -2004,6 +2036,14 @@ ipa_passes (void)
> >gimple_register_cfg_hooks ();
> >bitmap_obstack_initialize (NULL);
> >  
> > +  if (!in_lto_p && flag_openmp)
> 
> As -fopenmp is not generally available it's odd to test
> flag_openmp (though that is available everywhere as
> implementation detail).  Doesn't offloading work
> without -fopenmp?

In this patch series offloading is implemented only for OpenMP.
OpenACC guys will add flag_openacc here.

> >/* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
> > -  if (flag_lto)
> > +  if (flag_lto || flag_openmp)
> 
> flag_generate_lto?
> 
> >/* When not optimizing, do not bother to analyze.  Inlining is still done
> >   because edge redirection needs to happen there.  */
> > -  if (!optimize && !flag_lto && !flag_wpa)
> > +  if (!optimize && !flag_lto && !flag_wpa && !flag_openmp)
> >  return;
> 
> Likewise !flag_generate_lto

Currently this is not working, since symbol_table::compile is executed before
ipa_passes.  But with global have_offload it should work.

> > +/* Select what needs to be streamed out.  In regular lto mode stream 
> > everything.
> > +   In offload lto mode stream only stuff marked with an attribute.  */
> > +void
> > +select_what_to_stream (bool offload_lto_mode)
> > +{
> > +  struct symtab_node *snode;
> > +  FOR_EACH_SYMBOL (snode)
> > +snode->need_lto_streaming
> > +  = !offload_lto_mode || lookup_attribute ("omp declare target",
> > +  DECL_ATTRIBUTES (snode->decl));
> 
> I suppose I suggested this already earlier this year.  Why keep this
> artificial attribute when you have a cgraph node flag?

> > + /* If '#pragma omp critical' is inside target region, the symbol must
> > +have an 'omp declare target' attribute.  */
> > + omp_context *octx;
> > + for (octx = ctx->outer; octx; octx = octx->outer)
> > +   if (is_targetreg_ctx (octx))
> > + {
> > +   DECL_ATTRIBUTES (decl)
> > + = tree_cons (get_identifier ("omp declare target"),
> > +  NULL_TREE, DECL_ATTRIBUTES (decl));
> 
> Here - why not set a flag on cgraph_get_node (decl) instead?

I thought that select_what_to_stream is exactly what you've suggested.
Could you please clarify this?  You propose to replace "omp declare target"
attribure with some cgraph node flag like need_offload?  But we'll need
need_lto_streaming anyway, since for LTO it should be 1 for all nodes, but for
offloading it should be equal to need_offload.

Thanks,
  -- Ilya

Re: [PATCH][1/n] Merge from match-and-simplify, public API

2014-10-20 Thread Richard Biener

On Fri, 17 Oct 2014, Jakub Jelinek wrote:

> On Wed, Oct 15, 2014 at 01:40:07PM +0200, Richard Biener wrote:
> > 2014-10-15  Richard Biener  
> > 
> > * gimple-fold.h (gimple_build): Declare various overloads.
> > (gimple_simplify): Likewise.
> > (gimple_convert): Re-implement in terms of gimple_build.
> > * gimple-fold.c (gimple_convert): Remove.
> > (gimple_build): New functions.
> > 
> > --- 45,141 
> >   extern bool arith_code_with_undefined_signed_overflow (tree_code);
> >   extern gimple_seq rewrite_to_defined_overflow (gimple);
> >   
> > ! /* gimple_build, functionally matching fold_buildN, outputs stmts
> > !int the provided sequence, matching and simplifying them on-the-fly.
> > !Supposed to replace force_gimple_operand (fold_buildN (...), ...).  */
> > ! tree gimple_build (gimple_seq *, location_t,
> > !  enum tree_code, tree, tree,
> > !  tree (*valueize) (tree) = NULL);
> 
> I find mixing prototypes with and without extern keyword weird,
> most of the prototypes in headers use extern, I think it would be cleaner
> to use it everywhere.

Fixed.

> > *** gcc/gimple-fold.c.orig  2014-10-14 15:49:30.634356179 +0200
> > --- gcc/gimple-fold.c   2014-10-15 13:02:08.158099055 +0200
> > *** along with GCC; see the file COPYING3.
> > *** 56,61 
> > --- 56,62 
> >   #include "builtins.h"
> >   #include "output.h"
> >   
> > + 
> >   /* Return true when DECL can be referenced from current unit.
> >  FROM_DECL (if non-null) specify constructor of variable DECL was taken 
> > from.
> >  We can get declarations that are not possible to reference for various
> 
> Why the whitespace change?

Fixed.

> >   
> >   tree
> > ! gimple_convert (gimple_seq *seq, location_t loc, tree type, tree op)
> >   {
> > !   if (useless_type_conversion_p (type, TREE_TYPE (op)))
> > ! return op;
> > !   op = fold_convert_loc (loc, type, op);
> > !   gimple_seq stmts = NULL;
> > !   op = force_gimple_operand (op, &stmts, true, NULL_TREE);
> > !   gimple_seq_add_seq_without_update (seq, stmts);
> > !   return op;
> >   }
> > --- 5297,5487 
> > return stmts;
> >   }
> >   
> > ! 
> > ! 
> 
> 3 lines of vertical space too much?

Reduced to two.

I moved gimple_convert out-of-line to avoid

/* ???  Forward from gimple-expr.h.  */
extern bool useless_type_conversion_p (tree, tree);

Thanks,
Richard.

2014-10-20  Richard Biener  

* gimple-fold.c (gimple_convert): Move out-of-line from ...
* gimple-fold.h (gimple_convert): ... here.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 216459)
+++ gcc/gimple-fold.c   (working copy)
@@ -64,7 +64,6 @@ along with GCC; see the file COPYING3.
 #include "tree-eh.h"
 #include "gimple-match.h"
 
-
 /* Return true when DECL can be referenced from current unit.
FROM_DECL (if non-null) specify constructor of variable DECL was taken from.
We can get declarations that are not possible to reference for various
@@ -5538,7 +5537,6 @@ rewrite_to_defined_overflow (gimple stmt
 }
 
 
-
 /* Build the expression CODE OP0 of type TYPE with location LOC,
simplifying it first if possible using VALUEIZE if not NULL.
OP0 is expected to be valueized already.  Returns the built
@@ -5597,7 +5595,6 @@ gimple_build (gimple_seq *seq, location_
   return res;
 }
 
-
 /* Build the expression (CODE OP0 OP1 OP2) of type TYPE with location LOC,
simplifying it first if possible using VALUEIZE if not NULL.
OP0, OP1 and OP2 are expected to be valueized already.  Returns the built
@@ -5725,3 +5722,16 @@ gimple_build (gimple_seq *seq, location_
   return res;
 }
 
+/* Build the conversion (TYPE) OP with a result of type TYPE
+   with location LOC if such conversion is neccesary in GIMPLE,
+   simplifying it first.
+   Returns the built expression value and appends
+   statements possibly defining it to SEQ.  */
+
+tree
+gimple_convert (gimple_seq *seq, location_t loc, tree type, tree op)
+{
+  if (useless_type_conversion_p (type, TREE_TYPE (op)))
+return op;
+  return gimple_build (seq, loc, NOP_EXPR, type, op);
+}
Index: gcc/gimple-fold.h
===
--- gcc/gimple-fold.h   (revision 216459)
+++ gcc/gimple-fold.h   (working copy)
@@ -51,54 +51,54 @@ extern gimple_seq rewrite_to_defined_ove
 /* gimple_build, functionally matching fold_buildN, outputs stmts
int the provided sequence, matching and simplifying them on-the-fly.
Supposed to replace force_gimple_operand (fold_buildN (...), ...).  */
-tree gimple_build (gimple_seq *, location_t,
-  enum tree_code, tree, tree,
-  tree (*valueize) (tree) = NULL);
+extern tree gimple_build (gimple_seq *, location_t,
+ enum tree_code, tree, tree,
+ tree (*valueize) (tree) = NULL);
 inline tree
 gimple_build (gimple_seq *seq,

Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming

2014-10-20 Thread Jakub Jelinek

On Mon, Oct 20, 2014 at 03:19:35PM +0400, Ilya Verbin wrote:
> > > +   /* If '#pragma omp critical' is inside target region, the symbol must
> > > +  have an 'omp declare target' attribute.  */
> > > +   omp_context *octx;
> > > +   for (octx = ctx->outer; octx; octx = octx->outer)
> > > + if (is_targetreg_ctx (octx))
> > > +   {
> > > + DECL_ATTRIBUTES (decl)
> > > +   = tree_cons (get_identifier ("omp declare target"),
> > > +NULL_TREE, DECL_ATTRIBUTES (decl));
> > 
> > Here - why not set a flag on cgraph_get_node (decl) instead?
> 
> I thought that select_what_to_stream is exactly what you've suggested.
> Could you please clarify this?  You propose to replace "omp declare target"
> attribure with some cgraph node flag like need_offload?  But we'll need
> need_lto_streaming anyway, since for LTO it should be 1 for all nodes, but for
> offloading it should be equal to need_offload.

Note, the attribute is created usually by the FEs, at points where
cgraph/varpool nodes can't be created yet.  So, it is not possible to get
rid of the artificial attribute easily, it could be cached in some
cgraph/varpool bit field of course.

Jakub

Re: [PATCH][1/n] Merge from match-and-simplify, infrastructure

2014-10-20 Thread Richard Biener

On Fri, 17 Oct 2014, Jakub Jelinek wrote:

> On Wed, Oct 15, 2014 at 01:39:33PM +0200, Richard Biener wrote:
> > 2014-10-15  Richard Biener  
> 
> Shouldn't Prathamesh be listed as co-author of the patch?

Yes, of course.

> > +   fprintf (f, "case SSA_NAME:\n");
> > +   fprintf (f, "{\n");
> > +   fprintf (f, "gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n", 
> > kid_opname);
> 
> Etc.; so no attempt to indent the generated code, by tracking number
> of current indentation columns and trasnating that into a series of
> spaces or tabs or tabs+spaces?  Other generated sources like insn-*.c
> usually are indented, at least to some extent.

Yes, no tracking is done at the moment.  I will see how difficult it is
as a followup (the nests can become quite deep and thus have very long
lines - for debugging I chose to indent only pieces of it via vim
auto-re-indent)

> > + char dest[32];
> > + snprintf (dest, 32, "  res_ops[%d]", j);
> > + const char *optype
> > + = get_operand_type (e->operation,
> 
> This seems to be indented too much.

Fixed.

> > + "type", e->expr_type,
> > + j == 0
> > + ? NULL : "TREE_TYPE (res_ops[0])");
> > + /* The genmatch generator progam.  It reads from a pattern description
> > +and outputs GIMPLE or GENERIC IL matching and simplification routines. 
> >  */
> > + 
> > + int
> > + main(int argc, char **argv)
> 
> Formatting ;)

Fixed.

> > + return 1;
> > + 
> > +   bool gimple = true;
> > +   bool verbose = false;
> > +   char *input = argv[argc-1];
> > +   for (int i = 1; i < argc - 1; ++i)
> > + {
> > +   if (strcmp (argv[i], "-gimple") == 0)
> > +   gimple = true;
> > +   else if (strcmp (argv[i], "-generic") == 0)
> > +   gimple = false;
> > +   else if (strcmp (argv[i], "-v") == 0)
> > +   verbose = true;
> > +   else
> > +   {
> > + fprintf (stderr, "Usage: genmatch [-gimple] [-generic] [-v] input\n");
> > + return 1;
> > +   }
> > + }
> 
> Wouldn't --gimple and --generic be nicer?

Changed.

Thanks,
Richard.

2014-10-20  Richard Biener  

* genmatch.c (main): Change -gimple and -generic to
--gimple and --generic.
* Makefile.in (s-match): Adjust.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 216459)
+++ gcc/genmatch.c  (working copy)
@@ -1950,10 +1950,10 @@ dt_simplify::gen (FILE *f, bool gimple)
  char dest[32];
  snprintf (dest, 32, "  res_ops[%d]", j);
  const char *optype
- = get_operand_type (e->operation,
- "type", e->expr_type,
- j == 0
- ? NULL : "TREE_TYPE (res_ops[0])");
+   = get_operand_type (e->operation,
+   "type", e->expr_type,
+   j == 0
+   ? NULL : "TREE_TYPE (res_ops[0])");
  e->ops[j]->gen_transform (f, dest, true, 1, optype, indexes);
}
 
@@ -2918,7 +2918,7 @@ round_alloc_size (size_t s)
and outputs GIMPLE or GENERIC IL matching and simplification routines.  */
 
 int
-main(int argc, char **argv)
+main (int argc, char **argv)
 {
   cpp_reader *r;
 
@@ -2932,15 +2932,16 @@ main(int argc, char **argv)
   char *input = argv[argc-1];
   for (int i = 1; i < argc - 1; ++i)
 {
-  if (strcmp (argv[i], "-gimple") == 0)
+  if (strcmp (argv[i], "--gimple") == 0)
gimple = true;
-  else if (strcmp (argv[i], "-generic") == 0)
+  else if (strcmp (argv[i], "--generic") == 0)
gimple = false;
   else if (strcmp (argv[i], "-v") == 0)
verbose = true;
   else
{
- fprintf (stderr, "Usage: genmatch [-gimple] [-generic] [-v] input\n");
+ fprintf (stderr, "Usage: genmatch "
+  "[--gimple] [--generic] [-v] input\n");
  return 1;
}
 }
@@ -3035,4 +3036,3 @@ add_operator (CONVERT2, "CONVERT2", "tcc
 
   return 0;
 }
-
Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 216459)
+++ gcc/Makefile.in (working copy)
@@ -2237,9 +2237,9 @@ gimple-match.c: s-match gimple-match-hea
 generic-match.c: s-match generic-match-head.c ; @true
 
 s-match: build/genmatch$(build_exeext) $(srcdir)/match*.pd
-   $(RUN_GEN) build/genmatch$(build_exeext) -gimple $(srcdir)/match.pd \
+   $(RUN_GEN) build/genmatch$(build_exeext) --gimple $(srcdir)/match.pd \
> tmp-gimple-match.c
-   $(RUN_GEN) build/genmatch$(build_exeext) -generic $(srcdir)/match.pd \
+   $(RUN_GEN) build/genmatch$(build_exeext) --generic $(srcdir)/match.pd \
> tmp-generic-match.c
$(SHELL) $(srcdir)/../move-if-change tmp-gimple-match.c \

Re: [wwwdocs] Add recent C++ changes to gcc-5/changes.html

2014-10-20 Thread Jonathan Wakely


On 19/10/14 22:19 +0200, Gerald Pfeifer wrote:

On Friday 2014-10-17 13:34, Jonathan Wakely wrote:

Index: htdocs/gcc-5/changes.html
===
@@ -128,12 +164,13 @@
 
Class std::experimental::any; 
Function template std::experimental::apply; 
+ Variable templates for type traits; 

The trailing semi-colon feels a bit odd, doesn't it?

(Also, when using a semi-colon, wouldn't the next word -- Function
and Variable here -- be lower-case?)


Yes, I'll fix these next time I change the file.

Re: [PATCH][3/n] Merge from match-and-simplify, first patterns and questions

2014-10-20 Thread Richard Biener

On Fri, 17 Oct 2014, Jakub Jelinek wrote:

> On Wed, Oct 15, 2014 at 01:40:49PM +0200, Richard Biener wrote:
> > 
> > This adds a bunch of simplifications with constant operands
> > or ones that simplify to constants, such as a + 0, x * 1.
> > 
> > It's a patch mainly to get a few questions answered for further
> > pattern merges:
> > 
> >  - The branch uses multiple .pd files and includes them from
> >match.pd trying to group related stuff together.  It has
> >become somewhat difficult to do that grouping in some
> >sensible manner so I am not sure this is the best approach.
> >Any opinion?  We can simply put everything into match.pd
> >and group visually by overall comments.
> 
> That would be probably my preference, unless match.pd grows too big.

Ok.

> >  - Each pattern I will add will either be already implemented
> >in some form in fold-const.c or tree-ssa-forwprop.c.  Once
> >the machinery is exercised from fold-const.c and
> >tree-ssa-forwprop.c I can remove the duplicates at the
> >same time I add a pattern.  Should I do that?
> 
> I guess it depends, if the new pattern covers the old one well, sure,
> the STRIP_{,SIGN_}NOPS issues might be more important, TREE_SIDE_EFFECTS
> probably less important (those shouldn't be really constant expressions
> and thus there should be fewer users expecting stuff to be folded).
> 
> In any cases, we need to be prepared to cure some folding
> regressions if people report them and we find them desirable to be
> restored.  Hopefully there won't be hundreds of such reports.

Agreed.  So any STRIP_NOPS that is not a STRIP_SIGN_NOPS (thus
a sing-changing conversion strip) would be a missed optimization
on GIMPLE which we should address anyhow.  For STRIP_SING_NOPS,
yes - if we find uses that matter we can address them or delay
the simplification to GIMPLE where the conversion should vanish
as useless.

Note that we can address the issues by amending patterns with
conditional conversions (with appropriate predicate on the
allowed types of course).  Note that the patterns would need to
as  take care of putting back required conversions that
are stripped off similar to how fold-const.c wraps almost all
operands in a fold_convert (...).

In theory one could try removing all STRIP_NOPS calls from fold-const.c
and look for fallout.

I'll keep an eye on it.

Thanks,
Richard.

Re: [PATCH][0/n] Merge from match-and-simplify

2014-10-20 Thread Richard Biener

On Fri, 17 Oct 2014, Sebastian Pop wrote:

> Sebastian Pop wrote:
> > Richard Biener wrote:
> > > looks like
> > > RTL issues and/or IVOPTs issues?
> > 
> > I should have posted the first diff between the compilers with 
> > -fdump-tree-all:
> > that would expose the problem at its root.
> 
> Looks like this is caused by the fwprop pass:
> 
> diff -u -r ./foo.i.087t.forwprop3 ../mas/foo.i.087t.forwprop3
> --- ./foo.i.087t.forwprop3  2014-10-17 13:17:29.985327000 -0500
> +++ ../mas/foo.i.087t.forwprop3 2014-10-17 13:17:29.308814000 -0500
> @@ -5,6 +5,8 @@
>  Pass statistics:
>  
>  
> +Applying pattern match-comparison.pd:43, gimple-match.c:11747
> +gimple_simplified to if (i_20 != 99)
>  
>  Pass statistics:
>  
> @@ -60,7 +62,7 @@
>i_17 = i_20 + 1;
># DEBUG iD.2450 => i_17
># DEBUG iD.2450 => i_17
> -  if (i_17 != 100)
> +  if (i_20 != 99)
>  goto ;
>else
>  goto ;

Ok, so this is one effect on the thing Marc pointed out - currently
no patterns (well, no but one) guards itself with has_single_use
predicates.

That was a conscious decision and the idea was that the caller should
do this via its lattice valueization function which could look like

tree
valueize (tree t)
{
  if (TREE_CODE (t) == SSA_NAME
  && !has_single_use (t))
return NULL_TREE;
  return t;
}

But of course doing that unconditionally would also pessimize code.
Generally we'd like to avoid un-CSEing stuff in a way that cannot
be CSEd again.  That's a more complex condition than what can be
implemented with has_single_use.  You might also consider a
stmt doing a_1 + a_1 where a_1 has two uses now.

For Sebastians case above the issue is that we are appearantly
bad at optimizing post-increment exit tests.  But if you'd consider
code like

  i_2 = i_1 + 1;
  b1_3 = i_2 < 100;
  b2_4 = i_2 > 50;
  if (b1_3 && b2_4)
...

then it is profitable to remove i_2 by changing the two comparisons
to i_2 <= 98 and i_2 > 49.

I thought about doing all simplifications first without committing
any simplified sequence to the IL, then scanning over the result,
pruning out cases that end up pessimizing code (how exactly isn't
yet clear to me).

So I'm not sure what we want to do here now.  I don't very much like
doing things explicitely in the pattern description (nor using the
"has_single_use" predicate).
I suppose for the gimple_build () stuff we could restrict simplifications
to the expression we are building (not simplifying with SSA defs in the 
IL), more exactly mimicing fold_buildN behavior.
I suppose for forwprop we could use the above valueize hook (but then
regress because not all patterns as implemented in forwprop guard
their def stmt lookup with has_single_use...).

Any opinion on this?  Any idea of a "simple" cost function if
you have the functions IL before and after simplifications (but
without any DCE/CSE applied)?

Thanks,
Richard.

Re: [PATCH][3/n] Merge from match-and-simplify, first patterns and questions

2014-10-20 Thread Richard Biener

On Sun, 19 Oct 2014, Marc Glisse wrote:

> Hello,
> 
> looking though the patterns on the branch (not specifically the ones attached
> here), I am surprised to see so few calls to has_single_use. In RTL-land, we
> don't even valueize if there are several uses, so the question doesn't occur.
> In generic, we assume everything is single use (CSE could later disagree, but
> that's the user's fault for writing his code that way). In
> tree-ssa-forwprop.c, helpers like get_prop_source_stmt do test for single use.
> Since has_single_use is a bit painful to use in .pd files (separate test for
> generic and constants), it might deserve another helper function, or a special
> syntax.

But I don't think "has_single_use" is a good tool to disable transforms 
on.  It's also used very inconsistently in tree-ssa-forwprop.c.

See also my other mail.

Richard.

[PATCH] More aggressively try swapping operands / SLP nodes

2014-10-20 Thread Richard Biener


This fixes the vectorizer testsuite fallout from folding all stmts
which can swap tree operands where the SLP vectorizer currently
cannot deal with that.  It fixes it by making the SLP vectorizer
deal with swapped operands in more cases, mostly SLP tree leafs
and operations with one dt_external operand.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Might also fix some PR but a cursory look didn't find one.

Thanks,
Richard.

2014-10-20  Richard Biener  

* tree-vect-slp.c (vect_get_and_check_slp_defs): Try swapping
operands to get a def operand kind match.  Signal mismatches
to the parent so we can try swapping its operands.
(vect_build_slp_tree): Try swapping operands if they have
a mismatched operand kind.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 216448)
+++ gcc/tree-vect-slp.c (working copy)
@@ -205,9 +205,11 @@ vect_get_place_in_interleaving_chain (gi
 
 /* Get the defs for the rhs of STMT (collect them in OPRNDS_INFO), check that
they are of a valid type and that they match the defs of the first stmt of
-   the SLP group (stored in OPRNDS_INFO).  */
+   the SLP group (stored in OPRNDS_INFO).  If there was a fatal error
+   return -1, if the error could be corrected by swapping operands of the
+   operation return 1, if everything is ok return 0.  */
 
-static bool
+static int 
 vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
  gimple stmt, bool first,
  vec *oprnds_info)
@@ -220,8 +222,9 @@ vect_get_and_check_slp_defs (loop_vec_in
   struct loop *loop = NULL;
   bool pattern = false;
   slp_oprnd_info oprnd_info;
-  int op_idx = 1;
-  tree compare_rhs = NULL_TREE;
+  int first_op_idx = 1;
+  bool commutative = false;
+  bool first_op_cond = false;
 
   if (loop_vinfo)
 loop = LOOP_VINFO_LOOP (loop_vinfo);
@@ -229,35 +232,41 @@ vect_get_and_check_slp_defs (loop_vec_in
   if (is_gimple_call (stmt))
 {
   number_of_oprnds = gimple_call_num_args (stmt);
-  op_idx = 3;
+  first_op_idx = 3;
 }
   else if (is_gimple_assign (stmt))
 {
+  enum tree_code code = gimple_assign_rhs_code (stmt);
   number_of_oprnds = gimple_num_ops (stmt) - 1;
   if (gimple_assign_rhs_code (stmt) == COND_EXPR)
-number_of_oprnds++;
+   {
+ first_op_cond = true;
+ commutative = true;
+ number_of_oprnds++;
+   }
+  else
+   commutative = commutative_tree_code (code);
 }
   else
-return false;
+return -1;
 
+  bool swapped = false;
   for (i = 0; i < number_of_oprnds; i++)
 {
-  if (compare_rhs)
+again:
+  if (first_op_cond)
{
- oprnd = compare_rhs;
- compare_rhs = NULL_TREE;
+ if (i == 0 || i == 1)
+   oprnd = TREE_OPERAND (gimple_op (stmt, first_op_idx),
+ swapped ? !i : i);
+ else
+   oprnd = gimple_op (stmt, first_op_idx + i - 1);
}
   else
-oprnd = gimple_op (stmt, op_idx++);
+oprnd = gimple_op (stmt, first_op_idx + (swapped ? !i : i));
 
   oprnd_info = (*oprnds_info)[i];
 
-  if (COMPARISON_CLASS_P (oprnd))
-{
-  compare_rhs = TREE_OPERAND (oprnd, 1);
-  oprnd = TREE_OPERAND (oprnd, 0);
-   }
-
   if (!vect_is_simple_use (oprnd, NULL, loop_vinfo, bb_vinfo, &def_stmt,
   &def, &dt)
  || (!def_stmt && dt != vect_constant_def))
@@ -270,7 +279,7 @@ vect_get_and_check_slp_defs (loop_vec_in
   dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
}
 
- return false;
+ return -1;
}
 
   /* Check if DEF_STMT is a part of a pattern in LOOP and get the def stmt
@@ -288,6 +297,14 @@ vect_get_and_check_slp_defs (loop_vec_in
   pattern = true;
   if (!first && !oprnd_info->first_pattern)
{
+ if (i == 0
+ && !swapped
+ && commutative)
+   {
+ swapped = true;
+ goto again;
+   }
+
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -297,7 +314,7 @@ vect_get_and_check_slp_defs (loop_vec_in
   dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
}
 
- return false;
+ return 1;
 }
 
   def_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (def_stmt));
@@ -308,7 +325,7 @@ vect_get_and_check_slp_defs (loop_vec_in
   if (dump_enabled_p ())
 dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "Unsupported pattern.\n");
-  return false;
+  return -1;
 }
 
   switch (gimple_code (def_stmt))
@@ -325,7 +

[AARCH64, NEON] Any regression testcase for AARCH64 NEON intrinsics in GCC testsuite?

2014-10-20 Thread Yangfei (Felix)

Hi,

  I am trying to improve the AARCH64 NEON intrinsics. It seems that we don't 
enough testcases for this part in GCC testsuite.
  How do you guys test your patch on this part? Any suggestions? Thanks.

Re: [patch] LWG 2019 - std::isblank

2014-10-20 Thread Jonathan Wakely


On 17/10/14 10:41 +0100, Jonathan Wakely wrote:

http://cplusplus.github.io/LWG/lwg-defects.html#2019

I've checked the relevant _ISblank/_ISBLANK/_CTYPE_B constant on all
targets except VxWorks where I chose something that looks reasonable.
Not all targets reserve a bit for isblank, but this way
ctype_base::blank is always defined, but on some platforms with the
same value as ctype_base::space. That means that on those targets
isblank(c, loc) is equivalent to isspace(c, loc) which is not correct,
but isn't completely crazy either.

Some systems (bionic, newlib, netbsd, openbsd) do define a _B (or
_CTYPE_B) constant, but as it says on netbsd:

/*
* isblank() is implemented as C function, due to insufficient bitwidth in
* _ctype_.  Note that _B does not mean isblank - it means isprint && !isgraph.
*/

On those targets there is no bitmask corresponding to the isblank set.
I don't know how to solve that without changing ctype_base::mask to a
wider type, which I'm not planning on doing.

N.B. on other BSDs (freebsd, darwin, dragonfly) _CTYPE_B *does*
correspond to isblank. Portability is fun.

Some implementations of ctype::is(mask, char) and/or
ctype::do_is defined inline in config/os/*/ctype_inline.h
need the ctype_base::blank mask, but those files get included by C++98
code, so for some targets ctype_base::blank is always defined even in
C++98 mode. Solving that is too difficult.

Tested x86_64-linux, with --enable-clocale={gnu,generic}
and also by hacking configure.host to use config/os/generic, and also
tested on x86_64-netbsd5.1 and x86_64-dragonfly3.6. Something will
probably break on a target I didn't test, but should be easy to fix.

I plan to commit this later today.


Committed to trunk.

Re: [PATCH i386 AVX512] [56/n] Add plus/minus/abs/neg/andnot insn patterns.

2014-10-20 Thread Jakub Jelinek

On Tue, Oct 14, 2014 at 11:18:28AM +0400, Kirill Yukhin wrote:
>   * config/i386/sse.md (define_mode_iterator VI_AVX2): Extend
>   to support AVX-512BW.
>   (define_mode_iterator VI124_AVX2_48_AVX512F): Remove.
>   (define_expand "3"): Remove masking support.
>   (define_insn "*3"): Ditto.
>   (define_expand "3_mask"): New.
>   (define_expand "3_mask"): Ditto.
>   (define_insn "*3_mask"): Ditto.
>   (define_insn "*3_mask"): Ditto.
>   (define_expand "_andnot3"): Remove masking support.
>   (define_insn "*andnot3"): Ditto.
>   (define_expand "_andnot3_mask"): New.
>   (define_expand "_andnot3_mask"): Ditto.
>   (define_insn "*andnot3"): Ditto.
>   (define_insn "*andnot3"): Ditto.
>   (define_insn "*abs2"): Remove masking support.
>   (define_insn "abs2_mask"): New.
>   (define_insn "abs2_mask"): Ditto.
>   (define_expand "abs2"): Use VI_AVX2 mode iterator.

Unfortunately this caused PR63600.  The problem is that VI_AVX2
mode iterator includes V2DI and for AVX2 also V4DI, but for pre-ssse3
ix86_expand_sse2_abs doesn't handle V2DI (and can't easily, we don't have
PSRAQ instruction), for ssse3 there is no vpabsq instruction, and for
avx2 neither.
We can handle V2DI/V4DI only for TARGET_AVX512VL, and V8DI for
TARGET_AVX512F.
Thus, IMHO the mode iterator on at least
(define_insn "*abs2"
and on
(define_expand "abs2"
is wrong, should not include V2DI/V4DI unless TARGET_AVX512VL
(so new (or ressurrected, was that VI124_AVX2_48_AVX512F?)
specialized mode iterator?).

Jakub

Re: [libstdc++ PATCH] More Fundamentals v1 variable templates

2014-10-20 Thread Jonathan Wakely


On 18/10/14 23:22 +0300, Ville Voutilainen wrote:

On 18 October 2014 23:18, Ville Voutilainen  wrote:

Tested on Linux-x64.

2014-10-18  Ville Voutilainen  

Implement more Library Fundamentals v1 variable templates for
type traits.
* include/Makefile.am: Add ratio, chrono and system_error.
* include/experimental/chrono: New.
* include/experimental/ratio: Likewise.
* include/experimental/system_error: Likewise.
* include/experimental/tuple (tuple_size_v): Likewise.
* testsuite/experimental/chrono/value.cc: Likewise.
* testsuite/experimental/ratio/value.cc: Likewise.
* testsuite/experimental/system_error/value.cc: Likewise.
* testsuite/experimental/tuple/tuple_size.cc: Likewise.


Hah, failed to uglify system_error. New patch attached.


Thanks.

The templates should also use 'typename' not 'class' but I can make
that change before committing it so no need for a new patch.

I'll do the commit tomorrow.

[PATCH] Adjust testcases to be robust against operand order changes

2014-10-20 Thread Richard Biener


When folding all stmts we can end up canonicalizing operand order
correctly which breaks at least the following two testcases.

Fixed by making their expected outcome more robust.

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-10-20  Richard Biener  

* gcc.dg/tree-ssa/slsr-19.c: Make robust against operand order changes.
* gcc.dg/tree-ssa/reassoc-20.c: Likewise.

Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-19.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-19.c (revision 216463)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-19.c (working copy)
@@ -16,7 +16,7 @@ f (int c, int s)
   return x1 + x2;
 }
 
-/* { dg-final { scan-tree-dump-times " \\* y" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times " \\* 2" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " \\* " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " \\* 2;" 1 "optimized" } } */
 /* { dg-final { cleanup-tree-dump "optimized" } } */
 
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-20.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-20.c  (revision 216463)
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-20.c  (working copy)
@@ -15,6 +15,6 @@ int main(void)
   printf ("%d %d\n", e, f);
 }
 
-/* { dg-final { scan-tree-dump-times "b.._. \\\+ a.._." 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\[ab\].._. \\\+ \[ab\].._." 1 
"optimized" } } */
 /* { dg-final { scan-tree-dump-times " \\\+ " 2 "optimized" } } */
 /* { dg-final { cleanup-tree-dump "optimized" } } */

Re: [PATCH i386 AVX512] [56/n] Add plus/minus/abs/neg/andnot insn patterns.

2014-10-20 Thread Kirill Yukhin

Hello,
On 20 Oct 14:36, Jakub Jelinek wrote:
> On Tue, Oct 14, 2014 at 11:18:28AM +0400, Kirill Yukhin wrote:
> > * config/i386/sse.md (define_mode_iterator VI_AVX2): Extend
> > to support AVX-512BW.
> > (define_mode_iterator VI124_AVX2_48_AVX512F): Remove.
> > (define_expand "3"): Remove masking support.
> > (define_insn "*3"): Ditto.
> > (define_expand "3_mask"): New.
> > (define_expand "3_mask"): Ditto.
> > (define_insn "*3_mask"): Ditto.
> > (define_insn "*3_mask"): Ditto.
> > (define_expand "_andnot3"): Remove masking support.
> > (define_insn "*andnot3"): Ditto.
> > (define_expand "_andnot3_mask"): New.
> > (define_expand "_andnot3_mask"): Ditto.
> > (define_insn "*andnot3"): Ditto.
> > (define_insn "*andnot3"): Ditto.
> > (define_insn "*abs2"): Remove masking support.
> > (define_insn "abs2_mask"): New.
> > (define_insn "abs2_mask"): Ditto.
> > (define_expand "abs2"): Use VI_AVX2 mode iterator.
> 
> Unfortunately this caused PR63600.  The problem is that VI_AVX2
> mode iterator includes V2DI and for AVX2 also V4DI, but for pre-ssse3
> ix86_expand_sse2_abs doesn't handle V2DI (and can't easily, we don't have
> PSRAQ instruction), for ssse3 there is no vpabsq instruction, and for
> avx2 neither.
> We can handle V2DI/V4DI only for TARGET_AVX512VL, and V8DI for
> TARGET_AVX512F.
> Thus, IMHO the mode iterator on at least
> (define_insn "*abs2"
> and on
> (define_expand "abs2"
> is wrong, should not include V2DI/V4DI unless TARGET_AVX512VL
> (so new (or ressurrected, was that VI124_AVX2_48_AVX512F?)
> specialized mode iterator?).


This patch removes absq insn patterns for non-AVX-512 targets.


gcc/
* config/i386/sse.md (define_mode_iterator VI_AVX2): Restore to 128-,
256- bit integer modes only.
(define_mode_iterator VI_AVX2_AVX512): New.
(define_expand "neg2"): Use VI_AVX2_AVX512 mode iterator.
(define_expand "3"): Ditto.
(define_insn "*3"): Ditto.
(define_expand "_andnot3"): Ditto.
(define_mode_iterator VI1248_AVX512VL_AVX512BW): New.
(define_insn "abs2"): Ditto.

Bootstrap in progress. AVX-512 tests pass.

Is it ok for trunk?

--
Thanks, K

AVX-512. Disable absq for non AVX-512 targets.

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index fd40623..74aca48 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -271,6 +271,11 @@
(V4DI "TARGET_AVX") V2DI])
 
 (define_mode_iterator VI_AVX2
+  [(V32QI "TARGET_AVX2") V16QI
+   (V16HI "TARGET_AVX2") V8HI
+   (V8SI "TARGET_AVX2") V4SI])
+
+(define_mode_iterator VI_AVX2_AVX512
   [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI
(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI
(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") V4SI
@@ -9142,18 +9147,18 @@
 ;
 
 (define_expand "neg2"
-  [(set (match_operand:VI_AVX2 0 "register_operand")
-   (minus:VI_AVX2
+  [(set (match_operand:VI_AVX2_AVX512 0 "register_operand")
+   (minus:VI_AVX2_AVX512
  (match_dup 2)
- (match_operand:VI_AVX2 1 "nonimmediate_operand")))]
+ (match_operand:VI_AVX2_AVX512 1 "nonimmediate_operand")))]
   "TARGET_SSE2"
   "operands[2] = force_reg (mode, CONST0_RTX (mode));")
 
 (define_expand "3"
-  [(set (match_operand:VI_AVX2 0 "register_operand")
-   (plusminus:VI_AVX2
- (match_operand:VI_AVX2 1 "nonimmediate_operand")
- (match_operand:VI_AVX2 2 "nonimmediate_operand")))]
+  [(set (match_operand:VI_AVX2_AVX512 0 "register_operand")
+   (plusminus:VI_AVX2_AVX512
+ (match_operand:VI_AVX2_AVX512 1 "nonimmediate_operand")
+ (match_operand:VI_AVX2_AVX512 2 "nonimmediate_operand")))]
   "TARGET_SSE2"
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
@@ -9180,10 +9185,10 @@
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*3"
-  [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v")
-   (plusminus:VI_AVX2
- (match_operand:VI_AVX2 1 "nonimmediate_operand" "0,v")
- (match_operand:VI_AVX2 2 "nonimmediate_operand" "xm,vm")))]
+  [(set (match_operand:VI_AVX2_AVX512 0 "register_operand" "=x,v")
+   (plusminus:VI_AVX2_AVX512
+ (match_operand:VI_AVX2_AVX512 1 "nonimmediate_operand" "0,v")
+ (match_operand:VI_AVX2_AVX512 2 "nonimmediate_operand" "xm,vm")))]
   "TARGET_SSE2
&& ix86_binary_operator_ok (, mode, operands)"
   "@
@@ -10715,10 +10720,10 @@
 })
 
 (define_expand "_andnot3"
-  [(set (match_operand:VI_AVX2 0 "register_operand")
-   (and:VI_AVX2
- (not:VI_AVX2 (match_operand:VI_AVX2 1 "register_operand"))
- (match_operand:VI_AVX2 2 "nonimmediate_operand")))]
+  [(set (match_operand:VI_AVX2_AVX512 0 "register_operand")
+   (and:VI_AVX2_AVX512
+ (not:VI_AVX2_AVX512 (match_operand:VI_AVX2_AVX512 1 
"register_operand"))
+ (match_operand:VI_AVX2_A

Re: [PATCH i386 AVX512] [56/n] Add plus/minus/abs/neg/andnot insn patterns.

2014-10-20 Thread Jakub Jelinek

On Mon, Oct 20, 2014 at 05:30:36PM +0400, Kirill Yukhin wrote:
> > Unfortunately this caused PR63600.  The problem is that VI_AVX2
> > mode iterator includes V2DI and for AVX2 also V4DI, but for pre-ssse3
> > ix86_expand_sse2_abs doesn't handle V2DI (and can't easily, we don't have
> > PSRAQ instruction), for ssse3 there is no vpabsq instruction, and for
> > avx2 neither.
> > We can handle V2DI/V4DI only for TARGET_AVX512VL, and V8DI for
> > TARGET_AVX512F.
> > Thus, IMHO the mode iterator on at least
> > (define_insn "*abs2"
> > and on
> > (define_expand "abs2"
> > is wrong, should not include V2DI/V4DI unless TARGET_AVX512VL
> > (so new (or ressurrected, was that VI124_AVX2_48_AVX512F?)
> > specialized mode iterator?).
> 
> 
> This patch removes absq insn patterns for non-AVX-512 targets.
> 
> 
> gcc/
>   * config/i386/sse.md (define_mode_iterator VI_AVX2): Restore to 128-,
>   256- bit integer modes only.
>   (define_mode_iterator VI_AVX2_AVX512): New.
>   (define_expand "neg2"): Use VI_AVX2_AVX512 mode iterator.
>   (define_expand "3"): Ditto.
>   (define_insn "*3"): Ditto.
>   (define_expand "_andnot3"): Ditto.
>   (define_mode_iterator VI1248_AVX512VL_AVX512BW): New.
>   (define_insn "abs2"): Ditto.
> 
> Bootstrap in progress. AVX-512 tests pass.
> 
> Is it ok for trunk?

I'll certainly leave the review to Uros, whatever he prefers.
That said, I was expecting you'd keep VI_AVX2 as is (because from the patch
clearly that is what is used most commonly, the V?DI modes are for most
insns normal integral vector modes, VI* uses the same modes and VI_AVX2
used to be just like VI, just with TARGET_AVX conditions replaced with
TARGET_AVX2), and just add a new mode iterator for the two abs patterns
(*abs2 and abs2), it can be specialized mode iterator just
for the abs with ABS in names or something.

Jakub

Re: [PATCH i386 AVX512] [81/n] Add new built-ins.

2014-10-20 Thread Jakub Jelinek

On Mon, Oct 20, 2014 at 05:41:25PM +0400, Kirill Yukhin wrote:
> Hello,
> This patch adds (almost) all built-ins needed by
> AVX-512VL,BW,DQ intrinsics.
> 
> Main questionable hunk is:
> 
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index b69312b..a639487 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1539,7 +1539,7 @@ struct GTY(()) tree_function_decl {
>   DECL_FUNCTION_CODE.  Otherwise unused.
>   ???  The bitfield needs to be able to hold all target function
> codes as well.  */
> -  ENUM_BITFIELD(built_in_function) function_code : 11;
> +  ENUM_BITFIELD(built_in_function) function_code : 12;
>ENUM_BITFIELD(built_in_class) built_in_class : 2;
>  
>unsigned static_ctor_flag : 1;

Well, decl_with_vis has 15 unused bits, so instead of growing
FUNCTION_DECL significantly, might be better to move one of the
flags to decl_with_vis and just document that it applies to FUNCTION_DECLs
only.  Or move some flag to cgraph if possible.

But seeing e.g.
   IX86_BUILTIN_FIXUPIMMPD256, IX86_BUILTIN_FIXUPIMMPD256_MASK,
   IX86_BUILTIN_FIXUPIMMPD256_MASKZ
etc. I wonder if you really need that many builtins, weren't we adding
for avx512f just single builtin instead of 3 different ones, always
providing mask argument and depending on whether it is all ones, etc.
figuring out what kind of masking should be performed?

Jakub

Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.

2014-10-20 Thread Yuri Rumyantsev

Richard,

Thanks for your answer!

In current implementation phi node conversion assume that one of
incoming edge to bb containing given phi has at least one non-critical
edge and choose it to insert predicated code. But if we choose
critical edge we need to determine insert point and insertion
direction (before/after) since in other case we can get invalid ssa
form (use before def). This is done by my new function which is not in
current patch ( I will present this patch later). SO I assume that we
need to leave this patch as it is to not introduce new bugs.

Thanks.
Yuri.

2014-10-20 12:00 GMT+04:00 Richard Biener :
> On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev  wrote:
>> Richard,
>>
>> I reworked the patch as you proposed, but I didn't understand what
>> did you mean by:
>>
>>>So please rework the patch so critical edges are always handled
>>>correctly.
>>
>> In current patch flag_force_vectorize is used (1) to reject phi nodes
>> with more than 2 arguments; (2) to reject basic blocks with only
>> critical incoming edges since support for extended predication of phi
>> nodes will be in next patch.
>
> I mean that (2) should not be rejected dependent on flag_force_vectorize.
> It was rejected because if-cvt couldn't handle it correctly before but with
> this patch this is fixed.  I see no reason to still reject this then even
> for !flag_force_vectorize.
>
> Rejecting PHIs with more than two arguments with flag_force_vectorize
> is ok.
>
> Richard.
>
>> Could you please clarify your statement.
>>
>> I attached modified patch.
>>
>> ChangeLog:
>>
>> 2014-10-17  Yuri Rumyantsev  
>>
>> (flag_force_vectorize): New variable.
>> (edge_predicate): New function.
>> (set_edge_predicate): New function.
>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
>> if destination block of edge is not always executed. Set-up predicate
>> for critical edge.
>> (if_convertible_phi_p): Accept phi nodes with more than two args
>> if FLAG_FORCE_VECTORIZE was set-up.
>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
>> (if_convertible_stmt_p): Fix up pre-function comments.
>> (all_edges_are_critical): New function.
>> (if_convertible_bb_p): Use call of all_preds_critical_p
>> to reject block if-conversion with incoming critical edges only if
>> FLAG_FORCE_VECTORIZE was not set-up.
>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc
>> to compute predicate instead of fold_build2_loc.
>> Add zeroing of edge 'aux' field.
>> (find_phi_replacement_condition): Extend function interface:
>> it returns NULL if given phi node must be handled by means of
>> extended phi node predication. If number of predecessors of phi-block
>> is equal 2 and atleast one incoming edge is not critical original
>> algorithm is used.
>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false.
>> Nullify 'aux' field of edges for blocks with two successors.
>>
>>
>>
>>
>> 2014-10-17 13:09 GMT+04:00 Richard Biener :
>>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev  wrote:
 Richard,

 Here is reduced patch as you requested. All your remarks have been fixed.
 Could you please look at it ( I have already sent the patch with
 changes in add_to_predicate_list for review).
>>>
>>> + if (dump_file && (dump_flags & TDF_DETAILS))
>>> +   fprintf (dump_file, "More than two phi node args.\n");
>>> + return false;
>>> +   }
>>> +
>>> +}
>>>
>>> Excess vertical space.
>>>
>>>
>>> +/* Assumes that BB has more than 2 predecessors.
>>>
>>> More than 1 predecessor?
>>>
>>> +   Returns false if at least one successor is not on critical edge
>>> +   and true otherwise.  */
>>> +
>>> +static inline bool
>>> +all_edges_are_critical (basic_block bb)
>>> +{
>>>
>>> "all_preds_critical_p" would be a better name
>>>
>>> +  if (EDGE_COUNT (bb->preds) > 2)
>>> +{
>>> +  if (!flag_force_vectorize)
>>> +   return false;
>>> +}
>>>
>>> as I said in the last review I don't think we should restrict edge
>>> predicates to flag_force_vectorize.  At least I can't see how
>>> if-conversion is magically more expensive for that case?
>>>
>>> So please rework the patch so critical edges are always handled
>>> correctly.
>>>
>>> Ok with that and the above suggested changes.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>
 Thanks.
 Yuri.
 ChangeLog
 2014-10-16  Yuri Rumyantsev  

 (flag_force_vectorize): New variable.
 (edge_predicate): New function.
 (set_edge_predicate): New function.
 (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list
 if destination block of edge is not always executed. Set-up predicate
 for critical edge.
 (if_convertible_phi_p): Accept phi nodes with more than two args
 if FLAG_FORCE_VECTORIZE was set-up.
 (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE.
 (if_convertible_stmt_p): Fix up pre-function comments.
 (all_edges_are_critical): New f

[Ada] Spurious output on optimized default-initialized limited aggregate

2014-10-20 Thread Arnaud Charlet

When expanding a limited aggregate into individual assignments, we create a
transient scope if the type of a component requires it. This must not be done
if the context is an initialization procedure, because the target of the
assignment must be visible outside of the block, and stack cleanup will happen
on return from the initialization call. Otherwise this may result in dangling
stack references in the back-end, which produce garbled results when compiled
at higher optimization levels.

Executing the following:

   gnatmake -q -O2 cutdown
   cutdown

must yield:

   0.0E+00

---
with Text_IO; use Text_IO;
procedure Cutdown is

   type Angle_Object_T is tagged record
  M_Value : Float := 0.0;
   end record;

   Zero : constant Angle_Object_T := (M_Value => 0.0);

   type Platform_T is record
  M_Roll : Angle_Object_T := Zero;
   end record;

   package Observable_Nongeneric is
  type Writer_T is tagged limited record
 M_Value : Platform_T;
  end record;

  function Init (Value : in Platform_T) return Writer_T;
   end Observable_Nongeneric;

   package body Observable_Nongeneric is

   --
  function Init (Value : in Platform_T) return Writer_T is
  begin
 return (M_Value => Value);
  end Init;
   --
   end Observable_Nongeneric;

   type Object_T is tagged limited record
  M_Platform : aliased Observable_Nongeneric.Writer_T :=
Observable_Nongeneric.Init (Platform_T'(others => <>));
   end record;

   Data : Object_T;
begin
   Put_Line (Data.M_Platform.M_Value.M_Roll.M_Value'Img);

   if Data.M_Platform.M_Value.M_Roll.M_Value /= 0.0 then
  raise Program_Error;
   end if;
end Cutdown;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-20  Ed Schonberg  

* exp_aggr.adb (Convert_To_Assignments): Do not create a
transient scope for a component whose type requires it, if the
context is an initialization procedure, because the target of
the assignment must be visible outside of the block.

Index: exp_aggr.adb
===
--- exp_aggr.adb(revision 216469)
+++ exp_aggr.adb(working copy)
@@ -3396,7 +3396,7 @@
  --  that any finalization chain will be associated with that scope.
  --  For extended returns, we delay expansion to avoid the creation
  --  of an unwanted transient scope that could result in premature
- --  finalization of the return object (which is built in in place
+ --  finalization of the return object (which is built in place
  --  within the caller's scope).
 
  or else
@@ -3409,7 +3409,14 @@
  return;
   end if;
 
-  if Requires_Transient_Scope (Typ) then
+  --  Otherwise, if a transient scope is required, create it now. If we
+  --  are within an initialization procedure do not create such, because
+  --  the target of the assignment must not be declared within a local
+  --  block, and because cleanup will take place on return from the
+  --  initialization procedure.
+  --  Should the condition be more restrictive ???
+
+  if Requires_Transient_Scope (Typ) and then not Inside_Init_Proc then
  Establish_Transient_Scope (N, Sec_Stack => Needs_Finalization (Typ));
   end if;

[Ada] Aspect specifications and incomplete views

2014-10-20 Thread Arnaud Charlet

Typically an indexing aspect is specified on the private view of a tagged
type. In the unusual case where there is an incomplete view and the aspect
specification appears on the full view, the aspect specification must be
analyzed on the full view rather than the incomplete one, to prevent freezing
anomalies with the class-wide type, which otherwise might be frozen before
the dispatch table for the type is constructed.

Compiling and executing try2.adb must yield:

   ab

---
pragma Ada_2012;
with Ada.Text_IO; use Ada.Text_IO;
procedure Try2 is
   package Pack is
  type T is tagged;
  function F (Obj : T; S : String; Pos : Positive) return Character;
  type T is tagged null record
with Constant_Indexing => F;
   end Pack;

   package body Pack is
  function F (Obj : T; S : String; Pos : Positive) return Character is
  begin
 return S (Pos);
  end F;
   end Pack;
   use Pack;

   V : T;
begin
   Put (V ("abcd", 1));
   Put (V ("abcd", 2));
   New_Line;
end Try2;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-20  Ed Schonberg  

* sem_ch3.adb (Analyze_Full_Type_Declaration): If previous view
is incomplete rather than private, and full type declaration
has aspects, analyze aspects on the full view rather than
the incomplete view, to prevent freezing anomalies with the
class-wide type.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 216469)
+++ sem_ch3.adb (working copy)
@@ -2777,9 +2777,18 @@
   --  them to the entity for the type which is currently the partial
   --  view, but which is the one that will be frozen.
 
+  --  In most cases the partial view is a private type, and both views
+  --  appear in different declarative parts. In the unusual case where the
+  --  partial view is incomplete, perform the analysis on the full view,
+  --  to prevent freezing anomalies with the corresponding class-wide type,
+  --  which otherwise might be frozen before the dispatch table is built.
+
   if Has_Aspects (N) then
- if Prev /= Def_Id then
+ if Prev /= Def_Id
+   and then Ekind (Prev) /= E_Incomplete_Type
+ then
 Analyze_Aspect_Specifications (N, Prev);
+
  else
 Analyze_Aspect_Specifications (N, Def_Id);
  end if;

Re: [PATCH] PR preprocessor/42014

2014-10-20 Thread Manuel López-Ibáñez

> 2014-10-18 23:07 GMT+02:00 Krzesimir Nowak :
>> Hello.
>>
>> This is my first patch for GCC. I already started a paperwork for
>> copyright assignment (sent an email to fsf-records at gnu org) -
>> waiting for response.
>>
>> So, about this patch - it basically removes column printing from "In
>> file included from ..." lines, as the column information always
>> returned 0. Not sure if this is correct assumption - I tested only C
>> and C++, so I don't know if other frontends (ada, go?) provide column
>> information for include lines. Anyway, column information here is
>> probably not useful.
>>
>> Or maybe it is, if GCC supports some language with include syntax like
>> followish:
>> #include , , 
>>
>> Maybe in this case printing column number has sense?
>>
>> I need help with testcase - I don't know how to implement it
>> correctly. The output of compilation is something like this:
>>
>> In file included from .../pr42014-2.h:2,
>>  from .../pr42014-1.h:3,
>>  from .../pr42014.c:4:
>> .../pr42014-3.h:1:7: error: 'foo' was not declared in this scope
>>
>> How to check the "from" lines? Is there some dg-foo (dg-grep?) command
>> for it? dg-excess-errors is likely not suited for this purpose.
>
> I suppose I will have to add a preprocessed file and try using dg-message.

Hi Krzesimir,

I think you are overcomplicating it. The original reporter complained
simply that there is an inconsistency between the first line and the
next ones when -fshow-column is enabled (which is now the default but
it wasn't some years ago). The following patch is sufficient to fix
this inconsistency:

Index: diagnostic.c
===
--- diagnostic.c(revision 216462)
+++ diagnostic.c(working copy)
@@ -528,8 +528,8 @@
   if (context->show_column)
 pp_verbatim (context->printer,
  "In file included from %r%s:%d:%d%R", "locus",
- LINEMAP_FILE (map),
- LAST_SOURCE_LINE (map), LAST_SOURCE_COLUMN (map));
+ LINEMAP_FILE (map), LAST_SOURCE_LINE (map),
+ LAST_SOURCE_COLUMN (map));
   else
 pp_verbatim (context->printer,
  "In file included from %r%s:%d%R", "locus",
@@ -537,9 +537,15 @@
   while (! MAIN_FILE_P (map))
 {
   map = INCLUDED_FROM (line_table, map);
-  pp_verbatim (context->printer,
-   ",\n from %r%s:%d%R", "locus",
-   LINEMAP_FILE (map), LAST_SOURCE_LINE (map));
+  if (context->show_column)
+pp_verbatim (context->printer,
+ ",\n from %r%s:%d:%d%R", "locus",
+ LINEMAP_FILE (map), LAST_SOURCE_LINE (map),
+ LAST_SOURCE_COLUMN (map));
+  else
+pp_verbatim (context->printer,
+ ",\n from %r%s:%d%R", "locus",
+ LINEMAP_FILE (map), LAST_SOURCE_LINE (map));
 }
   pp_verbatim (context->printer, ":");
   pp_newline (context->printer);

You can test this by simply building gcc and using -fshow-column vs.
-fno-show-column. I think a testsuite testcase will be hard to build
because DejaGNU. It doesn't seem worth the effort for such a minor
issue. Given that you seem to have enough knowledge and ability to
modify GCC and submit good patches, it would be better to spend your
time on more important bugs.

For example, this one needs to be analyzed, we don't even know how it happens:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52998

Or this one, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45333
which I think is just a matter of adding (or factoring out) some of
the logic from maybe_unwind_expanded_macro_loc() and use it in various
places in cp/error.c (print_instantiation_*).

If you are not motivated by those, I can offer more suggestions...

Cheers,

Manuel.

The nvptx port [0/11+]

2014-10-20 Thread Bernd Schmidt

This is a patch kit that adds the nvptx port to gcc. It contains 
preliminary patches to add needed functionality, the target files, and 
one somewhat optional patch with additional target tools. There'll be 
more patch series, one for the testsuite, and one to make the offload 
functionality work with this port. Also required are the previous four 
rtl patches, two of which weren't entirely approved yet.


For the moment, I've stripped out all the address space support that got 
bogged down in review by brokenness in our representation of address 
spaces. The ptx address spaces are of course still defined and used 
inside the backend.


Ptx really isn't a usual target - it is a virtual target which is then 
translated by another compiler (ptxas) to the final code that runs on 
the GPU. There are many restrictions, some imposed by the GPU hardware, 
and some by the fact that not everything you'd want can be represented 
in ptx. Here are some of the highlights:

 * Everything is typed - variables, functions, registers. This can
   cause problems with K&R style C or anything else that doesn't
   have a proper type internally.
 * Declarations are needed, even for undefined variables.
 * Can't emit initializers referring to their variable's address since
   you can't write forward declarations for variables.
 * Variables can be declared only as scalars or arrays, not
   structures. Initializers must be in the variable's declared type,
   which requires some code in the backend, and it means that packed
   pointer values are not representable.
 * Since it's a virtual target, we skip register allocation - no good
   can probably come from doing that twice. This means asm statements
   aren't fixed up and will fail if they use matching constraints.
 * No support for indirect jumps, label values, nonlocal gotos.
 * No alloca - ptx defines it, but it's not implemented.
 * No trampolines.
 * No debugging (at all, for now - we may add line number directives).
 * Limited C library support - I have a hacked up copy of newlib
   that provides a reasonable subset.
 * malloc and free are defined by ptx (these appear to be
   undocumented), but there isn't a realloc. I have one patch for
   Fortran to use a malloc/memcpy helper function in cases where we
   know the old size.

All in all, this is not intended to be used as a C (or any other source 
language) compiler. I've gone through a lot of effort to make it work 
reasonably well, but only in order to get sufficient test coverage from 
the testsuites. The intended use for this is only to build it as an 
offload compiler, and use it through OpenACC by way of lto1. That leaves 
the question of how we should document it - does it need the usual 
constraint and option documentation, given that user's aren't expected 
to use any of it?


A slightly earlier version of the entire patch kit was bootstrapped and 
tested on x86_64-linux. Ok for trunk?



Bernd

The nvptx port [2/11+] No register allocation

2014-10-20 Thread Bernd Schmidt

Since it's a virtual target, I've chosen not to run register allocation. 
This is one of the patches necessary to make that work, it primarily 
adds a target hook to disable it and fixes some of the fallout.



Bernd

The nvptx port [1/11+] indirect jumps

2014-10-20 Thread Bernd Schmidt


ptx doesn't have indirect jumps, so CODE_FOR_indirect_jump may not be
defined.  Add a sorry.


Bernd
	gcc/
	* optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a
	sorry if necessary.


Index: gcc/optabs.c
===
--- gcc/optabs.c	(revision 422345)
+++ gcc/optabs.c	(revision 422346)
@@ -4477,13 +4477,16 @@ prepare_float_lib_cmp (rtx x, rtx y, enu
 /* Generate code to indirectly jump to a location given in the rtx LOC.  */
 
 void
-emit_indirect_jump (rtx loc)
+emit_indirect_jump (rtx loc ATTRIBUTE_UNUSED)
 {
+#ifndef HAVE_indirect_jump
+  sorry ("indirect jumps are not available on this target");
+#else
   struct expand_operand ops[1];
-
   create_address_operand (&ops[0], loc);
   expand_jump_insn (CODE_FOR_indirect_jump, 1, ops);
   emit_barrier ();
+#endif
 }
 
 #ifdef HAVE_conditional_move

[Ada] Lift limitation on inter-unit inlining of instantiated subprograms

2014-10-20 Thread Arnaud Charlet

This change makes it so that instantiations of generic subprograms marked as
inline are considered for inter-unit inlining.  This was not previously the
case because of a technical limitation that was too broadly enforced (unlike
the associated comment which was more accurate) and excluded instantiations.

The call to Q.Compare must be inlined if the code is compiled with -O -gnatn:

with Q;

function F (A, B : Integer) return Boolean is
begin
  return Q.Compare (A, B);
end;
with G;

package Q is

  function Compare is new G (Integer);

end Q;
generic
  type T is (<>);
function G (Left,Right : T) return Boolean;
pragma Inline (G);
function G (Left,Right : T) return Boolean is
begin
  return Left /= Right;
end;

2014-10-20  Eric Botcazou  

* inline.adb (List_Inlining_Info): Minor tweaks.
(Add_Inlined_Body): Inline the enclosing package
if it is not internally generated, even if it doesn't come
from source.

Index: inline.adb
===
--- inline.adb  (revision 216469)
+++ inline.adb  (working copy)
@@ -414,7 +414,7 @@
 
elsif Level = Inline_Package
  and then not Is_Inlined (Pack)
- and then Comes_From_Source (E)
+ and then not Is_Internal (E)
  and then not In_Main_Unit_Or_Subunit (Pack)
then
   Set_Is_Inlined (Pack);
@@ -3888,7 +3888,7 @@
Count := Count + 1;
 
if Count = 1 then
-  Write_Str ("Listing of frontend inlined calls");
+  Write_Str ("List of calls inlined by the frontend");
   Write_Eol;
end if;
 
@@ -3917,7 +3917,7 @@
Count := Count + 1;
 
if Count = 1 then
-  Write_Str ("Listing of inlined calls passed to the backend");
+  Write_Str ("List of inlined calls passed to the backend");
   Write_Eol;
end if;
 
@@ -3947,7 +3947,7 @@
 
 if Count = 1 then
Write_Str
- ("Listing of inlined subprograms passed to the backend");
+ ("List of inlined subprograms passed to the backend");
Write_Eol;
 end if;
 
@@ -3964,7 +3964,7 @@
  end loop;
   end if;
 
-  --  Generate listing of subprogram that cannot be inlined by the backend
+  --  Generate listing of subprograms that cannot be inlined by the backend
 
   if Present (Backend_Not_Inlined_Subps)
 and then Back_End_Inlining
@@ -3979,7 +3979,7 @@
 
 if Count = 1 then
Write_Str
- ("Listing of subprograms that cannot inline the backend");
+ ("List of subprograms that cannot be inlined by the backend");
Write_Eol;
 end if;

The nvptx port [2/11+] No register allocation

2014-10-20 Thread Bernd Schmidt

Since it's a virtual target, I've chosen not to run register allocation. 
This is one of the patches necessary to make that work, it primarily 
adds a target hook to disable it and fixes some of the fallout.



Bernd

	gcc/
	* target.def (no_register_allocation): New data hook.
	* doc/tm.texi.in: Add @hook TARGET_NO_REGISTER_ALLOCATION.
	* doc/tm.texi: Regenerate.
	* ira.c (gate_ira): New function.
	(pass_data_ira): Set has_gate.
	(pass_ira): Add a gate function.
	(pass_data_reload): Likewise.
	(pass_reload): Add a gate function.
	(pass_ira): Use it.
	* reload1.c (eliminate_regs): If reg_eliminte_is NULL, assert that
	no register allocation happens on the target and return.
	* final.c (alter_subreg): Ensure register is not a pseudo before
	calling simplify_subreg.
	(output_operand): Assert that x isn't a pseudo only if doing
	register allocation.


Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi.orig
+++ gcc/doc/tm.texi
@@ -9520,11 +9520,19 @@ True if the @code{DW_AT_comp_dir} attrib
 @end deftypevr
 
 @deftypevr {Target Hook} bool TARGET_DELAY_SCHED2
-True if sched2 is not to be run at its normal place.  This usually means it will be run as part of machine-specific reorg.
+True if sched2 is not to be run at its normal place.
+This usually means it will be run as part of machine-specific reorg.
 @end deftypevr
 
 @deftypevr {Target Hook} bool TARGET_DELAY_VARTRACK
-True if vartrack is not to be run at its normal place.  This usually means it will be run as part of machine-specific reorg.
+True if vartrack is not to be run at its normal place.
+This usually means it will be run as part of machine-specific reorg.
+@end deftypevr
+
+@deftypevr {Target Hook} bool TARGET_NO_REGISTER_ALLOCATION
+True if register allocation and the passes
+following it should not be run.  Usually true only for virtual assembler
+targets.
 @end deftypevr
 
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, @var{label2})
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in.orig
+++ gcc/doc/tm.texi.in
@@ -7188,6 +7188,8 @@ tables, and hence is desirable if it wor
 
 @hook TARGET_DELAY_VARTRACK
 
+@hook TARGET_NO_REGISTER_ALLOCATION
+
 @defmac ASM_OUTPUT_DWARF_DELTA (@var{stream}, @var{size}, @var{label1}, @var{label2})
 A C statement to issue assembly directives that create a difference
 @var{lab1} minus @var{lab2}, using an integer of the given @var{size}.
Index: gcc/target.def
===
--- gcc/target.def.orig
+++ gcc/target.def
@@ -5379,15 +5379,21 @@ DEFHOOKPOD
  bool, false)
 
 DEFHOOKPOD
-(delay_sched2, "True if sched2 is not to be run at its normal place.  \
+(delay_sched2, "True if sched2 is not to be run at its normal place.\n\
 This usually means it will be run as part of machine-specific reorg.",
 bool, false)
 
 DEFHOOKPOD
-(delay_vartrack, "True if vartrack is not to be run at its normal place.  \
+(delay_vartrack, "True if vartrack is not to be run at its normal place.\n\
 This usually means it will be run as part of machine-specific reorg.",
 bool, false)
 
+DEFHOOKPOD
+(no_register_allocation, "True if register allocation and the passes\n\
+following it should not be run.  Usually true only for virtual assembler\n\
+targets.",
+bool, false)
+
 /* Leave the boolean fields at the end.  */
 
 /* Close the 'struct gcc_target' definition.  */
Index: gcc/final.c
===
--- gcc/final.c.orig
+++ gcc/final.c
@@ -3129,7 +3129,7 @@ alter_subreg (rtx *xp, bool final_p)
   else
 	*xp = adjust_address_nv (y, GET_MODE (x), offset);
 }
-  else
+  else if (REG_P (y) && HARD_REGISTER_P (y))
 {
   rtx new_rtx = simplify_subreg (GET_MODE (x), y, GET_MODE (y),
  SUBREG_BYTE (x));
@@ -3816,7 +3816,8 @@ output_operand (rtx x, int code ATTRIBUT
 x = alter_subreg (&x, true);
 
   /* X must not be a pseudo reg.  */
-  gcc_assert (!x || !REG_P (x) || REGNO (x) < FIRST_PSEUDO_REGISTER);
+  if (!targetm.no_register_allocation)
+gcc_assert (!x || !REG_P (x) || REGNO (x) < FIRST_PSEUDO_REGISTER);
 
   targetm.asm_out.print_operand (asm_out_file, x, code);
 
Index: gcc/reload1.c
===
--- gcc/reload1.c.orig
+++ gcc/reload1.c
@@ -2947,6 +2947,11 @@ eliminate_regs_1 (rtx x, enum machine_mo
 rtx
 eliminate_regs (rtx x, enum machine_mode mem_mode, rtx insn)
 {
+  if (reg_eliminate == NULL)
+{
+  gcc_assert (targetm.no_register_allocation);
+  return x;
+}
   return eliminate_regs_1 (x, mem_mode, insn, false, false);
 }
 
Index: gcc/ira.c
===
--- gcc/ira.c.orig
+++ gcc/ira.c
@@ -5573,6 +5573,10 @@ public:
   {}
 
   /* opt_pass methods: */
+  virtua

Re: The nvptx port [3/11+] Struct returns

2014-10-20 Thread Bernd Schmidt

Even when returning a structure by passing an invisible reference, gcc 
still likes to set the return register to the address of the struct. 
This is undesirable on ptx where things like the return register have to 
be declared, and the function really returns void at ptx level. I've 
added a target hook to avoid this. I figure other targets might find it 
beneficial to omit this unnecessary set as well.



Bernd

	gcc/
	* target.def (omit_struct_return_reg): New data hook.
	* doc/tm.texi.in: Add @hook TARGET_OMIT_STRUCT_RETURN_REG.
	* doc/tm.texi: Regenerate.
	* function.c (expand_function_end): Use it.


Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi	(revision 422355)
+++ gcc/doc/tm.texi	(revision 422356)
@@ -4560,6 +4560,14 @@ need more space than is implied by @code
 saving and restoring an arbitrary return value.
 @end defmac
 
+@deftypevr {Target Hook} bool TARGET_OMIT_STRUCT_RETURN_REG
+Normally, when a function returns a structure by memory, the address
+is passed as an invisible pointer argument, but the compiler also
+arranges to return the address from the function like it would a normal
+pointer return value.  Define this to true if that behaviour is
+undesirable on your target.
+@end deftypevr
+
 @deftypefn {Target Hook} bool TARGET_RETURN_IN_MSB (const_tree @var{type})
 This hook should return true if values of type @var{type} are returned
 at the most significant end of a register (in other words, if they are
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in	(revision 422355)
+++ gcc/doc/tm.texi.in	(revision 422356)
@@ -3769,6 +3769,8 @@ need more space than is implied by @code
 saving and restoring an arbitrary return value.
 @end defmac
 
+@hook TARGET_OMIT_STRUCT_RETURN_REG
+
 @hook TARGET_RETURN_IN_MSB
 
 @node Aggregate Return
Index: gcc/target.def
===
--- gcc/target.def	(revision 422355)
+++ gcc/target.def	(revision 422356)
@@ -3601,6 +3601,16 @@ structure value address at the beginning
 to emit adjusting code, you should do it at this point.",
  rtx, (tree fndecl, int incoming),
  hook_rtx_tree_int_null)
+
+DEFHOOKPOD
+(omit_struct_return_reg,
+ "Normally, when a function returns a structure by memory, the address\n\
+is passed as an invisible pointer argument, but the compiler also\n\
+arranges to return the address from the function like it would a normal\n\
+pointer return value.  Define this to true if that behaviour is\n\
+undesirable on your target.",
+ bool, false)
+
 DEFHOOK
 (return_in_memory,
  "This target hook should return a nonzero value to say to return the\n\
Index: gcc/function.c
===
--- gcc/function.c	(revision 422355)
+++ gcc/function.c	(revision 422356)
@@ -5179,8 +5179,8 @@ expand_function_end (void)
  If returning a structure PCC style,
  the caller also depends on this value.
  And cfun->returns_pcc_struct is not necessarily set.  */
-  if (cfun->returns_struct
-  || cfun->returns_pcc_struct)
+  if ((cfun->returns_struct || cfun->returns_pcc_struct)
+  && !targetm.calls.omit_struct_return_reg)
 {
   rtx value_address = DECL_RTL (DECL_RESULT (current_function_decl));
   tree type = TREE_TYPE (DECL_RESULT (current_function_decl));

[Ada] Implement pragma/aspect No_Tagged_Streams

2014-10-20 Thread Arnaud Charlet

The No_Tagged_Streams pragma (and aspect) provides a method for
selectively inhibiting the generation of stream routines for
tagged types. It can be used either in a form naming a specific
tagged type, or in a sequence of declarations to apply to all
subsequent declarations.

The following tests show the use of the pragma and the rejection
of attempts to use stream operations on affected types.

 1. with Ada.Text_IO; use Ada.Text_IO;
 2. with Ada.Text_IO.Text_Streams;
 3. use  Ada.Text_IO.Text_Streams;
 4. procedure NTS1 is
 5.f : File_Type;
 6.type R is tagged null record;
 7.pragma No_Tagged_Streams (R);
 8.RV : R;
 9. begin
10.R'Write (Stream (f), RV);
|
>>> no stream operations for "R"
(No_Tagged_Streams at line 7)

11. end;

 1. with Ada.Text_IO; use Ada.Text_IO;
 2. with Ada.Text_IO.Text_Streams;
 3. use  Ada.Text_IO.Text_Streams;
 4. procedure NTS2 is
 5.pragma No_Tagged_Streams;
 6.f : File_Type;
 7.type R is tagged null record;
 8.RV : R;
 9. begin
10.R'Write (Stream (f), RV);
|
>>> no stream operations for "R"
(No_Tagged_Streams at line 5)

11. end;

 1. with Ada.Text_IO; use Ada.Text_IO;
 2. with Ada.Text_IO.Text_Streams;
 3. use  Ada.Text_IO.Text_Streams;
 4. procedure NTS3 is
 5.f : File_Type;
 6.pragma No_Tagged_Streams;
 7.type R is tagged null record;
 8.RV : R;
 9. begin
10.R'Write (Stream (f), RV);
|
>>> no stream operations for "R"
(No_Tagged_Streams at line 6)

11. end;

 1. package NTS4 is
 2.pragma No_Tagged_Streams;
 3.type R is tagged null record;
 4. end;

 1. with Ada.Text_IO; use Ada.Text_IO;
 2. with Ada.Text_IO.Text_Streams;
 3. use  Ada.Text_IO.Text_Streams;
 4. with NTS4; use NTS4;
 5. procedure NTS4M is
 6.f : File_Type;
 7.RV : R;
 8. begin
 9.R'Write (Stream (f), RV);
|
>>> no stream operations for "R"
(No_Tagged_Streams at nts4.ads:2)

10. end;

 1. with Ada.Text_IO; use Ada.Text_IO;
 2. with Ada.Text_IO.Text_Streams;
 3. use  Ada.Text_IO.Text_Streams;
 4. procedure NTS5 is
 5.f : File_Type;
 6.type R is tagged null record
 7.  with No_Tagged_Streams => True;
 8.type R1 is new R with
 9.   record F : Integer; end record;
10.RV : R1;
11. begin
12.R1'Write (Stream (f), RV);
 |
>>> no stream operations for "R1"
(No_Tagged_Streams at line 7)

13. end;

The following test shows the rejection of incorrect usage

 1. pragma No_Tagged_Streams;
|
>>> pragma "NO_TAGGED_STREAMS" is not in
declarative part or package spec

 2. procedure NTS6 is
 3.type R is new Integer;
 4.pragma No_Tagged_Streams (Entity => R);
   |
>>> argument for pragma "NO_TAGGED_STREAMS"
must be root tagged type

 5. begin
 6.null;
 7. end;

2014-10-20  Robert Dewar  

* gnat_rm.texi: Document No_Tagged_Streams pragma and aspect.
* snames.ads-tmpl: Add entry for pragma No_Tagged_Streams.
* aspects.ads, aspects.adb: Add aspect No_Tagged_Streams.
* einfo.adb (No_Tagged_Streams_Pragma): New field.
* einfo.ads: Minor reformatting (reorder entries).
(No_Tagged_Streams_Pragma): New field.
* exp_ch3.adb: Minor comment update.
* opt.ads (No_Tagged_Streams): New variable.
* par-prag.adb: Add dummy entry for pragma No_Tagged_Streams.
* sem.ads (Save_No_Tagged_Streams): New field in scope record.
* sem_attr.adb (Check_Stream_Attribute): Check stream ops
prohibited by No_Tagged_Streams.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Set
No_Tagged_Streams_Pragma.
(Analyze_Subtype_Declaration): ditto.
(Build_Derived_Record_Type): ditto.
(Record_Type_Declaration): ditto.
* sem_ch8.adb (Pop_Scope): Restore No_Tagged_Streams.
(Push_Scope): Save No_Tagged_Streams.
* sem_prag.adb (Analyze_Pragma, case No_Tagged_Streams): Implement new
pragma.

Index: aspects.adb
===
--- aspects.adb (revision 216469)
+++ aspects.adb (working copy)
@@ -546,6 +546,7 @@
 Aspect_Machine_Radix=> Aspect_Machine_Radix,
 Aspect_No_Elaboration_Code_All  => Aspect_No_Elaboration_Code_All,
 Aspect_No_Return=> Aspect_No_Return,
+Aspect_No_Tagged_Streams=> Aspect_No_Tagged_Streams,
 Aspect_Obsolescent  => Aspect_Obsolescent,
 Aspect_Object_Size  => Aspect_Object_Size,
 Aspect_Output

The nvptx port [4/11+] Post-RA pipeline

2014-10-20 Thread Bernd Schmidt

This stops most of the post-regalloc passes to be run if the target 
doesn't want register allocation. I'd previously moved them all out of 
postreload to the toplevel, but Jakub (I think) pointed out that the 
idea is not to run them to avoid crashes if reload fails e.g. for an 
invalid asm. So I've made a new container pass.


A later patch will make thread_prologue_and_epilogue_insns callable from 
the backend.



Bernd

	gcc/
	* passes.def (pass_compute_alignments, pass_duplicate_computed_gotos,
	pass_variable_tracking, pass_free_cfg, pass_machine_reorg,
	pass_cleanup_barriers, pass_delay_slots,
	pass_split_for_shorten_branches, pass_convert_to_eh_region_ranges,
	pass_shorten_branches, pass_est_nothrow_function_flags,
	pass_dwarf2_frame, pass_final): Move outside of pass_postreload and
	into pass_late_compilation.
	(pass_late_compilation): Add.
	* passes.c (pass_data_late_compilation, pass_late_compilation,
	make_pass_late_compilation): New.
	* timevar.def (TV_LATE_COMPILATION): New.


Index: gcc/passes.def
===
--- gcc/passes.def.orig
+++ gcc/passes.def
@@ -415,6 +415,9 @@ along with GCC; see the file COPYING3.
 	  NEXT_PASS (pass_split_before_regstack);
 	  NEXT_PASS (pass_stack_regs_run);
 	  POP_INSERT_PASSES ()
+  POP_INSERT_PASSES ()
+  NEXT_PASS (pass_late_compilation);
+  PUSH_INSERT_PASSES_WITHIN (pass_late_compilation)
 	  NEXT_PASS (pass_compute_alignments);
 	  NEXT_PASS (pass_variable_tracking);
 	  NEXT_PASS (pass_free_cfg);
Index: gcc/passes.c
===
--- gcc/passes.c.orig
+++ gcc/passes.c
@@ -569,6 +569,44 @@ make_pass_postreload (gcc::context *ctxt
   return new pass_postreload (ctxt);
 }
 
+namespace {
+
+const pass_data pass_data_late_compilation =
+{
+  RTL_PASS, /* type */
+  "*all-late_compilation", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_LATE_COMPILATION, /* tv_id */
+  PROP_rtl, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_late_compilation : public rtl_opt_pass
+{
+public:
+  pass_late_compilation (gcc::context *ctxt)
+: rtl_opt_pass (pass_data_late_compilation, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+  {
+return reload_completed || targetm.no_register_allocation;
+  }
+
+}; // class pass_late_compilation
+
+} // anon namespace
+
+static rtl_opt_pass *
+make_pass_late_compilation (gcc::context *ctxt)
+{
+  return new pass_late_compilation (ctxt);
+}
+
 
 
 /* Set the static pass number of pass PASS to ID and record that
Index: gcc/timevar.def
===
--- gcc/timevar.def.orig
+++ gcc/timevar.def
@@ -270,6 +270,7 @@ DEFTIMEVAR (TV_EARLY_LOCAL	 , "early
 DEFTIMEVAR (TV_OPTIMIZE		 , "unaccounted optimizations")
 DEFTIMEVAR (TV_REST_OF_COMPILATION   , "rest of compilation")
 DEFTIMEVAR (TV_POSTRELOAD	 , "unaccounted post reload")
+DEFTIMEVAR (TV_LATE_COMPILATION	 , "unaccounted late compilation")
 DEFTIMEVAR (TV_REMOVE_UNUSED	 , "remove unused locals")
 DEFTIMEVAR (TV_ADDRESS_TAKEN	 , "address taken")
 DEFTIMEVAR (TV_TODO		 , "unaccounted todo")

The nvptx port [5/11+] Variable declarations

2014-10-20 Thread Bernd Schmidt

ptx assembly follows rather different rules than what's typical 
elsewhere. We need a new hook to add a " };" string when we are finished 
outputting a variable with an initializer.



Bernd

	gcc/
	* target.def (decl_end): New hook.
	* varasm.c (assemble_variable_contents, assemble_constant_contents):
	Use it.
	* doc/tm.texi.in (TARGET_ASM_DECL_END): Add.
	* doc/tm.texi: Regenerate.


Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi.orig
+++ gcc/doc/tm.texi
@@ -7575,6 +7575,11 @@ The default implementation of this hook
 when the relevant string is @code{NULL}.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_ASM_DECL_END (void)
+Define this hook if the target assembler requires a special marker to
+terminate an initialized variable declaration.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA (FILE *@var{file}, rtx @var{x})
 A target hook to recognize @var{rtx} patterns that @code{output_addr_const}
 can't deal with, and output assembly code to @var{file} corresponding to
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in.orig
+++ gcc/doc/tm.texi.in
@@ -5412,6 +5412,8 @@ It must not be modified by command-line
 
 @hook TARGET_ASM_INTEGER
 
+@hook TARGET_ASM_DECL_END
+
 @hook TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
 
 @defmac ASM_OUTPUT_ASCII (@var{stream}, @var{ptr}, @var{len})
Index: gcc/target.def
===
--- gcc/target.def.orig
+++ gcc/target.def
@@ -127,6 +127,15 @@ when the relevant string is @code{NULL}.
  bool, (rtx x, unsigned int size, int aligned_p),
  default_assemble_integer)
 
+/* Notify the backend that we have completed emitting the data for a
+   decl.  */
+DEFHOOK
+(decl_end,
+ "Define this hook if the target assembler requires a special marker to\n\
+terminate an initialized variable declaration.",
+ void, (void),
+ hook_void_void)
+
 /* Output code that will globalize a label.  */
 DEFHOOK
 (globalize_label,
Index: gcc/varasm.c
===
--- gcc/varasm.c.orig
+++ gcc/varasm.c
@@ -1945,6 +1945,7 @@ assemble_variable_contents (tree decl, c
   else
 	/* Leave space for it.  */
 	assemble_zeros (tree_to_uhwi (DECL_SIZE_UNIT (decl)));
+  targetm.asm_out.decl_end ();
 }
 }
 
@@ -3349,6 +3350,8 @@ assemble_constant_contents (tree exp, co
 
   /* Output the value of EXP.  */
   output_constant (exp, size, align);
+
+  targetm.asm_out.decl_end ();
 }
 
 /* We must output the constant data referred to by SYMBOL; do so.  */

[Ada] Improve error recovery for bad comma/semicolon in expression

2014-10-20 Thread Arnaud Charlet

This patch improves the error recovery for an errant comma or semicolon
after one condition in an expression when more conditions follow, as
shown in this example:

 1. procedure BadANDTHEN (X : Integer) is
 2. begin
 3.if X > 10
 4.  and then X mod 4 = 2;
 |
>>> extra ";" ignored

 5.  and then X mod 12 = 8
 6.then
 7.   null;
 8.end if;
 9. end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-20  Robert Dewar  

* par-ch4.adb (P_Expression): Handle extraneous comma/semicolon
in middle of expression with logical operators.

Index: par-ch4.adb
===
--- par-ch4.adb (revision 216469)
+++ par-ch4.adb (working copy)
@@ -1708,6 +1708,48 @@
 Node1 := New_Op_Node (Logical_Op, Op_Location);
 Set_Left_Opnd (Node1, Node2);
 Set_Right_Opnd (Node1, P_Relation);
+
+--  Check for case of errant comma or semicolon
+
+if Token = Tok_Comma or else Token = Tok_Semicolon then
+   declare
+  Com: constant Boolean := Token = Tok_Comma;
+  Scan_State : Saved_Scan_State;
+  Logop  : Node_Kind;
+
+   begin
+  Save_Scan_State (Scan_State); -- at comma/semicolon
+  Scan; -- past comma/semicolon
+
+  --  Check for AND THEN or OR ELSE after comma/semicolon. We
+  --  do not deal with AND/OR because those cases get mixed up
+  --  with the select alternatives case.
+
+  if Token = Tok_And or else Token = Tok_Or then
+ Logop := P_Logical_Operator;
+ Restore_Scan_State (Scan_State); -- to comma/semicolon
+
+ if Nkind_In (Logop, N_And_Then, N_Or_Else) then
+Scan; -- past comma/semicolon
+
+if Com then
+   Error_Msg_SP -- CODEFIX
+ ("|extra "","" ignored");
+else
+   Error_Msg_SP -- CODEFIX
+ ("|extra "";"" ignored");
+end if;
+
+ else
+Restore_Scan_State (Scan_State); -- to comma/semicolon
+ end if;
+
+  else
+ Restore_Scan_State (Scan_State); -- to comma/semicolon
+  end if;
+   end;
+end if;
+
 exit when Token not in Token_Class_Logop;
  end loop;

[Ada] Improve recognition of misspelled aspects

2014-10-20 Thread Arnaud Charlet

As shown by this example, the recognition of misspelled aspects is
improved:

 1. package UnrecogAs with Prelaborate is
   |
>>> "Prelaborate" is not a valid aspect identifier
>>> possible misspelling of "Preelaborate"

 2.type R is tagged null record;
 3. end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-20  Robert Dewar  

* par-ch13.adb (Possible_Misspelled_Aspect): New function.


Index: par-ch13.adb
===
--- par-ch13.adb(revision 216469)
+++ par-ch13.adb(working copy)
@@ -45,6 +45,26 @@
   Scan_State : Saved_Scan_State;
   Result : Boolean;
 
+  function Possible_Misspelled_Aspect return Boolean;
+  --  Returns True, if Token_Name is a misspelling of some aspect name
+
+  
+  -- Possible_Misspelled_Aspect --
+  
+
+  function Possible_Misspelled_Aspect return Boolean is
+  begin
+ for J in Aspect_Id_Exclude_No_Aspect loop
+if Is_Bad_Spelling_Of (Token_Name, Aspect_Names (J)) then
+   return True;
+end if;
+ end loop;
+
+ return False;
+  end Possible_Misspelled_Aspect;
+
+   --  Start of processing for Aspect_Specifications_Present
+
begin
   --  Definitely must have WITH to consider aspect specs to be present
 
@@ -74,17 +94,20 @@
   if Token /= Tok_Identifier then
  Result := False;
 
-  --  This is where we pay attention to the Strict mode. Normally when we
-  --  are in Ada 2012 mode, Strict is False, and we consider that we have
-  --  an aspect specification if the identifier is an aspect name (even if
-  --  not followed by =>) or the identifier is not an aspect name but is
-  --  followed by =>, by a comma, or by a semicolon. The last two cases
-  --  correspond to (misspelled) Boolean aspects with a defaulted value of
-  --  True. P_Aspect_Specifications will generate messages if the aspect
+  --  This is where we pay attention to the Strict mode. Normally when
+  --  we are in Ada 2012 mode, Strict is False, and we consider that we
+  --  have an aspect specification if the identifier is an aspect name
+  --  or a likely misspelling of one (even if not followed by =>) or
+  --  the identifier is not an aspect name but is followed by =>, by
+  --  a comma, or by a semicolon. The last two cases correspond to
+  --  (misspelled) Boolean aspects with a defaulted value of True.
+  --  P_Aspect_Specifications will generate messages if the aspect
   --  specification is ill-formed.
 
   elsif not Strict then
- if Get_Aspect_Id (Token_Name) /= No_Aspect then
+ if Get_Aspect_Id (Token_Name) /= No_Aspect
+   or else Possible_Misspelled_Aspect
+ then
 Result := True;
  else
 Scan; -- past identifier

The nvptx port [6/11+] Pseudo call args

2014-10-20 Thread Bernd Schmidt

On ptx, we'll be using pseudos to pass function args as well, and 
there's one assert that needs to be toned town to make that work.



Bernd

	gcc/
	* expr.c (use_reg_mode): Just return for pseudo registers.


Index: gcc/expr.c
===
--- gcc/expr.c	(revision 422421)
+++ gcc/expr.c	(revision 422422)
@@ -2321,7 +2321,10 @@ copy_blkmode_to_reg (enum machine_mode m
 void
 use_reg_mode (rtx *call_fusage, rtx reg, enum machine_mode mode)
 {
-  gcc_assert (REG_P (reg) && REGNO (reg) < FIRST_PSEUDO_REGISTER);
+  gcc_assert (REG_P (reg));
+
+  if (!HARD_REGISTER_P (reg))
+return;
 
   *call_fusage
 = gen_rtx_EXPR_LIST (mode, gen_rtx_USE (VOIDmode, reg), *call_fusage);

The nvptx port [7/11+] Inform the port about call arguments

2014-10-20 Thread Bernd Schmidt

In ptx assembly we need to decorate call insns with the arguments that 
are being passed. We also need to know the exact function type. This is 
kind of hard to do with the existing infrastructure since things like 
function_arg are called at other times rather than just when emitting a 
call, so this patch adds two more hooks, one called just before argument 
registers are loaded (once for each arg), and the other just after the 
call is complete.



Bernd

	gcc/
	* target.def (call_args, end_call_args): New hooks.
	* hooks.c (hook_void_rtx_tree): New empty function.
	* hooks.h (hook_void_rtx_tree): Declare.
	* doc/tm.texi.in (TARGET_CALL_ARGS, TARGET_END_CALL_ARGS): Add.
	* doc/tm.texi: Regenerate.
	* calls.c (expand_call): Slightly rearrange the code.  Use the two new
	hooks.
	(expand_library_call_value_1): Use the two new hooks.


Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi.orig
+++ gcc/doc/tm.texi
@@ -5027,6 +5027,29 @@ except the last are treated as named.
 You need not define this hook if it always returns @code{false}.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_CALL_ARGS (rtx, @var{tree})
+While generating RTL for a function call, this target hook is invoked once
+for each argument passed to the function, either a register returned by
+@code{TARGET_FUNCTION_ARG} or a memory location.  It is called just
+before the point where argument registers are stored.  The type of the
+function to be called is also passed as the second argument; it is
+@code{NULL_TREE} for libcalls.  The @code{TARGET_END_CALL_ARGS} hook is
+invoked just after the code to copy the return reg has been emitted.
+This functionality can be used to perform special setup of call argument
+registers if a target needs it.
+For functions without arguments, the hook is called once with @code{pc_rtx}
+passed instead of an argument register.
+Most ports do not need to implement anything for this hook.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_END_CALL_ARGS (void)
+This target hook is invoked while generating RTL for a function call,
+just after the point where the return reg is copied into a pseudo.  It
+signals that all the call argument and return registers for the just
+emitted call are now no longer in use.
+Most ports do not need to implement anything for this hook.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_PRETEND_OUTGOING_VARARGS_NAMED (cumulative_args_t @var{ca})
 If you need to conditionally change ABIs so that one works with
 @code{TARGET_SETUP_INCOMING_VARARGS}, but the other works like neither
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in.orig
+++ gcc/doc/tm.texi.in
@@ -3929,6 +3929,10 @@ These machine description macros help im
 
 @hook TARGET_STRICT_ARGUMENT_NAMING
 
+@hook TARGET_CALL_ARGS
+
+@hook TARGET_END_CALL_ARGS
+
 @hook TARGET_PRETEND_OUTGOING_VARARGS_NAMED
 
 @node Trampolines
Index: gcc/hooks.c
===
--- gcc/hooks.c.orig
+++ gcc/hooks.c
@@ -245,6 +245,11 @@ hook_void_tree (tree a ATTRIBUTE_UNUSED)
 }
 
 void
+hook_void_rtx_tree (rtx, tree)
+{
+}
+
+void
 hook_void_constcharptr (const char *a ATTRIBUTE_UNUSED)
 {
 }
Index: gcc/hooks.h
===
--- gcc/hooks.h.orig
+++ gcc/hooks.h
@@ -70,6 +70,7 @@ extern void hook_void_constcharptr (cons
 extern void hook_void_rtx_int (rtx, int);
 extern void hook_void_FILEptr_constcharptr (FILE *, const char *);
 extern bool hook_bool_FILEptr_rtx_false (FILE *, rtx);
+extern void hook_void_rtx_tree (rtx, tree);
 extern void hook_void_tree (tree);
 extern void hook_void_tree_treeptr (tree, tree *);
 extern void hook_void_int_int (int, int);
Index: gcc/target.def
===
--- gcc/target.def.orig
+++ gcc/target.def
@@ -3825,6 +3825,33 @@ not generate any instructions in this ca
  default_setup_incoming_varargs)
 
 DEFHOOK
+(call_args,
+ "While generating RTL for a function call, this target hook is invoked once\n\
+for each argument passed to the function, either a register returned by\n\
+@code{TARGET_FUNCTION_ARG} or a memory location.  It is called just\n\
+before the point where argument registers are stored.  The type of the\n\
+function to be called is also passed as the second argument; it is\n\
+@code{NULL_TREE} for libcalls.  The @code{TARGET_END_CALL_ARGS} hook is\n\
+invoked just after the code to copy the return reg has been emitted.\n\
+This functionality can be used to perform special setup of call argument\n\
+registers if a target needs it.\n\
+For functions without arguments, the hook is called once with @code{pc_rtx}\n\
+passed instead of an argument register.\n\
+Most ports do not need to implement anything for this hook.",
+ void, (rtx, tree)

The nvptx port [8/11+] Write undefined decls.

2014-10-20 Thread Bernd Schmidt

ptx assembly requires that declarations are written for undefined 
variables. This adds that functionality.



Bernd

	gcc/
	* target.def (assemble_undefined_decl): New hooks.
	* hooks.c (hook_void_FILEptr_constcharptr_const_tree): New function.
	* hooks.h (hook_void_FILEptr_constcharptr_const_tree): Declare.
	* doc/tm.texi.in (TARGET_ASM_ASSEMBLE_UNDEFINED_DECL): Add.
	* doc/tm.texi: Regenerate.
	* output.h (assemble_undefined_decl): Declare.
	(get_fnname_from_decl): Declare.
	* varasm.c (assemble_undefined_decl): New function.
	(get_fnname_from_decl): New function.
	* final.c (rest_of_handle_final): Use it.
	* varpool.c (varpool_output_variables): Call assemble_undefined_decl
	for nodes without a definition.


Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi.orig
+++ gcc/doc/tm.texi
@@ -7899,6 +7902,13 @@ global; that is, available for reference
 The default implementation uses the TARGET_ASM_GLOBALIZE_LABEL target hook.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_ASM_ASSEMBLE_UNDEFINED_DECL (FILE *@var{stream}, const char *@var{name}, const_tree @var{decl})
+This target hook is a function to output to the stdio stream
+@var{stream} some commands that will declare the name associated with
+@var{decl} which is not defined in the current translation unit.  Most
+assemblers do not require anything to be output in this case.
+@end deftypefn
+
 @defmac ASM_WEAKEN_LABEL (@var{stream}, @var{name})
 A C statement (sans semicolon) to output to the stdio stream
 @var{stream} some commands that will make the label @var{name} weak;
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in.orig
+++ gcc/doc/tm.texi.in
@@ -5693,6 +5693,8 @@ You may wish to use @code{ASM_OUTPUT_SIZ
 
 @hook TARGET_ASM_GLOBALIZE_DECL_NAME
 
+@hook TARGET_ASM_ASSEMBLE_UNDEFINED_DECL
+
 @defmac ASM_WEAKEN_LABEL (@var{stream}, @var{name})
 A C statement (sans semicolon) to output to the stdio stream
 @var{stream} some commands that will make the label @var{name} weak;
Index: gcc/hooks.c
===
--- gcc/hooks.c.orig
+++ gcc/hooks.c
@@ -139,6 +139,13 @@ hook_void_FILEptr_constcharptr (FILE *a
 {
 }
 
+/* Generic hook that takes (FILE *, const char *, constr_tree *) and does
+   nothing.  */
+void
+hook_void_FILEptr_constcharptr_const_tree (FILE *, const char *, const_tree)
+{
+}
+
 /* Generic hook that takes (FILE *, rtx) and returns false.  */
 bool
 hook_bool_FILEptr_rtx_false (FILE *a ATTRIBUTE_UNUSED,
Index: gcc/hooks.h
===
--- gcc/hooks.h.orig
+++ gcc/hooks.h
@@ -69,6 +69,8 @@ extern void hook_void_void (void);
 extern void hook_void_constcharptr (const char *);
 extern void hook_void_rtx_int (rtx, int);
 extern void hook_void_FILEptr_constcharptr (FILE *, const char *);
+extern void hook_void_FILEptr_constcharptr_const_tree (FILE *, const char *,
+		   const_tree);
 extern bool hook_bool_FILEptr_rtx_false (FILE *, rtx);
 extern void hook_void_rtx (rtx);
 extern void hook_void_tree (tree);
Index: gcc/target.def
===
--- gcc/target.def.orig
+++ gcc/target.def
@@ -158,6 +158,16 @@ global; that is, available for reference
 The default implementation uses the TARGET_ASM_GLOBALIZE_LABEL target hook.",
  void, (FILE *stream, tree decl), default_globalize_decl_name)
 
+/* Output code that will declare an external variable.  */
+DEFHOOK
+(assemble_undefined_decl,
+ "This target hook is a function to output to the stdio stream\n\
+@var{stream} some commands that will declare the name associated with\n\
+@var{decl} which is not defined in the current translation unit.  Most\n\
+assemblers do not require anything to be output in this case.",
+ void, (FILE *stream, const char *name, const_tree decl),
+ hook_void_FILEptr_constcharptr_const_tree)
+
 /* Output code that will emit a label for unwind info, if this
target requires such labels.  Second argument is the decl the
unwind info is associated with, third is a boolean: true if
Index: gcc/final.c
===
--- gcc/final.c.orig
+++ gcc/final.c
@@ -4434,17 +4434,7 @@ leaf_renumber_regs_insn (rtx in_rtx)
 static unsigned int
 rest_of_handle_final (void)
 {
-  rtx x;
-  const char *fnname;
-
-  /* Get the function's name, as described by its RTL.  This may be
- different from the DECL_NAME name used in the source file.  */
-
-  x = DECL_RTL (current_function_decl);
-  gcc_assert (MEM_P (x));
-  x = XEXP (x, 0);
-  gcc_assert (GET_CODE (x) == SYMBOL_REF);
-  fnname = XSTR (x, 0);
+  const char *fnname = get_fnname_from_decl (current_function_decl);
 
   assemble_start_function (current_function_decl, fnname);
   final_start_function (get_insn

[Ada] Crash on unconstrained unchecked union declaration

2014-10-20 Thread Arnaud Charlet

When an object declaration as an indefinite type, the actual subtype of the
object is constructed from the expression itself. If the type is an unchecked
union such a subtype cannot be constructed because discriminants cannot be
retrieved from the expression. In this case, rewrite declaration as a renaming
declaration, where the back-end does not require a definite subtype.

Compiling and executing foo.adb must yield::


   TRUE
   FALSE
   TRUE
   'A'
11
100

---
with Text_IO; use Text_IO;
procedure Foo is
   type Rec_Type (I : Integer ) is record
  B : Boolean;
  case I is
 when 0 =>
null;
 when 1 .. 10 =>
C : Character;
 when others =>
N : Natural;
  end case;
   end record;
   pragma Unchecked_Union (Rec_Type);

   function Get (I : Integer) return Rec_Type is
   begin
 case I is
when 0   =>  return (I => 0, B => True);
when 1 .. 10 =>  return (I => 1, B => False, C => 'A');
when others  =>  return (I => 11, B => True, N => abs I);
 end case;
   end Get;

   procedure Nop (R : Rec_Type) is
   begin
  Put_Line (Boolean'Image (R.B));
   end Nop;

   R0 : constant Rec_Type  := Get (0);
   R1 : constant Rec_Type  := Get (1);
   R11 : constant Rec_Type  := Get (11);
   R100 : Rec_Type := Get (100);

begin
   Nop (R0);
   Nop (R1);
   Nop (R11);
   Put_Line (Character'Image (R1.C));
   Put_Line (Integer'Image (R11.N));
   Put_Line (Integer'Image (R100.N));
end Foo;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-20  Ed Schonberg  

* sem_ch3.adb (Analyze_Object_Declaration): If the type is
an unconstrained unchecked_union type, rewrite declaration
as a renaming to prevent attempt to retrieve non- existent
discriminants from expression.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 216476)
+++ sem_ch3.adb (working copy)
@@ -3839,6 +3839,32 @@
 elsif GNATprove_Mode then
null;
 
+--  If the type is an unchecked union, no subtype can be built from
+--  the expression. Rewrite declaration as a renaming, which the
+--  back-end can handle properly. This is a rather unusual case,
+--  because most unchecked_union declarations have default values
+--  for discriminants and are thus unconstrained.
+
+elsif Is_Unchecked_Union (T) then
+   if Constant_Present (N)
+ or else Nkind (E) = N_Function_Call
+   then
+  Set_Ekind (Id, E_Constant);
+   else
+  Set_Ekind (Id, E_Variable);
+   end if;
+
+   Rewrite (N,
+ Make_Object_Renaming_Declaration (Loc,
+Defining_Identifier => Id,
+Subtype_Mark => New_Occurrence_Of (T, Loc),
+Name => E));
+
+   Set_Renamed_Object (Id, E);
+   Freeze_Before (N, T);
+   Set_Is_Frozen (Id);
+   return;
+
 else
Expand_Subtype_From_Expr (N, T, Object_Definition (N), E);
Act_T := Find_Type_Of_Object (Object_Definition (N), N);

The nvptx port [9/11+] Epilogues

2014-10-20 Thread Bernd Schmidt

We skip the late compilation passes on ptx, but there's one piece we do 
need - fixing up the function so that we get return insns in the right 
places. This patch just makes thread_prologue_and_epilogue_insns 
callable from the reorg pass.



Bernd
	gcc/
	* function.c (thread_prologue_and_epilogue_insns): No longer static.
	* function.h (thread_prologue_and_epilogue_insns): Declare.


Index: gcc/function.c
===
--- gcc/function.c	(revision 422424)
+++ gcc/function.c	(revision 422425)
@@ -5945,7 +5945,7 @@ emit_return_for_exit (edge exit_fallthru
in a sibcall omit the sibcall_epilogue if the block is not in
ANTIC.  */
 
-static void
+void
 thread_prologue_and_epilogue_insns (void)
 {
   bool inserted;
Index: gcc/function.h
===
--- gcc/function.h	(revision 422424)
+++ gcc/function.h	(revision 422425)
@@ -773,6 +773,8 @@ extern void free_after_compilation (stru
 
 extern void init_varasm_status (void);
 
+extern void thread_prologue_and_epilogue_insns (void);
+
 #ifdef RTX_CODE
 extern void diddle_return_value (void (*)(rtx, void*), void*);
 extern void clobber_return_register (void);

The nvptx port [10/11+] Target files

2014-10-20 Thread Bernd Schmidt

These are the main target files for the ptx port. t-nvptx is empty for 
now but will grow some content with follow up patches.



Bernd


	* configure.ac: Allow configuring lto for nvptx.
	* configure: Regenerate.

	gcc/
	* config/nvptx/nvptx.c: New file.
	* config/nvptx/nvptx.h: New file.
	* config/nvptx/nvptx-protos.h: New file.
	* config/nvptx/nvptx.md: New file.
	* config/nvptx/t-nvptx: New file.
	* config/nvptx/nvptx.opt: New file.
	* common/config/nvptx/nvptx-common.c: New file.
	* config.gcc: Handle nvptx-*-*.

	libgcc/
	* config.host: Handle nvptx-*-*.
	* config/nvptx/t-nvptx: New file.
	* config/nvptx/crt0.s: New file.


Index: gcc/common/config/nvptx/nvptx-common.c
===
--- /dev/null
+++ gcc/common/config/nvptx/nvptx-common.c
@@ -0,0 +1,38 @@
+/* NVPTX common hooks.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   Contributed by Bernd Schmidt 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic-core.h"
+#include "tm.h"
+#include "tm_p.h"
+#include "common/common-target.h"
+#include "common/common-target-def.h"
+#include "opts.h"
+#include "flags.h"
+
+#undef TARGET_HAVE_NAMED_SECTIONS
+#define TARGET_HAVE_NAMED_SECTIONS false
+
+#undef TARGET_DEFAULT_TARGET_FLAGS
+#define TARGET_DEFAULT_TARGET_FLAGS MASK_ABI64
+
+struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
Index: gcc/config.gcc
===
--- gcc/config.gcc.orig
+++ gcc/config.gcc
@@ -420,6 +420,9 @@ nios2-*-*)
 	cpu_type=nios2
 	extra_options="${extra_options} g.opt"
 	;;
+nvptx-*-*)
+	cpu_type=nvptx
+	;;
 powerpc*-*-*)
 	cpu_type=rs6000
 	extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h"
@@ -2148,6 +2151,10 @@ nios2-*-*)
 		;;
 esac
 	;;
+nvptx-*)
+	tm_file="${tm_file} newlib-stdint.h"
+	tmake_file="nvptx/t-nvptx"
+	;;
 pdp11-*-*)
 	tm_file="${tm_file} newlib-stdint.h"
 	use_gcc_stdint=wrap
Index: gcc/config/nvptx/nvptx.c
===
--- /dev/null
+++ gcc/config/nvptx/nvptx.c
@@ -0,0 +1,2024 @@
+/* Target code for NVPTX.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   Contributed by Bernd Schmidt 
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "tree.h"
+#include "insn-flags.h"
+#include "output.h"
+#include "insn-attr.h"
+#include "insn-codes.h"
+#include "expr.h"
+#include "regs.h"
+#include "optabs.h"
+#include "recog.h"
+#include "ggc.h"
+#include "timevar.h"
+#include "tm_p.h"
+#include "tm-preds.h"
+#include "tm-constrs.h"
+#include "function.h"
+#include "langhooks.h"
+#include "dbxout.h"
+#include "target.h"
+#include "target-def.h"
+#include "diagnostic.h"
+#include "basic-block.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "df.h"
+#include "builtins.h"
+#include "hashtab.h"
+#include 
+
+/* Record the function decls we've written, and the libfuncs and function
+   decls corresponding to them.  */
+static std::stringstream func_decls;
+static GTY((if_marked ("ggc_marked_p"), param_is (struct rtx_def)))
+  htab_t declared_libfuncs_htab;
+static GTY((if_marked ("ggc_marked_p"), param_is (union tree_node)))
+  htab_t declared_fndecls_htab;
+static GTY((if_marked ("ggc_marked_p"), param_is (union tree_node)))
+  htab_t needed_fndecls_htab;
+
+/* Allocate a new, cleared machine_function structure.  */
+
+static struct machine_function *
+nvptx_init_machine_status (vo

[Ada] Slices of parameterless calls

2014-10-20 Thread Arnaud Charlet

This patch handles correctly constructs of the forms F (T) where F denotes
a possibly overloaded function that can be invoked without actual parameters,
and T denotes a discrete type. The construct is parsed as an indexed component
but must be rewritten and analyzed as a slice of a parameterless call.

The following must compile quietly:

   gcc -c adf.adb

---
with Data_Stores;
with Sp_Adf_Types;
package Adf is
   PACKAGE selection IS

  PACKAGE on IS NEW Data_Stores.internal_array
   (index_type => sp_adf_types.boolean_selection_type,
data_type  => boolean);
   end Selection;
   procedure Frequency;
end Adf;
---
package body Adf is
   Previous_Selection : Selection.On.Data_Store_Type;
   procedure Frequency is separate;
end Adf;
---
separate (Adf)
procedure Frequency is
   P : String (1..3) := Selection.On.Get (1..3);
begin
   Previous_Selection (Sp_Adf_Types.Frequency_Type) :=
   Selection.On.Get (Sp_Adf_Types.Frequency_Type);
end Frequency;
---
package body Data_Stores is
   PACKAGE body internal_array IS

  Store : Data_Store_Type;
  PROCEDURE init (value : IN data_type := default) is
  begin
 Store := (others => Value);
  end;

  PROCEDURE put (index : IN index_type; data : IN data_type) is
  begin
 Store (Index) := Data;
  end;

  PROCEDURE put (data : IN data_store_type) is
  begin
 null;
  end;

  FUNCTION get (index : IN index_type) RETURN data_type is
  begin
 return Store (Index);
  end;

  FUNCTION get RETURN data_store_type is
  begin
 return Store;
  end;

  FUNCTION get RETURN String is
  begin
 return "What a wonderful morning";
  end;
   END internal_array;
end Data_Stores;
---
package Data_Stores is
   GENERIC
  TYPE index_type IS (<>);
  TYPE data_type IS (<>);
  default : data_type := data_type'first;

   PACKAGE internal_array IS

  TYPE data_store_type IS ARRAY (index_type) OF data_type;

  PROCEDURE init (value : IN data_type := default);

  PROCEDURE put (index : IN index_type; data : IN data_type);

  PROCEDURE put (data : IN data_store_type);

  FUNCTION get (index : IN index_type) RETURN data_type;

  FUNCTION get RETURN data_store_type;
  FUNCTION get RETURN String;

   END internal_array;
end Data_Stores;
---
package Sp_Adf_Types is
   TYPE operation_type IS (bearing, validity, control, power,
   tone, identify, adf, test, frequency,
   last_frequency, emergency_500_frequency,
   emergency_2182_frequency, tune, heading_bug);
   SUBTYPE boolean_selection_type IS operation_type RANGE test .. heading_bug;
   SUBTYPE frequency_type IS operation_type RANGE frequency .. tune;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-10-20  Ed Schonberg  

* sem_ch4.adb (Process_Function_Call): If the first actual
denotes a discrete type, the mode must be interpreted as a slice
of an array returned by a parameterless call.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 216469)
+++ sem_ch4.adb (working copy)
@@ -2156,6 +2156,7 @@
   ---
 
   procedure Process_Function_Call is
+ Loc: constant Source_Ptr := Sloc (N);
  Actual : Node_Id;
 
   begin
@@ -2187,7 +2188,26 @@
 --  subsequent crashes or loops if there is an attempt to continue
 --  analysis of the program.
 
-Next (Actual);
+--  IF there is a single actual and it is a type name, the node
+--  can only be interpreted as a slice of a parameterless call.
+--  Rebuild the node as such and analyze.
+
+if No (Next (Actual))
+  and then Is_Entity_Name (Actual)
+  and then Is_Type (Entity (Actual))
+  and then Is_Discrete_Type (Entity (Actual))
+then
+   Replace (N,
+  Make_Slice (Loc,
+Prefix => P,
+Discrete_Range =>
+   New_Occurrence_Of (Entity (Actual), Loc)));
+   Analyze (N);
+   return;
+
+else
+   Next (Actual);
+end if;
  end loop;
 
  Analyze_Call (N);

The nvptx port [11/11] More tools.

2014-10-20 Thread Bernd Schmidt

This is a "bonus" optional patch which adds ar, ranlib, as and ld to the 
ptx port. This is not proper binutils; ar and ranlib are just linked to 
the host versions, and the other two tools have the following functions:


* nvptx-as is required to convert the compiler output to actual valid
  ptx assembly, primarily by reordering declarations and definitions.
  Believe me when I say that I've tried to make that work in the
  compiler itself and it's pretty much impossible without some really
  invasive changes.
* nvptx-ld is just a pseudo linker that works by concatenating ptx
  input files and separating them with nul characters. Actual linking
  is something that happens later, when calling CUDA library functions,
  but existing build system make it useful to have something called
  "ld" which is able to bundle everything that's needed into a single
  file, and this seemed to be the simplest way of achieving this.

There's a toplevel configure.ac change necessary to make ar/ranlib 
useable by the libgcc build. Having some tools built like this has some 
precedent in t-vmsnative, but as Thomas noted it does make feature tests 
in gcc's configure somewhat ugly (but everything works well enough to 
build the compiler). The alternative here is to bundle all these files 
into a separate nvptx-tools package which users would have to download - 
something that would be nice to avoid.


These tools currently require GNU extensions - something I probably 
ought to fix if we decide to add them to the gcc build itself.



Bernd

	* configure.ac (AR_FOR_TARGET, RANLIB_FOR_TARGET): If nvptx-*,
	look for them in the gcc build directory.
	* configure: Regenerate.

	gcc/
	* config.gcc (nvptx-*): Define extra_programs.
	* config/nvptx/nvptx-as.c: New file.
	* config/nvptx/nvptx-ld.c: New file.
	* config/nvptx/t-nvptx (nvptx-ld.o, nvptx-as.o, collect-ld$(exeext),
	as$(exeext), ar$(exeext), ranlib$(exeext): New rules.

Index: git/gcc/config.gcc
===
--- git.orig/gcc/config.gcc
+++ git/gcc/config.gcc
@@ -2154,6 +2154,7 @@ nios2-*-*)
 nvptx-*)
 	tm_file="${tm_file} newlib-stdint.h"
 	tmake_file="nvptx/t-nvptx"
+	extra_programs="collect-ld\$(exeext) as\$(exeext) ar\$(exeext) ranlib\$(exeext)"
 	;;
 pdp11-*-*)
 	tm_file="${tm_file} newlib-stdint.h"
Index: git/gcc/config/nvptx/nvptx-as.c
===
--- /dev/null
+++ git/gcc/config/nvptx/nvptx-as.c
@@ -0,0 +1,961 @@
+/* An "assembler" for ptx.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   Contributed by Nathan Sidwell 
+   Contributed by Bernd Schmidt 
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+/* Munges gcc-generated PTX assembly so that it becomes acceptable for ptxas.
+
+   This is not a complete assembler.  We presume the source is well
+   formed from the compiler and can die horribly if it is not.  */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#define obstack_chunk_alloc malloc
+#define obstack_chunk_free free
+#include 
+#define HAVE_DECL_BASENAME 1
+#include 
+#include 
+
+#include 
+
+static const char *outname = NULL;
+
+static void __attribute__ ((format (printf, 1, 2)))
+fatal_error (const char * cmsgid, ...)
+{
+  va_list ap;
+
+  va_start (ap, cmsgid);
+  fprintf (stderr, "nvptx-as: ");
+  vfprintf (stderr, cmsgid, ap);
+  fprintf (stderr, "\n");
+  va_end (ap);
+
+  unlink (outname);
+  exit (1);
+}
+
+struct Stmt;
+
+class symbol
+{
+ public:
+  symbol (const char *k) : key (k), stmts (0), pending (0), emitted (0)
+{ }
+
+  /* The name of the symbol.  */
+  const char *key;
+  /* A linked list of dependencies for the initializer.  */
+  std::list deps;
+  /* The statement in which it is defined.  */
+  struct Stmt *stmts;
+  bool pending;
+  bool emitted;
+};
+
+/* Hash and comparison functions for these hash tables.  */
+
+static int hash_string_eq (const void *, const void *);
+static hashval_t hash_string_hash (const void *);
+
+static int
+hash_string_eq (const void *s1_p, const void *s2_p)
+{
+  const char *const *s1 = (const char *const *) s1_p;
+  const char *s2 = (const char *) s2_p;
+  return strcmp (*s1, s2) == 0;
+}
+
+static hashval_t
+hash_string_hash (const void *s_p)
+{
+  const char *const *s = (const char *const *) s_p;
+  return (*htab_hash_

Re: [PATCH i386 AVX512] [56/n] Add plus/minus/abs/neg/andnot insn patterns.

2014-10-20 Thread Uros Bizjak

On Mon, Oct 20, 2014 at 3:41 PM, Jakub Jelinek  wrote:
> On Mon, Oct 20, 2014 at 05:30:36PM +0400, Kirill Yukhin wrote:
>> > Unfortunately this caused PR63600.  The problem is that VI_AVX2
>> > mode iterator includes V2DI and for AVX2 also V4DI, but for pre-ssse3
>> > ix86_expand_sse2_abs doesn't handle V2DI (and can't easily, we don't have
>> > PSRAQ instruction), for ssse3 there is no vpabsq instruction, and for
>> > avx2 neither.
>> > We can handle V2DI/V4DI only for TARGET_AVX512VL, and V8DI for
>> > TARGET_AVX512F.
>> > Thus, IMHO the mode iterator on at least
>> > (define_insn "*abs2"
>> > and on
>> > (define_expand "abs2"
>> > is wrong, should not include V2DI/V4DI unless TARGET_AVX512VL
>> > (so new (or ressurrected, was that VI124_AVX2_48_AVX512F?)
>> > specialized mode iterator?).
>>
>>
>> This patch removes absq insn patterns for non-AVX-512 targets.
>>
>>
>> gcc/
>>   * config/i386/sse.md (define_mode_iterator VI_AVX2): Restore to 128-,
>>   256- bit integer modes only.
>>   (define_mode_iterator VI_AVX2_AVX512): New.
>>   (define_expand "neg2"): Use VI_AVX2_AVX512 mode iterator.
>>   (define_expand "3"): Ditto.
>>   (define_insn "*3"): Ditto.
>>   (define_expand "_andnot3"): Ditto.
>>   (define_mode_iterator VI1248_AVX512VL_AVX512BW): New.
>>   (define_insn "abs2"): Ditto.
>>
>> Bootstrap in progress. AVX-512 tests pass.
>>
>> Is it ok for trunk?
>
> I'll certainly leave the review to Uros, whatever he prefers.
> That said, I was expecting you'd keep VI_AVX2 as is (because from the patch
> clearly that is what is used most commonly, the V?DI modes are for most
> insns normal integral vector modes, VI* uses the same modes and VI_AVX2
> used to be just like VI, just with TARGET_AVX conditions replaced with
> TARGET_AVX2), and just add a new mode iterator for the two abs patterns
> (*abs2 and abs2), it can be specialized mode iterator just
> for the abs with ABS in names or something.

Yes, I like this idea, too.

Just add IV1248_AVX512VL_AVX512BW and use it in abs patterns.

The changed patch is pre-approved, but please still make full
bootstrap and regtest cycle.

Thanks,
Uros.

Re: [AARCH64, NEON] Any regression testcase for AARCH64 NEON intrinsics in GCC testsuite?

2014-10-20 Thread Christophe Lyon

On 20 October 2014 14:01, Yangfei (Felix)  wrote:
> Hi,
>
>   I am trying to improve the AARCH64 NEON intrinsics. It seems that we don't 
> enough testcases for this part in GCC testsuite.
>   How do you guys test your patch on this part? Any suggestions? Thanks.
>
Hello,

I have written a testsuite for AArch32 Neon intrinsics, available at
https://gitorious.org/arm-neon-tests

I am in the process of converting in into DejaGnu form for integration into GCC.

My most recent submission was
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00022.html
but I plan to submit another version soon.

As you'll notice, this first submission only covers a small subset of
the original testsuite, but I do plan to convert it all.

That being said, the current testsuite only covers AArch32 Neon
intrinsics, and needs to be expanded to cover the AArch64. It is still
useful to test the AArch32 subset on AArch64.

Christophe.

Re: [PATCH i386 AVX512] [63.1/n] Add vpshufb, perm autogen (except for v64qi).

2014-10-20 Thread Ilya Tocar

> > 
> > The patch is OK with the above improvement.
> > 
> >
> 
> Will commit version below, if no objections in 24 hours.
> 
>
Sorry,
I've missed palignr, which should also have v64qi version,
and lost return in expand_vec_perm_palignr case
(this caused avx512f-vec-unpack test failures).
Patch below fixes it. Ok for trunk?

2014-10-20  Ilya Tocar  

* config/i386/i386.c (expand_vec_perm_1): Fix
expand_vec_perm_palignr case.
* config/i386/sse.md (_palignr_mask): Use
VI1_AVX512.

---
 gcc/config/i386/i386.c |  1 +
 gcc/config/i386/sse.md | 12 ++--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 33b21f4..34273ca 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -43552,6 +43552,7 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
 
   /* Try the AVX2 vpalignr instruction.  */
   if (expand_vec_perm_palignr (d, true))
+return true;
 
   /* Try the AVX512F vpermi2 instructions.  */
   if (ix86_expand_vec_perm_vpermi2 (NULL_RTX, NULL_RTX, NULL_RTX, NULL_RTX, d))
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 8157045..a3f336f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -13716,14 +13716,14 @@
(set_attr "mode" "DI")])
 
 (define_insn "_palignr_mask"
-  [(set (match_operand:VI1_AVX2 0 "register_operand" "=v")
-(vec_merge:VI1_AVX2
- (unspec:VI1_AVX2
-   [(match_operand:VI1_AVX2 1 "register_operand" "v")
-(match_operand:VI1_AVX2 2 "nonimmediate_operand" "vm")
+  [(set (match_operand:VI1_AVX512 0 "register_operand" "=v")
+(vec_merge:VI1_AVX512
+ (unspec:VI1_AVX512
+   [(match_operand:VI1_AVX512 1 "register_operand" "v")
+(match_operand:VI1_AVX512 2 "nonimmediate_operand" "vm")
 (match_operand:SI 3 "const_0_to_255_mul_8_operand" "n")]
UNSPEC_PALIGNR)
-   (match_operand:VI1_AVX2 4 "vector_move_operand" "0C")
+   (match_operand:VI1_AVX512 4 "vector_move_operand" "0C")
(match_operand: 5 "register_operand" "Yk")))]
   "TARGET_AVX512BW && ( == 64 || TARGET_AVX512VL)"
 {
-- 
1.8.3.1

RE: [Patch, MIPS] Add Octeon3 support

2014-10-20 Thread Matthew Fortune

> 2014-10-08  Andrew Pinski  
> 
> * config/mips/mips-cpus.def (octeon3): New cpu.
> * config/mips/mips.c (mips_rtx_cost_data): Add octeon3.
> (mips_print_operand ): Fix a bug as the mode
> of the comparison no longer matches mode of the operands.
> (mips_issue_rate): Handle PROCESSOR_OCTEON3.
> * config/mips/mips.h (TARGET_OCTEON):  Add Octeon3.
> (TARGET_OCTEON2): Likewise.
> (TUNE_OCTEON): Add Octeon3.
> * config/mips/mips.md (processor): Add octeon3.
> * config/mips/octeon.md (octeon_fpu): New automaton and cpu_unit.
> (octeon_arith): Add octeon3.
> (octeon_condmove): Remove.
> (octeon_condmove_o1): New reservation.
> (octeon_condmove_o2): New reservation.
> (octeon_condmove_o3_int_on_cc): New reservation.
> (octeon_load_o2): Add octeon3.
> (octeon_cop_o2): Likewise.
> (octeon_store): Likewise.
> (octeon_brj_o2): Likewise.
> (octeon_imul3_o2): Likewise.
> (octeon_imul_o2): Likewise.
> (octeon_mfhilo_o2): Likewise.
> (octeon_imadd_o2): Likewise.
> (octeon_idiv_o2_si): Likewise.
> (octeon_idiv_o2_di): Likewise.
> (octeon_fpu): Add to the automaton.
> (octeon_fpu): New cpu unit.
> (octeon_condmove_o2): Check for non floating point modes.
> (octeon_load_o2): Add prefetchx.
> (octeon_cop_o2): Don't check for octeon3.
> (octeon3_faddsubcvt): New reservation.
> (octeon3_fmul): Likewise.
> (octeon3_fmadd): Likewise.
> (octeon3_div_sf): Likewise.
> (octeon3_div_df): Likewise.
> (octeon3_sqrt_sf): Likewise.
> (octeon3_sqrt_df): Likewise.
> (octeon3_rsqrt_sf): Likewise.
> (octeon3_rsqrt_df): Likewise.
> (octeon3_fabsnegmov): Likewise.
> (octeon_fcond): Likewise.
> (octeon_fcondmov): Likewise.
> (octeon_fpmtc1): Likewise.
> (octeon_fpmfc1): Likewise.
> (octeon_fpload): Likewise.
> (octeon_fpstore): Likewise.
> * config/mips/mips-tables.opt: Regenerate.
> * doc/invoke.texi (-march=@var{arch}): Add octeon3.

Sorry for the delay.  Looks OK just a couple of questions...

Is it intentional that you have not updated driver-native.c to
detect an Octeon 3 CPU? I guess you may be waiting on Octeon 3
being committed to the kernel so you know for sure how the
CPU will appear in procinfo?

Could you confirm what testing the patch has had?

Thanks,
Matthew

[c++-concepts]

2014-10-20 Thread Andrew Sutton

Fixing issues reported by users.

2014-10-20  Andrew Sutton  

Fixing user-reported issues and regressions
* gcc/cp/parser.c (cp_parser_template_declaration_after_exp):
Only pop access checks on failed parsing.
* gcc/cp/pt.cpp (type_dependent_expr_p): Always treat a
requires-expr as if dependently typed. Otherwise, we try to
evaluate these expressions when they have dependent types.
* gcc/cp/constriant.cc (normalize_stmt_list): Remove unused
function.
(normalize_call): Don't fold constraints during normalization.
* gcc/testsuite/g++.dg/concepts/decl-diagnose.C: Update diagnostics.

Andrew Sutton
Index: pt.c
===
--- pt.c	(revision 214991)
+++ pt.c	(working copy)
@@ -21646,6 +21646,16 @@ type_dependent_expression_p (tree expres
 	return dependent_type_p (type);
 }
 
+  // A requires expression has type bool, but is always treated as if
+  // it were a dependent expression.
+  //
+  // FIXME: This could be improved. Perhaps the type of the requires
+  // expression depends on the satisfaction of its constraints. That
+  // is, its type is bool only if its substitution into its normalized
+  // constraints succeeds.
+  if (TREE_CODE (expression) == REQUIRES_EXPR)
+return true;
+
   if (TREE_CODE (expression) == SCOPE_REF)
 {
   tree scope = TREE_OPERAND (expression, 0);
Index: constraint.cc
===
--- constraint.cc	(revision 216159)
+++ constraint.cc	(working copy)
@@ -326,7 +326,6 @@ tree normalize_nested_req (tree);
 tree normalize_var (tree);
 tree normalize_cleanup_point (tree);
 tree normalize_template_id (tree);
-tree normalize_stmt_list (tree);
 tree normalize_atom (tree);
 
 // Reduce the requirement T into a logical formula written in terms of
@@ -559,13 +558,13 @@ normalize_call (tree t)
   tree fn = TREE_VALUE (check);
   tree args = TREE_PURPOSE (check);
 
-  // Reduce the body of the function into the constriants language.
+  // Normalize the body of the function into the constriants language.
   tree body = normalize_constraints (DECL_SAVED_TREE (fn));
   if (!body)
 return error_mark_node;
 
   // Instantiate the reduced results using the deduced args.
-  tree result = tsubst_constraint_expr (body, args, false);
+  tree result = tsubst_constraint_expr (body, args, true);
   if (result == error_mark_node)
 return error_mark_node;
 
Index: testsuite/g++.dg/concepts/decl-diagnose.C
===
--- testsuite/g++.dg/concepts/decl-diagnose.C	(revision 214241)
+++ testsuite/g++.dg/concepts/decl-diagnose.C	(working copy)
@@ -4,12 +4,12 @@ typedef concept int CINT; // { dg-error
 
 void f(concept int); // { dg-error "a parameter cannot be declared 'concept'" }
 
-concept int f2(); // { dg-error "result must be bool" }
+concept int f2(); // { dg-error "return type" }
 concept bool f3();
 
 struct X
 {
-  concept int f4(); // { dg-error "result must be bool|declared with function parameters" }
+  concept int f4(); // { dg-error "return type|function parameters" }
   concept bool f5(); // { dg-error "declared with function parameters" }
   static concept bool f6(); // { dg-error "a concept cannot be a static member function" }
   static concept bool x; // { dg-error "declared 'concept'" }
Index: cp/parser.c
===
--- cp/parser.c	(revision 216159)
+++ cp/parser.c	(working copy)
@@ -24422,12 +24422,10 @@ cp_parser_template_declaration_after_exp
 
   push_deferring_access_checks (dk_deferred);
   parameter_list = cp_parser_template_introduction (parser);
-  pop_deferring_access_checks ();
-
   if (parameter_list == error_mark_node)
 {
-	  // Restore template requirements before returning.
 	  current_template_reqs = saved_template_reqs;
+  pop_deferring_access_checks ();
 	  return;
 }

Re: [PATCH] Don't call fatal_error before error reporting has been initialized.

2014-10-20 Thread Ilya Tocar

Same in collect2.

On 09 Oct 15:40, Ilya Tocar wrote:
> Ping.
> 
> On 29 Sep 18:02, Ilya Tocar wrote:
> > Hi,
> > 
> > Currently if call to atexit (lto_wrapper_cleanup) fails we
> > won't report error as we haven't initialized error-reporting
> > infrastructure. This patch moves this call after diagnostic_initialize.
> > I hope that we can't  exit inside diagnostic_initialize. Otherwise we
> > won't cleanup after it.
> > Ok for trunk?
> >

---
 gcc/collect2.c| 6 +++---
 gcc/lto-wrapper.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/collect2.c b/gcc/collect2.c
index c54e6fb..b0784e8 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -955,9 +955,6 @@ main (int argc, char **argv)
   signal (SIGCHLD, SIG_DFL);
 #endif
 
-  if (atexit (collect_atexit) != 0)
-fatal_error ("atexit failed");
-
   /* Unlock the stdio streams.  */
   unlock_std_streams ();
 
@@ -965,6 +962,9 @@ main (int argc, char **argv)
 
   diagnostic_initialize (global_dc, 0);
 
+  if (atexit (collect_atexit) != 0)
+fatal_error ("atexit failed");
+
   /* Do not invoke xcalloc before this point, since locale needs to be
  set first, in case a diagnostic is issued.  */
 
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 8033b15..d97f617 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -879,13 +879,13 @@ main (int argc, char *argv[])
 
   xmalloc_set_program_name (progname);
 
-  if (atexit (lto_wrapper_cleanup) != 0)
-fatal_error ("atexit failed");
-
   gcc_init_libintl ();
 
   diagnostic_initialize (global_dc, 0);
 
+  if (atexit (lto_wrapper_cleanup) != 0)
+fatal_error ("atexit failed");
+
   if (signal (SIGINT, SIG_IGN) != SIG_IGN)
 signal (SIGINT, fatal_signal);
 #ifdef SIGHUP
-- 
1.8.3.1

Re: [PING][PATCH] Warn about unclosed pragma omp declare target.

2014-10-20 Thread Ilya Tocar

Ping.

On 02 Oct 17:38, Ilya Tocar wrote:
> Ping.
> On 15 Aug 16:26, Ilya Tocar wrote:
> > Ping.
> > 
> > On 29 Jul 18:45, Ilya Tocar wrote:
> > > Hi,
> > > 
> > > As discussed here in https://gcc.gnu.org/ml/gcc/2014-01/msg00189.html
> > > Gcc should complain about pragma omp declare target without
> > > corresponding pragma omp end declare target. This patch adds a warning
> > > for those cases.
> > > Bootstraps/passes make-check.
> > > Ok for trunk?
> > > 
> > > ChangeLog:
> > > 
> > > 2014-07-29  Ilya Tocar  
> > > 
> > >   * c-decl.c (omp_declare_target_location_stack): New.
> > >   * c-lang.h (omp_declare_target_location_stack): Declare.
> > >   * c-parser.c (warn_unclosed_pragma_omp_target): New.
> > >   (c_parser_translation_unit): Call it.
> > >   (c_parser_omp_declare_target): Remeber location.
> > >   (c_parser_omp_end_declare_target): Forget location.
> > > 
> > > And ChangeLog for testsuite:
> > > 
> > > 2014-07-29  Ilya Tocar  
> > > 
> > >   * gcc.dg/gomp//target-3.c: New testcase.
> > > 
> > > ---
> > >  gcc/c/c-decl.c   |  3 +++
> > >  gcc/c/c-lang.h   |  3 +++
> > >  gcc/c/c-parser.c | 22 +-
> > >  gcc/testsuite/gcc.dg/gomp/target-3.c | 33 
> > > +
> > >  4 files changed, 60 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/gomp/target-3.c
> > > 
> > > diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> > > index 2a4b439..2dd5b2c 100644
> > > --- a/gcc/c/c-decl.c
> > > +++ b/gcc/c/c-decl.c
> > > @@ -158,6 +158,9 @@ enum machine_mode c_default_pointer_mode = VOIDmode;
> > >  /* If non-zero, implicit "omp declare target" attribute is added into the
> > > attribute lists.  */
> > >  int current_omp_declare_target_attribute;
> > > +
> > > +/* Holds locations of currently open "omp declare target" pragmas.  */
> > > +vec omp_declare_target_location_stack;
> > >  
> > >  /* Each c_binding structure describes one binding of an identifier to
> > > a decl.  All the decls in a scope - irrespective of namespace - are
> > > diff --git a/gcc/c/c-lang.h b/gcc/c/c-lang.h
> > > index e974906..cef995c 100644
> > > --- a/gcc/c/c-lang.h
> > > +++ b/gcc/c/c-lang.h
> > > @@ -59,4 +59,7 @@ struct GTY(()) language_function {
> > > attribute lists.  */
> > >  extern GTY(()) int current_omp_declare_target_attribute;
> > >  
> > > +/* Holds locations of currently open "omp declare target" pragmas.  */
> > > +extern vec omp_declare_target_location_stack;
> > > +
> > >  #endif /* ! GCC_C_LANG_H */
> > > diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> > > index e32bf04..0b96fe9 100644
> > > --- a/gcc/c/c-parser.c
> > > +++ b/gcc/c/c-parser.c
> > > @@ -1255,6 +1255,8 @@ static bool c_parser_cilk_verify_simd (c_parser *, 
> > > enum pragma_context);
> > >  static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
> > >  static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
> > >  
> > > +static void warn_unclosed_pragma_omp_target ();
> > > +
> > >  /* Parse a translation unit (C90 6.7, C99 6.9).
> > >  
> > > translation-unit:
> > > @@ -1290,6 +1292,8 @@ c_parser_translation_unit (c_parser *parser)
> > >   }
> > >while (c_parser_next_token_is_not (parser, CPP_EOF));
> > >  }
> > > +
> > > +  warn_unclosed_pragma_omp_target ();
> > >  }
> > >  
> > >  /* Parse an external declaration (C90 6.7, C99 6.9).
> > > @@ -13068,8 +13072,10 @@ c_finish_omp_declare_simd (c_parser *parser, 
> > > tree fndecl, tree parms,
> > >  static void
> > >  c_parser_omp_declare_target (c_parser *parser)
> > >  {
> > > +  location_t loc = c_parser_peek_token (parser)->location;
> > >c_parser_skip_to_pragma_eol (parser);
> > >current_omp_declare_target_attribute++;
> > > +  omp_declare_target_location_stack.safe_push (loc);
> > >  }
> > >  
> > >  static void
> > > @@ -13104,7 +13110,10 @@ c_parser_omp_end_declare_target (c_parser 
> > > *parser)
> > >  error_at (loc, "%<#pragma omp end declare target%> without 
> > > corresponding "
> > >  "%<#pragma omp declare target%>");
> > >else
> > > -current_omp_declare_target_attribute--;
> > > +{
> > > +  current_omp_declare_target_attribute--;
> > > +  omp_declare_target_location_stack.pop ();
> > > +}
> > >  }
> > >  
> > >  
> > > @@ -14267,4 +14276,15 @@ c_parser_array_notation (location_t loc, 
> > > c_parser *parser, tree initial_index,
> > >return value_tree;
> > >  }
> > >  
> > > +static void
> > > +warn_unclosed_pragma_omp_target ()
> > > +{
> > > +  int i;
> > > +  for (i = 0; i < current_omp_declare_target_attribute; i++)
> > > +warning_at (omp_declare_target_location_stack[i], 0,
> > > + "%<#pragma omp declare target%> without corresponding "
> > > + "%<#pragma omp end declare target%>");
> > > +  omp_declare_target_location_stack.release ();
> > > +}
> > > +
> > >  #include "gt-c-c-parser.h"
> > > diff --git a/gcc/testsui

Re: [GOOGLE] Increase max-early-inliner-iterations to 2 for profile-gen and use

2014-10-20 Thread Xinliang David Li

On Mon, Oct 20, 2014 at 1:32 AM, Richard Biener
 wrote:
> On Mon, Oct 20, 2014 at 12:02 AM, Xinliang David Li  
> wrote:
>> On Sat, Oct 18, 2014 at 4:19 PM, Xinliang David Li  
>> wrote:
>>> On Sat, Oct 18, 2014 at 3:27 PM, Jan Hubicka  wrote:
> The difference in instrumentation runtime is huge -- as topn profiler
> is pretty expensive to run.
>
> With FDO, it is probably better to make early inlining more aggressive
> in order to get more context sensitive profiling.

 I agree with that, I just would like to understand where increasing the 
 iterations
 helps and if we can handle it without iterating (because Richi originally 
 requested to
 drop the iteration for correcness issues)
>
> Well, I requested to do any iteration with an IPA view in mind.  That is,
> iterate for cgraph cycles for example where currently we face the situation
> that at least one function is inlined unoptimized.  For this we'd like to
> first optimize without inlining (well, maybe inlining doesn't hurt)

yes -- inlining decision made without callee cleanup is more
conservative and should not hurt.

>and then
> inline (and re-optimize if we inlined).
>
> Indirect edges are more interesting, but basically you'd want to re-inline
> once you discover new direct calls during early opts (but then make
> sure to do that only after the direct callee was early-optimized first).
>

It would be interesting to inline the newly introduced direct calls if
the callsites also have function pointer arguments that are known in
the call context.

> Thus it would be nice if somebody could improve on the currently very
> simple function ordering we apply early opts, integrating "iteration"
> in a better way (not iterating over all functions but only where it
> might make a difference, focused on inlining).
>
 Do you have some examples?
>>>
>>> We can do FDO experiment by shutting down einline. (Note that
>>> increasing iteration to 2 did not actually improve performance with
>>> our benchmarks).
>>
>> Early inlining itself has large performance impact for FDO (the
>> runtime of the profile-use build). With it disabled, the FDO
>> performance drops by >2% on average. The degradation is seen across
>> all benchmarks except for one.
>
> Only 2%?  You are lucky ;)

2% average is considered pretty significant for optimized build
runtime performance.


> For tramp3d introducing early inlining
> made a difference of 10% ;)  (yes, statistically for tramp3d
> we have for each assembler instruction generated 100 calls in the
> initial code ... wheee C++ template metaprogramming!)

Is this 10% difference from instrumentation build or optimized
build runtime?

>
> So indeed early inlining was absoultely required to make FDO usable at all.

thanks,

David
>
> Richard.
>
>> David
>>
>>
>>>
>>> David
>>>
 Honza
>
> David
>
> On Sat, Oct 18, 2014 at 10:05 AM, Jan Hubicka  wrote:
> >> Increasing the number of early inliner iterations from 1 to 2 enables 
> >> more
> >> indirect calls to be promoted/inlined before instrumentation. This in 
> >> turn
> >> reduces the instrumentation overhead, particularly for more expensive 
> >> indirect
> >> call topn profiling.
> >
> > How much difference you get here? One posibility would be also to run 
> > specialized
> > ipa-cp before profile instrumentation.
> >
> > Honza
> >>
> >> Passes internal testing and regression tests. Ok for google/4_9?
> >>
> >> 2014-10-18  Teresa Johnson  
> >>
> >> Google ref b/17934523
> >> * opts.c (finish_options): Increase 
> >> max-early-inliner-iterations to 2
> >> for profile-gen and profile-use builds.
> >>
> >> Index: opts.c
> >> ===
> >> --- opts.c  (revision 216286)
> >> +++ opts.c  (working copy)
> >> @@ -870,6 +869,14 @@ finish_options (struct gcc_options *opts, struct g
> >>  opts->x_param_values, opts_set->x_param_values);
> >>  }
> >>
> >> +  if (opts->x_profile_arc_flag
> >> +  || opts->x_flag_branch_probabilities)
> >> +{
> >> +  maybe_set_param_value
> >> +   (PARAM_EARLY_INLINER_MAX_ITERATIONS, 2,
> >> +opts->x_param_values, opts_set->x_param_values);
> >> +}
> >> +
> >>if (!(opts->x_flag_auto_profile
> >>  || (opts->x_profile_arc_flag || 
> >> opts->x_flag_branch_probabilities)))
> >>  {
> >>
> >>
> >> --
> >> Teresa Johnson | Software Engineer | tejohn...@google.com | 
> >> 408-460-2413

Re: [PATCH i386 AVX512] [63.1/n] Add vpshufb, perm autogen (except for v64qi).

2014-10-20 Thread Uros Bizjak

On Mon, Oct 20, 2014 at 5:19 PM, Ilya Tocar  wrote:
>> >
>> > The patch is OK with the above improvement.
>> >
>> >
>>
>> Will commit version below, if no objections in 24 hours.
>>
>>
> Sorry,
> I've missed palignr, which should also have v64qi version,
> and lost return in expand_vec_perm_palignr case
> (this caused avx512f-vec-unpack test failures).
> Patch below fixes it. Ok for trunk?
>
> 2014-10-20  Ilya Tocar  
>
> * config/i386/i386.c (expand_vec_perm_1): Fix
> expand_vec_perm_palignr case.
> * config/i386/sse.md (_palignr_mask): Use
> VI1_AVX512.

OK.

Thanks,
Uros.

[PING][PATCH] GCC/test: Set timeout factor for c11-atomic-exec-5.c

2014-10-20 Thread Maciej W. Rozycki

Hi,

 I thought http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00242.html would 
be folded into PowerPC TARGET_ATOMIC_ASSIGN_EXPAND_FENV support, but I see 
r216437 went without it.  In that case would someone please review my 
proposal as a separate change?

 Thanks,

  Maciej

Re: -fuse-caller-save - Collect register usage information

2014-10-20 Thread Eric Botcazou

> But, given the preference of a number of others for fipa-ra, could you live
> with that?

Yes, IMO that's too vague a name but still better than the existing one. :-)

-- 
Eric Botcazou

[Patch, libstdc++/63497] Avoid dereferencing invalid iterator in regex_executor

2014-10-20 Thread Tim Shen

Bootstrapped and tested.

Thanks!


-- 
Regards,
Tim Shen
commit 95c73ab6280c1f8182d018ee29a44230965dd4ef
Author: timshen 
Date:   Sun Oct 19 15:14:55 2014 -0700

PR libstdc++/63497
include/bits/regex_executor.h (_Executor::_M_word_boundary): Remove
const qualifier.
include/bits/regex_executor.tcc (_Executor::_M_dfs,
_Executor::_M_word_boundary): Avoid dereferecing _M_current at _M_end
or other invalid position.

diff --git a/libstdc++-v3/include/bits/regex_executor.h 
b/libstdc++-v3/include/bits/regex_executor.h
index cd9e55d..b867951 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -145,7 +145,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
   bool
-  _M_word_boundary(_State<_TraitsT> __state) const;
+  _M_word_boundary(_State<_TraitsT> __state);
 
   bool
   _M_lookahead(_State<_TraitsT> __state);
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 5eab852..9655c7a 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -284,9 +284,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_dfs(__match_mode, __state._M_next);
  break;
case _S_opcode_match:
+ if (_M_current == _M_end)
+   break;
  if (__dfs_mode)
{
- if (_M_current != _M_end && __state._M_matches(*_M_current))
+ if (__state._M_matches(*_M_current))
{
  ++_M_current;
  _M_dfs(__match_mode, __state._M_next);
@@ -407,25 +409,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-_M_word_boundary(_State<_TraitsT>) const
+_M_word_boundary(_State<_TraitsT>)
 {
-  // By definition.
-  bool __ans = false;
-  auto __pre = _M_current;
-  --__pre;
-  if (!(_M_at_begin() && _M_at_end()))
+  bool __left_is_word = false;
+  if (_M_current != _M_begin
+ || (_M_flags & regex_constants::match_prev_avail))
{
- if (_M_at_begin())
-   __ans = _M_is_word(*_M_current)
- && !(_M_flags & regex_constants::match_not_bow);
- else if (_M_at_end())
-   __ans = _M_is_word(*__pre)
- && !(_M_flags & regex_constants::match_not_eow);
- else
-   __ans = _M_is_word(*_M_current)
- != _M_is_word(*__pre);
+ --_M_current;
+ if (_M_is_word(*_M_current))
+   __left_is_word = true;
+ ++_M_current;
}
-  return __ans;
+  bool __right_is_word = false;
+  if (_M_current != _M_end && _M_is_word(*_M_current))
+   __right_is_word = true;
+
+  if (__left_is_word == __right_is_word)
+   return false;
+  if (__left_is_word && !(_M_flags & regex_constants::match_not_eow))
+   return true;
+  if (__right_is_word && !(_M_flags & regex_constants::match_not_bow))
+   return true;
+  return false;
 }
 
 _GLIBCXX_END_NAMESPACE_VERSION

C++ PATCH for c++/63601 (lambda, 'this' outside class)

2014-10-20 Thread Jason Merrill

finish_this_expr needs to be prepared for lambda_expr_this_capture to 
return NULL_TREE.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c631290819f1ab3754041c46d351745953fb8319
Author: Jason Merrill 
Date:   Mon Oct 20 09:56:35 2014 -0400

	PR c++/63601
	* lambda.c (current_nonlambda_function): New.
	* semantics.c (finish_this_expr): Use it.
	* cp-tree.h: Declare it.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index b6afc31..0923d9f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5961,6 +5961,7 @@ extern bool is_normal_capture_proxy (tree);
 extern void register_capture_members		(tree);
 extern tree lambda_expr_this_capture(tree, bool);
 extern tree maybe_resolve_dummy			(tree, bool);
+extern tree current_nonlambda_function		(void);
 extern tree nonlambda_method_basetype		(void);
 extern void maybe_add_lambda_conv_op(tree);
 extern bool is_lambda_ignored_entity(tree);
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 17fd037..d4030e3 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -777,6 +777,17 @@ maybe_resolve_dummy (tree object, bool add_capture_p)
   return object;
 }
 
+/* Returns the innermost non-lambda function.  */
+
+tree
+current_nonlambda_function (void)
+{
+  tree fn = current_function_decl;
+  while (fn && LAMBDA_FUNCTION_P (fn))
+fn = decl_function_context (fn);
+  return fn;
+}
+
 /* Returns the method basetype of the innermost non-lambda function, or
NULL_TREE if none.  */
 
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 0e675a3..26e66f5 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2438,7 +2438,7 @@ finish_increment_expr (tree expr, enum tree_code code)
 tree
 finish_this_expr (void)
 {
-  tree result;
+  tree result = NULL_TREE;
 
   if (current_class_ptr)
 {
@@ -2450,25 +2450,19 @@ finish_this_expr (void)
   else
 result = current_class_ptr;
 }
-  else if (current_function_decl
-	   && DECL_STATIC_FUNCTION_P (current_function_decl))
-{
-  error ("% is unavailable for static member functions");
-  result = error_mark_node;
-}
+
+  if (result)
+/* The keyword 'this' is a prvalue expression.  */
+return rvalue (result);
+
+  tree fn = current_nonlambda_function ();
+  if (fn && DECL_STATIC_FUNCTION_P (fn))
+error ("% is unavailable for static member functions");
+  else if (fn)
+error ("invalid use of % in non-member function");
   else
-{
-  if (current_function_decl)
-	error ("invalid use of % in non-member function");
-  else
-	error ("invalid use of % at top level");
-  result = error_mark_node;
-}
-
-  /* The keyword 'this' is a prvalue expression.  */
-  result = rvalue (result);
-
-  return result;
+error ("invalid use of % at top level");
+  return error_mark_node;
 }
 
 /* Finish a pseudo-destructor expression.  If SCOPE is NULL, the
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this20.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this20.C
new file mode 100644
index 000..0d27320
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this20.C
@@ -0,0 +1,4 @@
+// PR c++/63601
+// { dg-do compile { target c++11 } }
+
+auto f = []{ sizeof(this); };	// { dg-error "this" }

[PATCH] Add top-level config support for gold mips target

2014-10-20 Thread Cary Coutant

This patch adds support for the mips target in gold.

OK to commit?

-cary


2014-10-20  Cary Coutant  

* configure (--enable-gold): Add mips*-*-*.
* configure.ac: Regenerate.


Index: configure
===
--- configure   (revision 216487)
+++ configure   (working copy)
@@ -2941,7 +2941,7 @@ case "${ENABLE_GOLD}" in
   # Check for target supported by gold.
   case "${target}" in
 i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold
Index: configure.ac
===
--- configure.ac(revision 216487)
+++ configure.ac(working copy)
@@ -332,7 +332,7 @@ case "${ENABLE_GOLD}" in
   # Check for target supported by gold.
   case "${target}" in
 i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold

[Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Sriraman Tallam

Hi,

   This patch is under review for trunk GCC :
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.

In the mean time, is this ok for google/gcc-4_9 branch?  Without
this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
boundary.

Thanks
Sri
Index: testsuite/gcc.dg/pr63538.c
===
--- testsuite/gcc.dg/pr63538.c  (revision 0)
+++ testsuite/gcc.dg/pr63538.c  (revision 0)
@@ -0,0 +1,14 @@
+/* PR63538 is about not using 64-bit addresses for .lrodata accesses when it
+   involves STRING_CSTs.  */
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-O2 -mcmodel=medium -mlarge-data-threshold=0" { target 
x86_64-*-* } } */
+
+#include 
+
+const char *str = "Hello World";
+
+int main() {
+ printf("str = %p %s\n",str, str);
+ return 0;
+}
+/* { dg-final { scan-assembler-not "movl" } } */
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 216287)
+++ config/i386/i386.c  (working copy)
@@ -41331,8 +41331,7 @@ ix86_encode_section_info (tree decl, rtx rtl, int
 {
   default_encode_section_info (decl, rtl, first);
 
-  if (TREE_CODE (decl) == VAR_DECL
-  && (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
+  if ((TREE_STATIC (decl) || DECL_EXTERNAL (decl))
   && ix86_in_large_data_p (decl))
 SYMBOL_REF_FLAGS (XEXP (rtl, 0)) |= SYMBOL_FLAG_FAR_ADDR;
 }

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Xinliang David Li

Why removing the tree_code check?

David

On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  wrote:
> Hi,
>
>This patch is under review for trunk GCC :
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.
>
> In the mean time, is this ok for google/gcc-4_9 branch?  Without
> this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
> boundary.
>
> Thanks
> Sri

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Sriraman Tallam

On Mon, Oct 20, 2014 at 10:42 AM, Xinliang David Li  wrote:
> Why removing the tree_code check?

The actual problem happens because STRING_CSTs (end up in .lrodata)
are not set a far address as they dont match the VAR_DECL check here.
Futher,  "ix86_in_large_data_p" call has the TREE_CODE check to do the
right thing so this seems unnecessary & buggy here.

Thanks
Sri

>
> David
>
> On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  wrote:
>> Hi,
>>
>>This patch is under review for trunk GCC :
>> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.
>>
>> In the mean time, is this ok for google/gcc-4_9 branch?  Without
>> this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
>> boundary.
>>
>> Thanks
>> Sri

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Andrew Pinski

On Mon, Oct 20, 2014 at 10:46 AM, Sriraman Tallam  wrote:
> On Mon, Oct 20, 2014 at 10:42 AM, Xinliang David Li  
> wrote:
>> Why removing the tree_code check?
>
> The actual problem happens because STRING_CSTs (end up in .lrodata)
> are not set a far address as they dont match the VAR_DECL check here.
> Futher,  "ix86_in_large_data_p" call has the TREE_CODE check to do the
> right thing so this seems unnecessary & buggy here.

I think he is asking because TREE_STATIC (decl) || DECL_EXTERNAL
(decl) might be an issue for STRING_CSTs.

Thanks,
Andrew


>
> Thanks
> Sri
>
>>
>> David
>>
>> On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  
>> wrote:
>>> Hi,
>>>
>>>This patch is under review for trunk GCC :
>>> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.
>>>
>>> In the mean time, is this ok for google/gcc-4_9 branch?  Without
>>> this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
>>> boundary.
>>>
>>> Thanks
>>> Sri

[jit] Drop libgccjit.pc

2014-10-20 Thread David Malcolm

Committed to branch dmalcolm/jit:

pkg-config appears to be controversial, so don't provide a .pc file.

gcc/ChangeLog.jit:
* Makefile.in (pkgconfigdir): Drop this.
(installdirs): Likewise.
* configure.ac (gcc_version): Don't AC_SUBST this.
* configure: Regenerate.

gcc/jit/ChangeLog.jit:
* Make-lang.in (jit.install-common): Drop installation of
libgccjit.pc.
* config-lang.in (outputs): Drop jit/libgccjit.pc.
* libgccjit.pc.in: Delete.
---
 gcc/ChangeLog.jit   |  7 +++
 gcc/Makefile.in |  3 ---
 gcc/configure   |  6 ++
 gcc/configure.ac|  1 -
 gcc/jit/ChangeLog.jit   |  7 +++
 gcc/jit/Make-lang.in|  2 --
 gcc/jit/config-lang.in  |  4 
 gcc/jit/libgccjit.pc.in | 11 ---
 8 files changed, 16 insertions(+), 25 deletions(-)
 delete mode 100644 gcc/jit/libgccjit.pc.in

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index bf2d6d2..8dec312 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,3 +1,10 @@
+2014-10-20  David Malcolm  
+
+   * Makefile.in (pkgconfigdir): Drop this.
+   (installdirs): Likewise.
+   * configure.ac (gcc_version): Don't AC_SUBST this.
+   * configure: Regenerate.
+
 2014-10-17  David Malcolm  
 
* Makefile.in (FULL_DRIVER_NAME): New variable, adapted from the
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 523d1db..954a1eb 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -570,8 +570,6 @@ bindir = @bindir@
 libdir = @libdir@
 # Directory in which GCC puts its executables.
 libexecdir = @libexecdir@
-# Directory in which to install .pc files for pkgconfig
-pkgconfigdir = @libdir@/pkgconfig
 
 # 
 # UNSORTED
@@ -3141,7 +3139,6 @@ installdirs:
$(mkinstalldirs) $(DESTDIR)$(infodir)
$(mkinstalldirs) $(DESTDIR)$(man1dir)
$(mkinstalldirs) $(DESTDIR)$(man7dir)
-   $(mkinstalldirs) $(DESTDIR)$(pkgconfigdir)
 
 PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
   toplev.h $(DIAGNOSTIC_CORE_H) $(BASIC_BLOCK_H) $(HASH_TABLE_H) \
diff --git a/gcc/configure b/gcc/configure
index 81634f2..0024ece 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -825,7 +825,6 @@ build_os
 build_vendor
 build_cpu
 build
-gcc_version
 target_alias
 host_alias
 build_alias
@@ -3042,7 +3041,6 @@ ac_config_headers="$ac_config_headers 
auto-host.h:config.in"
 
 gcc_version=`cat $srcdir/BASE-VER`
 
-
 # Determine the host, build, and target systems
 ac_aux_dir=
 for ac_dir in "$srcdir" "$srcdir/.." "$srcdir/../.."; do
@@ -18093,7 +18091,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18096 "configure"
+#line 18094 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18199,7 +18197,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18202 "configure"
+#line 18200 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 0af7a77..37db6ab 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -29,7 +29,6 @@ AC_CONFIG_SRCDIR(tree.c)
 AC_CONFIG_HEADER(auto-host.h:config.in)
 
 gcc_version=`cat $srcdir/BASE-VER`
-AC_SUBST(gcc_version)
 
 # Determine the host, build, and target systems
 AC_CANONICAL_BUILD
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 0c55258..9a36dfd 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,10 @@
+2014-10-20  David Malcolm  
+
+   * Make-lang.in (jit.install-common): Drop installation of
+   libgccjit.pc.
+   * config-lang.in (outputs): Drop jit/libgccjit.pc.
+   * libgccjit.pc.in: Delete.
+
 2014-10-17  David Malcolm  
 
* Make-lang.in (jit): Add $(FULL_DRIVER_NAME) as a dependency, so
diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
index ac179f4..167fcad 100644
--- a/gcc/jit/Make-lang.in
+++ b/gcc/jit/Make-lang.in
@@ -260,8 +260,6 @@ jit.install-common: installdirs
  $(DESTDIR)/$(includedir)/libgccjit.h
$(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit++.h \
  $(DESTDIR)/$(includedir)/libgccjit++.h
-   $(INSTALL_PROGRAM) jit/libgccjit.pc \
- $(DESTDIR)/$(libdir)/pkgconfig/libgccjit.pc
 
 jit.install-man:
 
diff --git a/gcc/jit/config-lang.in b/gcc/jit/config-lang.in
index b22a5ee..7a32afe 100644
--- a/gcc/jit/config-lang.in
+++ b/gcc/jit/config-lang.in
@@ -36,7 +36,3 @@ gtfiles="\$(srcdir)/jit/dummy-frontend.c"
 # Hence to get the jit, one must configure with:
 #   --enable-host-shared --enable-languages=jit
 build_by_default="no"
-
-# Ensure that libgccjit.pc is built from libgccjit.pc.in
-# via AC_CONFIG_FILES in gcc/configure.ac
-outputs="jit/libgccjit.pc"
diff --git a/gcc/jit/libgccjit.pc.in b/gcc/jit/libgccjit.pc.in
deleted file mode 100644
index faafea5..000
--- a/gcc/jit/libgccjit.pc.in
+++ /dev/null
@@ -1,11 +0,0 @@
-prefix=@prefix@
-exec_prefix=

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Xinliang David Li

Perhaps explicitly allowing STRING_CST to go through the large data
check, instead of removing the var-decl check? Do you see other
opcodes that need to be handled too?

David

On Mon, Oct 20, 2014 at 10:46 AM, Sriraman Tallam  wrote:
> On Mon, Oct 20, 2014 at 10:42 AM, Xinliang David Li  
> wrote:
>> Why removing the tree_code check?
>
> The actual problem happens because STRING_CSTs (end up in .lrodata)
> are not set a far address as they dont match the VAR_DECL check here.
> Futher,  "ix86_in_large_data_p" call has the TREE_CODE check to do the
> right thing so this seems unnecessary & buggy here.
>
> Thanks
> Sri
>
>>
>> David
>>
>> On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  
>> wrote:
>>> Hi,
>>>
>>>This patch is under review for trunk GCC :
>>> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.
>>>
>>> In the mean time, is this ok for google/gcc-4_9 branch?  Without
>>> this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
>>> boundary.
>>>
>>> Thanks
>>> Sri

libgo patch committed: Allocate correct types in refect for interface conversions

2014-10-20 Thread Ian Taylor

This patch to libgo is a copy of a patch I recently made to the master
Go library.  This changes the reflect package to allocate memory using
the correct types for interface conversions.  The code was incorrectly
allocating an empty interface type to hold a non-empty interface
value.  This was working in the master library because it
coincidentally always handled the values correctly.  This was working
in gccgo because we almost never have the types anyhow, although we
plan to change that shortly.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r 87f8de53800a libgo/go/reflect/value.go
--- a/libgo/go/reflect/value.go Fri Oct 17 17:41:04 2014 -0700
+++ b/libgo/go/reflect/value.go Mon Oct 20 10:59:43 2014 -0700
@@ -1405,9 +1405,9 @@
 func (v Value) Set(x Value) {
v.mustBeAssignable()
x.mustBeExported() // do not let unexported x leak
-   var target *interface{}
+   var target unsafe.Pointer
if v.kind() == Interface {
-   target = (*interface{})(v.ptr)
+   target = v.ptr
}
x = x.assignTo("reflect.Set", v.typ, target)
if x.flag&flagIndir != 0 {
@@ -2230,7 +2230,7 @@
 // assignTo returns a value v that can be assigned directly to typ.
 // It panics if v is not assignable to typ.
 // For a conversion to an interface type, target is a suggested scratch space 
to use.
-func (v Value) assignTo(context string, dst *rtype, target *interface{}) Value 
{
+func (v Value) assignTo(context string, dst *rtype, target unsafe.Pointer) 
Value {
if v.flag&flagMethod != 0 {
v = makeMethodValue(context, v)
}
@@ -2246,15 +2246,15 @@
 
case implements(dst, v.typ):
if target == nil {
-   target = new(interface{})
+   target = unsafe_New(dst)
}
x := valueInterface(v, false)
if dst.NumMethod() == 0 {
-   *target = x
+   *(*interface{})(target) = x
} else {
-   ifaceE2I(dst, x, unsafe.Pointer(target))
+   ifaceE2I(dst, x, target)
}
-   return Value{dst, unsafe.Pointer(target) /* 0, */, flagIndir | 
flag(Interface)< interface
 func cvtT2I(v Value, typ Type) Value {
-   target := new(interface{})
+   target := unsafe_New(typ.common())
x := valueInterface(v, false)
if typ.NumMethod() == 0 {
-   *target = x
+   *(*interface{})(target) = x
} else {
-   ifaceE2I(typ.(*rtype), x, unsafe.Pointer(target))
+   ifaceE2I(typ.(*rtype), x, target)
}
-   return Value{typ.common(), unsafe.Pointer(target) /* 0, */, 
v.flag&flagRO | flagIndir | flag(Interface)< interface

[patch] Second basic-block.h restructuring patch.

2014-10-20 Thread Andrew MacLeod

creates cfg.h, cfganal.h, lcm.h, and loop-unroll.h to house the 
prototypes for those .c files.


cfganal.h also gets "struct edge_list"  and "class control_dependences" 
definitions since that is where all the routines and manipulators are 
declared.


 loop-unroll.h only exports 2 routines, so rather than including that 
in basic-block.h I simply included it from the 2 .c files which consume 
those routines.  Again, the other includes will be flattened out of 
basic-block.h to just their consumers later.


loop-unroll.c also had one function I marked as static since it wasn't 
actually used anywhere else.


bootstraps on x86_64-unknown-linux-gnu, and regressions are running... I 
expect no regressions because of the nature of the changes.   OK to 
check in assuming everything is OK?


Andrew


	* cfg.h: New.  Header file for cfg.c.
	* cfganal.h: New.  Header file for cfganal.c.
	* lcm.h: New.  Header file for lcm.c.
	* loop-unroll.h: New.  Header file for loop-unroll.h.
	* cfgloop.h: (unroll_loops): Remove prototype.
	* basic-block.h: Move prototypes and structs to new header files.
	Include cfg.h, cfganal.h, and lcm.h.
	* loop-init.c: Include loop-unroll.h.
	* loop-unroll.c: (referenced_in_one_insn_in_loop_p): Make static.
	* modulo-sched.c: Include loop-unroll.h.


Index: cfg.h
===
--- cfg.h	(revision 0)
+++ cfg.h	(working copy)
@@ -0,0 +1,65 @@
+/* Control flow graph manipulation code header file.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_CFG_H
+#define GCC_CFG_H
+
+extern void init_flow (struct function *);
+extern void clear_edges (void);
+extern basic_block alloc_block (void);
+extern void link_block (basic_block, basic_block);
+extern void unlink_block (basic_block);
+extern void compact_blocks (void);
+extern void expunge_block (basic_block);
+extern edge unchecked_make_edge (basic_block, basic_block, int);
+extern edge cached_make_edge (sbitmap, basic_block, basic_block, int);
+extern edge make_edge (basic_block, basic_block, int);
+extern edge make_single_succ_edge (basic_block, basic_block, int);
+extern void remove_edge_raw (edge);
+extern void redirect_edge_succ (edge, basic_block);
+extern void redirect_edge_pred (edge, basic_block);
+extern void clear_bb_flags (void);
+extern void dump_edge_info (FILE *, edge, int, int);
+extern void debug (edge_def &ref);
+extern void debug (edge_def *ptr);
+extern void alloc_aux_for_blocks (int);
+extern void clear_aux_for_blocks (void);
+extern void free_aux_for_blocks (void);
+extern void alloc_aux_for_edge (edge, int);
+extern void alloc_aux_for_edges (int);
+extern void clear_aux_for_edges (void);
+extern void free_aux_for_edges (void);
+extern void debug_bb (basic_block);
+extern basic_block debug_bb_n (int);
+extern void dump_bb_info (FILE *, basic_block, int, int, bool, bool);
+extern void brief_dump_cfg (FILE *, int);
+extern void update_bb_profile_for_threading (basic_block, int, gcov_type, edge);
+extern void scale_bbs_frequencies_int (basic_block *, int, int, int);
+extern void scale_bbs_frequencies_gcov_type (basic_block *, int, gcov_type,
+	 gcov_type);
+extern void initialize_original_copy_tables (void);
+extern void free_original_copy_tables (void);
+extern void set_bb_original (basic_block, basic_block);
+extern basic_block get_bb_original (basic_block);
+extern void set_bb_copy (basic_block, basic_block);
+extern basic_block get_bb_copy (basic_block);
+void set_loop_copy (struct loop *, struct loop *);
+struct loop *get_loop_copy (struct loop *);
+
+#endif /* GCC_CFG_H */
Index: cfganal.h
===
--- cfganal.h	(revision 0)
+++ cfganal.h	(working copy)
@@ -0,0 +1,79 @@
+/* Control flow graph analysis header file.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should ha

[jit] Error-handling within gcc::jit::dump

2014-10-20 Thread David Malcolm

On Fri, 2014-10-17 at 21:52 +, Joseph S. Myers wrote:
[...snip static linkage discussion...]

> The dump file handling appears to have no I/O error checking (no checking 
> for error on fopen, nothing obvious to prevent fwrite to a NULL m_file if 
> fopen did have an error, no checking for error on fclose (or fwrite)).

Thanks.

Does the following look OK?  (I've committed it to branch
dmalcolm/jit)

gcc/jit/ChangeLog.jit:
* jit-recording.c (gcc::jit::dump::dump): Handle fopen failures
by emitting an error on the context.
(gcc::jit::dump::~dump): Likewise for fclose failures.
(gcc::jit::dump::write): Don't attempt further work if the fopen
failed.  Handle fwrite failures by emitting an error on the
context.
---
 gcc/jit/ChangeLog.jit   |  9 +
 gcc/jit/jit-recording.c | 25 ++---
 2 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 9a36dfd..02664f0 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,14 @@
 2014-10-20  David Malcolm  
 
+   * jit-recording.c (gcc::jit::dump::dump): Handle fopen failures
+   by emitting an error on the context.
+   (gcc::jit::dump::~dump): Likewise for fclose failures.
+   (gcc::jit::dump::write): Don't attempt further work if the fopen
+   failed.  Handle fwrite failures by emitting an error on the
+   context.
+
+2014-10-20  David Malcolm  
+
* Make-lang.in (jit.install-common): Drop installation of
libgccjit.pc.
* config-lang.in (outputs): Drop jit/libgccjit.pc.
diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
index 32ce49b..5a97f23 100644
--- a/gcc/jit/jit-recording.c
+++ b/gcc/jit/jit-recording.c
@@ -47,13 +47,25 @@ dump::dump (recording::context &ctxt,
   m_line (0),
   m_column (0)
 {
-  m_file  = fopen (filename, "w");
+  m_file = fopen (filename, "w");
+  if (!m_file)
+ctxt.add_error (NULL,
+   "error opening dump file %s for writing: %s",
+   filename,
+   xstrerror (errno));
 }
 
 dump::~dump ()
 {
   if (m_file)
-fclose (m_file);
+{
+  int err = fclose (m_file);
+  if (err)
+   m_ctxt.add_error (NULL,
+ "error closing dump file %s: %s",
+ m_filename,
+ xstrerror (errno));
+}
 }
 
 /* Write the given message to the dump, using printf-formatting
@@ -67,6 +79,11 @@ dump::write (const char *fmt, ...)
   va_list ap;
   char *buf = NULL;
 
+  /* If there was an error opening the file, we've already reported it.
+ Don't attempt further work.  */
+  if (!m_file)
+return;
+
   va_start (ap, fmt);
   vasprintf (&buf, fmt, ap);
   va_end (ap);
@@ -78,7 +95,9 @@ dump::write (const char *fmt, ...)
   return;
 }
 
-  fwrite (buf, strlen (buf), 1, m_file);
+  if (fwrite (buf, strlen (buf), 1, m_file) != 1)
+m_ctxt.add_error (NULL, "error writing to dump file %s",
+ m_filename);
 
   /* Update line/column: */
   for (const char *ptr = buf; *ptr; ptr++)
-- 
1.7.11.7

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Sriraman Tallam

On Mon, Oct 20, 2014 at 10:59 AM, Xinliang David Li  wrote:
> Perhaps explicitly allowing STRING_CST to go through the large data
> check, instead of removing the var-decl check? Do you see other
> opcodes that need to be handled too?

I do not see any other opcodes explicitly but the code in
ix86_in_large_data_p seemingly handles all opcodes other than
FUNCTION_DECL through:

if (TREE_CODE (exp) == VAR_DECL && DECL_SECTION_NAME (exp))
{

}
else {

  }


However, I have modified the patch to explicitly check for STRING_CST
and I cannot think of any other case where the constant goes into
rodata but is not accessed via a VAR_DECL. Also note that TREE_STATIC
(decl) is true for STRING_CST.

Thanks
Sri




>
> David
>
> On Mon, Oct 20, 2014 at 10:46 AM, Sriraman Tallam  wrote:
>> On Mon, Oct 20, 2014 at 10:42 AM, Xinliang David Li  
>> wrote:
>>> Why removing the tree_code check?
>>
>> The actual problem happens because STRING_CSTs (end up in .lrodata)
>> are not set a far address as they dont match the VAR_DECL check here.
>> Futher,  "ix86_in_large_data_p" call has the TREE_CODE check to do the
>> right thing so this seems unnecessary & buggy here.
>>
>> Thanks
>> Sri
>>
>>>
>>> David
>>>
>>> On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  
>>> wrote:
 Hi,

This patch is under review for trunk GCC :
 https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.

 In the mean time, is this ok for google/gcc-4_9 branch?  Without
 this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
 boundary.

 Thanks
 Sri
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 216287)
+++ config/i386/i386.c  (working copy)
@@ -41331,7 +41331,7 @@ ix86_encode_section_info (tree decl, rtx rtl, int
 {
   default_encode_section_info (decl, rtl, first);
 
-  if (TREE_CODE (decl) == VAR_DECL
+  if ((TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == STRING_CST)
   && (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
   && ix86_in_large_data_p (decl))
 SYMBOL_REF_FLAGS (XEXP (rtl, 0)) |= SYMBOL_FLAG_FAR_ADDR;
Index: testsuite/gcc.dg/pr63538.c
===
--- testsuite/gcc.dg/pr63538.c  (revision 0)
+++ testsuite/gcc.dg/pr63538.c  (revision 0)
@@ -0,0 +1,14 @@
+/* PR63538 is about not using 64-bit addresses for .lrodata accesses when it
+   involves STRING_CSTs.  */
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-O2 -mcmodel=medium -mlarge-data-threshold=0" { target 
x86_64-*-* } } */
+
+#include 
+
+const char *str = "Hello World";
+
+int main() {
+ printf("str = %p %s\n",str, str);
+ return 0;
+}
+/* { dg-final { scan-assembler-not "movl" } } */

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Sriraman Tallam

On Mon, Oct 20, 2014 at 10:51 AM, Andrew Pinski  wrote:
> On Mon, Oct 20, 2014 at 10:46 AM, Sriraman Tallam  wrote:
>> On Mon, Oct 20, 2014 at 10:42 AM, Xinliang David Li  
>> wrote:
>>> Why removing the tree_code check?
>>
>> The actual problem happens because STRING_CSTs (end up in .lrodata)
>> are not set a far address as they dont match the VAR_DECL check here.
>> Futher,  "ix86_in_large_data_p" call has the TREE_CODE check to do the
>> right thing so this seems unnecessary & buggy here.
>
> I think he is asking because TREE_STATIC (decl) || DECL_EXTERNAL
> (decl) might be an issue for STRING_CSTs.

TREE_STATIC is true for STRING_CSTs and DECL_EXTERNAL false, that looks ok.

Thanks
Sri

>
> Thanks,
> Andrew
>
>
>>
>> Thanks
>> Sri
>>
>>>
>>> David
>>>
>>> On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  
>>> wrote:
 Hi,

This patch is under review for trunk GCC :
 https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.

 In the mean time, is this ok for google/gcc-4_9 branch?  Without
 this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
 boundary.

 Thanks
 Sri

Re: [PATCH] Improve scheduler dumps of ready list

2014-10-20 Thread Sebastian Pop

Maxim Kuvyrkov wrote:
> Hi,
> 
> Following previous improvement to scheduler dumps that provided insight into 
> which heuristics in rank_for_schedule make most decisions, this patch adds 
> print outs that show the deciding reason for an instruction in the ready list 
> to be at its particular place.
> 
> This patch allowed me to troubleshoot several scheduling problems in the 
> register pressure scheduling.

LGTM.
Please wait for the approval from a maintainer before committing.

Thanks,
Sebastian

Re: [PATCH] Account for prologue spills in reg_pressure scheduling

2014-10-20 Thread Sebastian Pop

Maxim Kuvyrkov wrote:
> Hi,
> 
> This patch improves register pressure scheduling (both 
> SCHED_PRESSURE_WEIGHTED and SCHED_PRESSURE_MODEL) to better estimate number 
> of available registers.
> 
> At the moment the scheduler does not account for spills in the prologues and 
> restores in the epilogue, which occur from use of call-used registers.  The 
> current state is, essentially, optimized for case when there is a hot loop 
> inside the function, and the loop executes significantly more often than the 
> prologue/epilogue.  However, on the opposite end, we have a case when the 
> function is just a single non-cyclic basic block, which executes just as 
> often as prologue / epilogue, so spills in the prologue hurt performance as 
> much as spills in the basic block itself.  In such a case the scheduler 
> should throttle-down on the number of available registers and try to not go 
> beyond call-clobbered registers.
> 
> The patch uses basic block frequencies to balance the cost of using call-used 
> registers for intermediate cases between the two above extremes.
> 
> The motivation for this patch was a floating-point testcase on 
> arm-linux-gnueabihf (ARM is one of the few targets that use register pressure 
> scheduling by default).
> 

Does aarch64 enable reg pressure sched by default, or what is the flag to 
enable it?
I'm planing to look at the perf impact of the patch.

Thanks,
Sebastian

Go patch committed: Pass type information to heap allocations

2014-10-20 Thread Ian Taylor

This patch by Chris Manghane passes type information to
compiler-generated heap allocations in gccgo.  This gives us precise
type information for much of the gccgo heap, and means that garbage
collection is much more precise and less prone to errors due to
mistaking integer or float values as pointers.  Bootstrapped and ran
Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r f6e937cbbe5a go/expressions.cc
--- a/go/expressions.cc Mon Oct 20 11:04:12 2014 -0700
+++ b/go/expressions.cc Mon Oct 20 12:04:16 2014 -0700
@@ -12170,7 +12170,7 @@
   { return this->vals_ == NULL ? 0 : this->vals_->size(); }
 
 protected:
-  int
+  virtual int
   do_traverse(Traverse* traverse);
 
   bool
@@ -12495,11 +12495,33 @@
 : Array_construction_expression(EXPRESSION_SLICE_CONSTRUCTION,
type, indexes, vals, location),
   valtype_(NULL)
-  { go_assert(type->is_slice_type()); }
+  {
+go_assert(type->is_slice_type());
+
+mpz_t lenval;
+Expression* length;
+if (vals == NULL || vals->empty())
+  mpz_init_set_ui(lenval, 0);
+else
+  {
+   if (this->indexes() == NULL)
+ mpz_init_set_ui(lenval, vals->size());
+   else
+ mpz_init_set_ui(lenval, indexes->back() + 1);
+  }
+Type* int_type = Type::lookup_integer_type("int");
+length = Expression::make_integer(&lenval, int_type, location);
+mpz_clear(lenval);
+Type* element_type = type->array_type()->element_type();
+this->valtype_ = Type::make_array_type(element_type, length);
+  }
 
  protected:
   // Note that taking the address of a slice literal is invalid.
 
+  int
+  do_traverse(Traverse* traverse);
+
   Expression*
   do_copy()
   {
@@ -12518,6 +12540,19 @@
   Type* valtype_;
 };
 
+// Traversal.
+
+int
+Slice_construction_expression::do_traverse(Traverse* traverse)
+{
+  if (this->Array_construction_expression::do_traverse(traverse)
+  == TRAVERSE_EXIT)
+return TRAVERSE_EXIT;
+  if (Type::traverse(this->valtype_, traverse) == TRAVERSE_EXIT)
+return TRAVERSE_EXIT;
+  return TRAVERSE_CONTINUE;
+}
+
 // Return the backend representation for constructing a slice.
 
 Bexpression*
@@ -12532,24 +12567,7 @@
 
   Location loc = this->location();
   Type* element_type = array_type->element_type();
-  if (this->valtype_ == NULL)
-{
-  mpz_t lenval;
-  Expression* length;
-  if (this->vals() == NULL || this->vals()->empty())
-mpz_init_set_ui(lenval, 0);
-  else
-{
-  if (this->indexes() == NULL)
-mpz_init_set_ui(lenval, this->vals()->size());
-  else
-mpz_init_set_ui(lenval, this->indexes()->back() + 1);
-}
-  Type* int_type = Type::lookup_integer_type("int");
-  length = Expression::make_integer(&lenval, int_type, loc);
-  mpz_clear(lenval);
-  this->valtype_ = Type::make_array_type(element_type, length);
-}
+  go_assert(this->valtype_ != NULL);
 
   Expression_list* vals = this->vals();
   if (this->vals() == NULL || this->vals()->empty())
@@ -14028,7 +14046,7 @@
  protected:
   Type*
   do_type()
-  { return Type::make_pointer_type(Type::make_void_type()); }
+  { return Type::lookup_integer_type("uintptr"); }
 
   bool
   do_is_immutable() const
diff -r f6e937cbbe5a go/go.cc
--- a/go/go.cc  Mon Oct 20 11:04:12 2014 -0700
+++ b/go/go.cc  Mon Oct 20 12:04:16 2014 -0700
@@ -96,9 +96,6 @@
   // Create function descriptors as needed.
   ::gogo->create_function_descriptors();
 
-  // Write out queued up functions for hash and comparison of types.
-  ::gogo->write_specific_type_functions();
-
   // Now that we have seen all the names, verify that types are
   // correct.
   ::gogo->verify_types();
@@ -130,6 +127,9 @@
   // Convert complicated go and defer statements into simpler ones.
   ::gogo->simplify_thunk_statements();
 
+  // Write out queued up functions for hash and comparison of types.
+  ::gogo->write_specific_type_functions();
+
   // Flatten the parse tree.
   ::gogo->flatten();
 
diff -r f6e937cbbe5a go/gogo.cc
--- a/go/gogo.ccMon Oct 20 11:04:12 2014 -0700
+++ b/go/gogo.ccMon Oct 20 12:04:16 2014 -0700
@@ -4196,21 +4196,18 @@
 Expression*
 Gogo::allocate_memory(Type* type, Location location)
 {
-  Btype* btype = type->get_backend(this);
-  size_t size = this->backend()->type_size(btype);
-  mpz_t size_val;
-  mpz_init_set_ui(size_val, size);
-  Type* uintptr = Type::lookup_integer_type("uintptr");
-  Expression* size_expr =
-Expression::make_integer(&size_val, uintptr, location);
-
-  // If the package imports unsafe, then it may play games with
-  // pointers that look like integers.
+  Expression* td = Expression::make_type_descriptor(type, location);
+  Expression* size =
+Expression::make_type_info(type, Expression::TYPE_INFO_SIZE);
+
+  // If this package imports unsafe, then it may play games with
+  // pointers that look like integers.  We should be able to determine
+  // whether or not to use

Re: [PATCH] Account for prologue spills in reg_pressure scheduling

2014-10-20 Thread Maxim Kuvyrkov

On Oct 21, 2014, at 8:11 AM, Sebastian Pop  wrote:

> Maxim Kuvyrkov wrote:
>> Hi,
>> 
>> This patch improves register pressure scheduling (both 
>> SCHED_PRESSURE_WEIGHTED and SCHED_PRESSURE_MODEL) to better estimate number 
>> of available registers.
>> 
>> At the moment the scheduler does not account for spills in the prologues and 
>> restores in the epilogue, which occur from use of call-used registers.  The 
>> current state is, essentially, optimized for case when there is a hot loop 
>> inside the function, and the loop executes significantly more often than the 
>> prologue/epilogue. However, on the opposite end, we have a case when the 
>> function is just a single non-cyclic basic block, which executes just as 
>> often as prologue / epilogue, so spills in the prologue hurt performance as 
>> much as spills in the basic block itself.  In such a case the scheduler 
>> should throttle-down on the number of available registers and try to not go 
>> beyond call-clobbered registers.
>> 
>> The patch uses basic block frequencies to balance the cost of using 
>> call-used registers for intermediate cases between the two above extremes.
>> 
>> The motivation for this patch was a floating-point testcase on 
>> arm-linux-gnueabihf (ARM is one of the few targets that use register 
>> pressure scheduling by default).
>> 
> 
> Does aarch64 enable reg pressure sched by default, or what is the flag to 
> enable it?
> I'm planing to look at the perf impact of the patch.

Thanks, benchmarking results are welcome!  AArch64 doesn't use reg_pressure 
scheduling by default.  Use "-fsched-pressure 
--param=sched-pressure-algorithm=2" to enable same thing as on ARM.  I would 
imagine C++ and Fortran floating-point code to be most affected.

--
Maxim Kuvyrkov
www.linaro.org

[jit] Add Sphinx to install.texi

2014-10-20 Thread David Malcolm

On Fri, 2014-10-17 at 21:25 +, Joseph S. Myers wrote:
> Although Sphinx isn't a build dependency, as a dependency for 
> regenerating checked-in files I think it should be documented in 
> install.texi (like autoconf, gettext, etc.).

Does this look OK?  (Committed to branch dmalcolm/jit for now)

gcc/ChangeLog.jit:
* doc/install.texi (Tools/packages necessary for modifying GCC):
Add Sphinx.
---
 gcc/ChangeLog.jit| 5 +
 gcc/doc/install.texi | 5 +
 2 files changed, 10 insertions(+)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 8dec312..bcd72a4 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,5 +1,10 @@
 2014-10-20  David Malcolm  
 
+   * doc/install.texi (Tools/packages necessary for modifying GCC):
+   Add Sphinx.
+
+2014-10-20  David Malcolm  
+
* Makefile.in (pkgconfigdir): Drop this.
(installdirs): Likewise.
* configure.ac (gcc_version): Don't AC_SUBST this.
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index c92de28..b4027ea 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -491,6 +491,11 @@ Necessary for running @command{texi2dvi} and 
@command{texi2pdf}, which
 are used when running @command{make dvi} or @command{make pdf} to create
 DVI or PDF files, respectively.
 
+@item Sphinx (any working version)
+
+Necessary to regenerate @file{jit/docs/_build/texinfo} from the .rst
+files in the directories below @file{jit/docs}.
+
 @item SVN (any version)
 @itemx SSH (any version)
 
-- 
1.7.11.7

Re: [Google/gcc-4_9][PATCH][target/x86_64] PR 63538

2014-10-20 Thread Xinliang David Li

On Mon, Oct 20, 2014 at 11:59 AM, Sriraman Tallam  wrote:
> On Mon, Oct 20, 2014 at 10:51 AM, Andrew Pinski  wrote:
>> On Mon, Oct 20, 2014 at 10:46 AM, Sriraman Tallam  
>> wrote:
>>> On Mon, Oct 20, 2014 at 10:42 AM, Xinliang David Li  
>>> wrote:
 Why removing the tree_code check?
>>>
>>> The actual problem happens because STRING_CSTs (end up in .lrodata)
>>> are not set a far address as they dont match the VAR_DECL check here.
>>> Futher,  "ix86_in_large_data_p" call has the TREE_CODE check to do the
>>> right thing so this seems unnecessary & buggy here.
>>
>> I think he is asking because TREE_STATIC (decl) || DECL_EXTERNAL
>> (decl) might be an issue for STRING_CSTs.
>
> TREE_STATIC is true for STRING_CSTs and DECL_EXTERNAL false, that looks ok.

The values for STRING_CST make sense, but it is not documented in
tree.h for use with STRING_CST. Maybe do this:

 if (((TREE_CODE (decl) == VAR_DECL&& (TREE_STATIC (decl) ||
DECL_EXTERNAL (decl))
   ||TREE_CODE (decl) == STRING_CST)
  && ix86_in_large_data_p (decl))

which can be simplified to:

 if ((TREE_CODE (decl) == VAR_DECL && is_global_var (decl) ||TREE_CODE
(decl) == STRING_CST)
  && ix86_in_large_data_p (decl))
 ...

David
>
> Thanks
> Sri
>
>>
>> Thanks,
>> Andrew
>>
>>
>>>
>>> Thanks
>>> Sri
>>>

 David

 On Mon, Oct 20, 2014 at 10:35 AM, Sriraman Tallam  
 wrote:
> Hi,
>
>This patch is under review for trunk GCC :
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01638.html.
>
> In the mean time, is this ok for google/gcc-4_9 branch?  Without
> this, -mcmodel=medium is unusable if .lrodata goes beyond the 2G
> boundary.
>
> Thanks
> Sri

Re: [PATCH PR63530] Fix the pointer alignment in vectorization

2014-10-20 Thread Carrot Wei

Hi Richard

An arm testcase that can reproduce this bug is attached.

2014-10-20  Guozhi Wei  

PR tree-optimization/63530
gcc.target/arm/pr63530.c: New testcase.

Index: pr63530.c
===
--- pr63530.c (revision 0)
+++ pr63530.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon } */
+/* { dg-options "-march=armv7-a -mfloat-abi=hard -mfpu=neon -marm -O2
-ftree-vectorize -funroll-loops --param
\"max-completely-peeled-insns=400\"" } */
+
+typedef struct {
+  unsigned char map[256];
+  int i;
+} A, *AP;
+
+void* calloc(int, int);
+
+AP foo (int n)
+{
+  AP b = (AP)calloc (1, sizeof (A));
+  int i;
+  for (i = n; i < 256; i++)
+b->map[i] = i;
+  return b;
+}
+
+/* { dg-final { scan-assembler-not "vst1.64" } } */

On Mon, Oct 20, 2014 at 1:19 AM, Richard Biener
 wrote:
> On Fri, Oct 17, 2014 at 7:58 PM, Carrot Wei  wrote:
>
> I miss a testcase.  I also miss a comment before this code explaining
> why DR_MISALIGNMENT if not -1 is valid and why it is not valid if

DR_MISALIGNMENT (dr) == -1 means some unknown misalignment, otherwise
it means some known misalignment.
See the usage in file tree-vect-stmts.c.

> 'offset' is supplied (what about 'byte_offset' btw?).  Also if peeling

It is for conservative, so it doesn't change the logic when offset is supplied.
I've checked that most of the passed in offset are caused by negative
step, its impact to DR_MISALIGNMENT should have already be considered
in function vect_update_misalignment_for_peel, but the comments of
vect_create_addr_base_for_vector_ref does not guarantee this usage of
offset.

The usage of byte_offset is quite broken, many direct or indirect
callers don't provide the parameters. So only the author can comment
this.

> for alignment aligned this ref (misalign == 0) you don't set the alignment.
>
I assume if no misalignment is specified, the natural alignment of the
vector type is used, and caused the wrong code in our case, is it
right?

> Thus you may fix a bug (not sure without a testcase) but the new code
> certainly doesn't look 100% correct.
>
> That said, I would have expected that we can unconditionally do
>
>  set_ptr_info_alignment (..., align, misalign)
>
> if misalign is != -1 and if we adjust misalign by offset * step + byte_offset
> (usually both are constants).
>
> Also we can still trust the alignment copied from addr_base modulo
> vector element size even if DR_MISALIGN is -1.  This may matter
> for targets that require element-alignment for vector accesses.
>

Re: [jit] Drop libgccjit.pc

2014-10-20 Thread Basile Starynkevitch

On Mon, 2014-10-20 at 13:54 -0400, David Malcolm wrote:
> Committed to branch dmalcolm/jit:
> 
> pkg-config appears to be controversial, so don't provide a .pc file.


I would put it under contrib/; it is controversial, but some would like
to have it.

Cheers.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***

[gomp4] c++ delete clause

2014-10-20 Thread Cesar Philippidis

The OpenACC delete clause isn't detected in the c++ front end because
the lexer classifies it as a keyword, which it is. This patch makes the
openacc pragma parser aware of that.

I've committed this patch to gomp-4_0-branch. A test case will be
included in a follow up patch along with support for the acc enter/exit
data directive.

Cesar
2014-10-20  Cesar Philippidis  

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Also consider CPP_KEYWORD
	typed tokens as clauses for delete.


diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 8fd470a..19cbf37 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -27321,7 +27321,9 @@ cp_parser_omp_clause_name (cp_parser *parser)
 result = PRAGMA_OMP_CLAUSE_PRIVATE;
   else if (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
 result = PRAGMA_OMP_CLAUSE_FOR;
-  else if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
+  /* The lexer classifies "delete" as a keyword.  */
+  else if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)
+	   || cp_lexer_next_token_is (parser->lexer, CPP_KEYWORD))
 {
   tree id = cp_lexer_peek_token (parser->lexer)->u.value;
   const char *p = IDENTIFIER_POINTER (id);

[gomp4] acc update bug

2014-10-20 Thread Cesar Philippidis

The OpenACC update directive would cause an ICE if there was an error
parsing one of its clauses in the c front end. E.g. #pragma acc update
copy(a(1:10)). This patch fixes that. Also, it declare GOACC_update
inside libgomp_g.h

I've committed this patch to gomp-4_0-branch. A test case will be
provided later.

Cesar
2014-10-20  Cesar Philippidis  

	gcc/c/
	* c-parser.c (c_parser_oacc_update): Don't create a new stmt
	if the pragma is bogus.

	libgomp/
	* (GOACC_update): Declare.


diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 17085bf..d1956b8 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -12071,6 +12071,9 @@ c_parser_oacc_update (c_parser *parser)
   return;
 }
 
+  if (parser->error)
+return;
+
   tree stmt = make_node (OACC_UPDATE);
   TREE_TYPE (stmt) = void_type_node;
   OACC_UPDATE_CLAUSES (stmt) = clauses;
diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index 44f200c..35b0627 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -225,6 +225,10 @@ extern void GOACC_kernels (int, void (*) (void *), const void *,
 extern void GOACC_parallel (int, void (*) (void *), const void *,
 			size_t, void **, size_t *, unsigned short *,
 			int, int, int, int, int, ...);
+extern void GOACC_update (int device, const void *openmp_target, size_t mapnum,
+			  void **hostaddrs, size_t *sizes,
+			  unsigned short *kinds, int async,
+			  int num_waits, ...);
 extern void GOACC_wait (int, int, ...);
 
 #endif /* LIBGOMP_G_H */

Re: [gomp4] c++ delete clause

2014-10-20 Thread Jakub Jelinek

On Mon, Oct 20, 2014 at 01:12:08PM -0700, Cesar Philippidis wrote:
> The OpenACC delete clause isn't detected in the c++ front end because
> the lexer classifies it as a keyword, which it is. This patch makes the
> openacc pragma parser aware of that.
> 
> I've committed this patch to gomp-4_0-branch. A test case will be
> included in a follow up patch along with support for the acc enter/exit
> data directive.
> 
> Cesar

> 2014-10-20  Cesar Philippidis  
> 
>   gcc/cp/
>   * parser.c (cp_parser_omp_clause_name): Also consider CPP_KEYWORD
>   typed tokens as clauses for delete.
> 
> 
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 8fd470a..19cbf37 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -27321,7 +27321,9 @@ cp_parser_omp_clause_name (cp_parser *parser)
>  result = PRAGMA_OMP_CLAUSE_PRIVATE;
>else if (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
>  result = PRAGMA_OMP_CLAUSE_FOR;
> -  else if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
> +  /* The lexer classifies "delete" as a keyword.  */
> +  else if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)
> +|| cp_lexer_next_token_is (parser->lexer, CPP_KEYWORD))
>  {
>tree id = cp_lexer_peek_token (parser->lexer)->u.value;
>const char *p = IDENTIFIER_POINTER (id);

See how private or for clauses are handled earlier, you should
not need to parse identifier to handle RID_DELETE as
PRAGMA_OACC_CLAUSE_DELETE.

Jakub

[Patch, Fortran] Add CO_REDUCE

2014-10-20 Thread Tobias Burnus

This patch adds some more checks and the actual implementation (compiler 
side and libcaf_single) for CO_REDUCE. It also rejects coindexed 
variables elsewhere, in line with the recent J3 changes. I also updated 
the API documentation (adding doc for collectives.)


Still unsupported as elsewhere are allocatable components and 
finalization – the latter also still has to be addressed in the standard 
itself.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2014-10-20  Tobias Burnus  

gcc/fortran
	* check.c (check_co_collective): Reject coindexed A args.
	(gfc_check_co_reduce): Add OPERATOR checks.
	* gfortran.texi (_gfortran_caf_co_broadcast, _gfortran_caf_co_max,
	_gfortran_caf_co_min, _gfortran_caf_co_sum,
	_gfortran_caf_co_reduce): Add ABI documentation.
	* intrinsic.texi (CO_REDUCE): Document intrinsic.
	(DPROD): Returns double not single precision.
	* trans-decl.c (gfor_fndecl_co_reduce): New global var.
	(gfc_build_builtin_function_decls): Init it.
	* trans.h (gfor_fndecl_co_reduce): Declare it.
	* trans-intrinsic.c (conv_co_collective,
	gfc_conv_intrinsic_subroutine): Handle CO_REDUCE.

gcc/testsuite/
	* gfortran.dg/coarray_collectives_9.f90: Remove dg-error.
	* gfortran.dg/coarray_collectives_13.f90: New.
	* gfortran.dg/coarray_collectives_14.f90: New.
	* gfortran.dg/coarray_collectives_15.f90: New.
	* gfortran.dg/coarray_collectives_16.f90: New.

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index 0a08c73..6f1fe3f 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1433,6 +1433,13 @@ check_co_collective (gfc_expr *a, gfc_expr *image_idx, gfc_expr *stat,
   return false;
 }
 
+  if (gfc_is_coindexed (a))
+{
+  gfc_error ("The A argument at %L to the intrinsic %s shall not be "
+		 "coindexed", &a->where, gfc_current_intrinsic);
+  return false;
+}
+
   if (image_idx != NULL)
 {
   if (!type_check (image_idx, co_reduce ? 2 : 1, BT_INTEGER))
@@ -1490,10 +1497,10 @@ gfc_check_co_broadcast (gfc_expr *a, gfc_expr *source_image, gfc_expr *stat,
 {
   if (a->ts.type == BT_CLASS || gfc_expr_attr (a).alloc_comp)
 {
-   gfc_error ("Support for the A argument at %L which is polymorphic A "
-  "argument or has allocatable components is not yet "
-		  "implemented", &a->where);
-   return false;
+  gfc_error ("Support for the A argument at %L which is polymorphic A "
+		 "argument or has allocatable components is not yet "
+		 "implemented", &a->where);
+  return false;
 }
   return check_co_collective (a, source_image, stat, errmsg, false);
 }
@@ -1504,38 +1511,164 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
 		 gfc_expr *stat, gfc_expr *errmsg)
 {
   symbol_attribute attr;
+  gfc_formal_arglist *formal;
+  gfc_symbol *sym;
 
   if (a->ts.type == BT_CLASS)
 {
-   gfc_error ("The A argument at %L of CO_REDUCE shall not be polymorphic",
-		  &a->where);
-   return false;
+  gfc_error ("The A argument at %L of CO_REDUCE shall not be polymorphic",
+		 &a->where);
+  return false;
 }
 
   if (gfc_expr_attr (a).alloc_comp)
 {
-   gfc_error ("Support for the A argument at %L with allocatable components"
-  " is not yet implemented", &a->where);
-   return false;
+  gfc_error ("Support for the A argument at %L with allocatable components"
+ " is not yet implemented", &a->where);
+  return false;
 }
 
+  if (!check_co_collective (a, result_image, stat, errmsg, true))
+return false;
+
+  if (!gfc_resolve_expr (op))
+return false;
+
   attr = gfc_expr_attr (op);
   if (!attr.pure || !attr.function)
 {
-   gfc_error ("OPERATOR argument at %L must be a PURE function",
-		  &op->where);
-   return false;
+  gfc_error ("OPERATOR argument at %L must be a PURE function",
+		 &op->where);
+  return false;
 }
 
-  if (!check_co_collective (a, result_image, stat, errmsg, true))
-return false;
+  if (attr.intrinsic)
+{
+  /* None of the intrinsics fulfills the criteria of taking two arguments,
+	 returning the same type and kind as the arguments and being permitted
+	 as actual argument.  */
+  gfc_error ("Intrinsic function %s at %L is not permitted for CO_REDUCE",
+		 op->symtree->n.sym->name, &op->where);
+  return false;
+}
 
-  /* FIXME: After J3/WG5 has decided what they actually exactly want, more
- checks such as same-argument checks have to be added, implemented and
- intrinsic.texi upated.  */
+  if (gfc_is_proc_ptr_comp (op))
+{
+  gfc_component *comp = gfc_get_proc_ptr_comp (op);
+  sym = comp->ts.interface;
+}
+  else
+sym = op->symtree->n.sym;
 
-  gfc_error("CO_REDUCE at %L is not yet implemented", &a->where);
-  return false;
+  formal = sym->formal;
+
+  if (!formal || !formal->next || formal->next->next)
+{
+  gfc_error ("The function passed as OPERATOR at %L shall have two "
+		 "arguments", &op->w

Re: [gomp4] c++ delete clause

2014-10-20 Thread Cesar Philippidis

On 10/20/2014 01:18 PM, Jakub Jelinek wrote:
> On Mon, Oct 20, 2014 at 01:12:08PM -0700, Cesar Philippidis wrote:
>> The OpenACC delete clause isn't detected in the c++ front end because
>> the lexer classifies it as a keyword, which it is. This patch makes the
>> openacc pragma parser aware of that.
>>
>> I've committed this patch to gomp-4_0-branch. A test case will be
>> included in a follow up patch along with support for the acc enter/exit
>> data directive.
>>
>> Cesar
> 
>> 2014-10-20  Cesar Philippidis  
>>
>>  gcc/cp/
>>  * parser.c (cp_parser_omp_clause_name): Also consider CPP_KEYWORD
>>  typed tokens as clauses for delete.
>>
>>
>> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
>> index 8fd470a..19cbf37 100644
>> --- a/gcc/cp/parser.c
>> +++ b/gcc/cp/parser.c
>> @@ -27321,7 +27321,9 @@ cp_parser_omp_clause_name (cp_parser *parser)
>>  result = PRAGMA_OMP_CLAUSE_PRIVATE;
>>else if (cp_lexer_next_token_is_keyword (parser->lexer, RID_FOR))
>>  result = PRAGMA_OMP_CLAUSE_FOR;
>> -  else if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
>> +  /* The lexer classifies "delete" as a keyword.  */
>> +  else if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)
>> +   || cp_lexer_next_token_is (parser->lexer, CPP_KEYWORD))
>>  {
>>tree id = cp_lexer_peek_token (parser->lexer)->u.value;
>>const char *p = IDENTIFIER_POINTER (id);
> 
> See how private or for clauses are handled earlier, you should
> not need to parse identifier to handle RID_DELETE as
> PRAGMA_OACC_CLAUSE_DELETE.

I forgot about private being a keyword in c++. Thanks for the pointer!
I'll fix the handling of delete accordingly.

Thanks,
Cesar

Re: [jit] Drop libgccjit.pc

2014-10-20 Thread Matthias Klose

Am 20.10.2014 um 22:11 schrieb Basile Starynkevitch:
> On Mon, 2014-10-20 at 13:54 -0400, David Malcolm wrote:
>> Committed to branch dmalcolm/jit:
>>
>> pkg-config appears to be controversial, so don't provide a .pc file.
> 
> 
> I would put it under contrib/; it is controversial, but some would like
> to have it.

please don't. if it's not always installed, then you have to check for a fall
back in any case.

Matthias

[gomp4] acc dealloc map

2014-10-20 Thread Cesar Philippidis

All of the various OpenACC memory maps are now fully supported in GCC.
This patch removes an obsolete sorry message complaining about
DEALLOCATE maps not being implemented.

I've committed this to gomp-4_0-branch.

Cesar
2014-10-20  Cesar Philippidis  

	gcc/
	* gimplify.c (gimplify_scan_omp_clauses): Remove switch stmt which
	declared OMP_CLAUSE_MAP_FORCE_DEALLOC as unimplemented.
	(gimplify_expr): Remove OACC_WAIT, since it handled directly by the
	front ends.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 5a8904f..448673e 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6052,17 +6052,6 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	  goto do_add;
 
 	case OMP_CLAUSE_MAP:
-	  switch (OMP_CLAUSE_MAP_KIND (c))
-	{
-	case OMP_CLAUSE_MAP_FORCE_DEALLOC:
-	  input_location = OMP_CLAUSE_LOCATION (c);
-	  /* TODO.  */
-	  sorry ("data clause not yet implemented");
-	  remove = true;
-	  break;
-	default:
-	  break;
-	}
 	  decl = OMP_CLAUSE_DECL (c);
 	  if (error_operand_p (decl))
 	{
@@ -8307,7 +8296,6 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	case OACC_DECLARE:
 	case OACC_ENTER_DATA:
 	case OACC_EXIT_DATA:
-	case OACC_WAIT:
 	case OACC_CACHE:
 	  sorry ("directive not yet implemented");
 	  ret = GS_ALL_DONE;

Re: [jit] Drop libgccjit.pc

2014-10-20 Thread David Malcolm

On Mon, 2014-10-20 at 22:11 +0200, Basile Starynkevitch wrote:
> On Mon, 2014-10-20 at 13:54 -0400, David Malcolm wrote:
> > Committed to branch dmalcolm/jit:
> > 
> > pkg-config appears to be controversial, so don't provide a .pc file.
> 
> 
> I would put it under contrib/; it is controversial, but some would like
> to have it.

It was generated, via some Makefile.in and configure.ac hooks, so I
don't think putting it under contrib/ allows us to sidestep that.

(FWIW, I'm one of the ones who'd like to have a .pc file, but I'd rather
focus on getting the jit in before stage1 closes)

Thanks
Dave

1 2 >

1 - 100 of 132 matches

Mail list logo