date:20111018

Re: Intrinsics for N2965: Type traits and base classes

2011-10-18 Thread Eric Botcazou

> This patch consists intrinsics to properly create the bases and
> direct_bases of a class in the correct order (including multiple nested
> ambiguous virtual and non-virtual classes) for N2965
> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2965.html).
> This allows you to create type traits for giving the base classes of the
> class:

Please post a ChangeLog entry with a patch.  Someone added one for you:

2011-10-17  Michael Spertus  

* gcc/c-family/c-common.c (c_common_reswords): Add __bases,
__direct_bases.
* gcc/c-family/c-common.h: Add RID_BASES and RID_DIRECT_BASES.

but it is in the wrong file (c-family has its own ChangeLog) and shouldn't 
contain the gcc/c-family prefix (paths are relative to the directory).

-- 
Eric Botcazou

Re: [C++ Patch] PR 50757

2011-10-18 Thread Eric Botcazou

> exactly like the recently fixed c++/17212. Tested x86_64-linux.

c-family has its own ChangeLog file, all changes must be documented there.

-- 
Eric Botcazou

Re: [patch] dwarf2out: Drop the size + performance overhead of DW_AT_sibling

2011-10-18 Thread Tristan Gingold

On Oct 17, 2011, at 3:16 PM, Tom Tromey wrote:

>> "Tristan" == Tristan Gingold  writes:
> 
> Tom> Another way to look at it is that there have been many changes to GCC's
> Tom> DWARF output in the last few years.  Surely these have broken these
> Tom> DWARF consumers more than this change possibly could.
> 
> Tristan> Yes, but there is -gstrict-dwarf to stay compatible with old 
> behavior.
> 
> Yes, but this change is strictly compliant.

Agreed.

>  What makes it different
> from any other change that was made to make GCC more compliant?
> Hypothetical consumers could also break on those changes.

The others changes were often corner cases, while this one is very visible.

What is wrong with my suggestion of adding a command line option to keep the 
siblings ?  This option could be removed in a few years if nobody complained 
about sibling removal.

Tristan.

Re: RFA: Improve tree-ssa-sink block selection

2011-10-18 Thread Paolo Bonzini


On 10/18/2011 07:10 AM, Jeff Law wrote:

--- 467,475 
 if (gimple_code (use) != GIMPLE_PHI)
{
  sinkbb = gimple_bb (use);
! sinkbb = select_best_block (frombb, gimple_bb (use), stmt);

! if (sinkbb == frombb)
return false;

  *togsi = gsi_for_stmt (use);


Useless assignment of sinkbb, otherwise looks fine.

By the way, is it intended that sink_code_in_bb visits again 
postdominators that were already visited (which with domwalk would come 
for free)?  As it is, the pass is quadratic when you have something like 
this:


  if (x1) y; else z;
  if (x2) y; else z;
  ...
  if (x) y; else z;
  if (x1) y; else z;

where the postdominator tree is

 x1
  |
 x2
  |
   x
  |
   x1 y1 z1 y2 z2 ... y1 z1
   \___|__|__|__|__/__,---'
  |
  EXIT_BLOCK

Paolo

Re: resent2 [PATCH] Fix ICE in redirect_jump, at jump.c:1497 PR50496

2011-10-18 Thread Eric Botcazou

> thread_prologue_and_epilogue_insns should detect all cases where a
> return insn can be created. So any CFG cleanup that runs before it does
> not need this functionality.

So we're left with CFG cleanups that run after it and could forward edges to an 
edge from a return insn to the exit block in order to build a new return insn.

-- 
Eric Botcazou

Re: [PATCH][RFC] "Fix" PR50716, override type alignment knowledge

2011-10-18 Thread Richard Guenther

On Mon, 17 Oct 2011, Eric Botcazou wrote:

> > 2011-10-17  Richard Guenther  
> >
> > PR middle-end/50716
> > * expr.c (get_object_or_type_alignment): New function.
> > (expand_assignment): Use it.
> > (expand_expr_real_1): Likewise.
> 
> Maybe move it to builtins.c alongside the other get_*_alignment functions.

I explicitly didn't do that because the function shouldn't be used
without knowing exactly what you do ;)

> > Index: gcc/expr.c
> > ===
> > *** gcc/expr.c  (revision 180077)
> > --- gcc/expr.c  (working copy)
> > *** get_bit_range (unsigned HOST_WIDE_INT *b
> > *** 4544,4549 
> > --- 4544,4568 
> >   }
> >   }
> >
> > + /* Return the alignment of the object EXP, also considering its type
> > +when we do not know of explicit misalignment.
> > +???  Note that generally the type of an expression is not kept
> > +consistent with misalignment information by the frontend, for
> > +example when taking the address of a member of a packed structure.
> > +So taking into account type information for alignment is generally
> > +wrong, but is done here as a compromise.  */
> 
> This sounds backwards, as taking into account type information is generally 
> correct, i.e. the packed case is the exception.  Maybe use "in the general 
> case" instead:
> 
>   ??? Note that, in the general case, the type of an expression is not kept
>   consistent with the misalignment by the front-end, for example when taking
>   the address of a member of a packed structure.  However, in most of the
>   cases, expressions have the alignment of their type, so we fall back to
>   the alignment of the type when we cannot compute a misalignment.

Ok, I'll adjust it a bit, adding "so we fall back optimistically to
the alignment...", because that's what is true - for the packed case
we simply compute wrong alignment this way (but for the expand case
this has been the case since forever, noticed only on strict-align
targets).

Note that "cannot compute a misalignment" applies also to the
case where we for some reason explicitly know the alignment
is just BITS_PER_UNIT (you cannot 'misalign' that).  So there are
still a lot of cases where known (in some sense of known) alignment
is overridden by maybe bogus type information.  It's just less
cases now, but I think we may be able to improve the situation
by maybe returning an extra argument from get_object_alignment_1
that says whether the alignment might be bigger or whether the
alignment is definitively known.  But that's for a followup I
presume.

> 
> + static unsigned int
> + get_object_or_type_alignment (tree exp)
> + {
> +   unsigned HOST_WIDE_INT misalign;
> +   unsigned int align = get_object_alignment_1 (exp, &misalign);
> +   align = MAX (TYPE_ALIGN (TREE_TYPE (exp)), align);
> +   if (misalign != 0)
> + align = (misalign & -misalign);
> 
> You don't need to go through the MAX if misalign is non-zero.

Ok.

Committed as follows.

Richard.

2011-10-18  Richard Guenther  

PR middle-end/50716
* expr.c (get_object_or_type_alignment): New function.
(expand_assignment): Use it.
(expand_expr_real_1): Likewise.

Index: gcc/expr.c
===
*** gcc/expr.c  (revision 180077)
--- gcc/expr.c  (working copy)
*** get_bit_range (unsigned HOST_WIDE_INT *b
*** 4544,4549 
--- 4544,4570 
  }
  }
  
+ /* Return the alignment of the object EXP, also considering its type
+when we do not know of explicit misalignment.
+???  Note that, in the general case, the type of an expression is not kept
+consistent with misalignment information by the front-end, for
+example when taking the address of a member of a packed structure.
+However, in most of the cases, expressions have the alignment of
+their type, so we optimistically fall back to the alignment of the
+type when we cannot compute a misalignment.  */
+ 
+ static unsigned int
+ get_object_or_type_alignment (tree exp)
+ {
+   unsigned HOST_WIDE_INT misalign;
+   unsigned int align = get_object_alignment_1 (exp, &misalign);
+   if (misalign != 0)
+ align = (misalign & -misalign);
+   else
+ align = MAX (TYPE_ALIGN (TREE_TYPE (exp)), align);
+   return align;
+ }
+ 
  /* Expand an assignment that stores the value of FROM into TO.  If NONTEMPORAL
 is true, try generating a nontemporal store.  */
  
*** expand_assignment (tree to, tree from, b
*** 4553,4559 
rtx to_rtx = 0;
rtx result;
enum machine_mode mode;
!   int align;
enum insn_code icode;
  
/* Don't crash if the lhs of the assignment was erroneous.  */
--- 4574,4580 
rtx to_rtx = 0;
rtx result;
enum machine_mode mode;
!   unsigned int align;
enum insn_code icode;
  
/* Don't crash if the lhs of the assignment was erroneous.  */
*** expand_

[patch#2] dwarf2out: Drop the size + performance overhead of DW_AT_sibling

2011-10-18 Thread Jan Kratochvil

On Tue, 18 Oct 2011 09:24:08 +0200, Tristan Gingold wrote:
> What is wrong with my suggestion of adding a command line option to keep the
> siblings ?  This option could be removed in a few years if nobody complained
> about sibling removal.

I find an extra option just a pollution of doc and everything by an option
which will never get used.  Wouldn't it be enough to disable it by
-gstrict-dwarf?  While currently the -gstrict-dwarf meaning is different
I believe the purpose is correct - to be more backward compatible.


Thanks,
Jan


gcc/
2011-10-12  Jan Kratochvil  

Stop producing DW_AT_sibling without -gstrict-dwarf.
* dwarf2out.c (dwarf2out_finish): Remove calls of
add_sibling_attributes if !DWARF_STRICT.  Extend the comment with
reason.

--- gcc/dwarf2out.c (revision 180121)
+++ gcc/dwarf2out.c (working copy)
@@ -22496,13 +22496,17 @@ dwarf2out_finish (const char *filename)
   prune_unused_types ();
 }
 
-  /* Traverse the DIE's and add add sibling attributes to those DIE's
- that have children.  */
-  add_sibling_attributes (comp_unit_die ());
-  for (node = limbo_die_list; node; node = node->next)
-add_sibling_attributes (node->die);
-  for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
-add_sibling_attributes (ctnode->root_die);
+  if (dwarf_strict)
+{
+  /* Traverse the DIE's and add add sibling attributes to those DIE's that
+have children.  It is produced only for compatibility reasons as it is
+both a size and consumer performance hit.  */
+  add_sibling_attributes (comp_unit_die ());
+  for (node = limbo_die_list; node; node = node->next)
+   add_sibling_attributes (node->die);
+  for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
+   add_sibling_attributes (ctnode->root_die);
+}
 
   /* Output a terminator label for the .text section.  */
   switch_to_section (text_section);

Re: [patch#2] dwarf2out: Drop the size + performance overhead of DW_AT_sibling

2011-10-18 Thread Jakub Jelinek

On Tue, Oct 18, 2011 at 10:28:09AM +0200, Jan Kratochvil wrote:
> 2011-10-12  Jan Kratochvil  
> 
>   Stop producing DW_AT_sibling without -gstrict-dwarf.
>   * dwarf2out.c (dwarf2out_finish): Remove calls of
>   add_sibling_attributes if !DWARF_STRICT.  Extend the comment with
>   reason.

This is ok for trunk.

Jakub

Re: [Patch, Fortran] PR 47023: C_Sizeof: Rejects valid code

2011-10-18 Thread Tobias Burnus


On 10/17/2011 11:37 PM, Janus Weil wrote:

here is another patch for PR47023, which takes care of comment #1,
i.e. rejecting polymorphic variables in a C-binding context.

Regtested on x86_64-unknown-linux-gnu. Ok for trunk?


OK. Thanks for the patch.

If I saw it correctly, one still needs to fix the vendor-extension 
SIZEOF for procedure pointers (it currently returns one byte!) - and the 
issue in the original bug (comment 0).


Tobias


2011-10-17  Janus Weil

PR fortran/47023
* decl.c (verify_c_interop_param): Renamed to
'gfc_verify_c_interop_param'. Add error message for polymorphic
arguments.
(verify_c_interop): Renamed to 'gfc_verify_c_interop'. Reject
polymorphic variables.
(verify_bind_c_sym): Renamed 'verify_c_interop'.
* gfortran.h (verify_c_interop,verify_c_interop_param): Renamed.
* check.c (gfc_check_sizeof): Ditto.
* resolve.c (gfc_iso_c_func_interface,resolve_fl_procedure): Ditto.
* symbol.c (verify_bind_c_derived_type): Ditto.


2011-10-17  Janus Weil

PR fortran/47023
* gfortran.dg/iso_c_binding_class.f03: New.

Re: [PATCH] Add capability to run several iterations of early optimizations

2011-10-18 Thread Richard Guenther

On Tue, Oct 18, 2011 at 1:45 AM, Maxim Kuvyrkov  wrote:
> On 13/10/2011, at 12:58 AM, Richard Guenther wrote:
>
>> On Wed, Oct 12, 2011 at 8:50 AM, Maxim Kuvyrkov  
>> wrote:
>>> The following patch adds new knob to make GCC perform several iterations of 
>>> early optimizations and inlining.
>>>
>>> This is for dont-care-about-compile-time-optimize-all-you-can scenarios.  
>>> Performing several iterations of optimizations does significantly improve 
>>> code speed on a certain proprietary source base.  Some hand-tuning of the 
>>> parameter value is required to get optimum performance.  Another good use 
>>> for this option is for search and ad-hoc analysis of cases where GCC misses 
>>> optimization opportunities.
>>>
>>> With the default setting of '1', nothing is changed from the current status 
>>> quo.
>>>
>>> The patch was bootstrapped and regtested with 3 iterations set by default 
>>> on i686-linux-gnu.  The only failures in regression testsuite were due to 
>>> latent bugs in handling of EH information, which are being discussed in a 
>>> different thread.
>>>
>>> Performance impact on the standard benchmarks is not conclusive, there are 
>>> improvements in SPEC2000 of up to 4% and regressions down to -2%, see [*].  
>>> SPEC2006 benchmarks will take another day or two to complete and I will 
>>> update the spreadsheet then.  The benchmarks were run on a Core2 system for 
>>> all combinations of {-m32/-m64}{-O2/-O3}.
>>>
>>> Effect on compilation time is fairly predictable, about 10% compile time 
>>> increase with 3 iterations.
>>>
>>> OK for trunk?
>>
>> I don't think this is a good idea, especially in the form you implemented it.
>>
>> If we'd want to iterate early optimizations we'd want to do it by iterating
>> an IPA pass so that we benefit from more precise size estimates
>> when trying to inline a function the second time.
>
> Could you elaborate on this a bit?  Early optimizations are gimple passes, so 
> I'm missing your point here.

pass_early_local_passes is an IPA pass, you want to iterate
fn1, fn2, fn1, fn2, ..., not fn1, fn1 ..., fn2, fn2 ... precisely for better
inlining.  Thus you need to split pass_early_local_passes into pieces
so you can iterate one of the IPA pieces.

>> Also statically
>> scheduling the passes will mess up dump files and you have no
>> chance of say, noticing that nothing changed for function f and its
>> callees in iteration N and thus you can skip processing them in
>> iteration N + 1.
>
> Yes, these are the shortcomings.  The dump files name changes can be fixed, 
> e.g., by adding a suffix to the passes on iterations after the first one.  
> The analysis to avoid unnecessary iterations is more complex problem.

Sure.  I analyzed early passes by manually duplicating them and
test that they do nothing for tramp3d, which they pretty much all did
at some point.

>>
>> So, at least you should split the pass_early_local_passes IPA pass
>> into three, you'd iterate over the 2nd (definitely not over 
>> pass_split_functions
>> though), the third would be pass_profile and pass_split_functions only.
>> And you'd iterate from the place the 2nd IPA pass is executed, not
>> by scheduling them N times.
>
> OK, I will look into this.
>
>>
>> Then you'd have to analyze the compile-time impact of the IPA
>> splitting on its own when not iterating.  Then you should look
>> at what actually was the optimizations that were performed
>> that lead to the improvement (I can see some indirect inlining
>> happening, but everything else would be a bug in present
>> optimizers in the early pipeline - they are all designed to be
>> roughly independent on each other and _not_ expose new
>> opportunities by iteration).  Thus - testcases?
>
> The initial motivation for the patch was to enable more indirect inlining and 
> devirtualization opportunities.

Hm.

> Since then I found the patch to be helpful in searching for optimization 
> opportunities and bugs.  E.g., SPEC2006's 471.omnetpp drops 20% with 2 
> additional iterations of early optimizations [*].  Given that applying more 
> optimizations should, theoretically, not decrease performance, there is 
> likely a very real bug or deficiency behind that.

It is likely early SRA that messes up, or maybe convert switch.  Early
passes should be really restricted to always profitable cleanups.

Your experiment looks useful to track down these bugs, but in general
I don't think we want to expose iterating early passes.

Richard.

> Thank you,
>
> [*] 
> https://docs.google.com/spreadsheet/ccc?key=0AvK0Y-Pgj7bNdFBQMEJ6d3laeFdvdk9lQ1p0LUFkVFE&hl=en_US
>
> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics
>
>
>
>

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread Dodji Seketeli

Jason Merrill  writes:

> If you have a patch like this that fixes a major regression, go ahead
> and check it in without waiting for approval; we can adjust it as
> necessary after build is working again.

OK.

>>   size_t num_expanded_macros;
>
>> -  fprintf (stderr, "Number of expanded macros: %5lu\n",
>> -   s.num_expanded_macros);
>> +  fprintf (stderr, "Number of expanded macros: %5ld\n",
>> +   (long)s.num_expanded_macros);
>
> If we're going to use %l in printf, we should use long rather than
> size_t in linemap_stats; that way we don't need the cast.

OK, I have checked the below in.

libcpp/

* include/line-map.h (struct linemap_stats): Change the type of
the members from size_t to long.
* macro.c (macro_arg_token_iter_init): Unconditionally initialize
iter->location_ptr.

gcc/c-family/

* c-lex.c (fe_file_change): Use LINEMAP_SYSP when
!NO_IMPLICIT_EXTERN_C.

gcc/
* input.c (dump_line_table_statistics): Use long, not size_t.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180124 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index be83b61..b151564 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -211,7 +211,7 @@ fe_file_change (const struct line_map *new_map)
 #ifndef NO_IMPLICIT_EXTERN_C
  if (c_header_level)
++c_header_level;
- else if (new_map->sysp == 2)
+ else if (LINEMAP_SYSP (new_map) == 2)
{
  c_header_level = 1;
  ++pending_lang_change;
@@ -224,7 +224,7 @@ fe_file_change (const struct line_map *new_map)
 #ifndef NO_IMPLICIT_EXTERN_C
   if (c_header_level && --c_header_level == 0)
{
- if (new_map->sysp == 2)
+ if (LINEMAP_SYSP (new_map) == 2)
warning (0, "badly nested C headers from preprocessor");
  --pending_lang_change;
}
diff --git a/gcc/input.c b/gcc/input.c
index 41842b7..a780f5c 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -80,7 +80,7 @@ void
 dump_line_table_statistics (void)
 {
   struct linemap_stats s;
-  size_t total_used_map_size,
+  long total_used_map_size,
 macro_maps_size,
 total_allocated_map_size;
 
@@ -99,45 +99,45 @@ dump_line_table_statistics (void)
 + s.macro_maps_used_size
 + s.macro_maps_locations_size;
 
-  fprintf (stderr, "Number of expanded macros: %5lu\n",
+  fprintf (stderr, "Number of expanded macros: %5ld\n",
s.num_expanded_macros);
   if (s.num_expanded_macros != 0)
-fprintf (stderr, "Average number of tokens per macro expansion:  %5lu\n",
+fprintf (stderr, "Average number of tokens per macro expansion:  %5ld\n",
  s.num_macro_tokens / s.num_expanded_macros);
   fprintf (stderr,
"\nLine Table allocations during the "
"compilation process\n");
-  fprintf (stderr, "Number of ordinary maps used:%5lu%c\n",
+  fprintf (stderr, "Number of ordinary maps used:%5ld%c\n",
SCALE (s.num_ordinary_maps_used),
STAT_LABEL (s.num_ordinary_maps_used));
-  fprintf (stderr, "Ordinary map used size:  %5lu%c\n",
+  fprintf (stderr, "Ordinary map used size:  %5ld%c\n",
SCALE (s.ordinary_maps_used_size),
STAT_LABEL (s.ordinary_maps_used_size));
-  fprintf (stderr, "Number of ordinary maps allocated:   %5lu%c\n",
+  fprintf (stderr, "Number of ordinary maps allocated:   %5ld%c\n",
SCALE (s.num_ordinary_maps_allocated),
STAT_LABEL (s.num_ordinary_maps_allocated));
-  fprintf (stderr, "Ordinary maps allocated size:%5lu%c\n",
+  fprintf (stderr, "Ordinary maps allocated size:%5ld%c\n",
SCALE (s.ordinary_maps_allocated_size),
STAT_LABEL (s.ordinary_maps_allocated_size));
-  fprintf (stderr, "Number of macro maps used:   %5lu%c\n",
+  fprintf (stderr, "Number of macro maps used:   %5ld%c\n",
SCALE (s.num_macro_maps_used),
STAT_LABEL (s.num_macro_maps_used));
-  fprintf (stderr, "Macro maps used size:%5lu%c\n",
+  fprintf (stderr, "Macro maps used size:%5ld%c\n",
SCALE (s.macro_maps_used_size),
STAT_LABEL (s.macro_maps_used_size));
-  fprintf (stderr, "Macro maps locations size:   %5lu%c\n",
+  fprintf (stderr, "Macro maps locations size:   %5ld%c\n",
SCALE (s.macro_maps_locations_size),
STAT_LABEL (s.macro_maps_locations_size));
-  fprintf (stderr, "Macro maps size: %5lu%c\n",
+  fprintf (stderr, "Macro maps size: %5ld%c\n",
SCALE (macro_maps_size),
STAT_LABEL (macro_maps_size));
-  fprintf (stderr, "Duplicated maps locations size:  %5lu%c\n",
+  fprintf (stderr, "Duplicated maps locat

Re: [PATCH] Simplify and fix restrict handling

2011-10-18 Thread Richard Guenther

On Mon, 17 Oct 2011, Richard Guenther wrote:

> On Fri, 14 Oct 2011, Richard Guenther wrote:
> 
> > 
> > This follows up Michas testcase where we fail to handle the
> > conservatively propagated restrict tags properly.  The following
> > patch simplifies handling of restrict in the oracle and thus
> > only excludes NONLOCAL (as designed), but not ESCAPED from
> > conflict checking.
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> So, after some regressions caused by this patch and some more thinking
> (about more possible issues) I concluded that we can simplify things
> even more by not making restrict vars point to NONLOCAL, but only
> to their tag (but marking that as global and able to have a points-to
> set).  This way the special-casing of NONLOCAL vs. restrict can go
> away, and with it all its possible problems.  Restrict is now
> similar to malloc () memory that escapes.
> 
> Hopefully this one is without regressions ;)
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

And this is what I ended up applying after fixing constraints
again.  For restrict qualified global (or parameter) pointers
we now generate

p = &RESTRICT_TAG
RESTRICT_TAG = NONLOCAL

if we implmenent the proposed RESTRICT_CAST expression from

p = RESTRICT_CAST ;

we'd need to generate

p = &TAG;
TAG = *q;

With this in place we can now disambiguate restrict qualified
pointers against global decls which wasn't possible before
(we invented the DECL_IS_RESTRICTED_P flag for this).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-10-17  Richard Guenther  

* tree-ssa-alias.h (struct pt_solution): Remove
vars_contains_restrict member.
(pt_solutions_same_restrict_base): Remove.
(pt_solution_set): Adjust.
* tree-ssa-alias.c (ptr_deref_may_alias_decl_p): Remove
vars_contains_restrict handling.
(dump_points_to_solution): Likewise.
(ptr_derefs_may_alias_p): Do not call pt_solutions_same_restrict_base.
* tree-ssa-structalias.c (struct variable_info): Remove is_restrict_var
field.
(new_var_info): Do not initialize it.
(ipa_escaped_pt): Adjust.
(make_constraint_from_restrict): Make the tag global.
(make_constraint_from_global_restrict): New function.
(make_constraint_from_heapvar): Remove.
(create_variable_info_for): Do not make restrict vars point
to NONLOCAL.
(intra_create_variable_infos): Likewise.
(find_what_var_points_to): Remove vars_contains_restrict handling.
(pt_solution_set): Adjust.
(pt_solution_ior_into): Likewise.
(pt_solutions_same_restrict_base): Remove.
(compute_points_to_sets): Do not test is_restrict_var.
* cfgexpand.c (update_alias_info_with_stack_vars): Adjust.
* gimple-pretty-print.c (pp_points_to_solution): Likewise.

* gcc.dg/torture/restrict-1.c: New testcase.

Index: gcc/tree-ssa-alias.c
===
*** gcc/tree-ssa-alias.c.orig   2011-10-17 16:56:54.0 +0200
--- gcc/tree-ssa-alias.c2011-10-17 16:57:21.0 +0200
*** ptr_deref_may_alias_decl_p (tree ptr, tr
*** 219,231 
if (!pi)
  return true;

-   /* If the decl can be used as a restrict tag and we have a restrict
-  pointer and that pointers points-to set doesn't contain this decl
-  then they can't alias.  */
-   if (DECL_RESTRICTED_P (decl)
-   && pi->pt.vars_contains_restrict)
- return bitmap_bit_p (pi->pt.vars, DECL_PT_UID (decl));
- 
return pt_solution_includes (&pi->pt, decl);
  }

--- 219,224 
*** ptr_derefs_may_alias_p (tree ptr1, tree
*** 316,326 
if (!pi1 || !pi2)
  return true;

-   /* If both pointers are restrict-qualified try to disambiguate
-  with restrict information.  */
-   if (!pt_solutions_same_restrict_base (&pi1->pt, &pi2->pt))
- return false;
- 
/* ???  This does not use TBAA to prune decls from the intersection
   that not both pointers may access.  */
return pt_solutions_intersect (&pi1->pt, &pi2->pt);
--- 309,314 
*** dump_points_to_solution (FILE *file, str
*** 426,433 
dump_decl_set (file, pt->vars);
if (pt->vars_contains_global)
fprintf (file, " (includes global vars)");
-   if (pt->vars_contains_restrict)
-   fprintf (file, " (includes restrict tags)");
  }
  }

--- 414,419 
Index: gcc/tree-ssa-alias.h
===
*** gcc/tree-ssa-alias.h.orig   2011-10-17 16:56:54.0 +0200
--- gcc/tree-ssa-alias.h2011-10-17 16:57:21.0 +0200
*** struct GTY(()) pt_solution
*** 54,61 
/* Nonzero if the pt_vars bitmap includes a global variable.  */
unsigned int vars_contains_global : 1;

-   /* Nonzero if the pt_vars bitmap includes a r

Re: [PATCH 1/6] Linemap infrastructure for virtual locations

2011-10-18 Thread Dodji Seketeli

Gerald Pfeifer  writes:

> On Mon, 17 Oct 2011, Dodji Seketeli wrote:
>>> this looks like it's causing the following bootstrap failure for me
>>> on i386-unknown-freebsd9.0?
>> Yes this is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50760, and I am
>> testing the patch attached to the bug at the moment.  Just curious, does
>> that patch help for your target?
>
> Yes, applying the input.c part of the patch I am onw beyond the point
> where bootstrap failed last night and into stage 3.

Thank you for testing.  I have checked the fix in.  It should hopefully
be fixed now.

-- 
Dodji

Re: [PR50672, PATCH] Fix ice triggered by -ftree-tail-merge: verify_ssa failed: no immediate_use list

2011-10-18 Thread Tom de Vries

On 10/17/2011 01:51 PM, Richard Guenther wrote:
> On Sun, Oct 16, 2011 at 12:05 PM, Tom de Vries  wrote:
>> On 10/14/2011 12:00 PM, Richard Guenther wrote:
>>> On Fri, Oct 14, 2011 at 1:12 AM, Tom de Vries  
>>> wrote:
 On 10/12/2011 02:19 PM, Richard Guenther wrote:
> On Wed, Oct 12, 2011 at 8:35 AM, Tom de Vries  
> wrote:
>> Richard,
>>
>> I have a patch for PR50672.
>>
>> When compiling the testcase from the PR with -ftree-tail-merge, the 
>> scenario is
>> as follows:
>>
>> We start out tail_merge_optimize with blocks 14 and 20, which are alike, 
>> but not
>> equal, since they have different successors:
>> ...
>>  # BLOCK 14 freq:690
>>  # PRED: 25 [61.0%]  (false,exec)
>>
>>  if (wD.2197_57(D) != 0B)
>>goto ;
>>  else
>>goto ;
>>  # SUCC: 15 [78.4%]  (true,exec) 16 [21.6%]  (false,exec)
>>
>>
>>  # BLOCK 20 freq:2900
>>  # PRED: 29 [100.0%]  (fallthru) 31 [100.0%]  (fallthru)
>>
>>  # .MEMD.2447_209 = PHI <.MEMD.2447_125(29), .MEMD.2447_129(31)>
>>  if (wD.2197_57(D) != 0B)
>>goto ;
>>  else
>>goto ;
>>  # SUCC: 5 [85.0%]  (true,exec) 6 [15.0%]  (false,exec)
>> ...
>>
>> In the first iteration, we merge block 5 with block 15 and block 6 with 
>> block
>> 16. After that, the blocks 14 and 20 are equal.
>>
>> In the second iteration, the blocks 14 and 20 are merged, by redirecting 
>> the
>> incoming edges of block 20 to block 14, and removing block 20.
>>
>> Block 20 also contains the definition of .MEMD.2447_209. Removing the 
>> definition
>> delinks the vuse of .MEMD.2447_209 in block 5:
>> ...
>>  # BLOCK 5 freq:6036
>>  # PRED: 20 [85.0%]  (true,exec)
>>
>>  # PT = nonlocal escaped
>>  D.2306_58 = &thisD.2200_10(D)->D.2156;
>>  # .MEMD.2447_132 = VDEF <.MEMD.2447_209>
>>  # USE = anything
>>  # CLB = anything
>>  drawLineD.2135 (D.2306_58, wD.2197_57(D), gcD.2198_59(D));
>>  goto ;
>>  # SUCC: 17 [100.0%]  (fallthru,exec)
>> ...
>
> And block 5 is retained and block 15 is discarded?
>

 Indeed.

>> After the pass, when executing the TODO_update_ssa_only_virtuals, we 
>> update the
>> drawLine call in block 5 using rewrite_update_stmt, which calls
>> maybe_replace_use for the vuse operand.
>>
>> However, maybe_replace_use doesn't have an effect since the old vuse and 
>> the new
>> vuse happen to be the same (rdef == use), so SET_USE is not called and 
>> the vuse
>> remains delinked:
>> ...
>>  if (rdef && rdef != use)
>>SET_USE (use_p, rdef);
>> ...
>>
>> The patch fixes this by forcing SET_USE for delinked uses.
>
> That isn't the correct fix.  Whoever unlinks the vuse (by removing its
> definition) has to replace it with something valid, which is either the
> bare symbol .MEM, or the VUSE associated with the removed VDEF
> (thus, as unlink_stmt_vdef does).
>

 Another try. For each deleted bb, we call unlink_stmt_vdef for the 
 statements,
 and replace the .MEM phi uses with the bare .MEM symbol.

 Bootstrapped and reg-tested on x86_64.

 Ok for trunk?
>>>
>>> Better.  For
>>>
>>> +
>>> +  FOR_EACH_IMM_USE_STMT (use_stmt, iter, res)
>>> +   {
>>> + FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
>>> +   SET_USE (use_p, SSA_NAME_VAR (res));
>>> +   }
>>>
>>> you can use mark_virtual_phi_result_for_renaming (phi) instead.
>>>
>>> +  for (i = gsi_last_bb (bb); !gsi_end_p (i); gsi_prev_nondebug (&i))
>>> +unlink_stmt_vdef (gsi_stmt (i));
>>>
>>> is that actually necessary?  That is, isn't the block that follows a
>>> deleted block always starting with a virtual PHI?
>>
>> Block 20 is deleted. Block 5 follows the deleted block 20. Block 5 does not
>> start with a virtual PHI.
>>
>>> If not it should
>>> be enough to walk to the first stmt that uses a virtual operand
>>> and similar to the PHI case replace all its uses with the bare
>>> symbol.
>>
>> I think we need to handle the exposed uses (meaning outside the block) of 
>> vdefs
>> in each deleted block. The only vdef that can have exposed uses is the last 
>> vdef
>> in the block. So we should find the last vdef (gimple with gimple_vdef or
>> virtual PHI) and handle the related uses.
>>
>> Bootstrapped and regtested on x86_64. OK for trunk?
> 
> Hmmm.  I can see that this will fail when the block has a store
> but not a virtual PHI node, thus, when renaming does not reach
> a bare .MEM symbol walking the use-def chain from the VUSE
> of the store.  What release_last_vdef should do is trigger a rename
> of subsequent VUSEs in successors of the deleted block.  Which
> requires you to do something like
> 
> static void
> rename_last_vdef (basic_block bb)
> {
> +  gimple_stmt_iterator i;
> +
> +  for (i = gsi_last_bb (bb); !gsi_

Re: [cxx-mem-model] Rename __sync_mem to __atomic

2011-10-18 Thread Eric Botcazou

> In order to align with a standard naming convention, this patch renames
> all the __sync_mem references to __atomic.
>
> The only real addition is in builtins.c::is_builtin_name needed to know
> that __atomic_ is also a builtin prefix.

Are you sure about the renaming get_builtin_sync_mem -> get_builtin_atomic?

-- 
Eric Botcazou

Re: [C++ Patch] PR 50757

2011-10-18 Thread Paolo Carlini

Hi,

>> exactly like the recently fixed c++/17212. Tested x86_64-linux.
> 
> c-family has its own ChangeLog file, all changes must be documented there.

Oh my, I thought I had checked. I'll fix it, sorry.

Paolo

Re: [patch] Support vectorization of widening shifts

2011-10-18 Thread Jakub Jelinek

On Tue, Oct 18, 2011 at 11:39:22AM +0200, Ira Rosen wrote:
> On 2 October 2011 10:30, Ira Rosen  wrote:
> > On 29 September 2011 17:30, Ramana Radhakrishnan
> >  wrote:
> >> On 19 September 2011 08:54, Ira Rosen  wrote:
> >>
> >>>
> >>> Bootstrapped on powerpc64-suse-linux, tested on powerpc64-suse-linux
> >>> and arm-linux-gnueabi
> >>> OK for mainline?
> >>
> >> Sorry I missed this patch. Is there any reason why we need unspecs in
> >> this case ? Can't this be represented by subregs and zero/ sign
> >> extensions in RTL without the UNSPECs ?
> 
> I committed the attached patch with Ramana's solution for testing

> +/* Detect widening shift pattern:
>  
> +   type a_t;
> +   TYPE a_T, res_T;
> +
> +   S1 a_t = ;
> +   S2 a_T = (TYPE) a_t;
> +   S3 res_T = a_T << CONST;
> +
> +  where type 'TYPE' is at least double the size of type 'type'.
> +
> +  Also detect unsgigned cases:

unsigned

Jakub

Re: [PATCH] fortran/50514 -- Fix static chekcing of ISHFT[C] arguments.

2011-10-18 Thread Tobias Burnus


On 10/17/2011 07:02 PM, Steve Kargl wrote:

On Mon, Oct 17, 2011 at 12:22:03PM +0200, Tobias Burnus wrote:

I am not so happy with complete test replacements. How about adding it
as new test case?

Well, the old testcase is
[...]

The only line that would survive is the first line, which
is covered in the updated testcase.


Convinced: Do the replacement.


I somehow find less_than_bitsize1's

+  if (strncmp (arg2, "ISHFT", 5) == 0)

not that elegant and would prefer another argument, which tells the
function that it should take the absolute value of the argument;
however, given that ISHFT seems to be the only function which allows
negative arguments, one could also leave it.

In looking at the other uses of less_than_bitsize1() I
could pass arg2=NULL for ISHFT[C], and then the code
would become

  if (arg2 == NULL) { /* Special case for ISHFT[C].  */

Would that be better?


It would be faster - but not necessarily better. I was rather thinking 
of adding another argument like "bool abs_ok".


However, I think all three solutions are OK. Pick one.

Tobias

Re: [C++ Patch] __builtin_choose_expr

2011-10-18 Thread Paolo Carlini

I'm under the impression that some tests are written in C, wouldn't 
better fit in c-c++-common?


Thanks,
Paolo.

Re: [patch] Support vectorization of widening shifts

2011-10-18 Thread Ira Rosen

On 18 October 2011 11:43, Jakub Jelinek  wrote:
> On Tue, Oct 18, 2011 at 11:39:22AM +0200, Ira Rosen wrote:
>> On 2 October 2011 10:30, Ira Rosen  wrote:
>> > On 29 September 2011 17:30, Ramana Radhakrishnan
>> >  wrote:
>> >> On 19 September 2011 08:54, Ira Rosen  wrote:
>> >>
>> >>>
>> >>> Bootstrapped on powerpc64-suse-linux, tested on powerpc64-suse-linux
>> >>> and arm-linux-gnueabi
>> >>> OK for mainline?
>> >>
>> >> Sorry I missed this patch. Is there any reason why we need unspecs in
>> >> this case ? Can't this be represented by subregs and zero/ sign
>> >> extensions in RTL without the UNSPECs ?
>>
>> I committed the attached patch with Ramana's solution for testing
>
>> +/* Detect widening shift pattern:
>>
>> +   type a_t;
>> +   TYPE a_T, res_T;
>> +
>> +   S1 a_t = ;
>> +   S2 a_T = (TYPE) a_t;
>> +   S3 res_T = a_T << CONST;
>> +
>> +  where type 'TYPE' is at least double the size of type 'type'.
>> +
>> +  Also detect unsgigned cases:
>
> unsigned

Thanks, I'll fix this.

Ira

>
>        Jakub
>

patch ping: Dump the "degree of overlap" to compare static profile with instrumentation profile

2011-10-18 Thread Dehao Chen

http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01293.html

Thanks!
Dehao

Re: [Patch]: fix typo in rs6000.c (AIX bootstrap broken)

2011-10-18 Thread Tristan Gingold

Ping…

On Oct 13, 2011, at 5:11 PM, Tristan Gingold wrote:

> Hi,
> 
> looks like an obvious typo.  Ok for trunk ?
> 
> Tristan.
> 
> 2011-10-13  Tristan Gingold  
> 
>   * config/rs6000/rs6000.c (rs6000_init_builtins): Fix typo.
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 4fd2192..3bfe33e 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -12213,7 +12213,7 @@ rs6000_init_builtins (void)
> 
> #if TARGET_XCOFF
>   /* AIX libm provides clog as __clog.  */
> -  if ((tdecl = builtin_decl_explicit ([BUILT_IN_CLOG))) != NULL_TREE)
> +  if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
> set_user_assembler_name (tdecl, "__clog");
> #endif
> 
>

Re: [Patch, Fortran] PR 47023: C_Sizeof: Rejects valid code

2011-10-18 Thread Janus Weil

>> here is another patch for PR47023, which takes care of comment #1,
>> i.e. rejecting polymorphic variables in a C-binding context.
>>
>> Regtested on x86_64-unknown-linux-gnu. Ok for trunk?
>
> OK. Thanks for the patch.

Thanks. Committed as r180130.


> If I saw it correctly, one still needs to fix the vendor-extension SIZEOF
> for procedure pointers (it currently returns one byte!)

Right. Actually I don't quite understand where that 1 byte comes from. Example:

use iso_c_binding
procedure(real), pointer :: pp
pp => sin
print *,sizeof(pp)
print *,sizeof(pp(0.))
end

This spits out:
1
4

The second one is correct (giving the pointee size). The first one
should probably give the pointer size, but "1" is neither the size of
the pointer nor the pointee.


> and the issue in the original bug (comment 0).

Yes. I think that's it.

Cheers,
Janus



>> 2011-10-17  Janus Weil
>>
>>        PR fortran/47023
>>        * decl.c (verify_c_interop_param): Renamed to
>>        'gfc_verify_c_interop_param'. Add error message for polymorphic
>>        arguments.
>>        (verify_c_interop): Renamed to 'gfc_verify_c_interop'. Reject
>>        polymorphic variables.
>>        (verify_bind_c_sym): Renamed 'verify_c_interop'.
>>        * gfortran.h (verify_c_interop,verify_c_interop_param): Renamed.
>>        * check.c (gfc_check_sizeof): Ditto.
>>        * resolve.c (gfc_iso_c_func_interface,resolve_fl_procedure): Ditto.
>>        * symbol.c (verify_bind_c_derived_type): Ditto.
>>
>>
>> 2011-10-17  Janus Weil
>>
>>        PR fortran/47023
>>        * gfortran.dg/iso_c_binding_class.f03: New.
>
>

[PATCH] Fix PR50767

2011-10-18 Thread Richard Guenther


This fixes an oversight of my patch folding the added stmt in PRE - 
failing to update it.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-10-18  Richard Guenther  

PR tree-optimization/50767
* tree-ssa-pre.c (create_expression_by_pieces): Update the
folded statement.

* gcc.dg/torture/pr50767.c: New testcase.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 180127)
--- gcc/tree-ssa-pre.c  (working copy)
*** create_expression_by_pieces (basic_block
*** 3188,3194 
  
/* Fold the last statement.  */
gsi = gsi_last (*stmts);
!   fold_stmt_inplace (&gsi);
  
/* Add a value number to the temporary.
   The value may already exist in either NEW_SETS, or AVAIL_OUT, because
--- 3188,3195 
  
/* Fold the last statement.  */
gsi = gsi_last (*stmts);
!   if (fold_stmt_inplace (&gsi))
! update_stmt (gsi_stmt (gsi));
  
/* Add a value number to the temporary.
   The value may already exist in either NEW_SETS, or AVAIL_OUT, because
Index: gcc/testsuite/gcc.dg/torture/pr50767.c
===
*** gcc/testsuite/gcc.dg/torture/pr50767.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr50767.c  (revision 0)
***
*** 0 
--- 1,21 
+ /* { dg-do compile } */
+ /* { dg-options "-fno-tree-copy-prop -fno-tree-dominator-opts" } */
+ 
+ struct S
+ {
+   struct S *s;
+ };
+ 
+ static struct S *ss;
+ struct S *s;
+ 
+ void bar(void);
+ 
+ void foo(void)
+ {
+   for (;;)
+ {
+   s->s = ss;
+   bar ();
+ }
+ }

[patch testsuite]: Adjust tree-ssa/builtin-expect-*.c tests for high cost-branching optimization

2011-10-18 Thread Kai Tietz

Hello,

this patch adjusts __builtin_expect tests in tree-ssa so, that short-circuit 
branch optimization don't lead to fallout.  I've used here a multiplication, as 
simple substraction/addition might get optimized away.

ChangeLog

2011-10-18  Kai Tietz  

* gcc.dg/tree-ssa/builtin-expect-1.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-2.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-3.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-4.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-5.c: Adjust test.


Regression tested for x86_64-unknown-linux-gnu. Ok for apply?

Regards,
Kai

Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-1.c
===
--- gcc-head.orig/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-1.c
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-1.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-gimple" } */
 
-f (int i, float j) 
+f (int i, float j, int i2, float j2) 
 { 
-  if (__builtin_expect (i > 0 && j, 0)) 
+  if (__builtin_expect ((i * i2) > 0 && (j * j2), 0)) 
 g ();
 } 
 
Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-2.c
===
--- gcc-head.orig/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-2.c
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-2.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-gimple" } */
 
-f (int i, float j) 
+f (int i, float j, int i2, float j2) 
 { 
-  if (__builtin_expect (i > 0 || j, 0))
+  if (__builtin_expect ((i * i2) > 0 || (j * j2), 0))
 ;
   else
 g ();
Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-3.c
===
--- gcc-head.orig/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-3.c
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-3.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-gimple" } */
 
-f (int i, float j) 
+f (int i, float j, int i2, float j2) 
 { 
-  if (__builtin_expect (i > 0 && j, 0))
+  if (__builtin_expect ((i * i2) > 0 && (j * j2), 0))
 a ();
   else
 b ();
Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-4.c
===
--- gcc-head.orig/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-4.c
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-4.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-gimple" } */
 
-f (int i, float j) 
+f (int i, float j, int i2, float j2) 
 { 
-  if (__builtin_expect (i > 0 || j, 0))
+  if (__builtin_expect ((i * i2) > 0 || (j * j2), 0))
 a ();
   else
 b ();
Index: gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-5.c
===
--- gcc-head.orig/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-5.c
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-5.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-forwprop" } */
 
-f (int i, float j) 
+f (int i, float j, int i2, float j2) 
 { 
-  if (__builtin_expect (i > 0 && __builtin_expect (j != 0, 1), 0))
+  if (__builtin_expect ((i * i2) > 0 && __builtin_expect ((j * j2) != 0, 1), 
0))
 a ();
   else
 b ();

Re: [PATCH] fix -Wsign-compare error in objc-act.c (PR objc/50743)

2011-10-18 Thread Nicola Pero

> The FSF has had my papers since Feb '11, but are stalling trying to decide
> whether to accept my employer's slightly non-standard disclaimer or not.

Ah.  That seems a long time even for this kind of things.

Anyhow, this particular patch consisted of exactly 4 casts, so it seems to fall 
in the case
of a "tiny patch" which doesn't require copyright assignment, so I applied it 
for you.

Thanks

Re: [C++ Patch] __builtin_choose_expr

2011-10-18 Thread Andy Gibbs

- Original Message - 
From: "Paolo Carlini" 
To: "Andy Gibbs" 
Cc: ; 
Sent: Tuesday, October 18, 2011 12:00 PM
Subject: Re: [C++ Patch] __builtin_choose_expr

> I'm under the impression that some tests are written in C, wouldn't better 
> fit in c-c++-common?

Ok, I can do this but I think only 2 out of the 8 tests can be moved
across.  I will then move the duplicates out of the gcc.dg folder too.

I will post a replacement patch later this afternoon.

Andy

[Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Georg-Johann Lay

This patch do some tweaks to addhi3 like adding QI scratch register.

The original *addhi3 insn is still there and located prior to new
addhi3_clobber insn because addhi3 is special to reload (thanks Danis for this
note) so that there is a version with and a version without scratch register.

Patch passes without regressions.

Ok for trunk?

PR target/50447
* config/avr/avr.md (cc): New alternative out_plus_noclobber.
(adjust_len): Ditto.
(addhi3): Don't pipe through short; use gen_int_mode instead.
Prior to reload, expand to gen_addhi3_clobber.
(*addhi3): Use avr_out_plus_noclobber if applicable, use
out_plus_noclobber in cc and adjust_len attribute.
(addhi3_clobber): 2 new RTL peepholes.
(addhi3_clobber): New insn.
* config/avr/avr-protos.h: (avr_out_plus_noclobber): New prototype.
* config/avr/avr.c (avr_out_plus_noclobber): New function.
(notice_update_cc): Handle CC_OUT_PLUS_NOCLOBBER.
(avr_out_plus_1): Tweak if only MSB is +/-1 and other bytes are 0.
Set cc0 to set_zn for adiw on 16-bit values.
(adjust_insn_length): Handle ADJUST_LEN_OUT_PLUS_NOCLOBBER.
(expand_epilogue): No need to add 0 to frame_pointer_rtx.

Johann
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 180104)
+++ config/avr/avr.md	(working copy)
@@ -78,7 +78,7 @@ (define_c_enum "unspecv"
   
 ;; Condition code settings.
 (define_attr "cc" "none,set_czn,set_zn,set_n,compare,clobber,
-   out_plus"
+   out_plus, out_plus_noclobber"
   (const_string "none"))
 
 (define_attr "type" "branch,branch1,arith,xcall"
@@ -125,7 +125,8 @@ (define_attr "length" ""
 ;; Otherwise do special processing depending on the attribute.
 
 (define_attr "adjust_len"
-  "out_bitop, out_plus, addto_sp, tsthi, tstsi, compare, call,
+  "out_bitop, out_plus, out_plus_noclobber, addto_sp,
+   tsthi, tstsi, compare, call,
mov8, mov16, mov32, reload_in16, reload_in32,
ashlqi, ashrqi, lshrqi,
ashlhi, ashrhi, lshrhi,
@@ -759,14 +760,23 @@ (define_expand "addhi3"
 	(plus:HI (match_operand:HI 1 "register_operand" "")
 		 (match_operand:HI 2 "nonmemory_operand" "")))]
   ""
-  "
-{
-  if (GET_CODE (operands[2]) == CONST_INT)
-{
-  short tmp = INTVAL (operands[2]);
-  operands[2] = GEN_INT(tmp);
-}
-}")
+  {
+if (CONST_INT_P (operands[2]))
+  {
+operands[2] = gen_int_mode (INTVAL (operands[2]), HImode);
+
+if (!reload_completed
+&& !reload_in_progress
+&& !stack_register_operand (operands[0], HImode)
+&& !stack_register_operand (operands[1], HImode)
+&& !d_register_operand (operands[0], HImode)
+&& !d_register_operand (operands[1], HImode))
+  {
+emit_insn (gen_addhi3_clobber (operands[0], operands[1], operands[2]));
+DONE;
+  }
+  }
+  })
 
 
 (define_insn "*addhi3_zero_extend"
@@ -803,20 +813,77 @@ (define_insn "*addhi3_sp_R"
(set_attr "adjust_len" "addto_sp")])
 
 (define_insn "*addhi3"
-  [(set (match_operand:HI 0 "register_operand" "=r,!w,!w,d,r,r")
- 	(plus:HI
- 	 (match_operand:HI 1 "register_operand" "%0,0,0,0,0,0")
- 	 (match_operand:HI 2 "nonmemory_operand" "r,I,J,i,P,N")))]
+  [(set (match_operand:HI 0 "register_operand"  "=r,d,d")
+(plus:HI (match_operand:HI 1 "register_operand" "%0,0,0")
+ (match_operand:HI 2 "nonmemory_operand" "r,s,n")))]
   ""
-  "@
- 	add %A0,%A2\;adc %B0,%B2
- 	adiw %A0,%2
- 	sbiw %A0,%n2
- 	subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))
- 	sec\;adc %A0,__zero_reg__\;adc %B0,__zero_reg__
- 	sec\;sbc %A0,__zero_reg__\;sbc %B0,__zero_reg__"
-  [(set_attr "length" "2,1,1,2,3,3")
-   (set_attr "cc" "set_n,set_czn,set_czn,set_czn,set_n,set_n")])
+  {
+static const char * const asm_code[] =
+  {
+"add %A0,%A2\;adc %B0,%B2",
+"subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))",
+""
+  };
+
+if (*asm_code[which_alternative])
+  return asm_code[which_alternative];
+
+return avr_out_plus_noclobber (operands, NULL, NULL);
+  }
+  [(set_attr "length" "2,2,2")
+   (set_attr "adjust_len" "*,*,out_plus_noclobber")
+   (set_attr "cc" "set_n,set_czn,out_plus_noclobber")])
+
+;; Adding a constant to NO_LD_REGS might have lead to a reload of
+;; that constant to LD_REGS.  We don't add a scratch to *addhi3
+;; itself because that insn is special to reload.
+
+(define_peephole2 ; *addhi3_clobber
+  [(set (match_operand:HI 0 "d_register_operand" "")
+(match_operand:HI 1 "const_int_operand" ""))
+   (set (match_operand:HI 2 "l_register_operand" "")
+	(plus:HI (match_dup 2)
+ (match_dup 0)))]
+  "peep2_reg_dead_p (2, operands[0])"
+  [(parallel [(set (match_dup 2)
+   (plus:HI (match_dup 2)
+(match_dup 1)))
+  (clobber (match_dup 3))

[Patch Darwin/Ada] work around PR target/50678

2011-10-18 Thread Iain Sandoe

The audit trail in the PR contains the detective work (mostly by Eric)  
that concludes we have a long-standing bug in the Darwin x86-64  
sigtramp unwind data.
This has been filed as radar #10302855, but we need a work-around  
until that is resolved (possibly forever on older systems).


OK for trunk?
(what opinion about 4.6?)

ada:

PR target/50678
* init.c (Darwin/__gnat_error_handler): Transpose rbx and rdx in the
handler.

Index: gcc/ada/init.c
===
--- gcc/ada/init.c  (revision 180097)
+++ gcc/ada/init.c  (working copy)
@@ -2287,6 +2287,16 @@ __gnat_error_handler (int sig, siginfo_t *si,  
void

 {
   struct Exception_Data *exception;
   const char *msg;
+#if defined (__x86_64__)
+  /* Work around radar #10302855/pr50678, where the unwinders  
(libunwind or
+ libgcc_s depending on the system revision) and the DWARF unwind  
data for
+ the sigtramp have different ideas about register numbering  
(causing rbx

+ and rdx to be transposed).  */
+  ucontext_t *uc = (ucontext_t *)ucontext ;
+  unsigned long t = uc->uc_mcontext->__ss.__rbx;
+  uc->uc_mcontext->__ss.__rbx = uc->uc_mcontext->__ss.__rdx;
+  uc->uc_mcontext->__ss.__rdx =t;
+#endif

   switch (sig)
 {

Re: [cxx-mem-model] Rename __sync_mem to __atomic

2011-10-18 Thread Andrew MacLeod


In order to align with a standard naming convention, this patch renames
all the __sync_mem references to __atomic.

The only real addition is in builtins.c::is_builtin_name needed to know
that __atomic_ is also a builtin prefix.

Are you sure about the renaming get_builtin_sync_mem ->  get_builtin_atomic?

Heh, yeah you are right. It should retain the old name.  I'll fix that 
along with some minor formatting issues I've come across.


Andrew

Re: [ARM] Fix PR49641

2011-10-18 Thread Bernd Schmidt

On 10/17/11 14:54, Richard Earnshaw wrote:
> On 14/10/11 14:31, Bernd Schmidt wrote:
>> On 07/13/11 16:03, Richard Earnshaw wrote:
* config/arm/arm.c (store_multiple_sequence): Avoid cases where
the base reg is stored iff compiling for Thumb1.

* gcc.target/arm/pr49641.c: New test.
>>
>> Ping.  Richard, you replied to the mail but didn't comment on the patch.
>>
>>
>> Bernd
>>
> 
> 
> Sorry, I thought I'd made it clear that I don't think the compiler
> should ever use STM with write-back if the base register is in the
> stored list.  We must certainly never do it if the base register is not
> the first register in the list as this has always been unpredictable.
> 
> BTW, this is not Thumb1 specific, it applies at all times.
> 
> 
> So, no the patch is not OK as it stands.

I'm confused. The patch disables the STM if THUMB1 and the base register
is in the stored list. We only ever enable write-back for Thumb1 (see
gen_stm_seq). So, what's the problem?


Bernd

[4.6] Fix type of SRAed enum accesses

2011-10-18 Thread Jakub Jelinek

On Tue, Sep 27, 2011 at 03:26:03PM +0100, Richard Sandiford wrote:
> This patch fixes a miscompilation of stage1 c-parser.o in an ARM bootstrap.
> When an access to an enum field was SRAed, a component ref used the type
> of the integer temporary variable instead of the type of the enum.
> It therefore didn't alias other accesses to the same structure,
> and was scheduled after a copy-load.
> 
> Tested on x86_64-linux-gnu, and by verifying that c-parser.o is correctly
> compiled for ARM after the patch.  Martin says he's going to test on ia64
> too (thanks) -- I'll add 50326 to the changelog if that goes OK.
> 
> OK to install if there are no regressions on ia64?

As mentioned in bugzilla, the patch that caused this regression has been
committed to 4.6 branch as well.

Here is a backport of that patch to 4.6 branch, bootstrapped/regtested on
x86_64-linux and i686-linux and Matthias AFAIK bootstrapped it on ia64-linux
(where it previously failed to bootstrap because of this), regtest pending.

Ok for 4.6?

2011-10-18  Jakub Jelinek  

PR target/50350
Backport from mainline
2011-09-27  Richard Sandiford  

PR middle-end/50386
PR middle-end/50326
* tree-sra.c (build_ref_for_model): Use the type of the field as
the type of the COMPONENT_REF.

--- gcc/tree-sra.c.jj   2011-09-30 17:19:10.0 +0200
+++ gcc/tree-sra.c  2011-10-18 08:01:28.682647833 +0200
@@ -1448,12 +1448,12 @@ build_ref_for_model (location_t loc, tre
 {
   if (TREE_CODE (model->expr) == COMPONENT_REF)
 {
-  tree t, exp_type;
-  offset -= int_bit_position (TREE_OPERAND (model->expr, 1));
+  tree t, exp_type, fld = TREE_OPERAND (model->expr, 1);
+  offset -= int_bit_position (fld);
   exp_type = TREE_TYPE (TREE_OPERAND (model->expr, 0));
   t = build_ref_for_offset (loc, base, offset, exp_type, gsi, 
insert_after);
-  return fold_build3_loc (loc, COMPONENT_REF, model->type, t,
- TREE_OPERAND (model->expr, 1), NULL_TREE);
+  return fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), t, fld,
+ NULL_TREE);
 }
   else
 return build_ref_for_offset (loc, base, offset, model->type,

Jakub

Re: [Patch Darwin/Ada] work around PR target/50678

2011-10-18 Thread Arnaud Charlet

> The audit trail in the PR contains the detective work (mostly by Eric) that
> concludes we have a long-standing bug in the Darwin x86-64 sigtramp unwind
> data.
> This has been filed as radar #10302855, but we need a work-around until
> that is resolved (possibly forever on older systems).
> 
> OK for trunk?
> (what opinion about 4.6?)
> 
> ada:
> 
>   PR target/50678
>   * init.c (Darwin/__gnat_error_handler): Transpose rbx and rdx in the
>   handler.

Here is Tristan's review on the above patch:

--
According to the PR, this is ok.
(Maybe we should restrict the swap to the known broken libc ?)

Tristan.

Re: [Patch, Darwin/libiberty] fix target/48108 by adding a section wrapper scheme to the darwin port.

2011-10-18 Thread Iain Sandoe



On 30 Sep 2011, at 00:33, Iain Sandoe wrote:
I'll re-jig with the typographical changes (sorry that were so  
many ... )


I've addressed your points in :

http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01974.html

specifically (other than typographical issues_

 o we retain the ability to read an old-style file.

   - I wonder if there should be an argument for removing this from  
the trunk version, but retaining in 4.6.

   - or, at the least, can we remove it from 4.8?

 o I dropped the ability to read all the sections from a file (by  
retaining the requirement to specify a segment name).


  - I will try to submit a separate patch to do this - which would  
essentially give the darwin version of simple-object the
   same functionality as the others (at the moment one can only read  
data from one segment at a time).


 o there's no need to amend the header, since the functionality is  
unchanged.


- retested on i686/x86-64 darwin 9/10 with "bootstrap-lto bootstrap- 
debug"

 (darwin10 requires the ranlib -c fix to work when XCode >= 3.2.6 ).

Once again, apologies for the number of typos in the original.

OK for trunk/4.6?
Iain

libiberty:

PR target/48108
* simple-object-mach-o.c  (GNU_WRAPPER_SECTS, GNU_WRAPPER_INDEX,
GNU_WRAPPER_NAMES): New macros.
(simple_object_mach_o_segment): Handle wrapper scheme.
	(simple_object_mach_o_write_section_header): Allow the segment name  
to be supplied.
	(simple_object_mach_o_write_segment): Handle wrapper scheme.  Ensure  
that the

top-level segment name in the load command is empty.
	(simple_object_mach_o_write_to_file): Determine the number of  
sections during

segment output, use that in writing the header.

gcc:

PR target/48108
* config/darwin.c (top level): Amend comments concerning LTO output.
(lto_section_num): New variable.  (darwin_lto_section_e): New GTY.
(LTO_SECTS_SECTION, LTO_INDEX_SECTION): New.
(LTO_NAMES_SECTION): Rename.
(darwin_asm_named_section): Record LTO section counts and switches
in a vec of darwin_lto_section_e.
(darwin_file_start): Remove unused code.
(darwin_file_end): Put an LTO section termination label.  Handle output
of the wrapped LTO sections, index and names table.

Index: libiberty/simple-object-mach-o.c
===
--- libiberty/simple-object-mach-o.c(revision 180097)
+++ libiberty/simple-object-mach-o.c(working copy)
@@ -1,5 +1,5 @@
 /* simple-object-mach-o.c -- routines to manipulate Mach-O object files.
-   Copyright 2010 Free Software Foundation, Inc.
+   Copyright 2010, 2011 Free Software Foundation, Inc.
Written by Ian Lance Taylor, Google.
 
 This program is free software; you can redistribute it and/or modify it
@@ -174,6 +174,15 @@ struct mach_o_section_64
 
 #define GNU_SECTION_NAMES "__section_names"
 
+/* A GNU-specific extension to wrap multiple sections using three
+   mach-o sections within a given segment.  The section '__wrapper_sects'
+   is subdivided according to the index '__wrapper_index' and each sub
+   sect is named according to the names supplied in '__wrapper_names'.  */
+
+#define GNU_WRAPPER_SECTS "__wrapper_sects"
+#define GNU_WRAPPER_INDEX "__wrapper_index"
+#define GNU_WRAPPER_NAMES "__wrapper_names"
+
 /* Private data for an simple_object_read.  */
 
 struct simple_object_mach_o_read
@@ -214,8 +223,19 @@ struct simple_object_mach_o_attributes
   unsigned int reserved;
 };
 
-/* See if we have a Mach-O file.  */
+/* See if we have a Mach-O MH_OBJECT file:
 
+   A standard MH_OBJECT (from as) will have three load commands:
+   0 - LC_SEGMENT/LC_SEGMENT64
+   1 - LC_SYMTAB
+   2 - LC_DYSYMTAB
+
+   The LC_SEGMENT/LC_SEGMENT64 will introduce a single anonymous segment
+   containing all the sections.
+
+   Files written by simple-object will have only the segment command
+   (no symbol tables).  */
+
 static void *
 simple_object_mach_o_match (
 unsigned char header[SIMPLE_OBJECT_MATCH_HEADER_LEN],
@@ -356,9 +376,30 @@ simple_object_mach_o_section_info (int is_big_endi
 }
 }
 
-/* Handle a segment in a Mach-O file.  Return 1 if we should continue,
-   0 if the caller should return.  */
+/* Handle a segment in a Mach-O Object file.
 
+   This will callback to the function pfn for each "section found" the meaning
+   of which depends on gnu extensions to mach-o:
+
+   If we find mach-o sections (with the segment name as specified) which also
+   contain: a 'sects' wrapper, an index, and a  name table, we expand this into
+   as many sections as are specified in the index.  In this case, there will
+   be a callback for each of these.
+
+   We will also allow an extension that permits long names (more than 16
+   characters) to be used with mach-o.  In this case, the section name has
+   a specific format embedding an index into a name table, and the file must
+   contain such name table.
+
+   Re

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Denis Chertykov

2011/10/18 Georg-Johann Lay :
> This patch do some tweaks to addhi3 like adding QI scratch register.
>
> The original *addhi3 insn is still there and located prior to new
> addhi3_clobber insn because addhi3 is special to reload (thanks Danis for this
> note) so that there is a version with and a version without scratch register.
>
> Patch passes without regressions.
>

Which improvements added by this patch ?

Denis.

Re: [Patch Darwin/Ada] work around PR target/50678

2011-10-18 Thread Iain Sandoe



On 18 Oct 2011, at 13:22, Arnaud Charlet wrote:

The audit trail in the PR contains the detective work (mostly by  
Eric) that
concludes we have a long-standing bug in the Darwin x86-64 sigtramp  
unwind

data.
This has been filed as radar #10302855, but we need a work-around  
until

that is resolved (possibly forever on older systems).

OK for trunk?
(what opinion about 4.6?)

ada:

PR target/50678
* init.c (Darwin/__gnat_error_handler): Transpose rbx and rdx in the
handler.


Here is Tristan's review on the above patch:

--
According to the PR, this is ok.
(Maybe we should restrict the swap to the known broken libc ?)


It's broken in all Libc versions that are in the wild (AFAICT from  
looking at the released sources).


We will need to deal with configury/ 
__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ stuff once there is a  
fixed Libc.


Iain

Re: [ARM] Fix PR49641

2011-10-18 Thread Richard Earnshaw

On 18/10/11 13:19, Bernd Schmidt wrote:
> On 10/17/11 14:54, Richard Earnshaw wrote:
>> On 14/10/11 14:31, Bernd Schmidt wrote:
>>> On 07/13/11 16:03, Richard Earnshaw wrote:
>   * config/arm/arm.c (store_multiple_sequence): Avoid cases where
>   the base reg is stored iff compiling for Thumb1.
>
>   * gcc.target/arm/pr49641.c: New test.
>>>
>>> Ping.  Richard, you replied to the mail but didn't comment on the patch.
>>>
>>>
>>> Bernd
>>>
>>
>>
>> Sorry, I thought I'd made it clear that I don't think the compiler
>> should ever use STM with write-back if the base register is in the
>> stored list.  We must certainly never do it if the base register is not
>> the first register in the list as this has always been unpredictable.
>>
>> BTW, this is not Thumb1 specific, it applies at all times.
>>
>>
>> So, no the patch is not OK as it stands.
> 
> I'm confused. The patch disables the STM if THUMB1 and the base register
> is in the stored list. We only ever enable write-back for Thumb1 (see
> gen_stm_seq). So, what's the problem?

Well, if that's the case why do we need to test for Thumb1 at all?  And
why do we only enable write-back for Thumb1?  other ISA variants can
also do that (I know that Thumb1 requires write-back, but it's
optionally available for the other ISA flavours).

R.

Re: [Patch Darwin/Ada] work around PR target/50678

2011-10-18 Thread Arnaud Charlet

> It's broken in all Libc versions that are in the wild (AFAICT from looking
> at the released sources).
> 
> We will need to deal with 
> configury/__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ stuff once
> there is a fixed Libc.

OK, would be good to follow up with such patch when/if this is fixed.

Arno

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Georg-Johann Lay

Denis Chertykov schrieb:
> 2011/10/18 Georg-Johann Lay :
>> This patch do some tweaks to addhi3 like adding QI scratch register.
>>
>> The original *addhi3 insn is still there and located prior to new
>> addhi3_clobber insn because addhi3 is special to reload (thanks Danis for 
>> this
>> note) so that there is a version with and a version without scratch register.
>>
>> Patch passes without regressions.
>>
> 
> Which improvements added by this patch ?
> 
> Denis.

If the addhi3 is expanded early, the addition happens with QI scratch which
avoids reload of constant if target register is in NO_LD. And reduce register
pressure as only QI is needed and not reload of constant to HI.

Otherwise, there might be sequences like

ldi r31, 2; *reload_inhi
mov r12, r31
clr r13

add r14, r12  ; *addhi3
adc r15, r13

which now will be

ldi r31, 2; addhi3_clobber
add r14, r31
adc r15, __zero_reg__

Similar applies if the reload of the constant happens to LD regs:

ldi r30, 2; *movhi
clr r31

add r14, r12  ; *addhi3
adc r15, r13

will become

ldi r30, 2; addhi3_clobber
add r14, r30
adc r15, __zero_reg__

For *addhi3 insns the register pressure is not reduced but the insn sequence
might be smarter if peep2 comes up with a QI scratch or if it detects a
*reload_inhi insn just prior to the addition (and the reg that holds the
reloaded constant dies after the addition).

As *addhi3 is special to reload, there is still an "ordinary" add addhi insn
without scratch. This is easier because, e.g. prologue and epilogue generation
generate add insns (not by means of addhi3 expander but by explicit
gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
addhi3 insn is to be generated via addhi3 expander late in the compilation 
process.

Johann

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Denis Chertykov

2011/10/18 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/10/18 Georg-Johann Lay :
>>> This patch do some tweaks to addhi3 like adding QI scratch register.
>>>
>>> The original *addhi3 insn is still there and located prior to new
>>> addhi3_clobber insn because addhi3 is special to reload (thanks Danis for 
>>> this
>>> note) so that there is a version with and a version without scratch 
>>> register.
>>>
>>> Patch passes without regressions.
>>>
>>
>> Which improvements added by this patch ?
>>
>> Denis.
>
> If the addhi3 is expanded early, the addition happens with QI scratch which
> avoids reload of constant if target register is in NO_LD. And reduce register
> pressure as only QI is needed and not reload of constant to HI.
>
> Otherwise, there might be sequences like
>
> ldi r31, 2    ; *reload_inhi
> mov r12, r31
> clr r13
>
> add r14, r12  ; *addhi3
> adc r15, r13
>
> which now will be
>
> ldi r31, 2    ; addhi3_clobber
> add r14, r31
> adc r15, __zero_reg__
>
> Similar applies if the reload of the constant happens to LD regs:
>
> ldi r30, 2    ; *movhi
> clr r31
>
> add r14, r12  ; *addhi3
> adc r15, r13
>
> will become
>
> ldi r30, 2    ; addhi3_clobber
> add r14, r30
> adc r15, __zero_reg__
>
> For *addhi3 insns the register pressure is not reduced but the insn sequence
> might be smarter if peep2 comes up with a QI scratch or if it detects a
> *reload_inhi insn just prior to the addition (and the reg that holds the
> reloaded constant dies after the addition).
>
> As *addhi3 is special to reload, there is still an "ordinary" add addhi insn
> without scratch. This is easier because, e.g. prologue and epilogue generation
> generate add insns (not by means of addhi3 expander but by explicit
> gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
> addhi3 insn is to be generated via addhi3 expander late in the compilation 
> process

Please provide any real world example.

Denis.

Re: [ARM] Fix PR49641

2011-10-18 Thread Bernd Schmidt

On 10/18/11 14:30, Richard Earnshaw wrote:
> Well, if that's the case why do we need to test for Thumb1 at all?  And
> why do we only enable write-back for Thumb1?  other ISA variants can
> also do that (I know that Thumb1 requires write-back, but it's
> optionally available for the other ISA flavours).

We're not trying to generate a writeback sequence with our peepholes.
The problem is that on Thumb, that's the only instruction available, and
we want to make use of it if possible (i.e. register dead afterwards etc.).

Bernd

[Patch, Fortran, 4.6, committed] PR 50016: mitigate performance regression on Windows by calling less often _commit

2011-10-18 Thread Tobias Burnus

This patch has been approved by Janne off list - and has been committed 
to the 4.6 branch only (Rev. 180138) after bootstrapping and regtesting it.


It is essentially my patch from
  http://gcc.gnu.org/ml/fortran/2011-10/msg00120.html
minus the .texi change. And the inquire.c part of Janne's patch at
  http://gcc.gnu.org/ml/fortran/2011-10/msg00094.html

In GCC 4.6 and 4.7, every time when the gfortran-internal buffer is 
flushed, _commit() was called on _WIN32 (= MinGW and MinGW-w64), which 
caused a severe slowdown. The reason is that _commit() causes that files 
are written to the hard disk.


With this patch, _commit() is only called when the user explicitly runs 
the FLUSH subroutine/statement. Additionally, if one inquires the size 
of an open file, now the internally known size is used instead of 
calling "stat".


 * * *

For 4.7 the same issue exists but as the release is still a couple of 
months away, we have time to learn more about how Windows handles 
buffering and which consistency should be provided. - The gist of the 
discussion is whether flush() should automatically call _commit or 
whether it shouldn't, which is a question about consistency vs. 
performance. For details, see the thread starting at 
http://gcc.gnu.org/ml/fortran/2011-10/threads.html#00079


For 4.6 we decided that it makes more sense to make the committed patch 
available for 4.6.2 than to delay it further. The patch should fix most 
of the performance issues.


Comment about which strategy to choose for 4.7 and insight about the 
buffering of Windows (i.e. whether it affects data and file meta data 
such as filesizes, or only the latter) are highly welcome.


Tobias
Index: libgfortran/ChangeLog
===
--- libgfortran/ChangeLog	(revision 180134)
+++ libgfortran/ChangeLog	(working copy)
@@ -1,3 +1,14 @@
+2011-10-18  Tobias Burnus  
+	Janne Blomqvist  
+
+	PR fortran/50016
+	* io/file_pos.c (st_flush): Call _commit on MinGW(-w64).
+	* io/intrinsics.c (flush_i4, flush_i8): Ditto.
+	* io/unix.c (flush_all_units_1, flush_all_units): Ditto.
+	(buf_flush): Remove _commit call.
+	* io/inquire.c (inquire_via_unit): Flush internal buffer
+	and call file_length instead of invoking stat via file_size.
+
 2011-09-11  Thomas Koenig  
 
 	Backport fron trunk
Index: libgfortran/io/file_pos.c
===
--- libgfortran/io/file_pos.c	(revision 180134)
+++ libgfortran/io/file_pos.c	(working copy)
@@ -452,6 +452,10 @@ st_flush (st_parameter_filepos *fpp)
 fbuf_flush (u, u->mode);
 
   sflush (u->s);
+#ifdef _WIN32
+  /* Without _commit, changes are not visible to other file descriptors.  */
+  _commit (u->s->fd);
+#endif
   unlock_unit (u);
 }
   else
Index: libgfortran/io/inquire.c
===
--- libgfortran/io/inquire.c	(revision 180134)
+++ libgfortran/io/inquire.c	(working copy)
@@ -409,7 +409,10 @@ inquire_via_unit (st_parameter_inquire *iqp, gfc_u
 	  if (u == NULL)
 	*iqp->size = -1;
 	  else
-	*iqp->size = file_size (u->file, (gfc_charlen_type) u->file_len);
+	{
+	  sflush (u->s);
+	  *iqp->size = file_length (u->s);
+	}
 	}
 }
 
Index: libgfortran/io/unix.c
===
--- libgfortran/io/unix.c	(revision 180134)
+++ libgfortran/io/unix.c	(working copy)
@@ -435,10 +435,6 @@ buf_flush (unix_stream * s)
   if (s->ndirty != 0)
 return -1;
 
-#ifdef _WIN32
-  _commit (s->fd);
-#endif
-
   return 0;
 }
 
@@ -1542,7 +1538,14 @@ flush_all_units_1 (gfc_unit *u, int min_unit)
 	  if (__gthread_mutex_trylock (&u->lock))
 	return u;
 	  if (u->s)
-	sflush (u->s);
+	{
+	  sflush (u->s);
+#ifdef _WIN32
+	  /* Without _commit, changes are not visible to other
+		 file descriptors.  */
+	  _commit (u->s->fd);
+#endif
+	}
 	  __gthread_mutex_unlock (&u->lock);
 	}
   u = u->right;
@@ -1573,6 +1576,11 @@ flush_all_units (void)
   if (u->closed == 0)
 	{
 	  sflush (u->s);
+#ifdef _WIN32
+	  /* Without _commit, changes are not visible to other
+	 file descriptors.  */
+	  _commit (u->s->fd);
+#endif
 	  __gthread_mutex_lock (&unit_lock);
 	  __gthread_mutex_unlock (&u->lock);
 	  (void) predec_waiting_locked (u);
Index: libgfortran/io/intrinsics.c
===
--- libgfortran/io/intrinsics.c	(revision 180134)
+++ libgfortran/io/intrinsics.c	(working copy)
@@ -207,6 +207,11 @@ flush_i4 (GFC_INTEGER_4 *unit)
   if (us != NULL)
 	{
 	  sflush (us->s);
+#ifdef _WIN32
+	  /* Without _commit, changes are not visible
+	 to other file descriptors.  */
+	  _commit (u->s->fd);
+#endif
 	  unlock_unit (us);
 	}
 }
@@ -230,6 +235,11 @@ flush_i8 (GFC_INTEGER_8 *unit)
   if (us != NULL)
 	{
 	  sflush (us->s);
+#ifdef _WIN32
+	  /* Without _commit, changes are not

Re: [ARM] Fix PR49641

2011-10-18 Thread Richard Earnshaw

On 18/10/11 13:47, Bernd Schmidt wrote:
> On 10/18/11 14:30, Richard Earnshaw wrote:
>> Well, if that's the case why do we need to test for Thumb1 at all?  And
>> why do we only enable write-back for Thumb1?  other ISA variants can
>> also do that (I know that Thumb1 requires write-back, but it's
>> optionally available for the other ISA flavours).
> 
> We're not trying to generate a writeback sequence with our peepholes.
> The problem is that on Thumb, that's the only instruction available, and
> we want to make use of it if possible (i.e. register dead afterwards etc.).
> 
> 
> Bernd
> 


OK, I understand now.  However, I think it's misleading to talk about
thumb1 directly here -- it implies that this doesn't apply in other cases.

A better patch, would be (I think)


--- gcc/config/arm/arm.c(revision 175906)
+++ gcc/config/arm/arm.c(working copy)
@@ -9950,7 +9950,9 @@ store_multiple_sequence (rtx *operands,
  /* If it isn't an integer register, then we can't do this.  */
  if (unsorted_regs[i] < 0
  || (TARGET_THUMB1 && unsorted_regs[i] > LAST_LO_REGNUM)
- || (TARGET_THUMB2 && unsorted_regs[i] == base_reg)
+ /* The effects are unpredictable if the base reg is
+both updated and stored.  */
+ || (base_writeback && unsorted_regs[i] == base_reg)
  || (TARGET_THUMB2 && unsorted_regs[i] == SP_REGNUM)
  || unsorted_regs[i] > 14)
return 0;

and then initialize base_writeback at the entry to the function (I
presume that currently we would just set it to be TARGET_THUMB1),
perhaps with a comment saying that we don't currently support
base_writeback for other ISA variants.

R.

Re: [patch testsuite]: Adjust tree-ssa/builtin-expect-*.c tests for high cost-branching optimization

2011-10-18 Thread Kai Tietz

- Original Message -
From: "Kai Tietz" 
To: gcc-patches@gcc.gnu.org
Cc: "Richard Guenther" 
Sent: Tuesday, October 18, 2011 1:33:17 PM
Subject: [patch testsuite]: Adjust tree-ssa/builtin-expect-*.c tests for high 
cost-branching optimization

Hello,

this patch adjusts __builtin_expect tests in tree-ssa so, that short-circuit 
branch optimization don't lead to fallout.  I've used here a multiplication, as 
simple substraction/addition might get optimized away.

ChangeLog

2011-10-18  Kai Tietz  

* gcc.dg/tree-ssa/builtin-expect-1.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-2.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-3.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-4.c: Adjust test.
* gcc.dg/tree-ssa/builtin-expect-5.c: Adjust test.

Regression tested for x86_64-unknown-linux-gnu. Ok for apply?

Regards,
Kai

--- gcc-head.orig/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-5.c
+++ gcc-head/gcc/testsuite/gcc.dg/tree-ssa/builtin-expect-5.c
@@ -1,9 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-forwprop" } */

-f (int i, float j) 
+f (int i, float j, int i2, float j2) 
 { 
-  if (__builtin_expect (i > 0 && __builtin_expect (j != 0, 1), 0))
+  if (__builtin_expect ((i * i2) > 0 && __builtin_expect ((j * j2) != 0, 1), 
0))
 a ();
   else
 b ();

Well, this change might be not necessary. But to keep in all 5 tests the same 
conditions checked, I modified this too.  On the other hand, this change isn't 
really necessary, if we assume that this check is testing that __builtin_expect 
conditions aren't merged on short-circuit optimization.

Regards,
Kai

Re: [PATCH][ARM] -m{cpu,tune,arch}=native

2011-10-18 Thread Andrew Stubbs


On 17/10/11 14:24, Richard Earnshaw wrote:

There's a presumption in host_detect_local_cpu() that "CPU implementer"
will appear before  "CPU part" in the output of /proc/cpuinfo.  That's
probably a pretty safe assumption (and it appears that it will handle
that case relatively safely -- ie not crash).  I'll let you decide
whether that's important enough to fix.


Yes, I was aware of that. I believe it wouldn't be worth the effort 
required to fix it, and it would only complicate the code for no advantage.


If, at some point in the future, the format of /proc/cpuinfo changes 
enough to break this assumption then it'll probably be the least of the 
problems needing fixing.



However another problem I've just spotted is that you never close the
file descriptor if you fail to parse the output properly (ie jump to
not_found).  That should be fixed before this is committed.

OK with the second issue resolved.


Thanks, fixed and committed, as attached.

Andrew
2011-10-18  Andrew Stubbs  

	gcc/
	* config.host (arm*-*-linux*): Add driver-arm.o and x-arm.
	* config/arm/arm.opt: Add 'native' processor_type and
	arm_arch enum values.
	* config/arm/arm.h (host_detect_local_cpu): New prototype.
	(EXTRA_SPEC_FUNCTIONS): New define.
	(MCPU_MTUNE_NATIVE_SPECS): New define.
	(DRIVER_SELF_SPECS): New define.
	* config/arm/driver-arm.c: New file.
	* config/arm/x-arm: New file.
	* doc/invoke.texi (ARM Options): Document -mcpu=native,
	-mtune=native and -march=native.

--- a/gcc/config.host
+++ b/gcc/config.host
@@ -100,6 +100,14 @@ case ${host} in
 esac
 
 case ${host} in
+  arm*-*-linux*)
+case ${target} in
+  arm*-*-*)
+	host_extra_gcc_objs="driver-arm.o"
+	host_xmake_file="${host_xmake_file} arm/x-arm"
+	;;
+esac
+;;
   alpha*-*-linux* | alpha*-dec-osf*)
 case ${target} in
   alpha*-*-linux* | alpha*-dec-osf*)
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2255,4 +2255,21 @@ extern int making_const_table;
 }	\
   while (0)
 
+/* -mcpu=native handling only makes sense with compiler running on
+   an ARM chip.  */
+#if defined(__arm__)
+extern const char *host_detect_local_cpu (int argc, const char **argv);
+# define EXTRA_SPEC_FUNCTIONS		\
+  { "local_cpu_detect", host_detect_local_cpu },
+
+# define MCPU_MTUNE_NATIVE_SPECS	\
+   " %{march=native:%http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "configargs.h"
+
+struct vendor_cpu {
+  const char *part_no;
+  const char *arch_name;
+  const char *cpu_name;
+};
+
+static struct vendor_cpu arm_cpu_table[] = {
+{"0x926", "armv5te", "arm926ej-s"},
+{"0xa26", "armv5te", "arm1026ej-s"},
+{"0xb02", "armv6k", "mpcore"},
+{"0xb36", "armv6j", "arm1136j-s"},
+{"0xb56", "armv6t2", "arm1156t2-s"},
+{"0xb76", "armv6zk", "arm1176jz-s"},
+{"0xc05", "armv7-a", "cortex-a5"},
+{"0xc08", "armv7-a", "cortex-a8"},
+{"0xc09", "armv7-a", "cortex-a9"},
+{"0xc0f", "armv7-a", "cortex-a15"},
+{"0xc14", "armv7-r", "cortex-r4"},
+{"0xc15", "armv7-r", "cortex-r5"},
+{"0xc20", "armv6-m", "cortex-m0"},
+{"0xc21", "armv6-m", "cortex-m1"},
+{"0xc23", "armv7-m", "cortex-m3"},
+{"0xc24", "armv7e-m", "cortex-m4"},
+{NULL, NULL, NULL}
+};
+
+struct {
+  const char *vendor_no;
+  const struct vendor_cpu *vendor_parts;
+} vendors[] = {
+{"0x41", arm_cpu_table},
+{NULL, NULL}
+};
+
+/* This will be called by the spec parser in gcc.c when it sees
+   a %:local_cpu_detect(args) construct.  Currently it will be called
+   with either "arch", "cpu" or "tune" as argument depending on if
+   -march=native, -mcpu=native or -mtune=native is to be substituted.
+
+   It returns a string containing new command line parameters to be
+   put at the place of the above two options, depending on what CPU
+   this is executed.  E.g. "-march=armv7-a" on a Cortex-A8 for
+   -march=native.  If the routine can't detect a known processor,
+   the -march or -mtune option is discarded.
+
+   ARGC and ARGV are set depending on the actual arguments given
+   in the spec.  */
+const char *
+host_detect_local_cpu (int argc, const char **argv)
+{
+  const char *val = NULL;
+  char buf[128];
+  FILE *f;
+  bool arch;
+  const struct vendor_cpu *cpu_table = NULL;
+
+  if (argc < 1)
+goto not_found;
+
+  arch = strcmp (argv[0], "arch") == 0;
+  if (!arch && strcmp (argv[0], "cpu") != 0 && strcmp (argv[0], "tune"))
+goto not_found;
+
+  f = fopen ("/proc/cpuinfo", "r");
+  if (f == NULL)
+goto not_found;
+
+  while (fgets (buf, sizeof (buf), f) != NULL)
+{
+  /* Ensure that CPU implementer is ARM (0x41).  */
+  if (strncmp (buf, "CPU implementer", sizeof ("CPU implementer") - 1) == 0)
+	{
+	  int i;
+	  for (i = 0; vendors[i].vendor_no != NULL; i++)
+	if (strstr (buf, vendors[i].vendor_no) != NULL)
+	  {
+		cpu_table = vendors[i].vendor_parts;
+		break;
+	  }
+	}
+
+  /* Detect arch/cpu.  */
+

Re: [Patch]: fix typo in rs6000.c (AIX bootstrap broken)

2011-10-18 Thread David Edelsohn

Tristan,

Mike fixed the typo already.  Why are you still seeing a problem?

- David

On Tue, Oct 18, 2011 at 6:52 AM, Tristan Gingold  wrote:
> Ping…
>
> On Oct 13, 2011, at 5:11 PM, Tristan Gingold wrote:
>
>> Hi,
>>
>> looks like an obvious typo.  Ok for trunk ?
>>
>> Tristan.
>>
>> 2011-10-13  Tristan Gingold  
>>
>>       * config/rs6000/rs6000.c (rs6000_init_builtins): Fix typo.
>>
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 4fd2192..3bfe33e 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -12213,7 +12213,7 @@ rs6000_init_builtins (void)
>>
>> #if TARGET_XCOFF
>>   /* AIX libm provides clog as __clog.  */
>> -  if ((tdecl = builtin_decl_explicit ([BUILT_IN_CLOG))) != NULL_TREE)
>> +  if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
>>     set_user_assembler_name (tdecl, "__clog");
>> #endif
>>
>>
>
>

Re: [PATCH] Fix PR46556 (poor address generation)

2011-10-18 Thread William J. Schmidt

Greetings,

Here is a new revision of the tree portions of this patch.  I moved the
pattern recognizer to expand, and added additional logic to look for the
same pattern in gimple form.  I added two more tests to verify the new
logic.

I didn't run into any problems with the RTL CSE phases.  I can't recall
for certain what caused me to abandon the expand version previously.
There may not have been good reason; too many versions to keep track of
and too many interruptions, I suppose.  In any case, I'm much happier
having this code in the expander.

Paolo's RTL logic for unpropagating the zero-offset case is not going to
work out as is.  It causes a number of performance degradations, which I
suspect are due to the pass reordering.  That's a separate issue,
though, and not needed for this patch.

Bootstrapped and regression-tested on powerpc64-linux.  SPEC cpu2000
shows a number of small improvements and no significant degradations.
SPEC cpu2006 testing is pending.

Thanks,
Bill


2011-10-18  Bill Schmidt  

gcc:

PR rtl-optimization/46556
* expr.c (tree-pretty-print.h): New include.
(restructure_base_and_offset): New function.
(restructure_mem_ref): New function.
(expand_expr_real_1): In MEM_REF case, attempt restructure_mem_ref
first.  In normal_inner_ref case, attempt restructure_base_and_offset
first.
* Makefile.in: Update dependences for expr.o.

gcc/testsuite:

PR rtl-optimization/46556
* gcc.dg/tree-ssa-pr46556-1.c: New test.
* gcc.dg/tree-ssa-pr46556-2.c: Likewise.
* gcc.dg/tree-ssa-pr46556-3.c: Likewise.
* gcc.dg/tree-ssa-pr46556-4.c: Likewise.
* gcc.dg/tree-ssa-pr46556-5.c: Likewise.


Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-1.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr46556-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-1.c   (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+}
+
+/* { dg-final { scan-rtl-dump-times "\\(mem/s:SI \\(plus:" 2 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "const_int 128" 1 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "const_int 64 \\\[0x40\\\]\\)\\) \\\[" 1 
"expand" } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr46556-2.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-2.c   (revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+  if (n > 12)
+foo (p->a[n], p->c[n], p->b[n]);
+  else if (n > 3)
+foo (p->b[n], p->a[n], p->c[n]);
+}
+
+/* { dg-final { scan-rtl-dump-times "\\(mem/s:SI \\(plus:" 6 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "const_int 128" 3 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "const_int 64 \\\[0x40\\\]\\)\\) \\\[" 3 
"expand" } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-3.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr46556-3.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-3.c   (revision 0)
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+  if (n > 3)
+{
+  foo (p->a[n], p->c[n], p->b[n]);
+  if (n > 12)
+   foo (p->b[n], p->a[n], p->c[n]);
+}
+}
+
+/* { dg-final { scan-rtl-dump-times "\\(mem/s:SI \\(plus:" 6 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "const_int 128" 3 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "const_int 64 \\\[0x40\\\]\\)\\) \\\[" 3 
"expand" } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-4.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr46556-4.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-4.c   (revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+
+struct x
+{
+  int a[16];
+  int b[16];
+  int c[16];
+};
+
+extern void foo (int, int, int);
+
+void
+f (struct x *p, unsigned int n)
+{
+  foo (*((int *)p + n*4), *((int *)p + 32 + n*4), *((int *)p + 16 + n*4));
+  if (n > 3)
+{
+  foo (*((int *)p + n*4), *((int *)p + 32 + n*4), *

Re: [Patch]: fix typo in rs6000.c (AIX bootstrap broken)

2011-10-18 Thread Tristan Gingold


On Oct 18, 2011, at 4:23 PM, David Edelsohn wrote:

> Tristan,
> 
> Mike fixed the typo already.  Why are you still seeing a problem?

Indeed, was fixed later.

Tristan.

> 
> - David
> 
> On Tue, Oct 18, 2011 at 6:52 AM, Tristan Gingold  wrote:
>> Ping…
>> 
>> On Oct 13, 2011, at 5:11 PM, Tristan Gingold wrote:
>> 
>>> Hi,
>>> 
>>> looks like an obvious typo.  Ok for trunk ?
>>> 
>>> Tristan.
>>> 
>>> 2011-10-13  Tristan Gingold  
>>> 
>>>   * config/rs6000/rs6000.c (rs6000_init_builtins): Fix typo.
>>> 
>>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>>> index 4fd2192..3bfe33e 100644
>>> --- a/gcc/config/rs6000/rs6000.c
>>> +++ b/gcc/config/rs6000/rs6000.c
>>> @@ -12213,7 +12213,7 @@ rs6000_init_builtins (void)
>>> 
>>> #if TARGET_XCOFF
>>>   /* AIX libm provides clog as __clog.  */
>>> -  if ((tdecl = builtin_decl_explicit ([BUILT_IN_CLOG))) != NULL_TREE)
>>> +  if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
>>> set_user_assembler_name (tdecl, "__clog");
>>> #endif
>>> 
>>> 
>> 
>>

Re: [PATCH][ARM] -m{cpu,tune,arch}=native

2011-10-18 Thread Richard Earnshaw

On 18/10/11 15:23, Andrew Stubbs wrote:
> +  /* Detect arch/cpu.  */
> +  if (strncmp (buf, "CPU part", sizeof ("CPU part") - 1) == 0)
> + {
> +   int i;
> +
> +   if (cpu_table == NULL)
> + goto not_found;
> +

Which still jumps to not_found without closing f.

R.

Patch ping: [PR 50562]: configure

2011-10-18 Thread Jeremy Stenglein (jstengle)

http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00291.html

configure: add target-libgloss to skipdirs when using --without-newlib.

Testing: Bootstrapped/tested on x86_64-linux and sparc-sun-solaris2.10 
with a gcc+newlib+libgloss combined tree.

Thanks,
Jeremy

[Patch, libfortran, 1/3] Simplify handling of special files

2011-10-18 Thread Janne Blomqvist

Hi,

in a few places in libgfortran we have some code for handling special
and/or non-seekable files differently. The problem is that special
files don't all have some nice consistent behavior. E.g. wrt. seeking,
some allow seeking just fine, others allow some seeks and not others,
others allow them but always return an offset of 0, and yet others
fail the seek completely.

The Fortran standard doesn't really help here except for noting that
some files may not be positionable, and thus statements requiring the
file position to be modified may fail on such files.

Obviously, libgfortran itself cannot enumerate all the possible
variations for how a special file may behave, and trying to impose
some kind of least common denominator may hide essential capability.
Having thought about this, my conclusion is that the only thing that
makes sense is that we do what the caller asks us to do, and if that
fails, we report the error back to the caller and let the caller
handle it. The attached patch implements this.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

2011-10-18  Janne Blomqvist  

* io/file_pos.c (st_rewind): Handle regular and special files
identically.
* io/intrinsics.c (fseek_sub): Don't check whether we think the
file is seekable, just do what the caller says.
* io/transfer.c (skip_record): First try to seek, then fallback to
reading and throwing away what we read.
* io/unit.c (update_position): Don't check whether file is
seekable, just try to do what we're told.
(unit_truncate): Likewise.
* io/unix.c (struct unix_stream): Remove special_file flag.
(buf_flush): Remove code for handling unseekable files.
(buf_seek): Likewise.
(fd_to_stream): Use buffered IO only for regular files.
(file_length): Remove is_seekable() call.
(is_seekable): Remove function.
(is_special): Likewise.
* io/unix.h: Remove prototypes for is_seekable and is_special.


-- 
Janne Blomqvist
diff --git a/libgcc/configure b/libgcc/configure
old mode 100644
new mode 100755
diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c
index b034f38..2caf601 100644
--- a/libgfortran/io/file_pos.c
+++ b/libgfortran/io/file_pos.c
@@ -407,20 +407,15 @@ st_rewind (st_parameter_filepos *fpp)
 	  if (sseek (u->s, 0, SEEK_SET) < 0)
 	generate_error (&fpp->common, LIBERROR_OS, NULL);
 
-	  /* Handle special files like /dev/null differently.  */
-	  if (!is_special (u->s))
+	  /* Set this for compatibilty with g77 for /dev/null.  */
+	  if (file_length (u->s) == 0)
+	u->endfile = AT_ENDFILE;
+	  else
 	{
 	  /* We are rewinding so we are not at the end.  */
 	  u->endfile = NO_ENDFILE;
 	}
-	  else
-	{
-	  /* Set this for compatibilty with g77 for /dev/null.  */
-	  if (file_length (u->s) == 0  && stell (u->s) == 0)
-		u->endfile = AT_ENDFILE;
-	  /* Future refinements on special files can go here.  */
-	}
-
+	  
 	  u->current_record = 0;
 	  u->strm_pos = 1;
 	  u->read_bad = 0;
diff --git a/libgfortran/io/intrinsics.c b/libgfortran/io/intrinsics.c
index f48bd77..c1287d4 100644
--- a/libgfortran/io/intrinsics.c
+++ b/libgfortran/io/intrinsics.c
@@ -246,7 +246,7 @@ fseek_sub (int * unit, GFC_IO_INT * offset, int * whence, int * status)
   gfc_unit * u = find_unit (*unit);
   ssize_t result = -1;
 
-  if (u != NULL && is_seekable(u->s))
+  if (u != NULL)
 {
   result = sseek(u->s, *offset, *whence);
 
diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 12aca97..1e054f8 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -2823,18 +2823,12 @@ skip_record (st_parameter_dt *dtp, ssize_t bytes)
   if (dtp->u.p.current_unit->bytes_left_subrecord == 0)
 return;
 
-  if (is_seekable (dtp->u.p.current_unit->s))
+  /* Direct access files do not generate END conditions,
+ only I/O errors.  */
+  if (sseek (dtp->u.p.current_unit->s, 
+	 dtp->u.p.current_unit->bytes_left_subrecord, SEEK_CUR) < 0)
 {
-  /* Direct access files do not generate END conditions,
-	 only I/O errors.  */
-  if (sseek (dtp->u.p.current_unit->s, 
-		 dtp->u.p.current_unit->bytes_left_subrecord, SEEK_CUR) < 0)
-	generate_error (&dtp->common, LIBERROR_OS, NULL);
-
-  dtp->u.p.current_unit->bytes_left_subrecord = 0;
-}
-  else
-{			/* Seek by reading data.  */
+  /* Seeking failed, fall back to seeking by reading data.  */
   while (dtp->u.p.current_unit->bytes_left_subrecord > 0)
 	{
 	  rlength = 
@@ -2850,8 +2844,9 @@ skip_record (st_parameter_dt *dtp, ssize_t bytes)
 
 	  dtp->u.p.current_unit->bytes_left_subrecord -= readb;
 	}
+  return;
 }
-
+  dtp->u.p.current_unit->bytes_left_subrecord = 0;
 }
 
 
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index e8a9b84..d2fb6d0 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -714,23 +714,21 @@ update_position (gfc_unit *u

Re: [PATCH][ARM] -m{cpu,tune,arch}=native

2011-10-18 Thread Andrew Stubbs


On 18/10/11 15:34, Richard Earnshaw wrote:

On 18/10/11 15:23, Andrew Stubbs wrote:

+  /* Detect arch/cpu.  */
+  if (strncmp (buf, "CPU part", sizeof ("CPU part") - 1) == 0)
+   {
+ int i;
+
+ if (cpu_table == NULL)
+   goto not_found;
+


Which still jumps to not_found without closing f.


Hmmm, I know I fixed that, I know I did!

But I appear to have lost the change somewhere when I updated my checkout?

I'll fix it now.

Andrew

[Patch, libfortran, 2/3] Introduce a ssize member function

2011-10-18 Thread Janne Blomqvist

Hi,

the attached patch introduces a new "member function" ssize in order
to determine the size of a unit. The benefit is that with unbuffered
IO, when getting the size of a file we can replace 3 lseek() calls
with one fstat() call, and also that for unseekable file we get a size
of 0 rather than -1 (which would indicate that the syscall failed).

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

2011-10-18  Janne Blomqvist  

* io/unix.h (struct stream): Add size function pointer.
(ssize): New inline function.
(file_length): Remove prototype.
* io/unix.c (raw_size): New function.
(raw_init): Initialize st.size pointer.
(buf_size): New function.
(buf_init): Initialize st.size pointer.
(open_internal): Likewise.
(open_internal4): Likewise.
(file_length): Remove function.
* io/file_pos.c (st_rewind): Use ssize instead of file_length.
* io/open.c (test_endfile): Likewise.
* io/transfer.c (data_transfer_init): Likewise.
(next_record_r): Likewise.
(next_record_w): Likewise.
* io/unit.c (update_position): Likewise.


-- 
Janne Blomqvist
diff --git a/libgfortran/io/file_pos.c b/libgfortran/io/file_pos.c
index 2caf601..c8ecc3a 100644
--- a/libgfortran/io/file_pos.c
+++ b/libgfortran/io/file_pos.c
@@ -408,7 +408,7 @@ st_rewind (st_parameter_filepos *fpp)
 	generate_error (&fpp->common, LIBERROR_OS, NULL);
 
 	  /* Set this for compatibilty with g77 for /dev/null.  */
-	  if (file_length (u->s) == 0)
+	  if (ssize (u->s) == 0)
 	u->endfile = AT_ENDFILE;
 	  else
 	{
diff --git a/libgfortran/io/open.c b/libgfortran/io/open.c
index b26d14d..0102b9c 100644
--- a/libgfortran/io/open.c
+++ b/libgfortran/io/open.c
@@ -153,7 +153,7 @@ static const st_option async_opt[] =
 static void
 test_endfile (gfc_unit * u)
 {
-  if (u->endfile == NO_ENDFILE && file_length (u->s) == stell (u->s))
+  if (u->endfile == NO_ENDFILE && ssize (u->s) == stell (u->s))
 u->endfile = AT_ENDFILE;
 }
 
diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 1e054f8..26263ae 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -2627,7 +2627,7 @@ data_transfer_init (st_parameter_dt *dtp, int read_flag)
 	 a partial record needs to exist.  */
 
   if (dtp->u.p.mode == READING && (dtp->rec - 1)
-	  * dtp->u.p.current_unit->recl >= file_length (dtp->u.p.current_unit->s))
+	  * dtp->u.p.current_unit->recl >= ssize (dtp->u.p.current_unit->s))
 	{
 	  generate_error (&dtp->common, LIBERROR_BAD_OPTION,
 			  "Non-existing record number");
@@ -2944,7 +2944,7 @@ next_record_r (st_parameter_dt *dtp, int done)
 	{
 	  bytes_left = (int) dtp->u.p.current_unit->bytes_left;
 	  bytes_left = min_off (bytes_left, 
-		  file_length (dtp->u.p.current_unit->s)
+		  ssize (dtp->u.p.current_unit->s)
 		  - stell (dtp->u.p.current_unit->s));
 	  if (sseek (dtp->u.p.current_unit->s, 
 			 bytes_left, SEEK_CUR) < 0)
@@ -3309,7 +3309,7 @@ next_record_w (st_parameter_dt *dtp, int done)
 	{
 	  dtp->u.p.current_unit->strm_pos += len;
 	  if (dtp->u.p.current_unit->strm_pos
-		  < file_length (dtp->u.p.current_unit->s))
+		  < ssize (dtp->u.p.current_unit->s))
 		unit_truncate (dtp->u.p.current_unit,
dtp->u.p.current_unit->strm_pos - 1,
&dtp->common);
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index d2fb6d0..1d36214 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -719,7 +719,7 @@ update_position (gfc_unit *u)
 return;
   else if (cur == 0)
 u->flags.position = POSITION_REWIND;
-  else if (file_length (u->s) == cur)
+  else if (ssize (u->s) == cur)
 u->flags.position = POSITION_APPEND;
   else
 u->flags.position = POSITION_ASIS;
diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c
index 00f7c72..c87be13 100644
--- a/libgfortran/io/unix.c
+++ b/libgfortran/io/unix.c
@@ -332,6 +332,16 @@ raw_tell (unix_stream * s)
   return lseek (s->fd, 0, SEEK_CUR);
 }
 
+static gfc_offset
+raw_size (unix_stream * s)
+{
+  struct stat statbuf;
+  int ret = fstat (s->fd, &statbuf);
+  if (ret == -1)
+return ret;
+  return statbuf.st_size;
+}
+
 static int
 raw_truncate (unix_stream * s, gfc_offset length)
 {
@@ -398,6 +408,7 @@ raw_init (unix_stream * s)
   s->st.write = (void *) raw_write;
   s->st.seek = (void *) raw_seek;
   s->st.tell = (void *) raw_tell;
+  s->st.size = (void *) raw_size;
   s->st.trunc = (void *) raw_truncate;
   s->st.close = (void *) raw_close;
   s->st.flush = (void *) raw_flush;
@@ -584,6 +595,12 @@ buf_tell (unix_stream * s)
   return buf_seek (s, 0, SEEK_CUR);
 }
 
+static gfc_offset
+buf_size (unix_stream * s)
+{
+  return s->file_length;
+}
+
 static int
 buf_truncate (unix_stream * s, gfc_offset length)
 {
@@ -613,6 +630,7 @@ buf_init (unix_stream * s)
   s->st.write = (void *) buf_write;
   s->st.seek = (void *) bu

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread David Edelsohn

Hey, Dodji,

Your patch broke bootstrap on AIX because of the typedef "loc_t"
introduced in tree-diagnostics.c.  The typedef conflicts with a
typedef in an AIX 5.3 header file for locales.  AIX should not be
using that namespace, but the failure occurs before fix-includes, so
it is not possible to fix it there.

How can we avoid the namespace conflict?

Thanks, David

[Patch, libfortran, 3/3] Update file position lazily

2011-10-18 Thread Janne Blomqvist

Hi,

libgfortran maintains a position flag which is used by the
INQUIRE(POSITION=...) statement. Currently we update this flag after
every IO statement. For unbuffered IO this is somewhat tedious, as
figuring out whether we're at the beginning of a file or the end
requires at least two syscalls. The attached patch moves this checking
to the inquire implementation, which is certainly less frequently
invoked than READ or WRITE.

Also, I think I've found a small standards conformance bug. From F2008
(N1830) 9.10.2.23 (page 256): "... ASIS if the connection was opened
without changing its position." and "If the ﬁle has been repositioned
since the connection, the scalar-default-char-variable
is assigned a processor-dependent value, which shall not be REWIND
unless the ﬁle is positioned at its initial
point and shall not be APPEND unless the ﬁle is positioned so that its
endﬁle record is the next record or at its
terminal point if it has no endﬁle record.
"

If my understanding of the above is correct, returning ASIS is
incorrent unless the position is unchanged since the OPEN statement.
Currently we return ASIS by default if it's neither REWIND nor APPEND.
So the patch changes the implementation to return the
processor-dependent value UNSPECIFIED in this case.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

2011-10-18  Janne Blomqvist  

* io/inquire.c (inquire_via_unit): Check whether we're at the
beginning or end if the position is unspecified. If the position
is not one of the 3 standard ones, return unspecified.
* io/io.h (update_position): Remove prototype.
* io/transfer.c (next_record): Set the position to unspecified,
letting inquire figure it out more exactly when needed.
* io/unit.c (update_position): Remove function.


testsuite ChangeLog:

2011-10-18  Janne Blomqvist  

* gfortran.dg/inquire_5.f90: Update testcase to match the standard
and current implementation.


-- 
Janne Blomqvist
diff --git a/gcc/testsuite/gfortran.dg/inquire_5.f90 b/gcc/testsuite/gfortran.dg/inquire_5.f90
index fe107a1..064f96d 100644
--- a/gcc/testsuite/gfortran.dg/inquire_5.f90
+++ b/gcc/testsuite/gfortran.dg/inquire_5.f90
@@ -1,11 +1,10 @@
 ! { dg-do run { target fd_truncate } }
-! { dg-options "-std=legacy" }
 !
 ! pr19314 inquire(..position=..) segfaults
 ! test by thomas.koe...@online.de
 ! bdavis9...@comcast.net
   implicit none
-  character*20 chr
+  character(len=20) chr
   open(7,STATUS='SCRATCH')
   inquire(7,position=chr)
   if (chr.NE.'ASIS') CALL ABORT
@@ -31,7 +30,8 @@
   write(7,*)'this is another record'
   backspace(7)
   inquire(7,position=chr)
-  if (chr.NE.'ASIS') CALL ABORT
+  if (chr.eq.'ASIS' .or. chr .eq. 'REWIND' &
+   .or. chr .eq. 'APPEND') CALL ABORT
   rewind(7)
   inquire(7,position=chr)
   if (chr.NE.'REWIND') CALL ABORT
diff --git a/libgfortran/io/inquire.c b/libgfortran/io/inquire.c
index 252f29f..fb525ca 100644
--- a/libgfortran/io/inquire.c
+++ b/libgfortran/io/inquire.c
@@ -418,24 +418,36 @@ inquire_via_unit (st_parameter_inquire *iqp, gfc_unit * u)
   if (u == NULL || u->flags.access == ACCESS_DIRECT)
 p = undefined;
   else
-switch (u->flags.position)
-  {
- case POSITION_REWIND:
-   p = "REWIND";
-   break;
- case POSITION_APPEND:
-   p = "APPEND";
-   break;
- case POSITION_ASIS:
-   p = "ASIS";
-   break;
- default:
-   /* if not direct access, it must be
-  either REWIND, APPEND, or ASIS.
-  ASIS seems to be the best default */
-   p = "ASIS";
-   break;
-  }
+	{
+	  /* If the position is unspecified, check if we can figure
+	 out whether it's at the beginning or end.  */
+	  if (u->flags.position == POSITION_UNSPECIFIED)
+	{
+	  gfc_offset cur = stell (u->s);
+	  if (cur == 0)
+		u->flags.position = POSITION_REWIND;
+	  else if (cur != -1 && (ssize (u->s) == cur))
+		u->flags.position = POSITION_APPEND;
+	}
+	  switch (u->flags.position)
+	{
+	case POSITION_REWIND:
+	  p = "REWIND";
+	  break;
+	case POSITION_APPEND:
+	  p = "APPEND";
+	  break;
+	case POSITION_ASIS:
+	  p = "ASIS";
+	  break;
+	default:
+	  /* If the position has changed and is not rewind or
+		 append, it must be set to a processor-dependent
+		 value.  */
+	  p = "UNSPECIFIED";
+	  break;
+	}
+	}
   cf_strcpy (iqp->position, iqp->position_len, p);
 }
 
diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index 37353d7..23f07ca 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -608,9 +608,6 @@ internal_proto(get_unit);
 extern void unlock_unit (gfc_unit *);
 internal_proto(unlock_unit);
 
-extern void update_position (gfc_unit *);
-internal

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread Joseph S. Myers

On Tue, 18 Oct 2011, David Edelsohn wrote:

> Hey, Dodji,
> 
> Your patch broke bootstrap on AIX because of the typedef "loc_t"
> introduced in tree-diagnostics.c.  The typedef conflicts with a
> typedef in an AIX 5.3 header file for locales.  AIX should not be
> using that namespace, but the failure occurs before fix-includes, so
> it is not possible to fix it there.

The whole *_t namespace is reserved by POSIX, so it's not unreasonable for 
AIX to have some AIX-specific *_t typedefs although it seems quite rare 
for this to cause conflicts in practice (and AIX is obviously risking 
conflicts with future POSIX revisions by using *_t for its own types).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread Jakub Jelinek

On Tue, Oct 18, 2011 at 11:09:04AM -0400, David Edelsohn wrote:
> Your patch broke bootstrap on AIX because of the typedef "loc_t"
> introduced in tree-diagnostics.c.  The typedef conflicts with a
> typedef in an AIX 5.3 header file for locales.  AIX should not be
> using that namespace, but the failure occurs before fix-includes, so
> it is not possible to fix it there.

At least in POSIX *_t is reserved for the implementation (in any header).
So gcc should use something else.

Jakub

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread Gabriel Dos Reis

On Tue, Oct 18, 2011 at 10:19 AM, Joseph S. Myers
 wrote:
> On Tue, 18 Oct 2011, David Edelsohn wrote:
>
>> Hey, Dodji,
>>
>> Your patch broke bootstrap on AIX because of the typedef "loc_t"
>> introduced in tree-diagnostics.c.  The typedef conflicts with a
>> typedef in an AIX 5.3 header file for locales.  AIX should not be
>> using that namespace, but the failure occurs before fix-includes, so
>> it is not possible to fix it there.
>
> The whole *_t namespace is reserved by POSIX, so it's not unreasonable for
> AIX to have some AIX-specific *_t typedefs although it seems quite rare
> for this to cause conflicts in practice (and AIX is obviously risking
> conflicts with future POSIX revisions by using *_t for its own types).

any reason we could not call it location_type? or just location?

[PATCH, m68k] Enable building for ColdFire Linux

2011-10-18 Thread Julian Brown

Hi,

The attached patch (by Maxim Kuvyrkov) enables building GCC for Linux
when the --with-arch=cf configuration option is in effect (i.e. when
building a compiler targeting ColdFire rather than legacy m68k).

Without the patch the build fails with an error such as:

.../gcc/config/m68k/t-mlibs:45: *** Error default cpu 'mcf5475' is not
in multilib set ''.  Stop.

OK to apply?

Thanks,

Julian

ChangeLog

Maxim Kuvyrkov  

gcc/
* config/m68k/t-linux (M68K_MLIB_CPU): Add ColdFire CPUs.
Index: gcc/config/m68k/t-linux
===
--- gcc/config/m68k/t-linux	(revision 179967)
+++ gcc/config/m68k/t-linux	(working copy)
@@ -18,8 +18,8 @@
 
 EXTRA_MULTILIB_PARTS=crtbegin.o crtend.o crtbeginS.o crtendS.o crtbeginT.o
 
-# Only include multilibs for 680x0 CPUs with an MMU.
-M68K_MLIB_CPU += && (CPU ~ "^m680") && (FLAGS ~ "FL_MMU")
+# Only include multilibs for 680x0 and ColdFire CPUs with an MMU.
+M68K_MLIB_CPU += && ((CPU ~ "^m680") || (CPU ~ "^mcf")) && (FLAGS ~ "FL_MMU")
 
 # This rule uses MULTILIB_MATCHES to generate a definition of
 # SYSROOT_SUFFIX_SPEC.

Re: [Patch, fortran] [00/14] PR fortran/50420 Support coarray subreferences

2011-10-18 Thread Mikael Morin

On Sunday 09 October 2011 18:25:25 Tobias Burnus wrote:
> On 07.10.2011 16:38, Mikael Morin wrote:
> > The full patchset has passed the fortran testsuite successfully.
> > OK for trunk?
> 
> OK for the whole patch set. Thanks for finding and fixing the issue!
> 
Committed as follows.
I committed 8 and 9 together because I was fearing breakage.

patch  revision
===
1  180140
2  180141
3  180142
4  180143
5  180144
6  180145
7  180146
8,9180147
10 180148
11 180149
12 180150
13 180151
14 180152
tests  180153

Mikael

[PATCH] i?86 vec_perm fixes and improvements

2011-10-18 Thread Jakub Jelinek

Hi!

Now that there is a better testsuite for constant reshuffling, this patch
fixes various issues I found plus improves various permutations.
Bootstrapped/regtested on x86_64-linux and i686-linux, additionally
tested with
GCC_TEST_RUN_EXPENSIVE=1 make check-gcc 
RUNTESTFLAGS='--target_board=unix\{-msse2,-msse4,-mavx\} dg-torture.exp=vshuf*'
on AVX capable box and tested -mavx2 compiled tests on sde.
Ok for trunk?

Examples of improvements, say for V16HImode:
-   vpshuflw$228, a(%rip), %ymm0
+   vmovdqa a(%rip), %ymm0
vmovdqa %ymm0, c(%rip)
(for identity permutation), ICE vs.
+   vpbroadcastwa(%rip), %ymm0
+   vmovdqa %ymm0, c(%rip)
using vpbroadcast* for broadcast shuffle,
-   vpshufb .LC0(%rip), %ymm0, %ymm1
-   vpshufb .LC1(%rip), %ymm0, %ymm0
-   vpermq  $78, %ymm1, %ymm1
-   vpor%ymm1, %ymm0, %ymm0
+   vperm2i128  $0, %ymm0, %ymm0, %ymm0
+   vpshufb .LC0(%rip), %ymm0, %ymm0
when both lanes refer to just one lane, > 20 insns (full two argument
non-constant shuffle) into:
+   vmovdqa a(%rip), %ymm0
+   vpunpcklwd  b(%rip), %ymm0, %ymm0
+   vpshufb .LC2(%rip), %ymm0, %ymm0
+   vmovdqa %ymm0, c(%rip)
(resp. vpunpckhwd) when interleave gives something vpshufb can reshuffle
afterwards,
-   vmovdqa a(%rip), %ymm0
-   vpshufb .LC11(%rip), %ymm0, %ymm1
-   vpshufb .LC12(%rip), %ymm0, %ymm0
-   vpermq  $78, %ymm1, %ymm1
-   vpor%ymm1, %ymm0, %ymm0
+   vpermq  $156, a(%rip), %ymm0
+   vpshufb .LC4(%rip), %ymm0, %ymm0
another case where vpermq can shuffle quadwords into something vpshufb can
reshuffle, etc.

2011-10-18  Jakub Jelinek  

* config/i386/i386.c (ix86_expand_vec_perm): In merge_two use
mode SUBREG of operands[0] as target.
(valid_perm_using_mode_p): Don't ignore higher bits of d->perm.
(expand_vec_pshufb): For V8SImode vmode emit avx2_permvarv8si.
(expand_vec_perm_1): Handle identity and some broadcast
permutations.
(expand_vec_perm_interleave2): Handle also 32-byte modes, using
vperm2[fi]128 or vpunpck[lh]* followed by single insn permutation.
For d->testing_p return true earlier to avoid creating more GC
garbage.
(expand_vec_perm_vpermq_perm_1): New function.
(expand_vec_perm_vpshufb2_vpermq): For d->testing_p return true
earlier to avoid creating more GC garbage.  Fix handling of
V16HImode.  Avoid some SUBREGs in SET_DEST.
(expand_vec_perm_broadcast_1): Return false for 32-byte integer
vector modes.
(expand_vec_perm_vpshufb4_vpermq2): New function.
(ix86_expand_vec_perm_builtin_1): Call expand_vec_perm_vpermq_perm_1
and expand_vec_perm_vpshufb4_vpermq2.

--- gcc/config/i386/i386.c.jj   2011-10-17 22:27:39.0 +0200
+++ gcc/config/i386/i386.c  2011-10-18 14:08:58.0 +0200
@@ -19663,7 +19663,7 @@ ix86_expand_vec_perm (rtx operands[])
   mask = expand_simple_binop (maskmode, AND, mask, vt,
  NULL_RTX, 0, OPTAB_DIRECT);
 
-  xops[0] = operands[0];
+  xops[0] = gen_lowpart (mode, operands[0]);
   xops[1] = gen_lowpart (mode, t2);
   xops[2] = gen_lowpart (mode, t1);
   xops[3] = gen_rtx_EQ (maskmode, mask, vt);
@@ -35006,8 +35006,7 @@ valid_perm_using_mode_p (enum machine_mo
   return false;
 else
   for (j = 1; j < chunk; ++j)
-   if ((d->perm[i] & (d->nelt - 1)) + j
-   != (d->perm[i + j] & (d->nelt - 1)))
+   if (d->perm[i] + j != d->perm[i + j])
  return false;
 
   return true;
@@ -35138,6 +35137,8 @@ expand_vec_perm_pshufb (struct expand_ve
emit_insn (gen_ssse3_pshufbv16qi3 (target, op0, vperm));
   else if (vmode == V32QImode)
emit_insn (gen_avx2_pshufbv32qi3 (target, op0, vperm));
+  else
+   emit_insn (gen_avx2_permvarv8si (target, vperm, op0));
 }
   else
 {
@@ -35163,9 +35164,58 @@ expand_vec_perm_1 (struct expand_vec_per
   if (d->op0 == d->op1)
 {
   int mask = nelt - 1;
+  bool identity_perm = true;
+  bool broadcast_perm = true;
 
   for (i = 0; i < nelt; i++)
-   perm2[i] = d->perm[i] & mask;
+   {
+ perm2[i] = d->perm[i] & mask;
+ if (perm2[i] != i)
+   identity_perm = false;
+ if (perm2[i])
+   broadcast_perm = false;
+   }
+
+  if (identity_perm)
+   {
+ if (!d->testing_p)
+   emit_move_insn (d->target, d->op0);
+ return true;
+   }
+  else if (broadcast_perm && TARGET_AVX2)
+   {
+ /* Use vpbroadcast{b,w,d}.  */
+ rtx op = d->op0, (*gen) (rtx, rtx) = NULL;
+ switch (d->vmode)
+   {
+   case V32QImode:
+ op = gen_lowpart (V16QImode, op);
+ gen = gen_avx2_pbroadcastv32qi;
+ break;
+   case V16HImode:
+ op = gen_lowpart (V8HImode, op);
+ gen = gen_avx2_pbro

Re: [PATCH] Clear DECL_GIMPLE_REG_P when making parameter copy addressable (PR tree-optimization/50735)

2011-10-18 Thread Jakub Jelinek

On Mon, Oct 17, 2011 at 08:49:34AM +0200, Richard Guenther wrote:
> On Sun, Oct 16, 2011 at 5:47 PM, Jakub Jelinek  wrote:
> I think this should be exactly the other way around, using create_tmp_var,
> copying TREE_ADDRESSABLE and setting DECL_GIMPLE_REG_P if it is not 
> addressable.
> 
> Ok with that change.

Here is what I've committed after bootstrap/regtest on x86_64-linux and
i686-linux:

2011-10-18  Jakub Jelinek  

PR tree-optimization/50735
* function.c (gimplify_parameters): Use create_tmp_var instead of
create_tmp_reg.  If parm is not TREE_ADDRESSABLE and type is complex
or vector type, set DECL_GIMPLE_REG_P.

--- gcc/function.c.jj   2011-10-17 09:20:38.0 +0200
+++ gcc/function.c  2011-10-18 13:54:02.0 +0200
@@ -3617,7 +3617,7 @@ gimplify_parameters (void)
   && compare_tree_int (DECL_SIZE_UNIT (parm),
STACK_CHECK_MAX_VAR_SIZE) > 0))
{
- local = create_tmp_reg (type, get_name (parm));
+ local = create_tmp_var (type, get_name (parm));
  DECL_IGNORED_P (local) = 0;
  /* If PARM was addressable, move that flag over
 to the local copy, as its address will be taken,
@@ -3625,6 +3625,9 @@ gimplify_parameters (void)
 as we'll query that flag during gimplification.  */
  if (TREE_ADDRESSABLE (parm))
TREE_ADDRESSABLE (local) = 1;
+ else if (TREE_CODE (type) == COMPLEX_TYPE
+  || TREE_CODE (type) == VECTOR_TYPE)
+   DECL_GIMPLE_REG_P (local) = 1;
}
  else
{

Jakub

[PATCH] Fix gcc.dg/guality/asm-1.c regressions on 4.6 branch

2011-10-18 Thread Jakub Jelinek

Hi!

Apparently a recent change (2011-10-1[23] or so) regressed
gcc.dg/guality/asm-1.c test on x86_64-linux on the 4.6 branch.
This turned out to be a bug in loc_descriptor which has been passing wrong
mode on the recursive call, so when we had var_location note containing
SImode subreg of some DImode expression, the inner loc_descriptor would
just give up because of the mode mismatch.

Fixed thusly (here the trunk version of the patch, attached 4.6 version),
bootstrapped/regtested on x86_64-linux and i686-linux (both branches in each
case), ok for trunk/4.6?

2011-10-18  Jakub Jelinek  

* dwarf2out.c (loc_descriptor): For SUBREG pass SUBREG_REG's mode as
second argument instead of mode.

--- gcc/dwarf2out.c.jj  2011-09-08 11:21:10.0 +0200
+++ gcc/dwarf2out.c 2011-10-18 14:09:46.0 +0200
@@ -12502,7 +12502,8 @@ loc_descriptor (rtx rtl, enum machine_mo
 legitimate to make the Dwarf info refer to the whole register which
 contains the given subreg.  */
   if (REG_P (SUBREG_REG (rtl)) && subreg_lowpart_p (rtl))
-   loc_result = loc_descriptor (SUBREG_REG (rtl), mode, initialized);
+   loc_result = loc_descriptor (SUBREG_REG (rtl),
+GET_MODE (SUBREG_REG (rtl)), initialized);
   else
goto do_default;
   break;

Jakub
2011-10-18  Jakub Jelinek  

* dwarf2out.c (loc_descriptor): For SUBREG pass SUBREG_REG's mode as
second argument instead of mode.

--- gcc/dwarf2out.c.jj  2011-08-02 10:42:23.0 +0200
+++ gcc/dwarf2out.c 2011-10-18 11:21:38.0 +0200
@@ -14608,7 +14608,8 @@ loc_descriptor (rtx rtl, enum machine_mo
 up an entire register.  For now, just assume that it is
 legitimate to make the Dwarf info refer to the whole register which
 contains the given subreg.  */
-  loc_result = loc_descriptor (SUBREG_REG (rtl), mode, initialized);
+  loc_result = loc_descriptor (SUBREG_REG (rtl),
+  GET_MODE (SUBREG_REG (rtl)), initialized);
   break;
 
 case REG:

Re: [PATCH, i386 tests] New tests to check vectorization for AVX2 insns.

2011-10-18 Thread H.J. Lu

On Mon, Oct 17, 2011 at 7:49 AM, Kirill Yukhin  wrote:
> Thanks, guys, could anybody please commit that?
>

I checked it in for you.


-- 
H.J.

[commit, spu] Fix -fPIC wrong-code regression

2011-10-18 Thread Ulrich Weigand

Hello,

current mainline miscompiles -fPIC code on SPU, which leads to wrong code
in libgcc, causing a large number of test failures.

The problem is in the back-end's get_pic_reg routine which implements an
optimization: in a leaf function, if register 74 is unused so far, use that
register instead of the standard register 124 as PIC register.  This helps
becaues 74 is call-clobbered, and thus no prologue/epilogue code is needed.

However, there is a problem in how the "register 74 is unused so far"
check is implemented: with current mainline, this check succeeds in
the post-reload splitter where uses of the PIC register are introduced.
However, that same check is later repeated in prologue generation in
order to emit code to set up the PIC register.  At this point, register
74 is now considered as in use.  This is true, of course, since it is
in use by the code that was added in the post-reload splitter ...

It would appear this used to work in the past because DF information
was not updated between the post-reload splitter and prologue emitter
passes.  This seems to have changed recently (maybe due to the shrink-
wrapping pass?), and now the back-end bug is exposed.

Fortunately, the fix is simple: we perform the DF check only once and
remember the result in cfun->machine.  The patch below implements this.

Tested on spu-elf, fixes many testcase failures.
Committed to mainline.

Bye,
Ulrich


ChangeLog:

* config/spu/spu.c (struct machine_function): New data structure.
(spu_init_machine_status): New function.
(spu_option_override): Install it.
(get_pic_reg): Set and use cfun->machine->pic_reg.
(spu_split_immediate): Do not set crtl->uses_pic_offset_table.
(need_to_save_reg): Use cfun->machine->pic_reg instead of
checking crtl->uses_pic_offset_table.
(spu_expand_prologue): Likewise.


Index: gcc/config/spu/spu.c
===
*** gcc/config/spu/spu.c(revision 179977)
--- gcc/config/spu/spu.c(working copy)
*** static void spu_setup_incoming_varargs (
*** 500,509 
--- 500,526 
  
  struct gcc_target targetm = TARGET_INITIALIZER;
  
+ /* Define the structure for the machine field in struct function.  */
+ struct GTY(()) machine_function
+ {
+   /* Register to use for PIC accesses.  */
+   rtx pic_reg;
+ };
+ 
+ /* How to allocate a 'struct machine_function'.  */
+ static struct machine_function *
+ spu_init_machine_status (void)
+ {
+   return ggc_alloc_cleared_machine_function ();
+ }
+ 
  /* Implement TARGET_OPTION_OVERRIDE.  */
  static void
  spu_option_override (void)
  {
+   /* Set up function hooks.  */
+   init_machine_status = spu_init_machine_status;
+ 
/* Small loops will be unpeeled at -O3.  For SPU it is more important
   to keep code small by default.  */
if (!flag_unroll_loops && !flag_peel_loops)
*** print_operand (FILE * file, rtx x, int c
*** 1741,1752 
  static rtx
  get_pic_reg (void)
  {
-   rtx pic_reg = pic_offset_table_rtx;
if (!reload_completed && !reload_in_progress)
  abort ();
!   if (current_function_is_leaf && !df_regs_ever_live_p (LAST_ARG_REGNUM))
! pic_reg = gen_rtx_REG (SImode, LAST_ARG_REGNUM);
!   return pic_reg;
  }
  
  /* Split constant addresses to handle cases that are too large. 
--- 1758,1779 
  static rtx
  get_pic_reg (void)
  {
if (!reload_completed && !reload_in_progress)
  abort ();
! 
!   /* If we've already made the decision, we need to keep with it.  Once we've
!  decided to use LAST_ARG_REGNUM, future calls to df_regs_ever_live_p may
!  return true since the register is now live; this should not cause us to
!  "switch back" to using pic_offset_table_rtx.  */
!   if (!cfun->machine->pic_reg)
! {
!   if (current_function_is_leaf && !df_regs_ever_live_p (LAST_ARG_REGNUM))
!   cfun->machine->pic_reg = gen_rtx_REG (SImode, LAST_ARG_REGNUM);
!   else
!   cfun->machine->pic_reg = pic_offset_table_rtx;
! }
! 
!   return cfun->machine->pic_reg;
  }
  
  /* Split constant addresses to handle cases that are too large. 
*** spu_split_immediate (rtx * ops)
*** 1849,1855 
{
  rtx pic_reg = get_pic_reg ();
  emit_insn (gen_addsi3 (ops[0], ops[0], pic_reg));
- crtl->uses_pic_offset_table = 1;
}
  return flag_pic || c == IC_IL2s;
}
--- 1876,1881 
*** need_to_save_reg (int regno, int saving)
*** 1875,1883 
  return 1;
if (flag_pic
&& regno == PIC_OFFSET_TABLE_REGNUM
!   && (!saving || crtl->uses_pic_offset_table)
!   && (!saving
! || !current_function_is_leaf || df_regs_ever_live_p 
(LAST_ARG_REGNUM)))
  return 1;
return 0;
  }
--- 1901,1907 
  return 1;
if (flag_pic
&& regno == PIC_OFFSET_TABLE_REGNUM
!   && (!saving || cfun->machine->pic_reg == pic_offset_table_

Re: [PATCH, i386 tests] New tests to check vectorization for AVX2 insns.

2011-10-18 Thread Kirill Yukhin

Thank you!

K

On Tue, Oct 18, 2011 at 7:42 PM, H.J. Lu  wrote:
> On Mon, Oct 17, 2011 at 7:49 AM, Kirill Yukhin  
> wrote:
>> Thanks, guys, could anybody please commit that?
>>
>
> I checked it in for you.
>
>
> --
> H.J.
>

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread Dodji Seketeli

David Edelsohn  writes:

> Your patch broke bootstrap on AIX because of the typedef "loc_t"
> introduced in tree-diagnostics.c.  The typedef conflicts with a
> typedef in an AIX 5.3 header file for locales.  AIX should not be
> using that namespace, but the failure occurs before fix-includes, so
> it is not possible to fix it there.
>
> How can we avoid the namespace conflict?

I'll change the name to location_type as Gaby proposed and commit it as
obvious.  Sorry for the inconvenience.

-- 
Dodji

[PATCH] Fix computed gotos on m68k

2011-10-18 Thread Julian Brown

Hi,

This patch fixes computed gotos on m68k, and probably other targets too
(the fix is in the middle end). Several tests fail at present with "-O3
-fomit-frame-pointer".

One example of erroneous behaviour is as follows: the
function 'x' in comp-goto-1.c compiles to:

x:
lea (-104,%sp),%sp
lea (104,%sp),%a0
move.l %a0,92(%sp)
fmovem #63,44(%sp)
movem.l #31996,(%sp)
move.l %sp,96(%sp)
move.l 108(%sp),-(%sp)
lea (96,%sp),%a0
jsr y.1217
.L3:
.L12:  
move.l 112(%sp),%d0   <--- wrong offset, should be 108.
movem.l (%sp),#31996
fmovem 44(%sp),#63
lea (104,%sp),%sp
rts

Where the source code looks like:

  int
  x(a)
  {
__label__ xlab;
void y(a)
  {
void *x = &&llab;
if (a==-1)
  goto *x;
if (a==0)
  goto xlab;
  llab:
y (a-1);
  }
y (a);
   xlab:;
return a;
  }

The offset is intended to load the argument value 'a' for the "return
a;" statement, but actually loads one word beyond that.

This has been broken because since mainline revision r160124, which
(correctly) causes 'y' to be flagged as noreturn -- y never returns via
the normal function return route. If ipa-pure-const.c is hacked to
check that a function does not perform non-local gotos before marking
it noreturn, the original (correct) offset is restored -- but that is
simply papering over the problem.

Prior to r160124, or with the aforementioned patch to inhibit the
"noreturn" attribute for the function y, the call to y from x (just
before "xlab:") has a goto to the following label emitted at expand
time. Before/after that patch is applied, the diff of 148r.expand
around the relevant part looks like:

   # BLOCK 2 freq:1
   # PRED: ENTRY [100.0%]  (fallthru,exec)
   y (a_1(D)); [static-chain: &FRAME.0]
-  # SUCC: 3 [100.0%]  (ab,exec)
+  goto  (xlab);
+  # SUCC: 3 [61.0%]  (ab,exec) 4 [39.0%]  (fallthru,exec)
 
-  # BLOCK 3 freq:1
-  # PRED: 2 [100.0%]  (ab,exec)
+  # BLOCK 3 freq:6100
+  # PRED: 2 [61.0%]  (ab,exec)
 : [non-local]
   # SUCC: 4 [100.0%]  (fallthru,exec)
 
   # BLOCK 4 freq:1
-  # PRED: 3 [100.0%]  (fallthru,exec)
+  # PRED: 2 [39.0%]  (fallthru,exec) 3 [100.0%]  (fallthru,exec)
 xlab:

Now, this is important because when the "goto " is emitted, a
stack adjustment is also emitted to pop y's arguments (via
dojump.c:do_pending_stack_adjust). Without that (i.e. the present state
with no patches), when later passes try to figure out elimination
offsets (from arg pointer to frame pointer, frame pointer to stack
pointer and so on) at the labels L0/xlab, they get the wrong answer.
(This happens in reload, called from ira.)

So, on to the attached fix. As mentioned previously, GCC tracks
elimination offsets throughout each function, which can vary as e.g.
other functions are called. There is a concept of an "initial" offset:
I believe that when nonlocal gotos are executed, and hence at all the
labels a nonlocal goto may reach, the elimination offsets must have
their "initial" values.

We can find out if a label is the target of a nonlocal goto (and hence
force the use of initial offsets) in a somewhat roundabout way, by
looking up its containing basic block, seeing if that BB is a nonlocal
goto target, then seeing if the label is the first instruction in the
block. This seems slightly clumsy, but I didn't find another way of
doing it.

Interestingly perhaps, the comment near the fix in
reload1.c:set_label_offsets hints at a possible issue which may be
related:

case CODE_LABEL:
  /* If we know nothing about this label, set the desired offsets.
Note that this sets the offset at a label to be the offset before a
label if we don't know anything about the label.  This is not
correct for the label after a BARRIER, but is the best guess we can
make.  If we guessed wrong, we will suppress an elimination that
might have been possible had we been able to guess correctly.  */

But obviously, in our case, rather than just failing to make an
elimination which may have been possible, we get wrong code instead
(there is a barrier before our non-local-goto-receiving label).

Anyway. The attached patch appears to work fine, and we get the
following changes in test results (cross-building and testing to
ColdFire Linux):

FAIL -> PASS: gcc.c-torture/execute/920501-7.c execution,  -O3 
-fomit-frame-pointer 
FAIL -> PASS: gcc.c-torture/execute/comp-goto-2.c execution,  -O3 
-fomit-frame-pointer 
FAIL -> PASS: gcc.dg/torture/stackalign/comp-goto-1.c  -O3 -fomit-frame-pointer 
 execution test
FAIL -> PASS: gcc.dg/torture/stackalign/comp-goto-1.c  -O3 -fomit-frame-pointer 
 execution test [#2]
FAIL -> PASS: gcc.dg/torture/stackalign/non-local-goto-4.c  -O3 
-fomit-frame-pointer  execution test
FAIL -> PASS: gcc.dg/torture/stackalign/non-local-goto-4.c  -O3 
-fomit-frame-pointer  execution test [#2]

Any comments, or OK to apply?

Thanks,

J

[PATCH] libiberty: fix psignal parameter type

2011-10-18 Thread Roland McGrath

When libiberty defines psignal, it doesn't use the canonical signature.
This came up as a problem in a configuration where libiberty wants to
define psignal itself, but the build environment's  declares
it too.  

This was a --with-newlib configuration using the newlib trunk, where newlib
does define psignal (and strsignal, among other things), but libiberty's
configure thinks it knows exactly what newlib does and doesn't define
without checking.  This hard-coding seems unwise to me, exactly because of
the potential for cases like this, where newlib's set of available
functions changes over time.

I think the fragility of that hard-coding ought to be addressed somehow.
But regardless, this change alleviates the immediate problem, and is
otherwise harmless.


Thanks,
Roland


libiberty/
2011-10-18  Roland McGrath  

* strsignal.c (psignal): Use const second in parameter type.
* functions.texi: Updated.

diff --git a/libiberty/functions.texi b/libiberty/functions.texi
index c9df186..2945c61 100644
--- a/libiberty/functions.texi
+++ b/libiberty/functions.texi
@@ -1097,7 +1097,7 @@ documented.
 @end deftypefn
 
 @c strsignal.c:541
-@deftypefn Supplemental void psignal (int @var{signo}, char *@var{message})
+@deftypefn Supplemental void psignal (int @var{signo}, const char 
*@var{message})
 
 Print @var{message} to the standard error, followed by a colon,
 followed by the description of the signal specified by @var{signo},
diff --git a/libiberty/strsignal.c b/libiberty/strsignal.c
index 666b1b4..3b56d16 100644
--- a/libiberty/strsignal.c
+++ b/libiberty/strsignal.c
@@ -538,7 +538,7 @@ strtosigno (const char *name)
 
 /*
 
-@deftypefn Supplemental void psignal (int @var{signo}, char *@var{message})
+@deftypefn Supplemental void psignal (int @var{signo}, const char 
*@var{message})
 
 Print @var{message} to the standard error, followed by a colon,
 followed by the description of the signal specified by @var{signo},
@@ -551,7 +551,7 @@ followed by a newline.
 #ifndef HAVE_PSIGNAL
 
 void
-psignal (int signo, char *message)
+psignal (int signo, const char *message)
 {
   if (signal_names == NULL)
 {

Re: [PATCH] libiberty: fix psignal parameter type

2011-10-18 Thread Andrew Pinski

On Tue, Oct 18, 2011 at 9:49 AM, Roland McGrath  wrote:
> When libiberty defines psignal, it doesn't use the canonical signature.
> This came up as a problem in a configuration where libiberty wants to
> define psignal itself, but the build environment's  declares
> it too.
>
> This was a --with-newlib configuration using the newlib trunk, where newlib
> does define psignal (and strsignal, among other things), but libiberty's
> configure thinks it knows exactly what newlib does and doesn't define
> without checking.  This hard-coding seems unwise to me, exactly because of
> the potential for cases like this, where newlib's set of available
> functions changes over time.

libiberty is no longer compiled for the target so this should never
happen really.

Thanks,
Andrew Pinski

Re: [PATCH] libiberty: fix psignal parameter type

2011-10-18 Thread Roland McGrath

On Tue, Oct 18, 2011 at 9:50 AM, Andrew Pinski  wrote:
> libiberty is no longer compiled for the target so this should never
> happen really.

I see.  I was indeed using an older source base, and just noticed that all
the offending configure logic was still the same.  Perhaps all the
--with-newlib logic in libiberty's configure should be removed now?

Regardless, the psignal change is an appropriate cleanup in its own right.


Thanks,
Roland

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Georg-Johann Lay

Denis Chertykov schrieb:
> 2011/10/18 Georg-Johann Lay :
>> Denis Chertykov schrieb:
>>> 2011/10/18 Georg-Johann Lay :
 This patch do some tweaks to addhi3 like adding QI scratch register.

 The original *addhi3 insn is still there and located prior to new
 addhi3_clobber insn because addhi3 is special to reload (thanks Danis for 
 this
 note) so that there is a version with and a version without scratch 
 register.

 Patch passes without regressions.

>>> Which improvements added by this patch ?
>>>
>>> Denis.
>> If the addhi3 is expanded early, the addition happens with QI scratch which
>> avoids reload of constant if target register is in NO_LD. And reduce register
>> pressure as only QI is needed and not reload of constant to HI.
>>
>> Otherwise, there might be sequences like
>>
>> ldi r31, 2; *reload_inhi
>> mov r12, r31
>> clr r13
>>
>> add r14, r12  ; *addhi3
>> adc r15, r13
>>
>> which now will be
>>
>> ldi r31, 2; addhi3_clobber
>> add r14, r31
>> adc r15, __zero_reg__
>>
>> Similar applies if the reload of the constant happens to LD regs:
>>
>> ldi r30, 2; *movhi
>> clr r31
>>
>> add r14, r12  ; *addhi3
>> adc r15, r13
>>
>> will become
>>
>> ldi r30, 2; addhi3_clobber
>> add r14, r30
>> adc r15, __zero_reg__
>>
>> For *addhi3 insns the register pressure is not reduced but the insn sequence
>> might be smarter if peep2 comes up with a QI scratch or if it detects a
>> *reload_inhi insn just prior to the addition (and the reg that holds the
>> reloaded constant dies after the addition).
>>
>> As *addhi3 is special to reload, there is still an "ordinary" add addhi insn
>> without scratch. This is easier because, e.g. prologue and epilogue 
>> generation
>> generate add insns (not by means of addhi3 expander but by explicit
>> gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
>> addhi3 insn is to be generated via addhi3 expander late in the compilation 
>> process
> 
> Please provide any real world example.
> 
> Denis.

Consider avr-libc (under the assumption that it is "real world" code):

In avr-libc's build directory, and with the patch integrated:

$ cd avr/lib/avr4
$ make clean && make CFLAGS='-save-temps -dp -Os'
$ grep -A 2 'addhi3_clobber\/2' *.s > out-nopeep2.txt (see attachment)
$ grep 'addhi3_clobber\/2' *.s | wc -l
33

This shows that the insns are already there before peep2 and thus no reload of
16-bit constant is needed; an 8-bit scratch is sufficient.

Alternatively, the implementation could omit the expansion to addhi3_clobber in
addhi3 expander and instead rely completely on peep2. However, that does not
reduce register pressure because a 16-bit register will be allocated and the
peep2 just prints things smarter and needs just a QI scratch to call
avr_out_plus_clobber.

For +/-1, the addition with SEC/ADD/ADC resp. SEC/SBC/SBC leaves cc0 in a mess.
 as most loops use +/-1 on the counter variable, LDI/SUB/SBC is not shorter but
better because it sets cc0.

So you like this patch?
Or prefer a patch that is neutral with respect to register allocator and just
uses peep2 to print things smarter?

Johann


dtoa_prf.s: ldi r31,3; , ;  338 addhi3_clobber/2[length 
= 3]
dtoa_prf.s- add r12,r31  ;  s,
dtoa_prf.s- adc r13,__zero_reg__ ;  s
--
dtoa_prf.s: ldi r31,3; , ;  447 addhi3_clobber/2[length 
= 3]
dtoa_prf.s- add r12,r31  ;  s,
dtoa_prf.s- adc r13,__zero_reg__ ;  s
--
fgets.s:ldi r31,1; , ;  70  addhi3_clobber/2[length 
= 3]
fgets.s-sub r14,r31  ;  ivtmp.9,
fgets.s-sbc r15,__zero_reg__ ;  ivtmp.9
--
realloc.s:  ldi r17,2; , ;  80  addhi3_clobber/2[length 
= 3]
realloc.s-  add r12,r17  ;  tmp83,
realloc.s-  adc r13,__zero_reg__ ; 
--
realloc.s:  ldi r18,2; , ;  85  addhi3_clobber/2[length 
= 3]
realloc.s-  add r12,r18  ;  tmp84,
realloc.s-  adc r13,__zero_reg__ ; 
--
strtod.s:   ldi r31,1; , ;  101 addhi3_clobber/2[length 
= 3]
strtod.s-   sub r14,r31  ;  D.2581,
strtod.s-   sbc r15,__zero_reg__ ;  D.2581
--
strtod.s:   ldi r18,2; , ;  110 addhi3_clobber/2[length 
= 3]
strtod.s-   add r14,r18  ;  nptr,
strtod.s-   adc r15,__zero_reg__ ;  nptr
--
strtod.s:   ldi r21,7; , ;  120 addhi3_clobber/2[length 
= 3]
strtod.s-   add r14,r21  ;  nptr,
strtod.s-   adc r15,__zero_reg__ ;  nptr
--
strtod.s:   ldi r31,255  ; , ;  175 addhi3_clobber/2[length 
= 3]
strtod.s-   sub r14,r31  ;  exp,
strtod.s-   sbc r15,r31  ;  exp,
--
strtod.s:   ldi r18,1; , ;  185 addhi3_clobber/2[length 
= 3]
strtod.s-   sub r14,r18  ;  exp,
strtod.s-   sbc r15,__zero_reg__ ;  exp
--
strtod.s:   ldi r31,24   ; , ;  376 addhi3_clobbe

Re: [PATCH] libiberty: fix psignal parameter type

2011-10-18 Thread Ian Lance Taylor

On Tue, Oct 18, 2011 at 9:49 AM, Roland McGrath  wrote:
>
> libiberty/
> 2011-10-18  Roland McGrath  
>
>        * strsignal.c (psignal): Use const second in parameter type.
>        * functions.texi: Updated.

This is OK.

Thanks.

Ian

Re: [PATCH] Fix computed gotos on m68k

2011-10-18 Thread Mikael Pettersson

Julian Brown writes:
 > Hi,
 > 
 > This patch fixes computed gotos on m68k, and probably other targets too
 > (the fix is in the middle end). Several tests fail at present with "-O3
 > -fomit-frame-pointer".

This is the same bug as PR47918.  As described in that PR trail, the bug
can be triggered also on i386.

Thanks for debugging this very nasty issue!

> [...]

/Mikael

Re: [PATCH] libiberty: fix psignal parameter type

2011-10-18 Thread Roland McGrath

On Tue, Oct 18, 2011 at 10:18 AM, Ian Lance Taylor  wrote:
> On Tue, Oct 18, 2011 at 9:49 AM, Roland McGrath  wrote:
>>
>> libiberty/
>> 2011-10-18  Roland McGrath  
>>
>>        * strsignal.c (psignal): Use const second in parameter type.
>>        * functions.texi: Updated.
>
> This is OK.

Thanks.  But last I checked, I'm not a GCC committer.
So somebody has to do it for me.


Thanks,
Roland

Re: [PATCH] i?86 vec_perm fixes and improvements

2011-10-18 Thread Richard Henderson

On 10/18/2011 08:30 AM, Jakub Jelinek wrote:
>   * config/i386/i386.c (ix86_expand_vec_perm): In merge_two use
>   mode SUBREG of operands[0] as target.
>   (valid_perm_using_mode_p): Don't ignore higher bits of d->perm.
>   (expand_vec_pshufb): For V8SImode vmode emit avx2_permvarv8si.
>   (expand_vec_perm_1): Handle identity and some broadcast
>   permutations.
>   (expand_vec_perm_interleave2): Handle also 32-byte modes, using
>   vperm2[fi]128 or vpunpck[lh]* followed by single insn permutation.
>   For d->testing_p return true earlier to avoid creating more GC
>   garbage.
>   (expand_vec_perm_vpermq_perm_1): New function.
>   (expand_vec_perm_vpshufb2_vpermq): For d->testing_p return true
>   earlier to avoid creating more GC garbage.  Fix handling of
>   V16HImode.  Avoid some SUBREGs in SET_DEST.
>   (expand_vec_perm_broadcast_1): Return false for 32-byte integer
>   vector modes.
>   (expand_vec_perm_vpshufb4_vpermq2): New function.
>   (ix86_expand_vec_perm_builtin_1): Call expand_vec_perm_vpermq_perm_1
>   and expand_vec_perm_vpshufb4_vpermq2.

Ok.


r~

C++ PATCH for c++/50742 (ICE on switch with local using-decl)

2011-10-18 Thread Jason Merrill

A using-declaration adds a TREE_LIST to the list of names on a binding 
level, so we need to cope with that here.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 052e893fe307f33ca3d3c2ead06248e0ef738f16
Author: Jason Merrill 
Date:   Mon Oct 17 22:04:08 2011 -0400

	PR c++/50742
	* decl.c (check_previous_goto_1): Handle using-decl.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 8b5033f..4b5b6c8 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2683,7 +2683,8 @@ check_previous_goto_1 (tree decl, cp_binding_level* level, tree names,
   tree new_decls, old_decls = (b == level ? names : NULL_TREE);
 
   for (new_decls = b->names; new_decls != old_decls;
-	   new_decls = DECL_CHAIN (new_decls))
+	   new_decls = (DECL_P (new_decls) ? DECL_CHAIN (new_decls)
+			: TREE_CHAIN (new_decls)))
 	{
 	  int problem = decl_jump_unsafe (new_decls);
 	  if (! problem)
diff --git a/gcc/testsuite/g++.dg/lookup/using23.C b/gcc/testsuite/g++.dg/lookup/using23.C
new file mode 100644
index 000..5dd8d85
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/using23.C
@@ -0,0 +1,13 @@
+// PR c++/50742
+
+typedef int A;
+
+void f(int i)
+{
+  switch (i)
+{
+case 0:
+  using ::A;
+default:;
+}
+}

Re: [PATCH 3/6] Emit macro expansion related diagnostics

2011-10-18 Thread Dodji Seketeli

Dodji Seketeli  writes:

> David Edelsohn  writes:
>
>> Your patch broke bootstrap on AIX because of the typedef "loc_t"
>> introduced in tree-diagnostics.c.

I have committed the patch below.  Tested on x86_64-unknown-linux-gnu
only, unfortunately.  I hope it fixes the build on AIX.

From: Dodji Seketeli 
Date: Tue, 18 Oct 2011 18:40:06 +0200
Subject: [PATCH] Do not use loc_t as a type name

gcc/
* tree-diagnostic (struct loc_t): Rename into struct loc_map_pair.
(maybe_unwind_expanded_macro_loc): Adjust.
---
 gcc/tree-diagnostic.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-diagnostic.c b/gcc/tree-diagnostic.c
index 53b350b..b4b60dc 100644
--- a/gcc/tree-diagnostic.c
+++ b/gcc/tree-diagnostic.c
@@ -56,10 +56,10 @@ typedef struct
 {
   const struct line_map *map;
   source_location where;
-} loc_t;
+} loc_map_pair;
 
-DEF_VEC_O (loc_t);
-DEF_VEC_ALLOC_O (loc_t, heap);
+DEF_VEC_O (loc_map_pair);
+DEF_VEC_ALLOC_O (loc_map_pair, heap);
 
 /* Unwind the different macro expansions that lead to the token which
location is WHERE and emit diagnostics showing the resulting
@@ -111,9 +111,9 @@ maybe_unwind_expanded_macro_loc (diagnostic_context 
*context,
  const struct line_map **first_exp_point_map)
 {
   const struct line_map *map;
-  VEC(loc_t,heap) *loc_vec = NULL;
+  VEC(loc_map_pair,heap) *loc_vec = NULL;
   unsigned ix;
-  loc_t loc, *iter;
+  loc_map_pair loc, *iter;
 
   map = linemap_lookup (line_table, where);
   if (!linemap_macro_expansion_map_p (map))
@@ -132,7 +132,7 @@ maybe_unwind_expanded_macro_loc (diagnostic_context 
*context,
   loc.where = where;
   loc.map = map;
 
-  VEC_safe_push (loc_t, heap, loc_vec, &loc);
+  VEC_safe_push (loc_map_pair, heap, loc_vec, &loc);
 
   /* WHERE is the location of a token inside the expansion of a
  macro.  MAP is the map holding the locations of that macro
@@ -150,7 +150,7 @@ maybe_unwind_expanded_macro_loc (diagnostic_context 
*context,
  first macro which expansion triggered this trace was expanded
  inside a system header.  */
   if (!LINEMAP_SYSP (map))
-FOR_EACH_VEC_ELT (loc_t, loc_vec, ix, iter)
+FOR_EACH_VEC_ELT (loc_map_pair, loc_vec, ix, iter)
   {
 source_location resolved_def_loc = 0, resolved_exp_loc = 0;
 diagnostic_t saved_kind;
@@ -203,7 +203,7 @@ maybe_unwind_expanded_macro_loc (diagnostic_context 
*context,
 context->printer->prefix = saved_prefix;
   }
 
-  VEC_free (loc_t, heap, loc_vec);
+  VEC_free (loc_map_pair, heap, loc_vec);
 }
 
 /*  This is a diagnostic finalizer implementation that is aware of
-- 
1.7.6.2



-- 
Dodji

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Denis Chertykov

2011/10/18 Georg-Johann Lay :
> Denis Chertykov schrieb:
>> 2011/10/18 Georg-Johann Lay :
>>> Denis Chertykov schrieb:
 2011/10/18 Georg-Johann Lay :
> This patch do some tweaks to addhi3 like adding QI scratch register.
>
> The original *addhi3 insn is still there and located prior to new
> addhi3_clobber insn because addhi3 is special to reload (thanks Danis for 
> this
> note) so that there is a version with and a version without scratch 
> register.
>
> Patch passes without regressions.
>
 Which improvements added by this patch ?

 Denis.
>>> If the addhi3 is expanded early, the addition happens with QI scratch which
>>> avoids reload of constant if target register is in NO_LD. And reduce 
>>> register
>>> pressure as only QI is needed and not reload of constant to HI.
>>>
>>> Otherwise, there might be sequences like
>>>
>>> ldi r31, 2    ; *reload_inhi
>>> mov r12, r31
>>> clr r13
>>>
>>> add r14, r12  ; *addhi3
>>> adc r15, r13
>>>
>>> which now will be
>>>
>>> ldi r31, 2    ; addhi3_clobber
>>> add r14, r31
>>> adc r15, __zero_reg__
>>>
>>> Similar applies if the reload of the constant happens to LD regs:
>>>
>>> ldi r30, 2    ; *movhi
>>> clr r31
>>>
>>> add r14, r12  ; *addhi3
>>> adc r15, r13
>>>
>>> will become
>>>
>>> ldi r30, 2    ; addhi3_clobber
>>> add r14, r30
>>> adc r15, __zero_reg__
>>>
>>> For *addhi3 insns the register pressure is not reduced but the insn sequence
>>> might be smarter if peep2 comes up with a QI scratch or if it detects a
>>> *reload_inhi insn just prior to the addition (and the reg that holds the
>>> reloaded constant dies after the addition).
>>>
>>> As *addhi3 is special to reload, there is still an "ordinary" add addhi insn
>>> without scratch. This is easier because, e.g. prologue and epilogue 
>>> generation
>>> generate add insns (not by means of addhi3 expander but by explicit
>>> gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
>>> addhi3 insn is to be generated via addhi3 expander late in the compilation 
>>> process
>>
>> Please provide any real world example.
>>
>> Denis.
>
> Consider avr-libc (under the assumption that it is "real world" code):
>
> In avr-libc's build directory, and with the patch integrated:
>
> $ cd avr/lib/avr4
> $ make clean && make CFLAGS='-save-temps -dp -Os'
> $ grep -A 2 'addhi3_clobber\/2' *.s > out-nopeep2.txt (see attachment)
> $ grep 'addhi3_clobber\/2' *.s | wc -l
> 33
>
> This shows that the insns are already there before peep2 and thus no reload of
> 16-bit constant is needed; an 8-bit scratch is sufficient.
>
> Alternatively, the implementation could omit the expansion to addhi3_clobber 
> in
> addhi3 expander and instead rely completely on peep2. However, that does not
> reduce register pressure because a 16-bit register will be allocated and the
> peep2 just prints things smarter and needs just a QI scratch to call
> avr_out_plus_clobber.
>
> For +/-1, the addition with SEC/ADD/ADC resp. SEC/SBC/SBC leaves cc0 in a 
> mess.
>  as most loops use +/-1 on the counter variable, LDI/SUB/SBC is not shorter 
> but
> better because it sets cc0.
>
> So you like this patch?
> Or prefer a patch that is neutral with respect to register allocator and just
> uses peep2 to print things smarter?

I'm interested in code improvements.
What difference in size of avr-libc ?

Denis.

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Denis Chertykov

Just a note.

Instead of `!reload_completed && !reload_in_progress'you can use
`can_create_pseudo_p()' declared in rtl.h

Denis.

Re: RFA: Improve tree-ssa-sink block selection

2011-10-18 Thread Jeff Law

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/18/11 01:44, Paolo Bonzini wrote:
> On 10/18/2011 07:10 AM, Jeff Law wrote:
>> --- 467,475  if (gimple_code (use) != GIMPLE_PHI) { sinkbb =
>> gimple_bb (use); !   sinkbb = select_best_block (frombb,
>> gimple_bb (use), stmt);
>> 
>> !   if (sinkbb == frombb) return false;
>> 
>> *togsi = gsi_for_stmt (use);
> 
> Useless assignment of sinkbb, otherwise looks fine.
Thanks.  I'll fix that.

> 
> By the way, is it intended that sink_code_in_bb visits again 
> postdominators that were already visited (which with domwalk would
> come for free)?  As it is, the pass is quadratic when you have
> something like this:
I haven't really looked at the visitation order in tree-ssa-sink.c.
The formulation as a walk over the pdom tree and sinking statements
within each block is unfortunate, but workable.

I'm not sure where you're seeing multiple visits of an individual
block, but maybe I've got some kind of a mental block.

The other way to formulate this problem is with recursion.  A DFS walk
to sink each statement's immediate uses, then sinking the statement
itself.  It's a nice, simple formulation that is annoyingly complex to
implement in GCC due to "interesting" aspects of our immediate use
iterators.


Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOnb/bAAoJEBRtltQi2kC7i7oIAIXnmR76SMZIBDQLHvMqEAME
qTG5HkbL0kCb4EDMHYRXA4U8Wot1UWiuAkJzihBQMRWiqOs5v555iUx6N+M/l2Fu
w+1SyqeZacNBLm9/bI/Ynbr55BXF3aKPxES5kqRWIJE1kCoOqKFOWAizgWJmyfMG
ct13PpUqU1CT572cwl+GgCQulEWXQEgeOH/Iu/gLPFv+PSEhkEN0vA6foBYgCnk6
d4lGaa3NcQsUwhAlu4rNgQVAZL3cpTRL8QDIultxg3j/zEhcTwJ4YB4azDQcsRQ4
ZasKqD7m4HXnvuqS8TI66v+w4wKsparWyKTXAqp4dibnhON4l2Omes/ZmJokiUs=
=HXaj
-END PGP SIGNATURE-

Re: [Patch Darwin/Ada] work around PR target/50678

2011-10-18 Thread Mike Stump

On Oct 18, 2011, at 4:58 AM, Iain Sandoe wrote:
> The audit trail in the PR contains the detective work (mostly by Eric) that 
> concludes we have a long-standing bug in the Darwin x86-64 sigtramp unwind 
> data.

Yeah, well, it is a bug, if they ever fix it.  If they burn it in and define it 
as correct, then, it is an unfortunatism.  If they fix it, it should be 
software updated to older system and all tools fixed to use the new scheme.  I 
suspect they'll never fix it.

> This has been filed as radar #10302855, but we need a work-around until that 
> is resolved (possibly forever on older systems).

> +#if defined (__x86_64__)

If this is for a target machine compile, this is fine.  Looking through the 
code, it seems like target machine code, though, I wasn't sure.  If it is for 
host code, then this isn't cross compile safe.  For host code DARWIN_X86 and 
TARGET_64BIT (from tm.h) both need to be true, for one to be compiling for a 
64-bit x86 machine.

I think this is fine for relevant release branches.

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Georg-Johann Lay

Denis Chertykov schrieb:
> 2011/10/18 Georg-Johann Lay :
>> Denis Chertykov schrieb:
>>> 2011/10/18 Georg-Johann Lay :
 Denis Chertykov schrieb:
> 2011/10/18 Georg-Johann Lay :
>> This patch do some tweaks to addhi3 like adding QI scratch register.
>>
>> The original *addhi3 insn is still there and located prior to new
>> addhi3_clobber insn because addhi3 is special to reload (thanks Danis 
>> for this
>> note) so that there is a version with and a version without scratch 
>> register.
>>
>> Patch passes without regressions.
>>
> Which improvements added by this patch ?
>
> Denis.
 If the addhi3 is expanded early, the addition happens with QI scratch which
 avoids reload of constant if target register is in NO_LD. And reduce 
 register
 pressure as only QI is needed and not reload of constant to HI.

 Otherwise, there might be sequences like

 ldi r31, 2; *reload_inhi
 mov r12, r31
 clr r13

 add r14, r12  ; *addhi3
 adc r15, r13

 which now will be

 ldi r31, 2; addhi3_clobber
 add r14, r31
 adc r15, __zero_reg__

 Similar applies if the reload of the constant happens to LD regs:

 ldi r30, 2; *movhi
 clr r31

 add r14, r12  ; *addhi3
 adc r15, r13

 will become

 ldi r30, 2; addhi3_clobber
 add r14, r30
 adc r15, __zero_reg__

 For *addhi3 insns the register pressure is not reduced but the insn 
 sequence
 might be smarter if peep2 comes up with a QI scratch or if it detects a
 *reload_inhi insn just prior to the addition (and the reg that holds the
 reloaded constant dies after the addition).

 As *addhi3 is special to reload, there is still an "ordinary" add addhi 
 insn
 without scratch. This is easier because, e.g. prologue and epilogue 
 generation
 generate add insns (not by means of addhi3 expander but by explicit
 gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
 addhi3 insn is to be generated via addhi3 expander late in the compilation 
 process
>>> Please provide any real world example.
>>>
>>> Denis.
>> Consider avr-libc (under the assumption that it is "real world" code):
>>
>> In avr-libc's build directory, and with the patch integrated:
>>
>> $ cd avr/lib/avr4
>> $ make clean && make CFLAGS='-save-temps -dp -Os'
>> $ grep -A 2 'addhi3_clobber\/2' *.s > out-nopeep2.txt (see attachment)
>> $ grep 'addhi3_clobber\/2' *.s | wc -l
>> 33
>>
>> This shows that the insns are already there before peep2 and thus no reload 
>> of
>> 16-bit constant is needed; an 8-bit scratch is sufficient.
>>
>> Alternatively, the implementation could omit the expansion to addhi3_clobber 
>> in
>> addhi3 expander and instead rely completely on peep2. However, that does not
>> reduce register pressure because a 16-bit register will be allocated and the
>> peep2 just prints things smarter and needs just a QI scratch to call
>> avr_out_plus_clobber.
>>
>> For +/-1, the addition with SEC/ADD/ADC resp. SEC/SBC/SBC leaves cc0 in a 
>> mess.
>>  as most loops use +/-1 on the counter variable, LDI/SUB/SBC is not shorter 
>> but
>> better because it sets cc0.
>>
>> So you like this patch?
>> Or prefer a patch that is neutral with respect to register allocator and just
>> uses peep2 to print things smarter?
> 
> I'm interested in code improvements.
> What difference in size of avr-libc ?
> 
> Denis.

I have to tool for smart size analysis, so here is just a diff:

After rebuilding avr-libc with respective compiler version, did respectively:

$ find . -name 'lib[mc].a' -exec avr-size {} ';' > size-orig.txt
$ find . -name 'lib[mc].a' -exec avr-size {} ';' > size-patch.txt

and then

$ diff -U 0 size-orig.txt size-patch.txt > size.diff

As far as I can see, there is not a big gain but no object increases in size.

For some files like ./avr/lib/avr2/libc.a:dtoa_prf.o size gain is 3%.
For ./avr/lib/avr4/libc.a:vfprintf_std.o it's 1.7% and for others just one
instruction better.

Johann
--- size-orig.txt	2011-10-18 19:59:52.0 +0200
+++ size-patch.txt	2011-10-18 19:50:59.0 +0200
@@ -7 +7 @@
-750	  0	  0	750	2ee	dtoa_prf.o (ex ./avr/lib/avr51/libc.a)
+724	  0	  0	724	2d4	dtoa_prf.o (ex ./avr/lib/avr51/libc.a)
@@ -11 +11 @@
-722	  6	  0	728	2d8	malloc.o (ex ./avr/lib/avr51/libc.a)
+720	  6	  0	726	2d6	malloc.o (ex ./avr/lib/avr51/libc.a)
@@ -15,2 +15,2 @@
-510	  0	  0	510	1fe	realloc.o (ex ./avr/lib/avr51/libc.a)
-747	  0	  0	747	2eb	strtod.o (ex ./avr/lib/avr51/libc.a)
+506	  0	  0	506	1fa	realloc.o (ex ./avr/lib/avr51/libc.a)
+739	  0	  0	739	2e3	strtod.o (ex ./avr/lib/avr51/libc.a)
@@ -18 +18 @@
-536	  0	  0	536	218	strtoul.o (ex ./avr/lib/avr51/libc.a

Re: [Patch, Darwin/libiberty] fix target/48108 by adding a section wrapper scheme to the darwin port.

2011-10-18 Thread Mike Stump

On Oct 18, 2011, at 5:23 AM, Iain Sandoe wrote:
> o we retain the ability to read an old-style file.

>   - I wonder if there should be an argument for removing this from the trunk 
> version, but retaining in 4.6.
>   - or, at the least, can we remove it from 4.8?

I'd be fine with simply requiring recompiles, since support is so new.  If you 
do up the code, might as well leave it in.

> OK for trunk/4.6?

Ok from my perspective.  Don't recall if others want to review the 
simple-object-mach-o.c change.

C++ PATCH for c++/50531 (ICE with template dtor defaulted outside fn)

2011-10-18 Thread Jason Merrill

Here we were failing to recognize that a defaulted fn had already been 
instantiated, and crashed trying to instantiate it again.


Tested x86_64-pc-linux-gnu, applying to trunk and 4.6.
commit 10ff48cb9f7817ae8e95e71bd4b22989dafb2be2
Author: Jason Merrill 
Date:   Tue Oct 18 14:25:42 2011 -0400

	PR c++/50531
	* pt.c (instantiate_decl): Recognize when a function defaulted
	outside the class is already instantiated.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 6fc60d5..56fa632 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18045,6 +18045,8 @@ instantiate_decl (tree d, int defer_ok,
 d = DECL_CLONED_FUNCTION (d);
 
   if (DECL_TEMPLATE_INSTANTIATED (d)
+  || (TREE_CODE (d) == FUNCTION_DECL
+	  && DECL_DEFAULTED_FN (d) && DECL_INITIAL (d))
   || DECL_TEMPLATE_SPECIALIZATION (d))
 /* D has already been instantiated or explicitly specialized, so
there's nothing for us to do here.
diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted32.C b/gcc/testsuite/g++.dg/cpp0x/defaulted32.C
new file mode 100644
index 000..351cdae
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/defaulted32.C
@@ -0,0 +1,21 @@
+// PR c++/50531
+// { dg-options -std=c++0x }
+
+template 
+class DataFilter
+{
+ public:
+  inline virtual ~DataFilter();
+};
+
+template
+inline DataFilter::~DataFilter() = default;
+
+class ARCalculator : public DataFilter
+{
+ public:
+  virtual void dataStart(int, int);
+};
+
+void ARCalculator::dataStart(int, int)
+{}

Re: [google] AddressSanitizer for gcc, first attempt. (issue 5272048)

2011-10-18 Thread davidxl



http://codereview.appspot.com/5272048/diff/18001/tree-asan.c
File tree-asan.c (right):

http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325
tree-asan.c:325: base = build_addr (t, current_function_decl);
You need to create a temp var and build as gimple assignment. See
init_tmp_var in tree-nested.c.

http://codereview.appspot.com/5272048/

Re: [trans-mem] Rename/split transaction into transaction_atomic and __transaction_relaxed.

2011-10-18 Thread Aldy Hernandez




Several ICEs in the TM tests on C++, but I think they are old. C TM
tests work except some missing optimizations (old failures too).
c-c++-common/tm/wrap-[12].c fail on C++, but not on C:
wrap-1.c:5:57: error: 'transaction_wrap' argument not an identifier
I believe this is an old error too?
(I don't yet have a testcase summary of before the changes, so I can't
compare properly. But will send an follow-up if there are regressions.)


Can you run the tests before your patch to make sure there are no 
regressions?  That is, let's be sure that these are indeed old failures.


The missed optimizations, particularly the memopt* ones are indeed old, 
or at least came with the merge.  I don't know about the other failures.

Re: [PATCH] Fix pr50717: widening multiply bad code

2011-10-18 Thread Andrew Stubbs


On 17/10/11 11:43, Richard Guenther wrote:

Ok.


Committed, thanks.

Andrew

Re: [Patch,AVR]: PR50447: Tweak addhi3

2011-10-18 Thread Georg-Johann Lay


Georg-Johann Lay schrieb:

Denis Chertykov schrieb:


What difference in size of avr-libc ?


I have no tool for smart size analysis, so here is just a diff:

After rebuilding avr-libc with respective compiler version, did respectively:

$ find . -name 'lib[mc].a' -exec avr-size {} ';' > size-orig.txt
$ find . -name 'lib[mc].a' -exec avr-size {} ';' > size-patch.txt

and then

$ diff -U 0 size-orig.txt size-patch.txt > size.diff

As far as I can see, there is not a big gain but no object increases in size.

For some files like ./avr/lib/avr2/libc.a:dtoa_prf.o size gain is 3%.
For ./avr/lib/avr4/libc.a:vfprintf_std.o it's 1.7% and for others just one
instruction better.


Actually there are some cases where the size increases by one instruction:


-464  0   0 464 1d0 realloc.o (ex 
./avr/lib/avr31/libc.a)
+466  0   0 466 1d2 realloc.o (ex 
./avr/lib/avr31/libc.a)



-464  0   0 464 1d0 realloc.o (ex 
./avr/lib/avr3/libc.a)
+466  0   0 466 1d2 realloc.o (ex 
./avr/lib/avr3/libc.a)


Will have a look tomorrow; presumably it's adding +/-1 that force a 
scratch whilst the old code does not.


Johann

[cxx-mem-model] compare_exchange implementation

2011-10-18 Thread Andrew MacLeod



Here's the last of the missing intrinsics.   compare_exchange is 
slightly different than the others since it doesn't map directly to an 
rtl pattern. Its format is :


  bool __atomic_compare_exchange (T* mem, T* expected, T desired, bool 
weak, memory_order success, memory_order failure)


note that the expected parameter is a pointer as well.  In the case 
where false is returned, the value of expected is updated.


Instead of providing a complex rtl pattern to try to match this, instead 
all that is required is to provide a compare_and_swap routine, which 
this builtin will then add some wrapper code around to match the 
required functionality.


there are 2 rtl patterns it looks for,
  atomic_compare_and_swap_weak and
  atomic_compare_and_swap_strong

both are of the form

QImode atomicCAS (mem(O), current_val(O), expected(I), desired(I), 
memorymodel(const int))


When neither of these patterns are present, the implementation defaults 
to using __sync_val_compare_and_swap to provide the implementation.


New tests have been added as well.

Bootstraps and no new regressions on x86_64-unknown-linux-gnu

so a bit of different rtl hacking around in there for me... does 
everything look OK?  it seems to work :-)Actual pattern 
implementation and debugging will happen later  when we actually write 
custom patterns for all these rather than defaulting to the current 
seq-cst patterns.


Andrew

* optabs.h (direct_optab_index): Replace DOI_atomic_compare_exchange
with DOI_atomic_compare_and_swap_{strong,weak}.
(direct_op): Add DOI_atomic_compare_and_swap_{strong,weak}.
* genopinit.c: Set atomic_compare_and_swap_{strong,weak}_optab.
* expr.h (expand_atomic_compare_exchange): Add parameter.
* builtins.c (builtin_atomic_compare_exchange): Add weak parameter
and verify it is a compile time constant.
* optabs.c (expand_atomic_compare_exchange): Look for the presence of
atomic_compare_and_swap_{strong,weak} patterns, otherwise use
__sync_val_compare_and_swap.
* builtin-types.def (BT_FN_BOOL_VPTR_PTR_I{1,2,4,8,16}_BOOL_INT_INT):
Add the bool parameter.
* sync-builtins.def (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_*): Use new
prototype.

* c-family/c-common.c (resolve_overloaded_builtin): Don't try to
process a return value with an error mark.

* fortran/types.def (BT_FN_BOOL_VPTR_PTR_I{1,2,4,8,16}_BOOL_INT_INT):
Add the bool parameter.

* testsuite/gcc.dg/atomic-invalid.c: Add compare_exchange failures.
* testsuite/gcc.dg/atomic-compare-exchange-{1-5}.c: New tests.

Index: optabs.h
===
*** optabs.h(revision 180092)
--- optabs.h(working copy)
*** enum direct_optab_index
*** 690,696 
  
/* Atomic operations with memory model parameters. */
DOI_atomic_exchange,
!   DOI_atomic_compare_exchange,
DOI_atomic_load,
DOI_atomic_store,
DOI_atomic_add_fetch,
--- 690,697 
  
/* Atomic operations with memory model parameters. */
DOI_atomic_exchange,
!   DOI_atomic_compare_and_swap_strong,
!   DOI_atomic_compare_and_swap_weak,
DOI_atomic_load,
DOI_atomic_store,
DOI_atomic_add_fetch,
*** typedef struct direct_optab_d *direct_op
*** 765,772 
  
  #define atomic_exchange_optab \
(&direct_optab_table[(int) DOI_atomic_exchange])
! #define atomic_compare_exchange_optab \
!   (&direct_optab_table[(int) DOI_atomic_compare_exchange])
  #define atomic_load_optab \
(&direct_optab_table[(int) DOI_atomic_load])
  #define atomic_store_optab \
--- 766,775 
  
  #define atomic_exchange_optab \
(&direct_optab_table[(int) DOI_atomic_exchange])
! #define atomic_compare_and_swap_strong_optab \
!   (&direct_optab_table[(int) DOI_atomic_compare_and_swap_strong])
! #define atomic_compare_and_swap_weak_optab \
!   (&direct_optab_table[(int) DOI_atomic_compare_and_swap_weak])
  #define atomic_load_optab \
(&direct_optab_table[(int) DOI_atomic_load])
  #define atomic_store_optab \
Index: genopinit.c
===
*** genopinit.c (revision 180092)
--- genopinit.c (working copy)
*** static const char * const optabs[] =
*** 244,250 
"set_direct_optab_handler (sync_lock_test_and_set_optab, $A, 
CODE_FOR_$(sync_lock_test_and_set$I$a$))",
"set_direct_optab_handler (sync_lock_release_optab, $A, 
CODE_FOR_$(sync_lock_release$I$a$))",
"set_direct_optab_handler (atomic_exchange_optab, $A, 
CODE_FOR_$(atomic_exchange$I$a$))",
!   "set_direct_optab_handler (atomic_compare_exchange_optab, $A, 
CODE_FOR_$(atomic_compare_exchange$I$a$))",
"set_direct_optab_handler (atomic_load_optab, $A, 
CODE_FOR_$(atomic_load$I$a$))",
"set_direct_optab_handler (atomic_store_optab, $A, 
CODE_FOR_$(atomic_store$I$a$))",
"set_direct_optab_handler (atomic_add_fetch_optab, $A,

Re: [cxx-mem-model] compare_exchange implementation

2011-10-18 Thread Jakub Jelinek

On Tue, Oct 18, 2011 at 06:03:16PM -0400, Andrew MacLeod wrote:
> 
> Here's the last of the missing intrinsics.   compare_exchange is
> slightly different than the others since it doesn't map directly to
> an rtl pattern. Its format is :
> 
>   bool __atomic_compare_exchange (T* mem, T* expected, T desired,
> bool weak, memory_order success, memory_order failure)
> 
> note that the expected parameter is a pointer as well.  In the case
> where false is returned, the value of expected is updated.

I think the __sync_* way here was much better (two intrinsics instead of
one, one bool-ish and one returning val).  With mandating T*expected
you force it to be addressable, probably until expansion time at which point
it will be just forced into memory, which is highly undesirable.

Jakub

Re: [cxx-mem-model] compare_exchange implementation

2011-10-18 Thread Andrew MacLeod


On 10/18/2011 06:07 PM, Jakub Jelinek wrote:

On Tue, Oct 18, 2011 at 06:03:16PM -0400, Andrew MacLeod wrote:

Here's the last of the missing intrinsics.   compare_exchange is
slightly different than the others since it doesn't map directly to
an rtl pattern. Its format is :

   bool __atomic_compare_exchange (T* mem, T* expected, T desired,
bool weak, memory_order success, memory_order failure)

note that the expected parameter is a pointer as well.  In the case
where false is returned, the value of expected is updated.

I think the __sync_* way here was much better (two intrinsics instead of
one, one bool-ish and one returning val).  With mandating T*expected
you force it to be addressable, probably until expansion time at which point
it will be just forced into memory, which is highly undesirable.




Its impossible to implement a weak compare and swap unless you return 
both parameters in one operation.


the compare_exchange matches the atomic interface for c++, and if we 
can't resolve it with a lock free instruction sequence, we have to leave 
an external call with this format to a library, so I that why I provide 
this built-in.


Neither rth nor I like the addressable parameter, so thats why I left 
the rtl pattern for weak and strong compare and swap without that 
addressable argument, and let this builtin generate wrapper code around it.


You could provide an __atomic version of the bool and val routines with 
memory model we toyed with making the compare_and_swap return both 
values so we could implement a weak version, but getting 2 return values 
is not pretty.  It could be done with 2 separate built-ins that relate 
to each other, but thats not great either.


we may think of a more brilliant way later to do it under the covers, 
but at the moment, this gets us the external interface we're looking for.


Andrew

[google] Suppress FDO-use related notes/warnings (issue5294043)

2011-10-18 Thread Rong Xu

Suppress verbose notes/warnings printed in FDO-use compilation.
(1) Add option -fprofile-use-verbose. When this option is on, FDO-use 
compilation prints out all the notes as that of today. When this
option is off (the default), all notes are suppressed.
(2) Make several unconditional warnings under OPT_Wcoverage_mismatch.
So that they can be turned off via -Wno-coverage-mismatch.
(3) Make several unconditional warnings for LIPO under flag_ripa_verbose.
(can be turned on via -fripa-verbose).

This patch is for google-main only.

Tested with bootstrap and regression tests.

2011-10-18   Rong Xu  

* gcc/common.opt (fprofile-use-verbose): New flag.
* gcc/value-prof.c (check_ic_counter): guard notes by
  flag_profile_use_verbose.
  (find_func_by_funcdef_no): Ditto. 
  (check_ic_target): Ditto.
  (check_counter): Ditto.
  (check_ic_counter): Ditto.
* gcc/mcf.c (find_minimum_cost_flow): Ditto.
* gcc/profile.c (read_profile_edge_counts): 
  (compute_branch_probabilities):
* gcc/coverage.c (read_counts_file): guard LIPO 
  warnings by flag_ripa_verbose.
  (get_coverage_counts): guard notes
  by flag_profile_use_verbose; make warning
  under OPT_Wcoverage_mismatch.
* gcc/tree-profile.c: (gimple_gen_reusedist): Ditto.
  (maybe_issue_profile_use_note): Ditto.
  (optimize_reusedist): Ditto.
* gcc/testsuite/gcc.dg/pr32773.c: add -fprofile-use-verbose.
* gcc/testsuite/gcc.dg/pr40209.c: Ditto.
* gcc/testsuite/gcc.dg/pr26570.c: Ditto.
* gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C: Ditto.

Index: gcc/value-prof.c
===
--- gcc/value-prof.c(revision 180106)
+++ gcc/value-prof.c(working copy)
@@ -472,9 +472,10 @@
   : DECL_SOURCE_LOCATION (current_function_decl);
   if (flag_profile_correction)
 {
- inform (locus, "correcting inconsistent value profile: "
- "%s profiler overall count (%d) does not match BB count "
-  "(%d)", name, (int)*all, (int)bb_count);
+  if (flag_profile_use_verbose)
+inform (locus, "correcting inconsistent value profile: %s "
+   "profiler overall count (%d) does not match BB count "
+"(%d)", name, (int)*all, (int)bb_count);
  *all = bb_count;
  if (*count > *all)
 *count = *all;
@@ -510,33 +511,42 @@
   location_t locus;
   if (*count1 > all && flag_profile_correction)
 {
-  locus = (stmt != NULL)
-  ? gimple_location (stmt)
-  : DECL_SOURCE_LOCATION (current_function_decl);
-  inform (locus, "Correcting inconsistent value profile: "
-  "ic (topn) profiler top target count (%ld) exceeds "
- "BB count (%ld)", (long)*count1, (long)all);
+  if (flag_profile_use_verbose)
+{
+  locus = (stmt != NULL)
+  ? gimple_location (stmt)
+  : DECL_SOURCE_LOCATION (current_function_decl);
+  inform (locus, "Correcting inconsistent value profile: "
+  "ic (topn) profiler top target count (%ld) exceeds "
+  "BB count (%ld)", (long)*count1, (long)all);
+}
   *count1 = all;
 }
   if (*count2 > all && flag_profile_correction)
 {
-  locus = (stmt != NULL)
-  ? gimple_location (stmt)
-  : DECL_SOURCE_LOCATION (current_function_decl);
-  inform (locus, "Correcting inconsistent value profile: "
-  "ic (topn) profiler second target count (%ld) exceeds "
- "BB count (%ld)", (long)*count2, (long)all);
+  if (flag_profile_use_verbose)
+{
+  locus = (stmt != NULL)
+  ? gimple_location (stmt)
+  : DECL_SOURCE_LOCATION (current_function_decl);
+  inform (locus, "Correcting inconsistent value profile: "
+  "ic (topn) profiler second target count (%ld) exceeds "
+ "BB count (%ld)", (long)*count2, (long)all);
+}
   *count2 = all;
 }
   
   if (*count2 > *count1)
 {
-  locus = (stmt != NULL)
-  ? gimple_location (stmt)
-  : DECL_SOURCE_LOCATION (current_function_decl);
-  inform (locus, "Corrupted topn ic value profile: "
- "first target count (%ld) is less than the second "
- "target count (%ld)", (long)*count1, (long)*count2);
+  if (flag_profile_use_verbose)
+{
+  locus = (stmt != NULL)
+  ? gimple_location (stmt)
+  : DECL_SOURCE_LOCATION (current_function_decl);
+  inform (locus, "Corrupted topn ic value profile: "
+ "first target count (%ld) is less than the second "
+ "target count (%ld)", (long)*count1, (long)*count2);
+}
   return true;
 }
 
@@

Re: [google] AddressSanitizer for gcc, first attempt. (issue 5272048)

2011-10-18 Thread davidxl



http://codereview.appspot.com/5272048/diff/18001/tree-asan.c
File tree-asan.c (right):

http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325
tree-asan.c:325: base = build_addr (t, current_function_decl);
There are issues with creating address expressions from TARGET_MEM_REF
in gcc. Since you want compiler to do optimization on instrumented code
as much as possible, asan instrumentation is better done as early as
possible after ipa inlining -- and this will also solve this problem. I
tried moving asan pass before loop opt, and it worked fine.

http://codereview.appspot.com/5272048/

Re: [google] AddressSanitizer for gcc, first attempt. (issue 5272048)

2011-10-18 Thread Xinliang David Li

On Tue, Oct 18, 2011 at 3:56 PM,   wrote:
> On 2011/10/18 22:52:33, davidxl wrote:
>>
>> http://codereview.appspot.com/5272048/diff/18001/tree-asan.c
>> File tree-asan.c (right):
>
>
> http://codereview.appspot.com/5272048/diff/18001/tree-asan.c#newcode325
>>
>> tree-asan.c:325: base = build_addr (t, current_function_decl);
>> There are issues with creating address expressions from TARGET_MEM_REF
>
> in gcc.
>>
>> Since you want compiler to do optimization on instrumented code as
>
> much as
>>
>> possible, asan instrumentation is better done as early as possible
>
> after ipa
>
> Why?
> I would actually say that I want the instrumentation to happen as late
> as possible because this will instrument fewer memory accesses.
> For example, asan certainly needs to happen after loop invariants are
> moved out and common subexpressions (including loads) are eliminated.
> No?

yes -- so a good choice would be after PRE and PDE (pre, sink_code)
which should handle most of the loop invariant memory loads.

David

>
>> inlining -- and this will also solve this problem. I tried moving asan
>
> pass
>>
>> before loop opt, and it worked fine.
>
>
>
> http://codereview.appspot.com/5272048/
>

Re: [PATCH] Add an intermediate coverage format for gcov

2011-10-18 Thread Sharad Singhai

On Wed, Oct 5, 2011 at 9:58 AM, Mike Stump  wrote:
>
> On Oct 5, 2011, at 12:47 AM, Sharad Singhai wrote:
> > This patch adds an intermediate coverage format (enabled via 'gcov
> > -i'). This is a compact format as it does not require source files.
>
> I don't like any of the tags, I think if you showed them to 100 people that 
> had used gcov before, very few of them would be able to figure out the 
> meaning without reading the manual.  Why make it completely cryptic?  A 
> binary encoded compress format is smaller and just as readable.
>
> SF, sounds like single float to me, I'd propose file.
> FN, sounds like filename, I'd propose line.
> FNDA, can't even guess at what it means, I'd propose fcount.
> BA, I'd propose branch, for 0, notexe, for 1, nottaken, for 2 taken.
> DA, I'd propose lcount
>
> I'd propose fcount be listed as fname,ecount, to match the ordering of 
> lcount.  Also, implicit in the name, is the ordering f, then count, l, then 
> count.
>
> I think if you showed the above to 100 people that had used gcov before, I 
> think we'd be up past 90% of the people able to figure it out, without 
> reading the doc.

Okay, I liked the idea of self-descriptive tags. I have updated the
patch based on your suggestions. I have simplified the format
somewhat. Instead of repeating function name, I use a 'function' tag
with the format

function:,,

I also dropped the unmangled function names, they were turning out to
be too unreadable and not really useful in this context.

Here is the updated patch. OK for trunk?

2011-10-18   Sharad Singhai  

* doc/gcov.texi: Document gcov intermediate format.
* gcov.c (print_usage): Handle new option.
(process_args): Handle new option.
(get_gcov_file_intermediate_name): New function.
(output_intermediate_file): New function.
(generate_results): Handle new option.
* testsuite/lib/gcov.exp: Handle intermediate format.
* testsuite/g++.dg/gcov/gcov-8.C: New testcase.

Index: doc/gcov.texi
===
--- doc/gcov.texi   (revision 179873)
+++ doc/gcov.texi   (working copy)
@@ -130,6 +130,7 @@ gcov [@option{-v}|@option{--version}] [@
  [@option{-f}|@option{--function-summaries}]
  [@option{-o}|@option{--object-directory} @var{directory|file}]
@var{sourcefiles}
  [@option{-u}|@option{--unconditional-branches}]
+ [@option{-i}|@option{--intermediate-format}]
  [@option{-d}|@option{--display-progress}]
 @c man end
 @c man begin SEEALSO
@@ -216,6 +217,45 @@ Unconditional branches are normally not
 @itemx --display-progress
 Display the progress on the standard output.

+@item -i
+@itemx --intermediate-format
+Output gcov file in an easy-to-parse intermediate text format that can
+be used by @command{lcov} or other tools. The output is a single
+@file{.gcov} file per @file{.gcda} file. No source code is required.
+
+The format of the intermediate @file{.gcov} file is plain text with
+one entry per line
+
+@smallexample
+file:@var{source_file_name}
+function:@var{function_name},@var{line_number},@var{execution_count}
+lcount:@var{line number},@var{execution_count}
+branch:@var{line_number},@var{branch_coverage_type}
+
+Where the @var{branch_coverage_type} is
+   notexec (Branch not executed)
+   taken (Branch executed and taken)
+   nottaken (Branch executed, but not taken)
+
+There can be multiple @var{file} entries in an intermediate gcov
+file. All entries following a @var{file} pertain to that source file
+until the next @var{file} entry.
+@end smallexample
+
+Here is a sample when @option{-i} is used in conjuction with
@option{-b} option:
+
+@smallexample
+file:array.cc
+function:_Z3sumRKSt6vectorIPiSaIS0_EE,11,1
+function:main,22,1
+lcount:11,1
+lcount:12,1
+lcount:14,1
+branch:14,taken
+lcount:26,1
+branch:28,nottaken
+@end smallexample
+
 @end table

 @command{gcov} should be run with the current directory the same as that
Index: gcov.c
===
--- gcov.c  (revision 179873)
+++ gcov.c  (working copy)
@@ -311,6 +311,9 @@ static int flag_gcov_file = 1;

 static int flag_display_progress = 0;

+/* Output *.gcov file in intermediate format used by 'lcov'.  */
+static int flag_intermediate_format = 0;
+
 /* For included files, make the gcov output file name include the name
of the input source file.  For example, if x.h is included in a.c,
then the output file name is a.c##x.h.gcov instead of x.h.gcov.  */
@@ -436,6 +439,7 @@ print_usage (int error_p)
   fnotice (file, "  -o, --object-directory DIR|FILE Search for object
files in DIR or called FILE\n");
   fnotice (file, "  -p, --preserve-pathsPreserve all
pathname components\n");
   fnotice (file, "  -u, --unconditional-branchesShow
unconditional branch counts too\n");
+  fnotice (file, "  -i, --intermediate-format   Output .gcov file
in intermediate text format\n");
   fnotice (file,

Re: [google] AddressSanitizer for gcc, first attempt. (issue 5272048)

2011-10-18 Thread Xinliang David Li

It will be weird to put the instrumentation pass inside loop opt,
besides memory loads which are loop invariants and redundant stores in
loop should be handled by pre/pde.

+cc Richard Guenther

You may want to ask the middle-end maintainer to review your code at
this point if you want it to be in trunk.

thanks,

David

On Tue, Oct 18, 2011 at 4:12 PM,   wrote:
>> yes -- so a good choice would be after PRE and PDE (pre, sink_code)
>> which should handle most of the loop invariant memory loads.
>
> how about pass_lim? I think asan should be after lim.
>
> http://codereview.appspot.com/5272048/
>

Re: [google] AddressSanitizer for gcc, first attempt. (issue 5272048)

2011-10-18 Thread Diego Novillo

On Tue, Oct 18, 2011 at 19:31, Xinliang David Li  wrote:
> It will be weird to put the instrumentation pass inside loop opt,
> besides memory loads which are loop invariants and redundant stores in
> loop should be handled by pre/pde.
>
> +cc Richard Guenther
>
> You may want to ask the middle-end maintainer to review your code at
> this point if you want it to be in trunk.

For trunk inclusion, we need to decide what to do wrt mudflap.
Clearly, if ASAN offers the same protections and considerable better
performance, then that should be an easy decision.

I still have not looked at the implementation in detail, but I like
the fact that it is a pure gimple pass.  If the instrumentation can be
statically optimized, then that is a clear advantage over mudflap,
which we have never been able to optimize properly.

More comments on the patch itself coming soon.

Diego.

Re: [PATCH, PR43814] Assume function arguments of pointer type are aligned.

2011-10-18 Thread Maxim Kuvyrkov

On 29/09/2011, at 10:23 PM, Richard Guenther wrote:

> On Thu, Sep 29, 2011 at 10:27 AM, Tom de Vries  wrote:

> +  && SSA_NAME_IS_DEFAULT_DEF (expr)
> +  && TREE_CODE (var) == PARM_DECL
> +  && POINTER_TYPE_P (TREE_TYPE (var))
> +  && TYPE_USER_ALIGN (TREE_TYPE (TREE_TYPE (var
> +   val = get_align_value (TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))),
> +  TREE_TYPE (var), 0);
> 
> I realize that the scope where this applies is pretty narrow (and thus
> bad fallout is quite unlikely).  But I suppose we should document
> somewhere that GCC treats alignment attributes on parameters
> more strict than say, on assignments.

Richard, the intention here is for a follow up patch to change "&& 
TYPE_USER_ALIGN" to "&& (TYPE_USER_ALIGN || flag_assume_aligned_parameters)".  
I understand that you will probably not like the idea of adding 
flag_assume_aligned_parameters, but it wouldn't make such an option any less 
valuable for experienced users of GCC.  For some users this option will the 
additional piece of rope to hang themselves; for others it will produce good 
benefits of better performance.

[Disclaimer: I've put significant amount of my time into investigation of this 
problem and hate to see it all go to waste :-).]

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

Re: [google] Suppress FDO-use related notes/warnings (issue5294043)

2011-10-18 Thread Xinliang David Li

On Tue, Oct 18, 2011 at 3:48 PM, Rong Xu  wrote:
> Suppress verbose notes/warnings printed in FDO-use compilation.
> (1) Add option -fprofile-use-verbose.

Gcc currently does not emit informational messages on high level
transformations such as inlining, value profiling transformations (ic
promotion info messages are emitted in lipo mode in google/main), loop
peeling and unrolling, and other loop opts. I can see those are very
useful for triaging performance regressions etc, but turning those on
would be too verbose for non-power users and can be annoying. They
(when introduced in the future) should be guarded. I can see they
should be guarded using the same option and possibly with a verbose
level. Something like:

-fopt-info=<1,2,3>

For now, a nonparameterized option should be good enough.

(similarly, -fopt-report can be used to dump optimization report which
is more structured -- e.g, grouped via optimizations, not functions).

There are some unrelated changes (e.g, histogram free etc) which
should be committed separately.


David

> When this option is on, FDO-use
> compilation prints out all the notes as that of today. When this
> option is off (the default), all notes are suppressed.
> (2) Make several unconditional warnings under OPT_Wcoverage_mismatch.
> So that they can be turned off via -Wno-coverage-mismatch.
> (3) Make several unconditional warnings for LIPO under flag_ripa_verbose.
> (can be turned on via -fripa-verbose).



>
> This patch is for google-main only.
>
> Tested with bootstrap and regression tests.
>
> 2011-10-18   Rong Xu  
>
>        * gcc/common.opt (fprofile-use-verbose): New flag.
>        * gcc/value-prof.c (check_ic_counter): guard notes by
>          flag_profile_use_verbose.
>          (find_func_by_funcdef_no): Ditto.
>          (check_ic_target): Ditto.
>          (check_counter): Ditto.
>          (check_ic_counter): Ditto.
>        * gcc/mcf.c (find_minimum_cost_flow): Ditto.
>        * gcc/profile.c (read_profile_edge_counts):
>          (compute_branch_probabilities):
>        * gcc/coverage.c (read_counts_file): guard LIPO
>          warnings by flag_ripa_verbose.
>          (get_coverage_counts): guard notes
>          by flag_profile_use_verbose; make warning
>          under OPT_Wcoverage_mismatch.
>        * gcc/tree-profile.c: (gimple_gen_reusedist): Ditto.
>          (maybe_issue_profile_use_note): Ditto.
>          (optimize_reusedist): Ditto.
>        * gcc/testsuite/gcc.dg/pr32773.c: add -fprofile-use-verbose.
>        * gcc/testsuite/gcc.dg/pr40209.c: Ditto.
>        * gcc/testsuite/gcc.dg/pr26570.c: Ditto.
>        * gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C: Ditto.
>
> Index: gcc/value-prof.c
> ===
> --- gcc/value-prof.c    (revision 180106)
> +++ gcc/value-prof.c    (working copy)
> @@ -472,9 +472,10 @@
>               : DECL_SOURCE_LOCATION (current_function_decl);
>       if (flag_profile_correction)
>         {
> -         inform (locus, "correcting inconsistent value profile: "
> -                 "%s profiler overall count (%d) does not match BB count "
> -                  "(%d)", name, (int)*all, (int)bb_count);
> +          if (flag_profile_use_verbose)
> +            inform (locus, "correcting inconsistent value profile: %s "
> +                   "profiler overall count (%d) does not match BB count "
> +                    "(%d)", name, (int)*all, (int)bb_count);
>          *all = bb_count;
>          if (*count > *all)
>             *count = *all;
> @@ -510,33 +511,42 @@
>   location_t locus;
>   if (*count1 > all && flag_profile_correction)
>     {
> -      locus = (stmt != NULL)
> -              ? gimple_location (stmt)
> -              : DECL_SOURCE_LOCATION (current_function_decl);
> -      inform (locus, "Correcting inconsistent value profile: "
> -              "ic (topn) profiler top target count (%ld) exceeds "
> -             "BB count (%ld)", (long)*count1, (long)all);
> +      if (flag_profile_use_verbose)
> +        {
> +          locus = (stmt != NULL)
> +                  ? gimple_location (stmt)
> +                  : DECL_SOURCE_LOCATION (current_function_decl);
> +          inform (locus, "Correcting inconsistent value profile: "
> +                  "ic (topn) profiler top target count (%ld) exceeds "
> +                  "BB count (%ld)", (long)*count1, (long)all);
> +        }
>       *count1 = all;
>     }
>   if (*count2 > all && flag_profile_correction)
>     {
> -      locus = (stmt != NULL)
> -              ? gimple_location (stmt)
> -              : DECL_SOURCE_LOCATION (current_function_decl);
> -      inform (locus, "Correcting inconsistent value profile: "
> -              "ic (topn) profiler second target count (%ld) exceeds "
> -             "BB count (%ld)", (long)*count2, (long)all);
> +      if (flag_profile_use_verbose)
> +        {
> +          locus = (stmt != NULL)
> +                  ? gimple_location (stmt)
> +                  : DECL

1 2 >

1 - 100 of 105 matches

Mail list logo