RE: Cilk Library

2013-11-08 Thread Thomas Schwinge
Hi!

On Thu, 7 Nov 2013 22:10:12 +, "Iyer, Balaji V"  
wrote:
> > > * Makefile.def: Add libcilkrts to target_modules.  Make libcilkrts
> > > depend on libstdc++ and libgcc.
> > > [...]
> > > * Makefile.in: Added libcilkrts related fields to support 
> > > building it.

> I manually added the fields in Makefile.in.

Oh.  :-|

> I read this website
> (http://gcc.gnu.org/install/prerequisites.html#TOC1) and misunderstood
> about what Makefile.in should be manually modified and what should be
> automatically generated.

Indeed there's potential for confusion in that description, if one don't
already know the structure and/or pay close attention.  (Or happen to
have read the comment in the first line of the top-level Makefile.in.)
;-)

> I am sorry for the mistake. I tried the regenerated makefile.in and it
> seem to work for me.

No problem -- this was easy enough to fix, and sorry for the extra work
you had doing these changes manually.


Grüße,
 Thomas


pgpm7FY8Oz3IJ.pgp
Description: PGP signature


[PATCH] Fix PR59038

2013-11-08 Thread Richard Biener

This fixes PR59038 by reverting the wrong fix for PR58955 and instead
installing the correct fix (fingers crossing).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-07  Richard Biener  

PR tree-optimization/59038
PR tree-optimization/58955
* tree-loop-distribution.c (pg_add_dependence_edges): Revert
previous change.  Handle known dependences correctly.

* gcc.dg/torture/pr59038.c: New testcase.

Index: gcc/tree-loop-distribution.c
===
*** gcc/tree-loop-distribution.c(revision 204506)
--- gcc/tree-loop-distribution.c(working copy)
*** pg_add_dependence_edges (struct graph *r
*** 1324,1330 
for (int ii = 0; drs1.iterate (ii, &dr1); ++ii)
  for (int jj = 0; drs2.iterate (jj, &dr2); ++jj)
{
!   int this_dir = -1;
ddr_p ddr;
/* Re-shuffle data-refs to be in dominator order.  */
if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
--- 1324,1330 
for (int ii = 0; drs1.iterate (ii, &dr1); ++ii)
  for (int jj = 0; drs2.iterate (jj, &dr2); ++jj)
{
!   int this_dir = 1;
ddr_p ddr;
/* Re-shuffle data-refs to be in dominator order.  */
if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
*** pg_add_dependence_edges (struct graph *r
*** 1350,1357 
  }
/* Known dependences can still be unordered througout the
   iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
!   if (DDR_NUM_DIST_VECTS (ddr) == 0)
  this_dir = 2;
  }
else
  this_dir = 0;
--- 1350,1366 
  }
/* Known dependences can still be unordered througout the
   iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
!   if (DDR_NUM_DIST_VECTS (ddr) != 1)
  this_dir = 2;
+   /* If the overlap is exact preserve stmt order.  */
+   else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
+ ;
+   else
+ {
+   /* Else as the distance vector is lexicographic positive
+  swap the dependence direction.  */
+   this_dir = -this_dir;
+ }
  }
else
  this_dir = 0;
Index: gcc/testsuite/gcc.dg/torture/pr59038.c
===
*** gcc/testsuite/gcc.dg/torture/pr59038.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr59038.c  (working copy)
***
*** 0 
--- 1,25 
+ /* { dg-do run } */
+ 
+ extern void abort (void);
+ 
+ unsigned char first_ones_8bit[256];
+ unsigned char connected_passed[256];
+ 
+ int
+ main ()
+ {
+   int i, j;
+   for (i=0;i<256;i++){
+   connected_passed[i]=0;
+   first_ones_8bit[i]=0;
+   for (j=7;j>0;j--){
+ if ((i & (3<<(7-j))) == (3<<(7-j))){
+ connected_passed[i]=j;
+ break;
+ }
+   }
+   }
+   if (connected_passed[3] != 7)
+ abort ();
+   return 0;
+ }


Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-08 Thread Richard Biener
On Thu, 7 Nov 2013, Kenneth Zadeck wrote:

> On 11/07/2013 01:07 PM, Richard Sandiford wrote:
> > Kenneth Zadeck  writes:
> > > I very strongly disagree with this.  The standard needs to be high than
> > > "does it pass the test suite."
> > > 
> > > What we are introducing is a case where the program will behave one way
> > > with optimization and another way without it.  While, this is always
> > > true for timing dependent code, it is pretty rare for math.  We should
> > > always tread carefully when doing that, and the excuse that it is
> > > "cleaner" does not, in my mind, fit the bill.
> > If it makes you feel any better (probably not), this was already true
> > for !SHIFT_COUNT_TRUNCATED targets.  The tree level would then not do
> > any truncation, whereas as you said before, all real targets do in practice.
> > And that affected many of the main targets, including x86_64, powerpc,
> > z/Series, ARM, etc.
> > 
> > (And the tree level has to do _something_.  "unsigned int i = 1 << 32;"
> > must compile with default options, even if the behaviour's undefined.
> > Only rtl has the luxury of being able to punt.)
> > 
> > Although you could go out of your way to try to make the undefined
> > behaviour match what the target instructions would do, that'd be a
> > lot of work, especially for cases that are implemented using library
> > calls.  And there's a lot to be said for making the behaviour
> > consistent across targets, which is what removing the truncation
> > from the tree level does.
> > 
> > Things are different at the rtl level because there a backend can
> > legitimately exploit the behaviour of the shift instructions when
> > implementing optabs.  And optabs.c itself does this when creating
> > doubleword shifts.  So at the rtl level we can only optimise out-of-range
> > shifts if we're sure what the target would do.
> > 
> > Thanks,
> > Richard
> I understand that i have lost this battle.  But the truth is that this is a
> symptom of a fairly deep problem with GCC: the people who work at the tree
> level feel that it really is wrong or unclean to understand the target when
> they do transformations.  With the exception of the vectorizer, the tree level
> is maintained revels in it's lack of understanding of the target.  This is the
> primary reason that optimizations like sign extension elimination are done so
> poorly in gcc -  you really do need to know what the target can do for free
> and the cost of what it can't do.   Only then can you push the code in a
> direction where you do good.

That's a completely different issue than to try to match target behavior
when constant folding an expression with undefined behavior.

Richard.


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-08 Thread Ilya Enkovich
Hi,

Here is an updated patch version with no langhook.

Regarding TLS objects issue - I do not think compiler should compensate the 
absence of instrumentation in libraries.  Compiler should be responsible for 
initialization of Bounds Tables for .tdata section.  Correct data copy is a 
responsibility of library.  User should use either instrumented library or 
wrapper calls if he needs this functionality.

Thanks,
Ilya
--
gcc/

2013-11-06  Ilya Enkovich  

* c/c-parser.c: Include tree-chkp.h.
(c_parser_declaration_or_fndef): Register statically
initialized decls in Pointer Bounds Checker.
* cp/decl.c: Include tree-chkp.h.
(cp_finish_decl): Register statically
initialized decls in Pointer Bounds Checker.
* gimplify.c: Include tree-chkp.h.
(gimplify_init_constructor): Register statically
initialized decls in Pointer Bounds Checker.


diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9ccae3b..397b323 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "plugin.h"
 #include "omp-low.h"
+#include "tree-chkp.h"
 
 
 /* Initialization routine for this file.  */
@@ -1682,6 +1683,12 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
  maybe_warn_string_init (TREE_TYPE (d), init);
  finish_decl (d, init_loc, init.value,
   init.original_type, asm_name);
+
+ /* Register all decls with initializers in Pointer
+Bounds Checker to generate required static bounds
+initializers.  */
+ if (DECL_INITIAL (d) != error_mark_node)
+   chkp_register_var_initializer (d);
}
}
  else
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1e92f2a..74df02f 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "splay-tree.h"
 #include "plugin.h"
 #include "cgraph.h"
+#include "tree-chkp.h"
 
 /* Possible cases of bad specifiers type used by bad_specifiers. */
 enum bad_spec_place {
@@ -6379,6 +6380,12 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
 the class specifier.  */
  if (!DECL_EXTERNAL (decl))
var_definition_p = true;
+
+ /* If var has initilizer then we need to register it in
+Pointer Bounds Checker to generate static bounds initilizer
+if required.  */
+ if (DECL_INITIAL (decl) && DECL_INITIAL (decl) != error_mark_node)
+   chkp_register_var_initializer (decl);
}
   /* If the variable has an array type, lay out the type, even if
 there is no initializer.  It is valid to index through the
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 4f52c27..7aaac15 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "vec.h"
 #include "omp-low.h"
 #include "gimple-low.h"
+#include "tree-chkp.h"
 
 #include "langhooks-def.h" /* FIXME: for lhd_set_decl_assembler_name */
 #include "tree-pass.h" /* FIXME: only for PROP_gimple_any */
@@ -4111,6 +4112,11 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
 
walk_tree (&ctor, force_labels_r, NULL, NULL);
ctor = tree_output_constant_def (ctor);
+
+   /* We need to register created constant object to
+  initialize bounds for pointers in it.  */
+   chkp_register_var_initializer (ctor);
+
if (!useless_type_conversion_p (type, TREE_TYPE (ctor)))
  ctor = build1 (VIEW_CONVERT_EXPR, type, ctor);
TREE_OPERAND (*expr_p, 1) = ctor;


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Mike Stump
On Nov 7, 2013, at 5:13 PM, Mingjie Xing  wrote:
> Well, it is my understanding that the warning should be emitted for a
> volatile variable only if it is not accessed.  Initialization means
> accessing, even though it is not used anywhere.

Let me try.  A warning is useful, if there is no way a conforming program can 
tell that the variable exists or not.  So, the question is, how can you notice 
the variable?  Answer, there is no way, so, there is no utility in having the 
variable.  The warning is to tell the user to remove the dead variable.

Re: [PATCH, i386]: Fix PR 59021, new vzeroupper instructions generated with -mavx

2013-11-08 Thread Uros Bizjak
On Fri, Nov 8, 2013 at 1:07 AM, Joern Rennecke
 wrote:
> On 7 November 2013 20:01, Uros Bizjak  wrote:
>
>> OTOH, looking a bit deeper, it looks that there is a problem in
>> mode-switching infrastructure. If we have a BB without any mode
>> requirements, but an insn that switched the mode via MODE_AFTER, we
>> should not mark the block as transparent. Indeed, we even have a N.B.
>> comment in mode-switching.c:
>>
>>   /* Check for blocks without ANY mode requirements.
>>  N.B. because of MODE_AFTER, last_mode might still be different
>>  from no_mode.  */
>>
>> But, we do nothing w.r.t to transparency
>
> The optimize_mode_switching patch makes sense to me.

Thanks, I will propose a patch for this.

>> Attached incremental patch also fixes additional vzeroupper problem.
>> Calls with AVX argument register in fact do not have any mode
>> requirements, so we don't need to fake MODE_DIRTY requirement.
>
> Not sure about this - because I'm note sure what the meaning of values in
> avx_u128_state are.  This lack of clarity is not caused by your  patch,
> but is that something that could be reasonably be explained
> in a comment, or would that have to be too long to be helpful?

No problem here, these are state transitions for vzeroupper
mode-switching state. State should be changed to DIRTY by functions
that use AVX256 argument registers.

Best regards,
Uros.


Re: [c++] Fix pr58525

2013-11-08 Thread Alexander Ivchenko
ups, I did ping in the wrong thread about this issue. Sorry.


Anyway, I noticed that after r204334 my patch cannot be applied
without conflicts.
Here is the updated one attached. Bootstrapped and regtested on
x86_64-unknown-linux-gnu.

2013-11-08  Alexander Ivchenko  

* gcc/cp/call.c (build_operator_new_call): Add flag_exceptions check.
* gcc/cp/decl.c (compute_array_index_type): Ditto.
* gcc/cp/init.c (build_new_1): Ditto.
(build_vec_init): Ditto.



Could someone take a look please?
This issue broke the build of Android source tree with trunk compiler.

thanks,
Alexander

2013/10/7 Alexander Ivchenko :
> Hi,
>
> __cxa_throw_bad_array_new_length and __cxa_throw_bad_array_new_length
> are generated with -fno-exceptions right now. The attached patch fixes
> that problem. Bootstrapped and regtested on x86_64-unknown-linux-gnu:
>
> 2013-10-07  Alexander Ivchenko  
>
> * gcc/cp/call.c (build_operator_new_call): Add flag_exceptions check.
> * gcc/cp/decl.c (cp_finish_decl): Ditto.
> * gcc/cp/init.c (build_new_1): Ditto.
> (build_vec_init): Ditto.
>
>
>
> Is it ok?
>
> --Alexander


fix_58525_02.patch
Description: Binary data


Re: [patch] [arm] ARM Cortex-M3/M4 tuning

2013-11-08 Thread Ramana Radhakrishnan

ChangeLog:

 2013-11-01  Julian Brown  
 Joey Ye  

 * config/arm/arm.c (arm_cortex_m_branch_cost): New.
 (arm_v7m_tune): New.
 (arm_*_tune): Add comments for Sched adj cost.


List all names here please rather than a regexp.


 * config/arm/arm-cores.def (cortex-m4, cortex-m3):
 Use arm_v7m_tune tuning.



The ARM parts are ok but I'd like a testsuite maintainer to look at the 
testsuite changes before committing.


regards
Ramana


testsuite:
 2013-11-01  Joey Ye  

 * gcc.dg/tree-ssa/forwprop-28.c: Disable for cortex_m.
 * gcc.dg/tree-ssa/vrp47.c: Likewise.
 * gcc.dg/tree-ssa/vrp87.c: Likewise.
 * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Ignore for cortex_m.
 * gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Likewise.





Re: Recent Go patch broke Alpha bootstrap

2013-11-08 Thread Uros Bizjak
On Fri, Nov 8, 2013 at 12:39 AM, Ian Lance Taylor  wrote:

>> Recent Go mega-patch broke Alpha bootstrap. Following fixlet is needed:
>>
> Thanks for the patch and report.  This patch should fix them.
> Bootstrapped and tested on x86_64-unknown-linux-gnu, not that that
> proves much.  Committed to mainline.

With your patch, I was able to compile libgo and run libgo testsuite.
There are two new testsuite failures for newly introduced tests:

--- FAIL: TestKillStartProcess (0.00 seconds)
os_test.go:1173: Failed to build exe
/tmp/go-build420056795/main.exe: exec: "go": executable file not found
in $PATH
--- FAIL: TestKillFindProcess (0.00 seconds)
os_test.go:1173: Failed to build exe
/tmp/go-build353151102/main.exe: exec: "go": executable file not found
in $PATH
FAIL
FAIL: os
gmake[2]: *** [os/check] Error 1

This should be trivial, I have no go in $PATH (IIRC, we already have
fixed failure like this).

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x1c]

goroutine 5 [running]:
syscall.Exitsyscall
../../../gcc-svn/trunk/libgo/runtime/proc.c:1986
pprof.profileWriter

/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libgo/gotest29851/test/pprof.go:600
created by runtime_pprof.StartCPUProfile

/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libgo/gotest29851/test/pprof.go:594

goroutine 1 [chan receive]:
testing.RunTests
../../../gcc-svn/trunk/libgo/go/testing/testing.go:470
testing.Main
../../../gcc-svn/trunk/libgo/go/testing/testing.go:401
main.main

/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libgo/gotest29851/test/_testmain.go:34

goroutine 4 [chan receive]:
pprof_test.testCPUProfile

/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libgo/gotest29851/test/pprof_test.go:108
testing.tRunner
../../../gcc-svn/trunk/libgo/go/testing/testing.go:389
created by testing.RunTests
../../../gcc-svn/trunk/libgo/go/testing/testing.go:469
FAIL: runtime/pprof
gmake[2]: *** [runtime/pprof/check] Error 1

This one is new, I have to look into it a bit deeper.

All other libgo tests pass.

Uros.


[PATCH, rtl]: Mode-switching: Mark block as nontransparent if its exiting mode != no_mode

2013-11-08 Thread Uros Bizjak
Hello!

Attached patch fixes an oversight in mode-switching. For blocks
without ANY mode requirements, we have to consider instructions with
MODE_AFTER mode changes. If the exiting mode from the block is
different that no_mode (the mode we start), we have to mark the block
as nontransparent.

2013-11-08  Uros Bizjak  

* mode-switching.c (optimize_mode_switching): Mark block as
nontransparent, if last_mode at block exit is different from no_mode.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}
core-avx-i configured bootstrap.

OK for mainline and branches?

Thanks,
Uros.
Index: mode-switching.c
===
--- mode-switching.c(revision 204496)
+++ mode-switching.c(working copy)
@@ -577,6 +577,8 @@ optimize_mode_switching (void)
{
  ptr = new_seginfo (no_mode, BB_END (bb), bb->index, live_now);
  add_seginfo (info + bb->index, ptr);
+ if (last_mode != no_mode)
+   bitmap_clear_bit (transp[bb->index], j);
}
}
 #if defined (MODE_ENTRY) && defined (MODE_EXIT)


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Bin.Cheng
On Fri, Nov 8, 2013 at 5:03 PM, Mike Stump  wrote:
> On Nov 7, 2013, at 5:13 PM, Mingjie Xing  wrote:
>> Well, it is my understanding that the warning should be emitted for a
>> volatile variable only if it is not accessed.  Initialization means
>> accessing, even though it is not used anywhere.
>
> Let me try.  A warning is useful, if there is no way a conforming program can 
> tell that the variable exists or not.  So, the question is, how can you 
> notice the variable?  Answer, there is no way, so, there is no utility in 
> having the variable.  The warning is to tell the user to remove the dead 
> variable.

I am sort of lost.  The bug is straightforward and it suggests the
volatile variable is actually used if it is declared with an
initialization, and GCC should not emit warning message for it.

The patch is fixing this, right?  Any explanation?

Thanks,
bin

-- 
Best Regards.


[patch libgcc]: Solve issue about too early released libgcc's DLL

2013-11-08 Thread Kai Tietz
Hello,

this issue was discussed on cygwin's ML some time ago.  For shared libgcc-DLL 
use it might happen that the DLL is released too early, so we need to perform 
an explicit load of it for increasing the load-count.  By this we make sure 
that the DLL is still loaded on destruction.

I will apply this patch soon, if there are no objections.  Corinna, please 
retest that this patch fixes the reported issue.

Regards,
Kai

2013-11-08  Kai Tietz  

* config/i386/cygming-crtbegin.c (__gcc_register_frame):
Increment load-count on use of LIBGCC_SONAME DLL.
(hmod_libgcc): New static variable to hold handle of
LIBGCC_SONAME DLL.
(__gcc_deregister_frame): Decrement load-count of
LIBGCC_SONAME DLL.

Index: config/i386/cygming-crtbegin.c
===
--- config/i386/cygming-crtbegin.c  (revision 204561)
+++ config/i386/cygming-crtbegin.c  (working copy)
@@ -91,6 +91,9 @@ static EH_FRAME_SECTION_CONST char __EH_FRAME_BEGI
   = { };
 
 static struct object obj;
+
+/* Handle of libgcc's DLL reference.  */
+HANDLE hmod_libgcc;
 #endif
 
 #if TARGET_USE_JCR_SECTION
@@ -115,9 +118,14 @@ __gcc_register_frame (void)
 
   void (*register_frame_fn) (const void *, struct object *);
   HANDLE h = GetModuleHandle (LIBGCC_SONAME);
+
   if (h)
-register_frame_fn = (void (*) (const void *, struct object *))
-   GetProcAddress (h, "__register_frame_info");
+{
+  /* Increasing the load-count of LIBGCC_SONAME DLL.  */
+  hmod_libgcc = LoadLibrary (LIBGCC_SONAME);
+  register_frame_fn = (void (*) (const void *, struct object *))
+ GetProcAddress (h, "__register_frame_info");
+}
   else 
 register_frame_fn = __register_frame_info;
   if (register_frame_fn)
@@ -154,5 +162,7 @@ __gcc_deregister_frame (void)
 deregister_frame_fn = __deregister_frame_info;
   if (deregister_frame_fn)
  deregister_frame_fn (__EH_FRAME_BEGIN__);
+  if (hmod_libgcc)
+FreeLibrary (hmod_libgcc);
 #endif
 }


Re: [PATCH] make has_gate and has_execute useless

2013-11-08 Thread Richard Biener
On Thu, Nov 7, 2013 at 5:00 PM,   wrote:
> From: Trevor Saunders 
>
> Hi,
>
>  This is the result of seeing what it would take to get rid of the has_gate 
> and
> has_execute flags on pass_data.  It turns out not much, but I wanted
> confirmation this part is ok before I go deal with all the places that
> initialize the fields.
>
> I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite
> regressions (ignoring the silk stuff because the full paths in its test names
> break my test script for now) Any reason this patch with the actual removal 
> of the flags wouldn't be ok?

The has_gate flag is easy to remove (without a TODO_ hack), right?
No gate simply means that the gate returns always true.  The only
weird thing is

  /* If we have a gate, combine the properties that we could have with
 and without the pass being examined.  */
  if (pass->has_gate)
properties &= new_properties;
  else
properties = new_properties;

which I don't understand (and you just removed all properties handling there).

So can you split out removing has_gate?  This part is obviously ok.

Then, for ->execute () I'd have refactored the code to make
->sub passes explicitely executed by the default ->execute ()
implementation only.  That is, passes without overriding execute
are containers only.  Can you quickly check whether that would
work out?

Thanks,
Richard.

> Trev
>
> 2013-11-06  Trevor Saunders  
>
> * pass_manager.h (pass_manager): Adjust.
> * passes.c (opt_pass::execute): Tell the pass manager it doesn't need
> to do anything for this pass.
> (pass_manager::register_dump_files_1): Don't uselessly deal with
> properties of passes.
> (pass_manager::register_dump_files): Adjust.
> (dump_one_pass): Just call pass->gate ().
> (execute_ipa_summary_passes): Likewise.
> (execute_one_pass): Don't check pass->has_execute flag.
> (ipa_write_summaries_2): Don't check pass->has_gate flag.
> (ipa_write_optimization_summaries_1): Likewise.
> (ipa_read_summaries_1): Likewise.
> (ipa_read_optimization_summaries_1): Likewise.
> (execute_ipa_stmt_fixups): Likewise.
> * tree-pass.h (pass_data): Rename has_gate to useless_has_gate and
> has_execute to useless_has_execute to be sure they're unused.
> (TODO_absolutely_nothing): New constant.
>
>
> diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
> index 77d78eb..3bc0a99 100644
> --- a/gcc/pass_manager.h
> +++ b/gcc/pass_manager.h
> @@ -93,7 +93,7 @@ public:
>
>  private:
>void set_pass_for_id (int id, opt_pass *pass);
> -  int register_dump_files_1 (struct opt_pass *pass, int properties);
> +  void register_dump_files_1 (struct opt_pass *pass);
>void register_dump_files (struct opt_pass *pass, int properties);
>
>  private:
> diff --git a/gcc/passes.c b/gcc/passes.c
> index 19e5869..3b28dc9 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -112,7 +112,7 @@ opt_pass::gate ()
>  unsigned int
>  opt_pass::execute ()
>  {
> -  return 0;
> +  return TODO_absolutely_nothing;
>  }
>
>  opt_pass::opt_pass (const pass_data &data, context *ctxt)
> @@ -701,33 +701,21 @@ pass_manager::register_one_dump_file (struct opt_pass 
> *pass)
>
>  /* Recursive worker function for register_dump_files.  */
>
> -int
> +void
>  pass_manager::
> -register_dump_files_1 (struct opt_pass *pass, int properties)
> +register_dump_files_1 (struct opt_pass *pass)
>  {
>do
>  {
> -  int new_properties = (properties | pass->properties_provided)
> -  & ~pass->properties_destroyed;
> -
>if (pass->name && pass->name[0] != '*')
>  register_one_dump_file (pass);
>
>if (pass->sub)
> -new_properties = register_dump_files_1 (pass->sub, new_properties);
> -
> -  /* If we have a gate, combine the properties that we could have with
> - and without the pass being examined.  */
> -  if (pass->has_gate)
> -properties &= new_properties;
> -  else
> -properties = new_properties;
> +register_dump_files_1 (pass->sub);
>
>pass = pass->next;
>  }
>while (pass);
> -
> -  return properties;
>  }
>
>  /* Register the dump files for the pass_manager starting at PASS.
> @@ -739,7 +727,7 @@ pass_manager::
>  register_dump_files (struct opt_pass *pass,int properties)
>  {
>pass->properties_required |= properties;
> -  register_dump_files_1 (pass, properties);
> +  register_dump_files_1 (pass);
>  }
>
>  struct pass_registry
> @@ -847,7 +835,7 @@ dump_one_pass (struct opt_pass *pass, int pass_indent)
>const char *pn;
>bool is_on, is_really_on;
>
> -  is_on = pass->has_gate ? pass->gate () : true;
> +  is_on = pass->gate ();
>is_really_on = override_gate_status (pass, current_function_decl, is_on);
>
>if (pass->static_pass_number <= 0)
> @@ -2002,7 +1990,7 @@ execute_ipa_summary_passes (struct ipa_opt_pass_

Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Mike Stump

On Nov 8, 2013, at 1:32 AM, Bin.Cheng  wrote:

> On Fri, Nov 8, 2013 at 5:03 PM, Mike Stump  wrote:
>> On Nov 7, 2013, at 5:13 PM, Mingjie Xing  wrote:
>>> Well, it is my understanding that the warning should be emitted for a
>>> volatile variable only if it is not accessed.  Initialization means
>>> accessing, even though it is not used anywhere.
>> 
>> Let me try.  A warning is useful, if there is no way a conforming program 
>> can tell that the variable exists or not.  So, the question is, how can you 
>> notice the variable?  Answer, there is no way, so, there is no utility in 
>> having the variable.  The warning is to tell the user to remove the dead 
>> variable.
> 
> I am sort of lost.

I can try again.  Begin your sentence, the important utility of this construct 
is demonstrated by the following code:

See if you can complete it.  If not, then, then there is no utility.  The 
warning says, there is no utility.  This isn't a theoretic thing, it is an 
engineering thing.

Re: [PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification

2013-11-08 Thread Richard Biener
On Thu, Nov 7, 2013 at 7:55 PM, Jeff Law  wrote:
> On 11/07/13 04:50, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> Here is an updated patch version.
>
> I think this needs to hold until we have a consensus on what the parameter
> passing looks like for bounded pointers.

I still think the best thing to do on GIMPLE is

arg_2 = __builtin_ia32_bnd_arg (arg_1(D));
foo (arg_2);

that is, make the parameter an implicit pair of {value, bound} where
the bound is determined by the value going through a bound association
builtin.  No extra explicit argument to the calls so arguments match
the fndecl and fntype.  All the complexity is defered to the expander
which can trivially lookup bound arguments via the SSA def (I suppose
it does that anyway now for getting at the explicit bound argument now).

As far as I can see (well, think), all currently passed bound arguments
are the return value of such builtin already.

Richard.



> Jeff


Re: Implement C11 _Atomic

2013-11-08 Thread Uros Bizjak
On Thu, Nov 7, 2013 at 7:26 PM, Joseph S. Myers  wrote:

>> Please note that following code form fenv.c won't generate overflow
>> exception on x87:
>>
>>   if (excepts & FP_EX_OVERFLOW)
>> {
>>   volatile float max = __FLT_MAX__;
>>   r = max * max;
>> }
>
> r being volatile is intended to ensure that the result does get stored
> back to memory, and so in particular that a result computed with excess
> precision gets converted back to float and the exception is raised.

Can we introduce a target-dependant source here, in the same way as
gfortran has different config/fpu-*.c files? I would like to propose
the code from (recently improved) libgcc/config/i386/sfp-exceptions.c
to generate exceptions on x86 targets. The main benefits of this code
are:
- precision: it generates exactly the exception it is supposed to
generate at the exact spot (x87 and SSE)
- the code doesn't do arithemtics with denormal results
- can also generate non-standard denormal exception

Your proposed code can be considered a generic fallback code then. I'm
sure other targets have their own, perhaps better ways to generate FP
exceptions.

Uros.


Re: [PATCH] Small fix: add { dg-require-effective-target vect_int } to testsuite/gcc.dg/vect/pr58508.c

2013-11-08 Thread Richard Biener
On Fri, Nov 8, 2013 at 3:24 AM, Cong Hou  wrote:
> Ping. OK for the trunk?

Ok.

Thanks,
Richard.

>
>
>
> thanks,
> Cong
>
>
> On Fri, Nov 1, 2013 at 10:47 AM, Cong Hou  wrote:
>> It seems that on some platforms the loops in
>> testsuite/gcc.dg/vect/pr58508.c may be unable to be vectorized. This
>> small patch added { dg-require-effective-target vect_int } to make
>> sure all loops can be vectorized.
>>
>>
>> thanks,
>> Cong
>>
>>
>> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>> index 9d0f4a5..3d9916d 100644
>> --- a/gcc/testsuite/ChangeLog
>> +++ b/gcc/testsuite/ChangeLog
>> @@ -1,3 +1,7 @@
>> +2013-10-29  Cong Hou  
>> +
>> +   * gcc.dg/vect/pr58508.c: Update.
>> +
>>  2013-10-15  Cong Hou  
>>
>> * gcc.dg/vect/pr58508.c: New test.
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c
>> b/gcc/testsuite/gcc.dg/vect/pr58508.c
>> index 6484a65..fff7a04 100644
>> --- a/gcc/testsuite/gcc.dg/vect/pr58508.c
>> +++ b/gcc/testsuite/gcc.dg/vect/pr58508.c
>> @@ -1,3 +1,4 @@
>> +/* { dg-require-effective-target vect_int } */
>>  /* { dg-do compile } */
>>  /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */


Re: [PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification

2013-11-08 Thread Ilya Enkovich
2013/11/8 Richard Biener :
> On Thu, Nov 7, 2013 at 7:55 PM, Jeff Law  wrote:
>> On 11/07/13 04:50, Ilya Enkovich wrote:
>>>
>>> Hi,
>>>
>>> Here is an updated patch version.
>>
>> I think this needs to hold until we have a consensus on what the parameter
>> passing looks like for bounded pointers.
>
> I still think the best thing to do on GIMPLE is
>
> arg_2 = __builtin_ia32_bnd_arg (arg_1(D));
> foo (arg_2);
>
> that is, make the parameter an implicit pair of {value, bound} where
> the bound is determined by the value going through a bound association
> builtin.  No extra explicit argument to the calls so arguments match
> the fndecl and fntype.  All the complexity is defered to the expander
> which can trivially lookup bound arguments via the SSA def (I suppose
> it does that anyway now for getting at the explicit bound argument now).
>
> As far as I can see (well, think), all currently passed bound arguments
> are the return value of such builtin already.

All bounds are result of different builtin calls. Small example:

int *global_p;
void foo (int *p)
{
  int buf[10];
  bar (p, buf, global_p);
}


It is translated into:

  __bound_tmp.1_7 = __builtin_ia32_bndmk (&buf, 40);
  __bound_tmp.1_6 = __builtin_ia32_arg_bnd (p_3(D)(ab));
  global_p.0_2 = global_p;
  __bound_tmp.1_8 = __builtin_ia32_bndldx (&global_p, global_p.0_2);
  bar (p_3(D)(ab), __bound_tmp.1_6, &buf, __bound_tmp.1_7,
global_p.0_2, __bound_tmp.1_8);

Bounds binding via calls as you suggest may be done as following:

  __bound_tmp.1_7 = __builtin_ia32_bndmk (&buf, 40);
  __bound_tmp.1_6 = __builtin_ia32_arg_bnd (p_3(D)(ab));
  global_p.0_2 = global_p;
  __bound_tmp.1_8 = __builtin_ia32_bndldx (&global_p, global_p.0_2);
  _9 = __builtin_bind_bounds (p_3(D)(ab), __bound_tmp.1_6);
  _10 = __builtin_bind_bounds (&buf, __bound_tmp.1_7);
  _11 = __builtin_bind_bounds (global_p.0_2, __bound_tmp.1_8);
  bar (_9, _10, _11);

Is it close to what you propose?

Ilya
>
> Richard.
>
>
>
>> Jeff


Re: [PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification

2013-11-08 Thread Jakub Jelinek
On Fri, Nov 08, 2013 at 10:43:26AM +0100, Richard Biener wrote:
> >> Here is an updated patch version.
> >
> > I think this needs to hold until we have a consensus on what the parameter
> > passing looks like for bounded pointers.
> 
> I still think the best thing to do on GIMPLE is
> 
> arg_2 = __builtin_ia32_bnd_arg (arg_1(D));
> foo (arg_2);

Well, in that case it would likely have to be an internal builtin fn,
because it needs to be a pass through for random argument types, so
return long for long argument, or double for double argument, etc.
Or is that only done for pointer arguments?  Then it could be
like __builtin_assume_aligned, which takes void * and returns void *.

Jakub


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Richard Biener
On Fri, Nov 8, 2013 at 10:39 AM, Mike Stump  wrote:
>
> On Nov 8, 2013, at 1:32 AM, Bin.Cheng  wrote:
>
>> On Fri, Nov 8, 2013 at 5:03 PM, Mike Stump  wrote:
>>> On Nov 7, 2013, at 5:13 PM, Mingjie Xing  wrote:
 Well, it is my understanding that the warning should be emitted for a
 volatile variable only if it is not accessed.  Initialization means
 accessing, even though it is not used anywhere.
>>>
>>> Let me try.  A warning is useful, if there is no way a conforming program 
>>> can tell that the variable exists or not.  So, the question is, how can you 
>>> notice the variable?  Answer, there is no way, so, there is no utility in 
>>> having the variable.  The warning is to tell the user to remove the dead 
>>> variable.
>>
>> I am sort of lost.
>
> I can try again.  Begin your sentence, the important utility of this 
> construct is demonstrated by the following code:
>
> See if you can complete it.  If not, then, then there is no utility.  The 
> warning says, there is no utility.  This isn't a theoretic thing, it is an 
> engineering thing.

Yeah, and as opposed to the non-volatile case removing the volatile
set-but-unused variable even reduces code size!

Richard.


Re: [PATCH 1/3] libgcc: check for fenv.h in dfp configure check

2013-11-08 Thread Bernhard Reutner-Fischer
On 4 April 2013 23:01, Ian Lance Taylor  wrote:
> On Thu, Apr 4, 2013 at 12:53 PM, Bernhard Reutner-Fischer
>  wrote:
>>
>> 2013-03-24  Bernhard Reutner-Fischer  
>>
>> * configure.ac (libgcc_cv_dfp): Extend check to probe fenv.h
>> availability.
>> * configure: Regenerate
>
> This is OK.

Applied just now as r204562.
thanks,


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Bin.Cheng
On Fri, Nov 8, 2013 at 5:39 PM, Mike Stump  wrote:
>
> On Nov 8, 2013, at 1:32 AM, Bin.Cheng  wrote:
>
>> On Fri, Nov 8, 2013 at 5:03 PM, Mike Stump  wrote:
>>> On Nov 7, 2013, at 5:13 PM, Mingjie Xing  wrote:
 Well, it is my understanding that the warning should be emitted for a
 volatile variable only if it is not accessed.  Initialization means
 accessing, even though it is not used anywhere.
>>>
>>> Let me try.  A warning is useful, if there is no way a conforming program 
>>> can tell that the variable exists or not.  So, the question is, how can you 
>>> notice the variable?  Answer, there is no way, so, there is no utility in 
>>> having the variable.  The warning is to tell the user to remove the dead 
>>> variable.
>>
>> I am sort of lost.
>
> I can try again.  Begin your sentence, the important utility of this 
> construct is demonstrated by the following code:
>
> See if you can complete it.  If not, then, then there is no utility.  The 
> warning says, there is no utility.  This isn't a theoretic thing, it is an 
> engineering thing.

Thanks for elaborating.  The warning message is actually for no-use of
variable, and it has few things to do with whether it's accessed or
not.

Thanks,
bin
-- 
Best Regards.


Re: [PATCH 2/3] libstdc++-v3: ::tmpnam depends on uClibc SUSV4_LEGACY

2013-11-08 Thread Bernhard Reutner-Fischer
On 11 April 2013 14:18, Paolo Carlini  wrote:
> Hi,
>
>
> On 04/11/2013 02:04 PM, Bernhard Reutner-Fischer wrote:
>>
>> I would have expected that somebody would tell me that omitting ::tmpnam
>> violates 27.9.2  from the spec but noone yelled at me yet?
>
> Frankly, I didn't because the targets I really care about aren't affected.
> The actual maintainers of this target should speak.

Attaching an updated patch that i was using since March (without
regressions) which takes Rainer's comments about _GLIBCXX_USE_TMPNAM
into account.
Ok for trunk?

libstdc++-v3/ChangeLog

2013-03-24  Bernhard Reutner-Fischer  

* acinclude.m4 (GLIBCXX_CHECK_TMPNAM): New check for tmpnam
function.
* configure.ac: Use GLIBCXX_CHECK_TMPNAM.
* (configure, config.h.in): Regenerate.
* include/c_global/cstdio: Guard ::tmpnam with _GLIBCXX_USE_TMPNAM


commit-6f2faa2
Description: Binary data


Some wide-int review comments

2013-11-08 Thread Richard Sandiford
Some comments from looking through the diff with the merge point,
ignoring wide-int.h and wide-int.cc.  A few more to follow in the
form of patchses.



dwarf2out.c has:

+case CONST_WIDE_INT:
+  if (mode == VOIDmode)
+   mode = GET_MODE (rtl);
+
+  if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict))
+   {
+ gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
+
+ /* Note that a CONST_DOUBLE rtx could represent either an integer
+or a floating-point constant.  A CONST_DOUBLE is used whenever
+the constant requires more than one word in order to be
+adequately represented.  We output CONST_DOUBLEs as blocks.  */
+ loc_result = new_loc_descr (DW_OP_implicit_value,
+ GET_MODE_SIZE (mode), 0);
+ loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
+ loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc_cleared_wide_int ();
+ *loc_result->dw_loc_oprnd2.v.val_wide = std::make_pair (rtl, mode);

The comment looks like a cut-&-paste.  The "mode == GET_MODE (rtl)"
bit should never be true.



>From fold-const.c:

@@ -13686,14 +13548,17 @@ fold_binary_loc (location_t loc,
  break;
}
 
-   else if (TREE_INT_CST_HIGH (arg1) == signed_max_hi
-&& TREE_INT_CST_LOW (arg1) == signed_max_lo
+   else if (wi::eq_p (arg1, signed_max)
 && TYPE_UNSIGNED (arg1_type)
+/* KENNY QUESTIONS THE CHECKING OF THE BITSIZE
+   HERE.  HE FEELS THAT THE PRECISION SHOULD BE
+   CHECKED */
+
 /* We will flip the signedness of the comparison operator
associated with the mode of arg1, so the sign bit is
specified by this mode.  Check that arg1 is the signed
max associated with this sign bit.  */
-&& width == GET_MODE_BITSIZE (TYPE_MODE (arg1_type))
+&& prec == GET_MODE_BITSIZE (TYPE_MODE (arg1_type))
 /* signed_type does not work on pointer types.  */
 && INTEGRAL_TYPE_P (arg1_type))
  {

Looks like it should be resolved one way or the other before the merge.



>From gcse.c:

--- wide-int-base/gcc/gcc/gcse.c2013-11-05 13:09:32.148376180 +
+++ wide-int/gcc/gcc/gcse.c 2013-11-05 13:07:28.431495118 +
@@ -1997,6 +1997,13 @@ prune_insertions_deletions (int n_elems)
bitmap_clear_bit (pre_delete_map[i], j);
 }
 
+  if (dump_file)
+{
+  dump_bitmap_vector (dump_file, "pre_insert_map", "", pre_insert_map, 
n_edges);
+  dump_bitmap_vector (dump_file, "pre_delete_map", "", pre_delete_map,
+  last_basic_block);
+}
+
   sbitmap_free (prune_exprs);
   free (insertions);
   free (deletions);

This doesn't look related.



>From lcm.c:

diff -udpr '--exclude=.svn' '--exclude=.pc' '--exclude=patches' 
wide-int-base/gcc/gcc/lcm.c wide-int/gcc/gcc/lcm.c
--- wide-int-base/gcc/gcc/lcm.c 2013-08-22 09:00:23.068716382 +0100
+++ wide-int/gcc/gcc/lcm.c  2013-10-26 13:19:16.287277520 +0100
@@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
 #include "sbitmap.h"
 #include "dumpfile.h"
 
+#define LCM_DEBUG_INFO 1
 /* Edge based LCM routines.  */
 static void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *);
 static void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *,
@@ -106,6 +107,7 @@ compute_antinout_edge (sbitmap *antloc,
   /* We want a maximal solution, so make an optimistic initialization of
  ANTIN.  */
   bitmap_vector_ones (antin, last_basic_block);
+  bitmap_vector_clear (antout, last_basic_block);
 
   /* Put every block on the worklist; this is necessary because of the
  optimistic initialization of ANTIN above.  */
@@ -432,6 +434,7 @@ pre_edge_lcm (int n_exprs, sbitmap *tran
 
   /* Allocate an extra element for the exit block in the laterin vector.  */
   laterin = sbitmap_vector_alloc (last_basic_block + 1, n_exprs);
+  bitmap_vector_clear (laterin, last_basic_block);
   compute_laterin (edge_list, earliest, antloc, later, laterin);
 
 #ifdef LCM_DEBUG_INFO

Same here.



>From real.c:

@@ -2144,43 +2148,131 @@ real_from_string3 (REAL_VALUE_TYPE *r, c
 real_convert (r, mode, r);
 }
 
-/* Initialize R from the integer pair HIGH+LOW.  */
+/* Initialize R from a HOST_WIDE_INT.  */
 
 void
 real_from_integer (REAL_VALUE_TYPE *r, enum machine_mode mode,
-  unsigned HOST_WIDE_INT low, HOST_WIDE_INT high,
-  int unsigned_p)
+  HOST_WIDE_INT val,
+  signop sgn)
 {
-  if (low == 0 && high == 0)
+  if (val == 0)
 get_zero (r, 0);
   else
 {
   memset (r, 0, sizeof (*r));
 

Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Mingjie Xing
Oops, if it is not a bug, please close the report
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57258

Thanks,
Mingjie

2013/11/8 Richard Biener :
> On Fri, Nov 8, 2013 at 10:39 AM, Mike Stump  wrote:
>>
>> On Nov 8, 2013, at 1:32 AM, Bin.Cheng  wrote:
>>
>>> On Fri, Nov 8, 2013 at 5:03 PM, Mike Stump  wrote:
 On Nov 7, 2013, at 5:13 PM, Mingjie Xing  wrote:
> Well, it is my understanding that the warning should be emitted for a
> volatile variable only if it is not accessed.  Initialization means
> accessing, even though it is not used anywhere.

 Let me try.  A warning is useful, if there is no way a conforming program 
 can tell that the variable exists or not.  So, the question is, how can 
 you notice the variable?  Answer, there is no way, so, there is no utility 
 in having the variable.  The warning is to tell the user to remove the 
 dead variable.
>>>
>>> I am sort of lost.
>>
>> I can try again.  Begin your sentence, the important utility of this 
>> construct is demonstrated by the following code:
>>
>> See if you can complete it.  If not, then, then there is no utility.  The 
>> warning says, there is no utility.  This isn't a theoretic thing, it is an 
>> engineering thing.
>
> Yeah, and as opposed to the non-volatile case removing the volatile
> set-but-unused variable even reduces code size!
>
> Richard.


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Mike Stump
On Nov 8, 2013, at 2:25 AM, Bin.Cheng  wrote:
> Thanks for elaborating.  The warning message is actually for no-use of
> variable, and it has few things to do with whether it's accessed or
> not.

I disagree.  If you examine why the warning was put in, you realize it was put 
in so that lazy programmers can find dead things.  Why do they want to do that, 
to remove the dead things.  This construct is a dead thing.  If you disagree, 
you'd have to say why the warning was produced.


Re: [ARM, AArch64] Make aarch-common.c files more robust.

2013-11-08 Thread Ramana Radhakrishnan

On 11/06/13 09:45, James Greenhalgh wrote:


Hi,

This patch is a respin of the aarch-common improvements to use a generic
search to find SETs and the variety of shifts, rather than relying on the
hope that we will find something well formed.

I've bootstrapped the patch on a chromebook with -mcpu=cortex-a7,
  -mcpu=cortex-a9 and -mcpu=cortex-a15 with no issues, and run
the aarch64-none-elf regression suite with no issues. And I've thrown
a bunch of smoke tests at this over all supported -mcpu values
with no issues, and only a few minor scheduling differences where the
old code was too liberal and did the wrong thing.

Is this OK?


This is OK - thanks,

regards
Ramana


Thanks,
James

---
gcc/

2013-11-06  James Greenhalgh  

* config/arm/aarch-common.c
(search_term): New typedef.
(shift_rtx_costs): New array.
(arm_rtx_shift_left_p): New.
(arm_find_sub_rtx_with_search_term): Likewise.
(arm_find_sub_rtx_with_code): Likewise.
(arm_early_load_addr_dep): Add sanity checking.
(arm_no_early_alu_shift_dep): Likewise.
(arm_no_early_alu_shift_value_dep): Likewise.
(arm_no_early_mul_dep): Likewise.
(arm_no_early_store_addr_dep): Likewise.






Re: Some wide-int review comments

2013-11-08 Thread Mike Stump
On Nov 8, 2013, at 2:30 AM, Richard Sandiford  
wrote:
> From gcse.c:
> 
> --- wide-int-base/gcc/gcc/gcse.c  2013-11-05 13:09:32.148376180 +
> +++ wide-int/gcc/gcc/gcse.c   2013-11-05 13:07:28.431495118 +
> @@ -1997,6 +1997,13 @@ prune_insertions_deletions (int n_elems)
>   bitmap_clear_bit (pre_delete_map[i], j);
> }
> 
> +  if (dump_file)
> +{
> +  dump_bitmap_vector (dump_file, "pre_insert_map", "", pre_insert_map, 
> n_edges);
> +  dump_bitmap_vector (dump_file, "pre_delete_map", "", pre_delete_map,
> +last_basic_block);
> +}
> +
>   sbitmap_free (prune_exprs);
>   free (insertions);
>   free (deletions);
> 
> This doesn't look related.

I think this was so Kenny could find code-gen differences and pin point them 
faster by having the ability to compare dump files.  When we developed the 
branch, we found great utility in dump file comparisons.  It narrows down the 
scope of what went wrong and helps get you into the right data structures to 
look at.  Sometimes one can have identical code-gen, but, have different dump 
files, and comparing them and looking for differences was nice.

Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.

2013-11-08 Thread James Greenhalgh
> On Tue, Nov 5, 2013 at 9:58 AM, Cong Hou  wrote:
> > Thank you for your detailed explanation.
> >
> > Once GCC detects a reduction operation, it will automatically
> > accumulate all elements in the vector after the loop. In the loop the
> > reduction variable is always a vector whose elements are reductions of
> > corresponding values from other vectors. Therefore in your case the
> > only instruction you need to generate is:
> >
> > VABAL   ops[3], ops[1], ops[2]
> >
> > It is OK if you accumulate the elements into one in the vector inside
> > of the loop (if one instruction can do this), but you have to make
> > sure other elements in the vector should remain zero so that the final
> > result is correct.
> >
> > If you are confused about the documentation, check the one for
> > udot_prod (just above usad in md.texi), as it has very similar
> > behavior as usad. Actually I copied the text from there and did some
> > changes. As those two instruction patterns are both for vectorization,
> > their behavior should not be difficult to explain.
> >
> > If you have more questions or think that the documentation is still
> > improper please let me know.

Hi Cong,

Thanks for your reply.

I've looked at Dorit's original patch adding WIDEN_SUM_EXPR and
DOT_PROD_EXPR and I see that the same ambiguity exists for
DOT_PROD_EXPR. Can you please add a note in your tree.def
that SAD_EXPR, like DOT_PROD_EXPR can be expanded as either:

  tmp = WIDEN_MINUS_EXPR (arg1, arg2)
  tmp2 = ABS_EXPR (tmp)
  arg3 = PLUS_EXPR (tmp2, arg3)

or:

  tmp = WIDEN_MINUS_EXPR (arg1, arg2)
  tmp2 = ABS_EXPR (tmp)
  arg3 = WIDEN_SUM_EXPR (tmp2, arg3)

Where WIDEN_MINUS_EXPR is a signed MINUS_EXPR, returning a
a value of the same (widened) type as arg3.

Also, while looking for the history of DOT_PROD_EXPR I spotted this
patch:

  [autovect] [patch] detect mult-hi and sad patterns
  http://gcc.gnu.org/ml/gcc-patches/2005-10/msg01394.html

I wonder what the reason was for that patch to be dropped?

Thanks,
James



Re: [PATCH] fixincludes: use $(FI) instead of fixincl@EXEEXT@

2013-11-08 Thread Bernhard Reutner-Fischer
On 4 April 2013 22:20, Bruce Korb  wrote:
> Except as noted below, fine by me.
>
> On 04/04/13 12:56, Bernhard Reutner-Fischer wrote:
>> Bootstrapped and regtested on x86_64-unknown-linux-gnu and
>> x86_64-mine-linux-uclibc without regressions, ok for trunk?
>>
>> fixincludes/ChangeLog:
>>
>> 2013-04-04  Bernhard Reutner-Fischer  
>>
>>   Makefile.in: Use $(FI) instead of fixincl@EXEEXT@.
>>   Cleanup whitespace while at it.
>>
>> Signed-off-by: Bernhard Reutner-Fischer 
>> ---
>>  fixincludes/Makefile.in |   10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/fixincludes/Makefile.in b/fixincludes/Makefile.in
>> index ce850ff..3dc869d 100644
>> --- a/fixincludes/Makefile.in
>> +++ b/fixincludes/Makefile.in
>> @@ -131,7 +131,7 @@ fixinc.sh : fixinc.in mkfixinc.sh Makefile
>>  $(srcdir)/fixincl.x: @MAINT@ fixincl.tpl inclhack.def
>>   cd $(srcdir) ; $(SHELL) ./genfixes
>>
>> -mostlyclean :
>> +mostlyclean:
>
> I see no reason for changing things.

dropped this hunk.

> But if you are going to "clean up" the colons, then they should
> be in columns (mostly 12 or 16).  This one is already in 12.
>
>>   rm -f *.o *-stamp $(AF) $(FI) *~ fixinc.sh
>>
>>  clean: mostlyclean
>> @@ -179,18 +179,18 @@ check : all
>>
>>  install : all
>>   -rm -rf $(DESTDIR)$(itoolsdir)
>> - $(mkinstalldirs) $(DESTDIR)$(itoolsdir)
>> + $(mkinstalldirs) $(DESTDIR)$(itoolsdir)
>>   $(mkinstalldirs) $(DESTDIR)$(itoolsdatadir)/include
>>   $(INSTALL_DATA) $(srcdir)/README-fixinc \
>> $(DESTDIR)$(itoolsdatadir)/include/README
>>   $(INSTALL_SCRIPT) fixinc.sh $(DESTDIR)$(itoolsdir)/fixinc.sh
>> - $(INSTALL_PROGRAM) fixincl@EXEEXT@ \
>> -   $(DESTDIR)$(itoolsdir)/fixincl@EXEEXT@
>> + $(INSTALL_PROGRAM) $(FI) \
>> +   $(DESTDIR)$(itoolsdir)/$(FI)
>
> This should now fit on a single line.

ok
>
>>   $(INSTALL_SCRIPT) mkheaders $(DESTDIR)$(itoolsdir)/mkheaders
>>
>>  install-strip: install
>>   test -z '$(STRIP)' \
>> -   || $(STRIP) $(DESTDIR)$(itoolsdir)/fixincl@EXEEXT@
>> +   || $(STRIP) $(DESTDIR)$(itoolsdir)/$(FI)

changed this too to be on a single line now.
>>
>>  .PHONY: all check install install-strip
>>  .PHONY: dvi pdf info html install-pdf install-info install-html

Changelog remains the same.
II was using the attached updated patch since April, ok for trunk?


commit-8788d7c
Description: Binary data


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-08 Thread Mike Stump
On Nov 8, 2013, at 2:35 AM, Mingjie Xing  wrote:
> Oops, if it is not a bug, please close the report
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57258

Well, I've stated my position.  I can be swayed by a good argument, if someone 
has one.  I'd give people a chance to weigh in if they can think of one.  A 
global variable with linkage is completely different, as the linker can grab 
it, dlsym can grab it, and who knows…  while a cleaver (Ian type of cleaver) 
person could grab an automatic variable, the difficulty is enormous and doing 
that would be exceedingly odd.  I don't know of any coding style where people 
do that sort of thing.  Reflection comes the closest; dynamic bindings (a la 
lisp) I guess comes second; shudder.  I'd make them mark it with the used 
attribute, since they're really weird.

Re: [patch] safe iterator simplification

2013-11-08 Thread Jonathan Wakely
On 7 November 2013 21:00, François Dumont wrote:
>
> * include/debug/safe_iterator.h (_BeforeBeginHelper<>::_S_Is):
> Take only a const safe iterator reference.
> (_BeforeBeingHelper<>::_S_Is_beginnest): Likewise.

There's a typo here, "Being" not "Begin"

> Ok to commit ?

Yes, with the tiny fix to the ChangeLog, thanks!


Recent Go patch broke Solaris bootstrap

2013-11-08 Thread Rainer Orth
The recent Go patch (couldn't find the submission on gcc-patches) broke
Solaris bootstrap: on Solaris 10/x86 I get

/vol/gcc/src/hg/trunk/local/libgo/go/net/fd_unix.go:414:72: error: reference to 
undefined identifier 'syscall.F_DUPFD_CLOEXEC'
   r0, _, e1 := syscall.Syscall(syscall.SYS_FCNTL, uintptr(fd), 
syscall.F_DUPFD_CLOEXEC, 0)

/vol/gcc/src/hg/trunk/local/libgo/go/net/tcpsockopt_unix.go:26:103: error: 
reference to undefined identifier 'syscall.TCP_KEEPINTVL'
  err := os.NewSyscallError("setsockopt", syscall.SetsockoptInt(fd.sysfd, 
syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, secs))

   ^
/vol/gcc/src/hg/trunk/local/libgo/go/net/tcpsockopt_unix.go:30:103: error: 
reference to undefined identifier 'syscall.TCP_KEEPIDLE'
  return os.NewSyscallError("setsockopt", syscall.SetsockoptInt(fd.sysfd, 
syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs))

   ^
/vol/gcc/src/hg/trunk/local/libgo/go/net/fd_select.go:90:30: error: use of 
undefined type 'pollServer'
 func (p *pollster) WaitFD(s *pollServer, nsec int64) (fd int, mode int, err 
error) {
  ^
/vol/gcc/src/hg/trunk/local/libgo/go/net/fd_select.go:113:5: error: reference 
to field 'Unlock' in object which has no fields or methods
s.Unlock()
 ^
/vol/gcc/src/hg/trunk/local/libgo/go/net/fd_select.go:115:5: error: reference 
to field 'Lock' in object which has no fields or methods
s.Lock()
 ^

Haven't yet started looking into how to fix this yet.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] RE: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-11-08 Thread Bernhard Reutner-Fischer
On 6 November 2013 08:04, Jakub Jelinek  wrote:
> On Wed, Nov 06, 2013 at 02:24:01AM +, Iyer, Balaji V wrote:
>> Fixed patch is attached. The responses to your question are given below. 
>> Is this patch OK?
>>
>> Here is the ChangeLog entry:
>>
>> +2013-11-05  Balaji V. Iyer  
>> +
>> +   * c-c++-common/cilk-plus/CK/fib.c: Reduced the iteration from
>> +   40 to 30.  Replaced iteration variable with a #define.  Instead of
>> +   returning non-zero value for error, called __builtin_abort ().  Fixed
>> +   a bug of calling fib_serial in serial case instead of fib.
>> +   * c-c++-common/cilk-plus/CK/fib_init_expr_xy.c: Likewise.
>> +   * c-c++-common/cilk-plus/CK/fib_no_return.c: Likewise.
>> +   * c-c++-common/cilk-plus/CK/fib_no_sync.c: Likewise.
>> +   * gcc.dg/cilk-plus/cilk-plus.exp: Removed duplicate/un-necessary
>> +   compiler flag testing.
>
> Ok.
Balaji,
May i suggest you rephrase the .exp so it does not line-wrap and is
actually readable like in attached (untested) 01 or at least 00?

Thanks for your consideration..
Index: gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
===
--- gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp	(revision 204561)
+++ gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp	(working copy)
@@ -25,31 +25,34 @@
 
 verbose "$tool $libdir" 1
 set library_var [get_multilibs]
-# Pointing the ld_library_path to the Cilk Runtime library binaries. 
+# Pointing the ld_library_path to the Cilk Runtime library binaries.
 set ld_library_path "${library_var}/libcilkrts/.libs"
 
 set ALWAYS_CFLAGS ""
 lappend ALWAYS_CFLAGS "-L${library_var}/libcilkrts/.libs"
 
 dg-init
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O1 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O2 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O3 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -g -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -g -O2 -ftree-vectorize -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus -std=c99" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus -O2 -std=c99" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus -O3 -std=c99" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus -g -O0 -std=c99" " "
 
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -fcilkplus $ALWAYS_CFLAGS " " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1 -fcilkplus $ALWAYS_CFLAGS" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O2 -std=c99 -fcilkplus $ALWAYS_CFLAGS" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O2 -ftree-vectorize -fcilkplus $ALWAYS_CFLAGS" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -g -fcilkplus $ALWAYS_CFLAGS" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -flto -g -fcilkplus $ALWAYS_CFLAGS" " "
+set tests [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]]
 
+dg-runtest $tests " -fcilkplus" " "
+dg-runtest $tests " -O1 -fcilkplus" " "
+dg-runtest $tests " -O2 -fcilkplus" " "
+dg-runtest $tests " -O3 -fcilkplus" " "
+dg-runtest $tests " -g -fcilkplus" " "
+dg-runtest $tests " -g -O2 -ftree-vectorize -fcilkplus" " "
+dg-runtest $tests " -fcilkplus -std=c99" " "
+dg-runtest $tests " -fcilkplus -O2 -std=c99" " "
+dg-runtest $tests " -fcilkplus -O3 -std=c99" " "
+dg-runtest $tests " -fcilkplus -g -O0 -std=c99" " "
 
+set tests [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]]
 
+dg-runtest $tests " -g -fcilkplus $ALWAYS_CFLAGS " " "
+dg-runtest $tests " -O1 -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest $tests " -O2 -std=c99 -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest $tests " -O2 -ftree-vectorize -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest $tests " -O3 -g -fcilkplus $ALWAYS_CFLAGS" " "
+dg-runtest $tests " -O3 -flto -g -fcilkplus $ALWAYS_CFLAGS" " "
+
 dg-finish
Index: gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
===
--- gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp	(revision 204561)
+++ gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp	(working copy)
@@ -25,31 +25,40 @@
 
 verbose "$tool $libdir" 1
 set library_var [get_multilibs]
-# Pointing the ld_library_path to the Cilk Runtime library binaries. 
+# Pointing the ld_library_path to the Cilk Runtime library binaries.
 set ld_library_path "${library_var}/libcilkrts/

Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)

2013-11-08 Thread Richard Biener
> Hello,
> This is new patch version.
> Comments from Bernhard Reutner-Fischer review applied.
> Also, test case bitfildmrg2.c modified - it is now execute test.
> 
> 
> Example:
> 
> Original code:
>D.1351;
>D.1350;
>D.1349;
>   D.1349_2 = p1_1(D)->f1;
>   p2_3(D)->f1 = D.1349_2;
>   D.1350_4 = p1_1(D)->f2;
>   p2_3(D)->f2 = D.1350_4;
>   D.1351_5 = p1_1(D)->f3;
>   p2_3(D)->f3 = D.1351_5;
> 
> Optimized code:
>D.1358;
>   _16 = pr1_2(D)->_field0;
>   pr2_4(D)->_field0 = _16;
> 
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go trough basic block statements list. If there are statement pairs
> that implement copy of bit field content from one memory location to
> another record statements pointers and other necessary data in
> corresponding data structure.
> 2. Identify records that represent adjacent bit field accesses and
> mark them as merged.
> 3. Modify trees accordingly.
> 
> New command line option "-ftree-bitfield-merge" is introduced.
> 
> Tested - passed gcc regression tests.

Comments inline (sorry for the late reply...)

> Changelog -
> 
> gcc/ChangeLog:
> 2013-09-24 Zoran Jovanovic (zoran.jovano...@imgtec.com)
>   * Makefile.in : Added tree-sra.c to GTFILES.
>   * common.opt (ftree-bitfield-merge): New option.
>   * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
>   * tree-sra.c (ssa_bitfield_merge): New function.
>   Entry for (-ftree-bitfield-merge).
>   (bitfield_stmt_access_pair_htab_hash): New function.
>   (bitfield_stmt_access_pair_htab_eq): New function.
>   (cmp_access): New function.
>   (create_and_insert_access): New function.
>   (get_bit_offset): New function.
>   (get_merged_bit_field_size): New function.
>   (add_stmt_access_pair): New function.
>   * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
>   (field_byte_offset): declaration moved to tree.h, static removed.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
>   * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
>   * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
>   * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
>   (simple_type_size_in_bits): moved from dwarf2out.c.
>   * tree.h (expressions_equal_p): declaration added.
>   (field_byte_offset): declaration added.
> 
> Patch -
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index a2e3f7a..54aa8e7 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -3847,6 +3847,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h
> $(srcdir)/coretypes.h \
>$(srcdir)/asan.c \
>$(srcdir)/ubsan.c \
>$(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
> +  $(srcdir)/tree-sra.c \
>@all_gtfiles@
> 
>  # Compute the list of GT header files from the corresponding C sources,
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 202e169..afac514 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2164,6 +2164,10 @@ ftree-sra
>  Common Report Var(flag_tree_sra) Optimization
>  Perform scalar replacement of aggregates
> 
> +ftree-bitfield-merge
> +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> +Enable bit-field merge on trees
> +

Drop the tree- prefix for new options, it doesn't tell users anything.
I suggest

fmerge-bitfields
Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
Merge loads and stores of consecutive bitfields

and adjust docs with regarding to the flag name change of course.

>  ftree-ter
>  Common Report Var(flag_tree_ter) Optimization
>  Replace temporary expressions in the SSA->normal pass
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index aa0f4ed..e588cae 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
>  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7679,6 +7679,11 @@ pointer alignment information.
>  This pass only operates on local scalar variables and is enabled by default
>  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
> 
> +@item -ftree-bitfield-merge
> +@opindex ftree-bitfield-merge
> +Combines several adjacent bit-field accesses that copy values
> +from one memory location to another into one single bit-field access.
> +
>  @item -ftree-ccp
>  @opindex ftree-ccp
>  Perform sparse conditional constant propagation (CCP) on trees.  This
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 95049e4..e74db17 100644
> --- a/gcc

Re: [patch libgcc]: Solve issue about too early released libgcc's DLL

2013-11-08 Thread Corinna Vinschen
Hi,

On Nov  8 04:36, Kai Tietz wrote:
> Hello,
> 
> this issue was discussed on cygwin's ML some time ago.  For shared
> libgcc-DLL use it might happen that the DLL is released too early, so
> we need to perform an explicit load of it for increasing the
> load-count.  By this we make sure that the DLL is still loaded on
> destruction.
> 
> I will apply this patch soon, if there are no objections.  Corinna,
> please retest that this patch fixes the reported issue.

I'm not set up for testing this, but the patch looks basically the same
as the original, which has been reported to fix the problem:

  http://cygwin.com/ml/cygwin/2013-11/msg00158.html


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat


pgpuvLOee2Q9D.pgp
Description: PGP signature


[PATCH] Fix PR59047

2013-11-08 Thread Richard Biener

The following fixes PR59047 - data-ref and bitfields.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

Index: gcc/tree-predcom.c
===
*** gcc/tree-predcom.c  (revision 204561)
--- gcc/tree-predcom.c  (working copy)
*** ref_at_iteration (data_reference_p dr, i
*** 1353,1362 
tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
addr = force_gimple_operand_1 (addr, stmts, is_gimple_mem_ref_addr,
 NULL_TREE);
!   return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)),
! addr,
! fold_convert (reference_alias_ptr_type (DR_REF (dr)),
!   coff));
  }
  
  /* Get the initialization expression for the INDEX-th temporary variable
--- 1353,1376 
tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
addr = force_gimple_operand_1 (addr, stmts, is_gimple_mem_ref_addr,
 NULL_TREE);
!   tree alias_ptr = fold_convert (reference_alias_ptr_type (DR_REF (dr)), 
coff);
!   /* While data-ref analysis punts on bit offsets it still handles
!  bitfield accesses at byte boundaries.  Cope with that.  Note that
!  we cannot simply re-apply the outer COMPONENT_REF because the
!  byte-granular portion of it is already applied via DR_INIT and
!  DR_OFFSET, so simply build a BIT_FIELD_REF knowing that the bits
!  start at offset zero.  */
!   if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
!   && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
! {
!   tree field = TREE_OPERAND (DR_REF (dr), 1);
!   return build3 (BIT_FIELD_REF, TREE_TYPE (DR_REF (dr)),
!build2 (MEM_REF, DECL_BIT_FIELD_TYPE (field),
!addr, alias_ptr),
!DECL_SIZE (field), bitsize_zero_node);
! }
!   else
! return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)), addr, alias_ptr);
  }
  
  /* Get the initialization expression for the INDEX-th temporary variable
Index: gcc/testsuite/gcc.dg/torture/pr59047.c
===
*** gcc/testsuite/gcc.dg/torture/pr59047.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr59047.c  (working copy)
***
*** 0 
--- 1,39 
+ /* { dg-do run } */
+ 
+ extern void abort (void);
+ 
+ struct
+ {
+   int f0;
+   int f1:1;
+   int f2:2;
+ } a = {0, 0, 1};
+ 
+ int b, c, *d, e, f;
+ 
+ int
+ fn1 ()
+ {
+   for (; b < 1; ++b)
+ {
+   for (e = 0; e < 1; e = 1)
+   {
+ int **g = &d;
+ *g = &c;
+   } 
+   *d = 0;
+   f = a.f1;
+   if (f)
+   return 0;
+ }
+   return 0;
+ }
+ 
+ int
+ main ()
+ {
+   fn1 ();
+   if (b != 1)
+ abort ();
+   return 0;
+ }


Re: Implement C11 _Atomic

2013-11-08 Thread Dominique Dhumieres
The tests introduced in revision 204544 fail with -m32
(see http://gcc.gnu.org/ml/gcc-regression/2013-11/msg00213.html
or http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00526.html ).
This revision may also causes the failures of
gfortran.dg/typebound_operator_9.f03 and 
FAIL: gfortran.fortran-torture/execute/forall_1.f90.

On x86_64-apple-darwin* typebound_operator_9.f03 gives the following ICE with 
-O3

/opt/gcc/work/gcc/testsuite/gfortran.dg/typebound_operator_9.f03: In function 
'nabla2_cart2d':
/opt/gcc/work/gcc/testsuite/gfortran.dg/typebound_operator_9.f03:272:0: 
internal compiler error: tree check: expected integer_cst, have plus_expr in 
tree_int_cst_lt, at tree.c:7083
   function nabla2_cart2d (obj)

and the failures of c11-atomic-exec-*.exe are of the kind

Executing on host: /opt/gcc/build_w/gcc/xgcc -B/opt/gcc/build_w/gcc/ 
/opt/gcc/work/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-4.c   
-B/opt/gcc/build_w/x86_64-apple-darwin13.0.0/i386/libatomic/  
-L/opt/gcc/build_w/x86_64-apple-darwin13.0.0/i386/libatomic/.libs -latomic  
-fno-diagnostics-show-caret -fdiagnostics-color=never   -Os  -std=c11 
-pedantic-errors -pthread -D_POSIX_C_SOURCE=200809L  -lm   -m32 -o 
./c11-atomic-exec-4.exe(timeout = 300)
spawn /opt/gcc/build_w/gcc/xgcc -B/opt/gcc/build_w/gcc/ 
/opt/gcc/work/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-4.c 
-B/opt/gcc/build_w/x86_64-apple-darwin13.0.0/i386/libatomic/ 
-L/opt/gcc/build_w/x86_64-apple-darwin13.0.0/i386/libatomic/.libs -latomic 
-fno-diagnostics-show-caret -fdiagnostics-color=never -Os -std=c11 
-pedantic-errors -pthread -D_POSIX_C_SOURCE=200809L -lm -m32 -o 
./c11-atomic-exec-4.exe
Undefined symbols for architecture i386:
  "___atomic_compare_exchange_16", referenced from:
  _test_main_long_double_add in ccBbKO9A.o
  _test_thread_long_double_add in ccBbKO9A.o
  _test_main_complex_double_add in ccBbKO9A.o
  _test_thread_complex_double_add in ccBbKO9A.o
  _test_main_long_double_postinc in ccBbKO9A.o
  _test_thread_long_double_postinc in ccBbKO9A.o
  _test_main_long_double_preinc in ccBbKO9A.o
  ...^M
  "___atomic_load_16", referenced from:
  _test_main_long_double_add in ccBbKO9A.o
  _test_thread_long_double_add in ccBbKO9A.o
  _test_main_complex_double_add in ccBbKO9A.o
  _test_thread_complex_double_add in ccBbKO9A.o
  _test_main_long_double_postinc in ccBbKO9A.o
  _test_thread_long_double_postinc in ccBbKO9A.o
  _test_main_long_double_preinc in ccBbKO9A.o
  ...^M
ld: symbol(s) not found for architecture i386

Dominique



Re: Implement C11 _Atomic

2013-11-08 Thread Joseph S. Myers
On Fri, 8 Nov 2013, Uros Bizjak wrote:

> Can we introduce a target-dependant source here, in the same way as

Sure, that seems a reasonable thing to do.  I think putting a file fenv.c 
in an appropriate subdirectory of libatomic/config will result in it being 
found automatically by the existing search path logic, but you'll need to 
test that.

The present code essentially follows what glibc's feraiseexcept does for 
lots of architectures, but with generic C code where the glibc code tends 
to use asms to control the exact instructions used (and thereby avoid the 
need for volatile, I suppose).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Implement C11 _Atomic

2013-11-08 Thread Joseph S. Myers
On Fri, 8 Nov 2013, Dominique Dhumieres wrote:

> This revision may also causes the failures of
> gfortran.dg/typebound_operator_9.f03 and 
> FAIL: gfortran.fortran-torture/execute/forall_1.f90.

I doubt the patch affects Fortran (there are language-independent changes, 
but they are fairly small and shouldn't affect code not using _Atomic 
qualifiers).

> Undefined symbols for architecture i386:
>   "___atomic_compare_exchange_16", referenced from:

>   "___atomic_load_16", referenced from:

Either those functions should be in libatomic (using locking where 
necessary), or the built-in functions should be expanding to use the 
size-generic libatomic function rather than the _16 one.  Richard or 
Andrew should be able to advise on which the design was for systems that 
don't have lock-free atomics for all of the standard sizes (1, 2, 4, 8, 
16).

My guess is that it would be hard to provide the _16 functions in 
libatomic when TImode types aren't supported, and so the bug is that _16 
function calls are generated when not supported.  That is, c-common.c 
functions resolve_overloaded_atomic_exchange, 
resolve_overloaded_atomic_compare_exchange, 
resolve_overloaded_atomic_load, resolve_overloaded_atomic_store, where 
they check n != 16, should do something like (n != 16 || 
!targetm.scalar_mode_supported_p (TImode)).  (Better, refactor the (n != 1 
&& n != 2 && n != 4 && n != 8 && n != 16) into an atomic_size_supported_p 
or similar function.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Reducing number of alias checks in vectorization.

2013-11-08 Thread Dominique Dhumieres
According to http://gcc.gnu.org/ml/gcc-regression/2013-11/msg00197.html
revision 204538 is breaking several tests. On x86_64-apple-darwin* the
failures I have looked at are of the kind

/opt/gcc/work/gcc/testsuite/gfortran.dg/typebound_operator_9.f03: In function 
'nabla2_cart2d':
/opt/gcc/work/gcc/testsuite/gfortran.dg/typebound_operator_9.f03:272:0: 
internal compiler error: tree check: expected integer_cst, have plus_expr in 
tree_int_cst_lt, at tree.c:7083
   function nabla2_cart2d (obj)

TIA

Dominique


Re: Implement C11 _Atomic

2013-11-08 Thread Dominique Dhumieres
> I doubt the patch affects Fortran (there are language-independent changes,
> but they are fairly small and shouldn't affect code not using _Atomic 
> qualifiers).

The Fortran failures seem related to revision 204538.

Dominique


[PATCH][ARM] Add break in handling of comparisons in rtx costs function

2013-11-08 Thread Kyrill Tkachov

Hi all,

In arm_new_rtx_costs we need a break; statement after handling the comparisons 
cases. Otherwise we fall through and compute garbage. This small patch adds that.


Tested arm-none-eabi on qemu.

Ok for trunk?

Thanks,
Kyrill


2013-11-08  Kyrylo Tkachov  

* config/arm/arm.c (arm_new_rtx_costs): Break after handling
comparisons.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ed57134..242fa09 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -10148,6 +10148,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  *cost = 0;
 	  return true;
 	}
+  break;
 
 case ABS:
   if (TARGET_HARD_FLOAT && GET_MODE_CLASS (mode) == MODE_FLOAT

[PATCH GCC]Refactor force_expr_to_var_cost and handle type conversion

2013-11-08 Thread bin.cheng
Hi,
This patch refactors force_expr_to_var_cost and handles type conversion
along with other tree nodes.  It is split from the patch posted at
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html
Bootstrap and test with the patch lowering address expressions on
x86/x86_64/arm.  Is it OK?

Thanks,
bin

2013-11-08  Bin Cheng  

* tree-ssa-loop-ivopts.c (force_expr_to_var_cost): Refactor the
code.
Handle type conversion.





Re: [PATCH GCC]Refactor force_expr_to_var_cost and handle type conversion

2013-11-08 Thread Richard Biener
On Fri, Nov 8, 2013 at 2:41 PM, bin.cheng  wrote:
> Hi,
> This patch refactors force_expr_to_var_cost and handles type conversion
> along with other tree nodes.  It is split from the patch posted at
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html
> Bootstrap and test with the patch lowering address expressions on
> x86/x86_64/arm.  Is it OK?

ENOPATCH

> Thanks,
> bin
>
> 2013-11-08  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (force_expr_to_var_cost): Refactor the
> code.
> Handle type conversion.
>
>
>


[PATCH] Vectorization using elemental functions

2013-11-08 Thread Jakub Jelinek
Hi!

Here is an updated version of the patch I've posted yesterday.
The changes since then are that the expander can now handle the CONSTRUCTORs
this patch creates (although we probably want to add some vec_concat
optab and at least improve handling of concatenation of two half sized
vectors into one larger one (say concatenate V4SImode and V4SImode into
V8SImode, etc.)), and allows vectorization of non-const elemental function
calls (including calls that have no lhs) in #pragma {,omp }simd loops.

Does this look good for gomp-4_0-branch?

2013-11-07  Jakub Jelinek  

* tree-vectorizer.h (enum stmt_vec_info_type): Add
call_simd_clone_vec_info_type.
* expr.c (store_constructor): Allow CONSTRUCTOR with VECTOR_TYPE
(same sized) elements even if the type of the CONSTRUCTOR has
vector mode and target is a REG.
* tree-vect-data-refs.c: Include cgraph.h.
(vect_analyze_data_refs): Inline by hand find_data_references_in_loop
and find_data_references_in_bb, if find_data_references_in_stmt
fails, still allow calls to #pragma omp declare simd functions
in #pragma omp simd loops unless they contain data references among
the call arguments or in lhs.
* tree-vect-loop.c (vect_determine_vectorization_factor): If a call
doesn't have lhs, set STMT_VINFO_VECTYPE to vector type corresponding
to any of the argument types and exclude it from adjustments of the
vectorization factor.
* tree-vect-stmts.c: Include tree-ssa-loop.h and
tree-scalar-evolution.h.
(vectorizable_function): Don't handle functions with simd clones here.
(vectorizable_call): Nor here.  Return early if call doesn't have lhs.
(struct simd_call_arg_info): New type.
(vectorizable_simd_clone_call): New function.
(vect_analyze_stmt, vect_transform_stmt): Call it.

--- gcc/tree-vectorizer.h.jj2013-11-07 12:34:50.047501234 +0100
+++ gcc/tree-vectorizer.h   2013-11-07 12:37:17.742708618 +0100
@@ -416,6 +416,7 @@ enum stmt_vec_info_type {
   shift_vec_info_type,
   op_vec_info_type,
   call_vec_info_type,
+  call_simd_clone_vec_info_type,
   assignment_vec_info_type,
   condition_vec_info_type,
   reduc_vec_info_type,
--- gcc/expr.c.jj   2013-11-01 14:37:33.0 +0100
+++ gcc/expr.c  2013-11-08 10:10:14.469321209 +0100
@@ -6199,6 +6199,18 @@ store_constructor (tree exp, rtx target,
enum machine_mode mode = GET_MODE (target);
 
icode = (int) optab_handler (vec_init_optab, mode);
+   /* Don't use vec_init if some elements have VECTOR_TYPE.  */
+   if (icode != CODE_FOR_nothing)
+ {
+   tree value;
+
+   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
+ if (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE)
+   {
+ icode = CODE_FOR_nothing;
+ break;
+   }
+ }
if (icode != CODE_FOR_nothing)
  {
unsigned int i;
@@ -6276,8 +6288,8 @@ store_constructor (tree exp, rtx target,
 
if (vector)
  {
-   /* Vector CONSTRUCTORs should only be built from smaller
-  vectors in the case of BLKmode vectors.  */
+   /* vec_init should not be used if there are VECTOR_TYPE
+  elements.  */
gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
RTVEC_ELT (vector, eltpos)
  = expand_normal (value);
--- gcc/tree-vect-data-refs.c.jj2013-11-01 14:38:43.0 +0100
+++ gcc/tree-vect-data-refs.c   2013-11-08 14:44:33.634199598 +0100
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "diagnostic-core.h"
+#include "cgraph.h"
 /* Need to include rtl.h, expr.h, etc. for optabs.  */
 #include "expr.h"
 #include "optabs.h"
@@ -2959,10 +2960,11 @@ vect_analyze_data_refs (loop_vec_info lo
 
   if (loop_vinfo)
 {
+  basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
+
   loop = LOOP_VINFO_LOOP (loop_vinfo);
-  if (!find_loop_nest (loop, &LOOP_VINFO_LOOP_NEST (loop_vinfo))
- || find_data_references_in_loop
-  (loop, &LOOP_VINFO_DATAREFS (loop_vinfo)))
+  datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
+  if (!find_loop_nest (loop, &LOOP_VINFO_LOOP_NEST (loop_vinfo)))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -2971,7 +2973,57 @@ vect_analyze_data_refs (loop_vec_info lo
  return false;
}
 
-  datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
+  for (i = 0; i < loop->num_nodes; i++)
+   {
+ gimple_stmt_iterator gsi;
+
+ for (gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple stmt = 

Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-08 Thread Michael Matz
Hi,

On Thu, 7 Nov 2013, Alec Teal wrote:

> > > subclass, by adding empty subclasses derived from the GSS_-based
> > > subclasses as appropriate (I don't bother for gimple codes that already
> > > have their own subclass due to having their own GSS layout).  I also
> > > copied the comments from gimple.def into gimple.h, so that Doxygen picks
> > > up on the descriptions and uses them to describe each subclass.
> > I don't like that.  The empty classes are just useless, they imply a
> > structure that isn't really there, some of the separate gimple codes are
> > basically selectors of specific subtypes of a generic concept, without
> > additional data or methods; creating a type for those is confusing.
> > 
> > Generally I don't like complicating the type system without good reasons
> > (as in actually also making use of the complicated types).  The fewer
> > types the better IMO.
> How would you do this?

How would I do what?  Making use of new types?  Well, like the opportunity 
that David already identified in tree-switching, I'd introduce the new 
type together with a patch to that pass to make use of it.  Further I'd 
try to think about how that new type represents a concept, how that would 
be generalized, and how it would fit in a really designed type hierarchy 
of gimple statements.  Then I would try to think about interaction of 
those new types with others we already have, in particular containers 
(should those become specialized as well?  I would say no from my gut, but 
that question does need some pondering).

> The types are different even if there is no actual difference, much like 
> there are different gimple codes in the first place. He is trying to 
> create a bijection between what we did have and what we now have

Yes, I understood that, and that's IMO the wrong way to introduce new 
types.  It really deserves some more than "we have these dozen gimple 
codes, let's create a dozen new types".  So, while a bijection between 
codes and types is simple to create, I'm pretty sure that's not the best 
mapping.

> so nothing breaks.

Currently nothing is broken, so that's certainly not the reason to 
introduce new types.


Ciao,
Michael.


Re: [PATCH] Vectorization using elemental functions

2013-11-08 Thread Aldy Hernandez

On 11/08/13 07:10, Jakub Jelinek wrote:

Hi!

Here is an updated version of the patch I've posted yesterday.
The changes since then are that the expander can now handle the CONSTRUCTORs
this patch creates (although we probably want to add some vec_concat
optab and at least improve handling of concatenation of two half sized
vectors into one larger one (say concatenate V4SImode and V4SImode into
V8SImode, etc.)), and allows vectorization of non-const elemental function
calls (including calls that have no lhs) in #pragma {,omp }simd loops.


Thanks for working on this.

BTW, could you add some comments to your changes to 
vect_analyze_data_refs?  Actually, the entire function needs comments 
throughout, but that's not your fault :).


Aldy


[GOOGLE] tune loop scale factor for AutoFDO

2013-11-08 Thread Dehao Chen
AutoFDO sometimes has 0 profile in the loop's entry block because the
debug info are lost and unrecoverable.

E.g.

if (a)
  if (b)
for () {}

This patch checks if the scale factor is 0, then use the normal scale.

Bootstrapped and passed regression test and performance test.

OK for google-4_8?

Thanks,
Dehao

Index: gcc/cfgloopmanip.c
===
--- gcc/cfgloopmanip.c (revision 204568)
+++ gcc/cfgloopmanip.c (working copy)
@@ -1262,6 +1262,8 @@ duplicate_loop_to_header_edge (struct loop *loop,
  }
   for (i = 0; i < ndupl; i++)
  gcc_assert (scale_step[i] >= 0 && scale_step[i] <= REG_BR_PROB_BASE);
+  if (flag_auto_profile && scale_main == 0)
+ scale_main = REG_BR_PROB_BASE;
   gcc_assert (scale_main >= 0 && scale_main <= REG_BR_PROB_BASE
&& scale_act >= 0  && scale_act <= REG_BR_PROB_BASE);
 }


[patch 1/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
This creates new base classes for the parts of _State and _NFA which
are not dependent on template parameters, and replaces some copies
with moves.

2013-11-08  Jonathan Wakely  

* include/bits/regex_automaton.h (__detail::_State): Split
non-dependent parts into new _State_base.
(__detail::_NFA): Likewise for _NFA_base. Use std::move() to avoid
copies when inserting _MatcherT and _StateT objects.
* include/bits/regex_automaton.tcc: Move member definitions to base
class. Qualify dependent names.
* include/bits/regex_compiler.h (__detail::_Compiler::_M_get_nfa): Make
non-const and use std::move to avoid copying.
* include/bits/regex_compiler.tcc: Likewise.
* include/bits/regex_executor.h (__detail::_Executor::_M_is_word): Use
array, so past-the-end iterator is valid.

Tested x86_64-linux, committed to trunk.
commit 3404a066d12f458e2b441798dbd70dfa21ef537d
Author: Jonathan Wakely 
Date:   Wed Nov 6 09:09:35 2013 +

* include/bits/regex_automaton.h (__detail::_State): Split
non-dependent parts into new _State_base.
(__detail::_NFA): Likewise for _NFA_base. Use std::move() to avoid
copies when inserting _MatcherT and _StateT objects.
* include/bits/regex_automaton.tcc: Move member definitions to base
class. Qualify dependent names.
* include/bits/regex_compiler.h (__detail::_Compiler::_M_get_nfa): Make
non-const and use std::move to avoid copying.
* include/bits/regex_compiler.tcc: Likewise.
* include/bits/regex_executor.h (__detail::_Executor::_M_is_word): Use
array, so past-the-end iterator is valid.

diff --git a/libstdc++-v3/include/bits/regex_automaton.h 
b/libstdc++-v3/include/bits/regex_automaton.h
index e630512..ded3716 100644
--- a/libstdc++-v3/include/bits/regex_automaton.h
+++ b/libstdc++-v3/include/bits/regex_automaton.h
@@ -65,81 +65,114 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _S_opcode_accept,
   };
 
-  template
-class _State
+  struct _State_base
+  {
+_Opcode  _M_opcode;   // type of outgoing transition
+_StateIdT_M_next; // outgoing transition
+union // Since they are mutually exclusive.
 {
-public:
-  typedef _Matcher<_CharT>   _MatcherT;
-
-  _Opcode  _M_opcode;   // type of outgoing transition
-  _StateIdT_M_next; // outgoing transition
-  union // Since they are mutually exclusive.
+  size_t _M_subexpr;// for _S_opcode_subexpr_*
+  size_t _M_backref_index;  // for _S_opcode_backref
+  struct
   {
-   size_t _M_subexpr;// for _S_opcode_subexpr_*
-   size_t _M_backref_index;  // for _S_opcode_backref
-   struct
-   {
- // for _S_opcode_alternative.
- _StateIdT  _M_quant_index;
- // for _S_opcode_alternative or _S_opcode_subexpr_lookahead
- _StateIdT  _M_alt;
- // for _S_opcode_word_boundary or _S_opcode_subexpr_lookahead or
- // quantifiers(ungreedy if set true)
- bool   _M_neg;
-   };
+   // for _S_opcode_alternative.
+   _StateIdT  _M_quant_index;
+   // for _S_opcode_alternative or _S_opcode_subexpr_lookahead
+   _StateIdT  _M_alt;
+   // for _S_opcode_word_boundary or _S_opcode_subexpr_lookahead or
+   // quantifiers (ungreedy if set true)
+   bool   _M_neg;
   };
-  _MatcherT  _M_matches;// for _S_opcode_match
+};
 
-  explicit _State(_Opcode  __opcode)
-  : _M_opcode(__opcode), _M_next(_S_invalid_state_id)
-  { }
+explicit _State_base(_Opcode __opcode)
+: _M_opcode(__opcode), _M_next(_S_invalid_state_id)
+{ }
+
+  protected:
+~_State_base() = default;
 
+  public:
 #ifdef _GLIBCXX_DEBUG
-  std::ostream&
-  _M_print(std::ostream& ostr) const;
+std::ostream&
+_M_print(std::ostream& ostr) const;
 
-  // Prints graphviz dot commands for state.
-  std::ostream&
-  _M_dot(std::ostream& __ostr, _StateIdT __id) const;
+// Prints graphviz dot commands for state.
+std::ostream&
+_M_dot(std::ostream& __ostr, _StateIdT __id) const;
 #endif
-};
+  };
 
   template
-class _NFA
-: public std::vector<_State<_CharT, _TraitsT>>
+struct _State : _State_base
 {
-public:
-  typedef _State<_CharT, _TraitsT>_StateT;
-  typedef const _Matcher<_CharT>& _MatcherT;
-  typedef size_t  _SizeT;
-  typedef regex_constants::syntax_option_type _FlagT;
-
-  _NFA(_FlagT __f)
-  : _M_flags(__f), _M_start_state(0), _M_subexpr_count(0),
-  _M_quant_count(0), _M_has_backref(false)
-  { }
+  typedef _Matcher<_CharT>   _MatcherT;
 
-  _FlagT
-  _M_options() const
-  { return _M_flags; }
+  _MatcherT  _M_matches;// for _S_opcode_match
 
-  _StateIdT
-  _M_star

[patch 3/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
This removes some more redundant _CharT template parameters.

2013-11-08  Jonathan Wakely  

* include/bits/regex_compiler.h (__detail::_AnyMatcher,
__detail::_CharMatcher, __detail::_BracketMatcher): Remove redundant
_CharT template parameters.
* include/bits/regex_compiler.tcc: Likewise.

Tested x86_64-linux, committed to trunk.
commit 1202237bdd397d1d3104bf1125e72f16d443d134
Author: Jonathan Wakely 
Date:   Fri Nov 8 12:45:26 2013 +

* include/bits/regex_compiler.h (__detail::_AnyMatcher,
__detail::_CharMatcher, __detail::_BracketMatcher): Remove redundant
_CharT template parameters.
* include/bits/regex_compiler.tcc: Likewise.

diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index fef8862..406d9a9 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -39,7 +39,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
-  template
+  template
 struct _BracketMatcher;
 
   /// Builds an NFA from an input iterator interval.
@@ -59,13 +59,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return make_shared<_RegexT>(std::move(_M_nfa)); }
 
 private:
-  typedef typename _TraitsT::char_type   _CharT;
   typedef _Scanner<_FwdIter>  _ScannerT;
   typedef typename _ScannerT::_TokenT _TokenT;
   typedef _StateSeq<_TraitsT>_StateSeqT;
   typedef std::stack<_StateSeqT, std::vector<_StateSeqT>> _StackT;
-  typedef _BracketMatcher<_CharT, _TraitsT>   _BMatcherT;
-  typedef std::ctype<_CharT>  _CtypeT;
+  typedef _BracketMatcher<_TraitsT>  
_BMatcherT;
+  typedef std::ctype_CtypeT;
 
   // accepts a specific token or returns false.
   bool
@@ -139,9 +138,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return _Cmplr(__first, __last, __traits, __flags)._M_get_nfa();
 }
 
-  template
+  template
 struct _AnyMatcher
 {
+  typedef typename _TraitsT::char_type   _CharT;
+
   explicit
   _AnyMatcher(const _TraitsT& __traits)
   : _M_traits(__traits)
@@ -159,9 +160,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const _TraitsT& _M_traits;
 };
 
-  template
+  template
 struct _CharMatcher
 {
+  typedef typename _TraitsT::char_type   _CharT;
   typedef regex_constants::syntax_option_type _FlagT;
 
   explicit
@@ -188,9 +190,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// Matches a character range (bracket expression)
-  template
+  template
 struct _BracketMatcher
 {
+  typedef typename _TraitsT::char_type   _CharT;
   typedef typename _TraitsT::char_class_type  _CharClassT;
   typedef typename _TraitsT::string_type  _StringT;
   typedef regex_constants::syntax_option_type _FlagT;
diff --git a/libstdc++-v3/include/bits/regex_compiler.tcc 
b/libstdc++-v3/include/bits/regex_compiler.tcc
index 49c32b8..f89498f 100644
--- a/libstdc++-v3/include/bits/regex_compiler.tcc
+++ b/libstdc++-v3/include/bits/regex_compiler.tcc
@@ -286,11 +286,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   if (_M_match_token(_ScannerT::_S_token_anychar))
_M_stack.push(_StateSeqT(_M_nfa,
_M_nfa._M_insert_matcher
-   (_AnyMatcher<_CharT, _TraitsT>(_M_traits;
+   (_AnyMatcher<_TraitsT>(_M_traits;
   else if (_M_try_char())
_M_stack.push(_StateSeqT(_M_nfa,
 _M_nfa._M_insert_matcher
-(_CharMatcher<_CharT, _TraitsT>(_M_value[0],
+(_CharMatcher<_TraitsT>(_M_value[0],
 _M_traits,
 _M_flags;
   else if (_M_match_token(_ScannerT::_S_token_backref))
@@ -430,9 +430,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return __v;
 }
 
-  template
+  template
 bool
-_BracketMatcher<_CharT, _TraitsT>::operator()(_CharT __ch) const
+_BracketMatcher<_TraitsT>::operator()(_CharT __ch) const
 {
   bool __ret = false;
   if (_M_traits.isctype(__ch, _M_class_set)




[patch 2/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
This removes the redundant _CharT template parameters that can be
obtained from the _Traits parameters.  It also adds the __compile_nfa
function to create a _Compiler (deducing its template arguments) and
get the _NFA from it.

2013-11-08  Jonathan Wakely  

* include/bits/regex_automaton.h (__detail::_State, __detail::_NFA,
__detail::_StateSeq): Remove redundant _CharT template parameters.
* include/bits/regex_automaton.tcc: Likewise.
* include/bits/regex_compiler.h (__detail::_Compiler): Likewise.
(__compile_nfa): Add object generator for _Compiler.
* include/bits/regex_compiler.tcc: Remove _CharT template parameters.
* include/bits/regex_executor.h: Likewise.
* include/bits/regex_executor.tcc: Likewise.
* include/bits/regex.h (basic_regex): Assert char_type matches. Use
__compile_nfa object generator. Remove _CharT template parameter.

Tested x86_64-linux, committed to trunk.
commit adf28b3b99b86f1cab7d2f03930eb0f7ede04ed8
Author: Jonathan Wakely 
Date:   Fri Nov 8 12:11:04 2013 +

* include/bits/regex_automaton.h (__detail::_State, __detail::_NFA,
__detail::_StateSeq): Remove redundant _CharT template parameters.
* include/bits/regex_automaton.tcc: Likewise.
* include/bits/regex_compiler.h (__detail::_Compiler): Likewise.
(__compile_nfa): Add object generator for _Compiler.
* include/bits/regex_compiler.tcc: Remove _CharT template parameters.
* include/bits/regex_executor.h: Likewise.
* include/bits/regex_executor.tcc: Likewise.
* include/bits/regex.h (basic_regex): Assert char_type matches. Use
__compile_nfa object generator. Remove _CharT template parameter.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index b1bda46..84b8cf1 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -377,10 +377,13 @@ _GLIBCXX_END_NAMESPACE_VERSION
* Storage for the regular expression is allocated and deallocated as
* necessary by the member functions of this class.
*/
-  template >
+  template>
 class basic_regex
 {
 public:
+  static_assert(is_same<_Ch_type, typename _Rx_traits::char_type>::value,
+   "regex traits class must have the same char_type");
+
   // types:
   typedef _Ch_typevalue_type;
   typedef _Rx_traits  traits_type;
@@ -498,8 +501,8 @@ _GLIBCXX_END_NAMESPACE_VERSION
basic_regex(_FwdIter __first, _FwdIter __last,
flag_type __f = ECMAScript)
: _M_flags(__f),
- _M_automaton(__detail::_Compiler<_FwdIter, _Ch_type, _Rx_traits>
-  (__first, __last, _M_traits, _M_flags)._M_get_nfa())
+ _M_automaton(__detail::__compile_nfa(__first, __last, _M_traits,
+  _M_flags))
{ }
 
   /**
@@ -634,9 +637,8 @@ _GLIBCXX_END_NAMESPACE_VERSION
   flag_type __flags = ECMAScript)
{
  _M_flags = __flags;
- _M_automaton =
-   __detail::_Compiler
-   (__s.begin(), __s.end(), _M_traits, _M_flags)._M_get_nfa();
+ _M_automaton = __detail::__compile_nfa(__s.begin(), __s.end(),
+_M_traits, _M_flags);
  return *this;
}
 
@@ -730,8 +732,7 @@ _GLIBCXX_END_NAMESPACE_VERSION
 #endif
 
 protected:
-  typedef std::shared_ptr<__detail::_NFA<_Ch_type, _Rx_traits>>
-   _AutomatonPtr;
+  typedef std::shared_ptr<__detail::_NFA<_Rx_traits>> _AutomatonPtr;
 
   template
diff --git a/libstdc++-v3/include/bits/regex_automaton.h 
b/libstdc++-v3/include/bits/regex_automaton.h
index ded3716..1be51221 100644
--- a/libstdc++-v3/include/bits/regex_automaton.h
+++ b/libstdc++-v3/include/bits/regex_automaton.h
@@ -103,10 +103,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
   };
 
-  template
+  template
 struct _State : _State_base
 {
-  typedef _Matcher<_CharT>   _MatcherT;
+  typedef _Matcher _MatcherT;
 
   _MatcherT  _M_matches;// for _S_opcode_match
 
@@ -155,12 +155,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 bool  _M_has_backref;
   };
 
-  template
+  template
 struct _NFA
-: _NFA_base, std::vector<_State<_CharT, _TraitsT>>
+: _NFA_base, std::vector<_State<_TraitsT>>
 {
-  typedef _State<_CharT, _TraitsT> _StateT;
-  typedef _Matcher<_CharT> _MatcherT;
+  typedef _State<_TraitsT> _StateT;
+  typedef _Matcher   _MatcherT;
 
   using _NFA_base::_NFA_base;
 
@@ -268,11 +268,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// Describes a sequence of one or more %_State, its current start
   /// and end(s).  This structure contains fragments of an NFA during
   /// construction.
-  template
+  template
 class _

RFA: Fix PR middle-end/59049

2013-11-08 Thread Joern Rennecke
bootstrapped / regtested on i686-pc-linux-gnu.
2013-11-08  Joern Rennecke  

PR middle-end/59049
* expmed.c (emit_cstore): Avoid generating a comparison of two
VOIDmode constants.

Index: gcc/expmed.c
===
--- gcc/expmed.c(revision 204568)
+++ gcc/expmed.c(working copy)
@@ -5112,6 +5112,12 @@ emit_cstore (rtx target, enum insn_code
   if (!target)
 target = gen_reg_rtx (target_mode);
 
+  /* If we compare two VOIDmode constants, we loose the mode to do the
+ comparison in.  To avoid that, copy one constant into a register.
+ We rely on subsequent optimizations (like ce1) to clean this up.  */
+  if (GET_MODE (x) == VOIDmode && GET_MODE (y) == VOIDmode)
+x = copy_to_mode_reg (compare_mode, x);
+
   comparison = gen_rtx_fmt_ee (code, result_mode, x, y);
 
   create_output_operand (&ops[0], optimize ? NULL_RTX : target, result_mode);


Re: [patch 4/4] std::regex refactoring

2013-11-08 Thread Daniel Krügler
2013/11/8 Jonathan Wakely :
> As I suggested yesterday on the libstdc++ list, this adds an overload
> for string and vector iterators to extract a raw pointer and re-use
> the _Compiler specialization, so that std::regex(".")
> and std::regex(std::string(".")) and std::regex(std::vector(1,
> '.')) only instantiate _Compiler once.
>
> 2013-11-08  Jonathan Wakely  
>
> * include/bits/regex_compiler.h (__detail::__compile_nfa): Overload
> so that std::basic_string and std::vector iterators dispatch to
> the const C* compiler.

I have fully not grasped for which T the specializations of
__has_contiguous_iter are intended to be used, but given the fact that

+  template
+struct __has_contiguous_iter>
+: std::true_type
+{ };

exists I think you really want to exclude any vector
by additionally adding

+  template
+struct __has_contiguous_iter>
+: std::false_type
+{ };

- Daniel


[patch 4/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
As I suggested yesterday on the libstdc++ list, this adds an overload
for string and vector iterators to extract a raw pointer and re-use
the _Compiler specialization, so that std::regex(".")
and std::regex(std::string(".")) and std::regex(std::vector(1,
'.')) only instantiate _Compiler once.

2013-11-08  Jonathan Wakely  

* include/bits/regex_compiler.h (__detail::__compile_nfa): Overload
so that std::basic_string and std::vector iterators dispatch to
the const C* compiler.

Tested x86_64-linux, committed to trunk.
commit 76e9d5349657d9eed04d3429dbf7b6527bf03deb
Author: Jonathan Wakely 
Date:   Fri Nov 8 13:33:06 2013 +

* include/bits/regex_compiler.h (__detail::__compile_nfa): Overload
so that std::basic_string and std::vector iterators dispatch to
the const C* compiler.

diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index 406d9a9..741098f 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -129,8 +129,40 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _StackT _M_stack;
 };
 
+  template
+struct __has_contiguous_iter : std::false_type { };
+
+  template
+struct __has_contiguous_iter>
+: std::true_type
+{ };
+
+  template
+struct __has_contiguous_iter>
+: std::true_type
+{ };
+
+  template
+struct __is_contiguous_normal_iter : std::false_type { };
+
+  template
+struct
+__is_contiguous_normal_iter<__gnu_cxx::__normal_iterator<_Tp, _Cont>>
+: __has_contiguous_iter<_Cont>::type
+{ };
+
+  template
+using __enable_if_contiguous_normal_iter
+  = typename enable_if< __is_contiguous_normal_iter<_Iter>::value,
+   std::shared_ptr<_NFA<_TraitsT>> >::type;
+
+  template
+using __disable_if_contiguous_normal_iter
+  = typename enable_if< !__is_contiguous_normal_iter<_Iter>::value,
+   std::shared_ptr<_NFA<_TraitsT>> >::type;
+
   template
-inline std::shared_ptr<_NFA<_TraitsT>>
+inline __disable_if_contiguous_normal_iter<_FwdIter, _TraitsT>
 __compile_nfa(_FwdIter __first, _FwdIter __last, const _TraitsT& __traits,
  regex_constants::syntax_option_type __flags)
 {
@@ -138,6 +170,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return _Cmplr(__first, __last, __traits, __flags)._M_get_nfa();
 }
 
+  template
+inline __enable_if_contiguous_normal_iter<_Iter, _TraitsT>
+__compile_nfa(_Iter __first, _Iter __last, const _TraitsT& __traits,
+ regex_constants::syntax_option_type __flags)
+{
+  size_t __len = __last - __first;
+  const auto* __cfirst = __len ? std::__addressof(*__first) : nullptr;
+  return __compile_nfa(__cfirst, __cfirst + __len, __traits, __flags);
+}
+
   template
 struct _AnyMatcher
 {



Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Steven Bosscher
On Fri, Nov 8, 2013 at 3:40 PM, Joern Rennecke
 wrote:
> bootstrapped / regtested on i686-pc-linux-gnu.

Not a very elaborate description of the patch, eh? :-)

This is IMHO not OK without at least an explanation of why the
comparison of two const_ints is not folded. Better yet would be to fix
that underlying problem. We should not present such non-sense to the
RTL parts of the middle end.

Also would need a test case.

Ciao!
Steven


Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Joern Rennecke
On 8 November 2013 14:45, Steven Bosscher  wrote:
> Also would need a test case.

As is mentioned in the PR, we already have a test case, which shows a
regression.


some prep work to make JUMP_TABLE_DATA a non-active_insn_p object

2013-11-08 Thread Steven Bosscher
Hello,

I'd like to make JUMP_TABLE_DATA a non-active insn before the end of
stage1. Most of the work required for this is pretty simple. It
involves finding and fixing the few places where insns are walked
across basic block boundaries and ignoring barriers. Ah, the madness
of that! :-) Fortunately almost no code does this in the shared RTL
middle end, and most targets are also safe.

This is the first patch of what I think will be four to fix those few places.

Bootstrapped&tested on powerpc64-unknown-linux-gnu. Also built SH to be sure.

OK for trunk?

Ciao!
Steven

* cfgrtl.c (can_fallthru): Reorder code to move tablejump check up.
Make that check explicit.  BB_HEAD cannot be NULL, remove check for it.
* haifa-sched.c (ready_remove_first_dispatch): Check INSN_P before
looking at INSN_CODE.
* reload1.c (delete_dead_insn) Do not expect JUMP_TABLE_DATA to be an
active_insn_p object, respect basic block boundaries.
* reorg.c (follow_jumps): Use invariant that JUMP_TABLE_DATA always
follows immediately after the jump table data label.
* config/nds32/nds32.c (nds32_output_casesi_pc_relative): Likewise.
* config/sh/sh.c (barrier_align): Likewise.  Rearrange code such
that JUMP_TABLE_DATA is not expected to be an active_insn_p object.

Index: cfgrtl.c
===
--- cfgrtl.c(revision 204543)
+++ cfgrtl.c(working copy)
@@ -610,7 +610,7 @@ forwarder_block_p (const_basic_block bb)
 }
 
 /* Return nonzero if we can reach target from src by falling through.  */
-/* FIXME: Make this a cfg hook.  */
+/* FIXME: Make this a cfg hook, the result is only valid in legacy cfgrtl 
mode.  */
 
 bool
 can_fallthru (basic_block src, basic_block target)
@@ -623,17 +623,21 @@ can_fallthru (basic_block src, basic_block target)
   if (target == EXIT_BLOCK_PTR)
 return true;
   if (src->next_bb != target)
-return 0;
+return false;
+
+  /* ??? Later we may add code to move jump tables offline.  */
+  if (tablejump_p (insn, NULL, NULL))
+return false;
+
   FOR_EACH_EDGE (e, ei, src->succs)
 if (e->dest == EXIT_BLOCK_PTR
&& e->flags & EDGE_FALLTHRU)
-  return 0;
+  return false;
 
   insn2 = BB_HEAD (target);
-  if (insn2 && !active_insn_p (insn2))
+  if (!active_insn_p (insn2))
 insn2 = next_active_insn (insn2);
 
-  /* ??? Later we may add code to move jump tables offline.  */
   return next_active_insn (insn) == insn2;
 }
 
Index: haifa-sched.c
===
--- haifa-sched.c   (revision 204543)
+++ haifa-sched.c   (working copy)
@@ -8589,8 +8589,8 @@ ready_remove_first_dispatch (struct ready_list *re
   rtx insn = ready_element (ready, 0);
 
   if (ready->n_ready == 1
+  || !INSN_P (insn)
   || INSN_CODE (insn) < 0
-  || !INSN_P (insn)
   || !active_insn_p (insn)
   || targetm.sched.dispatch (insn, FITS_DISPATCH_WINDOW))
 return ready_remove_first (ready);
@@ -8599,8 +8599,8 @@ ready_remove_first_dispatch (struct ready_list *re
 {
   insn = ready_element (ready, i);
 
-  if (INSN_CODE (insn) < 0
- || !INSN_P (insn)
+  if (!INSN_P (insn)
+ || INSN_CODE (insn) < 0
  || !active_insn_p (insn))
continue;
 
@@ -8619,8 +8619,8 @@ ready_remove_first_dispatch (struct ready_list *re
 {
   insn = ready_element (ready, i);
 
-  if (INSN_CODE (insn) < 0
- || !INSN_P (insn)
+  if (! INSN_P (insn)
+ || INSN_CODE (insn) < 0
  || !active_insn_p (insn))
continue;
 
Index: reload1.c
===
--- reload1.c   (revision 204543)
+++ reload1.c   (working copy)
@@ -2123,7 +2123,8 @@ delete_dead_insn (rtx insn)
  block local equivalences.  Instead of trying to figure out the exact
  circumstances where we can delete the potentially dead insns, just
  let DCE do the job.  */
-  if (prev && GET_CODE (PATTERN (prev)) == SET
+  if (prev && BLOCK_FOR_INSN (prev) == BLOCK_FOR_INSN (insn)
+  && GET_CODE (PATTERN (prev)) == SET
   && (prev_dest = SET_DEST (PATTERN (prev)), REG_P (prev_dest))
   && reg_mentioned_p (prev_dest, PATTERN (insn))
   && find_regno_note (insn, REG_DEAD, REGNO (prev_dest))
Index: reorg.c
===
--- reorg.c (revision 204543)
+++ reorg.c (working copy)
@@ -2302,15 +2302,16 @@ follow_jumps (rtx label, rtx jump, bool *crossing)
depth++)
 {
   rtx this_label = JUMP_LABEL (insn);
-  rtx tem;
 
   /* If we have found a cycle, make the insn jump to itself.  */
   if (this_label == label)
return label;
+
+  /* Cannot follow returns and cannot look through tablejumps.  */
   if (ANY_RETURN_P (this_label))
return this_label;
-  tem = next_active_insn (this_label);
-  if 

Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Joern Rennecke
On 8 November 2013 14:45, Steven Bosscher  wrote:
> This is IMHO not OK without at least an explanation of why the
> comparison of two const_ints is not folded. Better yet would be to fix
> that underlying problem. We should not present such non-sense to the
> RTL parts of the middle end.

Which part would be responsible to folding the comparison at -O1 ?
FWIW, as I just commented in the PR, one of the operands is an ssa_name
with a known value.


Re: [PATCH] make has_gate and has_execute useless

2013-11-08 Thread Trevor Saunders
On Fri, Nov 08, 2013 at 10:37:00AM +0100, Richard Biener wrote:
> On Thu, Nov 7, 2013 at 5:00 PM,   wrote:
> > From: Trevor Saunders 
> >
> > Hi,
> >
> >  This is the result of seeing what it would take to get rid of the has_gate 
> > and
> > has_execute flags on pass_data.  It turns out not much, but I wanted
> > confirmation this part is ok before I go deal with all the places that
> > initialize the fields.
> >
> > I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test 
> > suite
> > regressions (ignoring the silk stuff because the full paths in its test 
> > names
> > break my test script for now) Any reason this patch with the actual removal 
> > of the flags wouldn't be ok?
> 
> The has_gate flag is easy to remove (without a TODO_ hack), right?

yes

> No gate simply means that the gate returns always true.  The only
> weird thing is
> 
>   /* If we have a gate, combine the properties that we could have with
>  and without the pass being examined.  */
>   if (pass->has_gate)
> properties &= new_properties;
>   else
> properties = new_properties;
> 
> which I don't understand (and you just removed all properties handling there).

yes, because it was pretty obviously doing nothing useful, but I'm not sure
what it was trying to do.

> So can you split out removing has_gate?  This part is obviously ok.

 ugh, already wrote / sent the patch that got rid of setting both flags.

> Then, for ->execute () I'd have refactored the code to make
> ->sub passes explicitely executed by the default ->execute ()
> implementation only.  That is, passes without overriding execute
> are containers only.  Can you quickly check whether that would
> work out?

It seems nice and I'll try it.  A quick look makes me worry a little
about what execute_ipa_pass_list is doing with passes->sub, but I
suspect that's the wrong place for whatever it is, and maybe it isn't
actually an issue.

Trev

> 
> Thanks,
> Richard.
> 
> > Trev
> >
> > 2013-11-06  Trevor Saunders  
> >
> > * pass_manager.h (pass_manager): Adjust.
> > * passes.c (opt_pass::execute): Tell the pass manager it doesn't 
> > need
> > to do anything for this pass.
> > (pass_manager::register_dump_files_1): Don't uselessly deal with
> > properties of passes.
> > (pass_manager::register_dump_files): Adjust.
> > (dump_one_pass): Just call pass->gate ().
> > (execute_ipa_summary_passes): Likewise.
> > (execute_one_pass): Don't check pass->has_execute flag.
> > (ipa_write_summaries_2): Don't check pass->has_gate flag.
> > (ipa_write_optimization_summaries_1): Likewise.
> > (ipa_read_summaries_1): Likewise.
> > (ipa_read_optimization_summaries_1): Likewise.
> > (execute_ipa_stmt_fixups): Likewise.
> > * tree-pass.h (pass_data): Rename has_gate to useless_has_gate and
> > has_execute to useless_has_execute to be sure they're unused.
> > (TODO_absolutely_nothing): New constant.
> >
> >
> > diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
> > index 77d78eb..3bc0a99 100644
> > --- a/gcc/pass_manager.h
> > +++ b/gcc/pass_manager.h
> > @@ -93,7 +93,7 @@ public:
> >
> >  private:
> >void set_pass_for_id (int id, opt_pass *pass);
> > -  int register_dump_files_1 (struct opt_pass *pass, int properties);
> > +  void register_dump_files_1 (struct opt_pass *pass);
> >void register_dump_files (struct opt_pass *pass, int properties);
> >
> >  private:
> > diff --git a/gcc/passes.c b/gcc/passes.c
> > index 19e5869..3b28dc9 100644
> > --- a/gcc/passes.c
> > +++ b/gcc/passes.c
> > @@ -112,7 +112,7 @@ opt_pass::gate ()
> >  unsigned int
> >  opt_pass::execute ()
> >  {
> > -  return 0;
> > +  return TODO_absolutely_nothing;
> >  }
> >
> >  opt_pass::opt_pass (const pass_data &data, context *ctxt)
> > @@ -701,33 +701,21 @@ pass_manager::register_one_dump_file (struct opt_pass 
> > *pass)
> >
> >  /* Recursive worker function for register_dump_files.  */
> >
> > -int
> > +void
> >  pass_manager::
> > -register_dump_files_1 (struct opt_pass *pass, int properties)
> > +register_dump_files_1 (struct opt_pass *pass)
> >  {
> >do
> >  {
> > -  int new_properties = (properties | pass->properties_provided)
> > -  & ~pass->properties_destroyed;
> > -
> >if (pass->name && pass->name[0] != '*')
> >  register_one_dump_file (pass);
> >
> >if (pass->sub)
> > -new_properties = register_dump_files_1 (pass->sub, new_properties);
> > -
> > -  /* If we have a gate, combine the properties that we could have with
> > - and without the pass being examined.  */
> > -  if (pass->has_gate)
> > -properties &= new_properties;
> > -  else
> > -properties = new_properties;
> > +register_dump_files_1 (pass->sub);
> >
> >pass = pass->next;
> >  }
> >while (pass);
> > -
> > -  return properties;
> >  

Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Steven Bosscher
On Fri, Nov 8, 2013 at 4:18 PM, Joern Rennecke wrote:
> On 8 November 2013 14:45, Steven Bosscher wrote:
>> This is IMHO not OK without at least an explanation of why the
>> comparison of two const_ints is not folded. Better yet would be to fix
>> that underlying problem. We should not present such non-sense to the
>> RTL parts of the middle end.
>
> Which part would be responsible to folding the comparison at -O1 ?
> FWIW, as I just commented in the PR, one of the operands is an ssa_name
> with a known value.

This usually is taken care of by one of the many propagation passes
(CCP, VRP, forwprop, DOM, even FRE/PRE). The only cases I've
previously encountered, where a known value ends up not being
propagated, is when the propagation is somehow overlooked and the
known value is only replaced at out-of-ssa time, i.e. TER.

The only way to find out where the propagation should have happened,
is by looking at the gimple dumps.

Ciao!
Steven


Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Jakub Jelinek
On Fri, Nov 08, 2013 at 04:32:09PM +0100, Steven Bosscher wrote:
> On Fri, Nov 8, 2013 at 4:18 PM, Joern Rennecke wrote:
> > On 8 November 2013 14:45, Steven Bosscher wrote:
> >> This is IMHO not OK without at least an explanation of why the
> >> comparison of two const_ints is not folded. Better yet would be to fix
> >> that underlying problem. We should not present such non-sense to the
> >> RTL parts of the middle end.
> >
> > Which part would be responsible to folding the comparison at -O1 ?
> > FWIW, as I just commented in the PR, one of the operands is an ssa_name
> > with a known value.
> 
> This usually is taken care of by one of the many propagation passes
> (CCP, VRP, forwprop, DOM, even FRE/PRE). The only cases I've
> previously encountered, where a known value ends up not being
> propagated, is when the propagation is somehow overlooked and the
> known value is only replaced at out-of-ssa time, i.e. TER.
> 
> The only way to find out where the propagation should have happened,
> is by looking at the gimple dumps.

Well, most of the passes you've mentioned can be disabled with -fno-tree-*
but still we shouldn't ICE on it, so the expander can't assume that
it will never see something like that, only doesn't have to try too hard
to optimize that unlikely case.  Of course, it is nice to know why it hasn't
been optimized in the particular case.

Jakub


Re: [patch 4/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
On 8 November 2013 14:51, Daniel Krügler wrote:
> I have fully not grasped for which T the specializations of
> __has_contiguous_iter are intended to be used,

Currently, only std::container iterators passed to a basic_regex
constructor, but in theory the trait could get moved to another header
and used elsewhere in future.

> but given the fact that
>
> +  template
> +struct __has_contiguous_iter>
> +: std::true_type
> +{ };
>
> exists I think you really want to exclude any vector
> by additionally adding
>
> +  template
> +struct __has_contiguous_iter>
> +: std::false_type
> +{ };

Hmm, I didn't think of that, damn vector!  It's only needed in
case someone does this:

  std::vector v;
  std::regex re(v.begin(), v.end());

which would be pretty crazy, but it has well-defined behaviour and
should throw regex_error.

I'll add the specialization, thanks for catching it.


Re: [PATCH] Generate a label for the split cold function while using -freorder-blocks-and-partition

2013-11-08 Thread Teresa Johnson
On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher  wrote:
> On Wed, Apr 24, 2013 at 12:58 AM, Sriraman Tallam wrote:
>> Hi,
>>
>>   This patch generates labels for cold function parts that are split when
>> using the option -freorder-blocks-and-partition.  The cold label name
>> is generated by suffixing ".cold" to the assembler name of the hot
>> function.
>>
>>   This is useful when getting back traces from gdb when the cold function
>> part does get executed.
>
> Can you show how this changes your gdb back traces? Can you not
> already let GDB inform you that it's in another .text section?
>
> I don't like this label idea much. It seems so arbitrary, unless
> you're only interested in knowing when a piece of function from
> another .text section is executed. But I'm personally more interested
> in the whole function, i.e. also in not-so-hot (or even cold) blocks
> that are not in the .text.unlikely section. That means somehow passing
> the profile-use information from the compiler to the debugger.
>
> So what I would really like to have instead, is some kind of DWARF
> annotations for profile information in the function, with some way in
> GDB to inspect and modify the values, and to add break points based on
> execution count threshold.
>
> I haven't really thought this through yet, but I have this dream of
> having the data of a gcno file encoded in DWARF instead, and to have
> profile information included in the FDO-compiled binary to add the
> ability to verify that those parts of the program that you expect to
> be hot are really also considered hot by the compiler. The only way to
> check this now is by inspecting some dumps and that's IMHO not very
> practical.
>
> Ciao!
> Steven

I'd like to revisit this old patch from Sri to generate a label for
the split out cold section, now that all the patches to fix the
failures and insanities from -freorder-blocks-and-partition are in
(and I want to send a patch to enable it by default with fdo). The
problem is that the split code doesn't currently have any label to
identify where it came from, so in the final binary it appears to be
part of whatever non-cold function the linker happens to place it
after. This confuses tools like objdump, profilers and the debugger.

For example, from mcf with -freorder-blocks-and-partition:

main:
.LFB40:
.cfi_startproc
...
jne .L27   < jump to main's text.unlikely
...
.cfi_endproc
.section.text.unlikely
.cfi_startproc
.L27:   <--- start of main's text.unlikely
...

$ objdump -d mcf

00400aa0 :
  ...
  400b1a:   0f 85 1b fb ff ff   jne40063b
   < jump to main's text.unlikely

004004c0 :
  ...
  40063b:   bf 06 8a 49 00  mov$0x498a06,%edi<---
start of main's text.unlikely
  ...
  40065e:   e9 0d 05 00 00  jmpq   400b70 
<--- jump back to main's hot text


Note that objdump thinks we are jumping to/from another function. If
we end up executing the cold section (e.g. due to non-representative
profiles), profiling tools and gdb are going to think we are executing
replace_weaker_arc. With Sri's patch, this becomes much clearer:

00400aa0 :
  ...
  400b1a:   0f 85 1b fb ff ff   jne40063b 

0040063b :
  40063b:   bf 06 8a 49 00  mov$0x498a06,%edi
  ...
  40065e:   e9 0d 05 00 00  jmpq   400b70 


Steven, I think the ideas you describe above about passing profile
information in DWARF and doing some special handling in gdb to use
that seem orthogonal to this.

Given that, is this patch ok for trunk (pending successful re-runs of
regression testing)

Thanks,
Teresa

-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Steven Bosscher
On Fri, Nov 8, 2013 at 4:41 PM, Jakub Jelinek wrote:
> On Fri, Nov 08, 2013 at 04:32:09PM +0100, Steven Bosscher wrote:
>> On Fri, Nov 8, 2013 at 4:18 PM, Joern Rennecke wrote:
>> > On 8 November 2013 14:45, Steven Bosscher wrote:
>> >> This is IMHO not OK without at least an explanation of why the
>> >> comparison of two const_ints is not folded. Better yet would be to fix
>> >> that underlying problem. We should not present such non-sense to the
>> >> RTL parts of the middle end.
>> >
>> > Which part would be responsible to folding the comparison at -O1 ?
>> > FWIW, as I just commented in the PR, one of the operands is an ssa_name
>> > with a known value.
>>
>> This usually is taken care of by one of the many propagation passes
>> (CCP, VRP, forwprop, DOM, even FRE/PRE). The only cases I've
>> previously encountered, where a known value ends up not being
>> propagated, is when the propagation is somehow overlooked and the
>> known value is only replaced at out-of-ssa time, i.e. TER.
>>
>> The only way to find out where the propagation should have happened,
>> is by looking at the gimple dumps.
>
> Well, most of the passes you've mentioned can be disabled with -fno-tree-*
> but still we shouldn't ICE on it, so the expander can't assume that
> it will never see something like that, only doesn't have to try too hard
> to optimize that unlikely case.  Of course, it is nice to know why it hasn't
> been optimized in the particular case.


Even with the gimple opts disabled, a const-const comparison would
normally be folded by the RTL expanders.

Ciao!
Steven


Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Jeff Law

On 11/08/13 07:45, Steven Bosscher wrote:

On Fri, Nov 8, 2013 at 3:40 PM, Joern Rennecke
 wrote:

bootstrapped / regtested on i686-pc-linux-gnu.


Not a very elaborate description of the patch, eh? :-)

This is IMHO not OK without at least an explanation of why the
comparison of two const_ints is not folded. Better yet would be to fix
that underlying problem. We should not present such non-sense to the
RTL parts of the middle end.

Agreed.
jeff



Re: [patch 4/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
On 8 November 2013 15:41, Jonathan Wakely wrote:
> On 8 November 2013 14:51, Daniel Krügler wrote:
>> I have fully not grasped for which T the specializations of
>> __has_contiguous_iter are intended to be used,
>
> Currently, only std::container iterators passed to a basic_regex
> constructor, but in theory the trait could get moved to another header
> and used elsewhere in future.

Currently the vector specialization can never be reached,
because std::vector doesn't use __gnu_cxx::__normal_iterator
(and trying to pass vector::iterator to a regex ctor fails
anyway) so this is only a theoretical problem if we re-use
__has_contiguous_iter elsewhere.


Re: [PATCH] Generate a label for the split cold function while using -freorder-blocks-and-partition

2013-11-08 Thread Steven Bosscher
On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
> On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher wrote:
> I'd like to revisit this old patch from Sri to generate a label for
> the split out cold section, now that all the patches to fix the
> failures and insanities from -freorder-blocks-and-partition are in
> (and I want to send a patch to enable it by default with fdo). The
> problem is that the split code doesn't currently have any label to
> identify where it came from, so in the final binary it appears to be
> part of whatever non-cold function the linker happens to place it
> after. This confuses tools like objdump, profilers and the debugger.

Surely there are .loc directives in the code to clarify where the code
came from? So at least the debugger should know what function the code
belongs to.

> For example, from mcf with -freorder-blocks-and-partition:
>
> main:
> .LFB40:
> .cfi_startproc
> ...
> jne .L27   < jump to main's text.unlikely
> ...
> .cfi_endproc
> .section.text.unlikely
> .cfi_startproc
> .L27:   <--- start of main's text.unlikely
> ...
>
> $ objdump -d mcf
>
> 00400aa0 :
>   ...
>   400b1a:   0f 85 1b fb ff ff   jne40063b
>< jump to main's text.unlikely
>
> 004004c0 :
>   ...
>   40063b:   bf 06 8a 49 00  mov$0x498a06,%edi<---
> start of main's text.unlikely
>   ...
>   40065e:   e9 0d 05 00 00  jmpq   400b70 
> <--- jump back to main's hot text
>
>
> Note that objdump thinks we are jumping to/from another function. If
> we end up executing the cold section (e.g. due to non-representative
> profiles), profiling tools and gdb are going to think we are executing
> replace_weaker_arc. With Sri's patch, this becomes much clearer:
>
> 00400aa0 :
>   ...
>   400b1a:   0f 85 1b fb ff ff   jne40063b 
>
> 0040063b :
>   40063b:   bf 06 8a 49 00  mov$0x498a06,%edi
>   ...
>   40065e:   e9 0d 05 00 00  jmpq   400b70 
>

Does -ffunction-sections help here?


> Steven, I think the ideas you describe above about passing profile
> information in DWARF and doing some special handling in gdb to use
> that seem orthogonal to this.

It's not just about profile information but about what function some
code belongs to. I just cannot believe that there isn't already
something in DWARF to mark address ranges as being part of a given
function.


> Given that, is this patch ok for trunk (pending successful re-runs of
> regression testing)


IMHO, no. This is conceptually wrong. You create artificial named
labels in the compiler that might clash with user labels. Is that
likely to happen? No. But it could and therefore the approach of
adding functionname.cold labels is wrong.

Ciao!
Steven


Re: [PATCH] Generate a label for the split cold function while using -freorder-blocks-and-partition

2013-11-08 Thread Teresa Johnson
On Fri, Nov 8, 2013 at 8:14 AM, Steven Bosscher  wrote:
> On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
>> On Tue, May 14, 2013 at 3:47 PM, Steven Bosscher wrote:
>> I'd like to revisit this old patch from Sri to generate a label for
>> the split out cold section, now that all the patches to fix the
>> failures and insanities from -freorder-blocks-and-partition are in
>> (and I want to send a patch to enable it by default with fdo). The
>> problem is that the split code doesn't currently have any label to
>> identify where it came from, so in the final binary it appears to be
>> part of whatever non-cold function the linker happens to place it
>> after. This confuses tools like objdump, profilers and the debugger.
>
> Surely there are .loc directives in the code to clarify where the code
> came from? So at least the debugger should know what function the code
> belongs to.
>
>> For example, from mcf with -freorder-blocks-and-partition:
>>
>> main:
>> .LFB40:
>> .cfi_startproc
>> ...
>> jne .L27   < jump to main's text.unlikely
>> ...
>> .cfi_endproc
>> .section.text.unlikely
>> .cfi_startproc
>> .L27:   <--- start of main's text.unlikely
>> ...
>>
>> $ objdump -d mcf
>>
>> 00400aa0 :
>>   ...
>>   400b1a:   0f 85 1b fb ff ff   jne40063b
>>< jump to main's text.unlikely
>>
>> 004004c0 :
>>   ...
>>   40063b:   bf 06 8a 49 00  mov$0x498a06,%edi<---
>> start of main's text.unlikely
>>   ...
>>   40065e:   e9 0d 05 00 00  jmpq   400b70 
>> <--- jump back to main's hot text
>>
>>
>> Note that objdump thinks we are jumping to/from another function. If
>> we end up executing the cold section (e.g. due to non-representative
>> profiles), profiling tools and gdb are going to think we are executing
>> replace_weaker_arc. With Sri's patch, this becomes much clearer:
>>
>> 00400aa0 :
>>   ...
>>   400b1a:   0f 85 1b fb ff ff   jne40063b 
>>
>> 0040063b :
>>   40063b:   bf 06 8a 49 00  mov$0x498a06,%edi
>>   ...
>>   40065e:   e9 0d 05 00 00  jmpq   400b70 
>>
>
> Does -ffunction-sections help here?

No. It just moves the cold section to a different spot (so the
function it appears part of is different, but same issue).

>
>
>> Steven, I think the ideas you describe above about passing profile
>> information in DWARF and doing some special handling in gdb to use
>> that seem orthogonal to this.
>
> It's not just about profile information but about what function some
> code belongs to. I just cannot believe that there isn't already
> something in DWARF to mark address ranges as being part of a given
> function.

Cary, any ideas here?

>
>
>> Given that, is this patch ok for trunk (pending successful re-runs of
>> regression testing)
>
>
> IMHO, no. This is conceptually wrong. You create artificial named
> labels in the compiler that might clash with user labels. Is that
> likely to happen? No. But it could and therefore the approach of
> adding functionname.cold labels is wrong.

Is there a format for compiler-defined labels that would not be able
to clash with other user-generated labels?

Teresa

>
> Ciao!
> Steven



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[PATCH] Commonize repeated code into function

2013-11-08 Thread Jeff Law


This has been on my todo list for a little while now.  Over time I've 
written the loop to delete a jump threading path several times.  It's 
just 3 lines of code, but I hate repeating it all over the place.


This pulls those trivial 3 lines into a function and calls it from the 
appropriate places.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  Applied 
to the trunk.


* tree-ssa-threadupdate.h (delete_thread_path): Declare.
* tree-ssa-threadupdate.c (delete_thread_path): New function.
(ssa_redirect_edges, thread_block_1): Use it.
(thread_through_loop_header, mark_threaded_blocks): Likewise.
(thread_through_all_blocks, register_jump_thread): Likewise.
* tree-ssa-threadedge.c (thread_across_edge): Likewise.



diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 4cff16d..cd2b34a 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1086,9 +1086,7 @@ thread_across_edge (gimple dummy_cond,
  }
else
  {
-   for (unsigned int i = 0; i < path->length (); i++)
- delete (*path)[i];
-   path->release();
+   delete_jump_thread_path (path);
  }
   }
 BITMAP_FREE (visited);
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 1027191..24e7767 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -555,9 +555,7 @@ ssa_redirect_edges (struct redirection_data **slot,
 
   /* Go ahead and clear E->aux.  It's not needed anymore and failure
  to clear it will cause all kinds of unpleasant problems later.  */
-  for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
-  path->release ();
+  delete_jump_thread_path (path);
   e->aux = NULL;
 
 }
@@ -703,9 +701,7 @@ thread_block_1 (basic_block bb, bool noloop_only, bool 
joiners)
  /* Since this case is not handled by our special code
 to thread through a loop header, we must explicitly
 cancel the threading request here.  */
- for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
- path->release ();
+ delete_jump_thread_path (path);
  e->aux = NULL;
  continue;
}
@@ -1161,9 +1157,7 @@ thread_through_loop_header (struct loop *loop, bool 
may_peel_loop_headers)
  if (e->src->loop_father != e2->dest->loop_father
  && e2->dest != loop->header)
{
- for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
- path->release ();
+ delete_jump_thread_path (path);
  e->aux = NULL;
}
}
@@ -1213,9 +1207,7 @@ fail:
 
   if (path)
{
- for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
- path->release ();
+ delete_jump_thread_path (path);
  e->aux = NULL;
}
 }
@@ -1310,9 +1302,7 @@ mark_threaded_blocks (bitmap threaded_blocks)
 
  if (e2 && !phi_args_equal_on_edges (e2, final_edge))
{
- for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
- path->release ();
+ delete_jump_thread_path (path);
  e->aux = NULL;
}
}
@@ -1336,9 +1326,7 @@ mark_threaded_blocks (bitmap threaded_blocks)
  if (e->aux)
{
  vec *path = THREAD_PATH (e);
- for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
- path->release ();
+ delete_jump_thread_path (path);
  e->aux = NULL;
}
}
@@ -1395,9 +1383,7 @@ mark_threaded_blocks (bitmap threaded_blocks)
  || (path->last ()->type
  == EDGE_COPY_SRC_JOINER_BLOCK))
{
- for (unsigned int i = 0; i < path->length (); i++)
-   delete (*path)[i];
- path->release ();
+ delete_jump_thread_path (path);
  e->aux = NULL;
}
  break;
@@ -1498,9 +1484,7 @@ thread_through_all_blocks (bool may_peel_loop_headers)
  {
vec *path = THREAD_PATH (e);
 
-   for (unsigned int i = 0; i < path->length (); i++)
- delete (*path)[i];
-   path->release ();
+   delete_jump_thread_path (path);
e->aux = NULL;
  }
 }
@@ -1520,6 +1504,17 @@ thread_through_all_blocks (bool may_peel_loop_headers)
   return retval;
 }
 
+/*

Re: [PATCH] Generate a label for the split cold function while using -freorder-blocks-and-partition

2013-11-08 Thread Steven Bosscher
On Fri, Nov 8, 2013 at 4:45 PM, Teresa Johnson wrote:
> For example, from mcf with -freorder-blocks-and-partition:
>
> main:
> .LFB40:
> .cfi_startproc
> ...
> jne .L27   < jump to main's text.unlikely
> ...
> .cfi_endproc
> .section.text.unlikely
> .cfi_startproc
> .L27:   <--- start of main's text.unlikely
> ...
>
> $ objdump -d mcf
>
> 00400aa0 :
>   ...
>   400b1a:   0f 85 1b fb ff ff   jne40063b
>< jump to main's text.unlikely
>
> 004004c0 :
>   ...
>   40063b:   bf 06 8a 49 00  mov$0x498a06,%edi<---
> start of main's text.unlikely
>   ...
>   40065e:   e9 0d 05 00 00  jmpq   400b70 
> <--- jump back to main's hot text
>
>
> Note that objdump thinks we are jumping to/from another function. If
> we end up executing the cold section (e.g. due to non-representative
> profiles), profiling tools and gdb are going to think we are executing
> replace_weaker_arc.

Isn't this something that should be expressed in DWARF with
DW_AT_ranges? See DWARF4, section 2.17,

Does GCC generate such ranges?

Ciao!
Steven


Re: Update soft-fp from glibc

2013-11-08 Thread Uros Bizjak
Hello!

> I've applied this patch to update libgcc's copy of soft-fp from
> glibc.  There are lots of coding standards fixes, but also various bug
> fixes; I've added testcases for various of the fixed bugs illustrating
> them for __float128.

Is there really no way to get those "missing prototypes" warning fixed?

Uros.


Re: [PATCH] fixincludes: use $(FI) instead of fixincl@EXEEXT@

2013-11-08 Thread Bruce Korb
Sure.  Looks good to me.  Thanks

On Fri, Nov 8, 2013 at 2:57 AM, Bernhard Reutner-Fischer
 wrote:
> On 4 April 2013 22:20, Bruce Korb  wrote:
>> Except as noted below, fine by me.
>>
>> On 04/04/13 12:56, Bernhard Reutner-Fischer wrote:
>>> Bootstrapped and regtested on x86_64-unknown-linux-gnu and
>>> x86_64-mine-linux-uclibc without regressions, ok for trunk?
>>>
>>> fixincludes/ChangeLog:
>>>
>>> 2013-04-04  Bernhard Reutner-Fischer  
>>>
>>>   Makefile.in: Use $(FI) instead of fixincl@EXEEXT@.
>>>   Cleanup whitespace while at it.
>>>
>>> Signed-off-by: Bernhard Reutner-Fischer 
>>> ---
>>>  fixincludes/Makefile.in |   10 +-
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fixincludes/Makefile.in b/fixincludes/Makefile.in
>>> index ce850ff..3dc869d 100644
>>> --- a/fixincludes/Makefile.in
>>> +++ b/fixincludes/Makefile.in
>>> @@ -131,7 +131,7 @@ fixinc.sh : fixinc.in mkfixinc.sh Makefile
>>>  $(srcdir)/fixincl.x: @MAINT@ fixincl.tpl inclhack.def
>>>   cd $(srcdir) ; $(SHELL) ./genfixes
>>>
>>> -mostlyclean :
>>> +mostlyclean:
>>
>> I see no reason for changing things.
>
> dropped this hunk.
>
>> But if you are going to "clean up" the colons, then they should
>> be in columns (mostly 12 or 16).  This one is already in 12.
>>
>>>   rm -f *.o *-stamp $(AF) $(FI) *~ fixinc.sh
>>>
>>>  clean: mostlyclean
>>> @@ -179,18 +179,18 @@ check : all
>>>
>>>  install : all
>>>   -rm -rf $(DESTDIR)$(itoolsdir)
>>> - $(mkinstalldirs) $(DESTDIR)$(itoolsdir)
>>> + $(mkinstalldirs) $(DESTDIR)$(itoolsdir)
>>>   $(mkinstalldirs) $(DESTDIR)$(itoolsdatadir)/include
>>>   $(INSTALL_DATA) $(srcdir)/README-fixinc \
>>> $(DESTDIR)$(itoolsdatadir)/include/README
>>>   $(INSTALL_SCRIPT) fixinc.sh $(DESTDIR)$(itoolsdir)/fixinc.sh
>>> - $(INSTALL_PROGRAM) fixincl@EXEEXT@ \
>>> -   $(DESTDIR)$(itoolsdir)/fixincl@EXEEXT@
>>> + $(INSTALL_PROGRAM) $(FI) \
>>> +   $(DESTDIR)$(itoolsdir)/$(FI)
>>
>> This should now fit on a single line.
>
> ok
>>
>>>   $(INSTALL_SCRIPT) mkheaders $(DESTDIR)$(itoolsdir)/mkheaders
>>>
>>>  install-strip: install
>>>   test -z '$(STRIP)' \
>>> -   || $(STRIP) $(DESTDIR)$(itoolsdir)/fixincl@EXEEXT@
>>> +   || $(STRIP) $(DESTDIR)$(itoolsdir)/$(FI)
>
> changed this too to be on a single line now.
>>>
>>>  .PHONY: all check install install-strip
>>>  .PHONY: dvi pdf info html install-pdf install-info install-html
>
> Changelog remains the same.
> II was using the attached updated patch since April, ok for trunk?


Fix for PR 59039

2013-11-08 Thread Iyer, Balaji V
Hello Everyone,
Attached, please find a patch that will fix a crash in a function in 
libcilkrts when optimization was turned on. The issue was that the longjump and 
setjmp were called in the same function.  This patch should fix that issue. The 
change is not in the compiler side but only in the Cilk Library. If it is OK 
with everyone, I will check this in this afternoon.

Thanks,

Balaji V. Iyer.
Index: libcilkrts/runtime/cilk_fiber-unix.cpp
===
--- libcilkrts/runtime/cilk_fiber-unix.cpp  (revision 204546)
+++ libcilkrts/runtime/cilk_fiber-unix.cpp  (working copy)
@@ -44,24 +44,27 @@
 #include 
 #include 
 
+#include 
+#include 
+#include 
+
+// You'd think that getting a defintion for alloca would be easy.  But you'd
+// be wrong. Here's a variant on what's recommended in the autoconf doc.  I've
+// remove the Windows portion since this is Unix-specific code.
 #if defined HAVE_ALLOCA_H
-# include 
+#   include 
 #elif defined __GNUC__
-# define alloca __builtin_alloca
+#   define alloca __builtin_alloca
 #elif defined _AIX
-# define alloca __alloca
+#   define alloca __alloca
 #else
-# include 
-# ifdef  __cplusplus
+#   include 
+#   ifdef  __cplusplus
 extern "C"
-# endif
+#   endif
 void *alloca (size_t);
 #endif
 
-#include 
-#include 
-#include 
-
 // MAP_ANON is deprecated on Linux, but seems to be required on Mac...
 #ifndef MAP_ANONYMOUS
 #define MAP_ANONYMOUS MAP_ANON
@@ -163,8 +166,15 @@
 __cilkrts_bug("Should not get here");
 }
 
-#pragma GCC push_options
-#pragma GCC optimize ("-O0")
+// GCC doesn't allow us to call __builtin_longjmp in the same function that
+// calls __builtin_setjmp, so create a new function to house the call to
+// __builtin_longjmp
+static void __attribute__((noinline))
+do_cilk_longjmp(__CILK_JUMP_BUFFER jmpbuf)
+{
+CILK_LONGJMP(jmpbuf);
+}
+
 NORETURN cilk_fiber_sysdep::run()
 {
 // Only fibers created from a pool have a proc method to run and execute. 
@@ -201,7 +211,11 @@
 // switching to for any temporaries required for this run()
 // function.
 JMPBUF_SP(m_resume_jmpbuf) = m_stack_base - frame_size;
-CILK_LONGJMP(m_resume_jmpbuf);
+
+// GCC doesn't allow us to call __builtin_longjmp in the same function
+// that calls __builtin_setjmp, so it's been moved into it's own
+// function that cannot be inlined.
+do_cilk_longjmp(m_resume_jmpbuf);
 }
 
 // Note: our resetting of the stack pointer is valid only if the
@@ -228,7 +242,6 @@
 // User proc should never return.
 __cilkrts_bug("Should not get here");
 }
-#pragma GCC pop_options
 
 void cilk_fiber_sysdep::make_stack(size_t stack_size)
 {
Index: libcilkrts/ChangeLog
===
--- libcilkrts/ChangeLog(revision 204546)
+++ libcilkrts/ChangeLog(working copy)
@@ -1,3 +1,9 @@
+2013-11-08  Balaji V. Iyer  
+
+   PR c/59039
+   * runtime/cilk_fiber-unix.cpp: Fixed a crash in run() function
+   when optimization is turned on.
+
 2013-11-04  Balaji V. Iyer  
 
PR bootstrap/58951


Re: [PATCH][ARM] Add break in handling of comparisons in rtx costs function

2013-11-08 Thread Ramana Radhakrishnan

On 11/08/13 13:34, Kyrill Tkachov wrote:

Hi all,

In arm_new_rtx_costs we need a break; statement after handling the comparisons
cases. Otherwise we fall through and compute garbage. This small patch adds 
that.

Tested arm-none-eabi on qemu.

Ok for trunk?


Ok - thanks.

Ramana



Re: RFA: Fix PR middle-end/59049

2013-11-08 Thread Joern Rennecke
On 8 November 2013 15:50, Steven Bosscher  wrote:
> Even with the gimple opts disabled, a const-const comparison would
> normally be folded by the RTL expanders.

Well, in this spirit, attached is another way to address the RTL side
of the problem.
As mention in the PR, the tree side of the problem started showing up
in r204194.
2013-11-08  Joern Rennecke  

PR middle-end/59049
* expmed.c (emit_store_flag): Fail for const-const comparison.

Index: expmed.c
===
--- expmed.c(revision 204568)
+++ expmed.c(working copy)
@@ -5401,6 +5401,13 @@ emit_store_flag (rtx target, enum rtx_co
   rtx subtarget;
   rtx tem, last, trueval;
 
+  /* If we compare constants, we shouldn't use a store-flag operation,
+ but a constant load.  We can get there via the vanilla route that
+ usually generates a compare-branch sequence, but will in this case
+ fold the comparison to a constant, and thus elide the branch.  */
+  if (CONSTANT_P (op0) && CONSTANT_P (op1))
+return 0;
+
   tem = emit_store_flag_1 (target, code, op0, op1, mode, unsignedp, normalizep,
   target_mode);
   if (tem)


[PATCH, libatomic]: Add config/x86/fenv.c

2013-11-08 Thread Uros Bizjak
On Fri, Nov 8, 2013 at 2:13 PM, Joseph S. Myers  wrote:

>> Can we introduce a target-dependant source here, in the same way as
>
> Sure, that seems a reasonable thing to do.  I think putting a file fenv.c
> in an appropriate subdirectory of libatomic/config will result in it being
> found automatically by the existing search path logic, but you'll need to
> test that.

Attached is the x86 optimized implementation of fenv.c. The source
depends as little as possible on fenv.h definitions - these are
defined by hardware, and for sure won't change soon in
hardware-dependant file.

2013-11-08  Uros Bizjak  

* config/x86/fenv.c: New file.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. The
testing of atomics currently fails on 32bit target due to missing
__atomic_{load,store}_16 functions, so the patch is not adequately
tested yet.

Yes, I have tested that the libatomic's auto-configuration logic works ;)

Uros.
Index: ChangeLog
===
--- ChangeLog   (revision 204574)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2013-11-08  Uros Bizjak  
+
+   * config/x86/fenv.c: New file.
+
 2013-11-07  Joseph Myers  
 
* fenv.c: New file.
Index: config/x86/fenv.c
===
--- config/x86/fenv.c   (revision 0)
+++ config/x86/fenv.c   (working copy)
@@ -0,0 +1,115 @@
+/* Copyright (C) 2013 Free Software Foundation, Inc.
+
+   This file is part of the GNU Atomic Library (libatomic).
+
+   Libatomic is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libatomic is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include "libatomic_i.h"
+
+#ifdef HAVE_FENV_H
+# include 
+#endif
+
+struct fenv
+{
+  unsigned short int __control_word;
+  unsigned short int __unused1;
+  unsigned short int __status_word;
+  unsigned short int __unused2;
+  unsigned short int __tags;
+  unsigned short int __unused3;
+  unsigned int __eip;
+  unsigned short int __cs_selector;
+  unsigned int __opcode:11;
+  unsigned int __unused4:5;
+  unsigned int __data_offset;
+  unsigned short int __data_selector;
+  unsigned short int __unused5;
+};
+
+/* Raise the supported floating-point exceptions from EXCEPTS.  Other
+   bits in EXCEPTS are ignored.  */
+
+void
+__atomic_feraiseexcept (int excepts __attribute__ ((unused)))
+{
+#ifdef FE_INVALID
+  if (excepts & FE_INVALID)
+{
+  float f = 0.0f;
+#ifdef __x86_64__
+  volatile float r __attribute__ ((unused));
+  asm volatile ("%vdivss\t{%0, %d0|%d0, %0}" : "+x" (f));
+  r = f; /* Needed to trigger exception.   */
+#else
+  asm volatile ("fdiv\t{%y0, %0|%0, %y0}" : "+t" (f));
+  /* No need for fwait, exception is triggered by emitted fstp.  */
+#endif
+}
+#endif
+#ifdef FE_DIVBYZERO
+  if (excepts & FE_DIVBYZERO)
+{
+  float f = 1.0f, g = 0.0f;
+#ifdef __x86_64__
+  volatile float r __attribute__ ((unused));
+  asm volatile ("%vdivss\t{%1, %d0|%d0, %1}" : "+x" (f) : "xm" (g));
+  r = f; /* Needed to trigger exception.   */
+#else
+  asm volatile ("fdivs\t%1" : "+t" (f) : "m" (g));
+  /* No need for fwait, exception is triggered by emitted fstp.  */
+#endif
+}
+#endif
+#ifdef FE_OVERFLOW
+  if (excepts & FE_OVERFLOW)
+{
+  struct fenv temp;
+  asm volatile ("fnstenv\t%0" : "=m" (temp));
+  temp.__status_word |= 0x08;
+  asm volatile ("fldenv\t%0" : : "m" (temp));
+  asm volatile ("fwait");
+}
+#endif
+#ifdef FE_UNDERFLOW
+  if (excepts & FE_UNDERFLOW)
+{
+  struct fenv temp;
+  asm volatile ("fnstenv\t%0" : "=m" (temp));
+  temp.__status_word |= 0x10;
+  asm volatile ("fldenv\t%0" : : "m" (temp));
+  asm volatile ("fwait");
+}
+#endif
+#ifdef FE_INEXACT
+  if (excepts & FE_INEXACT)
+{
+  float f = 1.0f, g = 3.0f;
+#ifdef __x86_64__
+  volatile float r __attribute__ ((unused));
+  asm volatile ("%vdivss\t{%1, %d0|%d0, %1}" : "+x" (f) : "xm" (g));
+  r = f; /* Needed to trigger exception.   */
+#else
+  asm volatile ("fdivs\t%1" : "+t" (f) : "m" (g));
+  /* No need for fwait, exception is trigg

Re: [PATCH] Enhance ifcombine to recover non short circuit branches

2013-11-08 Thread Steven Bosscher
On Wed, Oct 30, 2013 at 5:03 AM, Andrew Pinski wrote:
> Here is what I applied in the end; Jeff told me just to remove the
> testcase.  I added the comment trying to explain why it was the
> opposite order of PHI-opt.
>
> Thanks,
> Andrew Pinski
>
> ChangeLog:
> * tree-ssa-ifcombine.c: Include rtl.h and tm_p.h.

Eh, why???

The file has this comment:

25/* rtl is needed only because arm back-end requires it for
26   BRANCH_COST.  */
27#include "rtl.h"
28#include "tm_p.h"

Can you please clarify why this is not something to be fixed in the
ARM back end?

You're taking the easy way out here, but it's a step in the wrong
direction from modularity point of view.

Ciao!
Steven


Re: [GOOGLE] tune loop scale factor for AutoFDO

2013-11-08 Thread Xinliang David Li
On Fri, Nov 8, 2013 at 6:23 AM, Dehao Chen  wrote:
> AutoFDO sometimes has 0 profile in the loop's entry block because the
> debug info are lost and unrecoverable.

Is that a separate problem to track and fix?


Also is it better to do a round of loop profiling smoothing after auto
profile annotation to eliminate the insanity?

David


>
> E.g.
>
> if (a)
>   if (b)
> for () {}
>
> This patch checks if the scale factor is 0, then use the normal scale.
>
> Bootstrapped and passed regression test and performance test.
>
> OK for google-4_8?
>
> Thanks,
> Dehao
>
> Index: gcc/cfgloopmanip.c
> ===
> --- gcc/cfgloopmanip.c (revision 204568)
> +++ gcc/cfgloopmanip.c (working copy)
> @@ -1262,6 +1262,8 @@ duplicate_loop_to_header_edge (struct loop *loop,
>   }
>for (i = 0; i < ndupl; i++)
>   gcc_assert (scale_step[i] >= 0 && scale_step[i] <= REG_BR_PROB_BASE);
> +  if (flag_auto_profile && scale_main == 0)
> + scale_main = REG_BR_PROB_BASE;
>gcc_assert (scale_main >= 0 && scale_main <= REG_BR_PROB_BASE
> && scale_act >= 0  && scale_act <= REG_BR_PROB_BASE);
>  }


Re: [patch 4/4] std::regex refactoring

2013-11-08 Thread Jonathan Wakely
On 8 November 2013 16:03, Jonathan Wakely wrote:
> On 8 November 2013 15:41, Jonathan Wakely wrote:
>> On 8 November 2013 14:51, Daniel Krügler wrote:
>>> I have fully not grasped for which T the specializations of
>>> __has_contiguous_iter are intended to be used,
>>
>> Currently, only std::container iterators passed to a basic_regex
>> constructor, but in theory the trait could get moved to another header
>> and used elsewhere in future.
>
> Currently the vector specialization can never be reached,
> because std::vector doesn't use __gnu_cxx::__normal_iterator
> (and trying to pass vector::iterator to a regex ctor fails
> anyway) so this is only a theoretical problem if we re-use
> __has_contiguous_iter elsewhere.

I've fixed the trait anyway, like so:

2013-11-08  Jonathan Wakely  

* include/bits/regex_compiler.h (__detail::__has_contiguous_iter):
vector storage is not contiguous.

Tested x86_64-linux, committed to trunk.
diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index 741098f..b9f8127 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -134,12 +134,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct __has_contiguous_iter>
-: std::true_type
+: std::true_type  // string storage is contiguous
 { };
 
   template
 struct __has_contiguous_iter>
-: std::true_type
+: std::true_type  // vector storage is contiguous
+{ };
+
+  template
+struct __has_contiguous_iter>
+: std::false_type // vector storage is not contiguous
 { };
 
   template


Go patch committed: Avoid bogus init loop error

2013-11-08 Thread Ian Lance Taylor
Code like this

type S struct {
F int
}

var V = S{F: 1}

var F = V.F

could trigger an incorrect "variable initializer refers to itself" error
because the Go frontend would confuse itself into thinking that the "F"
in the composite literal was the same as the global variable "F", rather
than being the name of the field "F".

This patch fixes the problem.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 60070f2ac887 go/expressions.cc
--- a/go/expressions.cc	Thu Nov 07 20:41:06 2013 -0800
+++ b/go/expressions.cc	Fri Nov 08 09:30:45 2013 -0800
@@ -13488,10 +13488,52 @@
 int
 Composite_literal_expression::do_traverse(Traverse* traverse)
 {
-  if (this->vals_ != NULL
-  && this->vals_->traverse(traverse) == TRAVERSE_EXIT)
+  if (Type::traverse(this->type_, traverse) == TRAVERSE_EXIT)
 return TRAVERSE_EXIT;
-  return Type::traverse(this->type_, traverse);
+
+  // If this is a struct composite literal with keys, then the keys
+  // are field names, not expressions.  We don't want to traverse them
+  // in that case.  If we do, we can give an erroneous error "variable
+  // initializer refers to itself."  See bug482.go in the testsuite.
+  if (this->has_keys_ && this->vals_ != NULL)
+{
+  // The type may not be resolvable at this point.
+  Type* type = this->type_;
+  while (true)
+	{
+	  if (type->classification() == Type::TYPE_NAMED)
+	type = type->named_type()->real_type();
+	  else if (type->classification() == Type::TYPE_FORWARD)
+	{
+	  Type* t = type->forwarded();
+	  if (t == type)
+		break;
+	  type = t;
+	}
+	  else
+	break;
+	}
+
+  if (type->classification() == Type::TYPE_STRUCT)
+	{
+	  Expression_list::iterator p = this->vals_->begin();
+	  while (p != this->vals_->end())
+	{
+	  // Skip key.
+	  ++p;
+	  go_assert(p != this->vals_->end());
+	  if (Expression::traverse(&*p, traverse) == TRAVERSE_EXIT)
+		return TRAVERSE_EXIT;
+	  ++p;
+	}
+	  return TRAVERSE_CONTINUE;
+	}
+}
+
+  if (this->vals_ != NULL)
+return this->vals_->traverse(traverse);
+
+  return TRAVERSE_CONTINUE;
 }
 
 // Lower a generic composite literal into a specific version based on


Re: [PATCH] Enhance ifcombine to recover non short circuit branches

2013-11-08 Thread Steven Bosscher
On Fri, Nov 8, 2013 at 6:20 PM, Steven Bosscher wrote:
> On Wed, Oct 30, 2013 at 5:03 AM, Andrew Pinski wrote:
>> Here is what I applied in the end; Jeff told me just to remove the
>> testcase.  I added the comment trying to explain why it was the
>> opposite order of PHI-opt.
>>
>> Thanks,
>> Andrew Pinski
>>
>> ChangeLog:
>> * tree-ssa-ifcombine.c: Include rtl.h and tm_p.h.
>
> Eh, why???
>
> The file has this comment:
>
> 25/* rtl is needed only because arm back-end requires it for
> 26   BRANCH_COST.  */
> 27#include "rtl.h"
> 28#include "tm_p.h"
>
> Can you please clarify why this is not something to be fixed in the
> ARM back end?


Could be fixed like attached. Can you please have a look and
foster-parent it if you like it?

Thanks,

Ciao!
Steven

* config/arm/arm-protos.h (arm_branch_cost,
arm_logical_op_non_short_circuit): New prototypes.
* config/arm/arm.c (arm_branch_cost): Implement it.
(arm_logical_op_non_short_circuit): Likewise.
* config/arm/arm.h (BRANCH_COST): Don't expose ARM back-end
internals to the RTL middle end.
(LOGICAL_OP_NON_SHORT_CIRCUIT): Likewise.
* tree-ssa-ifcombine.c: Don't include rtl.h here.
* tree-ssa-reassoc.c: Don't include rtl.h here either.

Index: config/arm/arm-protos.h
===
--- config/arm/arm-protos.h (revision 204565)
+++ config/arm/arm-protos.h (working copy)
@@ -288,4 +288,7 @@ extern bool arm_autoinc_modes_ok_p (enum machine_m
 
 extern void arm_emit_eabi_attribute (const char *, int, int);
 
+extern bool arm_branch_cost (bool, bool);
+extern bool arm_logical_op_non_short_circuit (void);
+
 #endif /* ! GCC_ARM_PROTOS_H */
Index: config/arm/arm.c
===
--- config/arm/arm.c(revision 204565)
+++ config/arm/arm.c(working copy)
@@ -30351,4 +30351,19 @@ arm_asan_shadow_offset (void)
   return (unsigned HOST_WIDE_INT) 1 << 29;
 }
 
+/* Implement BRANCH_COST as a function, to hide tune_params from the
+   rest of the compiler.  */
+bool
+arm_branch_cost (bool speed_p, bool predictable_p)
+{
+  return current_tune->branch_cost (speed_p, predictable_p);
+}
+
+/* Likewise for LOGICAL_OP_NON_SHORT_CIRCUIT.  */
+bool
+arm_logical_op_non_short_circuit (void)
+{
+  return (current_tune->logical_op_non_short_circuit[TARGET_ARM]);
+}
+
 #include "gt-arm.h"
Index: config/arm/arm.h
===
--- config/arm/arm.h(revision 204565)
+++ config/arm/arm.h(working copy)
@@ -2042,14 +2042,13 @@ enum arm_auto_incmodes
 /* Try to generate sequences that don't involve branches, we can then use
conditional instructions.  */
 #define BRANCH_COST(speed_p, predictable_p) \
-  (current_tune->branch_cost (speed_p, predictable_p))
+  arm_branch_cost (speed_p, predictable_p)
 
 /* False if short circuit operation is preferred.  */
 #define LOGICAL_OP_NON_SHORT_CIRCUIT   \
   ((optimize_size) \
? (TARGET_THUMB ? false : true) \
-   : (current_tune->logical_op_non_short_circuit[TARGET_ARM]))
-
+   : arm_logical_op_non_short_circuit ()
 
 /* Position Independent Code.  */
 /* We decide which register to use based on the compilation options and
Index: tree-ssa-ifcombine.c
===
--- tree-ssa-ifcombine.c(revision 204565)
+++ tree-ssa-ifcombine.c(working copy)
@@ -22,9 +22,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
-/* rtl is needed only because arm back-end requires it for
-   BRANCH_COST.  */
-#include "rtl.h"
 #include "tm_p.h"
 #include "tree.h"
 #include "basic-block.h"
Index: tree-ssa-reassoc.c
===
--- tree-ssa-reassoc.c  (revision 204565)
+++ tree-ssa-reassoc.c  (working copy)
@@ -23,7 +23,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "hash-table.h"
 #include "tm.h"
-#include "rtl.h"
 #include "tm_p.h"
 #include "tree.h"
 #include "basic-block.h"


Re: [PATCH] Enhance ifcombine to recover non short circuit branches

2013-11-08 Thread pinskia


> On Nov 8, 2013, at 9:20 AM, Steven Bosscher  wrote:
> 
>> On Wed, Oct 30, 2013 at 5:03 AM, Andrew Pinski wrote:
>> Here is what I applied in the end; Jeff told me just to remove the
>> testcase.  I added the comment trying to explain why it was the
>> opposite order of PHI-opt.
>> 
>> Thanks,
>> Andrew Pinski
>> 
>> ChangeLog:
>> * tree-ssa-ifcombine.c: Include rtl.h and tm_p.h.
> 
> Eh, why???
> 
> The file has this comment:
> 
> 25/* rtl is needed only because arm back-end requires it for
> 26   BRANCH_COST.  */
> 27#include "rtl.h"
> 28#include "tm_p.h"
> 
> Can you please clarify why this is not something to be fixed in the
> ARM back end?

  Really BRANCH_COST should be a target hook rather than a macro which should 
fix this issue.  fold-const.c has the same include for the same reason.  I 
thought I had saw a patch which changes it into a hook. Next week if I get 
time, I will do that.  I knew it was the wrong direction which is why I added a 
comment.

Thanks,
Andrew

> 
> You're taking the easy way out here, but it's a step in the wrong
> direction from modularity point of view.
> 
> Ciao!
> Steven


Re: RFC: simd enabled functions (omp declare simd / elementals)

2013-11-08 Thread Aldy Hernandez

On 11/07/13 09:09, Martin Jambor wrote:


I am glad this is becoming a useful infrastructure rather than just a
part of IPA-SRA.  Note that while ipa_combine_adjustments is not used
from anywhere and thus probably buggy anyway, it should in theory be
able to process new_param adjustments too.  Can you please at least
put a "not implemented" assert there?  (The reason is that the plan


Done.


diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index c38ba82..faae080 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node)
  reason = "not a tree_versionable_function";
else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
  reason = "insufficient body availability";
+  else if (node->has_simd_clones)
+{
+  /* Ideally we should clone the SIMD clones themselves and create
+vector copies of them, so IPA-cp and SIMD clones can happily
+coexist, but that may not be worth the effort.  */
+  reason = "function has SIMD clones";
+}


Lets hope we will eventually fix this in some followup :-)


Sure, but to be honest it's not super high on my priority list, perhaps 
once the basic functionality is in trunk.



+  tree new_arg_types = NULL;
+  for (int i = 0; i < len; i++)
  {
struct ipa_parm_adjustment *adj;
gcc_assert (link);

adj = &adjustments[i];
-  parm = oparms[adj->base_index];
+  tree parm;
+  if (adj->new_param)


I don't know what I was thinking when I invented copy_param and
remove_param as multiple flags rather than a single enum, I probably
wasn't thinking at all.  I can change it myself as a followup if you
have more pressing tasks now.  Meanwhile, can you gcc_checking_assert
that at most one flag is set at appropriate places?


Not a problem, I can implement the enum changes since I'm already 
changing all this code.  Done.





+   parm = NULL;
+  else
+   parm = oparms[adj->base_index];
adj->base = parm;


I do not think it makes sense for new parameters to have a base which
is basically the old decl.  Do you have any reasons for not setting it
to NULL?


In this particular case, adj->base is already being set to NULL because 
parm=NULL for adj->op.  The code now reads:




  if (adj->op == IPA_PARM_OP_NEW)
parm = NULL;
  else
parm = oparms[adj->base_index];
  adj->base = parm;


Am I missing something?  Base is already been set to NULL for new 
parameters.






if (adj->copy_param)
@@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl, 
ipa_parm_adjustment_vec adjustments,
  tree new_parm;
  tree ptype;




- if (adj->by_ref)
-   ptype = build_pointer_type (adj->type);


Please add gcc_checking_assert (!adj->by_ref || adj->simdlen == 0)
here...


Done.


+ const char *prefix
+   = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;


Can we perhaps get rid of synth_parm_prefix then and just have
adj->new_arg_prefix?  It's not particularly important but this is
weird.


Done.





+ DECL_NAME (new_parm) = create_tmp_var_name (prefix);
  DECL_ARTIFICIAL (new_parm) = 1;
  DECL_ARG_TYPE (new_parm) = ptype;
  DECL_CONTEXT (new_parm) = fndecl;
@@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl, 
ipa_parm_adjustment_vec adjustments,
  DECL_IGNORED_P (new_parm) = 1;
  layout_decl (new_parm, 0);

- adj->base = parm;
+ if (adj->new_param)
+   adj->base = new_parm;


Again, shouldn't this be NULL?


This one, yes :).  Done.


diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 48634d2..8d7d9b9 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -634,9 +634,10 @@ struct ipa_parm_adjustment
   arguments.  */
tree alias_ptr_type;

-  /* The new declaration when creating/replacing a parameter.  Created by
- ipa_modify_formal_parameters, useful for functions modifying the body
- accordingly. */
+  /* The new declaration when creating/replacing a parameter.  Created
+ by ipa_modify_formal_parameters, useful for functions modifying
+ the body accordingly.  For brand new arguments, this is the newly
+ created argument.  */
tree reduction;


We should eventually rename this to new_decl or something, given that
this is not an SRA thing any more.  But that can be done later.


Done.





/* New declaration of a substitute variable that we may use to replace all
@@ -647,15 +648,36 @@ struct ipa_parm_adjustment
   is NULL), this is going to be its nonlocalized vars value.  */
tree nonlocal_value;

+  /* If this is a brand new argument, this holds the prefix to be used
+ for the DECL_NAME.  */
+  const char *new_arg_prefix;
+
/* Offset into the original parameter (for the cases when the new parameter
   is a component of an original one).  */
HOST_WIDE_INT offset;

-  /* Zero based index of the original param

Merge from GCC mainline to gccgo branch

2013-11-08 Thread Ian Lance Taylor
I merged GCC mainline revision 204583 to the gcgo branch.

Ian


Re: [PATCH] Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-11-08 Thread Mike Stump
On Nov 8, 2013, at 4:09 AM, Bernhard Reutner-Fischer  
wrote:
> On 6 November 2013 08:04, Jakub Jelinek  wrote:
>> On Wed, Nov 06, 2013 at 02:24:01AM +, Iyer, Balaji V wrote:
>>>Fixed patch is attached. The responses to your question are given below. 
>>> Is this patch OK?

> May i suggest you rephrase the .exp so it does not line-wrap and is
> actually readable like in attached (untested) 01 or at least 00?

I like the direction irrespective of line length.  :-)

Re: Fix for PR 59039

2013-11-08 Thread H.J. Lu
On Fri, Nov 8, 2013 at 8:48 AM, Iyer, Balaji V  wrote:
> Hello Everyone,
> Attached, please find a patch that will fix a crash in a function in 
> libcilkrts when optimization was turned on. The issue was that the longjump 
> and setjmp were called in the same function.  This patch should fix that 
> issue. The change is not in the compiler side but only in the Cilk Library. 
> If it is OK with everyone, I will check this in this afternoon.
>

It looks good to me.


-- 
H.J.


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Xinliang David Li
in libgcov-driver.c

> /* Flag when the profile has already been dumped via __gcov_dump().  */
> static int gcov_dump_complete;

> inline void
> set_gcov_dump_complete (void)
>{
  > gcov_dump_complete = 1;
>}

>inline void
>reset_gcov_dump_complete (void)
>{
>  gcov_dump_complete = 0;
>}


These should be moved to libgcov-interface.c. Is there reason to not
mark them as static?

in libgcov-profiler.c, line 142

> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
> __thread

trailing whilte space.


> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))

trailing white space.


in libgcov-merge.c

1) I don't think you need in_mem flag. For in memory merge, the source != NULL.
2) document the new source and weight parameter -- especially the weight.
3) How do you deal with integer overflow ?

David






On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
> wrote:
>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>
>>> Thanks Joseph for these detailed comments/suggestions.
>>> The fixed patch is attached to this email.
>>> The only thing left out is the Texinfo manual. Do you mean this tool
>>> should have its it's own texi file in gcc/doc?
>>
>> Its own texi file, probably included as a chapter by gcc.texi (like
>> gcov.texi is).
>
> OK. will add this in.
>
> My last patch that changes the command handling is actually broken.
> Please ignore that patch and use the one in this email.
>
> Also did some minor adjust of the code in libgcov.
>
> Thanks,
>
> -Rong
>
>
>>
>> --
>> Joseph S. Myers
>> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  wrote:
> in libgcov-driver.c
>
>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>> static int gcov_dump_complete;
>
>> inline void
>> set_gcov_dump_complete (void)
>>{
>   > gcov_dump_complete = 1;
>>}
>
>>inline void
>>reset_gcov_dump_complete (void)
>>{
>>  gcov_dump_complete = 0;
>>}
>
>
> These should be moved to libgcov-interface.c. Is there reason to not
> mark them as static?

gcov_dump_complete is used in gcov_exit() which is in
libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
in libgcov-interface.c.
I want gcov_dump_complete a static. So I have to add to global
functions that accessible from libgcov-interface.c.
Another choice is to move __gcov_dump and __gcov_reset to
libgcov-driver.c and that will simplify the code.

>
> in libgcov-profiler.c, line 142
>
>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>> __thread
>
> trailing whilte space.
>

Will fix.

>
>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>
> trailing white space.
>

Will fix.

>
> in libgcov-merge.c
>
> 1) I don't think you need in_mem flag. For in memory merge, the source != 
> NULL.
> 2) document the new source and weight parameter -- especially the weight.

Will do.

> 3) How do you deal with integer overflow ?

This is good question. gcvo_type is (typically) long long. I thought
the count value will not be nearly close to the max of long long.
(We did see overflow in compiler, but that's because of the scaling --
some of the scaling factors are really large.)

>
> David
>
>
>
>
>
>
> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>> wrote:
>>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>>
 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?
>>>
>>> Its own texi file, probably included as a chapter by gcc.texi (like
>>> gcov.texi is).
>>
>> OK. will add this in.
>>
>> My last patch that changes the command handling is actually broken.
>> Please ignore that patch and use the one in this email.
>>
>> Also did some minor adjust of the code in libgcov.
>>
>> Thanks,
>>
>> -Rong
>>
>>
>>>
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com


[PATCH] Bug fix for PR59050

2013-11-08 Thread Cong Hou
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59050

This is my bad. I forget to check the test result for gfortran. With
this patch the bug should be fixed (tested on x86-64).


thanks,
Cong


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 90b01f2..e62c672 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2013-11-08  Cong Hou  
+
+   PR tree-optimization/59050
+   * tree-vect-data-refs.c (comp_dr_addr_with_seg_len_pair): Bug fix.
+
 2013-11-07  Cong Hou  

* tree-vect-loop-manip.c (vect_create_cond_for_alias_checks):
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index b2a31b1..b7eb926 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -2669,9 +2669,9 @@ comp_dr_addr_with_seg_len_pair (const void *p1_,
const void *p2_)
   if (comp_res != 0)
return comp_res;
 }
-  if (tree_int_cst_compare (p11.offset, p21.offset) < 0)
+  else if (tree_int_cst_compare (p11.offset, p21.offset) < 0)
 return -1;
-  if (tree_int_cst_compare (p11.offset, p21.offset) > 0)
+  else if (tree_int_cst_compare (p11.offset, p21.offset) > 0)
 return 1;
   if (TREE_CODE (p12.offset) != INTEGER_CST
   || TREE_CODE (p22.offset) != INTEGER_CST)
@@ -2680,9 +2680,9 @@ comp_dr_addr_with_seg_len_pair (const void *p1_,
const void *p2_)
   if (comp_res != 0)
return comp_res;
 }
-  if (tree_int_cst_compare (p12.offset, p22.offset) < 0)
+  else if (tree_int_cst_compare (p12.offset, p22.offset) < 0)
 return -1;
-  if (tree_int_cst_compare (p12.offset, p22.offset) > 0)
+  else if (tree_int_cst_compare (p12.offset, p22.offset) > 0)
 return 1;

   return 0;


Re: [C++ PATCH] Refactor implicit function template implementation and fix 58534, 58536, 58548, 58549 and 58637.

2013-11-08 Thread Jason Merrill

On 10/31/2013 05:47 AM, Adam Butcher wrote:

+ become_template = true;



+ push_deferring_access_checks (dk_deferred);


Why is this call here?  I don't see anything in the rest of the function 
that would trigger an access check, or a matching pop.



+  /* Create a distinct parameter pack type from the current parm and add it
+to the replacement args to tsubst below into the generic function
+parameter.  */
+
+  tree t = copy_type (TREE_TYPE (TREE_VALUE (TREE_VEC_ELT (current, i;
+  TYPE_STUB_DECL (t) = TYPE_NAME (t) = TEMPLATE_TYPE_DECL (t);


Is this changing anything?  I'm not sure if we need to copy the decl or 
if we can just reuse it, but either way we need to set the type of 
TEMPLATE_TYPE_DECL (t) to t.



+  SET_TYPE_STRUCTURAL_EQUALITY (t);


Why not

  TYPE_CANONICAL (t) = canonical_type_parameter (t);

?

Jason



Re: [PATCH] Reducing number of alias checks in vectorization.

2013-11-08 Thread Cong Hou
Thank you for the report. I have submitted a bug fix patch waiting to
be reviewed.



thanks,
Cong


On Fri, Nov 8, 2013 at 5:26 AM, Dominique Dhumieres  wrote:
> According to http://gcc.gnu.org/ml/gcc-regression/2013-11/msg00197.html
> revision 204538 is breaking several tests. On x86_64-apple-darwin* the
> failures I have looked at are of the kind
>
> /opt/gcc/work/gcc/testsuite/gfortran.dg/typebound_operator_9.f03: In function 
> 'nabla2_cart2d':
> /opt/gcc/work/gcc/testsuite/gfortran.dg/typebound_operator_9.f03:272:0: 
> internal compiler error: tree check: expected integer_cst, have plus_expr in 
> tree_int_cst_lt, at tree.c:7083
>function nabla2_cart2d (obj)
>
> TIA
>
> Dominique


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Xinliang David Li
> 1) I don't think you need in_mem flag. For in memory merge, the source != 
> NULL.

discard this comment.

David
>
>
>
>
>
>
> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>> wrote:
>>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>>
 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?
>>>
>>> Its own texi file, probably included as a chapter by gcc.texi (like
>>> gcov.texi is).
>>
>> OK. will add this in.
>>
>> My last patch that changes the command handling is actually broken.
>> Please ignore that patch and use the one in this email.
>>
>> Also did some minor adjust of the code in libgcov.
>>
>> Thanks,
>>
>> -Rong
>>
>>
>>>
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
A question about inhibit_libc.
When inhibit_libc is defined, we provide dummy functions for all the
__gcov_* methods.
Is it purposely to minimize the footprint?
I'm thinking to allow some codes that are independent of libc (and its
headers) under this. Is it OK?

-Rong

On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  wrote:
>> in libgcov-driver.c
>>
>>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>>> static int gcov_dump_complete;
>>
>>> inline void
>>> set_gcov_dump_complete (void)
>>>{
>>   > gcov_dump_complete = 1;
>>>}
>>
>>>inline void
>>>reset_gcov_dump_complete (void)
>>>{
>>>  gcov_dump_complete = 0;
>>>}
>>
>>
>> These should be moved to libgcov-interface.c. Is there reason to not
>> mark them as static?
>
> gcov_dump_complete is used in gcov_exit() which is in
> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
> in libgcov-interface.c.
> I want gcov_dump_complete a static. So I have to add to global
> functions that accessible from libgcov-interface.c.
> Another choice is to move __gcov_dump and __gcov_reset to
> libgcov-driver.c and that will simplify the code.
>
>>
>> in libgcov-profiler.c, line 142
>>
>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>>> __thread
>>
>> trailing whilte space.
>>
>
> Will fix.
>
>>
>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>>
>> trailing white space.
>>
>
> Will fix.
>
>>
>> in libgcov-merge.c
>>
>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>> NULL.
>> 2) document the new source and weight parameter -- especially the weight.
>
> Will do.
>
>> 3) How do you deal with integer overflow ?
>
> This is good question. gcvo_type is (typically) long long. I thought
> the count value will not be nearly close to the max of long long.
> (We did see overflow in compiler, but that's because of the scaling --
> some of the scaling factors are really large.)
>
>>
>> David
>>
>>
>>
>>
>>
>>
>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>>> wrote:
 On Thu, 7 Nov 2013, Rong Xu wrote:

> Thanks Joseph for these detailed comments/suggestions.
> The fixed patch is attached to this email.
> The only thing left out is the Texinfo manual. Do you mean this tool
> should have its it's own texi file in gcc/doc?

 Its own texi file, probably included as a chapter by gcc.texi (like
 gcov.texi is).
>>>
>>> OK. will add this in.
>>>
>>> My last patch that changes the command handling is actually broken.
>>> Please ignore that patch and use the one in this email.
>>>
>>> Also did some minor adjust of the code in libgcov.
>>>
>>> Thanks,
>>>
>>> -Rong
>>>
>>>

 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Xinliang David Li
On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  wrote:
>> in libgcov-driver.c
>>
>>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>>> static int gcov_dump_complete;
>>
>>> inline void
>>> set_gcov_dump_complete (void)
>>>{
>>   > gcov_dump_complete = 1;
>>>}
>>
>>>inline void
>>>reset_gcov_dump_complete (void)
>>>{
>>>  gcov_dump_complete = 0;
>>>}
>>
>>
>> These should be moved to libgcov-interface.c. Is there reason to not
>> mark them as static?
>
> gcov_dump_complete is used in gcov_exit() which is in
> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
> in libgcov-interface.c.
> I want gcov_dump_complete a static. So I have to add to global
> functions that accessible from libgcov-interface.c.
> Another choice is to move __gcov_dump and __gcov_reset to
> libgcov-driver.c and that will simplify the code.
>

Ok  then you should remove the 'inline' keyword for the set and reset
functions and keep them in libgcov-driver.c.

>>
>> in libgcov-profiler.c, line 142
>>
>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>>> __thread
>>
>> trailing whilte space.
>>
>
> Will fix.
>
>>
>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>>
>> trailing white space.
>>
>
> Will fix.
>
>>
>> in libgcov-merge.c
>>
>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>> NULL.
>> 2) document the new source and weight parameter -- especially the weight.
>
> Will do.
>
>> 3) How do you deal with integer overflow ?
>
> This is good question. gcvo_type is (typically) long long. I thought
> the count value will not be nearly close to the max of long long.
> (We did see overflow in compiler, but that's because of the scaling --
> some of the scaling factors are really large.)
>

Why not passing weight as a scaled PERCENT  (as branch prob) for the
source? THis way, the merge tool does not need to do scaling twice.

David



>>
>> David
>>
>>
>>
>>
>>
>>
>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>>> wrote:
 On Thu, 7 Nov 2013, Rong Xu wrote:

> Thanks Joseph for these detailed comments/suggestions.
> The fixed patch is attached to this email.
> The only thing left out is the Texinfo manual. Do you mean this tool
> should have its it's own texi file in gcc/doc?

 Its own texi file, probably included as a chapter by gcc.texi (like
 gcov.texi is).
>>>
>>> OK. will add this in.
>>>
>>> My last patch that changes the command handling is actually broken.
>>> Please ignore that patch and use the one in this email.
>>>
>>> Also did some minor adjust of the code in libgcov.
>>>
>>> Thanks,
>>>
>>> -Rong
>>>
>>>

 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-08 Thread David Malcolm
On Wed, 2013-11-06 at 22:32 -0700, Jeff Law wrote:
> [ Just a note, of this reply is meant for Michael and other parts for 
> David, hopefully the audience is clear from the context. ]
> 
> On 11/06/13 21:56, David Malcolm wrote:
> > On Wed, 2013-11-06 at 16:32 +0100, Michael Matz wrote:
> >> Hi,
> >>
> >> On Tue, 5 Nov 2013, David Malcolm wrote:
> >>
> >>> Here's a followup patch which ensures that every gimple code has its own
> >>> subclass, by adding empty subclasses derived from the GSS_-based
> >>> subclasses as appropriate (I don't bother for gimple codes that already
> >>> have their own subclass due to having their own GSS layout).  I also
> >>> copied the comments from gimple.def into gimple.h, so that Doxygen picks
> >>> up on the descriptions and uses them to describe each subclass.
> >>
> >> I don't like that.  The empty classes are just useless, they imply a
> >> structure that isn't really there, some of the separate gimple codes are
> >> basically selectors of specific subtypes of a generic concept, without
> >> additional data or methods; creating a type for those is confusing.
> >
> > A type system does more than just express memory layouts:
> > * it permits the proof of absence of certain bugs
> Right.  As you have probably surmised, this is the single biggest thing 
> we get from this work in my mind.  We use the type system to ensure a 
> certain class of bugs simply won't get through the compilation phase.
> 
> That's a significant and important change from where we are now.  Right 
> now we have no way of knowing that at compile time.  Instead we rely 
> upon an insane set of macros to check this kind of invariant at run 
> time.  Note carefully just because we don't hit a checking failure 
> doesn't mean the code is safe.  It just means we haven't found a set of 
> preconditions necessary to trip the problem at runtime.  Obviously in 
> some (many), where may be no such way to trigger the failure, but 
> there's no way to prove it.
> 
> Don't get me wrong the checking macros, when they were introduced were a 
> godsend.  But they're papering over a fundamental problems in GCC's 
> internal representations.
> 
> I think it's hard to overestimate the value we get by moving this stuff 
> into compile-time type checking.
> 
> 
> > I can post a followup patch that makes use of each of these, if it will
> > help.
> I wouldn't mind seeing a small example proof of concept posted to help 
> those who don't see where this is going understand the goal.  I would 
> recommend against posting another large patch for inclusion at this time.
Attached is a proof-of-concept patch which uses the
gimple_statement_switch subclass (as a "gimple_switch" typedef).  This
is one of the subclasses that the earlier patch added, which has no new
fields, but which carries the invariant that, if non-NULL,
   gimple_code (gs) == GIMPLE_SWITCH.

The patch adds compile-time type-checking for places where switch
statements are handled.  For example, in tree-vrp.c's switch_update:
 typedef struct {
-  gimple stmt;
+  gimple_switch stmt;
   tree vec;
 } switch_update;

we can capture the fact that the statements have code GIMPLE_SWITCH.

I was able to make 7 of the 9 gimple_switch_* accessors accept a
gimple_switch rather than a gimple, hence adding compile-time typesafety
for these.  We could do all of them, but doing the remaining two would
enlarge the patch (I did the ones where the site of the downcast already
has enclosing braces handy to scope the subclass pointer).

I kept the run-time checking in those accessors, though arguably they're
redundant.

I deliberately used a C style for the downcast from gimple to a more
specialized type, eschewing the dyn_cast<> template, though IMHO the
latter is a better style.   Perhaps a specialized dyncast method to
gimple would be more acceptable e.g.:

struct GTY((etc)) gimple statement_base {
   [...]

   gimple_switch is_switch () const
   {
if (gimple_code (this) == GIMPLE_SWITCH)
   return (gimple_switch)this;
else
   return NULL;
   }
};

allowing us to spell a dynamic cast like this:
   if (gimple_switch switch_stmt = stmt->is_switch ())
 {
/* do typesafe stuff with switch_stmt */
 }


commit b9e2375b4250428cc20e83c9be9be80afbfcb388
Author: David Malcolm 
Date:   Thu Nov 7 21:32:15 2013 -0500

Use gimple_switch in various places

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index fb05ce7..cc2d4d7 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2221,7 +2221,7 @@ expand_gimple_stmt_1 (gimple stmt)
 case GIMPLE_PREDICT:
   break;
 case GIMPLE_SWITCH:
-  expand_case (stmt);
+  expand_case ((gimple_switch)stmt);
   break;
 case GIMPLE_ASM:
   expand_asm_stmt (stmt);
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 5a5cfbb..a72ebb4 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -69,6 +69,9 @@ typedef struct gimple_stmt_iterator_d gimple_stmt_iterator;
 /* FWIW I'd rather s

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Xinliang David Li
On Fri, Nov 8, 2013 at 10:56 AM, Rong Xu  wrote:
> A question about inhibit_libc.
> When inhibit_libc is defined, we provide dummy functions for all the
> __gcov_* methods.
> Is it purposely to minimize the footprint?
> I'm thinking to allow some codes that are independent of libc (and its
> headers) under this. Is it OK?
>


I only saw the macro defined for mcore targets -- not sure if they are
obsolete or not.

David


> -Rong
>
> On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
>> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  
>> wrote:
>>> in libgcov-driver.c
>>>
 /* Flag when the profile has already been dumped via __gcov_dump().  */
 static int gcov_dump_complete;
>>>
 inline void
 set_gcov_dump_complete (void)
{
>>>   > gcov_dump_complete = 1;
}
>>>
inline void
reset_gcov_dump_complete (void)
{
  gcov_dump_complete = 0;
}
>>>
>>>
>>> These should be moved to libgcov-interface.c. Is there reason to not
>>> mark them as static?
>>
>> gcov_dump_complete is used in gcov_exit() which is in
>> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
>> in libgcov-interface.c.
>> I want gcov_dump_complete a static. So I have to add to global
>> functions that accessible from libgcov-interface.c.
>> Another choice is to move __gcov_dump and __gcov_reset to
>> libgcov-driver.c and that will simplify the code.
>>
>>>
>>> in libgcov-profiler.c, line 142
>>>
 #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
 __thread
>>>
>>> trailing whilte space.
>>>
>>
>> Will fix.
>>
>>>
 || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
 && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>>>
>>> trailing white space.
>>>
>>
>> Will fix.
>>
>>>
>>> in libgcov-merge.c
>>>
>>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>>> NULL.
>>> 2) document the new source and weight parameter -- especially the weight.
>>
>> Will do.
>>
>>> 3) How do you deal with integer overflow ?
>>
>> This is good question. gcvo_type is (typically) long long. I thought
>> the count value will not be nearly close to the max of long long.
>> (We did see overflow in compiler, but that's because of the scaling --
>> some of the scaling factors are really large.)
>>
>>>
>>> David
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
 On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
 wrote:
> On Thu, 7 Nov 2013, Rong Xu wrote:
>
>> Thanks Joseph for these detailed comments/suggestions.
>> The fixed patch is attached to this email.
>> The only thing left out is the Texinfo manual. Do you mean this tool
>> should have its it's own texi file in gcc/doc?
>
> Its own texi file, probably included as a chapter by gcc.texi (like
> gcov.texi is).

 OK. will add this in.

 My last patch that changes the command handling is actually broken.
 Please ignore that patch and use the one in this email.

 Also did some minor adjust of the code in libgcov.

 Thanks,

 -Rong


>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li  wrote:
> On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
>> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  
>> wrote:
>>> in libgcov-driver.c
>>>
 /* Flag when the profile has already been dumped via __gcov_dump().  */
 static int gcov_dump_complete;
>>>
 inline void
 set_gcov_dump_complete (void)
{
>>>   > gcov_dump_complete = 1;
}
>>>
inline void
reset_gcov_dump_complete (void)
{
  gcov_dump_complete = 0;
}
>>>
>>>
>>> These should be moved to libgcov-interface.c. Is there reason to not
>>> mark them as static?
>>
>> gcov_dump_complete is used in gcov_exit() which is in
>> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
>> in libgcov-interface.c.
>> I want gcov_dump_complete a static. So I have to add to global
>> functions that accessible from libgcov-interface.c.
>> Another choice is to move __gcov_dump and __gcov_reset to
>> libgcov-driver.c and that will simplify the code.
>>
>
> Ok  then you should remove the 'inline' keyword for the set and reset
> functions and keep them in libgcov-driver.c.

Will remove 'inline' keyword.

>
>>>
>>> in libgcov-profiler.c, line 142
>>>
 #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
 __thread
>>>
>>> trailing whilte space.
>>>
>>
>> Will fix.
>>
>>>
 || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
 && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>>>
>>> trailing white space.
>>>
>>
>> Will fix.
>>
>>>
>>> in libgcov-merge.c
>>>
>>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>>> NULL.
>>> 2) document the new source and weight parameter -- especially the weight.
>>
>> Will do.
>>
>>> 3) How do you deal with integer overflow ?
>>
>> This is good question. gcvo_type is (typically) long long. I thought
>> the count value will not be nearly close to the max of long long.
>> (We did see overflow in compiler, but that's because of the scaling --
>> some of the scaling factors are really large.)
>>
>
> Why not passing weight as a scaled PERCENT  (as branch prob) for the
> source? THis way, the merge tool does not need to do scaling twice.
>

there are two weights, so unless you have two weights in the merge_add
functions, you have to
traverse the list twice. Do you mean pass addition parameters?

Currently, merge tools is done this way:
merge (p1, p2, w1, w2)
first pass: merge_add (p1, p1, w1)
second pass: merge_add (p1, p2, w2)

I initially had both weights in merge_add function. but later I found that
to deal with mis-match profile, I need to find out the gcov_info in p1
but not p2, (they need to multiply by w1 only).
So I choose the above structure.

Another thing about the PERCENT is the inaccuracy for floating points.

I have the scaling function in rewrite sub-command mainly for debug purpose:
I merge the same profile with a weight 1:9.
Then I scale the result profile by 0.1. I was expecting identical
profile as the source. But due to floating point inaccuracy, some
counters are off slightly.


> David
>
>
>
>>>
>>> David
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
 On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
 wrote:
> On Thu, 7 Nov 2013, Rong Xu wrote:
>
>> Thanks Joseph for these detailed comments/suggestions.
>> The fixed patch is attached to this email.
>> The only thing left out is the Texinfo manual. Do you mean this tool
>> should have its it's own texi file in gcc/doc?
>
> Its own texi file, probably included as a chapter by gcc.texi (like
> gcov.texi is).

 OK. will add this in.

 My last patch that changes the command handling is actually broken.
 Please ignore that patch and use the one in this email.

 Also did some minor adjust of the code in libgcov.

 Thanks,

 -Rong


>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Xinliang David Li
On Fri, Nov 8, 2013 at 11:21 AM, Rong Xu  wrote:
> On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li  wrote:
>> On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
>>> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  
>>> wrote:
 in libgcov-driver.c

> /* Flag when the profile has already been dumped via __gcov_dump().  */
> static int gcov_dump_complete;

> inline void
> set_gcov_dump_complete (void)
>{
   > gcov_dump_complete = 1;
>}

>inline void
>reset_gcov_dump_complete (void)
>{
>  gcov_dump_complete = 0;
>}


 These should be moved to libgcov-interface.c. Is there reason to not
 mark them as static?
>>>
>>> gcov_dump_complete is used in gcov_exit() which is in
>>> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
>>> in libgcov-interface.c.
>>> I want gcov_dump_complete a static. So I have to add to global
>>> functions that accessible from libgcov-interface.c.
>>> Another choice is to move __gcov_dump and __gcov_reset to
>>> libgcov-driver.c and that will simplify the code.
>>>
>>
>> Ok  then you should remove the 'inline' keyword for the set and reset
>> functions and keep them in libgcov-driver.c.
>
> Will remove 'inline' keyword.
>
>>

 in libgcov-profiler.c, line 142

> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
> __thread

 trailing whilte space.

>>>
>>> Will fix.
>>>

> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))

 trailing white space.

>>>
>>> Will fix.
>>>

 in libgcov-merge.c

 1) I don't think you need in_mem flag. For in memory merge, the source != 
 NULL.
 2) document the new source and weight parameter -- especially the weight.
>>>
>>> Will do.
>>>
 3) How do you deal with integer overflow ?
>>>
>>> This is good question. gcvo_type is (typically) long long. I thought
>>> the count value will not be nearly close to the max of long long.
>>> (We did see overflow in compiler, but that's because of the scaling --
>>> some of the scaling factors are really large.)
>>>
>>
>> Why not passing weight as a scaled PERCENT  (as branch prob) for the
>> source? THis way, the merge tool does not need to do scaling twice.
>>
>
> there are two weights, so unless you have two weights in the merge_add
> functions, you have to
> traverse the list twice. Do you mean pass addition parameters?
>
> Currently, merge tools is done this way:
> merge (p1, p2, w1, w2)
> first pass: merge_add (p1, p1, w1)
> second pass: merge_add (p1, p2, w2)
>

Only need to pass the normalized the weight (in percent) for one
profile source -- say norm_w2 : w2/(w1+w2), and the merge function can
do this in one pass as norm_w1 = 1-norm_w2.


> I initially had both weights in merge_add function. but later I found that
> to deal with mis-match profile, I need to find out the gcov_info in p1
> but not p2, (they need to multiply by w1 only).
> So I choose the above structure.
>
> Another thing about the PERCENT is the inaccuracy for floating points.

This is how profile update work in profile-use compilation -- so that
should not be a big problem.

Before you fix this, I'd like to hear what Honza and other reviewers think.

thanks,

David

>
> I have the scaling function in rewrite sub-command mainly for debug purpose:
> I merge the same profile with a weight 1:9.
> Then I scale the result profile by 0.1. I was expecting identical
> profile as the source. But due to floating point inaccuracy, some
> counters are off slightly.
>
>
>> David
>>
>>
>>

 David






 On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
> wrote:
>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>
>>> Thanks Joseph for these detailed comments/suggestions.
>>> The fixed patch is attached to this email.
>>> The only thing left out is the Texinfo manual. Do you mean this tool
>>> should have its it's own texi file in gcc/doc?
>>
>> Its own texi file, probably included as a chapter by gcc.texi (like
>> gcov.texi is).
>
> OK. will add this in.
>
> My last patch that changes the command handling is actually broken.
> Please ignore that patch and use the one in this email.
>
> Also did some minor adjust of the code in libgcov.
>
> Thanks,
>
> -Rong
>
>
>>
>> --
>> Joseph S. Myers
>> jos...@codesourcery.com


Re: [rl78] Use canonical const_int for one_cmplqi2

2013-11-08 Thread DJ Delorie

> This patch just changes a QImode (const_int 255) to (const_int -1),
> since the canonical form is to sign-extend.  As things stand the pattern
> trips a new assert added on the wide-int branch.

Ok.  Thanks!


  1   2   >