date:20131107

Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 07:17:19AM +, Nathan Sidwell wrote:
> On 11/06/13 21:25, Dodji Seketeli wrote:
> >It appeared that gcov.c and input.c both have a static function named
> >read_line.  I guess we ought to keep just one.
> 
> As a general rule, that's good.

Generally yes, but IMHO not in this case, both because of the problems you
are mentioning, because it already now needs to do quite different things
and because I think in the future the version for caret diagnostics should
do much more complicated things (caching of the lines for faster further
caret diagnostics).  Furthermore, ftell/fseek aren't something that should
be used unless strictly necessary.

Jakub

Re: [gomp4 1/9] Add missing include.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:15PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   libgomp/
>   * libgomp_g.h: Include  for size_t.

I'm surprised why it is needed, because libgomp.h that includes it includes
stdlib.h that should provide size_t too.  But, including this certainly
doesn't hurt and makes the header more self-contained.  So, ok for
trunk/gomp-4_0-branch.

Jakub

Re: [gomp4 2/9] libgomp: Prepare for testcases without -fopenmp.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   libgomp/
>   * testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
>   ALWAYS_CFLAGS.
>   * testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
>   * testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
>   * testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
>   * testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
>   Likewise.

Ok for trunk/gomp-4_0-branch.

Jakub

Re: [gomp4 3/9] OpenACC: Recognize -fopenacc.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:17PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   gcc/c-family/
>   * c.opt (fopenacc): New option.
>   gcc/fortran/
>   * lang.opt (fopenacc): New option.
>   * invoke.texi (-fopenacc): Document it.
>   * gfortran.h (gfc_option_t): New member.
>   * options.c (gfc_init_options, gfc_handle_option): Handle it.
>   gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_fopenacc): New
>   procedure.
>   gcc/
>   * doc/invoke.texi (-fopenacc): Document it.
>   * doc/sourcebuild.texi (fopenacc): Document it.
>   gcc/testsuite/
>   * c-c++-common/cpp/openacc-define-1.c: New file.
>   * c-c++-common/cpp/openacc-define-2.c: Likewise.
>   * c-c++-common/cpp/openacc-define-3.c: Likewise.
>   * gfortran.dg/openacc-define-1.f90: Likewise.
>   * gfortran.dg/openacc-define-2.f90: Likewise.
>   * gfortran.dg/openacc-define-3.f90: Likewise.

Ok for branch.

Jakub

Re: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote:
> --- gcc/config/arc/arc.h
> +++ gcc/config/arc/arc.h
> @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
>  %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} 
> %{r}\
>  %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
>  %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
> -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
> +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\

Perhaps add here |ftree-parallelize-loops=* too?

>"%{!shared: \
> - %{mt|pthread:%{fopenmp:-lrt} -lpthread} \
> + %{mt|pthread:%{fopenacc|fopenmp:-lrt} -lpthread} \

Likewise.

>   %{p:%{!mlp64:-L/usr/lib/hpux32/libp} \
>%{mlp64:-L/usr/lib/hpux64/libp} -lprof} \
>   %{pg:%{!mlp64:-L/usr/lib/hpux32/libp} \
> diff --git gcc/config/pa/pa-hpux11.h gcc/config/pa/pa-hpux11.h
> index c217398..0676d74 100644
> --- gcc/config/pa/pa-hpux11.h
> +++ gcc/config/pa/pa-hpux11.h
> @@ -120,7 +120,7 @@ along with GCC; see the file COPYING3.  If not see
>  #undef LIB_SPEC
>  #define LIB_SPEC \
>"%{!shared:\
> - %{fopenmp:%{static:-a archive_shared} -lrt %{static:-a archive}}\
> + %{fopenacc|fopenmp:%{static:-a archive_shared} -lrt %{static:-a 
> archive}}\

Ditto (and several times further).

Otherwise LGTM for branch.

Jakub

Re: [gomp4 5/9] OpenACC: preprocessor definition, Fortran integer parameter.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:19PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   gcc/c-family/
>   * c-cppbuiltin.c (c_cpp_builtins): Conditionally define _OPENACC.
>   gcc/fortran/
>   * cpp.c (cpp_define_builtins): Conditionally define _OPENACC.
>   gcc/testsuite/
>   * c-c++-common/cpp/openacc-define-1.c: Test _OPENACC.
>   * c-c++-common/cpp/openacc-define-2.c: Likewise.
>   * c-c++-common/cpp/openacc-define-3.c: Likewise.
>   * gfortran.dg/openacc-define-1.f90: Likewise.
>   * gfortran.dg/openacc-define-2.f90: Likewise.
>   * gfortran.dg/openacc-define-3.f90: Likewise.
>   libgomp/
>   * openacc.f90 (openacc_version): New integer parameter.
>   * openacc_lib.h (openacc_version): Likewise.
>   * testsuite/libgomp.oacc-fortran/openacc_version-1.f: New file.
>   * testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.

Ok for branch.

Jakub

Re: [gomp4 7/9] OpenACC: Use OpenMP's lowering and expansion passes.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:21PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   gcc/
>   * gimplify.c (gimplify_body): Consider flag_openacc additionally
>   to flag_openmp.
>   * omp-low.c (execute_expand_omp, execute_lower_omp)
>   (gate_diagnose_omp_blocks): Likewise.
>   gcc/testsuite/
>   * gcc.dg/goacc-gomp/goacc-gomp.exp: New file.
>   * gcc.dg/goacc/goacc.exp: Likewise.

Ok for branch.

Jakub

Re: [gomp4 6/9] OpenACC: Infrastructure for builtins.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:20PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   gcc/
>   * oacc-builtins.def: New file.
>   * Makefile.in (BUILTINS_DEF): Add oacc-builtins.def.
>   * builtins.def (DEF_GOACC_BUILTIN): New macro.
>   Include "oacc-builtins.def".
>   gcc/fortran/
>   * f95-lang.c (DEF_GOACC_BUILTIN): New macro.
>   Include "../oacc-builtins.def".
>   libgomp/
>   * libgomp.map (GOACC_2.0): New symbol version.

Ok for branch.

Jakub

Re: [gomp4 8/9] OpenACC: Basic support for #pragma acc in the C front end.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:42:22PM +0100, tho...@codesourcery.com wrote:
> From: Thomas Schwinge 
> 
>   gcc/c-family/
>   * c-pragma.c (oacc_pragmas): New array.
>   (c_pp_lookup_pragma, init_pragma): Handle it.
>   gcc/
>   * doc/invoke.texi (-fopenacc): Update.
> 
>   gcc/c/
>   * c-parser.c (c_parser_omp_all_clauses): Make a parser error
>   message suitable for OpenACC, too.
>   gcc/cp/
>   * parser.c (cp_parser_omp_all_clauses): Make a parser error
>   message suitable for OpenACC, too.

Ok for branch.

Jakub

Re: [gomp4 9/9] OpenACC: Basic support for #pragma acc parallel.

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 08:53:00PM +0100, Thomas Schwinge wrote:
> Forgot to pass the --patience switch to Git, so the diff algorithm
> decided to first patch the existing expand_omp_taskreg into
> expand_oacc_parallel, and then later re-add expand_omp_taskreg.  Here's a
> more readable version of that patch, avoiding all that back'n'forth:

I've just quickly skimmed this, other than what I've mentioned earlier
I haven't found anything totally wrong, so with that thing fixed I guess
it is ok for the branch, but my quick skimming was far from a proper review.

Jakub

Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote:
> Hmmm, good point.  I've moved update_stmt and company to the caller,
> and modified the caller to call regimplify_operands only for
> GIMPLE_RETURN which is the special case.

Can't you (later) handle that without regimplification too?
> 
> Let me know if this is still ok so I can commit.

Ok.

Jakub

Re: [PATCH] Fix PR58941

2013-11-07 Thread Richard Biener

On Wed, 6 Nov 2013, Richard Sandiford wrote:

> Richard Biener  writes:
> > --- 599,615 
> >   
> > exp = TREE_OPERAND (exp, 0);
> >   }
> >   
> > +   /* We need to deal with variable arrays ending structures.  */
> > +   if (seen_variable_array_ref
> > +   && maxsize != -1
> > +   && (!bit_offset.fits_shwi ()
> > + || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
> > + || (bit_offset.to_shwi () + maxsize
> > + == (signed) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp))
> > + maxsize = -1;
> 
> I realise this was in the old code too, but should it be "HOST_WIDE_INT"
> rather than "signed"?

Hmm, indeed.  I'll fix it.

Thanks,
Richard.

> Ignore me if not.  It just stood out as a bit weird in today's wide-int merge.
> 
> Thanks,
> Richard
> 
> 
> > + 
> > +  done:
> > if (!bit_offset.fits_shwi ())
> >   {
> > *poffset = 0;
> > *** get_ref_base_and_extent (tree exp, HOST_
> > *** 614,637 
> >   
> > hbit_offset = bit_offset.to_shwi ();
> >   
> > -   /* We need to deal with variable arrays ending structures such as
> > -struct { int length; int a[1]; } x;   x.a[d]
> > -struct { struct { int a; int b; } a[1]; } x;  x.a[d].a
> > -struct { struct { int a[1]; } a[1]; } x;  x.a[0][d], x.a[d][0]
> > -struct { int len; union { int a[1]; struct X x; } u; } x; x.u.a[d]
> > -  where we do not know maxsize for variable index accesses to
> > -  the array.  The simplest way to conservatively deal with this
> > -  is to punt in the case that offset + maxsize reaches the
> > -  base type boundary.  This needs to include possible trailing padding
> > -  that is there for alignment purposes.  */
> > - 
> > -   if (seen_variable_array_ref
> > -   && maxsize != -1
> > -   && (!host_integerp (TYPE_SIZE (base_type), 1)
> > - || (hbit_offset + maxsize
> > - == (signed) TREE_INT_CST_LOW (TYPE_SIZE (base_type)
> > - maxsize = -1;
> > - 
> > /* In case of a decl or constant base object we can do better.  */
> >   
> > if (DECL_P (exp))
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Richard Biener

On Wed, 6 Nov 2013, Jakub Jelinek wrote:

> On Tue, Nov 05, 2013 at 02:00:16PM -0800, Cong Hou wrote:
> > > I'm also curious -- did this code show up in a particular benchmark, if 
> > > so,
> > > which one?
> > 
> > I didn't find this problem from any benchmark, but from another
> > concern about loop upper bound estimation. Look at the following code:
> > 
> > int foo(unsigned int n, int r)
> > {
> >   int i;
> >   if (n > 0)
> > if (n < 4)
> > {
> >   do {
> >  --n;
> >  r *= 2;
> >   } while (n > 0);
> > }
> >   return r+n;
> > }
> > 
> > 
> > In order to get the upper bound of the loop in this function, GCC
> > traverses conditions n<4 and n>0 separately and tries to get any
> > useful information. But as those two conditions cannot be combined
> > into one due to this issue (note that n>0 will be transformed into
> > n!=0), when GCC sees n<4, it will consider the possibility that n may
> > be equal to 0, in which case the upper bound is UINT_MAX. If those two
> > conditions can be combined into one, which is n-1<=2, then we can get
> > the correct upper bound of the loop.
> 
> I've looked at the above testcase to see why we aren't able to determine
> the number of iterations upper bound properly here.
> 
> That doesn't mean your patch is useless, though I must say I'm far from
> being convinced it is safe ATM and also it looks like quite ugly special
> case (trying to undo a VRP optimization but only one single specific case).
> 
> The first problem is VRP, we end up with:
>   :
>   # n_1 = PHI 
>   # r_3 = PHI 
>   # RANGE [0, 4294967295]
>   n_7 = n_1 + 4294967295;
>   # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe
>   r_8 = r_3 * 2;
>   if (n_7 != 0)
> goto ;
>   else
> goto ;
>   
>   :
>   goto ;
> - missing range on n_1, extremely conservative range on n_7.  With the
> attached patch we get instead:
>   :
>   # RANGE [1, 3] NONZERO 0x3
>   # n_1 = PHI 
>   # r_3 = PHI 
>   # RANGE [0, 2] NONZERO 0x3
>   n_7 = n_1 + 4294967295;
>   # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe
>   r_8 = r_3 * 2;
>   if (n_7 != 0)
> goto ;
>   else
> goto ;
> 
>   :
>   goto ;
> 
> The issue is that we use live_on_edge to determine if it is desirable
> to added ASSERT_EXPRs, but as we walk bbs in RPO order, we first enter
> the loop through the bb with exit edge and thus live of the latch isn't
> computed (and, generally the propagation for live ignores dfs back edges
> anyway), and because in the above live_on_edge ((5)->(6), n_7) is false,
> we don't add ASSERT_EXPR that n_7 is != 0 in the latch bb, so during
> iteration we start with n_1 being assumed [1, 3] (that is range of the
> assertion from the preceeding conditions on n_5(D)), but in the next
> iteration widen it to [0, 3] because we think n_7 can be [0, 2] in the
> PHI and thus end up with uselessly wide range because we think the
> subtraction can wrap around.  This patch improves live_on_edge for
> empty latches, by just looking at the PHIs on loop->header from the
> latch -> header edge and noting which SSA_NAMEs are used there.
> 
> I had to add -fno-tree-vrp to 4 unroll_*.c tests, because they disable
> various tree optimizations already and want to test unrolling of loops
> iterating exactly twice, but with this VRP change VRP is smart enough
> to replace the PHI argument on the i IV from
> -  # i_13 = PHI 
> +  # i_13 = PHI <1(3), 0(2)>
> (the loop iterates exactly twice) and RTL unrolling doesn't do the
> tested thing in that case.  But, this should affect only loops that roll
> exactly twice and those realy should be unrolled already far before, so IMHO
> it isn't something to try to optimize better at the RTL level.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2013-11-06  Jakub Jelinek  
> 
>   * tree-vrp.c (live_on_edge): If e->dest is an empty latch
>   of some loop, but live[e->dest->index] is not computed, look
>   for SSA_NAMEs used in loop header PHIs from the latch edge.
> 
>   * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
>   * gcc.dg/unroll_2.c: Likewise.
>   * gcc.dg/unroll_3.c: Likewise.
>   * gcc.dg/unroll_4.c: Likewise.
> 
> --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100
> +++ gcc/tree-vrp.c2013-11-06 09:32:19.205104029 +0100
> @@ -92,6 +92,42 @@ static sbitmap *live;
>  static bool
>  live_on_edge (edge e, tree name)
>  {
> +  if (live[e->dest->index] == NULL)
> +{

I'd rather have live[] computed "correctly" than the following
which I think is a hack ... as you say the issue is ordering
(or rather that there isn't an "order" for CFG cycles unless
we want to iterate).  For loop BB order we explicitely handle
the latch, maybe just using a different order than
RPO order, with special-casing the latch, makes more sense?

Richard.

> +  /* Handle edges to empty latch blocks.  If NAME appears
> +  in loop header phis on edges

Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-07 Thread Kenneth Zadeck

I very strongly disagree with this.  The standard needs to be high than "does 
it pass the test suite."

What we are introducing is a case where the program will behave one way with 
optimization and another way without it.  While, this is always true for timing 
dependent code, it is pretty rare for math.   We should always tread carefully 
when doing that, and the excuse that it is "cleaner" does not, in my mind, fit 
the bill.



On Nov 7, 2013, at 2:08 AM, Richard Sandiford  
wrote:

> Kenneth Zadeck  writes:
>> So what is the big plan here? if you remove it here and then do not 
>> do it in wide int, then it is not going to be truncated.
> 
> Yeah, that is the big plan for trees.  Mainline doesn't truncate at the
> tree level after:
> 
>http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00285.html
> 
> and this patch brings wide-int back in line.  (The truncations in this patch
> were only ever local to wide-int.)
> 
> If you're wondering why we don't want to truncate at the tree level,
> see the second half of:
> 
>http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00174.html
> 
> Thanks,
> Richard
> 
>> On 11/06/2013 05:10 PM, Richard Sandiford wrote:
>>> Following the removal of SHIFT_COUNT_TRUNCATED from double-int, this patch
>>> reverts the changed I'd made to mimic the old behaviour on wide-int.
>>> 
>>> Tested on powerpc64-linux-gnu and by assembly comparison on a range of 
>>> targets.
>>> OK to install?
>>> 
>>> Thanks,
>>> Richard
>>> 
>>> 
>>> Index: gcc/fold-const.c
>>> ===
>>> --- gcc/fold-const.c2013-11-05 13:06:56.985255941 +
>>> +++ gcc/fold-const.c2013-11-05 13:12:28.805655903 +
>>> @@ -1007,13 +1007,9 @@ int_const_binop_1 (enum tree_code code,
>>>  /* It's unclear from the C standard whether shifts can overflow.
>>> The following code ignores overflow; perhaps a C standard
>>> interpretation ruling is needed.  */
>>> -res = wi::rshift (arg1, arg2, sign,
>>> -  SHIFT_COUNT_TRUNCATED
>>> -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
>>> +res = wi::rshift (arg1, arg2, sign);
>>>else
>>> -res = wi::lshift (arg1, arg2,
>>> -  SHIFT_COUNT_TRUNCATED
>>> -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
>>> +res = wi::lshift (arg1, arg2);
>>>break;
>>> 
>>>  case RROTATE_EXPR:
>>> Index: gcc/tree-ssa-ccp.c
>>> ===
>>> --- gcc/tree-ssa-ccp.c2013-11-05 13:07:25.659474362 +
>>> +++ gcc/tree-ssa-ccp.c2013-11-05 13:12:28.806655910 +
>>> @@ -1272,20 +1272,15 @@ bit_value_binop_1 (enum tree_code code,
>>>else
>>>  code = RSHIFT_EXPR;
>>>  }
>>> -  int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0;
>>>if (code == RSHIFT_EXPR)
>>>  {
>>> -  *mask = wi::rshift (wi::ext (r1mask, width, sgn),
>>> -  shift, sgn, shift_precision);
>>> -  *val = wi::rshift (wi::ext (r1val, width, sgn),
>>> - shift, sgn, shift_precision);
>>> +  *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn);
>>> +  *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn);
>>>  }
>>>else
>>>  {
>>> -  *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision),
>>> -   width, sgn);
>>> -  *val = wi::ext (wi::lshift (r1val, shift, shift_precision),
>>> -  width, sgn);
>>> +  *mask = wi::ext (wi::lshift (r1mask, shift), width, sgn);
>>> +  *val = wi::ext (wi::lshift (r1val, shift), width, sgn);
>>>  }
>>>  }
>>>  }

Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-07 Thread Richard Biener

On Thu, 7 Nov 2013, Kenneth Zadeck wrote:

> I very strongly disagree with this.  The standard needs to be high than "does 
> it pass the test suite."
> 
> What we are introducing is a case where the program will behave one way 
> with optimization and another way without it.  While, this is always 
> true for timing dependent code, it is pretty rare for math.  We should 
> always tread carefully when doing that, and the excuse that it is 
> "cleaner" does not, in my mind, fit the bill.

It happens always for code with undefined behavior.  See the gazillions
of INVALID bugreports that complain that -O0 produces different results
than -On.

Patch is ok.

Thanks,
Richard. 

> On Nov 7, 2013, at 2:08 AM, Richard Sandiford  
> wrote:
> 
> > Kenneth Zadeck  writes:
> >> So what is the big plan here? if you remove it here and then do not 
> >> do it in wide int, then it is not going to be truncated.
> > 
> > Yeah, that is the big plan for trees.  Mainline doesn't truncate at the
> > tree level after:
> > 
> >http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00285.html
> > 
> > and this patch brings wide-int back in line.  (The truncations in this patch
> > were only ever local to wide-int.)
> > 
> > If you're wondering why we don't want to truncate at the tree level,
> > see the second half of:
> > 
> >http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00174.html
> > 
> > Thanks,
> > Richard
> > 
> >> On 11/06/2013 05:10 PM, Richard Sandiford wrote:
> >>> Following the removal of SHIFT_COUNT_TRUNCATED from double-int, this patch
> >>> reverts the changed I'd made to mimic the old behaviour on wide-int.
> >>> 
> >>> Tested on powerpc64-linux-gnu and by assembly comparison on a range of 
> >>> targets.
> >>> OK to install?
> >>> 
> >>> Thanks,
> >>> Richard
> >>> 
> >>> 
> >>> Index: gcc/fold-const.c
> >>> ===
> >>> --- gcc/fold-const.c2013-11-05 13:06:56.985255941 +
> >>> +++ gcc/fold-const.c2013-11-05 13:12:28.805655903 +
> >>> @@ -1007,13 +1007,9 @@ int_const_binop_1 (enum tree_code code,
> >>>  /* It's unclear from the C standard whether shifts can overflow.
> >>> The following code ignores overflow; perhaps a C standard
> >>> interpretation ruling is needed.  */
> >>> -res = wi::rshift (arg1, arg2, sign,
> >>> -  SHIFT_COUNT_TRUNCATED
> >>> -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
> >>> +res = wi::rshift (arg1, arg2, sign);
> >>>else
> >>> -res = wi::lshift (arg1, arg2,
> >>> -  SHIFT_COUNT_TRUNCATED
> >>> -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
> >>> +res = wi::lshift (arg1, arg2);
> >>>break;
> >>> 
> >>>  case RROTATE_EXPR:
> >>> Index: gcc/tree-ssa-ccp.c
> >>> ===
> >>> --- gcc/tree-ssa-ccp.c2013-11-05 13:07:25.659474362 +
> >>> +++ gcc/tree-ssa-ccp.c2013-11-05 13:12:28.806655910 +
> >>> @@ -1272,20 +1272,15 @@ bit_value_binop_1 (enum tree_code code,
> >>>else
> >>>  code = RSHIFT_EXPR;
> >>>  }
> >>> -  int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0;
> >>>if (code == RSHIFT_EXPR)
> >>>  {
> >>> -  *mask = wi::rshift (wi::ext (r1mask, width, sgn),
> >>> -  shift, sgn, shift_precision);
> >>> -  *val = wi::rshift (wi::ext (r1val, width, sgn),
> >>> - shift, sgn, shift_precision);
> >>> +  *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn);
> >>> +  *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn);
> >>>  }
> >>>else
> >>>  {
> >>> -  *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision),
> >>> -   width, sgn);
> >>> -  *val = wi::ext (wi::lshift (r1val, shift, shift_precision),
> >>> -  width, sgn);
> >>> +  *mask = wi::ext (wi::lshift (r1mask, shift), width, sgn);
> >>> +  *val = wi::ext (wi::lshift (r1val, shift), width, sgn);
> >>>  }
> >>>  }
> >>>  }
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 09:48:46AM +0100, Richard Biener wrote:
> > --- gcc/tree-vrp.c.jj   2013-11-06 08:48:01.0 +0100
> > +++ gcc/tree-vrp.c  2013-11-06 09:32:19.205104029 +0100
> > @@ -92,6 +92,42 @@ static sbitmap *live;
> >  static bool
> >  live_on_edge (edge e, tree name)
> >  {
> > +  if (live[e->dest->index] == NULL)
> > +{
> 
> I'd rather have live[] computed "correctly" than the following
> which I think is a hack ... as you say the issue is ordering
> (or rather that there isn't an "order" for CFG cycles unless
> we want to iterate).  For loop BB order we explicitely handle
> the latch, maybe just using a different order than
> RPO order, with special-casing the latch, makes more sense?

But is there any order that would help?  Sure, we could say just tweak
the rpo/bb_rpo arrays and put there latch bbs before any other bbs from the
same loop, but that would only help us in calling find_assert_locations_1
on the latch before other bbs in the loop.  But that call does nothing,
and we'd need to special case handling of the live for the latch block
anyway, just not in live_on_edge, but in find_assert_locations instead.

Is that what what you are looking for?

Because without that special casing of handling latch edges, you'd need to
process the header rather than latch first, and propagate through DFS_BACK
edges, but then you couldn't propagate through other edges, nor in this
testcase would be the latch live computed when processing the header which
has there the condition.

Jakub

Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Richard Biener

On Thu, 7 Nov 2013, Jakub Jelinek wrote:

> On Thu, Nov 07, 2013 at 09:48:46AM +0100, Richard Biener wrote:
> > > --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100
> > > +++ gcc/tree-vrp.c2013-11-06 09:32:19.205104029 +0100
> > > @@ -92,6 +92,42 @@ static sbitmap *live;
> > >  static bool
> > >  live_on_edge (edge e, tree name)
> > >  {
> > > +  if (live[e->dest->index] == NULL)
> > > +{
> > 
> > I'd rather have live[] computed "correctly" than the following
> > which I think is a hack ... as you say the issue is ordering
> > (or rather that there isn't an "order" for CFG cycles unless
> > we want to iterate).  For loop BB order we explicitely handle
> > the latch, maybe just using a different order than
> > RPO order, with special-casing the latch, makes more sense?
> 
> But is there any order that would help?  Sure, we could say just tweak
> the rpo/bb_rpo arrays and put there latch bbs before any other bbs from the
> same loop, but that would only help us in calling find_assert_locations_1
> on the latch before other bbs in the loop.  But that call does nothing,
> and we'd need to special case handling of the live for the latch block
> anyway, just not in live_on_edge, but in find_assert_locations instead.
> 
> Is that what what you are looking for?
> 
> Because without that special casing of handling latch edges, you'd need to
> process the header rather than latch first, and propagate through DFS_BACK
> edges, but then you couldn't propagate through other edges, nor in this
> testcase would be the latch live computed when processing the header which
> has there the condition.

I'm looking for adjusting of live compute - either by adjusting the
algorithm or by special casing the latches.  Like for example
with the following (untested, needs cleanup - the PHI processing
can be factored out from find_assert_locations_1 and re-used):

@@ -5904,6 +5898,27 @@ find_assert_locations (void)
   for (i = 0; i < rpo_cnt; ++i)
 bb_rpo[rpo[i]] = i;
 
+  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
+ the order we compute liveness and insert asserts we otherwise
+ fail to insert asserts into the loop latch.  */
+  loop_p loop;
+  loop_iterator li;
+  FOR_EACH_LOOP (li, loop, 0)
+{
+  i = loop->latch->index;
+  live[i] = sbitmap_alloc (num_ssa_names);
+  bitmap_clear (live[i]);
+  for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header);
+  !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ if (virtual_operand_p (gimple_phi_result (phi)))
+   continue;
+ for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j)
+   if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME)
+ bitmap_set_bit (live[i], SSA_NAME_VERSION 
(gimple_phi_arg_def (phi, j)));
+   }
+}

Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Dodji Seketeli

I was a bit too hasty in committing this patch, thinking the review was
done.  I have reverted it yesterday night.

Let's try again after a night of sleep.

Bernd Edlinger  writes:

> I still do not see how this is supposed to work:
>
> If the previous invocation of get_line already had read some
> characters of the following line(s), how is that information
> recovered?

It's lost, you are right.  I felt under my radar that the file position
indicator adjustment that should be done in get_line after it finds a
line delimiter felt through the cracks in the transformation made from
the getline of glibc.  Hopefully in this version, the file position
indicator should be properly set when get_line finishes.

So am proposing two patches that I think should be committed in pair.

The first one is the one we've been discussing lately; it's amended to
make get_line remember update the file position indicator.

The second one is to factorize the two read_line (from gcov.c and
input.c) and make gcov.c just use one from input.c.

Nathan Sidwell  writes:

> How is input.o linked into gcov?

I didn't changed the way input.o is linked into gcov.  I just launched
the debugger on some of the regression tests of gcov in the suite, to
see that the read_line from input.c was being picked up.  But now that
you are asking, here is what I see from gcc/Makefile.in:

input.o is part of OBJS-libcommon:

OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o 
input.o version.o

$(OBJS-libcommon) ends up in libcommon.a which itself is part of LIBS
(and LIBDEPS)

And the gcov executable is linked with LIBS as per:

GCOV_OBJS = gcov.o
gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@

> The former's a compiler proper source file. Furthermore, if it is
> linked in, doesn't it drag in a whole bunch of the compiler itself
> into gcov?

>From the above, what I can say is that input.o was already linked with
gcov.  But I think it's minimal enough to only drag libcpp and the
diagnostic subsystem.

> I think that means read_line needs to be broken out into a separate
> source file -- or at least one that's already common between gcov
> and cc1 (etc).

Hopefully this concern should be addressed by the above.

> Changelog doesn't match diff. (function names wrong, no mention of gcov)

Woops, sorry about that.  The second patch attached is an update that
should address that.  And with the changes in the first patch, this
new factorization patch is smaller.

Jakub Jelinek  writes:

> On Thu, Nov 07, 2013 at 07:17:19AM +, Nathan Sidwell wrote:
>> On 11/06/13 21:25, Dodji Seketeli wrote:
>> >It appeared that gcov.c and input.c both have a static function named
>> >read_line.  I guess we ought to keep just one.
>> 
>> As a general rule, that's good.
>
> Generally yes, but IMHO not in this case, both because of the problems you
> are mentioning, because it already now needs to do quite different things
> and because I think in the future the version for caret diagnostics should
> do much more complicated things (caching of the lines for faster further
> caret diagnostics).  Furthermore, ftell/fseek aren't something that should
> be used unless strictly necessary.

OK, let's say I am sending these patches here just for the record so
that it doesn't get lost :-)  I'll devise another scheme right away to
avoid the ftell/seek that you don't like.

I agree that, as you proposed at the beginning of the initial thread,
the caret diagnostics system should avoid opening/closing one file just
to cycle over it to find one line, repeatedly.  I just thought that
would come as a later patch after we fix this PR specifically.

Cheers.

>From 75cea3f6883f192e9472fd7905f42d84d875e4e7 Mon Sep 17 00:00:00 2001
From: dodji 
Date: Wed, 6 Nov 2013 11:33:52 +
Subject: [PATCH 1/2] PR preprocessor/58580 - preprocessor goes OOM with
 warning for zero literals

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function do the
	actual line reading.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0

Re: patch to fix PR58784 (ARM LRA crash)

2013-11-07 Thread Yvan Roux

Hi,

I've tested LRA on ARM in several configuration and it occurs that
only one regression is observed in Fortran for targets A15 and A9 in
hard and soft fp amd Thumb mode enabled by default (ARMv5 is clean).
The failing test case is gfortran.dg/realloc_on_assign_11.f90 in -O3
and the error seems to be of the same nature of
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58784, in the sens that
LRA ICEs in check_rtl on an insn which doesn't satisfy its
constraints.  Here is the insn :

(insn 633 543 656 24 (parallel [
(set (mem/c:SI (plus:SI (reg/f:SI 907)
(const_int 8 [0x8])) [4 A.54+8 S4 A64])
(smax:SI (reg:SI 562 [ D.5235 ])
(reg:SI 621 [ D.5235 ])))
(clobber (reg:CC 100 cc))
]).../gfortran.dg/realloc_on_assign_11.f90:29 121 {*store_minmaxsi}
 (expr_list:REG_DEAD (reg:SI 621 [ D.5235 ])
(expr_list:REG_UNUSED (reg:CC 100 cc)
(nil

Thanks,
Yvan

On 31 October 2013 17:31, Yvan Roux  wrote:
> (this time in plain text !)
>
>> Does this mean we can now turn on LRA for ARM state by default and
>> start looking at performance issues if any ?
>
> With the other patch for 58785 and a small one I've to post, we are even
> about to turn LRA on by default completely, as the compiler now bootstrap
> also in thumb and first testsuite are clean :-)
>
> I juste want ton pass a full validation before.
>
> Yvan
>
>
>>> Ramana
>>>
>>>
>>>
>>>
>>>
>>> > 2013-10-30  Vladimir Makarov  
>>> >
>>> > PR target/58784
>>> > * lra.c (check_rtl): Remove address check before LRA work.
>>> >
>>> > 2013-10-30  Vladimir Makarov  
>>> >
>>> > PR target/58784
>>> > * gcc.target/arm/pr58784.c: New.
>>> >

Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-07 Thread Joey Ye

-  warning_at (DECL_SOURCE_LOCATION (p),
-   OPT_Wunused_but_set_variable,
-   "variable %qD set but not used", p);
+  {
+if (!TREE_THIS_VOLATILE (p))
+  warning_at (DECL_SOURCE_LOCATION (p),
+OPT_Wunused_but_set_variable,
+"variable %qD set but not used", p);
+  }
I'd prefer

else if (DECL_CONTEXT (p) == current_function_decl
&& !TREE_THIS_VOLATILE (p))

which concises the logic and avoids indent change

Thanks,
Joey

On Thu, Nov 7, 2013 at 3:37 PM, Bin.Cheng  wrote:
> On Thu, Nov 7, 2013 at 11:53 AM, Mingjie Xing  wrote:
>> 2013/11/6 Richard Biener :
>>> You miss a testcase.
>>>
>>> Also why should the warning be omitted for unused automatic
>>> volatile variables?  They cannot be used in any way.
>>>
>>> Richard.
>>
>> Thanks.  I've updated the patch with a test case.
>>
>> c/ChangeLog
>>
>> PR 57258
>
> Should be:
>PR c/57258
>
> Thanks,
> bin
>> * c-decl.c (pop_scope): Don't emit unused variable warnings for
>> accessed volatile variables.
>>
>> testsuite/ChangeLog
>>
>> * gcc.dg/Wunused-var-4.c: New test.
>>
>> Mingjie
>
>
>
> --
> Best Regards.

[rl78] Use canonical const_int for one_cmplqi2

2013-11-07 Thread Richard Sandiford

This patch just changes a QImode (const_int 255) to (const_int -1),
since the canonical form is to sign-extend.  As things stand the pattern
trips a new assert added on the wide-int branch.

Tested by building rl78-elf before and after the patch and making sure
that there were no changes in assembly code for gcc.c-torture, gcc.dg
and g++dg.  OK to install?

Thanks,
Richard


gcc/
* config/rl78/rl78-expand.md (one_cmplqi2): Use (const_int -1)
rather than (const_int 255).

Index: gcc/config/rl78/rl78-expand.md
===
--- gcc/config/rl78/rl78-expand.md  2013-10-15 19:24:15.821884272 +0100
+++ gcc/config/rl78/rl78-expand.md  2013-11-07 09:41:50.754226143 +
@@ -177,7 +177,7 @@ (define_expand "xorqi3"
 (define_expand "one_cmplqi2"
   [(set (match_operand:QI 0 "nonimmediate_operand")
(xor:QI (match_operand:QI 1 "general_operand")
-   (const_int 255)))
+   (const_int -1)))
]
   ""
   "if (rl78_force_nonfar_2 (operands, gen_one_cmplqi2))

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-07 Thread Ilya Enkovich

2013/11/6 Ilya Enkovich :
> 2013/11/5 Jeff Law :
>> On 11/04/13 06:27, Richard Biener wrote:
>>>
>>> On Thu, Oct 31, 2013 at 10:02 AM, Ilya Enkovich 
>>> wrote:

 Hi,

 Here is a patch which hadles the problem with optimization of
 BUILT_IN_CHKP_ARG_BND calls.  Pointer Bounds Checker expects that argument
 of this call is a default SSA_NAME of the PARM_DECL whose bounds we want to
 get.  The problem is in optimizations which may replace arg with it's copy
 or a known value.  This patch suppress such modifications.
>>>
>>>
>>> This doesn't seem like a good fix.  I suppose you require the same on
>>> RTL, that is, have the incoming arg reg coalesced with the use?
>>> In that case better set SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
>>
>> I agree.  This seems like the wrong direction.
>>
>> While I don't like abusing SSA_NAME_OCCURS_IN_ABNORMAL_PHI in this manner,
>> setting that flag should give them precisely the behaviour they want.  That
>> makes me think SSA_NAME_OCCURS_IN_ABNORMAL_PHI is poorly named.  But
>> addressing that is separate from this patch.
>>
>> Ilya, can you convert your code to set SSA_NAME_OCCURS_IN_ABNORMAL_PHI and
>> drop this patch?
>
> OK. I'll try SSA_NAME_OCCURS_IN_ABNORMAL_PHI approach and see if it
> causes any problems.

Seems flag works fine.  All my tests pass.  No need to make additional
changes in verifiers etc.  Fix will be included into instrumentation
pass patch.  This patch is dropped.

Thanks,
Ilya

>
> Thanks,
> Ilya
>
>>
>>
>> Jeff
>>

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Richard Biener

On Wed, Nov 6, 2013 at 9:56 PM, Andrew MacLeod  wrote:
> On 11/06/2013 06:31 AM, Richard Biener wrote:
>>
>> On Wed, Nov 6, 2013 at 12:02 PM, Bernd Schmidt 
>> wrote:
>>>
>>>
>>> Maybe we need to revisit it? As one of those who were not in favour of
>>> the C++ move, can I ask you guys to step back for a moment and think
>>> about - what do all of these changes buy us, exactly? Imagine the state
>>> at the end, where everything is converted and supposedly the temporary
>>> ugliness is gone, what have we gained over the code as it is now?
>>
>> as_a gains us less runtime checking and more static type checking
>> which is good.
>>
>>> I still think all this effort is misdirected and distracts us from
>>> making changes that improve gcc for its users.
>>
>> That I agree to.  Instead of fixing the less than optimal separation /
>> boundary
>> between frontends and the middle-end, or fixing several other
>> long-standing
>> issues with GCC we spend a lot of time refactoring things to be C++.
>> But that was kind of part of the decision (though I remember that we
>> mainly wanted to convert containters and isolated stuff, not gimple
>> or trees (I bet that'll be next)).
>>
>> Of course I don't see contributors of "changes that improve gcc for its
>> users"
>> now wasting their time with converting code to C++.  That conversion
>> may slow down those people, but only so much.  It'll get more interesting
>> with branch maintainance ...
>>
>>
> What does it buy us?  I won't comment on the merit of other changes, but I
> will reiterate what I'm doing.
>
> 1 - I think the code restructuring  to enable the changes I proposed is
> quite worthwhile.  Our includes are a mess and prototypes locations are all
> over the map. I think this is a worthwhile cleanup that I probably wouldn't
> be doing otherwise.

True.

> 2 - I really believe gimple needs a type system different from front end
> trees, that is my primary motivation.  I'm tired of jumping through hoops to
> do anything slightly different, and I got fed up with it.  With a separate
> type system for gimple, we can rid ourselves of all the stuff that isn't
> related to optimization and codegen...   Could we do this with trees?
> possibly, but if I'm going to go to the effort of converting the front end
> tree types into a new or reduced-type subset, I might as well put that
> effort into something that is more appropriate right from the start.

I'm not sure.  Especially the idea of wrapping everything in a
I'm-not-a-tree-well-but-I-really-still-am scares the hell out of me.
And I put that into the very same basket of things you now complain about.

> 3 - Trees are very polymorphic and as a result, its difficult to tell where
> types are being used.

It's a feature.  Now the type vs. object thing may be something to
change (historically the reason they are entangled is that things
like a vector and a list also were trees - which is a feature).

> By replacing those uses with an actual type, we'll get
> static type checking which is another improvement.   There is only one way
> to move the middle/back end off of types as trees that I can see.  Go
> through the source base and find all the places which use trees as types,
> and replace them with an actual type.  This can't be done all at once, its
> too big a job...  So I propose a plan which allows a new gimple type and
> current trees to coexist.  Once the conversion is complete we can easily
> implement a type system that has the features and traits we want.

Note that on 'gimple' there are no types ... the types are on the objects
'gimple' refers to and those are 'tree's.  So you'd need to change the
'tree's to have TREE_TYPE not be a 'tree'.  Or simply subclass
'tree' so you can distinguish a tree type from a tree non-type statically
(a bit awkward with the current union setup).

But maybe I'm confusing what you call 'type'.

> 4 - I realized that if I am going through the entire code base to find all
> the type locations, its very little extra work to also do the same for decls
> and expressions. Then we'll have static type checking for those as well.
>
> 5 - When done we will be significantly closer to having the ability to
> create a self contained IL that we can strictly define...  That alone seems
> a worthwhile goal to me.  The IL then becomes appropriate for streaming at
> any point in during compilation.  This means any front end or tool can
> easily generate or consume a gimple stream which opens up numerous
> opportunities...  That is pretty hard to enable today.

Have fun with dealing with our legacy frontends ;)  Or didn't you
actually mean "any point" during compilation? ... oh, you talk about
GIMPLE.  So no plans for Frontend ASTs aka GENERIC for our legacy
frontends?

> Im going to do the initial bits on a branch for the first couple of months,
> and during that time convert a number of files over to the new gimple
> interface.  Then we can look at it, measure it, comment on it, tweak it,
> whate

Re: PATCH: PR target/59034: FAIL gcc.c-torture/compile/20031208-1.c

2013-11-07 Thread H.J. Lu

On Wed, Nov 6, 2013 at 11:42 PM, Uros Bizjak  wrote:
> On Thu, Nov 7, 2013 at 7:19 AM, H.J. Lu  wrote:
>
>> We should use Pmode with stack_pointer_rtx.  OK for trunk and 4.8
>> branch?
>
> OK everywhere.
>

I checked it into trunk and 4.8 branch.  For 4.7, I only checked
in the i386.md change since the testcases use the new
-maddress-mode= in 4.8.

Thanks.

-- 
H.J.

Re: [PATCH, i386]: Fix PR 59021, new vzeroupper instructions generated with -mavx

2013-11-07 Thread Uros Bizjak

On Wed, Nov 6, 2013 at 11:06 PM, Eric Botcazou  wrote:
>> Attached patch fixes PR 59021, where new vzeroupper instructions are
>> generated for -mavx after Vlad's useless insn removal patch.
>>
>> The problem was, that we depent on (useless) moves to AVX256 function
>> argument registers in front of the function call to switch the mode to
>> DIRTY mode. This is not true anymore, so call_insn has to switch to
>> DIRTY mode by itself.
>
> Thanks for fixing this!

OTOH, looking a bit deeper, it looks that there is a problem in
mode-switching infrastructure. If we have a BB without any mode
requirements, but an insn that switched the mode via MODE_AFTER, we
should not mark the block as transparent. Indeed, we even have a N.B.
comment in mode-switching.c:

  /* Check for blocks without ANY mode requirements.
 N.B. because of MODE_AFTER, last_mode might still be different
 from no_mode.  */

But, we do nothing w.r.t to transparency

Attached incremental patch also fixes additional vzeroupper problem.
Calls with AVX argument register in fact do not have any mode
requirements, so we don't need to fake MODE_DIRTY requirement.

I have added Joern to the discussion.

Uros.
Index: mode-switching.c
===
--- mode-switching.c(revision 204496)
+++ mode-switching.c(working copy)
@@ -577,6 +577,8 @@ optimize_mode_switching (void)
{
  ptr = new_seginfo (no_mode, BB_END (bb), bb->index, live_now);
  add_seginfo (info + bb->index, ptr);
+ if (last_mode != no_mode)
+   bitmap_clear_bit (transp[bb->index], j);
}
}
 #if defined (MODE_ENTRY) && defined (MODE_EXIT)
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 204496)
+++ config/i386/i386.c  (working copy)
@@ -15708,7 +15708,7 @@ ix86_avx_u128_mode_needed (rtx insn)
  rtx arg = XEXP (XEXP (link, 0), 0);

  if (ix86_check_avx256_register (&arg, NULL))
-   return AVX_U128_DIRTY;
+   return AVX_U128_ANY;
}
}

Re: Re-factor tree.h - Part 1

2013-11-07 Thread Richard Biener

On Wed, Nov 6, 2013 at 6:10 PM, Diego Novillo  wrote:
> On Wed, Nov 6, 2013 at 2:04 AM, Marc Glisse  wrote:
>> On Tue, 5 Nov 2013, Diego Novillo wrote:
>>
>>> This is the first patch in a series of patches to cleanup tree.h to
>>> reduce the exposure it has all over the compiler.
>>>
>>> In this patch, I'm moving functions that are used once into the files
>>> that use them, and make them private to that file. These functions
>>> were declared extern in tree.h and called from exactly one place.
>>
>>
>> I am not a big fan of doing it so automatically. For instance
>> widest_int_cst_value should imho remain next to int_cst_value since they
>> mostly share the same implementation. Doing this also doesn't promote code
>> reuse: if I am looking for a function that does some basic operation on
>> trees, I won't only need to look in the file that is semantically relevant,
>
> Yeah, that was what I was trying to impose. Functions that
> semantically make better sense in the place where they're called from.
> I filtered functions that were used once but would not make much sense
> to move (for instance, the init functions).
>
> How about this. Below is the list of functions that I took out of
> tree.h.  They are either not used or used only once, in which case I
> moved them (and their private transitive closure) over to the file
> that uses them.  There are more, but these are the ones that I
> originally considered to be too specific to the user file to be
> globally exposed.  Which ones would you folks keep global?
>
> I have marked some that we may decide to keep extern and one, which I
> would not mind to not move at all.  I also marked the functions I
> removed and the ones that I just made private to their definition
> file:
>
> Moved to builtins.c:
> more_const_call_expr_args_p
> expand_stack_restore
> expand_stack_save

Makes sense.

> Moved to cfgexpand.c:
> expand_main_function
> stack_protect_prologue
> expand_asm_stmt
> expand_computed_goto
> expand_goto
> expand_return

Likewise.

> Moved to cgraphclones.c:
> build_function_decl_skip_args

Likewise.

> Moved to explow.c:
> tree_expr_size (maybe keep extern?)

Weird enough function, no need to export it.

> Moved to expr.c:
> fields_length

Likewise.

> Moved to fold-const.c:
> size_low_cst

Same as int_cst_value just with lack of any checking?

> Moved to gimple-fold.c:
> truth_type_for (maybe keep extern?)

Uses a langhook so should stay here.

> Moved to symtab.c:
> decl_assembler_name_hash
> decl_assembler_name_equal

Ok.

> Moved to trans-mem.c:
> is_tm_safe_or_pure

Ok.

> Moved to tree-eh.c:
> in_array_bounds_p
> range_in_array_bounds_p

I'm not sure ...

> Moved to tree-ssa-dom.c:
> iterative_hash_exprs_commutative

I'm not sure - it seems that iterative_hash_expr does exactly
the same thing?  So instead of moving things this should
be refactored (make that function re-usable from iterative_hash_expr).

> Moved to tree-ssa-math-opts.c:
> widest_int_cst_value (maybe keep extern?)

Keep it in tree.c

> Moved to tree-vrp.c:
> fold_unary_to_constant (?)

No.

> Moved to cp/call.c
> ctor_to_vec

No.

> Moved to cp/decl.c:
> supports_one_only
> chain_member

No.

> Moved to java/class.c:
> build_method_type

No.

> Removed (not used anywhere)
> build_type_no_quals
> omp_remove_redundant_declare_simd_attrs
> fold_build3_initializer_loc
> real_twop
> print_vec_tree
> list_equal_p
> ssa_name_nonnegative_p
> addr_expr_of_non_mem_decl_p
> save_vtable_map_decl
>
> Made static (only used in its defining file)
> stabilize_reference_1
> tree_expr_nonzero_p
> tree_invalid_nonnegative_warnv_p
> tree_expr_nonzero_warnv_p
> fold_builtin_snprintf_chk
> validate_arglist
> simple_cst_list_equal
> lookup_scoped_attribute_spec
> get_attribute_namespace
> fini_object_sizes

Those are obvious.

Richard.

>
> Thanks.  Diego.

Re: [wide-int] Make integer_onep return false for signed 1-bit bitfields

2013-11-07 Thread Richard Biener

On Wed, Nov 6, 2013 at 11:31 PM, Richard Sandiford
 wrote:
> Ping.

Ok.

Thanks,
Richard.

> Richard Sandiford  writes:
>> As discussed on gcc@, integer_onep is supposed to return false for
>> a nonzero 1-bit value if the type is signed.
>>
>> Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK to install?
>>
>> Thanks,
>> Richard
>>
>>
>> Index: gcc/tree.c
>> ===
>> --- gcc/tree.c2013-10-29 19:19:27.623468618 +
>> +++ gcc/tree.c2013-11-02 17:25:06.499657501 +
>> @@ -2092,7 +2092,7 @@ integer_onep (const_tree expr)
>>switch (TREE_CODE (expr))
>>  {
>>  case INTEGER_CST:
>> -  return wi::eq_p (expr, 1);
>> +  return wi::eq_p (wi::to_widest (expr), 1);
>>  case COMPLEX_CST:
>>return (integer_onep (TREE_REALPART (expr))
>> && integer_zerop (TREE_IMAGPART (expr)));

Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart

2013-11-07 Thread Basile Starynkevitch

On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote:
> 
> I decided to name the new file gimple-expr.[ch] instead of
> gimple-decl   This will eventually split into gimple-type.[ch],
> gimple-decl.[ch], and gimple-expr.[ch].  

Since we are adding *new* C++ files, can't we please name them *.cc 
for the implementation part, so at least create gimple-expr.h and 
gimple-expr.cc but not gimple-expr.c, please!

There are some reasons to keep existing *.c files containing C++ code
(IMHO the reasons are bad ones, and related to poor habits and to 
deficiencies in the version control system we have to use, 
but I really don't want to open that debate again).

But for **NEW** files which are definitely in C++, I don't understand 
why they should be named .c files; this is confusing for all (and, 
for instance, when compiling them with Clang we are getting -IMHO 
rightly- some warnings about the file naming).

If I remember well, there have been (in the discussion about naming C++ source 
files of GCC) a suggestion (and perhaps even a consensus), 
probably by Diego Novillo, to name *.cc our new files which are in C++.

Having old C++ files named *.c is already a big frustration 
(most editors are by default configured to handle them as C, not as C++, files
and most developers, notably newbies to GCC contributions, 
are expecting them to be C files not C++ ones). 

But IMHO having new source files inside GCC coded in C++ with a *.c file 
extension
is non-sense; such files should have a .cc (or maybe .cpp or .cxx) extension, 
not a .c extension.

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

Re: [PATCH]Fix computation of offset in ivopt

2013-11-07 Thread Richard Biener

On Wed, Nov 6, 2013 at 6:06 PM, Richard Sandiford
 wrote:
> Hi,
>
> "bin.cheng"  writes:
>> Index: gcc/tree-ssa-loop-ivopts.c
>> ===
>> --- gcc/tree-ssa-loop-ivopts.c(revision 203267)
>> +++ gcc/tree-ssa-loop-ivopts.c(working copy)
>> @@ -2037,12 +2037,12 @@ find_interesting_uses (struct ivopts_data *data)
>>
>>  static tree
>>  strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
>> - unsigned HOST_WIDE_INT *offset)
>> + HOST_WIDE_INT *offset)
>>  {
>>tree op0 = NULL_TREE, op1 = NULL_TREE, tmp, step;
>>enum tree_code code;
>>tree type, orig_type = TREE_TYPE (expr);
>> -  unsigned HOST_WIDE_INT off0, off1, st;
>> +  HOST_WIDE_INT off0, off1, st;
>>tree orig_expr = expr;
>>
>>STRIP_NOPS (expr);
>> @@ -2133,19 +2133,32 @@ strip_offset_1 (tree expr, bool inside_addr, bool
>>break;
>>
>>  case COMPONENT_REF:
>> -  if (!inside_addr)
>> - return orig_expr;
>> +  {
>> + tree field;
>>
>> -  tmp = component_ref_field_offset (expr);
>> -  if (top_compref
>> -   && cst_and_fits_in_hwi (tmp))
>> - {
>> -   /* Strip the component reference completely.  */
>> -   op0 = TREE_OPERAND (expr, 0);
>> -   op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0);
>> -   *offset = off0 + int_cst_value (tmp);
>> -   return op0;
>> - }
>> + if (!inside_addr)
>> +   return orig_expr;
>> +
>> + tmp = component_ref_field_offset (expr);
>> + field = TREE_OPERAND (expr, 1);
>> + if (top_compref
>> + && cst_and_fits_in_hwi (tmp)
>> + && cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
>
> While comparing output for wide-int and mainline, I noticed that
> this condition is now always false on x86_64, since DECL_FIELD_BIT_OFFSET
> is a 128-bit bitsizetype and since cst_and_fits_in_hwi rejects constants
> with precision greater than HWI:
>
>   if (TREE_CODE (x) != INTEGER_CST)
> return false;
>
>   if (TYPE_PRECISION (TREE_TYPE (x)) > HOST_BITS_PER_WIDE_INT)
> return false;

I think that's simply overly restrictive on the side of cst_and_fits_in_hwi.
I suppose this function is meant to complement int_cst_value (digging
in history might be nice here).

> Should this be host_integerp (DECL_FIELD_BIT_OFFSET (field), 0) instead?

That's not the same as it rejects -1U.  The function seems to ask
if the value, if casted to unsigned fits in a HOST_WIDE_INT.

So just drop the precision check from cst_and_fits_in_hwi.

Richard.

>
>> +   {
>> + HOST_WIDE_INT boffset, abs_off;
>> +
>> + /* Strip the component reference completely.  */
>> + op0 = TREE_OPERAND (expr, 0);
>> + op0 = strip_offset_1 (op0, inside_addr, top_compref, &off0);
>> + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
>> + abs_off = abs_hwi (boffset) / BITS_PER_UNIT;
>> + if (boffset < 0)
>> +   abs_off = -abs_off;
>> +
>> + *offset = off0 + int_cst_value (tmp) + abs_off;
>> + return op0;
>> +   }
>> +  }
>>break;
>
> Thanks,
> Richard

Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 10:32:10AM +0100, Richard Biener wrote:
> I'm looking for adjusting of live compute - either by adjusting the
> algorithm or by special casing the latches.  Like for example
> with the following (untested, needs cleanup - the PHI processing
> can be factored out from find_assert_locations_1 and re-used):
> 
> @@ -5904,6 +5898,27 @@ find_assert_locations (void)
>for (i = 0; i < rpo_cnt; ++i)
>  bb_rpo[rpo[i]] = i;
>  
> +  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
> + the order we compute liveness and insert asserts we otherwise
> + fail to insert asserts into the loop latch.  */
> +  loop_p loop;
> +  loop_iterator li;
> +  FOR_EACH_LOOP (li, loop, 0)
> +{
> +  i = loop->latch->index;
> +  live[i] = sbitmap_alloc (num_ssa_names);
> +  bitmap_clear (live[i]);
> +  for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header);
> +  !gsi_end_p (gsi); gsi_next (&gsi))
> +   {
> + gimple phi = gsi_stmt (gsi);
> + if (virtual_operand_p (gimple_phi_result (phi)))
> +   continue;
> + for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j)
> +   if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME)
> + bitmap_set_bit (live[i], SSA_NAME_VERSION 
> (gimple_phi_arg_def (phi, j)));
> +   }
> +}

LGTM, except that I think we should only care about phi args from the
latch -> header edge, not others and IMHO it is memory waste to allocate
sbitmap even when we wouldn't add anything.  So, I'm going to
rebootstrap/regtest the following (including new testcase Jeff asked for):

2013-11-07  Richard Biener  
Jakub Jelinek  

* tree-vrp.c (find_assert_locations): Pre-seed live bitmaps for loop
latches from header PHI arguments from the latch edge.

* gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
* gcc.dg/unroll_2.c: Likewise.
* gcc.dg/unroll_3.c: Likewise.
* gcc.dg/unroll_4.c: Likewise.
* gcc.dg/vrp90.c: New test.

--- gcc/tree-vrp.c.jj   2013-11-06 16:58:04.675678567 +0100
+++ gcc/tree-vrp.c  2013-11-07 10:59:38.841283931 +0100
@@ -5906,6 +5906,34 @@ find_assert_locations (void)
   for (i = 0; i < rpo_cnt; ++i)
 bb_rpo[rpo[i]] = i;
 
+  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
+ the order we compute liveness and insert asserts we otherwise
+ fail to insert asserts into the loop latch.  */
+  loop_p loop;
+  loop_iterator li;
+  FOR_EACH_LOOP (li, loop, 0)
+{
+  i = loop->latch->index;
+  unsigned int j = find_edge (loop->latch, loop->header)->dest_idx;
+  for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header);
+  !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ if (virtual_operand_p (gimple_phi_result (phi)))
+   continue;
+ tree arg = gimple_phi_arg_def (phi, j);
+ if (TREE_CODE (arg) == SSA_NAME)
+   {
+ if (live[i] == NULL)
+   {
+ live[i] = sbitmap_alloc (num_ssa_names);
+ bitmap_clear (live[i]);
+   }
+ bitmap_set_bit (live[i], SSA_NAME_VERSION (arg));
+   }
+   }
+}
+
   need_asserts = false;
   for (i = rpo_cnt - 1; i >= 0; --i)
 {
--- gcc/testsuite/gcc.dg/unroll_1.c.jj  2013-09-09 11:32:36.0 +0200
+++ gcc/testsuite/gcc.dg/unroll_1.c 2013-11-06 17:10:52.900722932 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
-fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll" } */
+/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
-fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli 
-fenable-rtl-loop2_unroll" } */
 
 unsigned a[100], b[100];
 inline void bar()
--- gcc/testsuite/gcc.dg/unroll_2.c.jj  2013-08-30 14:38:39.0 +0200
+++ gcc/testsuite/gcc.dg/unroll_2.c 2013-11-06 17:10:30.751845504 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
-fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
-fenable-rtl-loop2_unroll" } */
+/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp 
-fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
-fenable-rtl-loop2_unroll" } */
 
 unsigned a[100], b[100];
 inline void bar()
--- gcc/testsuite/gcc.dg/unroll_3.c.jj  2013-08-30 14:38:39.0 +0200
+++ gcc/testsuite/gcc.dg/unroll_3.c 2013-11-06 17:10:38.864800338 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
-fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll=foo" } 
*/
+/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp 
-fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll=foo" } 
*/
 
 unsigned a[100], b[100];
 inli

Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Richard Biener

On Thu, 7 Nov 2013, Jakub Jelinek wrote:

> On Thu, Nov 07, 2013 at 10:32:10AM +0100, Richard Biener wrote:
> > I'm looking for adjusting of live compute - either by adjusting the
> > algorithm or by special casing the latches.  Like for example
> > with the following (untested, needs cleanup - the PHI processing
> > can be factored out from find_assert_locations_1 and re-used):
> > 
> > @@ -5904,6 +5898,27 @@ find_assert_locations (void)
> >for (i = 0; i < rpo_cnt; ++i)
> >  bb_rpo[rpo[i]] = i;
> >  
> > +  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
> > + the order we compute liveness and insert asserts we otherwise
> > + fail to insert asserts into the loop latch.  */
> > +  loop_p loop;
> > +  loop_iterator li;
> > +  FOR_EACH_LOOP (li, loop, 0)
> > +{
> > +  i = loop->latch->index;
> > +  live[i] = sbitmap_alloc (num_ssa_names);
> > +  bitmap_clear (live[i]);
> > +  for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header);
> > +  !gsi_end_p (gsi); gsi_next (&gsi))
> > +   {
> > + gimple phi = gsi_stmt (gsi);
> > + if (virtual_operand_p (gimple_phi_result (phi)))
> > +   continue;
> > + for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j)
> > +   if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME)
> > + bitmap_set_bit (live[i], SSA_NAME_VERSION 
> > (gimple_phi_arg_def (phi, j)));
> > +   }
> > +}
> 
> LGTM, except that I think we should only care about phi args from the
> latch -> header edge, not others and IMHO it is memory waste to allocate
> sbitmap even when we wouldn't add anything.  So, I'm going to
> rebootstrap/regtest the following (including new testcase Jeff asked for):
> 
> 2013-11-07  Richard Biener  
>   Jakub Jelinek  
> 
>   * tree-vrp.c (find_assert_locations): Pre-seed live bitmaps for loop
>   latches from header PHI arguments from the latch edge.
> 
> * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
> * gcc.dg/unroll_2.c: Likewise.
> * gcc.dg/unroll_3.c: Likewise.
> * gcc.dg/unroll_4.c: Likewise.
>   * gcc.dg/vrp90.c: New test.
> 
> --- gcc/tree-vrp.c.jj 2013-11-06 16:58:04.675678567 +0100
> +++ gcc/tree-vrp.c2013-11-07 10:59:38.841283931 +0100
> @@ -5906,6 +5906,34 @@ find_assert_locations (void)
>for (i = 0; i < rpo_cnt; ++i)
>  bb_rpo[rpo[i]] = i;
>  
> +  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
> + the order we compute liveness and insert asserts we otherwise
> + fail to insert asserts into the loop latch.  */
> +  loop_p loop;
> +  loop_iterator li;
> +  FOR_EACH_LOOP (li, loop, 0)
> +{
> +  i = loop->latch->index;
> +  unsigned int j = find_edge (loop->latch, loop->header)->dest_idx;

Actually there is only one edge from latch to header by definition,
so single_succ_edge (loop->latch) would be better?

Otherwise ok.

Thanks,
Richard.

> +  for (gimple_stmt_iterator gsi = gsi_start_phis (loop->header);
> +!gsi_end_p (gsi); gsi_next (&gsi))
> + {
> +   gimple phi = gsi_stmt (gsi);
> +   if (virtual_operand_p (gimple_phi_result (phi)))
> + continue;
> +   tree arg = gimple_phi_arg_def (phi, j);
> +   if (TREE_CODE (arg) == SSA_NAME)
> + {
> +   if (live[i] == NULL)
> + {
> +   live[i] = sbitmap_alloc (num_ssa_names);
> +   bitmap_clear (live[i]);
> + }
> +   bitmap_set_bit (live[i], SSA_NAME_VERSION (arg));
> + }
> + }
> +}
> +
>need_asserts = false;
>for (i = rpo_cnt - 1; i >= 0; --i)
>  {
> --- gcc/testsuite/gcc.dg/unroll_1.c.jj2013-09-09 11:32:36.0 
> +0200
> +++ gcc/testsuite/gcc.dg/unroll_1.c   2013-11-06 17:10:52.900722932 +0100
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
> -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll" } */
> +/* { dg-options "-O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
> -fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli 
> -fenable-rtl-loop2_unroll" } */
>  
>  unsigned a[100], b[100];
>  inline void bar()
> --- gcc/testsuite/gcc.dg/unroll_2.c.jj2013-08-30 14:38:39.0 
> +0200
> +++ gcc/testsuite/gcc.dg/unroll_2.c   2013-11-06 17:10:30.751845504 +0100
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
> -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
> -fenable-rtl-loop2_unroll" } */
> +/* { dg-options "-O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp 
> -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
> -fenable-rtl-loop2_unroll" } */
>  
>  unsigned a[100], b[100];
>  inline void bar()
> --- gcc/testsuite/gcc.dg/unroll_3.c.jj2013-08-30 14:38:39.0 
> +0200
> +++ gcc/testsuite/gcc.dg/unroll_3.c   2013-11-06 17:10:

[PATCH] Fix casting in get_ref_base_and_extent

2013-11-07 Thread Richard Biener


As noticed by Richard.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-07  Richard Biener  

* tree-dfa.c (get_ref_base_and_extent): Fix casting.

Index: gcc/tree-dfa.c
===
--- gcc/tree-dfa.c  (revision 204458)
+++ gcc/tree-dfa.c  (working copy)
@@ -569,7 +569,7 @@ get_ref_base_and_extent (tree exp, HOST_
  && (!bit_offset.fits_shwi ()
  || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
  || (bit_offset.to_shwi () + maxsize
- == (signed) TREE_INT_CST_LOW
+ == (HOST_WIDE_INT) TREE_INT_CST_LOW
(TYPE_SIZE (TREE_TYPE (exp))
maxsize = -1;
 
@@ -606,7 +606,8 @@ get_ref_base_and_extent (tree exp, HOST_
   && (!bit_offset.fits_shwi ()
  || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
  || (bit_offset.to_shwi () + maxsize
- == (signed) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp))
+ == (HOST_WIDE_INT) TREE_INT_CST_LOW
+  (TYPE_SIZE (TREE_TYPE (exp))
 maxsize = -1;
 
  done:

[PATCH][2/2] Get TREE_OVERFLOW somewhat under control

2013-11-07 Thread Richard Biener


This fixes CCP and DOM to not spread TREE_OVERFLOW eventually set
from fold.  I've chosen to change the lattice updating which is
easiest.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-07  Richard Biener  

* tree-ssa-ccp.c (canonicalize_float_value): Rename to ...
(canonicalize_value): ... this.  Also handle stripping of
TREE_OVERFLOW.
(get_value, set_lattice_value, get_value_for_expr): Adjust.
* gimple-fold.c (canonicalize_constructor_val): Strip
TREE_OVERFLOW.
* tree-ssa-threadedge.c (set_ssa_name_value): Likewise.

Index: gcc/tree-ssa-ccp.c
===
--- gcc/tree-ssa-ccp.c  (revision 204504)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -168,7 +168,7 @@ typedef struct prop_value_d prop_value_t
 static prop_value_t *const_val;
 static unsigned n_const_val;
 
-static void canonicalize_float_value (prop_value_t *);
+static void canonicalize_value (prop_value_t *);
 static bool ccp_fold_stmt (gimple_stmt_iterator *);
 
 /* Dump constant propagation value VAL to file OUTF prefixed by PREFIX.  */
@@ -326,7 +326,7 @@ get_value (tree var)
   if (val->lattice_val == UNINITIALIZED)
 *val = get_default_value (var);
 
-  canonicalize_float_value (val);
+  canonicalize_value (val);
 
   return val;
 }
@@ -378,17 +378,24 @@ set_value_varying (tree var)
  that HONOR_NANS is false, and we try to change the value of x to 0,
  causing an ICE.  With HONOR_NANS being false, the real appearance of
  NaN would cause undefined behavior, though, so claiming that y (and x)
- are UNDEFINED initially is correct.  */
+ are UNDEFINED initially is correct.
+
+  For other constants, make sure to drop TREE_OVERFLOW.  */
 
 static void
-canonicalize_float_value (prop_value_t *val)
+canonicalize_value (prop_value_t *val)
 {
   enum machine_mode mode;
   tree type;
   REAL_VALUE_TYPE d;
 
-  if (val->lattice_val != CONSTANT
-  || TREE_CODE (val->value) != REAL_CST)
+  if (val->lattice_val != CONSTANT)
+return;
+
+  if (TREE_OVERFLOW_P (val->value))
+val->value = drop_tree_overflow (val->value);
+
+  if (TREE_CODE (val->value) != REAL_CST)
 return;
 
   d = TREE_REAL_CST (val->value);
@@ -454,7 +461,7 @@ set_lattice_value (tree var, prop_value_
   /* We can deal with old UNINITIALIZED values just fine here.  */
   prop_value_t *old_val = &const_val[SSA_NAME_VERSION (var)];
 
-  canonicalize_float_value (&new_val);
+  canonicalize_value (&new_val);
 
   /* We have to be careful to not go up the bitwise lattice
  represented by the mask.
@@ -569,7 +576,7 @@ get_value_for_expr (tree expr, bool for_
   val.lattice_val = CONSTANT;
   val.value = expr;
   val.mask = double_int_zero;
-  canonicalize_float_value (&val);
+  canonicalize_value (&val);
 }
   else if (TREE_CODE (expr) == ADDR_EXPR)
 val = get_value_from_alignment (expr);
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 204504)
+++ gcc/gimple-fold.c   (working copy)
@@ -205,6 +205,8 @@ canonicalize_constructor_val (tree cval,
cval = fold_convert (TREE_TYPE (orig_cval), cval);
   return cval;
 }
+  if (TREE_OVERFLOW_P (cval))
+return drop_tree_overflow (cval);
   return orig_cval;
 }
 
Index: gcc/tree-ssa-threadedge.c
===
--- gcc/tree-ssa-threadedge.c   (revision 204504)
+++ gcc/tree-ssa-threadedge.c   (working copy)
@@ -58,6 +58,8 @@ set_ssa_name_value (tree name, tree valu
 {
   if (SSA_NAME_VERSION (name) >= ssa_name_values.length ())
 ssa_name_values.safe_grow_cleared (SSA_NAME_VERSION (name) + 1);
+  if (value && TREE_OVERFLOW_P (value))
+value = drop_tree_overflow (value);
   ssa_name_values[SSA_NAME_VERSION (name)] = value;
 }

[PATCH, MPX, 2/X] Pointers Checker. Remove checker language hook.

2013-11-07 Thread Ilya Enkovich

Hi,

Here is a patch to remove language hook used by Pointer Bounds Checker.  To 
disable checker on non C languages option is moved to c.opt.

Thanks,
Ilya
--
gcc/

2013-11-06  Ilya Enkovich  

* common.opt (fcheck-pointer-bounds): Move to ...
* c-family/c.opt: ... here.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): Remove.
(LANG_HOOKS_INITIALIZER): Remove LANG_HOOKS_CHKP_SUPPORTED.
* langhooks.h (lang_hooks): Remove chkp_supported field.
* toplev.c (process_options): Remove chkp_supported check.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index b862eb9..c082bac 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -850,6 +850,11 @@ fcanonical-system-headers
 C ObjC C++ ObjC++
 Where shorter, use canonicalized paths to systems headers.
 
+fcheck-pointer-bounds
+C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds)
+Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
+control instrumentation.
+
 fcilkplus
 C ObjC C++ ObjC++ LTO Report Var(flag_enable_cilkplus) Init(0)
 Enable Cilk Plus
diff --git a/gcc/common.opt b/gcc/common.opt
index 5c2f56e..deeb3f2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -874,11 +874,6 @@ fbounds-check
 Common Report Var(flag_bounds_check)
 Generate code to check bounds before indexing arrays
 
-fcheck-pointer-bounds
-Common Report Var(flag_check_pointer_bounds)
-Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
-control instrumentation.  Currently available for C, C++ and ObjC.
-
 fbranch-count-reg
 Common Report Var(flag_branch_on_count_reg) Init(1) Optimization
 Replace add, compare, branch with branch on count register
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 0597036..b7be472 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -118,7 +118,6 @@ extern bool lhd_omp_mappable_type (tree);
 #define LANG_HOOKS_BLOCK_MAY_FALLTHRU  hook_bool_const_tree_true
 #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP  false
 #define LANG_HOOKS_DEEP_UNSHARING  false
-#define LANG_HOOKS_CHKP_SUPPORTED  false
 
 /* Attribute hooks.  */
 #define LANG_HOOKS_ATTRIBUTE_TABLE NULL
@@ -306,8 +305,7 @@ extern void lhd_end_section (void);
   LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \
   LANG_HOOKS_BLOCK_MAY_FALLTHRU, \
   LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \
-  LANG_HOOKS_DEEP_UNSHARING, \
-  LANG_HOOKS_CHKP_SUPPORTED \
+  LANG_HOOKS_DEEP_UNSHARING \
 }
 
 #endif /* GCC_LANG_HOOKS_DEF_H */
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 002d7eb..a83bf7b 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -472,9 +472,6 @@ struct lang_hooks
  gimplification.  */
   bool deep_unsharing;
 
-  /* True if this language allows pointers checker instrumentation.  */
-  bool chkp_supported;
-
   /* Whenever you add entries here, make sure you adjust langhooks-def.h
  and langhooks.c accordingly.  */
 };
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 0eaf081..d3dac07 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1286,9 +1286,6 @@ process_options (void)
 {
   if (targetm.chkp_bound_mode () == VOIDmode)
error ("-fcheck-pointers is not supported for this target");
-
-  if (!lang_hooks.chkp_supported)
-   flag_check_pointer_bounds = 0;
 }
 
   /* One region RA really helps to decrease the code size.  */

[PATCH][AArch64][committed] Add comment on why plus_constant is not used in aarch64_legitimize_reload_address

2013-11-07 Thread Kyrill Tkachov


Hi all,

I've committed this patch as obvious. It adds a comment in 
aarch64_legitimize_reload_address explaining that we need the RTL structure to 
be preserved before push_reload and using plus_constant would fold that, causing 
ICEs.


Thanks,
Kyrill

[gcc/]
2013-11-07  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_legitimize_reload_address):
Explain why plus_constant is not used.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ee2cb4c..3fe70c1 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4007,7 +4007,11 @@ aarch64_legitimize_reload_address (rtx *x_p,
 	cst = force_const_mem (xmode, cst);
 
   /* Reload high part into base reg, leaving the low part
-	 in the mem instruction.  */
+	 in the mem instruction.
+	 Note that replacing this gen_rtx_PLUS with plus_constant is
+	 wrong in this case because we rely on the
+	 (plus (plus reg c1) c2) structure being preserved so that
+	 XEXP (*p, 0) in push_reload below uses the correct term.  */
   x = gen_rtx_PLUS (xmode,
 			gen_rtx_PLUS (xmode, XEXP (x, 0), cst),
 			GEN_INT (low));

Re: [PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification

2013-11-07 Thread Ilya Enkovich

Hi,

Here is an updated patch version.

Thanks,
Ilya
--
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 52d9ab0..9d7ae85 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3030,40 +3030,54 @@ gimple_check_call_args (gimple stmt, tree fndecl, bool 
args_count_match)
 {
   for (i = 0, p = DECL_ARGUMENTS (fndecl);
   i < nargs;
-  i++, p = DECL_CHAIN (p))
+  i++)
{
- tree arg;
+ tree arg = gimple_call_arg (stmt, i);
+
+ /* Skip bound args inserted by Pointer Bounds Checker.  */
+ if (POINTER_BOUNDS_P (arg))
+   continue;
+
  /* We cannot distinguish a varargs function from the case
 of excess parameters, still deferring the inlining decision
 to the callee is possible.  */
  if (!p)
break;
- arg = gimple_call_arg (stmt, i);
+
  if (p == error_mark_node
  || arg == error_mark_node
  || (!types_compatible_p (DECL_ARG_TYPE (p), TREE_TYPE (arg))
  && !fold_convertible_p (DECL_ARG_TYPE (p), arg)))
 return false;
+
+ p = DECL_CHAIN (p);
}
   if (args_count_match && p)
return false;
 }
   else if (parms)
 {
-  for (i = 0, p = parms; i < nargs; i++, p = TREE_CHAIN (p))
+  for (i = 0, p = parms; i < nargs; i++)
{
- tree arg;
+ tree arg = gimple_call_arg (stmt, i);
+
+ /* Skip bound args inserted by Pointer Bounds Checker.  */
+ if (POINTER_BOUNDS_P (arg))
+   continue;
+
  /* If this is a varargs function defer inlining decision
 to callee.  */
  if (!p)
break;
- arg = gimple_call_arg (stmt, i);
+
  if (TREE_VALUE (p) == error_mark_node
  || arg == error_mark_node
  || TREE_CODE (TREE_VALUE (p)) == VOID_TYPE
  || (!types_compatible_p (TREE_VALUE (p), TREE_TYPE (arg))
  && !fold_convertible_p (TREE_VALUE (p), arg)))
 return false;
+
+ p = TREE_CHAIN (p);
}
 }
   else
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 20f6010..0fc704b 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -3043,20 +3043,27 @@ canonicalize_cond_expr_cond (tree t)
 }
 
 /* Build a GIMPLE_CALL identical to STMT but skipping the arguments in
-   the positions marked by the set ARGS_TO_SKIP.  */
+   the positions marked by the set ARGS_TO_SKIP.  ARGS_TO_SKIP does not
+   take into account bounds arguments.  Bounds passed for skipped args
+   are also skipped.  */
 
 gimple
 gimple_call_copy_skip_args (gimple stmt, bitmap args_to_skip)
 {
-  int i;
+  int i, bit;
   int nargs = gimple_call_num_args (stmt);
   vec vargs;
   vargs.create (nargs);
   gimple new_stmt;
 
-  for (i = 0; i < nargs; i++)
-if (!bitmap_bit_p (args_to_skip, i))
-  vargs.quick_push (gimple_call_arg (stmt, i));
+  for (i = 0, bit = 0; i < nargs; i++, bit++)
+  if (POINTER_BOUNDS_P (gimple_call_arg (stmt, i)))
+   {
+ if (!bitmap_bit_p (args_to_skip, --bit))
+   vargs.quick_push (gimple_call_arg (stmt, i));
+   }
+  else if (!bitmap_bit_p (args_to_skip, bit))
+ vargs.quick_push (gimple_call_arg (stmt, i));
 
   if (gimple_call_internal_p (stmt))
 new_stmt = gimple_build_call_internal_vec (gimple_call_internal_fn (stmt),
@@ -3702,6 +3709,9 @@ validate_call (gimple stmt, tree fndecl)
   if (!targs)
return true;
   tree arg = gimple_call_arg (stmt, i);
+  /* Skip bounds.  */
+  if (POINTER_BOUNDS_P (arg))
+   continue;
   if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
  && INTEGRAL_TYPE_P (TREE_VALUE (targs)))
;

Re: [PATCH, MPX, 2/X] Pointers Checker. Remove checker language hook.

2013-11-07 Thread Richard Biener

On Thu, Nov 7, 2013 at 12:46 PM, Ilya Enkovich  wrote:
> Hi,
>
> Here is a patch to remove language hook used by Pointer Bounds Checker.  To 
> disable checker on non C languages option is moved to c.opt.

Ok.

Thanks,
Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2013-11-06  Ilya Enkovich  
>
> * common.opt (fcheck-pointer-bounds): Move to ...
> * c-family/c.opt: ... here.
> * langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): Remove.
> (LANG_HOOKS_INITIALIZER): Remove LANG_HOOKS_CHKP_SUPPORTED.
> * langhooks.h (lang_hooks): Remove chkp_supported field.
> * toplev.c (process_options): Remove chkp_supported check.
>
>
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index b862eb9..c082bac 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -850,6 +850,11 @@ fcanonical-system-headers
>  C ObjC C++ ObjC++
>  Where shorter, use canonicalized paths to systems headers.
>
> +fcheck-pointer-bounds
> +C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds)
> +Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
> +control instrumentation.
> +
>  fcilkplus
>  C ObjC C++ ObjC++ LTO Report Var(flag_enable_cilkplus) Init(0)
>  Enable Cilk Plus
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 5c2f56e..deeb3f2 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -874,11 +874,6 @@ fbounds-check
>  Common Report Var(flag_bounds_check)
>  Generate code to check bounds before indexing arrays
>
> -fcheck-pointer-bounds
> -Common Report Var(flag_check_pointer_bounds)
> -Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
> -control instrumentation.  Currently available for C, C++ and ObjC.
> -
>  fbranch-count-reg
>  Common Report Var(flag_branch_on_count_reg) Init(1) Optimization
>  Replace add, compare, branch with branch on count register
> diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
> index 0597036..b7be472 100644
> --- a/gcc/langhooks-def.h
> +++ b/gcc/langhooks-def.h
> @@ -118,7 +118,6 @@ extern bool lhd_omp_mappable_type (tree);
>  #define LANG_HOOKS_BLOCK_MAY_FALLTHRU  hook_bool_const_tree_true
>  #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP  false
>  #define LANG_HOOKS_DEEP_UNSHARING  false
> -#define LANG_HOOKS_CHKP_SUPPORTED  false
>
>  /* Attribute hooks.  */
>  #define LANG_HOOKS_ATTRIBUTE_TABLE NULL
> @@ -306,8 +305,7 @@ extern void lhd_end_section (void);
>LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \
>LANG_HOOKS_BLOCK_MAY_FALLTHRU, \
>LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \
> -  LANG_HOOKS_DEEP_UNSHARING, \
> -  LANG_HOOKS_CHKP_SUPPORTED \
> +  LANG_HOOKS_DEEP_UNSHARING \
>  }
>
>  #endif /* GCC_LANG_HOOKS_DEF_H */
> diff --git a/gcc/langhooks.h b/gcc/langhooks.h
> index 002d7eb..a83bf7b 100644
> --- a/gcc/langhooks.h
> +++ b/gcc/langhooks.h
> @@ -472,9 +472,6 @@ struct lang_hooks
>   gimplification.  */
>bool deep_unsharing;
>
> -  /* True if this language allows pointers checker instrumentation.  */
> -  bool chkp_supported;
> -
>/* Whenever you add entries here, make sure you adjust langhooks-def.h
>   and langhooks.c accordingly.  */
>  };
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 0eaf081..d3dac07 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -1286,9 +1286,6 @@ process_options (void)
>  {
>if (targetm.chkp_bound_mode () == VOIDmode)
> error ("-fcheck-pointers is not supported for this target");
> -
> -  if (!lang_hooks.chkp_supported)
> -   flag_check_pointer_bounds = 0;
>  }
>
>/* One region RA really helps to decrease the code size.  */

RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Bernd Edlinger

Hi,

just some thoughts...


fgetc will definitely be much faster than fread 16K and fseek back to the end 
of line position.

Note: fgetc is already buffered and not too slow on average, but only if you do 
not
fseek around. In that case the buffered file-stream data is lost.

Well, reading 16K in one piece _could_ be faster than fgetc, but only if you 
read the
data once and do not memcpy or memmove the buffer afterwards.

But if you read 16K starting from 0, then read 16K starting from line 1, and so 
on,
you will be really really slow. Did you ever profile your proposed patch?

The most easy way to do something for the performance would be in
location_get_source_line:

If you remember the last file and position, and keep the file open, just in 
case,
that the next time, either the same or a following line is requested. That 
would be
a really simple optimization.

Maybe that would be too simple? I do not know.

> fseek (fp, prev_pos + cur_len, SEEK_SET);
Furthermore this seek will position in the middle of the \r\n sequence on 
windows.


I am furthermore not sure, how the gcov.c would react on a NUL-char in the 
input stream.
Maybe in a comment, somewhere? Could it be possible that this reaches the 
gcov.c?

In that case the same OOM would occur because the memory size is always 
duplicated
after an incomplete line?

And gcov.c uses printf ("..%s\n") to print a source listing right?

If the line contains NUL-chars, the listing will only contain the part of the 
line
until the NUL-char, which is not what we want?

So in that case it would be better to replace NUL-chars in the line
buffer with SPACEs, to get a printable C string? What do you think?

Isn't a C string always more easy to handle, than this buffer with zero-chars?


Regards
Bernd.

[Commited] Fix for PR59029

2013-11-07 Thread Yury Gribov


Preapproved by Jakub in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59029
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4991a3a..535b670 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-11-07  Yury Gribov  
+	Jakub Jelinek  
+
+	PR sanitizer/59029
+	* gcc/asan.c (get_mem_refs_of_builtin_call): Allow
+	integer literals as addresses in instrumented builtins.
+
 2013-11-07  H.J. Lu  
 
 	PR target/59034
diff --git a/gcc/asan.c b/gcc/asan.c
index fdca377..950d332 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -715,7 +715,7 @@ get_mem_refs_of_builtin_call (const gimple call,
 	   instrument_derefs.  */
 	if (TREE_CODE (dest) == ADDR_EXPR)
 	  dest = TREE_OPERAND (dest, 0);
-	else if (TREE_CODE (dest) == SSA_NAME)
+	else if (TREE_CODE (dest) == SSA_NAME || TREE_CODE (dest) == INTEGER_CST)
 	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
 			 dest, build_int_cst (TREE_TYPE (dest), 0));
 	else
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 56d30a3..f6e735f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2013-11-07  Yury Gribov  
+	Jakub Jelinek  
+
+	PR sanitizer/59029
+	* c-c++-common/asan/pr59029.c: New test.
+
 2013-11-07  H.J. Lu  
 
 	PR target/59034
diff --git a/gcc/testsuite/c-c++-common/asan/pr59029.c b/gcc/testsuite/c-c++-common/asan/pr59029.c
new file mode 100644
index 000..a1319b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/pr59029.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+
+int
+foo ()
+{
+  return __sync_fetch_and_add ((int *) 0, 1);
+}

Re: [Commited] Fix for PR59029

2013-11-07 Thread Marek Polacek

On Thu, Nov 07, 2013 at 04:32:56PM +0400, Yury Gribov wrote:
> + PR sanitizer/59029
> + * gcc/asan.c (get_mem_refs_of_builtin_call): Allow
> + integer literals as addresses in instrumented builtins.
> +

The prefix gcc/ hasn't been dropped as it should.

Marek

[ia64] [PR target/57491] internal compiler error: in ia64_split_tmode -O2, quadmath

2013-11-07 Thread Kirill Yukhin

Hello,
I've looked into RTL after register allocation. Insn which
lead to assert is:
(insn 77 180 79 4 (set (reg/v:TF 44 loc12 [orig:359 T8 ] [359])
(mem:TF (post_modify:DI (reg/f:DI 14 r14 [435])
(plus:DI (reg/f:DI 14 r14 [435])
(reg:DI 45 loc13 [orig:342 D.3347 ] [342]))) [2 *R1_102+0 
S16 A128])) 1.i:2370 131 {*movtf_intern\
al}

When trying to split it into two DI loads here:
 ia64_split_tmode_move

We check if source and destination RTX are register overlapped,
and if so, set flag that says that load adress is dead after insns:
  if (GET_CODE (operands[1]) == MEM
  && reg_overlap_mentioned_p (operands[0], operands[1]))
...
dead = true;

This flags is asserted (to be unset) later in splitting of memory operand in 
ia64_split_tmode:
  case POST_MODIFY:
gcc_assert (!reversed && !dead);

I think that the insn (although op[0] and op1[1] are overlapped by regs)
is still safe to split. And we might end up with something like this:
  (insn 272 0 0 (set (reg:DI 44 loc12)
  (mem:DI (post_inc:DI (reg/f:DI 14 r14 [435])) [2 *R1_102+0 S8 A128]))
  (insn 273 272 0 (set (reg:DI 45 loc13)
  (mem:DI (post_modify:DI (reg/f:DI 14 r14 [435])
  (plus:DI (reg/f:DI 14 r14 [435])
  (reg:DI 45 loc13 [orig:342 D.3347 ] [342]))) [2 *R1_102+8 
S8 A64]))
  (set (reg/f:DI 14 r14 [435])
  (plus:DI (reg/f:DI 14 r14 [435])
  (const_int -8 [0xfff8])))

As far as I understand, second insn will update loc13 *after* ld is executed,
so value of location in r14 will be correct.

Patch in the bottom.

ChangeLog entry:
  * config/ia64/ia64.c (ia64_split_tmode_move): Relax `dead' flag 
setting.

Testcase builds w/o ICE with the patch.
Bootstrapped.

Could you pls take a look?

--
Thanks, K
---
 gcc/config/ia64/ia64.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index a128b19..446ee59 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -1527,12 +1527,19 @@ ia64_split_tmode_move (rtx operands[])
   && reg_overlap_mentioned_p (operands[0], operands[1]))
 {
   rtx base = XEXP (operands[1], 0);
+  rtx first_write = gen_rtx_REG (DImode, REGNO (operands[0]));
   while (GET_CODE (base) != REG)
base = XEXP (base, 0);

   if (REGNO (base) == REGNO (operands[0]))
-   reversed = true;
-  dead = true;
+   {
+ reversed = true;
+ first_write = gen_rtx_REG (DImode, REGNO (operands[0]) + 1);
+   }
+
+  if (GET_CODE (operands[0]) == REG
+ && reg_overlap_mentioned_p (first_write, operands[1]))
+   dead = true;
 }
   /* Another reason to do the moves in reversed order is if the first
  element of the target register pair is also the second element of

Re: [Commited] Fix for PR59029

2013-11-07 Thread Yury Gribov

Yup, Jakub already pointed this out. Should I fix the changelog with 
another commit?

---
From: Marek Polacek 
Sent:  Thursday, November 07, 2013 4:38PM
To: Yury Gribov 
Cc: gcc-patches@gcc.gnu.org, Jakub Jelinek , 
reich...@gcc.gnu.org, Viacheslav Garbuzov , 
Evgeny Gavrin 

Subject: Re: [Commited] Fix for PR59029
On 11/07/2013 04:38 PM, Marek Polacek wrote:

On Thu, Nov 07, 2013 at 04:32:56PM +0400, Yury Gribov wrote:

+   PR sanitizer/59029
+   * gcc/asan.c (get_mem_refs_of_builtin_call): Allow
+   integer literals as addresses in instrumented builtins.
+

The prefix gcc/ hasn't been dropped as it should.

Marek

Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote:
> just some thoughts...
> 
> 
> fgetc will definitely be much faster than fread 16K and fseek back to the end 
> of line position.
> 
> Note: fgetc is already buffered and not too slow on average, but only if you 
> do not
> fseek around. In that case the buffered file-stream data is lost.

fgetc (unless you are talking about fgetc_unlocked which is somewhat better)
is actually very expensive.  Of course, fseek is highly undersirable, but
I've already said that.

Jakub

Re: [Commited] Fix for PR59029

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 04:48:10PM +0400, Yury Gribov wrote:
> Yup, Jakub already pointed this out. Should I fix the changelog with
> another commit?

Yes, just fix up the ChangeLog and commit, for ChangeLog changes
of course you don't add a new ChangeLog entry.

Jakub

Re: [Commited] Fix for PR59029

2013-11-07 Thread Yury Gribov


>> Should I fix the changelog with another commit?
> Yes, just fix up the ChangeLog and commit

Fixed.

-Y

RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-11-07 Thread Bernd Edlinger

Hi,

On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote:
>
> On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
>  wrote:
>> Hi,
>>
>>> Eh ... even
>>>
>>> register struct { int i; int a[0]; } asm ("ebx");
>>>
>>> works. Also with int a[1] but not with a[2]. So just handling trailing
>>> arrays makes this case regress as well.
>>>
>>> Now I'd call this use questionable ... but we've likely supported that
>>> for decades and cannot change that now.
>>>
>>> Back to fixing everything in expand.
>>>
>>> Richard.
>>>
>>
>> Ok, finally you asked for it.
>>
>> Here is my previous version of that patch again.
>>
>> I have now added a new value "EXPAND_REFERENCE" to the expand_modifier
>> enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
>> constant values.
>>
>> I have done the same modification to VIEW_CONVERT_EXPR too, because
>> this is a possible inner reference, itself. It is however inherently hard to
>> test around this code.
>>
>> To understand this patch it is good to know what type of object the
>> return value "tem" of get_inner_reference can be.
>>
>> From the program logic at get_inner_reference it is clear that the
>> return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
>> ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
>> be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
>> further restricted because exp is gimplified.
>>
>> Usually the result will be a MEM_REF or a SSA_NAME of the memory where
>> the structure is to be found.
>>
>> When you look at where EXPAND_MEMORY is handled you see it is special-cased
>> in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
>> ARRAY_RANGE_REF.
>>
>> At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
>> same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
>> If it is an unaligned memory, we just return the unaligned reference.
>>
>> This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test 
>> case,
>> because it is only a problem for STRICT_ALIGNMENT targets, and even there it 
>> will
>> certainly be really hard to find test cases that exercise this code.
>>
>> In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
>> we do not have to touch the handling of the outer modifier. However we pass
>> EXPAND_REFERENCE to the inner object, which should not be a recursive
>> use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.
>>
>> TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
>> EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.
>>
>>
>> Boot-strapped and regression-tested on x86_64-linux-gnu
>> OK for trunk?
>
> You point to a weak spot in expansion - that it handles creating
> the base MEM to offset with handled components by recursing
> into the case that handles bare MEM_REFs. This makes the
> bare MEM_REF handling code somewhat awkward (it's the
> one to assign mem-attrs which are later adjusted for example).
>
> Maybe a better appropach than adding yet another expand
> modifier would be to split out the "base MEM" expansion part
> out of the bare MEM_REF handling code so we can call that
> instead of recursing.
>
> In this light - instead of a new expand modifier don't you want
> an actual flag that specifies we are coming from a call that
> wants to expand a base? That is, allow EXPAND_SUM
> but with the recursion flag set?
>

I think you are right. After some thought, I start to like that idea.

This way we have at least much more flexibility, how to handle the inner
references correctly, and if I change only  the interfaces of 
expand_expr_real/real_1
that will not be used at too many places, either.

> Finally I think the recursion into the VIEW_CONVERT_EXPR case
> is only there because of the keep_aligning flag of get_inner_reference
> which should be obsolete now that we properly handle its effects
> in get_object_alignment. So you wouldn't need to adjust this path
> if we finally can get rid of that.
>

I proposed a patch for that, but did not hear from you:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02267.html

nevertheless, even if the VIEW_CONVERT_EXPR may or may not be an inner
reference, the code there should be kept in sync with the normal_inner_ref,
and MEM_REF, IMHO.

Attached, my updated patch for PR57748, Part 2.

Boot-strapped and regression-tested on X86_64-pc-linux-gnu.

Ok for trunk?


Thanks
Bernd.

> What do others think?
>
> Thanks,
> Richard.
>
>> Thanks
>> Bernd.

RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-11-07 Thread Bernd Edlinger

oops - this time with attachments...


> Hi,
>
> On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote:
>>
>> On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
>>  wrote:
>>> Hi,
>>>
 Eh ... even

 register struct { int i; int a[0]; } asm ("ebx");

 works. Also with int a[1] but not with a[2]. So just handling trailing
 arrays makes this case regress as well.

 Now I'd call this use questionable ... but we've likely supported that
 for decades and cannot change that now.

 Back to fixing everything in expand.

 Richard.

>>>
>>> Ok, finally you asked for it.
>>>
>>> Here is my previous version of that patch again.
>>>
>>> I have now added a new value "EXPAND_REFERENCE" to the expand_modifier
>>> enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
>>> constant values.
>>>
>>> I have done the same modification to VIEW_CONVERT_EXPR too, because
>>> this is a possible inner reference, itself. It is however inherently hard to
>>> test around this code.
>>>
>>> To understand this patch it is good to know what type of object the
>>> return value "tem" of get_inner_reference can be.
>>>
>>> From the program logic at get_inner_reference it is clear that the
>>> return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
>>> ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
>>> be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
>>> further restricted because exp is gimplified.
>>>
>>> Usually the result will be a MEM_REF or a SSA_NAME of the memory where
>>> the structure is to be found.
>>>
>>> When you look at where EXPAND_MEMORY is handled you see it is special-cased
>>> in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
>>> ARRAY_RANGE_REF.
>>>
>>> At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
>>> same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
>>> If it is an unaligned memory, we just return the unaligned reference.
>>>
>>> This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test 
>>> case,
>>> because it is only a problem for STRICT_ALIGNMENT targets, and even there 
>>> it will
>>> certainly be really hard to find test cases that exercise this code.
>>>
>>> In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
>>> we do not have to touch the handling of the outer modifier. However we pass
>>> EXPAND_REFERENCE to the inner object, which should not be a recursive
>>> use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.
>>>
>>> TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
>>> EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.
>>>
>>>
>>> Boot-strapped and regression-tested on x86_64-linux-gnu
>>> OK for trunk?
>>
>> You point to a weak spot in expansion - that it handles creating
>> the base MEM to offset with handled components by recursing
>> into the case that handles bare MEM_REFs. This makes the
>> bare MEM_REF handling code somewhat awkward (it's the
>> one to assign mem-attrs which are later adjusted for example).
>>
>> Maybe a better appropach than adding yet another expand
>> modifier would be to split out the "base MEM" expansion part
>> out of the bare MEM_REF handling code so we can call that
>> instead of recursing.
>>
>> In this light - instead of a new expand modifier don't you want
>> an actual flag that specifies we are coming from a call that
>> wants to expand a base? That is, allow EXPAND_SUM
>> but with the recursion flag set?
>>
>
> I think you are right. After some thought, I start to like that idea.
>
> This way we have at least much more flexibility, how to handle the inner
> references correctly, and if I change only the interfaces of 
> expand_expr_real/real_1
> that will not be used at too many places, either.
>
>> Finally I think the recursion into the VIEW_CONVERT_EXPR case
>> is only there because of the keep_aligning flag of get_inner_reference
>> which should be obsolete now that we properly handle its effects
>> in get_object_alignment. So you wouldn't need to adjust this path
>> if we finally can get rid of that.
>>
>
> I proposed a patch for that, but did not hear from you:
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02267.html
>
> nevertheless, even if the VIEW_CONVERT_EXPR may or may not be an inner
> reference, the code there should be kept in sync with the normal_inner_ref,
> and MEM_REF, IMHO.
>
> Attached, my updated patch for PR57748, Part 2.
>
> Boot-strapped and regression-tested on X86_64-pc-linux-gnu.
>
> Ok for trunk?
>
>
> Thanks
> Bernd.
>
>> What do others think?
>>
>> Thanks,
>> Richard.
>>
>>> Thanks
>>> Bernd.2013-11-07  Bernd Edlinger  

PR middle-end/57748
* expr.h (expand_expr_real, expand_expr_real_1): Add new parameter
expand_reference.
(expand_expr, expand_normal): Adjust.
* expr.c (expand_expr_real, expand_expr_real_1): Add new parame

[C++ Patch/RFC] PR 58176 (ICE with nullptr at -O0 in output_constant)

2013-11-07 Thread Paolo Carlini


Hi,

at -O0 we ICE for this valid testcase in output_constant because 
NULLPTR_TYPE is not handled. Should I dig further because we never want 
nullptr to reach varasm? Because otherwise the patchlet works fine, I 
checked that expand_expr can cope perfectly well with NULLPTR_TYPE as 
TREE_TYPE of the INTEGER_CST (generally, we are already handling 
NULLPTR_TYPE in a number of places outside /cp)


Thanks,
Paolo.


Index: testsuite/g++.dg/cpp0x/nullptr30.C
===
--- testsuite/g++.dg/cpp0x/nullptr30.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/nullptr30.C  (working copy)
@@ -0,0 +1,40 @@
+// PR c++/58176
+// { dg-do compile { target c++11 } }
+
+// Nil
+struct nil_ { constexpr nil_ () {} };
+constexpr nil_ nil;
+
+// Cons
+template 
+struct cons_ {
+using head_ = H;
+using tail_ = T;
+
+H head;
+T tail;
+
+constexpr cons_() {}
+constexpr cons_(H const &h, T const &t) : head(h), tail(t) {}
+};
+template 
+constexpr cons_ cons (H const &h, T const &t = nil) { return
+cons_(h,t); }
+
+// List
+template  struct list_s;
+template 
+struct list_s {
+using type = cons_::type>;
+};
+template <>
+struct list_s<> {
+using type = nil_;
+};
+template 
+using list_ = typename list_s::type;
+constexpr nil_ list () { return nil; }
+template 
+constexpr list_ list (H h, T... t) { return cons(h, list(t...)); }
+
+constexpr auto l1 = list("monkey", 123.4, cons(1, 2), nullptr);
Index: varasm.c
===
--- varasm.c(revision 204508)
+++ varasm.c(working copy)
@@ -4685,6 +4685,7 @@ output_constant (tree exp, unsigned HOST_WIDE_INT
 case OFFSET_TYPE:
 case FIXED_POINT_TYPE:
 case POINTER_BOUNDS_TYPE:
+case NULLPTR_TYPE:
   if (! assemble_integer (expand_expr (exp, NULL_RTX, VOIDmode,
   EXPAND_INITIALIZER),
  MIN (size, thissize), align, 0))

Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart

2013-11-07 Thread Andrew MacLeod


On 11/07/2013 05:36 AM, Basile Starynkevitch wrote:

On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote:

I decided to name the new file gimple-expr.[ch] instead of
gimple-decl   This will eventually split into gimple-type.[ch],
gimple-decl.[ch], and gimple-expr.[ch].


Since we are adding *new* C++ files, can't we please name them *.cc
for the implementation part, so at least create gimple-expr.h and
gimple-expr.cc but not gimple-expr.c, please!
Assuming we put this into stage 1 next year,  I would imagine we'd 
rename a number of things, including .cc, drop the tree-* from the 
tree-ssa-blah.[c]h files, etc etc.  There have been a few things people 
have suggested renaming... I think if we do renaming, they ought to be 
all at one time to minimize the pain.


At the moment, the new gimple-* files I'm creating are still C, so they 
are .c files...


Andrew

Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-07 Thread Joseph S. Myers

On Thu, 7 Nov 2013, Mingjie Xing wrote:

> 2013/11/6 Richard Biener :
> > You miss a testcase.
> >
> > Also why should the warning be omitted for unused automatic
> > volatile variables?  They cannot be used in any way.
> >
> > Richard.
> 
> Thanks.  I've updated the patch with a test case.

You don't seem to have answered Richard's question about why the change is 
desirable in the first place.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Joseph S. Myers

On Thu, 7 Nov 2013, Richard Biener wrote:

> Well, I'm betting that you'll re-invent sth like 'tree' just don't
> call it 'tree' ;)
> You need to transparently refer to constants, SSA names and decls
> (at least) as GIMPLE statement operands.  You probably will make
> a "gimple statement operand" base class.  Well - that's a 'tree' ;)

Uses of constants and decls in expressions are why it's not obvious that 
those get a static type different from expressions (or at least, they 
probably do need a common base class).  But my model of how "tree" should 
ideally be split up has expressions (maybe including decls) as a 
completely separate static type from types, and identifiers as yet another 
static type, none of those inheriting from a common base class "tree".

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, MPX, 2/X] Pointers Checker [6/25] Instrumentation pass

2013-11-07 Thread Joseph S. Myers

On Thu, 7 Nov 2013, Ilya Enkovich wrote:

> Here is a new split.  I still include rtl.h into tree-chkp.c for MEM_P 
> usage.  Is it OK?

I have no further comments on this patch.

-- 
Joseph S. Myers
jos...@codesourcery.com

Generally link to libgomp for -ftree-parallelize-loops=* (was: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.)

2013-11-07 Thread Thomas Schwinge

Hi!

On Thu, 7 Nov 2013 09:15:45 +0100, Jakub Jelinek  wrote:
> On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote:
> > --- gcc/config/arc/arc.h
> > +++ gcc/config/arc/arc.h
> > @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
> >  %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} 
> > %{r}\
> >  %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
> >  %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
> > -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
> > +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
> 
> Perhaps add here |ftree-parallelize-loops=* too?  [...]

Like this (for trunk and gomp-4_0-branch)?

gcc/
* config/arc/arc.h (LINK_COMMAND_SPEC): For
-ftree-parallelize-loops=*, link to libgomp and its dependencies.
* config/ia64/hpux.h (LIB_SPEC): Likewise.
* config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
* config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
* gcc.c (GOMP_SELF_SPECS): Update comment about libgomp's
dependencies.
libgomp/
* libgomp.spec.in: Update comment about libgomp's dependencies.
* configure.ac: Likewise.
* configure: Regenerate.

diff --git gcc/config/arc/arc.h gcc/config/arc/arc.h
index 637f7b6..87908d4 100644
--- gcc/config/arc/arc.h
+++ gcc/config/arc/arc.h
@@ -174,7 +174,8 @@ along with GCC; see the file COPYING3.  If not see
 %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\
 %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
 %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
-%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
+%{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
+%(mflib)\
 %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
 %{!A:%{!nostdlib:%{!nostartfiles:%E}}} %{T*} }}"
diff --git gcc/config/ia64/hpux.h gcc/config/ia64/hpux.h
index ca592e4..517132c 100644
--- gcc/config/ia64/hpux.h
+++ gcc/config/ia64/hpux.h
@@ -92,7 +92,7 @@ do {  \
 #undef  LIB_SPEC
 #define LIB_SPEC \
   "%{!shared: \
- %{mt|pthread:%{fopenmp:-lrt} -lpthread} \
+ %{mt|pthread:%{fopenmp|ftree-parallelize-loops=*:-lrt} -lpthread} \
  %{p:%{!mlp64:-L/usr/lib/hpux32/libp} \
 %{mlp64:-L/usr/lib/hpux64/libp} -lprof} \
  %{pg:%{!mlp64:-L/usr/lib/hpux32/libp} \
diff --git gcc/config/pa/pa-hpux11.h gcc/config/pa/pa-hpux11.h
index c217398..4da1b09 100644
--- gcc/config/pa/pa-hpux11.h
+++ gcc/config/pa/pa-hpux11.h
@@ -120,7 +120,8 @@ along with GCC; see the file COPYING3.  If not see
 #undef LIB_SPEC
 #define LIB_SPEC \
   "%{!shared:\
- %{fopenmp:%{static:-a archive_shared} -lrt %{static:-a archive}}\
+ %{fopenmp|ftree-parallelize-loops=*:%{static:-a archive_shared} -lrt\
+   %{static:-a archive}}\
  %{mt|pthread:-lpthread} -lc\
  %{static:%{!nolibdld:-a archive_shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive\
diff --git gcc/config/pa/pa64-hpux.h gcc/config/pa/pa64-hpux.h
index a3e21d3..9efc137 100644
--- gcc/config/pa/pa64-hpux.h
+++ gcc/config/pa/pa64-hpux.h
@@ -58,19 +58,22 @@ along with GCC; see the file COPYING3.  If not see
 #if ((TARGET_DEFAULT | TARGET_CPU_DEFAULT) & MASK_GNU_LD)
 #define LIB_SPEC \
   "%{!shared:\
- %{!p:%{!pg:%{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+ %{!p:%{!pg:%{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\
+  %{static:-a archive}}\
%{mt|pthread:-lpthread} -lc\
%{static:%{!nolibdld:-a shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive}\
  %{p:%{!pg:%{static:%{!mhp-ld:-a shared}%{mhp-ld:-a archive_shared}}\
   -lprof %{static:-a archive}\
-  %{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+  %{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\
+ %{static:-a archive}}\
   %{mt|pthread:-lpthread} -lc\
   %{static:%{!nolibdld:-a shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive}\
  %{pg:%{static:%{!mhp-ld:-a shared}%{mhp-ld:-a archive_shared}}\
-lgprof %{static:-a archive}\
-   %{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+   %{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\
+ %{static:-a archive}}\
%{mt|pthread:-lpthread} -lc\
%{static:%{!nolibdld:-a shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive}\
@@ -78,19 +81,22 @@ along with GCC; see the file COPYING3.  If not see
 #else
 #define LIB_SPEC \
   "%{!shared:\
- %{!p:%{!pg:%{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+ %{!p:%{!pg:%{fopenmp|ftree-

PATCH: Don't set misaligned_prologue_used twice

2013-11-07 Thread H.J. Lu

I checked in this patch to remove redundant setting of
misaligned_prologue_used.

H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 204511)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2013-11-07  H.J. Lu  
+
+   * config/i386/i386.c (ix86_expand_set_or_movmem): Don't set
+   misaligned_prologue_used when it has been set.
+
 2013-11-07  Yury Gribov  
Jakub Jelinek  
 
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 204511)
+++ config/i386/i386.c  (working copy)
@@ -23796,7 +23796,6 @@ ix86_expand_set_or_movmem (rtx dst, rtx
   if (misaligned_prologue_used)
 {
   /* Misaligned move prologue handled small blocks by itself.  */
-  misaligned_prologue_used = true;
   expand_set_or_movmem_prologue_epilogue_by_misaligned_moves
   (dst, src, &destreg, &srcreg,
move_mode, promoted_val, vec_promoted_val,

Re: [C++ Patch/RFC] PR 58176 (ICE with nullptr at -O0 in output_constant)

2013-11-07 Thread Jason Merrill

OK.

Re: [PATCH] remove nolonger needed {cgraph,varpool}_node_{,asm_}name () functions

2013-11-07 Thread Václav Zeman

On 11/07/2013 01:58 AM, tsaund...@mozilla.com wrote:
> From: Trevor Saunders 
>
> Hi,
>
> I admit its another c++ification, but the functions are useless, and if I'm
> changing the name of the called function its just easier to drop the class 
> name
> bit all together.
>
> Trev
>
>[...]
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 4f93713..754d245 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -44,6 +44,12 @@ class GTY((desc ("%h.type"), tag ("SYMTAB_SYMBOL"),
>symtab_node
>  {
>  public:
> +  /* Return name.  */
> +  const char *name () const;
> +
> +  /* Return asm name.  */
> +const char * asm_name () const;
> +
Missing indentation?

-- 
VZ

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-07 Thread Tejas Belagod


Richard Sandiford wrote:

Tejas Belagod  writes:

Richard Sandiford wrote:

Tejas Belagod  writes:

Richard Sandiford wrote:

Tejas Belagod  writes:

+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the same in
+ either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+  && REG_P (XEXP (src, 0)) && REG_P (dst)
+  && REGNO (XEXP (src, 0)) == REGNO (dst))
+{
+  rtx par = XEXP (src, 1);
+  int i;
+
+  for (i = 0; i < XVECLEN (par, 0); i++)
+   {
+ rtx tem = XVECEXP (par, 0, i);
+ if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+   return 0;
+   }
+  return 1;
+}
+

I think for big endian it needs to be:

INTVAL (tem) != i + base

where base is something like:

int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0);

E.g. a big-endian V4HI looks like:

msb  lsb


and shortening it to say V2HI only gives the low 32 bits:

msb  lsb


But, in this case we want

 msb  lsb
 

It depends on whether the result occupies a full register or not.
I was thinking of the case where it didn't, but I realise now you were
thinking of the case where it did.  And yeah, my suggestion doesn't
cope with that...

I was under the impression that the const vector parallel for vec_select 
represents the element indexes of the array in memory order.


Therefore, in bigendian,

  msb lsb
     
element  a[0] a[1] a[2] a[3]

and in littleendian

  msb lsb
     
element  a[3] a[2] a[1] a[0]

Right.  But if an N-bit value is stored in a register, it's assumed to
occupy the lsb of the register and the N-1 bits above that.  The other
bits in the register are don't-care.

E.g., leaving vectors to one side, if you have:

   (set (reg:HI N) (truncate:SI (reg:SI N)))

on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

   msb  lsb
   01234567
   
   4567

rather than:

   msb  lsb
   01234567
   
   0123

for both endiannesses.  The same principle applies to vectors.
The lsb of the register is always assumed to be significant.

So maybe the original patch was correct for partial-register and
full-register results on little-endian, but only for full-register
results on big-endian.

Ah, ok! I think I get it. By eliminating
set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

using the check INTVAL (tem) != i, I'm essentially making subsequent operations 
use (reg:V2DI n) in DI mode which is a partial register result and this

gives me
the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial 
register mode, I have to make sure the correct elements coincide with

the lsb in
big-endian...

Thanks for your input, I'll apply the offset correction for big-endian you 
suggested. I'll respin the patch.


Thanks.  Just for avoidance of doubt, the result might be a full or
partial register, depending on the mode and target.  I was trying to
correct myself by agreeing that your original was right and mine was
wrong for big-endian if the result is a full register.



What I had in mind when I implemented this was a partial-reg result, but 
obviously it was wrong.


Sorry, I'm going to take a step back - I'm trying to figure out what a full 
register result would look like. Looking at the pattern,


  set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

the result is always a mode smaller than the src if the vec_select selects a 
subset of lanes. Plus if the src and dst are the same reg, because we're 
re-writing the src reg, wouldn't it always end up being a partial-reg?

In this case, wouldn't (reg:DI n) always represent

 msb  lsb
 

Thanks,
Tejas.



I don't know if there are existing helper functions for this kind of thing.

Richard

Re: [PATCH] Use get_range_info during number of iterations analysis

2013-11-07 Thread Jakub Jelinek

On Wed, Nov 06, 2013 at 03:21:04PM -0700, Jeff Law wrote:
> >2013-11-06  Jakub Jelinek  
> >
> > * tree-ssa-loop-niter.c: Include tree-ssanames.h.
> > (determine_value_range): Add loop argument.  Use get_range_info to
> > improve range.
> > (bound_difference): Adjust caller.
> Clever, using the range from the PHI in the loop header.  While it
> may be too wide, it's likely going to be a lot better than nothing.
> 
> Add a testcase and this is OK.

Ok, thanks, here is what I've committed after another bootstrap/regtest.

2013-11-07  Jakub Jelinek  

* tree-ssa-loop-niter.c: Include tree-ssanames.h.
(determine_value_range): Add loop argument.  Use get_range_info to
improve range.
(bound_difference): Adjust caller.

* gcc.dg/tree-ssa/loop-39.c: New test.

--- gcc/tree-ssa-loop-niter.c.jj2013-10-24 10:19:21.0 +0200
+++ gcc/tree-ssa-loop-niter.c   2013-11-07 11:49:37.568083138 +0100
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.
 #include "diagnostic-core.h"
 #include "tree-inline.h"
 #include "tree-pass.h"
+#include "tree-ssanames.h"
 
 
 #define SWAP(X, Y) do { affine_iv *tmp = (X); (X) = (Y); (Y) = tmp; } while (0)
@@ -119,9 +120,12 @@ split_to_var_and_offset (tree expr, tree
in TYPE to MIN and MAX.  */
 
 static void
-determine_value_range (tree type, tree var, mpz_t off,
+determine_value_range (struct loop *loop, tree type, tree var, mpz_t off,
   mpz_t min, mpz_t max)
 {
+  double_int minv, maxv;
+  enum value_range_type rtype = VR_VARYING;
+
   /* If the expression is a constant, we know its value exactly.  */
   if (integer_zerop (var))
 {
@@ -130,9 +134,73 @@ determine_value_range (tree type, tree v
   return;
 }
 
+  get_type_static_bounds (type, min, max);
+
+  /* See if we have some range info from VRP.  */
+  if (TREE_CODE (var) == SSA_NAME && INTEGRAL_TYPE_P (type))
+{
+  edge e = loop_preheader_edge (loop);
+  gimple_stmt_iterator gsi;
+
+  /* Either for VAR itself...  */
+  rtype = get_range_info (var, &minv, &maxv);
+  /* Or for PHI results in loop->header where VAR is used as
+PHI argument from the loop preheader edge.  */
+  for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next 
(&gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ double_int minc, maxc;
+ if (PHI_ARG_DEF_FROM_EDGE (phi, e) == var
+ && (get_range_info (gimple_phi_result (phi), &minc, &maxc)
+ == VR_RANGE))
+   {
+ if (rtype != VR_RANGE)
+   {
+ rtype = VR_RANGE;
+ minv = minc;
+ maxv = maxc;
+   }
+ else
+   {
+ minv = minv.max (minc, TYPE_UNSIGNED (type));
+ maxv = maxv.min (maxc, TYPE_UNSIGNED (type));
+ gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) <= 0);
+   }
+   }
+   }
+  if (rtype == VR_RANGE)
+   {
+ mpz_t minm, maxm;
+ gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) <= 0);
+ mpz_init (minm);
+ mpz_init (maxm);
+ mpz_set_double_int (minm, minv, TYPE_UNSIGNED (type));
+ mpz_set_double_int (maxm, maxv, TYPE_UNSIGNED (type));
+ mpz_add (minm, minm, off);
+ mpz_add (maxm, maxm, off);
+ /* If the computation may not wrap or off is zero, then this
+is always fine.  If off is negative and minv + off isn't
+smaller than type's minimum, or off is positive and
+maxv + off isn't bigger than type's maximum, use the more
+precise range too.  */
+ if (nowrap_type_p (type)
+ || mpz_sgn (off) == 0
+ || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
+ || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
+   {
+ mpz_set (min, minm);
+ mpz_set (max, maxm);
+ mpz_clear (minm);
+ mpz_clear (maxm);
+ return;
+   }
+ mpz_clear (minm);
+ mpz_clear (maxm);
+   }
+}
+
   /* If the computation may wrap, we know nothing about the value, except for
  the range of the type.  */
-  get_type_static_bounds (type, min, max);
   if (!nowrap_type_p (type))
 return;
 
@@ -405,8 +473,8 @@ bound_difference (struct loop *loop, tre
   mpz_init (maxx);
   mpz_init (miny);
   mpz_init (maxy);
-  determine_value_range (type, varx, offx, minx, maxx);
-  determine_value_range (type, vary, offy, miny, maxy);
+  determine_value_range (loop, type, varx, offx, minx, maxx);
+  determine_value_range (loop, type, vary, offy, miny, maxy);
 
   mpz_sub (bnds->below, minx, maxy);
   mpz_sub (bnds->up, maxx, miny);
--- gcc/testsuite/gcc.dg/tree-ssa/loop-39.c.jj  2013-11-07 11:52:01.871309995 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-39.c 2013-11-07 11

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Richard Biener

On Thu, Nov 7, 2013 at 2:44 PM, Joseph S. Myers  wrote:
> On Thu, 7 Nov 2013, Richard Biener wrote:
>
>> Well, I'm betting that you'll re-invent sth like 'tree' just don't
>> call it 'tree' ;)
>> You need to transparently refer to constants, SSA names and decls
>> (at least) as GIMPLE statement operands.  You probably will make
>> a "gimple statement operand" base class.  Well - that's a 'tree' ;)
>
> Uses of constants and decls in expressions are why it's not obvious that
> those get a static type different from expressions (or at least, they
> probably do need a common base class).  But my model of how "tree" should
> ideally be split up has expressions (maybe including decls) as a
> completely separate static type from types, and identifiers as yet another
> static type, none of those inheriting from a common base class "tree".

Sure.  Also you make containers no longer trees (TREE_CHAIN
anyone, TREE_VEC, TREE_LIST ...).  Should be the first step
I think (already started some time ago with using VECs for some
pieces).

Richard.

> --
> Joseph S. Myers
> jos...@codesourcery.com

RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Bernd Edlinger

On Thu, 7 Nov 2013 13:48:14, Jakub Jelinek wrote:
> On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote:
>> just some thoughts...
>>
>>
>> fgetc will definitely be much faster than fread 16K and fseek back to the 
>> end of line position.
>>
>> Note: fgetc is already buffered and not too slow on average, but only if you 
>> do not
>> fseek around. In that case the buffered file-stream data is lost.
>
> fgetc (unless you are talking about fgetc_unlocked which is somewhat better)
> is actually very expensive. Of course, fseek is highly undersirable, but
> I've already said that.
>
> Jakub

yes. That's true.

I think, maybe this patch tries just to reach too many goals at the same time.

What is the _real_ bug which should be fixed first, is that even a very short 
file
with NUL-chars goes OOM in read_line.

So stay with fgets, just detect that the file is binary and return in that case.

The performance and duplicate code can still be improved by a follow-up patch.

We do not really need any caret positions on a binary file, after all, we just
should not ICE.

Maybe that could be done by just detecting that we have a BINARY file, and stop
printing any further lines.

like this?

--- input.c.jj    2013-01-10 21:38:27.0 +0100
+++ input.c    2013-11-07 15:33:53.804990865 +0100
@@ -106,12 +106,15 @@ read_line (FILE *file)
 {
   size_t len = strlen (string + pos);

-  if (string[pos + len - 1] == '\n')
+  if (len && string[pos + len - 1] == '\n')
 {
   string[pos + len - 1] = 0;
   return string;
 }
   pos += len;
+  /* test if binary file or last line incomplete? */
+  if (pos < string_len-1)
+    return pos && feof (file) ? string : NULL;
   string = XRESIZEVEC (char, string, string_len * 2);
   string_len *= 2;
 }

Bernd.

Re: Generally link to libgomp for -ftree-parallelize-loops=* (was: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.)

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 02:58:31PM +0100, Thomas Schwinge wrote:
> On Thu, 7 Nov 2013 09:15:45 +0100, Jakub Jelinek  wrote:
> > On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote:
> > > --- gcc/config/arc/arc.h
> > > +++ gcc/config/arc/arc.h
> > > @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} 
> > > %{n} %{r}\
> > >  %{s} %{t} %{u*} %{x} %{z} %{Z} 
> > > %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
> > >  %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
> > > -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
> > > +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
> > 
> > Perhaps add here |ftree-parallelize-loops=* too?  [...]
> 
> Like this (for trunk and gomp-4_0-branch)?
> 
>   gcc/
>   * config/arc/arc.h (LINK_COMMAND_SPEC): For
>   -ftree-parallelize-loops=*, link to libgomp and its dependencies.
>   * config/ia64/hpux.h (LIB_SPEC): Likewise.
>   * config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
>   * config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
>   * gcc.c (GOMP_SELF_SPECS): Update comment about libgomp's
>   dependencies.
>   libgomp/
>   * libgomp.spec.in: Update comment about libgomp's dependencies.
>   * configure.ac: Likewise.
>   * configure: Regenerate.

Yes, thanks.

Jakub

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Andrew MacLeod


On 11/07/2013 05:08 AM, Richard Biener wrote:



2 - I really believe gimple needs a type system different from front end
trees, that is my primary motivation.  I'm tired of jumping through hoops to
do anything slightly different, and I got fed up with it.  With a separate
type system for gimple, we can rid ourselves of all the stuff that isn't
related to optimization and codegen...   Could we do this with trees?
possibly, but if I'm going to go to the effort of converting the front end
tree types into a new or reduced-type subset, I might as well put that
effort into something that is more appropriate right from the start.

I'm not sure.  Especially the idea of wrapping everything in a
I'm-not-a-tree-well-but-I-really-still-am scares the hell out of me.
And I put that into the very same basket of things you now complain about.


Well, this first phase is a transitional one... If it were possible, I'd 
change them all at once to not use a tree... but I don't think that is 
feasible. There is too much to find and change, and the source base is 
constantly changing.  We can't just freeze the base for a year.   so 
until I can get them all changes, they have to act like trees and 
interact like trees, so we make them trees under the covers.   Short 
term pain and discomfort...


If it helps, during the transition you can view it as replacing trees 
with oddly named tree subclasses which provide static type checking 
instead of run time GIMPLE_CHECK() calls :-)





By replacing those uses with an actual type, we'll get
static type checking which is another improvement.   There is only one way
to move the middle/back end off of types as trees that I can see.  Go
through the source base and find all the places which use trees as types,
and replace them with an actual type.  This can't be done all at once, its
too big a job...  So I propose a plan which allows a new gimple type and
current trees to coexist.  Once the conversion is complete we can easily
implement a type system that has the features and traits we want.

Note that on 'gimple' there are no types ... the types are on the objects
'gimple' refers to and those are 'tree's.  So you'd need to change the
'tree's to have TREE_TYPE not be a 'tree'.  Or simply subclass
'tree' so you can distinguish a tree type from a tree non-type statically
(a bit awkward with the current union setup).

But maybe I'm confusing what you call 'type'.


Im talking about changing the type gimple statements refer to, and also 
what decls and expressions will refer to...  making that the gimple_type.


 Today, thats a TREE_TYPE tree.  During the transition period this 
gimple_type is indeed a tree under the covers.. so that all the existing 
code can work even if it hasnt been converted yet.   Once everything is 
converted, then gimple_type has the tree removed from its declaration 
and we replace it with the bits we need/want... and it will no longer 
function as a tree..  gimplification will create these from the trees 
the front end created.





4 - I realized that if I am going through the entire code base to find all
the type locations, its very little extra work to also do the same for decls
and expressions. Then we'll have static type checking for those as well.

5 - When done we will be significantly closer to having the ability to
create a self contained IL that we can strictly define...  That alone seems
a worthwhile goal to me.  The IL then becomes appropriate for streaming at
any point in during compilation.  This means any front end or tool can
easily generate or consume a gimple stream which opens up numerous
opportunities...  That is pretty hard to enable today.

Have fun with dealing with our legacy frontends ;)  Or didn't you
actually mean "any point" during compilation? ... oh, you talk about
GIMPLE.  So no plans for Frontend ASTs aka GENERIC for our legacy
frontends?
I meant any point where we are in gimple...  Thats the only point we 
really care about.  Once control is passed to the middle end.  No plans 
to touch GENERIC at all... in fact, GENERIC and 'tree' will become true 
synonyms when I'm done, because gimple wont have any trees in it 
anywhere.  And rtl will also refer to gimple nodes.   It just wont 
matter what warts are in trees.

Im going to do the initial bits on a branch for the first couple of months,
and during that time convert a number of files over to the new gimple
interface.  Then we can look at it, measure it, comment on it, tweak it,
whatever.  You can see exactly what we'll be getting into before we fully
commit.

Before we fully commit we want to see the underlying object changes.
Because I still doubt this all will work out without seriously restructuring
the entanglement of our tree-based frontends and the middle-end
(say "fold").

That is, I still think you are starting this at the wrong end.  You first
need to solve the entanglement issue before you can seriously consider
making any real change to the data structures GIMPLE relie

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-07 Thread Richard Sandiford

Tejas Belagod  writes:
> Richard Sandiford wrote:
>> Tejas Belagod  writes:
>>> Richard Sandiford wrote:
 Tejas Belagod  writes:
> Richard Sandiford wrote:
>> Tejas Belagod  writes:
>>> +  /* This is big-endian-safe because the elements are kept in target
>>> + memory order.  So, for eg. PARALLEL element value of 2 is the
>>> same in
>>> + either endian-ness.  */
>>> +  if (GET_CODE (src) == VEC_SELECT
>>> +  && REG_P (XEXP (src, 0)) && REG_P (dst)
>>> +  && REGNO (XEXP (src, 0)) == REGNO (dst))
>>> +{
>>> +  rtx par = XEXP (src, 1);
>>> +  int i;
>>> +
>>> +  for (i = 0; i < XVECLEN (par, 0); i++)
>>> +   {
>>> + rtx tem = XVECEXP (par, 0, i);
>>> + if (!CONST_INT_P (tem) || INTVAL (tem) != i)
>>> +   return 0;
>>> +   }
>>> +  return 1;
>>> +}
>>> +
>> I think for big endian it needs to be:
>>
>> INTVAL (tem) != i + base
>>
>> where base is something like:
>>
>> int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN 
>> (par, 0);
>>
>> E.g. a big-endian V4HI looks like:
>>
>> msb  lsb
>> 
>>
>> and shortening it to say V2HI only gives the low 32 bits:
>>
>> msb  lsb
>> 
> But, in this case we want
>
>  msb  lsb
>  
 It depends on whether the result occupies a full register or not.
 I was thinking of the case where it didn't, but I realise now you were
 thinking of the case where it did.  And yeah, my suggestion doesn't
 cope with that...

> I was under the impression that the const vector parallel for vec_select 
> represents the element indexes of the array in memory order.
>
> Therefore, in bigendian,
>
>   msb lsb
>      
> element  a[0] a[1] a[2] a[3]
>
> and in littleendian
>
>   msb lsb
>      
> element  a[3] a[2] a[1] a[0]
 Right.  But if an N-bit value is stored in a register, it's assumed to
 occupy the lsb of the register and the N-1 bits above that.  The other
 bits in the register are don't-care.

 E.g., leaving vectors to one side, if you have:

(set (reg:HI N) (truncate:SI (reg:SI N)))

 on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

msb  lsb
01234567

4567

 rather than:

msb  lsb
01234567

0123

 for both endiannesses.  The same principle applies to vectors.
 The lsb of the register is always assumed to be significant.

 So maybe the original patch was correct for partial-register and
 full-register results on little-endian, but only for full-register
 results on big-endian.
>>> Ah, ok! I think I get it. By eliminating
>>> set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]
>>>
>>> using the check INTVAL (tem) != i, I'm essentially making subsequent
>>> operations
>>> use (reg:V2DI n) in DI mode which is a partial register result and this
>>> gives me
>>> the wrong set of lanes in bigendian. So, if I want to use (reg n) in 
>>> partial 
>>> register mode, I have to make sure the correct elements coincide with
>>> the lsb in
>>> big-endian...
>>>
>>> Thanks for your input, I'll apply the offset correction for big-endian you 
>>> suggested. I'll respin the patch.
>> 
>> Thanks.  Just for avoidance of doubt, the result might be a full or
>> partial register, depending on the mode and target.  I was trying to
>> correct myself by agreeing that your original was right and mine was
>> wrong for big-endian if the result is a full register.
>> 
>
> What I had in mind when I implemented this was a partial-reg result, but 
> obviously it was wrong.
>
> Sorry, I'm going to take a step back - I'm trying to figure out what a full 
> register result would look like. Looking at the pattern,
>
>set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]
>
> the result is always a mode smaller than the src if the vec_select selects a 
> subset of lanes. Plus if the src and dst are the same reg, because we're 
> re-writing the src reg, wouldn't it always end up being a partial-reg?
> In this case, wouldn't (reg:DI n) always represent
>
>   msb  lsb
>   

The problem is that one reg rtx can span several hard registers.
E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32),
but it might instead represent two 32-bit registers (nos. 32 and 33).
Obviously the latter's not very likely for vectors this small,
but more likely for larger ones (including on NEON IIRC).

So if we had 2 32-bit registers being treated as a V4HI, it would be:

   <--32--><--33-->
   ms

Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-07 Thread Michael Matz

Hi,

On Wed, 6 Nov 2013, David Malcolm wrote:

> > I don't like that.  The empty classes are just useless, they imply a 
> > structure that isn't really there, some of the separate gimple codes 
> > are basically selectors of specific subtypes of a generic concept, 
> > without additional data or methods; creating a type for those is 
> > confusing.
> 
> A type system does more than just express memory layouts:

I know all that.  But there's a point on the scale from a void* type 
system to a different-type-for-everything system where the advantages of a 
more complex type system are anulled by the disadvantages of a more 
complex type system.  And I'm merely of the opinion that you crossed that 
point with this separation.

You know we could have a typesystem where every different natural number 
is a type (and with C++ we _really_ could do that).  Obviously that's also 
a reductio ad absurdum, but it demonstrates the point that there _are_ 
type systems with a too low complexity to expressiveness ratio.

> * it permits the proof of absence of certain bugs
> * it supports abstraction
> * it documents the intent of the code

Well, yes.  You can document code with other things than types, you can 
add abstractions via different means, and certain bugs can indeed only be 
ruled out by different static types.

In my mind types should abstract different larger concepts foremost (and 
not every little aspect of a concept).  Only _then_ should they be used 
for other effects like the above, and only if the disadvantages don't 
overshadow that.  Disadvantages being mostly simply more complexity: which 
types relate to others in which way; what are the methods you can do with 
that but not the other type; which are the members of the 100 types there 
are (and if a type system is too large to be used only by IDEs with code 
completion, then it's too complex); and of course also simply having to 
write more text when you can't use the same gimple variable for several 
statements.

> To give an example from the patch, the proposed gimple_statement_switch
> subclass of gimple_statement_with_ops adds no extra fields, but it adds
> the invariant that (assuming the ptr is non-NULL), that code ==
> GIMPLE_SWITCH, or, more intuitively, "it's a switch statement".
> 
> This means that if we have a gimple_statement_switch, that both a human
> reading the code and the compiler can "know" at compile-time that the
> code == GIMPLE_SWITCH. 

But a gimple switch is actually a special case of a gimple conditional 
jump (the normal GIMPLE_COND being another) which itself is a special 
case of a gimple control transfer (a thing causing one or multiple 
outgoing edges and ending a basic block).  That enlightens that your 
hierarchy might not actually be the best split of types, it's merely a 
formal translation of gimple codes into types.

And I question exactly that automatic introduction of types.  There's not 
enough human thought put into the hierarchy to warrant the change.  Types 
are a fundament and they shouldn't be introduced lightly.  As gimple codes 
are simply flat it doesn't matter so much that the inherent structure of 
gimple statements isn't reflected (although there _is_ some structure by 
having the with_ops and with_mem_ops codes be together).  But for a _type 
hierarchy_ I would pretty much require that real thought and insights are 
put into it.

> Consider tree-switch-conversion.c: this contains various "gimple"
> variables but they don't work on arbitrary gimple; they require code ==
> GIMPLE_SWITCH:
> ...
> We could be doing this directly in the type system, and using the
> compiler to check it.

Then please do that while introducing one new type.

> >   The fewer 
> > types the better IMO.
> 
> There's a reductio ad absurdum of that argument:

See above for my variant of that showing that it also goes into the other 
direction of more types.

Ciao,
Michael.
P.S.: And do away with the "_statement" part of the type, gimple_switch 
ought to be enough.

Re: [PATCH] [libiberty] MAX_PATH problems with mingw gcc

2013-11-07 Thread Ian Lance Taylor

On Wed, Nov 6, 2013 at 11:55 PM, Vladimir Simonov
 wrote:
>> -Original Message-
>> From: Ian Lance Taylor [mailto:i...@google.com]
>> Sent: Wednesday, November 06, 2013 6:42 PM
>> To: Joey Ye
>> Cc: gcc-patches; d...@redhat.com; Vladimir Simonov
>> Subject: Re: [PATCH] [libiberty] MAX_PATH problems with mingw gcc
>
> Jan, thank you for your attention.
>
> It looks to me that you missed that the patch changes current gcc policy for 
> work with
> pathname  separators on "hosts" supporting both back and forward slashes from
> neutral(undefined) behavior to more defined - "From now on hosts/builds (in 
> terms of host-build-target)
> supporting both back and forward slashes gcc tries to use forward slashes 
> both in filenames saved
> in binaries for target and for internal work."
> And this patch is just first, little step in this direction. In fact the 
> patch was published just
> to show problems and start discussion about ways for their solution.
>
> Above may not satisfy you and other gcc developers/consumers.
> As minimum I'm interested in Mingw people opinion.
>
> Arguments for new policy are simple - this policy should not affect
> "native" builds but helps a lot in case when host/build supports both kinds 
> of separators but
> target supports only forward slashes.
>
> Without explicit consensus on above I see no sense in the patch details 
> discussion.


Personally, I'm fine with that proposed change in policy.

As somebody who does not use Windows, it does not affect me.  I agree
that getting the opinion of the mingw maintainers makes sense.

I commented on the details of the patch because Joey pinged me to
submit it, and I am a libiberty maintainer.  If you want to discuss
the above proposal, then by all means discuss it.

Ian

[PATCH, ARM, LRA] Fixed bootstrap failure in Thumb mode

2013-11-07 Thread Yvan Roux

Hi,

this patch fixed an LRA cycling due to secondary reload (Thumb mode).
Notice that this patch is a prerequisite to turn on LRA by default on
ARM.  Bootstrapped on a9 and a15 without any regression in the
testsuite as LRA is off by default and with the regression reported in
the thread bellow when LRA is on.

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html

Thanks,
Yvan

2013-11-07  Yvan Roux  

* config/arm/arm.h (THUMB_SECONDARY_INPUT_RELOAD_CLASS): Return NO_REGS
for LRA.
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 1781b75..d054906 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1266,11 +1266,12 @@ enum reg_class
 
 /* Must leave BASE_REGS reloads alone */
 #define THUMB_SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, X)		\
-  ((CLASS) != LO_REGS && (CLASS) != BASE_REGS\
-   ? ((true_regnum (X) == -1 ? LO_REGS	\
-   : (true_regnum (X) + HARD_REGNO_NREGS (0, MODE) > 8) ? LO_REGS	\
-   : NO_REGS)) 			\
-   : NO_REGS)
+  (lra_in_progress ? NO_REGS		\
+   : ((CLASS) != LO_REGS && (CLASS) != BASE_REGS			\
+  ? ((true_regnum (X) == -1 ? LO_REGS\
+ : (true_regnum (X) + HARD_REGNO_NREGS (0, MODE) > 8) ? LO_REGS	\
+ : NO_REGS)) 			\
+  : NO_REGS))
 
 #define THUMB_SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, X)		\
   ((CLASS) != LO_REGS && (CLASS) != BASE_REGS\

[PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)

2013-11-07 Thread Marek Polacek

Here, forward propagation turned
  _6 = a.1_5 & 1;
  _7 = (_Bool) _6;
into
  _7 = (_Bool) a.1_5;
but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1.
If a is an odd number, this is correct, but we don't know the value
of a, so I disabled this optimization for "x & 1" cases.  For e.g. "x & 255",
this still works correctly.

In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were
miscompiling the following testcase...

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-11-07  Marek Polacek  

PR tree-optimization/59014
* tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't
optimize x & 1 expressions here.
testsuite/
* gcc.dg/torture/pr59014.c: New test.

--- gcc/testsuite/gcc.dg/torture/pr59014.c.mp   2013-11-07 14:47:38.040941144 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr59014.c  2013-11-07 14:49:03.646271031 
+0100
@@ -0,0 +1,26 @@
+/* PR tree-optimization/59014 */
+/* { dg-do run } */
+
+int a = 2, b, c, d;
+
+int
+foo ()
+{
+  for (;; c++)
+if ((b > 0) | (a & 1))
+  ;
+else
+  {
+   d = a;
+   return 0;
+  }
+}
+
+int
+main ()
+{
+  foo ();
+  if (d != 2)
+__builtin_abort ();
+  return 0;
+}
--- gcc/tree-ssa-forwprop.c.mp  2013-11-07 14:30:43.785733810 +0100
+++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100
@@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple
   if (TREE_CODE (rhs_def_operand1) == SSA_NAME
  && ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1)
  && TREE_CODE (rhs_def_operand2) == INTEGER_CST
+ /* We can't turn
+  _6 = a & 1;
+  _7 = (_Bool) _6;
+into
+  _7 = (_Bool) a;
+as that's wrong if A is an even number.  */
+ && ! integer_onep (rhs_def_operand2)
  && operand_equal_p (rhs_def_operand2,
  build_low_bits_mask (TREE_TYPE (rhs_def_operand2),
-  TYPE_PRECISION (lhs_type)),
+  TYPE_PRECISION (lhs_type)),
   0))
{
  /* This is an optimizable case.  Replace the source operand

Marek

Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Aldy Hernandez


On 11/07/13 00:38, Jakub Jelinek wrote:

On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote:

Hmmm, good point.  I've moved update_stmt and company to the caller,
and modified the caller to call regimplify_operands only for
GIMPLE_RETURN which is the special case.


Can't you (later) handle that without regimplification too?


Sure!  See attached patch.

But as discussed on IRC, I wonder whether we can do without the 
following in the attached patch:


+   tree repl = make_ssa_name (TREE_TYPE (retval), NULL);
+   stmt = gimple_build_assign (repl, retval);
+   gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+   stmt = gimple_build_assign (ref, repl);

...and unconditionally do:

+ stmt = gimple_build_assign (ref, retval);

...since it seems all the GIMPLE_RETURNs I see can be replaced by 
``retval_array[iter] = whatever'' without creating something non-gimple 
(thus avoiding an SSA variable).


Either way, I'm ok.  Let me know.

Aldy
gcc/ChangeLog.gomp

* omp-low.c (ipa_simd_modify_function_body): Avoid
regimplification of GIMPLE_RETURNs.

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 14b8571..6039de9 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11175,26 +11175,31 @@ ipa_simd_modify_function_body (struct cgraph_node 
*node,
{
case GIMPLE_RETURN:
  {
-   tree old_retval = gimple_return_retval (stmt);
-   if (old_retval)
+   tree retval = gimple_return_retval (stmt);
+   if (!retval)
  {
-   /* Replace `return foo' by `retval_array[iter] = foo'.  */
-   stmt = gimple_build_assign (build4 (ARRAY_REF,
-   TREE_TYPE (old_retval),
-   retval_array, iter,
-   NULL, NULL),
-   old_retval);
-   gsi_replace (&gsi, stmt, true);
-   gimple_regimplify_operands (stmt, &gsi);
-   info.modified = true;
+   gsi_remove (&gsi, true);
+   continue;
  }
+
+   /* Replace `return foo' with `retval_array[iter] = foo'.  */
+   tree ref = build4 (ARRAY_REF,
+  TREE_TYPE (retval),
+  retval_array, iter,
+  NULL, NULL);
+   if (is_gimple_reg (retval))
+ stmt = gimple_build_assign (ref, retval);
else
  {
-   gsi_remove (&gsi, true);
-   continue;
+   tree repl = make_ssa_name (TREE_TYPE (retval), NULL);
+   stmt = gimple_build_assign (repl, retval);
+   gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+   stmt = gimple_build_assign (ref, repl);
  }
-   break;
+   gsi_replace (&gsi, stmt, true);
+   info.modified = true;
  }
+ break;
 
default:
  walk_gimple_op (stmt, ipa_simd_modify_stmt_ops, &wi);

Re: [PATCH] Add check for aarch64 in vect_cmdline_needed

2013-11-07 Thread Cesar Philippidis

On 11/6/13, 5:06 PM, Joseph S. Myers wrote:

> You should be testing aarch64*-*-* so as to match aarch64_be targets.

Thank you for catching that. Please commit this new patch if is OK. I
don't have SVN access.

Thanks,
Cesar

2013-11-06  Cesar Philippidis  

gcc/testsuite/
* lib/target-supports.exp 
(check_effective_target_vect_cmdline_neeed): Add aarch64 to the list 
of targets that do not need command line argument to enable SIMD.

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 423006)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -1920,7 +1920,8 @@ proc check_effective_target_vect_cmdline_needed {
 || [check_effective_target_powerpc_altivec]))
 || ([istarget sparc*-*-*] && [check_effective_target_sparc_vis])
  || [istarget spu-*-*]
-|| ([istarget arm*-*-*] && [check_effective_target_arm_neon]) } {
+|| ([istarget arm*-*-*] && [check_effective_target_arm_neon]) 
+|| [istarget aarch64*-*-*] } {
   set et_vect_cmdline_needed_saved 0
}
 }

Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 08:17:13AM -0700, Aldy Hernandez wrote:
> But as discussed on IRC, I wonder whether we can do without the
> following in the attached patch:
> 
> + tree repl = make_ssa_name (TREE_TYPE (retval), NULL);
> + stmt = gimple_build_assign (repl, retval);
> + gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
> + stmt = gimple_build_assign (ref, repl);
> 
> ...and unconditionally do:
> 
> +   stmt = gimple_build_assign (ref, retval);

This should be sufficient.
> 
> ...since it seems all the GIMPLE_RETURNs I see can be replaced by
> ``retval_array[iter] = whatever'' without creating something
> non-gimple (thus avoiding an SSA variable).
> 
> Either way, I'm ok.  Let me know.

Thanks.

Jakub

Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 04:15:11PM +0100, Marek Polacek wrote:
> Here, forward propagation turned
>   _6 = a.1_5 & 1;
>   _7 = (_Bool) _6;
> into
>   _7 = (_Bool) a.1_5;
> but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1.
> If a is an odd number, this is correct, but we don't know the value
> of a, so I disabled this optimization for "x & 1" cases.  For e.g. "x & 255",
> this still works correctly.
> 
> In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were
> miscompiling the following testcase...
> 
> Regtested/bootstrapped on x86_64-linux, ok for trunk?

Isn't the problem here not the constant 1, but the fact that TREE_CODE
(lhs_type) == BOOLEAN_TYPE, with the special properties of boolean types?
I mean, if lhs_type is say some normal INTEGER_TYPE with precision 1
(or say 3 and stmt & 3 etc.), then the cast alone should cover the required
masking.

> 2013-11-07  Marek Polacek  
> 
>   PR tree-optimization/59014
>   * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't
>   optimize x & 1 expressions here.
> testsuite/
>   * gcc.dg/torture/pr59014.c: New test.
> 
> --- gcc/tree-ssa-forwprop.c.mp2013-11-07 14:30:43.785733810 +0100
> +++ gcc/tree-ssa-forwprop.c   2013-11-07 14:42:31.836765375 +0100
> @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple
>if (TREE_CODE (rhs_def_operand1) == SSA_NAME
> && ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1)
> && TREE_CODE (rhs_def_operand2) == INTEGER_CST
> +   /* We can't turn
> +_6 = a & 1;
> +_7 = (_Bool) _6;
> +  into
> +_7 = (_Bool) a;
> +  as that's wrong if A is an even number.  */
> +   && ! integer_onep (rhs_def_operand2)
> && operand_equal_p (rhs_def_operand2,
> build_low_bits_mask (TREE_TYPE (rhs_def_operand2),
> -TYPE_PRECISION (lhs_type)),
> +TYPE_PRECISION (lhs_type)),
>  0))
>   {
> /* This is an optimizable case.  Replace the source operand
> 
>   Marek

Jakub

Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)

2013-11-07 Thread Richard Biener

On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek  wrote:
> Here, forward propagation turned
>   _6 = a.1_5 & 1;
>   _7 = (_Bool) _6;
> into
>   _7 = (_Bool) a.1_5;
> but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1.

?

(_Bool) 2

should truncate the value to zero.

Richard.

> If a is an odd number, this is correct, but we don't know the value
> of a, so I disabled this optimization for "x & 1" cases.  For e.g. "x & 255",
> this still works correctly.


> In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were
> miscompiling the following testcase...
>
> Regtested/bootstrapped on x86_64-linux, ok for trunk?
>
> 2013-11-07  Marek Polacek  
>
> PR tree-optimization/59014
> * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't
> optimize x & 1 expressions here.
> testsuite/
> * gcc.dg/torture/pr59014.c: New test.
>
> --- gcc/testsuite/gcc.dg/torture/pr59014.c.mp   2013-11-07 14:47:38.040941144 
> +0100
> +++ gcc/testsuite/gcc.dg/torture/pr59014.c  2013-11-07 14:49:03.646271031 
> +0100
> @@ -0,0 +1,26 @@
> +/* PR tree-optimization/59014 */
> +/* { dg-do run } */
> +
> +int a = 2, b, c, d;
> +
> +int
> +foo ()
> +{
> +  for (;; c++)
> +if ((b > 0) | (a & 1))
> +  ;
> +else
> +  {
> +   d = a;
> +   return 0;
> +  }
> +}
> +
> +int
> +main ()
> +{
> +  foo ();
> +  if (d != 2)
> +__builtin_abort ();
> +  return 0;
> +}
> --- gcc/tree-ssa-forwprop.c.mp  2013-11-07 14:30:43.785733810 +0100
> +++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100
> @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple
>if (TREE_CODE (rhs_def_operand1) == SSA_NAME
>   && ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1)
>   && TREE_CODE (rhs_def_operand2) == INTEGER_CST
> + /* We can't turn
> +  _6 = a & 1;
> +  _7 = (_Bool) _6;
> +into
> +  _7 = (_Bool) a;
> +as that's wrong if A is an even number.  */
> + && ! integer_onep (rhs_def_operand2)
>   && operand_equal_p (rhs_def_operand2,
>   build_low_bits_mask (TREE_TYPE 
> (rhs_def_operand2),
> -  TYPE_PRECISION (lhs_type)),
> +  TYPE_PRECISION (lhs_type)),
>0))
> {
>   /* This is an optimizable case.  Replace the source operand
>
> Marek

Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)

2013-11-07 Thread Jakub Jelinek

On Thu, Nov 07, 2013 at 04:39:03PM +0100, Richard Biener wrote:
> On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek  wrote:
> > Here, forward propagation turned
> >   _6 = a.1_5 & 1;
> >   _7 = (_Bool) _6;
> > into
> >   _7 = (_Bool) a.1_5;
> > but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 
> > 1.
> 
> ?
> 
> (_Bool) 2
> 
> should truncate the value to zero.

You're right, say:
_Bool foo (int x)
{
  return (_Bool) (x & 1);
}
which due to the simplify_conversion_from_bitmask optimization is optimized
into:
  _Bool _3;

  :
  _3 = (_Bool) x_1(D);
  return _3;
is indeed expanded as:
(set (reg:QI 90 [ D.1789 ])
(and:QI (reg:QI 90 [ D.1789 ])
(const_int 1 [0x1])))

Jakub

[PATCH] make has_gate and has_execute useless

2013-11-07 Thread tsaunders

From: Trevor Saunders 

Hi,

 This is the result of seeing what it would take to get rid of the has_gate and
has_execute flags on pass_data.  It turns out not much, but I wanted
confirmation this part is ok before I go deal with all the places that
initialize the fields.

I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite
regressions (ignoring the silk stuff because the full paths in its test names
break my test script for now) Any reason this patch with the actual removal of 
the flags wouldn't be ok?

Trev

2013-11-06  Trevor Saunders  

* pass_manager.h (pass_manager): Adjust.
* passes.c (opt_pass::execute): Tell the pass manager it doesn't need
to do anything for this pass.
(pass_manager::register_dump_files_1): Don't uselessly deal with
properties of passes.
(pass_manager::register_dump_files): Adjust.
(dump_one_pass): Just call pass->gate ().
(execute_ipa_summary_passes): Likewise.
(execute_one_pass): Don't check pass->has_execute flag.
(ipa_write_summaries_2): Don't check pass->has_gate flag.
(ipa_write_optimization_summaries_1): Likewise.
(ipa_read_summaries_1): Likewise.
(ipa_read_optimization_summaries_1): Likewise.
(execute_ipa_stmt_fixups): Likewise.
* tree-pass.h (pass_data): Rename has_gate to useless_has_gate and
has_execute to useless_has_execute to be sure they're unused.
(TODO_absolutely_nothing): New constant.


diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index 77d78eb..3bc0a99 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -93,7 +93,7 @@ public:
 
 private:
   void set_pass_for_id (int id, opt_pass *pass);
-  int register_dump_files_1 (struct opt_pass *pass, int properties);
+  void register_dump_files_1 (struct opt_pass *pass);
   void register_dump_files (struct opt_pass *pass, int properties);
 
 private:
diff --git a/gcc/passes.c b/gcc/passes.c
index 19e5869..3b28dc9 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -112,7 +112,7 @@ opt_pass::gate ()
 unsigned int
 opt_pass::execute ()
 {
-  return 0;
+  return TODO_absolutely_nothing;
 }
 
 opt_pass::opt_pass (const pass_data &data, context *ctxt)
@@ -701,33 +701,21 @@ pass_manager::register_one_dump_file (struct opt_pass 
*pass)
 
 /* Recursive worker function for register_dump_files.  */
 
-int
+void
 pass_manager::
-register_dump_files_1 (struct opt_pass *pass, int properties)
+register_dump_files_1 (struct opt_pass *pass)
 {
   do
 {
-  int new_properties = (properties | pass->properties_provided)
-  & ~pass->properties_destroyed;
-
   if (pass->name && pass->name[0] != '*')
 register_one_dump_file (pass);
 
   if (pass->sub)
-new_properties = register_dump_files_1 (pass->sub, new_properties);
-
-  /* If we have a gate, combine the properties that we could have with
- and without the pass being examined.  */
-  if (pass->has_gate)
-properties &= new_properties;
-  else
-properties = new_properties;
+register_dump_files_1 (pass->sub);
 
   pass = pass->next;
 }
   while (pass);
-
-  return properties;
 }
 
 /* Register the dump files for the pass_manager starting at PASS.
@@ -739,7 +727,7 @@ pass_manager::
 register_dump_files (struct opt_pass *pass,int properties)
 {
   pass->properties_required |= properties;
-  register_dump_files_1 (pass, properties);
+  register_dump_files_1 (pass);
 }
 
 struct pass_registry
@@ -847,7 +835,7 @@ dump_one_pass (struct opt_pass *pass, int pass_indent)
   const char *pn;
   bool is_on, is_really_on;
 
-  is_on = pass->has_gate ? pass->gate () : true;
+  is_on = pass->gate ();
   is_really_on = override_gate_status (pass, current_function_decl, is_on);
 
   if (pass->static_pass_number <= 0)
@@ -2002,7 +1990,7 @@ execute_ipa_summary_passes (struct ipa_opt_pass_d 
*ipa_pass)
 
   /* Execute all of the IPA_PASSes in the list.  */
   if (ipa_pass->type == IPA_PASS
- && ((!pass->has_gate) || pass->gate ())
+ && pass->gate ()
  && ipa_pass->generate_summary)
{
  pass_init_dump_file (pass);
@@ -2153,7 +2141,7 @@ execute_one_pass (struct opt_pass *pass)
 
   /* Check whether gate check should be avoided.
  User controls the value of the gate through the parameter "gate_status". 
*/
-  gate_status = pass->has_gate ? pass->gate () : true;
+  gate_status = pass->gate ();
   gate_status = override_gate_status (pass, current_function_decl, 
gate_status);
 
   /* Override gate with plugin.  */
@@ -2210,11 +2198,11 @@ execute_one_pass (struct opt_pass *pass)
 timevar_push (pass->tv_id);
 
   /* Do it!  */
-  if (pass->has_execute)
-{
-  todo_after = pass->execute ();
-  do_per_function (clear_last_verified, NULL);
-}
+  todo_after = pass->execute ();
+  if (todo_after != TODO_absolutely_nothing)
+do_per_function (clear_last_verified, NULL);
+  els

Re: RFC: simd enabled functions (omp declare simd / elementals)

2013-11-07 Thread Martin Jambor

Hi,

On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
> Hello gentlemen.  I'm CCing all of you, because each of you can
> provide valuable feedback to various parts of the compiler which I
> touch.  I have sprinkled love notes with your names throughout the
> post :).

sorry it took me so long, for various reasons I out of my control I've
accumulated quite a backlog of email and tasks last week and it took
me a lot of time to chew through it all.

...

> [Martin] I have adapted the ipa_parm_adjustment infrastructure to
> allow adding new arguments out of the blue like you mentioned was
> missing in ipa-prop.h.  I have also added support for creating
> vectors of arguments.  Could you take a look at my changes to
> ipa-prop.[ch]?

Sure, though I have only looked at ipa-* and tree-sra.c stuff.  I do
not have any real objections but would suggest a few amendments.  

I am glad this is becoming a useful infrastructure rather than just a
part of IPA-SRA.  Note that while ipa_combine_adjustments is not used
from anywhere and thus probably buggy anyway, it should in theory be
able to process new_param adjustments too.  Can you please at least
put a "not implemented" assert there?  (The reason is that the plan
still is to replace args_to_skip bitmaps in cgraphclones.c by
adjustments one day and we do need to combine clones.)

> 
> [Martin] I need to add new arguments in the case of inbranch clones,
> which add an additional vector with a mask as the last argument:
> For the following:
> 
> #pragma omp declare simd simdlen(4) inbranch
> int foo (int a)
> {
>   return a + 1234;
> }
> 
> ...we would generate a clone with:
> 
> vector(4) int
> foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)
> 
> I thought it best to enhance ipa_modify_formal_parameters() and
> associated machinery than to add the new argument ad-hoc.  We
> already have enough ways of doing tree and cgraph versioning in the
> compiler ;-).
> 

...

> gcc/ChangeLog.elementals
> 
>   * Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.
>   * tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when
>   a SIMD clone may be available.
>   (vectorizable_function): Use SIMD clone if available.
>   * ipa-cp.c (determine_versionability): Nodes with SIMD clones are
>   not versionable.
>   * ggc.h (ggc_alloc_cleared_simd_clone_stat): New.
>   * cgraph.h (enum linear_stride_type): New.
>   (struct simd_clone_arg): New.
>   (struct simd_clone): New.
>   (struct cgraph_node): Add simdclone and simdclone_of fields.
>   (get_simd_clone): Protoize.
>   * cgraph.c (get_simd_clone): New.
>   Add `has_simd_clones' field.
>   * ipa-cp.c (determine_versionability): Disallow functions with
>   simd clones.

(This looks like a repeated entry.)

>   * ipa-prop.h (ipa_sra_modify_function_body): Protoize.
>   (sra_ipa_modify_expr): Same.
>   (struct ipa_parm_adjustment): Add new_arg_prefix and new_param
>   fields.  Document their use.
>   * ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand
>   new parameters and minor cleanups.
>   * omp-low.c: Add new pass_omp_simd_clone support code.
>   (make_pass_omp_simd_clone): New.
>   (pass_data_omp_simd_clone): Declare.
>   (class pass_omp_simd_clone): Declare.
>   (vecsize_mangle): New.
>   (ipa_omp_simd_clone): New.
>   (simd_clone_clauses_extract): New.
>   (simd_clone_compute_base_data_type): New.
>   (simd_clone_compute_vecsize_and_simdlen): New.
>   (simd_clone_create): New.
>   (simd_clone_adjust_return_type): New.
>   (simd_clone_adjust_return_types): New.
>   (simd_clone_adjust): New.
>   (simd_clone_init_simd_arrays): New.
>   (ipa_simd_modify_function_body): New.
>   (simd_clone_mangle): New.
>   (simd_clone_struct_alloc): New.
>   (simd_clone_struct_copy): New.
>   (class argno_map): New.
>   (argno_map::argno_map(tree)): New.
>   (argno_map::~argno_map): New.
>   (argno_map::operator []): New.
>   (argno_map::length): New.
>   (expand_simd_clones): New.
>   (create_tmp_simd_array): New.
>   * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New.
>   * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
>   * tree-pass.h (make_pass_omp_simd_clone): New.
>   * passes.def (pass_omp_simd_clone): New.
>   * target.def: Define new hook prefix "TARGET_CILKPLUS_".
>   (default_vecsize_mangle): New.
>   (vecsize_for_mangle): New.
>   * doc/tm.texi.in: Add placeholder for
>   TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and
>   TARGET_CILKPLUS_VECSIZE_FOR_MANGLE.
>   * tree-sra.c (sra_ipa_modify_expr): Remove static modifier.
>   (ipa_sra_modify_function_body): Same.
>   * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
>   * doc/tm.texi: Regenerate.
>   * config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New.
>

Re: [PATCH] Don't simplify_conversion_from_bitmask on x & 1 (PR tree-optimization/59014)

2013-11-07 Thread Marek Polacek

On Thu, Nov 07, 2013 at 04:39:03PM +0100, Richard Biener wrote:
> On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek  wrote:
> > Here, forward propagation turned
> >   _6 = a.1_5 & 1;
> >   _7 = (_Bool) _6;
> > into
> >   _7 = (_Bool) a.1_5;
> > but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 
> > 1.
> 
> ?
> 
> (_Bool) 2
> 
> should truncate the value to zero.

Aha.  I can see now, said the blind man.

Anyway, I'll revisit this in stage 3; I now suspect the special
handling of BIT_IOR_EXPR in register_edge_assert_for_1.

Marek

Re: Implement C11 _Atomic

2013-11-07 Thread Uros Bizjak

On Wed, Nov 6, 2013 at 12:21 AM, Joseph S. Myers
 wrote:
> This patch, relative to trunk and based on work done on the C11-atomic
> branch, adds support for C11 _Atomic.  It is intended to include all
> the required language support.
>
> It does not include the  header; there's a version on the
> branch, but it needs further review against the standard and test
> coverage adding to the testsuite before I can propose it for mainline.
>
> Support for atomic types having bigger alignment than the
> corresponding non-atomic types is limited: it includes the code to
> increase the alignment of types whose size is exactly 1, 2, 4, 8 or 16
> to that of the corresponding integer type [*], but not anything for
> target-specific alignment increases.  There's code for target-specific
> alignment on the branch (and I intend to merge trunk back to the
> branch once this patch is on trunk, so it's easy to tell what the
> changes still left on the branch are), should any target maintainers
> wish to have such alignment.  Note however that ideally libstdc++
> atomics would be ABI-compatible with C atomics, requiring them to get
> the same alignment; the branch has an "atomic" attribute for that
> purpose, but I think further work on the C++ front-end changes would
> be needed for them to be ready for mainline.
>
> [*] The c-common.c code to resolve size-generic atomic built-in
> functions to size-specific ones assumes that types are naturally
> aligned (and so resolves to calls to the size-specific functions that
> require natural alignment) without checking it.  If users should be
> able to use the size-generic functions on types with lesser alignment,
> e.g. _Complex double (8-byte rather than 16-byte aligned), rather than
> just on their _Atomic variants, this is of course a bug in the code
> resolving those built-in functions.  (The libatomic functions properly
> check for alignment at runtime.)  But it seems reasonable enough
> anyway to make _Atomic _Complex double 16-byte aligned.
>
> Full use of _Atomic requires linking with libatomic, if you're doing
> any atomic operations that aren't fully expanded inline; arguably
> libatomic should be linked in by default with --as-needed (given that
> previously ordinary C code didn't require the user to link any
> libraries explicitly unless they directly called library functions),
> but that's independent of this patch.  Correct handling of atomic
> compound assignment for floating-point types also requires built-in
> support for floating-point environment manipulation operations roughly
> equivalent to feholdexcept, feclearexcept and feupdateenv (built-ins
> are used to avoid introducing libm dependencies, which generic C code
> not actually calling libm library functions shouldn't require); this
> patch includes such support for x86 [*], and I expect other
> architectures to be simpler.  If you don't have a libatomic port, the
> execution tests for _Atomic simply won't be run; if you have libatomic
> but not the floating-point support, the tests will be run but
> c11-atomic-exec-5.c will fail (assuming the library features are
> present for that test to run at all - it also requires pthreads).
>
> [*] This could be optimized if desired by passing the types in
> question to the target hook so it can generate code for only one of
> x87 and SSE in most cases, much like an optimization present in glibc
> when it internally does feholdexcept / fesetenv / feupdateenv
> sequences.
>
> _Atomic support is currently disabled for Objective-C and OpenMP.  For
> both (but mainly OpenMP), the relevant parser code needs checking to
> determine where convert_lvalue_to_rvalue calls need inserting to
> ensure that accesses to atomic variables involve atomic loads.  For
> Objective-C, there are also various special cases of compound
> assignment that need special handling for atomics just as standard C
> compound assignment is handled differently for atomics, as well as
> some TYPE_MAIN_VARIANT calls to check for correctness for atomics; see
> the comment on the relevant sorry () call for details.  OpenMP should
> also have TYPE_MAIN_VARIANT uses checked as well as a use of
> TYPE_QUALS_NO_ADDR_SPACE for a diagnostic in
> c_parser_omp_declare_reduction (where the diagnostic refers to a
> particular list of qualifiers).
>
> Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
> commit (the various non-front-end pieces)?
>
> gcc:
> 2013-11-05  Andrew MacLeod  
> Joseph Myers  
>
> * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC.
> (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE,
> TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE.
> (struct tree_base): Add atomic_flag field.
> * tree.h (TYPE_ATOMIC): New accessor macro.
> (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC.
> (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro.
> (atomicQI_type_node, atomicHI_type_node, at

Clean up atomic tests

2013-11-07 Thread Joseph S. Myers

This patch cleans up various issues with the tests of atomics built-in
functions and libatomic functions, in preparation for adapting those
tests to add test coverage of stdatomic.h macros.  The tests were
missing a return type for main (C11 doesn't allow implicit int return
types).  Some tests were incrementing a variable in one part of an
expression and using its value in another part without a sequence
point in between.  And the libatomic tests shouldn't have been
restricted to targets with hardware sync_* support, or adding
command-line options to make such support available, since they are
built with -fno-inline-atomics anyway to ensure the library
functionality is what's tested, and the library is meant to cover all
types regardless of what hardware support may be available.

Tested for x86_64-unknown-linux-gnu.  OK to commit?

gcc/testsuite:
2013-11-07  Joseph Myers  

* gcc.dg/atomic-compare-exchange-1.c,
gcc.dg/atomic-compare-exchange-2.c,
gcc.dg/atomic-compare-exchange-3.c,
gcc.dg/atomic-compare-exchange-4.c,
gcc.dg/atomic-compare-exchange-5.c, gcc.dg/atomic-exchange-1.c,
gcc.dg/atomic-exchange-2.c, gcc.dg/atomic-exchange-3.c,
gcc.dg/atomic-exchange-4.c, gcc.dg/atomic-exchange-5.c,
gcc.dg/atomic-fence.c, gcc.dg/atomic-flag.c,
gcc.dg/atomic-generic.c, gcc.dg/atomic-invalid.c,
gcc.dg/atomic-load-1.c, gcc.dg/atomic-load-2.c,
gcc.dg/atomic-load-3.c, gcc.dg/atomic-load-4.c,
gcc.dg/atomic-load-5.c, gcc.dg/atomic-lockfree.c,
gcc.dg/atomic-noinline.c, gcc.dg/atomic-op-1.c,
gcc.dg/atomic-op-2.c, gcc.dg/atomic-op-3.c, gcc.dg/atomic-op-4.c,
gcc.dg/atomic-op-5.c, gcc.dg/atomic-param.c,
gcc.dg/atomic-store-1.c, gcc.dg/atomic-store-2.c,
gcc.dg/atomic-store-3.c, gcc.dg/atomic-store-4.c,
gcc.dg/atomic-store-5.c: Declare main as returning int.
* gcc.dg/atomic-exchange-1.c, gcc.dg/atomic-exchange-2.c,
gcc.dg/atomic-exchange-3.c, gcc.dg/atomic-exchange-4.c,
gcc.dg/atomic-exchange-5.c: Separate increments of count from
expression using value of count.

libatomic:
2013-11-07  Joseph Myers  

* testsuite/libatomic.c/atomic-compare-exchange-1.c,
testsuite/libatomic.c/atomic-compare-exchange-2.c,
testsuite/libatomic.c/atomic-compare-exchange-3.c,
testsuite/libatomic.c/atomic-compare-exchange-4.c,
testsuite/libatomic.c/atomic-compare-exchange-5.c,
testsuite/libatomic.c/atomic-exchange-1.c,
testsuite/libatomic.c/atomic-exchange-2.c,
testsuite/libatomic.c/atomic-exchange-3.c,
testsuite/libatomic.c/atomic-exchange-4.c,
testsuite/libatomic.c/atomic-exchange-5.c,
testsuite/libatomic.c/atomic-generic.c,
testsuite/libatomic.c/atomic-load-1.c,
testsuite/libatomic.c/atomic-load-2.c,
testsuite/libatomic.c/atomic-load-3.c,
testsuite/libatomic.c/atomic-load-4.c,
testsuite/libatomic.c/atomic-load-5.c,
testsuite/libatomic.c/atomic-op-1.c,
testsuite/libatomic.c/atomic-op-2.c,
testsuite/libatomic.c/atomic-op-3.c,
testsuite/libatomic.c/atomic-op-4.c,
testsuite/libatomic.c/atomic-op-5.c,
testsuite/libatomic.c/atomic-store-1.c,
testsuite/libatomic.c/atomic-store-2.c,
testsuite/libatomic.c/atomic-store-3.c,
testsuite/libatomic.c/atomic-store-4.c,
testsuite/libatomic.c/atomic-store-5.c: Declare main as returning
int.  Do not require built-in sync support or add target-specific
options.
* testsuite/libatomic.c/atomic-exchange-1.c,
testsuite/libatomic.c/atomic-exchange-2.c,
testsuite/libatomic.c/atomic-exchange-3.c,
testsuite/libatomic.c/atomic-exchange-4.c,
testsuite/libatomic.c/atomic-exchange-5.c: Separate increments of
count from expression using value of count.

Index: gcc/testsuite/gcc.dg/atomic-op-2.c
===
--- gcc/testsuite/gcc.dg/atomic-op-2.c  (revision 204508)
+++ gcc/testsuite/gcc.dg/atomic-op-2.c  (working copy)
@@ -528,6 +528,7 @@
 abort ();
 }
 
+int
 main ()
 {
   test_fetch_add ();
Index: gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
===
--- gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c(revision 204508)
+++ gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c(working copy)
@@ -16,6 +16,7 @@
 #define STRONG 0
 #define WEAK 1
 
+int
 main ()
 {
 
Index: gcc/testsuite/gcc.dg/atomic-exchange-3.c
===
--- gcc/testsuite/gcc.dg/atomic-exchange-3.c(revision 204508)
+++ gcc/testsuite/gcc.dg/atomic-exchange-3.c(working copy)
@@ -9,25 +9,31 @@
 
 int v, count, ret;
 
+int
 main ()
 {
   v = 0;
   count = 0;
 
-  if (__atomic_exchange_n (&v, count + 1, __ATOMIC_RELAXED) !=  count++) 
+  if (_

Re: Implement C11 _Atomic

2013-11-07 Thread Jakub Jelinek

On Tue, Nov 05, 2013 at 11:21:56PM +, Joseph S. Myers wrote:
> 2013-11-05  Andrew MacLeod  
>   Joseph Myers  
> 
>   * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC.
>   (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE,
>   TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE.
>   (struct tree_base): Add atomic_flag field.
>   * tree.h (TYPE_ATOMIC): New accessor macro.
>   (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC.
>   (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro.
>   (atomicQI_type_node, atomicHI_type_node, atomicSI_type_node)
>   (atomicDI_type_node, atomicTI_type_node): New macros for type
>   nodes.
>   * tree.c (set_type_quals): Set TYPE_ATOMIC.
>   (find_atomic_core_type): New function.
>   (build_qualified_type): Adjust alignment for qualified types.
>   (build_atomic_base): New function
>   (build_common_tree_nodes): Build atomicQI_type_node,
>   atomicHI_type_node, atomicSI_type_node, atomicDI_type_node and
>   atomicTI_type_node.
>   * print-tree.c (print_node): Print atomic qualifier.
>   * tree-pretty-print.c (dump_generic_node): Print atomic type
>   attribute.
>   * target.def (atomic_assign_expand_fenv): New hook.
>   * doc/tm.texi.in (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New @hook.
>   * doc/tm.texi: Regenerate.
>   * targhooks.c (default_atomic_assign_expand_fenv): New function.
>   * targhooks.h (default_atomic_assign_expand_fenv): Declare.
>   * sync-builtins.def (__atomic_feraiseexcept): New built-in
>   function.

> 2013-11-05  Joseph Myers  
> 
>   * lib/target-supports.exp
>   (check_effective_target_fenv_exceptions): New function.
>   * lib/atomic-dg.exp, gcc.dg/atomic/atomic.exp: New files.
>   * gcc.dg/atomic/c11-atomic-exec-1.c,
>   gcc.dg/atomic/c11-atomic-exec-2.c,
>   gcc.dg/atomic/c11-atomic-exec-3.c,
>   gcc.dg/atomic/c11-atomic-exec-4.c,
>   gcc.dg/atomic/c11-atomic-exec-5.c, gcc.dg/c11-atomic-1.c,
>   gcc.dg/c11-atomic-2.c, gcc.dg/c11-atomic-3.c,
>   gcc.dg/c90-atomic-1.c, gcc.dg/c99-atomic-1.c: New tests.
> 
> libatomic:
> 2013-11-05  Joseph Myers  
> 
>   * fenv.c: New file.
>   * libatomic.map (LIBATOMIC_1.1): New symbol version.  Include
>   __atomic_feraiseexcept.
>   * configure.ac (libtool_VERSION): Change to 2:0:1.
>   (fenv.h): Test for header.
>   * Makefile.am (libatomic_la_SOURCES): Add fenv.c.
>   * Makefile.in, auto-config.h.in, configure: Regenerate.

The middle-end, libatomic and testsuite changes are ok.

Jakub

Re: Cilk Library

2013-11-07 Thread Jeff Law


On 11/07/13 06:11, Thomas Schwinge wrote:

Hi!

On Wed, 9 Oct 2013 18:32:11 +, "Iyer, Balaji V"  
wrote:

 * Makefile.def: Add libcilkrts to target_modules.  Make libcilkrts
 depend on libstdc++ and libgcc.
 [...]
 * Makefile.in: Added libcilkrts related fields to support building it.


How did you modify the latter file?  I noticed it is no longer in sync
with the former: if I regenerate it (»autogen Makefile.def«), then there
are differences.  This is easily fixed, of course, but as everyone now
has been using the "out-of-sync" Makefile.in, I wonder whether the
regeneration qualifies as obvious to commit, or rather something in
Makefile.def needs to be changed to make it match the Makefile.in as it
has been checked in in r204173?

* Makefile.in: Regenerate.
I think we should consider regeneration as an obvious change.  If that 
breaks something, then it's the original author who introduced the 
change without a corresponding regenerate that needs to fix their code.


Just to be explicit, this is fine for the trunk.

Jeff

Re: Re-factor tree.h - Part 1

2013-11-07 Thread Jeff Law


On 11/07/13 10:24, Diego Novillo wrote:

On Thu, Nov 7, 2013 at 5:29 AM, Richard Biener
 wrote:


Moved to fold-const.c:
 size_low_cst


Same as int_cst_value just with lack of any checking?


It may be. It's used by the POINTER_EXPR_PLUS handler. I will try to
replace it in a later patch.




Moved to gimple-fold.c:
 truth_type_for (maybe keep extern?)


Uses a langhook so should stay here.


Done.


Moved to tree-ssa-dom.c:
 iterative_hash_exprs_commutative


I'm not sure - it seems that iterative_hash_expr does exactly
the same thing?  So instead of moving things this should
be refactored (make that function re-usable from iterative_hash_expr).


Future patch? Making it static will at least prevent diverging uses of
the similar functions.
I think future patch.  In theory we ought to just make 
iterative_hash_exprs_commutative just go away, but there's some 
restructuring of the DOM code that would need to happen first.


jeff

Committed: arc_ifcvt: take commutativity into account

2013-11-07 Thread Joern Rennecke




tmp
Description: Binary data

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-07 Thread Joseph S. Myers

On Thu, 7 Nov 2013, Rong Xu wrote:

> Thanks Joseph for these detailed comments/suggestions.
> The fixed patch is attached to this email.
> The only thing left out is the Texinfo manual. Do you mean this tool
> should have its it's own texi file in gcc/doc?

Its own texi file, probably included as a chapter by gcc.texi (like 
gcov.texi is).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] make has_gate and has_execute useless

2013-11-07 Thread Jeff Law


On 11/07/13 09:00, tsaund...@mozilla.com wrote:

From: Trevor Saunders 

Hi,

  This is the result of seeing what it would take to get rid of the has_gate and
has_execute flags on pass_data.  It turns out not much, but I wanted
confirmation this part is ok before I go deal with all the places that
initialize the fields.

Most definitely, the has_gate/has_execute stuff is definitely a wart.



I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite
regressions (ignoring the silk stuff because the full paths in its test names
break my test script for now) Any reason this patch with the actual removal of 
the flags wouldn't be ok?

I don't see anything glaringly wrong.

Any reason why the register_dump_files bits were mucking with 
properties.  That just seems wrong.




+  todo_after = pass->execute ();
+  if (todo_after != TODO_absolutely_nothing)
+do_per_function (clear_last_verified, NULL);
+  else
+todo_after &= ~TODO_absolutely_nothing;


Isn't the last assignment there just
todo_after = 0;

?

Jeff

[wide-int, committed] Fix nds32 build

2013-11-07 Thread Richard Sandiford

Get the new port building on wide-int.  Seemed pretty obvious so I went
ahead and installed it.

Thanks,
Richard


Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c2013-11-05 13:06:54.744239262 +
+++ gcc/config/nds32/nds32.c2013-11-07 10:27:03.339028225 +
@@ -1273,7 +1273,7 @@ nds32_construct_isr_vectors_information
  /* Pick up each vector id value.  */
  id = TREE_VALUE (id_list);
  /* Add vector_number_offset to get actual vector number.  */
- vector_id = TREE_INT_CST_LOW (id) + vector_number_offset;
+ vector_id = tree_to_uhwi (id) + vector_number_offset;
 
  /* Enable corresponding vector and set function name.  */
  nds32_isr_vectors[vector_id].category = (intr)
@@ -1315,7 +1315,7 @@ nds32_construct_isr_vectors_information
 
   /* The total vectors = interrupt + exception numbers + reset.
  There are 8 exception and 1 reset in nds32 architecture.  */
-  nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1;
+  nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1;
   strcpy (nds32_isr_vectors[0].func_name, func_name);
 
   /* Retrieve nmi and warm function.  */
@@ -3145,8 +3145,8 @@ nds32_insert_attributes (tree decl, tree
  id = TREE_VALUE (id_list);
  /* Issue error if it is not a valid integer value.  */
  if (TREE_CODE (id) != INTEGER_CST
- || TREE_INT_CST_LOW (id) < lower_bound
- || TREE_INT_CST_LOW (id) > upper_bound)
+ || wi::ltu_p (id, lower_bound)
+ || wi::gtu_p (id, upper_bound))
error ("invalid id value for interrupt/exception attribute");
 
  /* Advance to next id.  */
@@ -3173,8 +3173,8 @@ nds32_insert_attributes (tree decl, tree
 
  /* 3. Check valid integer value for reset.  */
  if (TREE_CODE (id) != INTEGER_CST
- || TREE_INT_CST_LOW (id) < lower_bound
- || TREE_INT_CST_LOW (id) > upper_bound)
+ || wi::ltu_p (id, lower_bound)
+ || wi::gtu_p (id, upper_bound))
error ("invalid id value for reset attribute");
 
  /* 4. Check valid function for nmi/warm.  */

Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise

2013-11-07 Thread Bruce Korb


On 11/06/13 15:29, Ian Lance Taylor wrote:

When fenv.h is not fixed, libquadmath does not build.

This patch works around the problem.  Bootstrapped and tested on
x86_64-unknown-linux-gnu.

OK for mainline?


Hi Ian,

Yes, please.

This time, I'm on my dev box and looked at the code.
You remembered correctly that the first file name in the list
of file names needs to not have wild card characters so that
the testing scheme can create a file by that name.  A directory
named "*" is possible, but inconvenient.

Anyway, it also matches prior art in:


/*
 * glibc_c99_inline_3
 */
fix = {
hackname  = glibc_c99_inline_3;
files = bits/string2.h, '*/bits/string2.h';


but is inconsistent with:


/* Some versions of glibc have a version of bits/string2.h that
   produces "value computed is not used" warnings from strncpy; fix
   this definition by using __builtin_strncpy instead as in newer
   versions.  */
fix = {
hackname  = glibc_strncpy;
files = bits/string2.h;


Hmmm.  Does that fix need fixing, too?

Thanks -Bruce

wide-int testing results

2013-11-07 Thread Richard Sandiford

I wanted to make sure that each backend still builds with wide-int and that
there weren't any unexplained changes in assembly output.  I arbitrarily
picked one target for each CPU:

aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf
fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf
lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf
microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf
msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf
powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf
tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf
x86_64-darwin

and built gcc and g++ from both the merge point and the wide-int branch.
The branch included the patches I've posted and a local change
to put CONST_WIDE_INT at the end of rtl.def (just for comparison,
as mentioned before).  I then compiled gcc.c-torture, gcc.dg and g++.dg
at -O2 and compared the asm output.  This obviously isn't a very strong
test, since none of the libraries are built, and since some of the test
cases rely on system headers, but it should at least be better than nothing.

pdp11 and picochip-elf don't build on mainline due to:

   gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was not 
declared in this scope
   #   define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor

Both the merge point and the branch built the other targets and there were
no new warnings on the branch.  The only asm differences were in:

* gcc.dg/fixed-point/const-1.s on targets that support it

  I think this is a known improvement.  The test case checks a series of
  conversions to make sure that the out-of-range ones produce a warning.
  E.g.:

short _Fract sfF = 1.0;  /* { dg-warning "overflow" } */

  The asm differences are in the values produced for these out-of-range
  casees.  The old code used real_to_integer2, which saturates the result
  to double_int precision.  It then uses the fractional part of that result
  to initialise sfF.  The result therefore depends on the host (at least
  in the general case).

  The new code instead saturates to the precision of the result, just like
  we already do for:

signed short s = 32768.0f;

  So this seems like a good change. 

* gcc.dg/vect/pr51799.c on m32c-elf and
* gcc.c-torture/execute/divconst-2.c on xstormy16-elf

  These are cases where wide-int converts a narrower-than-HWI
  multiplication by 1 << (GET_MODE_PRECISION - 1) into a shift but
  mainline doesn't.  The mainline code looks like:

  /* Convert multiply by constant power of two into shift unless
 we are still generating RTL.  This test is a kludge.  */
  if (CONST_INT_P (trueop1)
  && (val = exact_log2 (UINTVAL (trueop1))) >= 0
  /* If the mode is larger than the host word size, and the
 uppermost bit is set, then this isn't a power of two due
 to implicit sign extension.  */
  && (width <= HOST_BITS_PER_WIDE_INT
  || val != HOST_BITS_PER_WIDE_INT - 1))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));

  and this exact_log2 doesn't cope properly with the sign-extended
  0xff8000... CONST_INT.  The wide-int code is:

  /* Convert multiply by constant power of two into shift.  */
  if (CONST_SCALAR_INT_P (trueop1))
{
  val = wi::exact_log2 (std::make_pair (trueop1, mode));
  if (val >= 0 && val < GET_MODE_BITSIZE (mode))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
}

  So this too seems like a good thing.

Thanks,
Richard

Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise

2013-11-07 Thread Bruce Korb


So is this the right patch?


$ svn diff inclhack.def
Index: inclhack.def
===
--- inclhack.def(revision 204533)
+++ inclhack.def(working copy)
@@ -1738,7 +1738,7 @@
versions.  */
 fix = {
 hackname  = glibc_strncpy;
-files = bits/string2.h;
+files = bits/string2.h, '*/bits/string2.h';
 bypass= "__builtin_strncpy";
 c_fix = format;
 c_fix_arg = "#  define strncpy(dest, src, n) __builtin_strncpy (dest, src, 
n)";
@@ -2411,7 +2411,7 @@
  */
 fix = {
 hackname  = huge_val_hex;
-files = bits/huge_val.h;
+files = bits/huge_val.h, '*/bits/huge_val.h';
 select= "^#[ \t]*define[ \t]*HUGE_VAL[ \t].*0x1\\.0p.*";
 bypass= "__builtin_huge_val";

@@ -2426,7 +2426,7 @@
  */
 fix = {
 hackname  = huge_valf_hex;
-files = bits/huge_val.h;
+files = bits/huge_val.h, '*/bits/huge_val.h';
 select= "^#[ \t]*define[ \t]*HUGE_VALF[ \t].*0x1\\.0p.*";
 bypass= "__builtin_huge_valf";

@@ -2441,7 +2441,7 @@
  */
 fix = {
 hackname  = huge_vall_hex;
-files = bits/huge_val.h;
+files = bits/huge_val.h, '*/bits/huge_val.h';
 select= "^#[ \t]*define[ \t]*HUGE_VALL[ \t].*0x1\\.0p.*";
 bypass= "__builtin_huge_vall";

@@ -4226,7 +4226,7 @@
 fix = {
 hackname  = thread_keyword;
 files = "pthread.h";
-files = "bits/sigthread.h";
+files = bits/sigthread.h, '*/bits/sigthread.h';
 select= "([* ])__thread([,)])";
 c_fix = format;
 c_fix_arg = "%1__thr%2";
@@ -4767,7 +4767,7 @@
 fix = {
 hackname  = feraiseexcept_nosse_invalid;
 mach  = 'i[34567]86-*-linux*', 'x86*-linux*', 'amd64-*-linux*';
-files = bits/fenv.h;
+files = bits/fenv.h, '*/bits/fenv.h';
 select= "^([\t ]*)__asm__ __volatile__ \\(\"divss %0, %0 *\" : "
": \"x\" \\(__f\\)\\);$";
 bypass= "\"fdiv .*; fwait\"";
@@ -4794,7 +4794,7 @@
 fix = {
 hackname  = feraiseexcept_nosse_divbyzero;
 mach  = 'i[34567]86-*-linux*', 'x86*-linux*', 'amd64-*-linux*';
-files = bits/fenv.h;
+files = bits/fenv.h, '*/bits/fenv.h';
 select= "^([\t ]*)__asm__ __volatile__ \\(\"divss %1, %0 *\" : "
": \"x\" \\(__f\\), \"x\" \\(__g\\)\\);$";
 bypass= "\"fdivp .*; fwait\"";

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-07 Thread Tejas Belagod


Richard Sandiford wrote:

Tejas Belagod  writes:

Richard Sandiford wrote:

Tejas Belagod  writes:

Richard Sandiford wrote:

Tejas Belagod  writes:

Richard Sandiford wrote:

Tejas Belagod  writes:

+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the
same in
+ either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+  && REG_P (XEXP (src, 0)) && REG_P (dst)
+  && REGNO (XEXP (src, 0)) == REGNO (dst))
+{
+  rtx par = XEXP (src, 1);
+  int i;
+
+  for (i = 0; i < XVECLEN (par, 0); i++)
+   {
+ rtx tem = XVECEXP (par, 0, i);
+ if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+   return 0;
+   }
+  return 1;
+}
+

I think for big endian it needs to be:

INTVAL (tem) != i + base

where base is something like:

int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0);

E.g. a big-endian V4HI looks like:

msb  lsb


and shortening it to say V2HI only gives the low 32 bits:

msb  lsb


But, in this case we want

 msb  lsb
 

It depends on whether the result occupies a full register or not.
I was thinking of the case where it didn't, but I realise now you were
thinking of the case where it did.  And yeah, my suggestion doesn't
cope with that...

I was under the impression that the const vector parallel for vec_select 
represents the element indexes of the array in memory order.


Therefore, in bigendian,

  msb lsb
     
element  a[0] a[1] a[2] a[3]

and in littleendian

  msb lsb
     
element  a[3] a[2] a[1] a[0]

Right.  But if an N-bit value is stored in a register, it's assumed to
occupy the lsb of the register and the N-1 bits above that.  The other
bits in the register are don't-care.

E.g., leaving vectors to one side, if you have:

   (set (reg:HI N) (truncate:SI (reg:SI N)))

on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

   msb  lsb
   01234567
   
   4567

rather than:

   msb  lsb
   01234567
   
   0123

for both endiannesses.  The same principle applies to vectors.
The lsb of the register is always assumed to be significant.

So maybe the original patch was correct for partial-register and
full-register results on little-endian, but only for full-register
results on big-endian.

Ah, ok! I think I get it. By eliminating
set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

using the check INTVAL (tem) != i, I'm essentially making subsequent
operations
use (reg:V2DI n) in DI mode which is a partial register result and this
gives me
the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial 
register mode, I have to make sure the correct elements coincide with

the lsb in
big-endian...

Thanks for your input, I'll apply the offset correction for big-endian you 
suggested. I'll respin the patch.

Thanks.  Just for avoidance of doubt, the result might be a full or
partial register, depending on the mode and target.  I was trying to
correct myself by agreeing that your original was right and mine was
wrong for big-endian if the result is a full register.

What I had in mind when I implemented this was a partial-reg result, but 
obviously it was wrong.


Sorry, I'm going to take a step back - I'm trying to figure out what a full 
register result would look like. Looking at the pattern,


   set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

the result is always a mode smaller than the src if the vec_select selects a 
subset of lanes. Plus if the src and dst are the same reg, because we're 
re-writing the src reg, wouldn't it always end up being a partial-reg?

In this case, wouldn't (reg:DI n) always represent

  msb  lsb
  


The problem is that one reg rtx can span several hard registers.
E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32),
but it might instead represent two 32-bit registers (nos. 32 and 33).
Obviously the latter's not very likely for vectors this small,
but more likely for larger ones (including on NEON IIRC).

So if we had 2 32-bit registers being treated as a V4HI, it would be:

   <--32--><--33-->
   msb  lsb
   
   
   
   msb  lsb
   <--32-->

for big endian and:

   <--33--><--32-->
   msb  lsb
   
   
   
   msb  lsb
   <--32-->

for little endian.


Ah, ok, that makes things clearer. Thanks for that.

I can't find any helper function that figures out if we're writing partial or 
full result regs. Would something like


REGNO (src) == REGNO (dst) &&
HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1

be a sane check for partial result regs?

Thanks,
Tejas.

Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Martin Jambor

Hi,

On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote:
> On 11/06/13 15:48, Jakub Jelinek wrote:
> >On Wed, Nov 06, 2013 at 03:24:40PM -0700, Aldy Hernandez wrote:
> >>I have checked the following patch with the attached testcases that
> >>were previously ICEing, and with a handful of handcrafted tests that
> >>I checked manually (array references on lhs and rhs, vectors of
> >>pointers, etc).
> >
> >I'd like Martin to eyeball the ipa changes eventually.
> 
> Indeed!  Likewise for all my previous changes to ipa-prop.[ch] and
> tree-sra.c.

Sorry for the delay.  I'd just like to re-iterate one comment from my
previous email which is that I do not think tree-sra.c should export
any function to the outside world apart from the entry points of the
passes (yes, there is already build_ref_for_offset which I admit is
entirely my creation but that should be moved elswhere too).

So please put sra_ipa_get_adjustment_candidate to ipa-prop.c.  I see
that it requires get_ssa_base_param to be moved there as well but
since IPA-SRA uses it in different places, it would need exporting
too, which would be weird because it does not really do anything with
parameters.  Since the function is so trivial, I would even suggest
introducing another private copy to ipa-prop.c (and leaving the
original without the new parameter).  Alternatively, you can move the
function to tree.c but for that it looks too specialized.

Thanks,

Martin

> 
> >>+  if (!SSA_VAR_P (*tp))
> >>+{
> >>+  /* Make sure we treat subtrees as a RHS.  This makes sure that
> >>+when examining the `*foo' in *foo=x, the `foo' get treated as
> >>+a use properly.  */
> >
> >I think it wouldn't hurt to e.g. do
> > if (TYPE_P (*tp))
> >   *walk_subtrees = 0;
> >here (and drop ATTRIBUTE_UNUSED above.
> 
> Done.
> 
> >
> >>+  wi->is_lhs = false;
> >>+  wi->val_only = true;
> >>+  return NULL_TREE;
> >>+}
> >>+  struct modify_stmt_info *info = (struct modify_stmt_info *) wi->info;
> >>+  struct ipa_parm_adjustment *cand
> >>+= sra_ipa_get_adjustment_candidate (tp, NULL, info->adjustments, true);
> >>+  if (!cand)
> >>+return NULL_TREE;
> >>+
> >>tree t = *tp;
> >>+  tree repl = make_ssa_name (create_tmp_var (TREE_TYPE (t), NULL), NULL);
> >
> >Just do
> >   tree repl = make_ssa_name (TREE_TYPE (t), NULL);
> >no need to create underlying vars since 4.8.
> 
> Done.
> 
> >
> >>+  /* We have modified in place; update the SSA operands.  */
> >>+  info->modified = false;
> >
> >So you always set modified to false?  I was expecting you'd set
> >it to true here and defer update_stmt and maybe_clean_eh_stmt etc.
> >to after walk_gimple_op (so, do it only when all the changes on the stmt
> >have been performed).  Plus, when you modify something, there is no need
> >to walk subtrees, so you can just do *walk_subtrees = 0; too.
> 
> Hmmm, good point.  I've moved update_stmt and company to the caller,
> and modified the caller to call regimplify_operands only for
> GIMPLE_RETURN which is the special case.
> 
> Let me know if this is still ok so I can commit.
> 
> Thanks.
> Aldy

> gcc/ChangeLog.gomp
>   * ipa-prop.h (sra_ipa_get_adjustment_candidate): Protoize.
>   * omp-low.c (struct modify_stmt_info): New.
>   (ipa_simd_modify_function_body_ops_1): Remove.
>   (ipa_simd_modify_stmt_ops): New.
>   (ipa_simd_modify_function_body_ops): Remove.
>   (ipa_simd_modify_function_body): Set new callback info.
>   Remove special casing.  Handle all operators with walk_gimple_op.
>   * tree-sra.c (get_ssa_base_param): Add new argument.  Use it.
>   (disqualify_base_of_expr): Pass new argument to
>   get_ssa_base_param.
>   (sra_ipa_modify_expr): Abstract candidate search into...
>   (sra_ipa_get_adjustment_candidate): ...here.
>

Re: wide-int testing results

2013-11-07 Thread Kenneth Zadeck


I doubt that this list is comprehensive.

When i many of these things a long time ago, I was not that interested 
in bit for bit compatibly as long as was never introducing any 
problems.in some cases it would have been hard to replicate some of 
the end cases with the wide-int code.


On 11/07/2013 12:51 PM, Richard Sandiford wrote:

I wanted to make sure that each backend still builds with wide-int and that
there weren't any unexplained changes in assembly output.  I arbitrarily
picked one target for each CPU:

 aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
 avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf
 fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf
 lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf
 microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf
 msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf
 powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
 score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf
 tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf
 x86_64-darwin

and built gcc and g++ from both the merge point and the wide-int branch.
The branch included the patches I've posted and a local change
to put CONST_WIDE_INT at the end of rtl.def (just for comparison,
as mentioned before).  I then compiled gcc.c-torture, gcc.dg and g++.dg
at -O2 and compared the asm output.  This obviously isn't a very strong
test, since none of the libraries are built, and since some of the test
cases rely on system headers, but it should at least be better than nothing.

pdp11 and picochip-elf don't build on mainline due to:

gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was not 
declared in this scope
#   define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor

Both the merge point and the branch built the other targets and there were
no new warnings on the branch.  The only asm differences were in:

* gcc.dg/fixed-point/const-1.s on targets that support it

   I think this is a known improvement.  The test case checks a series of
   conversions to make sure that the out-of-range ones produce a warning.
   E.g.:

 short _Fract sfF = 1.0;  /* { dg-warning "overflow" } */

   The asm differences are in the values produced for these out-of-range
   casees.  The old code used real_to_integer2, which saturates the result
   to double_int precision.  It then uses the fractional part of that result
   to initialise sfF.  The result therefore depends on the host (at least
   in the general case).

   The new code instead saturates to the precision of the result, just like
   we already do for:

 signed short s = 32768.0f;

   So this seems like a good change.

* gcc.dg/vect/pr51799.c on m32c-elf and
* gcc.c-torture/execute/divconst-2.c on xstormy16-elf

   These are cases where wide-int converts a narrower-than-HWI
   multiplication by 1 << (GET_MODE_PRECISION - 1) into a shift but
   mainline doesn't.  The mainline code looks like:

   /* Convert multiply by constant power of two into shift unless
 we are still generating RTL.  This test is a kludge.  */
   if (CONST_INT_P (trueop1)
  && (val = exact_log2 (UINTVAL (trueop1))) >= 0
  /* If the mode is larger than the host word size, and the
 uppermost bit is set, then this isn't a power of two due
 to implicit sign extension.  */
  && (width <= HOST_BITS_PER_WIDE_INT
  || val != HOST_BITS_PER_WIDE_INT - 1))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));

   and this exact_log2 doesn't cope properly with the sign-extended
   0xff8000... CONST_INT.  The wide-int code is:

   /* Convert multiply by constant power of two into shift.  */
   if (CONST_SCALAR_INT_P (trueop1))
{
  val = wi::exact_log2 (std::make_pair (trueop1, mode));
  if (val >= 0 && val < GET_MODE_BITSIZE (mode))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
}

   So this too seems like a good thing.

Thanks,
Richard

Re: Implement C11 _Atomic

2013-11-07 Thread Uros Bizjak

On Thu, Nov 7, 2013 at 5:45 PM, Jakub Jelinek  wrote:
> On Tue, Nov 05, 2013 at 11:21:56PM +, Joseph S. Myers wrote:
>> 2013-11-05  Andrew MacLeod  
>>   Joseph Myers  
>>
>>   * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC.
>>   (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE,
>>   TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE.
>>   (struct tree_base): Add atomic_flag field.
>>   * tree.h (TYPE_ATOMIC): New accessor macro.
>>   (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC.
>>   (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro.
>>   (atomicQI_type_node, atomicHI_type_node, atomicSI_type_node)
>>   (atomicDI_type_node, atomicTI_type_node): New macros for type
>>   nodes.
>>   * tree.c (set_type_quals): Set TYPE_ATOMIC.
>>   (find_atomic_core_type): New function.
>>   (build_qualified_type): Adjust alignment for qualified types.
>>   (build_atomic_base): New function
>>   (build_common_tree_nodes): Build atomicQI_type_node,
>>   atomicHI_type_node, atomicSI_type_node, atomicDI_type_node and
>>   atomicTI_type_node.
>>   * print-tree.c (print_node): Print atomic qualifier.
>>   * tree-pretty-print.c (dump_generic_node): Print atomic type
>>   attribute.
>>   * target.def (atomic_assign_expand_fenv): New hook.
>>   * doc/tm.texi.in (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New @hook.
>>   * doc/tm.texi: Regenerate.
>>   * targhooks.c (default_atomic_assign_expand_fenv): New function.
>>   * targhooks.h (default_atomic_assign_expand_fenv): Declare.
>>   * sync-builtins.def (__atomic_feraiseexcept): New built-in
>>   function.
>
>> 2013-11-05  Joseph Myers  
>>
>>   * lib/target-supports.exp
>>   (check_effective_target_fenv_exceptions): New function.
>>   * lib/atomic-dg.exp, gcc.dg/atomic/atomic.exp: New files.
>>   * gcc.dg/atomic/c11-atomic-exec-1.c,
>>   gcc.dg/atomic/c11-atomic-exec-2.c,
>>   gcc.dg/atomic/c11-atomic-exec-3.c,
>>   gcc.dg/atomic/c11-atomic-exec-4.c,
>>   gcc.dg/atomic/c11-atomic-exec-5.c, gcc.dg/c11-atomic-1.c,
>>   gcc.dg/c11-atomic-2.c, gcc.dg/c11-atomic-3.c,
>>   gcc.dg/c90-atomic-1.c, gcc.dg/c99-atomic-1.c: New tests.
>>
>> libatomic:
>> 2013-11-05  Joseph Myers  
>>
>>   * fenv.c: New file.
>>   * libatomic.map (LIBATOMIC_1.1): New symbol version.  Include
>>   __atomic_feraiseexcept.
>>   * configure.ac (libtool_VERSION): Change to 2:0:1.
>>   (fenv.h): Test for header.
>>   * Makefile.am (libatomic_la_SOURCES): Add fenv.c.
>>   * Makefile.in, auto-config.h.in, configure: Regenerate.
>
> The middle-end, libatomic and testsuite changes are ok.

Please note that following code form fenv.c won't generate overflow
exception on x87:

  if (excepts & FP_EX_OVERFLOW)
{
  volatile float max = __FLT_MAX__;
  r = max * max;
}

Uros.

Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart

2013-11-07 Thread Diego Novillo

On Thu, Nov 7, 2013 at 8:23 AM, Andrew MacLeod  wrote:
> On 11/07/2013 05:36 AM, Basile Starynkevitch wrote:
>>
>> On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote:
>>>
>>> I decided to name the new file gimple-expr.[ch] instead of
>>> gimple-decl   This will eventually split into gimple-type.[ch],
>>> gimple-decl.[ch], and gimple-expr.[ch].
>>
>>
>> Since we are adding *new* C++ files, can't we please name them *.cc
>> for the implementation part, so at least create gimple-expr.h and
>> gimple-expr.cc but not gimple-expr.c, please!
>
> Assuming we put this into stage 1 next year,  I would imagine we'd rename a
> number of things, including .cc, drop the tree-* from the tree-ssa-blah.[c]h
> files, etc etc.  There have been a few things people have suggested
> renaming... I think if we do renaming, they ought to be all at one time to
> minimize the pain.
>
> At the moment, the new gimple-* files I'm creating are still C, so they are
> .c files...

Agreed. When we start shuffling files around seems a better time.
Doing both operations at once will be easier than going through two
phases of naming changes.


Diego.

Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-07 Thread Richard Sandiford

Kenneth Zadeck  writes:
> I very strongly disagree with this.  The standard needs to be high than
> "does it pass the test suite."
>
> What we are introducing is a case where the program will behave one way
> with optimization and another way without it.  While, this is always
> true for timing dependent code, it is pretty rare for math.  We should
> always tread carefully when doing that, and the excuse that it is
> "cleaner" does not, in my mind, fit the bill.

If it makes you feel any better (probably not), this was already true
for !SHIFT_COUNT_TRUNCATED targets.  The tree level would then not do
any truncation, whereas as you said before, all real targets do in practice.
And that affected many of the main targets, including x86_64, powerpc,
z/Series, ARM, etc.

(And the tree level has to do _something_.  "unsigned int i = 1 << 32;"
must compile with default options, even if the behaviour's undefined.
Only rtl has the luxury of being able to punt.)

Although you could go out of your way to try to make the undefined
behaviour match what the target instructions would do, that'd be a
lot of work, especially for cases that are implemented using library
calls.  And there's a lot to be said for making the behaviour
consistent across targets, which is what removing the truncation
from the tree level does.

Things are different at the rtl level because there a backend can
legitimately exploit the behaviour of the shift instructions when
implementing optabs.  And optabs.c itself does this when creating
doubleword shifts.  So at the rtl level we can only optimise out-of-range
shifts if we're sure what the target would do.

Thanks,
Richard

RE: Initial submission of OpenACC support integrated into OpenMP's lowering and expansion passes

2013-11-07 Thread Evgeny Gavrin

Hi, Thomas!

This is really great!
I've looked at your changes and in most of front-end parts it looks reasonable 
to me. As you know, we're like-minded with you about how OpenACC's front-ends 
should look like. So, I think it's good that you have working flow for your 
implementation. Now we can start to merge front-ends from openacc-1_0-branch to 
GOMP, while keeping all understandings of ACC in sync and working.

Jakub, don't you mind if I prepare a few front-end patches likewise for review 
to extend Thomas' vision with the things that was already done by my colleagues 
on the ACC branch? I hope it'll much speed up the development.

> the last patch adds GOACC_parallel, which so far simply branches to 
> GOMP_target.  There's more to come.
I've reviewed ACC-related code from [openacc-1_0-branch] and from the patches, 
I noticed there are some mess around marking OpenACC specific code. There are 
ACC, GACC, OACC, GOACC abbreviations used across code, which one is the best? I 
prefer ACC, this is not critical, I'm just curious how to resolve this.

> Due to the conceptual similarity compared to OpenMP, and that (later) 
> it is reasonable to expect to embed OpenACC directives into OpenMP 
> ones, the approach I've chosen is to directly embed OpenACC handling 
> into the existing OpenMP infrastructure/passes.
Regarding front-ends, not only from view of OMP/ACC, but front-ends itself, it 
makes sense to move some parsing routines out of gomp/acc context. This will be 
useful to implement not only ACC, but also HMPP, if someone is interested, or 
other similar technology. And of cause, these functions should be used behind 
the existing facilities - not to break current implementation and other 
dependencies. So it won't not introduce any difficulties for back porting, 
while generalizing front-ends a bit.
For now, I prefer to extend the approved way of developing, but keep it in 
mind, please.

> This patch series doesn't contain any substantial rumtime library work 
> yet;
Can you share some technical details/ideas about further implementation, it's 
not going to be trivial. On the [openacc-1_0-branch] we resolved it with 
OpenCL-driven runtime. Also, OpenACC 2.0 standard declares some vendor-specific 
routines, this should be noted too.

> This is in contrast to Samsung's work, who are implementing OpenACC
> separately from the existing OpenMP support.
I'm not fully agree that this in contrast to things are taking place in 
[openacc-1_0-branch]. The main reason for keeping middle-end and back-end 
solutions far from GOMP is the understanding of OpenACC semantics and unclear 
understanding of what should be done to generalize support of accelerators in 
GCC. OpenACC and OpenOMP has some semantically differences, but such things 
matters only for middle-end and further code generation. The main one, is the 
memory concept. The 

> Yet, I hope we'll be able to share/re-use at least front end and some
middle end code.
I absolutely agree, it's needed to merge our efforts regarding front-ends.
Many things from ACC in [openacc-1_0-branch] is already done, so it doesn't 
necessary to re-implement same things. Maybe we can polish front-ends together 
and then commit changes to GOMP?

> We directly strive for OpenACC 2.0 support, skipping OpenACC 1.  We're 
> focussing on the C front end implementation first, following on with 
> C++ and Fortran later on.
How do you think, does it make sense to adopt current Fortran implementation 
from the OpenACC branch? This can be done quite soon. Also, I may help to 
extend C and C++ front-end the same way it's done now, by adding other 
implemented directives, too.

Regarding proposed patches:
[8/9] c_parser_omp_all_clauses and cp_parser_omp_all_clauses
- c_parser_error (parser, "expected %<#pragma omp%> clause");
+ c_parser_error (parser, "expected clause");

- cp_parser_error (parser, "expected %<#pragma omp%> clause");
+ cp_parser_error (parser, "expected clause");
Does it really needed to remove %<#pragma omp%> in these cases? We handle both, 
I think.

[9/9]
* gimple.h (gimple_build_oacc_*): New declaration.
gimple_oacc_parallel_* and other functions like that. Let's move these 
functions out of gimple.* to gimple-oacc.*! There are going to be much more of 
the stuff like this, maybe it'll make sense to keep it externally? May be not 
only gimple manipulation functions but also pretty-print Also, have a look on 
openacc_1-0_branch please, for gimple-oacc.*, is OK if I'll add some of this?

Handling clauses.
+  GIMPLE_CHECK (gs, GIMPLE_OACC_PARALLEL);  
+ gs->gimple_omp_parallel.clauses = clauses;
I think, we can freely add some kind of flag to gimple_omp_parallel to indicate 
that we're are working with ACC.
Or even add new gimple_acc_* statements.

On gimplification, lowering stuff and further, I think it's okay if targeting 
GOMP_target, but on my view, this is going to prevent us from emiting OpenCL or 
do some extended analysis of the

Re: Clean up atomic tests

2013-11-07 Thread Mike Stump

On Nov 7, 2013, at 8:43 AM, Joseph S. Myers  wrote:
> This patch cleans up various issues with the tests of atomics built-in
> functions and libatomic functions, in preparation for adapting those
> tests to add test coverage of stdatomic.h macros.  The tests were
> missing a return type for main (C11 doesn't allow implicit int return
> types).

Ok for those changes.

> Some tests were incrementing a variable in one part of an
> expression and using its value in another part without a sequence
> point in between.

Ok for those changes.

> And the libatomic tests shouldn't have been
> restricted to targets with hardware sync_* support, or adding
> command-line options to make such support available,

For this I'd like a domain expert to weigh in.  If that is you, then you can 
check it in as obvious if you'd like.  If you want a review, you can wait for 
them to speak up.

Re: [wide-int, committed] Fix nds32 build

2013-11-07 Thread Mike Stump

On Nov 7, 2013, at 9:47 AM, Richard Sandiford  
wrote:
> Get the new port building on wide-int.  Seemed pretty obvious so I went
> ahead and installed it.

Looks good.

Re: wide-int testing results

2013-11-07 Thread Richard Sandiford

Kenneth Zadeck  writes:
> I doubt that this list is comprehensive.

Hmm?  It was supposed to be one target for each CPU, so if you think
I've missed one, please point it out.  It certainly wasn't supposed
to be every target triple that gcc supports.

Even one target per CPU was useful, since quite a few wouldn't compile.

> When i many of these things a long time ago, I was not that interested 
> in bit for bit compatibly as long as was never introducing any 
> problems.in some cases it would have been hard to replicate some of 
> the end cases with the wide-int code.

Not sure what you mean.  The point was that if wide-int made things better,
we should just say "good" and move on.  Which is why I'm not trying to
get bit-for-bit compatibility in the testcases I pointed out.  But for
a change of this size I think we should at least know why any differences
occur.

Thanks,
Richard

> On 11/07/2013 12:51 PM, Richard Sandiford wrote:
>> I wanted to make sure that each backend still builds with wide-int and that
>> there weren't any unexplained changes in assembly output.  I arbitrarily
>> picked one target for each CPU:
>>
>>  aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
>>  avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf
>>  fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf
>>  lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf
>>  microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf
>>  msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf
>>  powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
>>  score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf
>>  tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf
>>  x86_64-darwin
>>
>> and built gcc and g++ from both the merge point and the wide-int branch.
>> The branch included the patches I've posted and a local change
>> to put CONST_WIDE_INT at the end of rtl.def (just for comparison,
>> as mentioned before).  I then compiled gcc.c-torture, gcc.dg and g++.dg
>> at -O2 and compared the asm output.  This obviously isn't a very strong
>> test, since none of the libraries are built, and since some of the test
>> cases rely on system headers, but it should at least be better than nothing.
>>
>> pdp11 and picochip-elf don't build on mainline due to:
>>
>> gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was 
>> not declared in this scope
>> #   define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor
>>
>> Both the merge point and the branch built the other targets and there were
>> no new warnings on the branch.  The only asm differences were in:
>>
>> * gcc.dg/fixed-point/const-1.s on targets that support it
>>
>>I think this is a known improvement.  The test case checks a series of
>>conversions to make sure that the out-of-range ones produce a warning.
>>E.g.:
>>
>>  short _Fract sfF = 1.0;  /* { dg-warning "overflow" } */
>>
>>The asm differences are in the values produced for these out-of-range
>>casees.  The old code used real_to_integer2, which saturates the result
>>to double_int precision.  It then uses the fractional part of that result
>>to initialise sfF.  The result therefore depends on the host (at least
>>in the general case).
>>
>>The new code instead saturates to the precision of the result, just like
>>we already do for:
>>
>>  signed short s = 32768.0f;
>>
>>So this seems like a good change.
>>
>> * gcc.dg/vect/pr51799.c on m32c-elf and
>> * gcc.c-torture/execute/divconst-2.c on xstormy16-elf
>>
>>These are cases where wide-int converts a narrower-than-HWI
>>multiplication by 1 << (GET_MODE_PRECISION - 1) into a shift but
>>mainline doesn't.  The mainline code looks like:
>>
>>/* Convert multiply by constant power of two into shift unless
>>   we are still generating RTL.  This test is a kludge.  */
>>if (CONST_INT_P (trueop1)
>>&& (val = exact_log2 (UINTVAL (trueop1))) >= 0
>>/* If the mode is larger than the host word size, and the
>>   uppermost bit is set, then this isn't a power of two due
>>   to implicit sign extension.  */
>>&& (width <= HOST_BITS_PER_WIDE_INT
>>|| val != HOST_BITS_PER_WIDE_INT - 1))
>>  return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
>>
>>and this exact_log2 doesn't cope properly with the sign-extended
>>0xff8000... CONST_INT.  The wide-int code is:
>>
>>/* Convert multiply by constant power of two into shift.  */
>>if (CONST_SCALAR_INT_P (trueop1))
>>  {
>>val = wi::exact_log2 (std::make_pair (trueop1, mode));
>>if (val >= 0 && val < GET_MODE_BITSIZE (mode))
>>  return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
>>  }
>>
>>So this too seems like a good thing.
>>
>> Thanks,
>> Richard

RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Bernd Edlinger

> On 11/07/13 09:32, Dodji Seketeli wrote:
>
>> From the above, what I can say is that input.o was already linked with
>> gcov. But I think it's minimal enough to only drag libcpp and the
>> diagnostic subsystem.
>
> I disagree. While input.o was available to gcov, I don't think it was being
> pulled into the executable (if it was, I'd like to understand why). I think
> dragging in libcpp & diagnostics is a major bloat -- have you measured the
> before and after gcov text sizes?


I agree that this should be avoided. However both versions of read_line
need to be fixed eventually.

Just for the records, this is not only a theoretical issue for gcov:
The OOM error can also be reproduced in gcov.

The attached test.c has some dozends of comment lines:
//xx\0EOL

That are lines with text before and after the \0.
If compiled without -Wall, it will crash in gcov:


gcc -fprofile-arcs -ftest-coverage test.c
./a.out
gcov test
File 'test.c'
Lines executed:100.00% of 1
Creating 'test.c.gcov'

gcov: out of memory allocating 1677721600 bytes after a total of 135168 bytes


Bernd.//00EOL
//01EOL
//02EOL
//03EOL
//04EOL
//05EOL
//06EOL
//07EOL
//08EOL
//09EOL
//10EOL
//11EOL
//12EOL
//13EOL
//14EOL
//15EOL
//16EOL
//17EOL
//18EOL
//19EOL
//20EOL
//21EOL
//22EOL
//23EOL
//24EOL
//25EOL
//26EOL
//27EOL
//28EOL
//29EOL
//30EOL
//31EOL
//32EOL
//33EOL
//34EOL
//35EOL
//36EOL
//37EOL
//38EOL
//39EOL
//40EOL
//41EOL
//42EOL
int main() { }
// EOF

Re: [PATCH] make has_gate and has_execute useless

2013-11-07 Thread Trevor Saunders

On Thu, Nov 07, 2013 at 10:30:16AM -0700, Jeff Law wrote:
> On 11/07/13 09:00, tsaund...@mozilla.com wrote:
> >From: Trevor Saunders 
> >
> >Hi,
> >
> >  This is the result of seeing what it would take to get rid of the has_gate 
> > and
> >has_execute flags on pass_data.  It turns out not much, but I wanted
> >confirmation this part is ok before I go deal with all the places that
> >initialize the fields.
> Most definitely, the has_gate/has_execute stuff is definitely a wart.
> 
> >
> >I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test 
> >suite
> >regressions (ignoring the silk stuff because the full paths in its test names
> >break my test script for now) Any reason this patch with the actual removal 
> >of the flags wouldn't be ok?
> I don't see anything glaringly wrong.

great, I'll work on ripping them out then :)

> Any reason why the register_dump_files bits were mucking with
> properties.  That just seems wrong.

 no, it looks like its pretty old, the most recent change I can see is
 r110742 in 2006, but I agree it seems crazy, but atleast its not
 actually doing anything now.

> >+  todo_after = pass->execute ();
> >+  if (todo_after != TODO_absolutely_nothing)
> >+do_per_function (clear_last_verified, NULL);
> >+  else
> >+todo_after &= ~TODO_absolutely_nothing;
> 
> Isn't the last assignment there just
> todo_after = 0;

yes

Trev

> 
> ?
> 
> Jeff

1 2 >

1 - 100 of 138 matches

Mail list logo